GPT-3
Jump to navigation
Jump to search
wikipedia:GPT-3 (Jun 2020)
wikipedia:Generative Pre-trained Transformer 3
The architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
See also[edit]
Advertising: