Difference between revisions of "GPT-3"

From wikieduonline
Jump to navigation Jump to search
Line 3: Line 3:
 
[[wikipedia:Generative Pre-trained Transformer 3]]
 
[[wikipedia:Generative Pre-trained Transformer 3]]
  
 +
The architecture is a decoder-only transformer network with a 2048-token-long context and then-unprecedented size of 175 billion [[parameters]], requiring 800GB to store.
  
  

Revision as of 17:50, 8 April 2023

Advertising: