Difference between revisions of "Transformer (machine learning model)"

From wikieduonline
Jump to navigation Jump to search
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 +
[[wikipedia:Transformer (machine learning model)]]
  
 +
* [[GPT-3]]: the architecture is a decoder-only [[transformer network]] with a 2048-token-long context and 175 billion [[parameters]], requiring 800GB to store.
 +
 +
 +
== Related ==
 +
* [[Attention is all you need (2017)]]
 +
* [[GPT]]: [[GPT-4]], [[GPT-3]]
 +
* [[Diffusion]]
  
 
== See also ==
 
== See also ==
 +
* {{Transformer}}
 
* {{GPT}}
 
* {{GPT}}
* {{NLP}}
+
* {{ML}}
 
* {{OpenAI}}
 
* {{OpenAI}}
 +
 +
[[Category:ML]]
 +
[[Category:AI]]

Latest revision as of 15:23, 9 April 2023

Advertising: