Difference between revisions of "Generative Pre-trained Transformer"

Latest revision as of 15:20, 24 August 2023

GPT-4 (Mar 2023)
GPT-3 (Jun 2020, beta) the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
GPT-2 (Feb 2019)

@@ Line 1: / Line 1: @@
 [[wikipedia:Generative Pre-trained Transformer]]
+* [[GPT-4]] (Mar 2023)
+* [[GPT-3]] (Jun 2020, beta) the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
+* [[GPT-2]] (Feb 2019)
 * [[Improving Language Understanding by Generative Pre-Training]]
+* [[Attention is all you need (2017)]]
+== Related ==
+* [[/usr/lib/systemd/system-generators/systemd-gpt-auto-generator]]
+* [[GUID]]
+== See also ==
 * {{Transformer}}
 * {{GPT}}
+[[Category:GPT]]