Difference between revisions of "Generative Pre-trained Transformer"

Latest revision as of 15:20, 24 August 2023

GPT-4 (Mar 2023)
GPT-3 (Jun 2020, beta) the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
GPT-2 (Feb 2019)

@@ Line 2: / Line 2: @@
 * [[GPT-4]] (Mar 2023)
-* [[GPT-3]] (Jun 2020, beta)
+* [[GPT-3]] (Jun 2020, beta) the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
 * [[GPT-2]] (Feb 2019)
@@ Line 10: / Line 10: @@
 * [[Improving Language Understanding by Generative Pre-Training]]
 * [[Attention is all you need (2017)]]
+== Related ==
+* [[/usr/lib/systemd/system-generators/systemd-gpt-auto-generator]]
+* [[GUID]]
 == See also ==