Difference between revisions of "GPT-3"

Revision as of 17:50, 8 April 2023

wikipedia:Generative Pre-trained Transformer 3

The architecture is a decoder-only transformer network with a 2048-token-long context and then-unprecedented size of 175 billion parameters, requiring 800GB to store.

Revision as of 14:07, 24 January 2023 (edit) Ant (talk \| contribs) (→‎See also) ← Older edit		Revision as of 17:50, 8 April 2023 (edit) (undo) Welcome (talk \| contribs) Newer edit →
Line 3:		Line 3:
	[[wikipedia:Generative Pre-trained Transformer 3]]		[[wikipedia:Generative Pre-trained Transformer 3]]

		+	The architecture is a decoder-only transformer network with a 2048-token-long context and then-unprecedented size of 175 billion [[parameters]], requiring 800GB to store.

Difference between revisions of "GPT-3"

Revision as of 17:50, 8 April 2023

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools