Transformer (machine learning model)
Jump to navigation
Jump to search
wikipedia:Transformer (machine learning model)
- GPT-3: the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
Related[edit]
See also[edit]
- Transformer, GPT, Transformer 8, Ethched, Megatron-Core
- GPT, GPT-2, GPT-3, GPT-4, GPT-4o, Tiktoken, Bigram, Transformer, PaLM, ChatGPT
- Machine learning, Deep learning, AWS Sagemaker, PyTorch, Kubeflow, TensorFlow, Keras, Torch, Spark ML, Tinygrad, Apple Neural Engine, Scikit-learn, MNIST, MLOps, AutoML, ClearML, PostgresML, AWS Batch, Transformer, Diffusion, Backpropagation, JAX, Vector database, LLM, The Forrester Wave: AI/ML Platforms
- OpenAI, GitHub Copilot, ChatGPT, OpenAI Codex, GPT-3, GPT-4, Whisper, Sam Altman, Mira Murati, Greg Brockman, Ilya Sutskever, OpenAI board, John Schulman
Advertising: