Transformer (machine learning model)

From wikieduonline

(Redirected from Transformer network)

Jump to navigation Jump to search

wikipedia:Transformer (machine learning model), Attention is all you need (2017)

GPT-3: the architecture is a decoder-only transformer network with a 2048-token-long context and 175 billion parameters, requiring 800GB to store.
GPT-2

Related[edit]

See also[edit]

Transformer, GPT, Transformer 8, Ethched, Megatron-Core, Attention is all you need, Attention
GPT, GPT-2, GPT-3, GPT-4, GPT-4o, GPT-5.4-Cyber, Tiktoken, Bigram, Transformer, PaLM, ChatGPT
Artificial neural networks, Neuronal network (NN), CNN, RNN, Micrograd, NPU, ConvNet, AlexNet, GoogLeNet, Apache MXNet, Neural architecture search, DAG, Feedforward neural network, NeurIPS, Feature Pyramid Network, TPU, NPU, Apple Neural Engine (ANE), LLM, TFLOPS, Softmax function, Dilution (neural networks), AlphaGo
Machine learning, Deep learning, AWS Sagemaker, PyTorch, Kubeflow, TensorFlow, Keras, Torch, Spark ML, Tinygrad, Apple Neural Engine, Scikit-learn, MNIST, MLOps, AutoML, ClearML, PostgresML, AWS Batch, Transformer, Diffusion, Backpropagation, JAX, Vector database, LLM, The Forrester Wave: AI/ML Platforms, Embeddings, Stochastic gradient descent
AI: Autonomous driving, OpenAI, Google AI,Eliezer Yudkowsky, DeepMind, Computer Vision, Neural network, Vertex AI, Instadeep, Deep learning, Infogrid, Sapling, AssemblyAI, V7, MTIA, Yann LeCun, AI WiW, Salesforce AI, Pika, Amazon Q, LLM, Ollama, Cloud AI Developer Services, Hugging Face, Databricks, Generative AI, Azure OpenAI, AI token, AI code

Retrieved from "https://www.wikieduonline.com/index.php?title=Transformer_(machine_learning_model)&oldid=458705"

ML
AI

Advertising: