Difference between revisions of "Measuring Massive Multitask Language Understanding (MMLU)"
Jump to navigation
Jump to search
(Created page with "wikipedia:MMLU") |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[wikipedia:MMLU]] | [[wikipedia:MMLU]] | ||
+ | |||
+ | |||
+ | |||
+ | * [[hellaswag]] (10-shot) | ||
+ | * [[winograde]] (5-shot) | ||
+ | * [[arc challenge]] (25-shot) | ||
+ | * [[TriviaQA]] (5-shot) | ||
+ | * [[TruthfulQA]] | ||
+ | * [[GSM8K]] | ||
+ | * [[MATH]] | ||
+ | * [[HumanEval]] | ||
+ | |||
+ | == See also == | ||
+ | * {{MATH}} | ||
+ | * {{LLM}} | ||
+ | |||
+ | [[Category:LLM]] |
Latest revision as of 09:58, 3 October 2024
- hellaswag (10-shot)
- winograde (5-shot)
- arc challenge (25-shot)
- TriviaQA (5-shot)
- TruthfulQA
- GSM8K
- MATH
- HumanEval
See also[edit]
Advertising: