Difference between revisions of "Measuring Massive Multitask Language Understanding (MMLU)"
Jump to navigation
Jump to search
Line 2: | Line 2: | ||
+ | |||
+ | * [[hellaswag]] (10-shot) | ||
+ | * [[winograde]] (5-shot) | ||
+ | * [[arc challenge]] (25-shot) | ||
+ | * [[TriviaQA]] (5-shot) | ||
+ | * [[TruthfulQA]] | ||
== See also == | == See also == |
Revision as of 10:01, 10 April 2024
- hellaswag (10-shot)
- winograde (5-shot)
- arc challenge (25-shot)
- TriviaQA (5-shot)
- TruthfulQA
See also
Advertising: