Difference between revisions of "Measuring Massive Multitask Language Understanding (MMLU)"

Revision as of 10:01, 10 April 2024

@@ Line 2: / Line 2: @@
+* [[hellaswag]] (10-shot)
+* [[winograde]] (5-shot)
+* [[arc challenge]] (25-shot)
+* [[TriviaQA]] (5-shot)
+* [[TruthfulQA]]
 == See also ==