Home
FAQ
HuggingFace
LLMAT
Model Performance vs Size
Question Difficulty Map
Leaderboard
Exam version: 1.2
Model Name
Score
Size
Family
MoE
ChatGPT 4o Latest
27
1800.0
GPT-4
o1-mini
26
8.0
o1
Claude 3 Opus
2000.0
Claude 3
o1-preview
220.0
GPT-4o
25
Claude 3.5 Sonnet
24
70.0
Claude 3.5
GPT-4 Turbo 0125-preview
GPT-4 Turbo 0409
Gemma 2 27B IT
23
27.0
Gemma
Claude 3 Haiku
20.0
Llama 3 70B
Llama 3
Gemini 1.5 Pro
22
Gemini
Gemini 1.5 Flash
21
Nous Hermes 2 Yi 34b
20
34.0
Yi
GPT 4o mini
Gemma 2 9B IT
9.0
Claude 3 Sonnet
Yi 1.5 34B Chat
Yi 1.5
Nous Hermes 2 SOLAR 10.7B
19
10.7
Solar
Llama 3 8B Instruct
Gemini Pro 1.0
175.0
Hermes 2 Theta Llama 3 8B
Yi 1.5 9B Chat
Hermes 2 Pro Llama 3
18
SFR Iterated DPO Llama 3 8B R
Mistral Nemo Instruct 2407
17
12.0
Mistral
Mixtral 34x2 MoE 60b
60.0
Starling LM 7B Beta
7.0
GPT-3.5 Turbo 0125
GPT-3.5
Mixtral 11bx2 MoE 19b
16
19.0
Llama 3.1 8B Instruct
Kunoichi DPO v2 7B
Hermes 2 Pro Mistral 7B
Mistral 7B Instruct v0.2
Llama 3 Refueled
Phi 3 Mini 4k Instruct
15
3.8
Phi 3
Mistral 7B Instruct v0.3
14
Phi 3 Mini 4k Instruct (2024-07-01)
Neural Hermes 2.5 7B
Gemma 1.1 7b IT
8.5
Command-R v01
13
35.0
C4AI Command-R
Yi 1.5 6B Chat
6.0
Gemma 7b IT
12
Mixtral 8x7b v0.1 instruct
45.0
Vicuna 33b Chat
10
33.0
Llama 1
Llama 3.2 3B Instruct
3.2
Zephyr 7B
Gemma 1.1 2B IT
2.5
StableLM Zephyr 3B
9
3.0
StableLM
StableLM 2 1.6B Chat
1.6
Llama 3.2 1B Instruct
7
1.2
H2O Danube 3 4b Chat
6
4.0
H2O Danube
H2O Danube 1.8b Chat
1.8
CodeLlama 13b Instruct
5
13.0
Llama 2
Gemma 2b IT
TinyLlama v1.0 chat 1.1b
1.1
TinyLlama