AI Model Comparison
Highlights
QUALITY
Artificial Analysis Quality Index; Higher is better
SPEED
Output Tokens per Second; Higher is better
PRICE
USD per 1M Tokens; Lower is better
Summary
Quality:
o1-preview and o1-mini are the highest quality models, followed by Gemini 1.5 Pro (Sep '24) & GPT-4o.
Output Speed (tokens/s):
Gemma 7B (917 t/s) and Llama 3.2 1B (485 t/s) are the fastest models, followed by Gemini 1.5 Flash (May '24) & Llama 3.1 8B.
Latency (seconds):
Sonar Large (0.00s) and Sonar Small (0.00s) are the lowest latency models, followed by Reka Edge & Sonar 3.1 Small.
Price ($ per M tokens):
OpenChat 3.5 ($0.06) and Gemma 7B ($0.07) are the cheapest models, followed by Llama 3.2 3B & Llama 3.2 1B.
Context Window:
Gemini 1.5 Pro (Sep '24) (2m) and Gemini 1.5 Pro (May '24) (2m) are the largest context window models, followed by Gemini 1.5 Flash (Sep '24) & Gemini 1.5 Flash (May '24).
Model | Quality | Context Length | Input Price (per 1M tokens) | Output Price (per 1M tokens) | |
---|---|---|---|---|---|
o1-preview | OpenAI | 85 | 128000 | 15.00 | 60.00 |
o1-mini | OpenAI | 82 | 128000 | 3.00 | 12.00 |
Gemini 1.5 Pro (Sep '24) | Google | 80 | 2000000 | 1.25 | 5.00 |
GPT-4o | OpenAI | 77 | 128000 | 2.50 | 10.00 |
GPT-4o (May '24) | OpenAI | 77 | 128000 | 5.00 | 15.00 |
Claude 3.5 Sonnet | Anthropic | 77 | 200000 | 3.00 | 15.00 |
Qwen2.5 72B | BABA BIG | 75 | 131000 | 0.38 | 0.40 |
GPT-4 Turbo | OpenAI | 74 | 128000 | 10.00 | 30.00 |
Gemini 1.5 Flash (Sep '24) | Google | 73 | 1000000 | 0.07 | 0.30 |
Mistral Large 2 | Mistral. | 73 | 128000 | 3.00 | 9.00 |
Llama 3.1 405B | Meta | 72 | 128000 | 4.00 | 9.00 |
GPT-4o mini | OpenAI | 71 | 128000 | 0.15 | 0.60 |
Claude 3 Opus | Anthropic | 70 | 200000 | 15.00 | 75.00 |
Qwen2 72B | BABA BIG | 69 | 128000 | 0.63 | 0.65 |
DeepSeek-Coder-V2 | DeepSeek | 67 | 128000 | 0.14 | 0.28 |
DeepSeek-V2.5 | DeepSeek | 66 | 128000 | 1.07 | 1.14 |
Llama 3.2 90B (Vision) | Meta | 66 | 128000 | 0.90 | 0.90 |
DeepSeek-V2 | DeepSeek | 66 | 128000 | 0.14 | 0.28 |
Llama 3.1 70B | Meta | 65 | 128000 | 0.88 | 0.90 |
Jamba 1.5 Large | AI21 | 64 | 256000 | 2.00 | 8.00 |
Llama 3 70B | Meta | 62 | 8000 | 0.88 | 0.90 |
Sonar Large | Perplexity | 62 | 33000 | 1.00 | 1.00 |
Gemma 2 27B | Google | 61 | 8000 | 0.80 | 0.80 |
Mixtral 8x22B | Mistral. | 61 | 65000 | 1.20 | 1.20 |
Mistral Small (Sep '24) | Mistral. | 60 | 128000 | 0.20 | 0.60 |
Yi-Large | 01ai | 58 | 32000 | 3.00 | 3.00 |
Claude 3 Sonnet | Anthropic | 57 | 200000 | 3.00 | 15.00 |
Reka Core | Reka | 57 | 128000 | 2.00 | 10.00 |
Mistral Large | Mistral. | 56 | 33000 | 4.00 | 12.00 |
Pixtral 12B | Mistral. | 56 | 128000 | 0.13 | 0.13 |
Command-R+ | Cohere | 56 | 128000 | 2.75 | 12.50 |
Claude 3 Haiku | Anthropic | 54 | 200000 | 0.25 | 1.25 |
Llama 3.2 11B (Vision) | Meta | 53 | 128000 | 0.19 | 0.19 |
Llama 3.1 8B | Meta | 53 | 128000 | 0.13 | 0.16 |
GPT-3.5 Turbo | OpenAI | 52 | 16000 | 0.50 | 1.50 |
Mistral NeMo | Mistral. | 52 | 128000 | 0.15 | 0.15 |
Command-R | Cohere | 51 | 128000 | 0.33 | 1.05 |
Mistral Small (Feb '24) | Mistral. | 50 | 33000 | 1.00 | 3.00 |
DBRX | Databricks | 49 | 33000 | 0.97 | 1.73 |
Llama 3 8B | Meta | 46 | 8000 | 0.14 | 0.20 |
Llama 3.2 3B | Meta | 46 | 128000 | 0.09 | 0.10 |
Gemma 2 9B | Google | 46 | 8000 | 0.20 | 0.20 |
Command-R+ (Apr '24) | Cohere | 46 | 128000 | 3.00 | 15.00 |
Jamba 1.5 Mini | AI21 | 46 | 256000 | 0.20 | 0.40 |
Reka Flash | Reka | 46 | 128000 | 0.80 | 2.00 |
OpenChat 3.5 | Openchat | 43 | 8000 | 0.06 | 0.06 |
Mixtral 8x7B | Mistral. | 42 | 33000 | 0.47 | 0.55 |
Sonar Small | Perplexity | 41 | 33000 | 0.20 | 0.20 |
Llama 2 Chat 70B | Meta | 39 | 4000 | 1.22 | 2.17 |
Llama 2 Chat 13B | Meta | 36 | 4000 | 0.30 | 0.30 |
Codestral-Mamba | Mistral. | 36 | 256000 | 0.25 | 0.25 |
Command-R (Mar '24) | Cohere | 36 | 128000 | 0.50 | 1.50 |
Reka Edge | Reka | 30 | 64000 | 0.40 | 1.00 |
Jamba Instruct | AI21 | 28 | 256000 | 0.50 | 0.70 |
Llama 3.2 1B | Meta | 28 | 128000 | 0.07 | 0.09 |
Mistral 7B | Mistral. | 24 | 33000 | 0.15 | 0.20 |
GPT-4 | OpenAI | N/A | 8000 | 30.00 | 60.00 |
Llama 2 Chat 7B | Meta | N/A | 4000 | 0.29 | 0.46 |
Codestral | Mistral. | N/A | 33000 | 0.20 | 0.60 |
GPT-3.5 Turbo Instruct | OpenAI | N/A | 4000 | 1.50 | 2.00 |
Gemini 1.0 Pro | Google | N/A | 33000 | 0.50 | 1.50 |
Mistral Medium | Mistral. | N/A | 33000 | 2.75 | 8.10 |
Gemini 1.5 Flash (May '24) | Google | N/A | 1000000 | 0.07 | 0.30 |
Gemini 1.5 Pro (May '24) | Google | N/A | 2000000 | 3.50 | 10.50 |
Sonar 3.1 Small | Perplexity | N/A | 131000 | 0.20 | 0.20 |
Phi-3 Medium 14B | Microsoft. | N/A | 128000 | 0.20 | 0.70 |
Sonar 3.1 Large | Perplexity | N/A | 131000 | 1.00 | 1.00 |