Compare Leading AI & LLM Models
Compare performance metrics, benchmarks, and pricing across the leading AI models. Sort by any metric and find the best model for your use case.
Model Leaderboard
Showing 298 of 298 models
• 0 selected
| Org | 🌍 | Model | Multimodal | GPQA | AIME 2025 | SWE-bench | HLE | Input $/M | Output $/M | Context | Cutoff | Params (B) | License | Speed | Latency | Released | Code Arena | Reasoning | Math | Coding | Search | Writing | Vision | Tools | Long Ctx | Finance | Legal | Health | ARC-AGI v2 | MMMLU | MMMU | BrowseComp | CharXiv-R | MMMU-Pro | ScreenSpot Pro | MCP Atlas | SimpleQA | OSWorld | Toolathlon | Terminal Bench | TAU2 Retail | FrontierMath | MRCR v2 | SciCode | Apex Agents | SWE-bench Pro | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Anthropic | 🇺🇸 | Claude Mythos Preview UNRELEASED | ✓ | 94.6% | - | 93.9% | 64.7% | $25.00 | $125.00 | - | - | - | Closed | - | - | - | - | 71.3 | 61.7 | 57.3 | 41.3 | - | 52.6 | 39.6 | 36.8 | - | - | 24.3 | - | 92.7% | - | 86.9% | 93.2% | - | - | - | - | - | - | - | - | - | - | - | - | 77.8% | |
| 🇺🇸 | Gemini 3.1 Pro | ✓ | 94.3% | - | 80.6% | 51.4% | $2.50 | $15.00 | 1.0M | Jan. 2025 | - | Closed | 114c/s | 29.1s | Feb. 2026 | 2,093 | 59.0 | 54.8 | 44.1 | 37.0 | - | 41.0 | 34.1 | 13.2 | 4.8 | 4.8 | 29.5 | 77.1% | 92.6% | - | 85.9% | - | 80.5% | - | 69.2% | - | - | - | - | - | - | 26.3% | 59.0% | 33.5% | 54.2% | ||
| Anthropic | 🇺🇸 | Claude Opus 4.7 | ✓ | 94.2% | - | 87.6% | 54.7% | $5.00 | $25.00 | 1M | - | - | Closed | 86c/s | 2.1s | Apr. 2026 | 1,851 | 62.6 | 52.3 | 51.6 | 31.4 | - | 44.6 | 39.6 | 26.3 | 40.6 | - | 41.8 | - | 91.5% | - | 79.3% | 91.0% | - | - | 77.3% | - | - | - | - | - | - | - | - | - | 64.3% | |
| OpenAI | 🇺🇸 | GPT-5.5 | ✓ | 93.6% | - | - | 52.2% | $5.00 | $30.00 | 1.1M | Dec. 2025 | - | Closed | 63c/s | 16.4s | Apr. 2026 | 1,583 | 62.8 | 48.6 | 53.1 | 35.6 | 30.8 | 46.9 | 40.4 | 30.5 | 21.8 | - | - | 85.0% | - | - | 84.4% | - | 83.2% | - | 75.3% | - | - | 55.6% | - | - | 35.4% | 74.0% | - | - | 58.6% | |
| OpenAI | 🇺🇸 | GPT-5.2 Pro | ✓ | 93.2% | 100.0% | - | 36.6% | $21.00 | $168.00 | 400k | - | - | Closed | - | - | Dec. 2025 | - | 56.1 | 51.4 | - | 29.5 | - | 33.1 | - | - | - | - | - | 54.2% | - | - | 77.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.4 | ✓ | 92.8% | - | - | 39.8% | $2.50 | $15.00 | 1M | - | - | Closed | 135c/s | 1.1s | Mar. 2026 | 1,727 | 57.8 | 47.3 | 44.3 | 32.0 | 35.5 | 38.1 | 36.3 | 27.4 | 2.0 | - | 43.5 | 73.3% | - | - | 82.7% | - | 81.2% | - | 67.2% | - | - | 54.6% | - | - | 47.6% | - | - | - | 57.7% | |
| OpenAI | 🇺🇸 | GPT-5.2 | ✓ | 92.4% | 100.0% | 80.0% | 34.5% | $1.75 | $14.00 | 400k | Aug. 2025 | - | Closed | 85c/s | 33.8s | Dec. 2025 | 1,514 | 54.0 | 50.5 | 35.7 | 26.1 | 33.1 | 35.8 | 28.7 | - | - | - | 44.4 | 52.9% | 89.6% | - | 65.8% | 82.1% | 79.5% | 86.3% | 60.6% | - | - | 46.3% | - | - | 40.3% | - | - | - | - | |
| 🇺🇸 | Gemini 3 Pro | ✓ | 91.9% | 100.0% | 76.2% | 45.8% | - | - | - | Jan. 2025 | - | Closed | - | - | Nov. 2025 | 1,579 | 49.9 | 51.6 | 33.4 | - | - | 34.9 | 19.8 | 9.8 | - | - | 49.9 | 31.1% | 91.8% | - | - | 81.4% | 81.0% | 72.7% | - | 72.1% | - | - | - | - | - | 26.3% | - | - | - | ||
| Anthropic | 🇺🇸 | Claude Opus 4.6 | ✓ | 91.3% | 99.8% | 80.8% | 53.1% | $5.00 | $25.00 | 1M | - | - | Closed | 71c/s | 2.4s | Feb. 2026 | 2,007 | 59.9 | 52.4 | 45.6 | 38.7 | 44.6 | 36.5 | 35.1 | 36.6 | 37.1 | 39.3 | 14.0 | 68.8% | 91.1% | - | 84.0% | 77.4% | 77.3% | - | 62.7% | - | 72.7% | - | - | - | - | 93.0% | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2.6 | ✓ | 90.5% | - | 80.2% | 36.4% | $0.95 | $4.00 | 262.1k | - | 1000 | Open | 20c/s | 54.1s | Apr. 2026 | 1,254 | 59.1 | 50.8 | 45.6 | 38.0 | - | 38.8 | 33.4 | - | - | - | - | - | - | - | 86.3% | 86.7% | 80.1% | - | - | - | - | 50.0% | - | - | - | - | 52.2% | 27.9% | 58.6% | |
| 🇺🇸 | Gemini 3 Flash | ✓ | 90.4% | 99.7% | 78.0% | 43.5% | $0.50 | $3.00 | 1M | Jan. 2025 | - | Closed | 322c/s | 7.1s | Dec. 2025 | 1,695 | 49.5 | 50.8 | 31.5 | - | - | 34.2 | 24.1 | 5.3 | - | - | 44.7 | 33.6% | 91.8% | - | - | 80.3% | 81.2% | 69.1% | 57.4% | 68.7% | - | 49.4% | - | - | - | 22.1% | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.6 Plus | ✓ | 90.4% | - | 78.8% | 28.8% | $0.50 | $3.00 | 1M | - | - | Closed | 78c/s | 65.8s | Mar. 2026 | 1,345 | 52.8 | 50.0 | 43.3 | 25.2 | - | 35.6 | 31.0 | 37.3 | 56.9 | 56.9 | 54.1 | - | 89.5% | 86.0% | - | 81.5% | 78.8% | 68.2% | 74.1% | - | - | 39.8% | - | - | - | - | - | - | 56.6% | |
| DeepSeek | 🇨🇳 | DeepSeek-V4-Pro-Max | x | 90.1% | - | 80.6% | 48.2% | $1.74 | $3.48 | 1.0M | - | 1600 | Open | 37c/s | 54.3s | Apr. 2026 | 1,096 | 57.7 | 53.9 | 45.0 | 33.5 | - | 33.6 | 35.2 | 20.0 | 45.2 | 45.7 | 48.6 | - | - | - | 83.4% | - | - | - | 73.6% | 57.9% | - | 51.8% | - | - | - | - | - | - | 55.4% | |
| Anthropic | 🇺🇸 | Claude Sonnet 4.6 | ✓ | 89.9% | - | 79.6% | 49.0% | $3.00 | $15.00 | 200k | - | - | Closed | 200c/s | 703ms | Feb. 2026 | 1,421 | 52.5 | 44.3 | 37.7 | 24.8 | 35.6 | 33.9 | 29.8 | 26.3 | 41.6 | 42.7 | 14.3 | 58.3% | 89.3% | - | 74.7% | - | 75.6% | - | 61.3% | - | 72.5% | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Muse Spark | ✓ | 89.5% | - | 77.4% | 58.4% | - | - | - | - | - | Closed | - | - | Apr. 2026 | - | 52.8 | 55.2 | 32.9 | 1.9 | 20.3 | 39.6 | 22.4 | - | 28.9 | 29.1 | 44.8 | 42.5% | - | - | - | 86.4% | 80.4% | 84.1% | - | - | - | - | - | - | - | - | - | - | 52.4% | |
| Bytedance | 🇨🇳 | Seed 2.0 Pro | ✓ | 88.9% | 98.3% | 76.5% | - | - | - | - | Jan. 2024 | - | Closed | - | - | Feb. 2026 | - | 54.5 | 45.4 | 33.3 | 28.9 | - | - | - | - | - | - | - | - | - | - | 77.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-397B-A17B | ✓ | 88.4% | - | 76.4% | 28.7% | $0.60 | $3.60 | 262.1k | - | 397 | Open | 68c/s | 9.5s | Feb. 2026 | 1,210 | 49.6 | 46.9 | 31.0 | 25.8 | 28.7 | 28.9 | 22.8 | 38.8 | 55.4 | 55.4 | 54.3 | - | 88.5% | - | 69.0% | - | - | - | - | - | - | 38.3% | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Heavy UNRELEASED | ✓ | 88.4% | 100.0% | - | 50.7% | - | - | - | Dec. 2024 | - | Closed | - | - | - | - | 54.1 | 53.5 | 25.6 | - | - | 36.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 474c/s | 958ms | Nov. 2025 | 1,226 | 48.1 | 41.5 | 31.8 | 22.2 | 28.3 | 33.4 | 25.3 | - | - | - | 48.1 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 High | ✓ | 88.1% | 99.6% | - | - | - | - | - | - | - | Closed | - | - | Nov. 2025 | 1,140 | 53.1 | 47.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 Medium | ✓ | 88.1% | 88.9% | - | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 109c/s | 51.2s | Aug. 2025 | 1,101 | 44.5 | 29.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Thinking | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | - | - | Closed | 32c/s | 23.7s | Nov. 2025 | 1,024 | 46.2 | 38.7 | 30.8 | 13.7 | 25.4 | 31.8 | 22.4 | - | - | - | 44.3 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Instant | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | - | - | Closed | 414c/s | 3.8s | Nov. 2025 | 805 | 49.6 | 40.1 | 31.3 | 18.3 | 26.9 | 35.0 | 23.8 | - | - | - | 46.0 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V4-Flash-Max | x | 88.1% | - | 79.0% | 45.1% | $0.14 | $0.28 | 1.0M | - | 284 | Open | 112c/s | 23.8s | Apr. 2026 | 850 | 52.2 | 51.5 | 38.8 | 23.9 | - | 31.0 | 28.3 | 7.9 | 36.5 | 36.5 | 44.7 | - | - | - | 73.2% | - | - | - | 69.0% | 34.1% | - | 47.8% | - | - | - | - | - | - | 52.6% | |
| OpenAI | 🇺🇸 | GPT-5.4 mini | ✓ | 88.0% | - | - | 28.2% | $0.75 | $4.50 | 400k | Aug. 2025 | - | Closed | 402c/s | 988ms | Mar. 2026 | 547 | 46.1 | 36.4 | 35.4 | - | 22.2 | 29.5 | 25.0 | 19.5 | - | - | 38.1 | - | - | - | - | - | 76.6% | - | 57.7% | - | - | 42.9% | - | - | - | 33.6% | - | - | 54.4% | |
| Qwen | 🇨🇳 | Qwen3.6-27B | ✓ | 87.8% | - | 77.2% | 24.0% | $0.60 | $3.60 | 262.1k | - | 27.8 | Open | 135c/s | 35.7s | Apr. 2026 | 465 | 46.3 | 43.4 | 35.5 | - | - | 30.6 | 24.8 | 29.2 | 46.9 | 46.9 | 46.5 | - | - | 82.9% | - | 78.4% | 75.8% | - | - | - | - | - | - | - | - | - | - | - | 53.5% | |
| MoonshotAI | 🇨🇳 | Kimi K2.5 | ✓ | 87.6% | 96.1% | 76.8% | 50.2% | $0.60 | $3.00 | 262.1k | - | 1000 | Open | 68c/s | 64.1s | Jan. 2026 | 1,462 | 50.4 | 48.0 | 32.5 | 30.4 | - | 36.0 | 14.6 | 38.3 | 47.7 | 47.7 | 48.5 | - | - | - | 74.9% | 77.5% | 78.5% | - | - | - | - | - | - | - | - | - | 48.7% | - | 50.7% | |
| xAI | 🇺🇸 | Grok-4 | ✓ | 87.5% | 91.7% | - | 40.0% | - | - | - | Dec. 2024 | - | Closed | - | - | Jul. 2025 | 487 | 44.5 | 40.7 | 23.6 | - | - | 28.1 | - | - | - | - | - | 15.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 High | ✓ | 87.3% | 94.6% | - | - | - | - | - | Sep. 2024 | - | Closed | - | - | Aug. 2025 | 1,301 | 48.2 | 40.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4.5 | ✓ | 87.0% | - | 80.9% | - | $5.00 | $25.00 | 200k | Mar. 2025 | - | Closed | 195c/s | 2.2s | Nov. 2025 | 1,620 | 54.5 | 42.1 | 41.0 | - | 35.6 | 29.9 | 30.8 | - | - | - | 24.4 | 37.6% | 90.8% | - | - | - | - | - | 62.3% | - | 66.3% | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 3.1 Flash-Lite | ✓ | 86.9% | - | - | 16.0% | $0.25 | $1.50 | 1M | Jan. 2025 | - | Closed | 132c/s | 9.2s | Mar. 2026 | 1,162 | 41.9 | 32.2 | - | - | - | 25.9 | - | 30.6 | - | - | 40.4 | - | 88.9% | - | - | 73.2% | 76.8% | - | - | 43.3% | - | - | - | - | - | 60.1% | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-122B-A10B | ✓ | 86.6% | - | 72.0% | 47.5% | $0.40 | $3.20 | 262.1k | - | 122 | Open | 66c/s | 12.9s | Feb. 2026 | 698 | 43.5 | 44.5 | 26.6 | 22.2 | 24.3 | 31.3 | 14.1 | 32.4 | 49.9 | 49.9 | 48.0 | - | 86.7% | 83.9% | 63.8% | 77.2% | 76.9% | 70.4% | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Pro Preview 06-05 | ✓ | 86.4% | 88.0% | 67.2% | 21.6% | $1.25 | $10.00 | 1.0M | Jan. 2025 | - | Closed | - | - | Jun. 2025 | - | 38.2 | 32.4 | 21.9 | - | - | 26.5 | - | -3.4 | - | - | 40.2 | - | - | 82.0% | - | - | - | - | - | 54.0% | - | - | - | - | - | 16.4% | - | - | - | ||
| ZAI | 🇨🇳 | GLM-5.1 | x | 86.2% | - | - | 52.3% | $1.40 | $4.40 | 200k | - | 754 | Open | 23c/s | 238.4s | Apr. 2026 | 1,539 | 55.0 | 47.3 | 45.1 | 30.1 | - | 39.5 | 30.6 | - | - | - | - | - | - | - | 79.3% | - | - | - | 71.8% | - | - | 40.7% | - | - | - | - | - | - | 58.4% | |
| Qwen | 🇨🇳 | Qwen3.6-35B-A3B | ✓ | 86.0% | - | 73.4% | 21.4% | - | - | - | - | 35 | Open | - | - | Apr. 2026 | - | 43.0 | 39.9 | 29.8 | 13.4 | - | 28.8 | 17.7 | 28.7 | 41.9 | 42.0 | 42.7 | - | - | 81.7% | - | 78.0% | 75.3% | - | 62.8% | - | - | 26.9% | - | - | - | - | - | - | 49.5% | |
| ZAI | 🇨🇳 | GLM-4.7 | ✓ | 85.7% | 95.7% | 73.8% | 42.8% | $0.60 | $2.20 | 204.8k | - | 358 | Open | 43c/s | 9.2s | Dec. 2025 | 1,066 | 44.3 | 44.1 | 23.2 | 16.8 | - | 29.8 | 12.5 | - | 36.6 | 36.5 | 36.3 | - | - | - | 52.0% | - | - | - | - | - | - | - | 33.3% | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 | ✓ | 85.7% | 94.6% | 74.9% | 24.8% | - | - | - | Sep. 2024 | - | Closed | - | - | Aug. 2025 | 886 | 45.5 | 39.0 | 34.5 | 13.2 | 29.5 | 32.1 | 23.8 | 32.6 | 45.5 | 45.6 | 41.8 | - | - | 84.2% | 54.9% | 81.1% | 78.4% | - | - | - | - | - | - | - | 26.3% | - | - | - | - | |
| xAI | 🇺🇸 | Grok 4 Fast | ✓ | 85.7% | 92.0% | - | 20.0% | $0.20 | $0.50 | 2M | - | - | Closed | 163c/s | 8.6s | Aug. 2025 | 419 | 41.5 | 36.9 | 26.7 | 6.9 | - | 17.3 | - | - | - | - | - | - | - | - | 44.9% | - | - | - | - | 95.0% | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.5 Instant NEW | ✓ | 85.6% | 81.2% | - | - | $5.00 | $30.00 | 400k | Aug. 2025 | - | Closed | 223c/s | 1.2s | May 2026 | - | 43.1 | 22.1 | - | - | - | 30.6 | - | - | - | - | 26.0 | - | - | - | - | 81.6% | 76.0% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-27B | ✓ | 85.5% | - | 72.4% | 48.5% | $0.30 | $2.40 | 262.1k | - | 27 | Open | 48c/s | 25.2s | Feb. 2026 | 483 | 42.3 | 42.8 | 22.1 | 20.5 | 22.2 | 30.1 | 13.3 | 31.0 | 46.4 | 46.5 | 44.4 | - | 85.9% | 82.3% | 61.0% | 79.5% | 75.0% | 70.3% | - | - | - | - | - | - | - | - | - | - | - | |
| Bytedance | 🇨🇳 | Seed 2.0 Lite | ✓ | 85.1% | 93.0% | 73.5% | - | - | - | - | Jan. 2024 | - | Closed | - | - | Feb. 2026 | - | 43.1 | 32.9 | 26.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Baidu | 🇨🇳 | ERNIE 5.0 | ✓ | 85.0% | 87.0% | - | 39.0% | - | - | - | - | - | Closed | - | - | Jan. 2026 | - | 44.0 | 40.6 | - | - | - | 27.8 | - | - | 46.8 | 46.8 | 46.5 | - | - | - | - | - | - | - | - | 75.0% | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.7 Sonnet | ✓ | 84.8% | 54.8% | 70.3% | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Feb. 2025 | 632 | 29.5 | 21.1 | 19.9 | - | 23.7 | 18.2 | 21.9 | - | - | - | 27.7 | - | 86.1% | 75.0% | - | - | - | - | - | - | - | - | 35.2% | 81.2% | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-3 | ✓ | 84.6% | 93.3% | - | - | $3.00 | $15.00 | 128k | Nov. 2024 | - | Closed | 150c/s | 857ms | Feb. 2025 | 794 | 40.6 | 39.0 | 26.5 | - | - | 21.2 | - | - | - | - | 31.6 | - | - | 78.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2-Thinking-0905 | x | 84.5% | 100.0% | 71.3% | 51.0% | - | - | - | - | 1000 | Open | - | - | Sep. 2025 | 941 | 46.0 | 47.5 | 26.6 | 19.9 | -6.3 | 36.8 | - | - | 26.4 | 26.3 | 39.4 | - | - | - | 60.2% | - | - | - | - | - | - | - | 47.1% | - | - | - | 44.8% | - | - | |
| 🇺🇸 | Gemma 4 31B | ✓ | 84.3% | - | - | 26.5% | $0.14 | $0.40 | 262.1k | Jan. 2025 | 30.7 | Open | 57c/s | 6.6s | Apr. 2026 | 982 | 45.3 | 39.9 | - | - | - | 28.1 | 20.1 | 29.8 | 42.4 | 42.4 | 36.7 | - | 88.4% | - | - | - | 76.9% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-35B-A3B | ✓ | 84.2% | - | 69.2% | 47.4% | $0.25 | $2.00 | 262.1k | - | 35 | Open | 139c/s | 7.3s | Feb. 2026 | 492 | 39.2 | 38.8 | 16.8 | 17.8 | 18.8 | 27.0 | 13.3 | 26.9 | 41.9 | 42.1 | 40.2 | - | 85.2% | 81.4% | 61.0% | 77.5% | 75.1% | 68.6% | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | ChatGPT-4o Latest | ✓ | 84.0% | - | - | - | $2.50 | $10.00 | 128k | - | - | Closed | - | - | May 2024 | 346 | 40.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-3 Mini | ✓ | 84.0% | 90.8% | - | - | $0.30 | $0.50 | 128k | Nov. 2024 | - | Closed | 103c/s | 9.7s | Feb. 2025 | 341 | 42.5 | 38.1 | 27.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Flash | x | 83.7% | 94.1% | 73.4% | 22.1% | $0.10 | $0.30 | 256k | - | 309 | Open | - | - | Dec. 2025 | 793 | 39.0 | 38.2 | 23.6 | 15.9 | - | 19.3 | 9.8 | 22.1 | 38.6 | 38.6 | 38.3 | - | - | - | 58.3% | - | - | - | - | - | - | - | 30.5% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Sonnet 4.5 | ✓ | 83.4% | 87.0% | - | - | $3.00 | $15.00 | 200k | Jan. 2025 | - | Closed | 46c/s | 2.9s | Sep. 2025 | 1,105 | 40.5 | 33.4 | 32.0 | - | 35.3 | 19.2 | 33.8 | 19.0 | - | - | 16.7 | - | 89.1% | - | - | - | - | - | - | - | 61.4% | - | 50.0% | 86.2% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3 | ✓ | 83.3% | 86.4% | 69.1% | 14.7% | $2.00 | $8.00 | 200k | May 2024 | - | Closed | - | - | Apr. 2025 | - | 38.7 | 32.6 | 20.5 | 9.7 | 23.0 | 28.5 | 16.3 | - | - | - | 41.2 | 6.5% | - | 82.9% | 49.7% | 78.6% | 76.4% | - | - | - | - | - | - | - | 15.8% | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Pro | ✓ | 83.0% | 83.0% | 63.2% | 17.8% | $1.25 | $10.00 | 1.0M | Jan. 2025 | - | Closed | 112c/s | 7.2s | May 2025 | 933 | 35.5 | 31.7 | 17.4 | - | - | 23.2 | - | 29.8 | - | - | 32.7 | 4.9% | - | 79.6% | - | - | - | - | - | 50.8% | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemini 2.5 Flash | ✓ | 82.8% | 72.0% | 60.4% | 11.0% | $0.30 | $2.50 | 1.0M | Jan. 2025 | - | Closed | 144c/s | 4.4s | May 2025 | 782 | 28.9 | 23.4 | 12.4 | - | - | 17.3 | - | 18.8 | - | - | 33.3 | - | - | 79.7% | - | - | - | - | - | 26.9% | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-5.4 nano | ✓ | 82.8% | - | - | 24.3% | $0.20 | $1.25 | 400k | Aug. 2025 | - | Closed | 497c/s | 1.2s | Mar. 2026 | 696 | 40.1 | 33.5 | 24.1 | - | 20.9 | 23.4 | 16.4 | 18.7 | - | - | 45.9 | - | - | - | - | - | 66.1% | - | 56.1% | - | - | 35.5% | - | - | - | 33.1% | - | - | 52.4% | |
| Nvidia | 🇺🇸 | Nemotron 3 Super (120B A12B) | x | 82.7% | 90.2% | 53.7% | 22.8% | - | - | - | Jun. 2025 | 120 | Open | - | - | Mar. 2026 | -63 | 30.5 | 38.2 | 17.4 | -0.1 | 13.6 | 19.7 | 5.8 | 19.0 | 36.0 | 36.1 | 35.7 | - | - | - | 31.3% | - | - | - | - | - | - | - | 25.8% | - | - | - | 42.0% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 (Thinking) | x | 82.4% | 93.1% | 73.1% | 25.1% | - | - | - | - | 685 | Open | - | - | Dec. 2025 | 393 | 42.8 | 39.4 | 29.9 | 15.0 | - | 22.1 | 14.8 | - | 40.4 | 40.4 | 40.0 | - | - | - | 51.4% | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 | x | 82.4% | 93.1% | 73.1% | 40.8% | $0.26 | $0.38 | 163.8k | - | 685 | Open | 25c/s | 1.4s | Dec. 2025 | 548 | 42.3 | 39.1 | 28.8 | 13.7 | - | 29.1 | 15.7 | - | 39.7 | 39.7 | 39.4 | - | - | - | 51.4% | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 mini | ✓ | 82.3% | 91.1% | - | 16.7% | $0.25 | $2.00 | 400k | May 2024 | - | Closed | 136c/s | 14.0s | Aug. 2025 | 1,086 | 36.9 | 33.4 | - | - | - | 12.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 22.1% | - | - | - | - | |
| 🇺🇸 | Gemma 4 26B-A4B | ✓ | 82.3% | - | - | 17.2% | $0.13 | $0.40 | 262.1k | Jan. 2025 | 25.2 | Open | 99c/s | 1.6s | Apr. 2026 | 1,072 | 36.0 | 33.2 | - | - | - | 21.1 | 18.4 | 15.9 | 32.6 | 32.6 | 31.0 | - | 86.3% | - | - | - | 73.8% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-9B | ✓ | 81.7% | - | - | - | - | - | - | - | 9 | Open | - | - | Mar. 2026 | - | 29.6 | 28.9 | - | - | 16.2 | 19.8 | 9.2 | 22.3 | 30.9 | 30.9 | 30.5 | - | 81.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Thinking | x | 81.5% | 90.6% | 59.4% | - | - | - | - | - | 560 | Open | - | - | Sep. 2025 | 791 | 36.0 | 34.2 | 19.4 | - | 23.7 | 16.6 | 21.6 | - | 36.6 | 32.5 | 32.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o4-mini | ✓ | 81.4% | 92.7% | 68.1% | 14.7% | $1.10 | $4.40 | 200k | May 2024 | - | Closed | - | - | Apr. 2025 | - | 34.6 | 35.3 | 16.0 | 12.5 | 14.8 | 22.7 | 14.5 | - | - | - | 35.3 | - | - | 81.6% | 51.5% | 72.0% | - | - | - | - | - | - | - | 71.8% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-235B-A22B-Thinking-2507 | x | 81.1% | 92.3% | - | 18.2% | $0.30 | $3.00 | 262.1k | - | 235 | Open | 175c/s | 3.9s | Jul. 2025 | 358 | 34.8 | 38.4 | - | - | 18.1 | 20.7 | 13.0 | - | 42.1 | 43.1 | 41.5 | - | - | - | - | - | - | - | - | - | - | - | - | 67.8% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.6 | ✓ | 81.0% | 93.9% | 68.0% | 17.2% | $0.55 | $2.19 | 131.1k | - | 357 | Open | 80c/s | 2.2s | Sep. 2025 | 1,144 | 38.4 | 34.5 | 20.5 | 7.4 | - | 13.5 | - | - | - | - | - | - | - | - | 45.1% | - | - | - | - | - | - | - | 40.5% | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2.1 | x | 81.0% | 81.0% | 67.0% | 22.0% | $0.30 | $1.20 | 1M | - | 230 | Open | 334c/s | 2.7s | Dec. 2025 | 908 | 41.5 | 36.7 | 27.6 | 19.1 | 18.2 | 18.6 | 21.1 | 20.1 | 51.5 | 51.5 | 51.0 | - | - | - | 62.0% | - | - | - | - | - | - | 43.5% | 47.9% | - | - | - | 39.0% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-R1-0528 | x | 81.0% | 87.5% | 44.6% | 17.7% | $0.55 | $2.19 | 131.1k | - | 671 | Open | 58c/s | 3.2s | May 2025 | 360 | 31.4 | 33.4 | 12.6 | -14.0 | - | 14.6 | - | - | 40.6 | 40.6 | 40.3 | - | - | - | 8.9% | - | - | - | - | 92.3% | - | - | 5.7% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4.1 | ✓ | 80.9% | 78.0% | 74.5% | - | $15.00 | $75.00 | 200k | - | - | Closed | 124c/s | 2.6s | Aug. 2025 | 1,189 | 38.5 | 31.6 | 28.6 | - | 24.7 | 21.8 | 23.0 | - | - | - | 17.3 | - | 89.5% | - | - | - | - | - | - | - | - | - | 43.3% | 82.4% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 120B High | x | 80.9% | 92.5% | - | - | $0.10 | $0.50 | 131.1k | - | 116.8 | Open | 191c/s | 20.8s | Aug. 2025 | 541 | 32.3 | 29.1 | - | - | - | - | 2.2 | - | 26.8 | 26.8 | 26.5 | - | 83.8% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Thinking-2601 | x | 80.5% | 99.6% | 70.0% | 25.2% | $0.30 | $1.20 | 128k | - | 560 | Open | 33c/s | 74.5s | Jan. 2026 | 529 | 46.4 | 42.0 | 26.1 | 18.7 | 38.0 | 22.9 | 34.2 | - | - | - | - | - | - | - | 56.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 120B | x | 80.1% | - | - | 14.9% | $0.09 | $0.45 | 131.1k | - | 116.8 | Open | 243c/s | 2.7s | Aug. 2025 | 335 | 30.9 | 32.2 | - | - | 10.3 | 10.5 | 8.4 | - | 32.0 | 32.1 | 35.2 | - | - | - | - | - | - | - | - | - | - | - | - | 67.8% | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2-Exp | x | 79.9% | 89.3% | 67.8% | 19.8% | - | - | - | - | 685 | Open | - | - | Sep. 2025 | 750 | 36.1 | 33.8 | 22.3 | 1.9 | - | 16.5 | - | - | 39.1 | 39.1 | 38.8 | - | - | - | 40.1% | - | - | - | - | 97.1% | - | - | 37.7% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4 | ✓ | 79.6% | 75.5% | 72.5% | - | - | - | - | - | - | Closed | - | - | May 2025 | 932 | 35.6 | 27.4 | 22.8 | - | 25.0 | 21.2 | 23.2 | - | - | - | 9.3 | 8.6% | 88.8% | - | - | - | - | - | - | - | - | - | 39.2% | 81.4% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.5 | x | 79.1% | - | 64.2% | 14.4% | - | - | - | - | 355 | Open | - | - | Jul. 2025 | 744 | 34.5 | 34.6 | 21.3 | -4.2 | 25.2 | 8.8 | 25.8 | - | 42.0 | 37.9 | 37.7 | - | - | - | 26.4% | - | - | - | - | - | - | - | 37.5% | 79.7% | - | - | 41.7% | - | - | |
| OpenAI | 🇺🇸 | o1-pro | ✓ | 79.0% | - | - | - | - | - | - | Sep. 2023 | - | Closed | - | - | Dec. 2024 | - | 28.6 | 24.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Sarvam AI | 🇮🇳 | Sarvam-105B | x | 78.7% | 96.7% | 45.0% | 11.2% | - | - | - | - | 105 | Open | - | - | Mar. 2026 | - | 33.0 | 35.6 | 3.9 | 9.0 | - | 6.5 | - | - | 35.4 | 35.4 | 34.8 | - | - | - | 49.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2 | x | 78.0% | 78.0% | 69.4% | 12.5% | $0.30 | $1.20 | 1M | - | 230 | Open | 172c/s | 1.4s | Oct. 2025 | 834 | 34.3 | 25.9 | 23.9 | 5.3 | 19.0 | 6.7 | 14.5 | - | 29.9 | 29.9 | 29.6 | - | - | - | 44.0% | - | - | - | - | - | - | - | 46.3% | - | - | - | 36.0% | - | - | |
| OpenAI | 🇺🇸 | o1 | x | 78.0% | - | 41.0% | - | $15.00 | $60.00 | 200k | - | - | Closed | - | - | Dec. 2024 | - | 24.5 | 27.6 | 7.4 | - | 17.1 | 19.6 | 15.4 | - | 42.5 | 42.6 | 37.7 | - | 87.7% | 77.6% | - | - | - | - | - | 47.0% | - | - | - | 70.8% | 5.5% | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-235B-A22B-Instruct-2507 | x | 77.5% | 70.3% | - | - | $0.15 | $0.80 | 262.1k | - | 235 | Open | - | - | Jul. 2025 | 147 | 28.4 | 29.1 | 7.2 | - | 13.5 | 13.0 | 8.5 | - | 35.2 | 35.7 | 36.8 | - | - | - | - | - | - | - | - | 54.3% | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3-mini | x | 77.2% | - | 49.3% | - | $1.10 | $4.40 | 200k | Sep. 2023 | - | Closed | - | - | Jan. 2025 | - | 21.9 | 30.7 | 8.3 | - | 9.0 | - | -2.8 | 2.3 | 23.3 | 23.3 | 22.8 | - | - | - | - | - | - | - | - | 15.0% | - | - | - | 57.6% | 9.2% | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Thinking | x | 77.2% | 87.8% | - | - | $0.15 | $1.50 | 65.5k | - | 80 | Open | 111c/s | 5.5s | Sep. 2025 | - | 30.1 | 32.2 | - | - | 16.1 | 18.3 | 14.1 | - | 34.8 | 33.5 | 35.1 | - | - | - | - | - | - | - | - | - | - | - | - | 69.6% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-4B | ✓ | 76.2% | - | - | - | - | - | - | - | 4 | Open | - | - | Mar. 2026 | - | 22.6 | 23.6 | - | - | 12.6 | 14.3 | 8.5 | 11.9 | 23.6 | 23.7 | 23.3 | - | 76.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron Ultra 253B v1 | x | 76.0% | 72.5% | - | - | - | - | - | Dec. 2023 | 253 | Open | - | - | Apr. 2025 | - | 24.5 | 21.9 | 18.6 | - | - | - | 22.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 0905 | x | 75.8% | - | - | - | $0.60 | $2.50 | 262.1k | - | 1000 | Closed | 31c/s | 5.5s | Sep. 2025 | 1,003 | 26.0 | 28.3 | 30.1 | - | - | - | - | - | 35.1 | 35.1 | 34.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Sonnet 4 | ✓ | 75.4% | 70.5% | 72.7% | - | - | - | - | - | - | Closed | 2c/s | - | May 2025 | 882 | 30.8 | 22.1 | 21.8 | - | 25.3 | 16.8 | 23.5 | - | - | - | 25.8 | - | 86.5% | 74.4% | - | - | - | - | - | - | - | - | 35.5% | 80.5% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.7-Flash | x | 75.2% | 91.6% | 59.2% | 14.4% | $0.07 | $0.40 | 128k | - | 30 | Open | 9c/s | 23.6s | Jan. 2026 | 759 | 31.9 | 29.4 | 9.0 | 3.9 | - | 8.0 | 10.1 | - | - | - | - | - | - | - | 42.8% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2-Instruct-0905 | x | 75.1% | 49.5% | 65.8% | 4.7% | - | - | - | - | 1000 | Open | - | - | Sep. 2025 | - | 24.2 | 23.4 | 13.1 | - | 14.8 | -3.8 | 9.9 | - | 30.6 | 30.6 | 30.1 | - | - | - | - | - | - | - | - | 31.0% | - | - | 25.0% | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 Instruct | x | 75.1% | 49.5% | - | 4.7% | $0.50 | $0.50 | 200k | - | 1000 | Open | 71c/s | 1.8s | Jul. 2025 | -882 | 25.6 | 25.7 | 17.3 | - | 16.1 | -5.4 | 11.4 | - | 31.4 | 31.5 | 31.0 | - | - | - | - | - | - | - | - | 31.0% | - | - | 30.0% | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Nemotron 3 Nano (30B A3B) | x | 75.0% | 99.2% | 38.8% | 15.5% | $0.06 | $0.24 | 262.1k | Nov. 2025 | 32 | Open | 172c/s | 5.8s | Dec. 2025 | 187 | 24.7 | 28.2 | 2.9 | - | 5.4 | 10.9 | 1.2 | - | 18.1 | 18.2 | 17.7 | - | - | - | - | - | - | - | - | - | - | - | 8.5% | - | - | - | 33.3% | - | - | |
| ZAI | 🇨🇳 | GLM-4.5-Air | x | 75.0% | - | 57.6% | 10.6% | - | - | - | - | 106 | Open | - | - | Jul. 2025 | - | 30.6 | 29.3 | 16.8 | -7.8 | 25.1 | 3.7 | 24.5 | - | 35.5 | 28.5 | 28.2 | - | - | - | 21.3% | - | - | - | - | - | - | - | 30.0% | 77.9% | - | - | 37.3% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.1 | x | 74.9% | 49.8% | 66.0% | 15.9% | $0.27 | $1.00 | 163.8k | - | 671 | Open | 23c/s | 1.9s | Jan. 2025 | - | 28.5 | 23.3 | 17.0 | 3.3 | - | 11.6 | - | - | 34.1 | 34.1 | 33.9 | - | - | - | 30.0% | - | - | - | - | 93.4% | - | - | 31.3% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 30B A3B Thinking | ✓ | 74.4% | 83.1% | - | - | $0.20 | $1.00 | 262.1k | - | 31 | Open | 131c/s | 30.5s | Sep. 2025 | 78 | 21.7 | 26.3 | - | - | 13.2 | 16.8 | 6.0 | 17.6 | 28.1 | 28.4 | 28.9 | - | - | - | - | 56.6% | 63.0% | 57.3% | - | 23.9% | 30.6% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 20B High | x | 74.2% | 98.7% | - | - | - | - | - | - | 20.9 | Open | - | - | Aug. 2025 | 230 | 38.0 | 45.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash Thinking | ✓ | 74.2% | - | - | - | - | - | - | Aug. 2024 | - | Closed | - | - | Jan. 2025 | - | 22.3 | 13.8 | - | - | - | 19.4 | - | - | - | - | 29.5 | - | - | 75.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Inception | 🇺🇸 | Mercury 2 | x | 74.0% | 91.1% | - | - | $0.25 | $0.75 | 128k | - | - | Closed | 1,045c/s | 985ms | Feb. 2026 | 211 | 29.5 | 31.7 | 19.2 | - | 7.3 | - | 3.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 38.0% | - | - | |
| Baidu | 🇨🇳 | ERNIE 4.5 | x | 74.0% | - | - | - | $0.40 | $4.00 | 128k | - | 21 | Closed | - | - | Jun. 2025 | 124 | -18.2 | -21.1 | - | - | - | - | - | - | -22.3 | -22.3 | -22.6 | - | - | - | - | - | - | - | - | 1.8% | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Zero | x | 73.3% | - | - | - | - | - | - | - | 671 | Open | - | - | Jan. 2025 | - | 21.4 | 23.7 | 8.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o1-preview | x | 73.3% | - | 41.3% | - | $15.00 | $60.00 | 128k | - | - | Closed | - | - | Sep. 2024 | - | 20.0 | 23.2 | 1.7 | - | - | - | - | - | 38.6 | 38.7 | 37.4 | - | - | - | - | - | - | - | - | 42.4% | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Chat | x | 73.2% | 61.3% | 60.4% | - | $0.30 | $1.20 | 128k | - | 560 | Open | 163c/s | 5.6s | Aug. 2025 | 771 | 23.8 | 22.9 | 16.2 | - | 17.7 | - | 14.4 | - | 35.0 | 35.0 | 34.5 | - | - | - | - | - | - | - | - | - | - | - | 39.5% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 32B Thinking | ✓ | 73.1% | 83.7% | - | - | - | - | - | - | 33 | Open | - | - | Sep. 2025 | - | 28.3 | 30.3 | - | - | 18.9 | 22.2 | 12.0 | 20.5 | 34.1 | 33.9 | 33.5 | - | - | - | - | 65.2% | 68.1% | 57.1% | - | 55.4% | 41.0% | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Haiku 4.5 | ✓ | 73.0% | 80.7% | 73.3% | - | $1.00 | $5.00 | 200k | Feb. 2025 | - | Closed | 301c/s | 462ms | Oct. 2025 | 853 | 35.9 | 21.2 | 25.8 | - | 25.2 | 17.1 | 21.8 | 14.8 | - | - | -0.1 | - | 83.0% | - | - | - | - | - | - | - | 50.7% | - | 41.0% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Instruct | x | 72.9% | 69.5% | - | - | $0.15 | $1.50 | 65.5k | - | 80 | Open | 264c/s | 2.4s | Sep. 2025 | - | 23.8 | 25.1 | 4.2 | - | 10.7 | 10.8 | 4.8 | - | 32.5 | 32.7 | 29.5 | - | - | - | - | - | - | - | - | - | - | - | - | 60.9% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 20B | x | 71.5% | - | - | 10.9% | $0.10 | $0.50 | 131.1k | - | 20.9 | Open | 38c/s | 18.4s | Aug. 2025 | 551 | 20.2 | 22.6 | - | - | -5.1 | 4.0 | -7.2 | - | 16.7 | 16.7 | 21.6 | - | - | - | - | - | - | - | - | - | - | - | - | 54.8% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 nano | ✓ | 71.2% | 85.2% | - | 8.7% | - | - | - | May 2024 | - | Closed | - | - | Aug. 2025 | 663 | 24.8 | 25.6 | - | - | - | 1.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 9.6% | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (14B Reasoning 2512) | ✓ | 71.2% | 85.0% | - | - | $0.20 | $0.20 | 262.1k | - | 14 | Open | - | - | Dec. 2025 | - | 27.7 | 29.6 | 17.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 4 | ✓ | 71.2% | 83.8% | - | - | $0.15 | $0.60 | 256k | - | 119 | Open | 303c/s | 1.2s | Mar. 2026 | 37 | 24.1 | 24.6 | 16.1 | - | - | 12.1 | - | 39.8 | 21.8 | 21.8 | 21.4 | - | - | - | - | - | 60.0% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Magistral Medium | ✓ | 70.8% | 64.9% | - | 9.0% | - | - | - | Jun. 2025 | 24 | Open | - | - | Jun. 2025 | - | 17.8 | 16.5 | 5.8 | - | - | 2.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 30B A3B Instruct | ✓ | 70.4% | 69.3% | - | - | $0.20 | $0.70 | 262.1k | - | 31 | Open | 299c/s | 1.9s | Sep. 2025 | 40 | 16.5 | 19.3 | - | - | 8.7 | 15.9 | 1.1 | 18.9 | 21.4 | 21.7 | 22.0 | - | - | - | - | 48.9% | 60.4% | 60.5% | - | 27.0% | 30.3% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4o | ✓ | 70.1% | - | 33.2% | 5.3% | $2.50 | $10.00 | 128k | - | - | Closed | 242c/s | 1.0s | Aug. 2024 | -417 | 17.4 | 15.3 | 1.3 | - | 6.7 | 12.6 | 5.8 | 15.5 | 19.7 | 19.7 | 20.2 | - | 81.4% | 72.2% | - | 58.8% | 59.9% | - | - | 38.2% | - | - | - | 60.3% | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M1 80K | x | 70.0% | 76.9% | 56.0% | 8.4% | $0.55 | $2.20 | 1M | - | 456 | Open | - | - | Jun. 2025 | - | 25.9 | 24.9 | 14.4 | - | 17.8 | 1.4 | 17.4 | 30.4 | 26.7 | 26.7 | 26.4 | - | - | - | - | - | - | - | - | 18.5% | - | - | - | 63.5% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 8B Thinking | ✓ | 69.9% | 80.3% | - | - | $0.18 | $2.09 | 262.1k | - | 9 | Open | 185c/s | 1.0s | Sep. 2025 | -37 | 18.7 | 22.1 | - | - | 15.1 | 15.1 | -11.0 | 13.4 | 21.4 | 23.4 | 21.5 | - | - | - | - | 53.0% | 60.4% | 46.6% | - | 49.6% | 33.9% | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 4 Maverick | ✓ | 69.8% | - | - | - | $0.17 | $0.60 | 1M | - | 400 | Open | - | - | Apr. 2025 | - | 15.4 | 22.8 | 5.7 | - | - | 18.7 | - | - | 23.5 | 23.5 | 25.4 | - | - | 73.4% | - | - | 59.6% | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.5 | ✓ | 69.5% | - | 38.0% | - | $75.00 | $150.00 | 128k | - | - | Closed | - | - | Feb. 2025 | - | 23.4 | 24.6 | 7.9 | - | 13.3 | 18.7 | 12.5 | 13.4 | 40.2 | 40.3 | 35.9 | - | 85.1% | 75.2% | - | 55.4% | - | - | - | 62.5% | - | - | - | 68.4% | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M1 40K | x | 69.2% | 74.6% | 55.6% | 7.2% | - | - | - | - | 456 | Open | - | - | Jun. 2025 | - | 24.7 | 22.6 | 12.8 | - | 17.1 | 0.6 | 15.7 | 32.8 | 25.7 | 25.7 | 25.4 | - | - | - | - | - | - | - | - | 17.9% | - | - | - | 67.8% | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Reasoning Plus | x | 68.9% | 78.0% | - | - | - | - | - | Mar. 2025 | 14 | Open | - | - | Apr. 2025 | - | 18.3 | 22.3 | 9.8 | - | - | - | - | - | 19.5 | 19.5 | 19.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 32B Instruct | ✓ | 68.9% | 66.2% | - | - | - | - | - | - | 33 | Open | - | - | Sep. 2025 | - | 21.4 | 22.9 | - | - | 12.5 | 20.1 | 8.3 | 23.0 | 25.8 | 24.8 | 25.7 | - | - | - | - | 62.8% | 65.3% | 57.9% | - | - | 32.6% | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3 0324 | x | 68.4% | - | - | - | $0.28 | $1.14 | 163.8k | - | 671 | Open | 77c/s | 3.3s | Mar. 2025 | 65 | 16.6 | 19.4 | 6.8 | - | - | - | - | - | 28.4 | 28.3 | 28.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Magistral Small 2506 | x | 68.2% | 62.8% | - | - | - | - | - | Jun. 2025 | 24 | Open | - | - | Jun. 2025 | - | 14.7 | 12.3 | 8.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Sonnet | ✓ | 67.2% | - | 49.0% | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Oct. 2024 | - | 22.3 | 25.3 | 17.9 | - | 13.1 | 20.2 | 11.3 | - | 30.2 | 30.2 | 27.1 | - | - | 68.3% | - | - | - | - | - | - | - | - | - | 69.2% | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Lite | x | 66.8% | 63.2% | 54.4% | - | $0.10 | $0.40 | 256k | - | 68.5 | Open | 248c/s | 7.3s | Feb. 2026 | 712 | 23.2 | 20.1 | 12.3 | - | 18.6 | - | 15.3 | - | 22.3 | 22.3 | 21.9 | - | - | - | - | - | - | - | - | - | - | - | 33.8% | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Reasoning 2512) | ✓ | 66.8% | 78.7% | - | - | $0.15 | $0.15 | 262.1k | - | 8 | Open | - | - | Dec. 2025 | - | 22.4 | 24.6 | 14.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama-3.3 Nemotron Super 49B v1 | x | 66.7% | 58.4% | - | - | - | - | - | Dec. 2023 | 49.9 | Open | - | - | Mar. 2025 | - | 15.8 | 16.7 | - | - | 27.8 | - | 16.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Sarvam AI | 🇮🇳 | Sarvam-30B | x | 66.5% | 96.7% | 34.0% | - | - | - | - | - | 30 | Open | - | - | Mar. 2026 | - | 24.1 | 28.6 | 12.0 | 1.9 | - | - | - | - | 22.0 | 22.0 | 21.6 | - | - | - | 35.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 | ✓ | 66.3% | 46.4% | 54.6% | 5.4% | $2.00 | $8.00 | 1.0M | Jun. 2024 | - | Closed | 266c/s | 979ms | Apr. 2025 | 681 | 21.6 | 20.3 | 8.2 | - | 10.6 | 15.3 | 12.1 | 20.2 | 33.3 | 33.4 | 32.0 | - | 87.3% | 74.8% | - | 56.7% | - | - | - | - | - | - | - | 68.0% | - | - | - | - | - | |
| Nous Research | 🇺🇸 | Hermes 3 70B | x | 66.1% | - | - | - | - | - | - | - | 70 | Open | - | - | Aug. 2024 | 71 | -1.8 | -1.8 | - | - | 34.8 | - | - | - | 3.2 | 4.1 | 2.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 30B A3B | x | 65.8% | 70.9% | - | - | $0.10 | $0.44 | 128k | - | 30.5 | Open | 274c/s | 684ms | Apr. 2025 | 240 | 18.0 | 19.7 | 15.3 | - | 7.9 | - | 16.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Reasoning | x | 65.8% | 62.9% | - | - | - | - | - | Mar. 2025 | 14 | Open | - | - | Apr. 2025 | - | 15.1 | 16.5 | 11.3 | - | - | - | - | - | 16.6 | 16.6 | 16.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Llama 70B | x | 65.2% | - | - | - | $0.10 | $0.40 | 128k | - | 70.6 | Open | - | - | Jan. 2025 | - | 19.1 | 22.9 | 13.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QwQ-32B | x | 65.2% | - | - | - | - | - | - | Nov. 2024 | 32.5 | Open | - | - | Mar. 2025 | - | 14.8 | 17.7 | 15.9 | - | - | - | 5.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QwQ-32B-Preview | x | 65.2% | - | - | - | $0.15 | $0.60 | 32.8k | Nov. 2024 | 32.5 | Open | - | - | Nov. 2024 | - | 11.1 | 11.0 | 7.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 mini | ✓ | 65.0% | 40.2% | 23.6% | 3.7% | $0.40 | $1.60 | 1.0M | May 2024 | - | Closed | 240c/s | 716ms | Apr. 2025 | 949 | 15.9 | 16.5 | -1.1 | - | 2.7 | 14.7 | -0.5 | 13.6 | 25.0 | 25.1 | 26.0 | - | 78.5% | 72.7% | - | 56.8% | - | - | - | - | - | - | - | 55.8% | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Flash-Lite | ✓ | 64.6% | 49.8% | 31.6% | 5.1% | $0.10 | $0.40 | 1.0M | Jan. 2025 | - | Open | - | - | Jun. 2025 | - | 12.8 | 10.2 | -2.1 | - | - | 9.1 | - | -13.7 | - | - | 23.0 | - | - | 72.9% | - | - | - | - | - | 10.7% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3 VL 4B Thinking | ✓ | 64.1% | 74.5% | - | - | $0.10 | $1.00 | 262.1k | - | 4 | Open | 152c/s | 3.9s | Sep. 2025 | 37 | 15.4 | 18.2 | - | - | 12.2 | 12.0 | 2.9 | 11.9 | 19.5 | 18.8 | 18.0 | - | - | - | - | 50.3% | 57.0% | 49.2% | - | - | 31.4% | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Nemotron Nano 9B v2 | x | 64.0% | 72.1% | - | - | - | - | - | Sep. 2024 | 8.9 | Open | - | - | Aug. 2025 | - | 22.8 | 22.9 | 21.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 32B | x | 62.1% | - | - | - | $0.12 | $0.18 | 128k | - | 32.8 | Open | - | - | Jan. 2025 | - | 17.0 | 20.3 | 13.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash | ✓ | 62.1% | - | - | - | $0.10 | $0.40 | 1.0M | Aug. 2024 | - | Closed | - | - | Dec. 2024 | - | 16.5 | 26.5 | 3.2 | - | - | 13.7 | - | 16.0 | 20.0 | 20.0 | 22.7 | - | - | 70.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3 Max | x | 62.0% | 81.6% | 69.6% | - | $0.50 | $5.00 | 256k | - | 1000 | Closed | 84c/s | 1.3s | Dec. 2025 | 684 | 29.0 | 31.0 | 17.9 | - | - | - | 5.7 | - | 39.6 | 39.8 | 38.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o1-mini | x | 60.0% | - | - | - | $3.00 | $12.00 | 128k | - | - | Closed | - | - | Sep. 2024 | - | 11.6 | 13.0 | 22.7 | - | - | - | - | - | 17.0 | 17.1 | 16.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Sonnet | ✓ | 59.4% | - | - | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Jun. 2024 | - | 18.3 | 26.4 | 21.7 | - | - | - | - | - | 30.1 | 30.1 | 29.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 14B | x | 59.1% | - | - | - | - | - | - | - | 14.8 | Open | - | - | Jan. 2025 | - | 13.9 | 16.8 | 10.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3 | x | 59.1% | - | 42.0% | - | $0.27 | $1.10 | 131.1k | - | 671 | Open | - | - | Dec. 2024 | - | 15.9 | 20.4 | 8.9 | - | - | - | - | 6.5 | 25.4 | 25.5 | 25.0 | - | - | - | - | - | - | - | - | 24.9% | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.5 Pro | ✓ | 59.1% | - | - | - | $2.50 | $10.00 | 2.1M | Nov. 2023 | - | Closed | - | - | May 2024 | - | 10.1 | 17.9 | 3.5 | - | - | 14.7 | - | 25.3 | 21.0 | 21.1 | 20.5 | - | - | 65.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 4 E4B | ✓ | 58.6% | - | - | - | - | - | - | Jan. 2025 | 8 | Open | - | - | Apr. 2026 | - | 12.5 | 14.7 | - | - | - | 10.7 | -1.6 | 7.2 | 11.6 | 11.6 | 12.5 | - | 76.6% | - | - | - | 52.6% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Meta | 🇺🇸 | Llama 4 Scout | ✓ | 57.2% | - | - | - | $0.08 | $0.30 | 10M | - | 109 | Open | - | - | Apr. 2025 | - | 6.0 | 14.3 | -0.5 | - | - | 15.3 | - | - | 12.8 | 12.9 | 15.8 | - | - | 69.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 | x | 56.1% | - | - | - | $0.07 | $0.14 | 16k | Jun. 2024 | 14.7 | Open | - | - | Dec. 2024 | - | 4.8 | 13.7 | 2.2 | - | - | - | - | - | 15.5 | 15.5 | 15.2 | - | - | - | - | - | - | - | - | 3.0% | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-2 | ✓ | 56.0% | - | - | - | $2.00 | $10.00 | 128k | - | - | Closed | - | - | Aug. 2024 | - | 12.3 | 20.9 | 13.6 | - | - | 13.2 | - | - | 23.7 | 23.8 | 22.3 | - | - | 66.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron Nano 8B V1 | x | 54.1% | 47.1% | - | - | - | - | - | Dec. 2023 | 8 | Open | - | - | Mar. 2025 | - | 5.9 | 12.6 | - | - | 7.3 | - | -1.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4o | ✓ | 53.6% | - | - | - | $2.50 | $10.00 | 128k | - | - | Closed | 425c/s | 847ms | May 2024 | 136 | 13.3 | 20.5 | 19.4 | - | - | 6.9 | - | - | 24.4 | 24.4 | 23.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Min istral 3 (3B Reasoning 2512) | ✓ | 53.4% | 72.1% | - | - | $0.10 | $0.10 | 131.1k | - | 3 | Open | - | - | Dec. 2025 | - | 15.0 | 17.1 | 11.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Mini Reasoning | x | 52.0% | - | - | - | - | - | - | Feb. 2025 | 3.8 | Open | - | - | Apr. 2025 | - | 9.2 | 16.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-2B | ✓ | 51.6% | - | - | - | - | - | - | - | 2 | Open | - | - | Mar. 2026 | - | -0.5 | 4.2 | - | - | -6.1 | -4.6 | -4.1 | 0.4 | 4.4 | 4.4 | 4.0 | - | 63.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash-Lite | ✓ | 51.5% | - | - | - | $0.07 | $0.30 | 1.0M | Jun. 2024 | - | Closed | - | - | Feb. 2025 | - | 12.3 | 20.7 | - | - | - | 6.2 | - | -2.5 | 13.7 | 13.8 | 16.9 | - | - | 68.0% | - | - | - | - | - | 21.7% | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemini 1.5 Flash | ✓ | 51.0% | - | - | - | $0.15 | $0.60 | 1.0M | Nov. 2023 | - | Closed | - | - | May 2024 | - | 4.1 | 11.1 | -3.9 | - | - | 10.7 | - | 21.4 | 7.2 | 7.2 | 10.5 | - | - | 62.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| xAI | 🇺🇸 | Grok-2 mini | ✓ | 51.0% | - | - | - | - | - | - | - | - | Closed | - | - | Aug. 2024 | - | 8.4 | 17.9 | 6.9 | - | - | 10.4 | - | - | 19.9 | 20.0 | 19.0 | - | - | 63.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 405B Instruct | x | 50.7% | - | - | - | $0.89 | $0.89 | 128k | - | 405 | Open | - | - | Jul. 2024 | - | 14.0 | 22.1 | 16.5 | - | - | - | 40.5 | - | 22.4 | 22.4 | 22.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.3 70B Instruct | x | 50.5% | - | - | - | $0.20 | $0.20 | 128k | - | 70 | Open | - | - | Dec. 2024 | - | 12.1 | 19.7 | 13.2 | - | - | - | 29.2 | - | 17.4 | 17.4 | 17.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3 Opus | ✓ | 50.4% | - | - | - | $15.00 | $75.00 | 200k | - | - | Closed | - | - | Feb. 2024 | - | 8.8 | 17.6 | 5.5 | - | - | - | - | - | 18.4 | 18.5 | 18.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 nano | ✓ | 50.3% | - | - | - | $0.10 | $0.40 | 1.0M | May 2024 | - | Closed | 355c/s | 771ms | Apr. 2025 | 73 | 1.1 | 5.6 | -18.7 | - | -19.7 | 1.7 | -20.5 | 3.8 | 5.4 | 5.4 | 6.3 | - | 66.9% | 55.4% | - | 40.5% | - | - | - | - | - | - | - | 22.6% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 32B Instruct | x | 49.5% | - | - | - | - | - | - | - | 32.5 | Open | - | - | Sep. 2024 | - | 5.2 | 18.1 | 12.4 | - | - | - | - | - | 10.3 | 10.5 | 10.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 7B | x | 49.1% | - | - | - | - | - | - | - | 7.6 | Open | - | - | Jan. 2025 | - | 11.0 | 18.6 | 4.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Llama 8B | x | 49.0% | - | - | - | - | - | - | - | 8.0 | Open | - | - | Jan. 2025 | - | 8.2 | 13.4 | 4.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 72B Instruct | x | 49.0% | - | - | - | $0.35 | $0.40 | 131.1k | - | 72.7 | Open | - | - | Sep. 2024 | - | 12.0 | 19.5 | 11.7 | - | 31.0 | - | - | - | 13.5 | 13.5 | 13.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 Base | x | 48.1% | - | - | - | - | - | - | - | 1000 | Open | - | - | Jul. 2025 | - | 13.8 | 16.2 | 17.2 | - | - | - | - | - | 20.3 | 20.4 | 19.9 | - | - | - | - | - | - | - | - | 35.3% | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4 Turbo | x | 48.0% | - | - | - | $10.00 | $30.00 | 128k | Dec. 2023 | - | Closed | 65c/s | 3.1s | Apr. 2024 | 104 | 9.2 | 20.1 | 8.6 | - | - | - | - | - | 23.1 | 23.1 | 22.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 235B A22B | x | 47.5% | 81.5% | - | - | $0.10 | $0.10 | 128k | - | 235 | Open | - | - | Apr. 2025 | - | 15.8 | 22.2 | 16.2 | - | - | - | 22.3 | - | 18.2 | 18.2 | 17.7 | - | 86.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Pro | ✓ | 46.9% | - | - | - | $0.80 | $3.20 | 300k | - | - | Closed | - | - | Nov. 2024 | - | 11.8 | 21.1 | 15.3 | 20.6 | - | 10.7 | 13.9 | 10.9 | 21.0 | 19.5 | 17.0 | - | - | 61.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 90B Instruct | ✓ | 46.7% | - | - | - | $0.35 | $0.40 | 128k | - | 90 | Open | - | - | Sep. 2024 | - | 6.3 | 13.2 | - | - | - | 6.9 | - | - | 20.8 | 20.9 | 17.3 | - | - | 60.3% | - | - | 45.2% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3.2 24B Instruct | ✓ | 46.1% | - | - | - | - | - | - | Oct. 2023 | 23.6 | Open | - | - | Jun. 2025 | - | 7.5 | 12.3 | - | - | 9.0 | 15.1 | - | - | 11.5 | 11.6 | 13.1 | - | - | 62.5% | - | - | - | - | - | 12.1% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3.1 24B Instruct | ✓ | 46.0% | - | - | - | - | - | - | - | 24 | Open | - | - | Mar. 2025 | - | 2.9 | 9.0 | 13.8 | - | - | 3.5 | - | - | 9.2 | 9.2 | 9.8 | - | - | 59.3% | - | - | - | - | - | 10.4% | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 32B Instruct | ✓ | 46.0% | - | - | - | - | - | - | - | 33.5 | Open | - | - | Feb. 2025 | - | 8.1 | 12.7 | 19.6 | - | - | 10.6 | - | 10.2 | 7.6 | 7.7 | 12.6 | - | - | 70.0% | - | - | 49.5% | 39.4% | - | - | 5.9% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 14B Instruct | x | 45.5% | - | - | - | - | - | - | - | 14.7 | Open | - | - | Sep. 2024 | - | 0.9 | 11.1 | 2.9 | - | - | - | - | - | 5.3 | 6.3 | 6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3 24B Instruct | x | 45.3% | - | - | - | $0.07 | $0.14 | 32k | Oct. 2023 | 24 | Open | - | - | Jan. 2025 | - | 2.6 | 8.8 | 4.6 | - | 10.0 | - | - | - | 5.1 | 5.1 | 4.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512) | ✓ | 43.9% | - | - | - | $0.50 | $1.50 | 262.1k | - | 675 | Open | 267c/s | 734ms | Dec. 2025 | 644 | 9.2 | 19.9 | 0.1 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Base) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 10.3 | 26.3 | 1.9 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512 Eagle) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 11.7 | 24.0 | 1.3 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512 NVFP4) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 8.1 | 22.0 | 0.7 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 4 E2B | ✓ | 43.4% | - | - | - | - | - | - | Jan. 2025 | 5.1 | Open | - | - | Apr. 2026 | - | 3.2 | 6.9 | - | - | - | 6.3 | -11.0 | -1.4 | 1.3 | 1.4 | 3.1 | - | 67.4% | - | - | - | 44.2% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3 27B | ✓ | 42.4% | - | - | - | $0.10 | $0.20 | 131.1k | - | 27 | Open | - | - | Mar. 2025 | - | 7.0 | 19.5 | 4.9 | - | - | -0.3 | - | - | 8.3 | 8.3 | 7.4 | - | - | - | - | - | - | - | - | 10.0% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen2 72B Instruct | x | 42.4% | - | - | - | - | - | - | - | 72 | Open | - | - | Jul. 2024 | - | 3.5 | 11.5 | 11.8 | - | - | - | - | - | 8.7 | 7.1 | 6.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Lite | ✓ | 42.0% | - | - | - | $0.06 | $0.24 | 300k | - | - | Closed | - | - | Nov. 2024 | - | 5.7 | 15.0 | 5.7 | 11.6 | - | 6.0 | 10.3 | 5.4 | 10.3 | 8.2 | 8.1 | - | - | 56.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 70B Instruct | x | 41.7% | - | - | - | $0.20 | $0.20 | 128k | - | 70 | Open | - | - | Jul. 2024 | - | 4.4 | 13.0 | 1.6 | - | - | - | 32.2 | - | 11.4 | 11.5 | 11.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Haiku | x | 41.6% | - | 40.6% | - | $0.80 | $4.00 | 200k | - | - | Closed | - | - | Oct. 2024 | - | 5.3 | 12.1 | 7.2 | - | -8.2 | - | -10.0 | - | 4.2 | 4.2 | 3.9 | - | - | - | - | - | - | - | - | - | - | - | - | 51.0% | - | - | - | - | - | |
| 🇺🇸 | Gemma 3 12B | ✓ | 40.9% | - | - | - | $0.05 | $0.10 | 131.1k | - | 12 | Open | - | - | Mar. 2025 | - | 4.0 | 14.4 | 2.0 | - | - | -1.8 | - | - | 2.7 | 2.7 | 2.0 | - | - | - | - | - | - | - | - | 6.3% | - | - | - | - | - | - | - | - | - | ||
| Anthropic | 🇺🇸 | Claude 3 Sonnet | ✓ | 40.4% | - | - | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Feb. 2024 | - | -0.7 | 7.8 | -5.0 | - | - | - | - | - | 4.2 | 4.2 | 3.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini Diffusion | x | 40.4% | 23.3% | 22.9% | - | - | - | - | - | - | Closed | - | - | May 2025 | - | 0.5 | -3.5 | 5.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-4o mini | ✓ | 40.2% | - | 8.7% | - | $0.15 | $0.60 | 128k | Oct. 2023 | - | Closed | - | - | Jul. 2024 | - | 3.0 | 12.8 | 1.3 | - | - | 3.9 | - | - | 12.2 | 12.3 | 12.4 | - | - | 59.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Micro | x | 40.0% | - | - | - | $0.03 | $0.14 | 128k | - | - | Closed | - | - | Nov. 2024 | - | -0.7 | 9.4 | 0.5 | -1.8 | - | - | -4.0 | -5.1 | 1.9 | 2.2 | 1.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.5 Flash 8B | ✓ | 38.4% | - | - | - | $0.07 | $0.30 | 1.0M | Oct. 2024 | 8 | Closed | - | - | Mar. 2024 | - | -2.3 | 2.3 | - | - | - | 1.3 | - | 15.9 | 1.6 | 1.6 | 3.9 | - | - | 53.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Mistral | 🇫🇷 | Mistral Small 3.1 24B Base | ✓ | 37.5% | - | - | - | $0.10 | $0.30 | 128k | - | 24 | Open | - | - | Mar. 2025 | - | -1.2 | 5.7 | - | - | - | 4.1 | - | - | 6.7 | 6.8 | 8.5 | - | - | 59.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| AI21 Labs | 🇮🇱 | Jamba 1.5 Large | x | 36.9% | - | - | - | $2.00 | $8.00 | 256k | Mar. 2024 | 398 | Open | - | - | Aug. 2024 | - | -1.2 | 4.9 | - | - | -0.2 | - | - | - | 6.1 | 6.2 | 5.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-MoE-instruct | x | 36.8% | - | - | - | - | - | - | - | 60 | Open | - | - | Aug. 2024 | - | -1.9 | 4.0 | -8.3 | - | - | - | - | 16.2 | 3.9 | 3.9 | 3.6 | - | 69.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 7B Instruct | x | 36.4% | - | - | - | $0.30 | $0.30 | 131.1k | - | 7.6 | Open | - | - | Sep. 2024 | - | 0.5 | 8.7 | 1.7 | - | 21.5 | - | - | - | 0.7 | 0.7 | 0.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-1.5 | x | 35.9% | - | - | - | - | - | - | - | - | Closed | - | - | Mar. 2024 | - | -4.5 | 5.2 | -4.7 | - | - | -1.9 | - | - | 6.1 | 6.1 | 5.6 | - | - | 53.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4 | ✓ | 35.7% | - | - | - | $30.00 | $60.00 | 32.8k | Dec. 2022 | - | Closed | 34c/s | 1.5s | Jun. 2023 | - | -0.1 | 9.7 | -9.6 | - | - | - | - | - | 22.6 | 22.6 | 22.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3 24B Base | ✓ | 34.4% | - | - | - | - | - | - | Oct. 2023 | 23.6 | Open | - | - | Jan. 2025 | - | -3.2 | 4.0 | - | - | - | - | - | - | 6.3 | 9.2 | 6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 1.5B | x | 33.8% | - | - | - | - | - | - | - | 1.8 | Open | - | - | Jan. 2025 | - | -4.0 | 7.4 | -7.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3 Haiku | ✓ | 33.3% | - | - | - | $0.25 | $1.25 | 200k | - | - | Closed | - | - | Mar. 2024 | - | -4.3 | 4.2 | -1.8 | - | - | - | - | - | 0.3 | 0.3 | -0.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 11B Instruct | ✓ | 32.8% | - | - | - | $0.05 | $0.05 | 128k | Dec. 2023 | 10.6 | Open | - | - | Sep. 2024 | - | -3.1 | 1.2 | - | - | - | 2.6 | - | - | -1.6 | -1.6 | -1.0 | - | - | 50.7% | - | - | 33.0% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 3B Instruct | x | 32.8% | - | - | - | $0.01 | $0.02 | 128k | - | 3.2 | Open | - | - | Sep. 2024 | - | -10.5 | -5.4 | - | - | - | - | 8.4 | - | -13.9 | -13.9 | -14.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| AI21 Labs | 🇮🇱 | Jamba 1.5 Mini | x | 32.3% | - | - | - | $0.20 | $0.40 | 256.1k | Mar. 2024 | 52 | Open | - | - | Aug. 2024 | - | -9.1 | -5.8 | - | - | -9.5 | - | - | - | -5.7 | -5.7 | -6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 3 4B | ✓ | 30.8% | - | - | - | $0.02 | $0.04 | 131.1k | Aug. 2024 | 4 | Open | - | - | Mar. 2025 | - | -6.4 | 5.3 | -10.3 | - | - | -12.4 | - | - | -9.8 | -9.7 | -9.8 | - | - | - | - | - | - | - | - | 4.0% | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-3.5 Turbo | x | 30.8% | - | - | - | $0.50 | $1.50 | 16.4k | Sep. 2021 | - | Closed | 149c/s | 1.3s | Mar. 2023 | 538 | -12.6 | -3.8 | -10.0 | - | - | -20.1 | - | - | -5.2 | -5.2 | -8.2 | - | - | 0.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Omni-7B | ✓ | 30.8% | - | - | - | - | - | - | - | 7 | Open | - | - | Mar. 2025 | - | -1.0 | 5.3 | -1.6 | - | -11.3 | 8.9 | - | 1.9 | -6.9 | -6.9 | 1.4 | - | - | 59.2% | - | - | 36.6% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 8B Instruct | x | 30.4% | - | - | - | $0.03 | $0.03 | 131.1k | Dec. 2023 | 8 | Open | - | - | Jul. 2024 | - | -7.3 | -3.8 | -6.2 | - | - | - | 23.9 | - | -4.1 | -4.1 | -4.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-mini-instruct | x | 30.4% | - | - | - | $0.10 | $0.10 | 128k | - | 3.8 | Open | - | - | Aug. 2024 | - | -9.1 | -3.3 | -13.5 | - | - | - | - | 14.4 | -0.4 | -0.3 | -0.7 | - | 55.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.0 Pro | x | 27.9% | - | - | - | $0.50 | $1.50 | 32.8k | Feb. 2024 | - | Closed | - | - | Feb. 2024 | - | -12.3 | -3.4 | - | - | - | -7.3 | - | -12.2 | -2.7 | -2.7 | -3.1 | - | - | 47.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen2 7B Instruct | x | 25.3% | - | - | - | - | - | - | - | 7.6 | Open | - | - | Jul. 2024 | - | -7.6 | -2.7 | -1.3 | - | 15.6 | - | - | - | -6.4 | -5.1 | -5.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Mini | x | 25.2% | - | - | - | - | - | - | Jun. 2024 | 3.8 | Open | - | - | Feb. 2025 | - | -8.4 | 1.8 | - | - | - | - | - | - | 0.7 | 0.7 | 0.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 3n E2B Instructed | ✓ | 24.8% | 6.7% | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -15.6 | -10.5 | -8.6 | - | - | - | - | - | -13.9 | -13.9 | -14.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E2B Instructed LiteRT (Preview) | ✓ | 24.8% | 6.7% | - | - | - | - | - | Jun. 2024 | 1.9 | Open | - | - | May 2025 | - | -17.7 | -12.4 | -11.0 | -16.8 | - | - | - | - | -17.3 | -17.3 | -17.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B Instructed | ✓ | 23.7% | 11.6% | - | - | $20.00 | $40.00 | 32k | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -8.6 | -2.0 | -3.7 | - | - | - | - | - | -5.5 | -5.5 | -5.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B Instructed LiteRT Preview | ✓ | 23.7% | 11.6% | - | - | - | - | - | Jun. 2024 | 1.9 | Open | - | - | May 2025 | - | -11.5 | -3.6 | -5.7 | 2.7 | - | - | - | - | -7.0 | -7.0 | -7.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3 1B | x | 19.2% | - | - | - | - | - | - | - | 1 | Open | - | - | Mar. 2025 | - | -20.9 | -12.7 | -18.5 | - | - | - | - | - | -27.8 | -27.8 | -28.1 | - | - | - | - | - | - | - | - | 2.2% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-0.8B | ✓ | 11.9% | - | - | - | - | - | - | - | 0.8 | Open | - | - | Mar. 2026 | - | -15.3 | -8.3 | - | - | -12.7 | -18.3 | -20.8 | -15.3 | -7.5 | -7.5 | -7.7 | - | 44.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-5 | x | - | - | 77.8% | - | $1.00 | $3.20 | 200k | - | 744 | Open | 86c/s | 9.6s | Feb. 2026 | 1,581 | 52.1 | - | 37.3 | 26.5 | - | - | 26.7 | - | - | - | - | - | - | - | 75.9% | - | - | - | 67.8% | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.3 Codex | ✓ | - | - | - | - | $1.75 | $14.00 | 400k | - | - | Closed | 122c/s | 1.3s | Feb. 2026 | 1,243 | 56.1 | - | 43.9 | - | - | 27.9 | 37.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 56.8% | |
| OpenAI | 🇺🇸 | GPT-5.2 Codex | ✓ | - | - | - | - | $1.75 | $14.00 | 400k | - | - | Closed | 96c/s | 8.6s | Jan. 2026 | 1,160 | 52.7 | - | 40.2 | - | - | - | 27.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 56.4% | |
| MiniMax | 🇨🇳 | MiniMax M2.5 | x | - | - | 80.2% | - | $0.30 | $1.20 | 1M | - | 230 | Open | 543c/s | 2.5s | Feb. 2026 | 955 | 53.1 | - | 38.9 | 27.0 | - | - | - | - | -1.6 | - | - | - | - | - | 76.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | 55.4% | |
| xAI | 🇺🇸 | Grok-4.20 Beta Non-Reasoning | ✓ | - | - | - | - | $2.00 | $6.00 | 2M | - | - | Closed | 390c/s | 2.3s | Mar. 2026 | 986 | - | - | - | - | - | - | - | 13.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Medium | ✓ | - | 98.4% | - | - | $1.25 | $10.00 | 400k | - | - | Closed | 303c/s | 9.3s | Nov. 2025 | 884 | 46.1 | 44.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2-Speciale | x | - | 96.0% | 73.1% | 30.6% | - | - | - | - | 685 | Open | - | - | Dec. 2025 | 832 | 43.0 | 46.2 | 23.5 | - | - | 25.4 | 13.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 (Non-thinking) | x | - | - | - | - | $0.28 | $0.42 | 131.1k | - | 685 | Open | 144c/s | 1.1s | Dec. 2025 | 705 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex High | ✓ | - | 96.7% | - | - | $1.25 | $10.00 | 400k | - | - | Closed | 314c/s | 11.4s | Nov. 2025 | 719 | 43.6 | 42.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex | ✓ | - | - | 73.7% | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 235c/s | 3.3s | Nov. 2025 | 710 | 43.2 | - | 29.4 | - | - | - | 17.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.20 Multi-Agent Beta | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Mar. 2026 | 660 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| StepFun | 🇨🇳 | Step-3.5-Flash | x | - | 97.3% | 74.4% | - | $0.10 | $0.40 | 65.5k | - | 196 | Open | 154c/s | 12.3s | Feb. 2026 | 591 | 49.8 | 47.2 | 28.9 | 22.4 | - | - | 19.1 | - | - | - | - | - | - | - | 69.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Coder | x | - | - | - | - | $0.18 | $0.18 | 256k | - | 480 | Open | 105c/s | 1.1s | Jan. 2025 | 585 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Fast Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 110c/s | 15.8s | Aug. 2025 | 556 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2.7 | x | - | - | - | - | $0.30 | $1.20 | 204.8k | - | - | Open | 159c/s | 4.7s | Mar. 2026 | 831 | 53.9 | - | 40.1 | - | - | - | 24.3 | - | 31.2 | 31.5 | - | - | - | - | - | - | - | - | - | - | - | 46.3% | - | - | - | - | - | - | 56.2% | |
| xAI | 🇺🇸 | Grok-4.1 Fast Non-Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 246c/s | 2.2s | Nov. 2025 | 572 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 Fast Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 93c/s | 28.4s | Nov. 2025 | 554 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.20 Beta Reasoning | ✓ | - | - | - | - | $2.00 | $6.00 | 2M | - | - | Closed | 106c/s | 25.7s | Mar. 2026 | 571 | - | - | - | - | - | - | - | 29.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Fast Non-Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 267c/s | 965ms | Aug. 2025 | 520 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok Code Fast 1 | x | - | - | 70.8% | - | $0.20 | $1.50 | 256k | - | - | Closed | 208c/s | 9.0s | Aug. 2025 | 461 | 32.6 | - | 19.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.3 Chat | ✓ | - | - | - | - | $1.75 | $14.00 | 128k | Aug. 2025 | - | Closed | 187c/s | 1.6s | Mar. 2026 | 552 | - | - | - | - | - | - | - | - | - | - | 29.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 32B | x | - | 72.9% | - | - | $0.10 | $0.44 | 128k | - | 32.8 | Open | 177c/s | 839ms | Apr. 2025 | 314 | 18.8 | 22.4 | 13.1 | - | - | - | 19.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Pro | x | - | - | 78.0% | - | $1.00 | $3.00 | 1M | - | 1000 | Closed | - | - | Mar. 2026 | 215 | 51.8 | - | 35.4 | 33.1 | 28.3 | - | 26.3 | - | 26.9 | 27.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex Mini | ✓ | - | 42.1% | - | - | - | - | - | - | - | Closed | - | - | Nov. 2025 | 210 | 0.6 | 0.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Omni | ✓ | - | - | 74.8% | - | $0.40 | $2.00 | 262k | - | - | Closed | - | - | Mar. 2026 | 191 | 45.5 | - | 28.6 | - | - | - | - | - | 24.4 | 24.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Coder 480B A35B Instruct | x | - | - | 69.6% | - | - | - | - | - | 480 | Open | - | - | Jan. 2025 | 46 | 27.4 | - | 16.8 | - | 21.6 | - | 14.5 | - | 22.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 77.5% | - | - | - | - | - | |
| Mistral | 🇫🇷 | Codestral-22B | x | - | - | - | - | - | - | - | - | 22.2 | Open | - | - | May 2024 | - | -1.8 | - | 1.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Cohere | 🇨🇦 | Command R+ | x | - | - | - | - | $0.25 | $1.00 | 128k | - | 104 | Open | - | - | Aug. 2024 | - | -3.4 | -3.1 | - | - | - | - | - | - | 1.2 | 1.2 | 0.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-R1 | x | - | - | - | - | $0.55 | $2.19 | 131.1k | - | 671 | Open | - | - | Jan. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V2.5 | x | - | - | 16.8% | - | $0.14 | $0.28 | 8.2k | - | 236 | Open | - | - | May 2024 | - | 8.0 | 15.9 | 10.7 | - | 24.5 | - | - | - | 7.7 | 7.8 | 7.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 | ✓ | - | - | - | - | - | - | - | - | 27 | Open | - | - | Dec. 2024 | - | -2.2 | 9.9 | - | - | - | 6.1 | - | - | - | - | -1.1 | - | - | 51.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 Small | ✓ | - | - | - | - | - | - | - | - | 16 | Open | - | - | Dec. 2024 | - | -5.0 | 7.2 | - | - | - | 3.9 | - | - | - | - | -3.5 | - | - | 48.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 Tiny | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2024 | - | -13.4 | 0.8 | - | - | - | -2.3 | - | - | - | - | -9.5 | - | - | 40.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Devstral Medium | x | - | - | 61.6% | - | $0.40 | $2.00 | 128k | - | - | Closed | - | - | Jul. 2025 | - | 24.0 | - | 11.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Devstral Small 1.1 | x | - | - | 53.6% | - | $0.10 | $0.30 | 128k | - | 24 | Open | - | - | Jul. 2025 | - | 17.9 | - | 5.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 2 27B | x | - | - | - | - | - | - | - | - | 27.2 | Open | - | - | Jun. 2024 | - | -4.9 | -2.2 | -14.6 | 23.8 | - | - | - | - | -0.3 | 1.4 | -0.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 2 9B | x | - | - | - | - | - | - | - | - | 9.2 | Open | - | - | Jun. 2024 | - | -8.9 | -5.4 | -18.8 | 13.5 | - | - | - | - | -2.7 | -1.1 | -3.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E2B | ✓ | - | - | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -14.9 | -8.6 | - | -3.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B | ✓ | - | - | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -9.3 | -3.2 | - | 8.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| ZAI | 🇨🇳 | GLM-5V-Turbo | ✓ | - | - | - | - | $1.20 | $4.00 | 200k | - | - | Closed | - | - | Apr. 2026 | - | 12.5 | - | - | - | - | 27.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 62.3% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.5 Pro | ✓ | - | - | - | 57.2% | $30.00 | $180.00 | 1M | Dec. 2025 | - | Closed | - | - | Apr. 2026 | - | 61.9 | 51.6 | - | 43.2 | - | 42.4 | - | - | 10.6 | - | - | - | - | - | 90.1% | - | - | - | - | - | - | - | - | - | 39.6% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 Codex | x | - | - | 74.5% | - | - | - | - | Sep. 2024 | - | Closed | - | - | Sep. 2025 | - | 40.7 | - | 27.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | Granite 3.3 8B Base | ✓ | - | - | - | - | - | - | - | Apr. 2024 | 8.2 | Open | - | - | Apr. 2025 | - | -3.0 | -1.6 | 18.2 | - | - | - | - | - | -9.8 | -8.6 | -10.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | IBM Granite 4.0 Tiny Preview | x | - | - | - | - | - | - | - | - | 7 | Open | - | - | May 2025 | - | -9.6 | -8.5 | 1.3 | - | - | - | - | - | -5.9 | -5.8 | -6.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-1.5V | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Apr. 2024 | - | 0.3 | -1.6 | - | - | - | 1.6 | - | - | - | - | 1.1 | - | - | 53.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-2 Image 1212 | x | - | - | - | - | - | - | - | - | - | Closed | - | - | Dec. 2024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 | ✓ | - | - | - | - | $3.00 | $15.00 | 256k | - | - | Closed | - | - | Nov. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 Thinking | ✓ | - | - | - | - | $3.00 | $15.00 | 256k | - | - | Closed | - | - | Nov. 2025 | - | - | - | - | - | - | -1.2 | - | - | - | - | -2.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi-k1.5 | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Jan. 2025 | - | 19.6 | 24.0 | - | - | - | 18.6 | - | - | 25.2 | 25.3 | 24.1 | - | - | 70.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron 70B Instruct | x | - | - | - | - | - | - | - | Dec. 2023 | 70 | Open | - | - | Oct. 2024 | - | -1.9 | 8.3 | - | - | -5.2 | - | - | - | 6.9 | 6.9 | 6.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | MedGemma 4B IT | ✓ | - | - | - | - | - | - | - | - | 4.3 | Open | - | - | May 2025 | - | -10.8 | - | - | - | - | -7.4 | - | - | - | - | -8.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| OpenBMB | 🇨🇳 | MiniCPM-SALA | x | - | 78.3% | - | - | - | - | - | - | 9.5 | Open | - | - | Feb. 2026 | - | 16.4 | 17.5 | 33.1 | - | - | - | - | - | 6.6 | 6.7 | 6.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (14B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 14 | Open | - | - | Dec. 2025 | - | 2.1 | 10.2 | - | - | - | - | - | - | 5.4 | 8.1 | 5.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | MiniStral 3 (14B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 14 | Open | - | - | Dec. 2025 | - | 13.5 | 29.7 | - | - | 12.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (3B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2025 | - | -9.9 | -1.4 | - | - | - | - | - | - | -3.3 | -3.0 | -3.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (3B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2025 | - | 1.6 | 19.3 | - | - | 3.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 8 | Open | - | - | Dec. 2025 | - | -3.9 | 4.5 | - | - | - | - | - | - | 1.2 | 4.6 | 0.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 8 | Open | - | - | Dec. 2025 | - | 8.5 | 24.8 | - | - | 8.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 8B Instruct | x | - | - | - | - | $0.10 | $0.10 | 128k | - | 8.0 | Open | - | - | Oct. 2024 | - | -7.6 | -5.0 | -23.4 | - | 10.2 | - | - | - | -10.0 | -10.8 | -10.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 2 | x | - | - | - | - | $2.00 | $6.00 | 128k | - | 123 | Open | - | - | Jul. 2024 | - | 8.4 | 14.1 | 21.1 | - | 19.5 | - | - | - | 15.1 | 15.2 | 14.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 | ✓ | - | - | - | - | $2.00 | $5.00 | 128k | - | 675 | Open | - | - | Sep. 2025 | - | 6.9 | 16.6 | - | - | 20.5 | - | - | - | - | - | - | - | 74.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral NeMo Instruct | x | - | - | - | - | $0.15 | $0.15 | 128k | - | 12 | Open | - | - | Jul. 2024 | - | -8.4 | -8.0 | - | 18.3 | - | - | - | - | -10.3 | -10.4 | -10.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small | x | - | - | - | - | $0.20 | $0.60 | 32.8k | - | 22 | Open | - | - | Sep. 2024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3-pro | ✓ | - | - | - | - | $20.00 | $80.00 | 200k | May 2024 | - | Closed | - | - | Jun. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-vision-instruct | ✓ | - | - | - | - | - | - | - | - | 4.2 | Open | - | - | Aug. 2024 | - | -6.4 | -8.2 | - | - | - | -1.9 | - | - | - | - | -6.8 | - | - | 43.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-4-multimodal-instruct | ✓ | - | - | - | - | $0.05 | $0.10 | 128k | Jun. 2024 | 5.6 | Open | - | - | Feb. 2025 | - | -0.8 | 9.1 | - | - | - | 5.6 | - | - | - | - | 3.9 | - | - | 55.1% | - | - | 38.5% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Pixtral-12B | ✓ | - | - | - | - | $0.15 | $0.15 | 128k | - | 12.4 | Open | - | - | Sep. 2024 | - | -6.0 | 0.9 | -6.7 | - | 5.8 | 3.4 | - | - | -6.0 | -6.0 | -2.1 | - | - | 52.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Pixtral Large | ✓ | - | - | - | - | $2.00 | $6.00 | 128k | - | 124 | Open | - | - | Nov. 2024 | - | 16.0 | 17.2 | - | - | 7.0 | 15.9 | - | - | - | - | 14.8 | - | - | 64.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QvQ-72B-Preview | ✓ | - | - | - | - | - | - | - | - | 73.4 | Open | - | - | Dec. 2024 | - | 12.0 | 12.6 | - | - | - | 12.6 | - | - | - | - | 20.1 | - | - | 70.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Coder 32B Instruct | x | - | - | - | - | $0.09 | $0.09 | 128k | - | 32 | Open | - | - | Sep. 2024 | - | 0.1 | 4.8 | 13.0 | - | - | - | - | - | 0.4 | -1.3 | -1.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Coder 7B Instruct | x | - | - | - | - | - | - | - | - | 7 | Open | - | - | Sep. 2024 | - | -6.3 | -4.4 | 6.8 | - | - | - | - | - | -8.9 | -10.6 | -10.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 72B Instruct | ✓ | - | - | - | - | - | - | - | - | 72 | Open | - | - | Jan. 2025 | - | 13.6 | 10.3 | - | - | - | 13.5 | - | 14.9 | - | - | 20.0 | - | - | 70.2% | - | - | 51.1% | 43.6% | - | - | 8.8% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 7B Instruct | ✓ | - | - | - | - | - | - | - | - | 8.3 | Open | - | - | Jan. 2025 | - | 2.4 | 5.8 | - | - | - | 8.1 | - | 6.4 | - | - | 6.2 | - | - | 58.6% | - | - | 38.3% | 29.0% | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2-VL-72B-Instruct | ✓ | - | - | - | - | - | - | - | Jun. 2023 | 73.4 | Open | - | - | Aug. 2024 | - | 13.3 | 8.2 | - | - | - | 15.8 | - | 21.1 | - | - | 4.1 | - | - | - | - | - | 46.2% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Base | x | - | - | - | - | - | - | - | - | 80 | Open | - | - | Sep. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 235B A22B Instruct | ✓ | - | 74.7% | - | - | $0.30 | $1.50 | 262.1k | - | 236 | Open | 95c/s | 2.8s | Sep. 2025 | 1,270 | 26.4 | 28.8 | - | - | 19.4 | 23.8 | 4.7 | 23.8 | 32.5 | 34.1 | 32.7 | - | - | - | - | 62.1% | 68.1% | 62.0% | - | 51.9% | 66.7% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 235B A22B Thinking | ✓ | - | 89.7% | - | 13.6% | $0.45 | $3.49 | 262.1k | - | 236 | Open | 47c/s | 44.0s | Sep. 2025 | 211 | 32.0 | 35.8 | - | - | 20.9 | 23.4 | 13.2 | 19.7 | 39.6 | 40.3 | 39.5 | - | - | - | - | 66.1% | 69.3% | 61.8% | - | 44.4% | 38.1% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 4B Instruct | ✓ | - | 46.6% | - | - | $0.10 | $0.60 | 262.1k | - | 4 | Open | 173c/s | 802ms | Sep. 2025 | 264 | 8.4 | 9.0 | - | - | 6.0 | 11.2 | -4.8 | 14.6 | 10.8 | 9.9 | 8.0 | - | - | - | - | 39.7% | 53.2% | 59.5% | - | 48.0% | 26.2% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 8B Instruct | ✓ | - | 45.9% | - | - | $0.08 | $0.50 | 262.1k | - | 9 | Open | 187c/s | 2.5s | Sep. 2025 | -68 | 12.1 | 13.9 | - | - | 11.9 | 14.4 | -1.3 | 16.1 | 17.2 | 17.1 | 15.5 | - | - | - | - | 46.4% | 55.9% | 54.6% | - | - | 33.9% | - | - | - | - | - | - | - | - | |
| StepFun | 🇨🇳 | Step3-VL-10B | ✓ | - | 87.7% | - | - | - | - | - | - | 10 | Open | - | - | Jan. 2026 | - | 35.3 | 30.7 | - | - | 26.6 | 23.9 | - | - | - | - | 32.2 | - | - | 78.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.5V | ✓ | - | - | - | - | - | - | - | - | 108 | Open | - | - | Aug. 2025 | -45 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| LG AI Research | 🇰🇷 | K-EXAONE-236B-A23B | x | - | 92.8% | - | - | $0.60 | $1.00 | 32.8k | Oct. 2025 | 236 | Closed | - | - | Dec. 2025 | -65 | 35.0 | 34.6 | - | - | - | - | 4.3 | - | 36.5 | 36.5 | 36.2 | - | 85.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | Granite 3.3 8B Instruct | ✓ | - | - | - | - | $0.50 | $0.50 | 128k | Apr. 2024 | 8 | Open | - | - | Apr. 2025 | -183 | 0.2 | 1.6 | 17.7 | - | - | - | - | - | 0.6 | 0.7 | 0.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok 4.3 NEW | x | - | - | - | - | $1.25 | $2.50 | 1M | - | - | Closed | 113c/s | 8.7s | May 2026 | 282 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
No models found matching your criteria.