Compare Leading AI & LLM Models
Compare performance metrics, benchmarks, and pricing across the leading AI models. Sort by any metric and find the best model for your use case.
Model Leaderboard
Showing 296 of 296 models
• 0 selected
| Org | 🌍 | Model | Multimodal | GPQA | AIME 2025 | SWE-bench | HLE | Input $/M | Output $/M | Context | Cutoff | Params (B) | License | Speed | Latency | Released | Code Arena | Reasoning | Math | Coding | Search | Writing | Vision | Tools | Long Ctx | Finance | Legal | Health | ARC-AGI v2 | MMMLU | MMMU | BrowseComp | CharXiv-R | MMMU-Pro | ScreenSpot Pro | MCP Atlas | SimpleQA | OSWorld | Toolathlon | Terminal Bench | TAU2 Retail | FrontierMath | MRCR v2 | SciCode | Apex Agents | SWE-bench Pro | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Anthropic | 🇺🇸 | Claude Mythos Preview UNRELEASED | ✓ | 94.6% | - | 93.9% | 64.7% | $25.00 | $125.00 | - | - | - | Closed | - | - | - | - | 71.4 | 61.7 | 57.3 | 41.3 | - | 52.6 | 39.6 | 36.8 | - | - | 24.3 | - | 92.7% | - | 86.9% | 93.2% | - | - | - | - | - | - | - | - | - | - | - | - | 77.8% | |
| 🇺🇸 | Gemini 3.1 Pro | ✓ | 94.3% | - | 80.6% | 51.4% | $2.50 | $15.00 | 1.0M | Jan. 2025 | - | Closed | 114c/s | 29.1s | Feb. 2026 | 2,093 | 59.0 | 54.7 | 44.1 | 37.0 | - | 41.0 | 34.1 | 13.2 | 4.8 | 4.8 | 19.9 | 77.1% | 92.6% | - | 85.9% | - | 80.5% | - | 69.2% | - | - | - | - | - | - | 26.3% | 59.0% | 33.5% | 54.2% | ||
| Anthropic | 🇺🇸 | Claude Opus 4.7 | ✓ | 94.2% | - | 87.6% | 54.7% | $5.00 | $25.00 | 1M | - | - | Closed | 108c/s | 2.2s | Apr. 2026 | 1,791 | 62.8 | 52.1 | 51.6 | 31.4 | - | 44.6 | 39.6 | 29.5 | 40.6 | - | 38.1 | - | 91.5% | - | 79.3% | 91.0% | - | - | 77.3% | - | - | - | - | - | - | - | - | - | 64.3% | |
| OpenAI | 🇺🇸 | GPT-5.5 NEW | ✓ | 93.6% | - | - | 52.2% | $5.00 | $30.00 | 1M | - | - | Closed | - | - | Apr. 2026 | - | 63.1 | 48.5 | 53.1 | 35.6 | 30.8 | 46.9 | 40.4 | 30.5 | 21.8 | - | - | 85.0% | - | - | 84.4% | - | 83.2% | - | 75.3% | - | - | 55.6% | - | - | 35.4% | 74.0% | - | - | 58.6% | |
| OpenAI | 🇺🇸 | GPT-5.2 Pro | ✓ | 93.2% | 100.0% | - | 36.6% | $21.00 | $168.00 | 400k | - | - | Closed | - | - | Dec. 2025 | - | 56.6 | 51.3 | - | 29.5 | - | 33.2 | - | - | - | - | - | 54.2% | - | - | 77.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.4 | ✓ | 92.8% | - | - | 39.8% | $2.50 | $15.00 | 1M | - | - | Closed | 244c/s | 1.1s | Mar. 2026 | 1,741 | 58.0 | 47.3 | 44.3 | 32.0 | 35.5 | 38.4 | 36.3 | 27.5 | 2.0 | - | 39.9 | 73.3% | - | - | 82.7% | - | 81.2% | - | 67.2% | - | - | 54.6% | - | - | 47.6% | - | - | - | 57.7% | |
| OpenAI | 🇺🇸 | GPT-5.2 | ✓ | 92.4% | 100.0% | 80.0% | 34.5% | $1.75 | $14.00 | 400k | Aug. 2025 | - | Closed | 176c/s | 29.3s | Dec. 2025 | 1,522 | 54.3 | 50.9 | 35.7 | 26.1 | 33.1 | 35.8 | 28.7 | - | - | - | 44.4 | 52.9% | 89.6% | - | 65.8% | 82.1% | 79.5% | 86.3% | 60.6% | - | - | 46.3% | - | - | 40.3% | - | - | - | - | |
| 🇺🇸 | Gemini 3 Pro | ✓ | 91.9% | 100.0% | 76.2% | 45.8% | - | - | - | Jan. 2025 | - | Closed | - | - | Nov. 2025 | 1,579 | 49.9 | 50.0 | 33.4 | - | - | 35.1 | 19.8 | 9.9 | - | - | 50.0 | 31.1% | 91.8% | - | - | 81.4% | 81.0% | 72.7% | - | 72.1% | - | - | - | - | - | 26.3% | - | - | - | ||
| Anthropic | 🇺🇸 | Claude Opus 4.6 | ✓ | 91.3% | 99.8% | 80.8% | 53.1% | $5.00 | $25.00 | 1M | - | - | Closed | 116c/s | 1.9s | Feb. 2026 | 2,005 | 60.0 | 52.3 | 45.6 | 38.7 | 44.6 | 36.6 | 35.1 | 36.6 | 37.1 | 39.3 | 14.0 | 68.8% | 91.1% | - | 84.0% | 77.4% | 77.3% | - | 62.7% | - | 72.7% | - | - | - | - | 93.0% | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2.6 NEW | ✓ | 90.5% | - | 80.2% | 36.4% | $0.95 | $4.00 | 262.1k | - | 1000 | Open | 82c/s | 64.4s | Apr. 2026 | 1,179 | 59.5 | 50.7 | 45.6 | 38.0 | - | 38.8 | 33.4 | - | - | - | - | - | - | - | 86.3% | 86.7% | 80.1% | - | - | - | - | 50.0% | - | - | - | - | 52.2% | 27.9% | 58.6% | |
| 🇺🇸 | Gemini 3 Flash | ✓ | 90.4% | 99.7% | 78.0% | 43.5% | $0.50 | $3.00 | 1M | Jan. 2025 | - | Closed | 215c/s | 5.2s | Dec. 2025 | 1,690 | 49.5 | 50.6 | 31.5 | - | - | 34.1 | 24.1 | 5.3 | - | - | 43.6 | 33.6% | 91.8% | - | - | 80.3% | 81.2% | 69.1% | 57.4% | 68.7% | - | 49.4% | - | - | - | 22.1% | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.6 Plus | ✓ | 90.4% | - | 78.8% | 28.8% | - | - | - | - | - | Closed | - | - | Mar. 2026 | - | 52.3 | 49.9 | 43.3 | 25.2 | - | 35.8 | 31.0 | 37.4 | 56.9 | 56.9 | 54.1 | - | 89.5% | 86.0% | - | 81.5% | 78.8% | 68.2% | 74.1% | - | - | 39.8% | - | - | - | - | - | - | 56.6% | |
| DeepSeek | 🇨🇳 | DeepSeek-V4-Pro-Max NEW | x | 90.1% | - | 80.6% | 48.2% | $1.74 | $3.48 | 1.0M | - | 1600 | Open | 52c/s | 53.2s | Apr. 2026 | 41 | 57.8 | 53.8 | 45.0 | 33.5 | - | 33.6 | 35.2 | 20.0 | 45.2 | 45.7 | 48.8 | - | - | - | 83.4% | - | - | - | 73.6% | 57.9% | - | 51.8% | - | - | - | - | - | - | 55.4% | |
| Anthropic | 🇺🇸 | Claude Sonnet 4.6 | ✓ | 89.9% | - | 79.6% | 49.0% | $3.00 | $15.00 | 200k | - | - | Closed | 143c/s | 1.5s | Feb. 2026 | 1,421 | 52.6 | 43.9 | 37.7 | 24.8 | 35.6 | 34.0 | 29.8 | 26.3 | 41.6 | 42.7 | 11.6 | 58.3% | 89.3% | - | 74.7% | - | 75.6% | - | 61.3% | - | 72.5% | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Muse Spark | ✓ | 89.5% | - | 77.4% | 58.4% | - | - | - | - | - | Closed | - | - | Apr. 2026 | - | 52.9 | 55.2 | 32.9 | 1.9 | 20.3 | 39.7 | 22.4 | - | 28.9 | 29.1 | 44.9 | 42.5% | - | - | - | 86.4% | 80.4% | 84.1% | - | - | - | - | - | - | - | - | - | - | 52.4% | |
| Bytedance | 🇨🇳 | Seed 2.0 Pro | ✓ | 88.9% | 98.3% | 76.5% | - | - | - | - | Jan. 2024 | - | Closed | - | - | Feb. 2026 | - | 54.6 | 45.3 | 33.3 | 28.9 | - | - | - | - | - | - | - | - | - | - | 77.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-397B-A17B | ✓ | 88.4% | - | 76.4% | 28.7% | $0.60 | $3.60 | 262.1k | - | 397 | Open | 84c/s | 18.9s | Feb. 2026 | 1,208 | 49.6 | 46.8 | 31.0 | 25.8 | 28.7 | 29.0 | 22.8 | 38.9 | 55.4 | 55.4 | 54.4 | - | 88.5% | - | 69.0% | - | - | - | - | - | - | 38.3% | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Heavy UNRELEASED | ✓ | 88.4% | 100.0% | - | 50.7% | - | - | - | Dec. 2024 | - | Closed | - | - | - | - | 53.3 | 53.5 | 25.6 | - | - | 36.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 277c/s | 3.2s | Nov. 2025 | 1,231 | 48.1 | 40.0 | 31.8 | 22.2 | 28.3 | 33.4 | 25.3 | - | - | - | 48.1 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 High | ✓ | 88.1% | 99.6% | - | - | - | - | - | - | - | Closed | - | - | Nov. 2025 | 1,140 | 53.5 | 47.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 Medium | ✓ | 88.1% | 88.9% | - | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 100c/s | 39.4s | Aug. 2025 | 1,089 | 44.5 | 29.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Thinking | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | - | - | Closed | 165c/s | 32.3s | Nov. 2025 | 1,003 | 46.3 | 38.6 | 30.8 | 13.7 | 25.4 | 31.9 | 22.4 | - | - | - | 44.4 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Instant | ✓ | 88.1% | 94.0% | 76.3% | - | $1.25 | $10.00 | 400k | - | - | Closed | 261c/s | 2.9s | Nov. 2025 | 770 | 49.7 | 41.4 | 31.3 | 18.3 | 26.9 | 35.1 | 23.8 | - | - | - | 46.0 | - | - | 85.4% | - | - | - | - | - | - | - | - | - | - | 26.7% | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V4-Flash-Max NEW | x | 88.1% | - | 79.0% | 45.1% | $0.14 | $0.28 | 1.0M | - | 284 | Open | - | - | Apr. 2026 | - | 52.3 | 51.5 | 38.8 | 23.9 | - | 31.1 | 28.3 | 7.9 | 36.5 | 36.5 | 44.8 | - | - | - | 73.2% | - | - | - | 69.0% | 34.1% | - | 47.8% | - | - | - | - | - | - | 52.6% | |
| OpenAI | 🇺🇸 | GPT-5.4 mini | ✓ | 88.0% | - | - | 28.2% | $0.75 | $4.50 | 400k | Aug. 2025 | - | Closed | 291c/s | 745ms | Mar. 2026 | 520 | 46.2 | 36.4 | 35.4 | - | 22.2 | 29.5 | 25.0 | 19.4 | - | - | 36.1 | - | - | - | - | - | 76.6% | - | 57.7% | - | - | 42.9% | - | - | - | 33.6% | - | - | 54.4% | |
| Qwen | 🇨🇳 | Qwen3.6-27B NEW | ✓ | 87.8% | - | 77.2% | 24.0% | $0.60 | $3.60 | 262.1k | - | 27.8 | Open | - | - | Apr. 2026 | - | 46.2 | 43.3 | 35.5 | - | - | 30.7 | 24.8 | 29.2 | 46.9 | 46.9 | 46.5 | - | - | 82.9% | - | 78.4% | 75.8% | - | - | - | - | - | - | - | - | - | - | - | 53.5% | |
| MoonshotAI | 🇨🇳 | Kimi K2.5 | ✓ | 87.6% | 96.1% | 76.8% | 50.2% | $0.60 | $3.00 | 262.1k | - | 1000 | Open | 41c/s | 77.9s | Jan. 2026 | 1,479 | 50.5 | 47.9 | 32.5 | 30.4 | - | 35.9 | 14.6 | 39.2 | 47.7 | 47.7 | 49.7 | - | - | - | 74.9% | 77.5% | 78.5% | - | - | - | - | - | - | - | - | - | 48.7% | - | 50.7% | |
| xAI | 🇺🇸 | Grok-4 | ✓ | 87.5% | 91.7% | - | 40.0% | - | - | - | Dec. 2024 | - | Closed | - | - | Jul. 2025 | 487 | 44.5 | 40.6 | 23.6 | - | - | 28.1 | - | - | - | - | - | 15.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 High | ✓ | 87.3% | 94.6% | - | - | - | - | - | Sep. 2024 | - | Closed | - | - | Aug. 2025 | 1,301 | 48.0 | 40.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4.5 | ✓ | 87.0% | - | 80.9% | - | $5.00 | $25.00 | 200k | Mar. 2025 | - | Closed | 257c/s | 1.6s | Nov. 2025 | 1,614 | 54.5 | 42.2 | 41.0 | - | 35.6 | 30.0 | 30.8 | - | - | - | 24.2 | 37.6% | 90.8% | - | - | - | - | - | 62.3% | - | 66.3% | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 3.1 Flash-Lite | ✓ | 86.9% | - | - | 16.0% | $0.25 | $1.50 | 1M | Jan. 2025 | - | Closed | 237c/s | 6.2s | Mar. 2026 | 1,164 | 42.0 | 31.9 | - | - | - | 25.9 | - | 30.4 | - | - | 38.9 | - | 88.9% | - | - | 73.2% | 76.8% | - | - | 43.3% | - | - | - | - | - | 60.1% | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-122B-A10B | ✓ | 86.6% | - | 72.0% | 47.5% | $0.40 | $3.20 | 262.1k | - | 122 | Open | 136c/s | 20.4s | Feb. 2026 | 802 | 43.6 | 44.4 | 26.6 | 22.2 | 24.3 | 31.4 | 14.1 | 32.5 | 49.9 | 49.9 | 48.1 | - | 86.7% | 83.9% | 63.8% | 77.2% | 76.9% | 70.4% | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Pro Preview 06-05 | ✓ | 86.4% | 88.0% | 67.2% | 21.6% | $1.25 | $10.00 | 1.0M | Jan. 2025 | - | Closed | - | - | Jun. 2025 | - | 38.2 | 32.3 | 21.9 | - | - | 26.6 | - | -3.4 | - | - | 40.2 | - | - | 82.0% | - | - | - | - | - | 54.0% | - | - | - | - | - | 16.4% | - | - | - | ||
| ZAI | 🇨🇳 | GLM-5.1 | x | 86.2% | - | - | 52.3% | $1.40 | $4.40 | 200k | - | 754 | Open | 96c/s | 52.4s | Apr. 2026 | 1,423 | 54.9 | 46.4 | 45.1 | 30.1 | - | 39.5 | 30.6 | - | - | - | - | - | - | - | 79.3% | - | - | - | 71.8% | - | - | 40.7% | - | - | - | - | - | - | 58.4% | |
| Qwen | 🇨🇳 | Qwen3.6-35B-A3B | ✓ | 86.0% | - | 73.4% | 21.4% | - | - | - | - | 35 | Open | - | - | Apr. 2026 | - | 42.7 | 39.9 | 29.8 | 13.4 | - | 28.9 | 17.7 | 28.7 | 41.9 | 42.0 | 42.8 | - | - | 81.7% | - | 78.0% | 75.3% | - | 62.8% | - | - | 26.9% | - | - | - | - | - | - | 49.5% | |
| ZAI | 🇨🇳 | GLM-4.7 | ✓ | 85.7% | 95.7% | 73.8% | 42.8% | $0.60 | $2.20 | 204.8k | - | 358 | Open | 230c/s | 9.3s | Dec. 2025 | 1,046 | 44.2 | 44.0 | 23.2 | 16.8 | - | 29.8 | 12.5 | - | 36.6 | 36.5 | 36.3 | - | - | - | 52.0% | - | - | - | - | - | - | - | 33.3% | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 | ✓ | 85.7% | 94.6% | 74.9% | 24.8% | - | - | - | Sep. 2024 | - | Closed | - | - | Aug. 2025 | 886 | 46.2 | 39.0 | 34.5 | 13.2 | 29.5 | 33.2 | 23.8 | 32.6 | 45.5 | 45.6 | 42.2 | - | - | 84.2% | 54.9% | 81.1% | 78.4% | - | - | - | - | - | - | - | 26.3% | - | - | - | - | |
| xAI | 🇺🇸 | Grok 4 Fast | ✓ | 85.7% | 92.0% | - | 20.0% | $0.20 | $0.50 | 2M | - | - | Closed | - | - | Aug. 2025 | - | 41.4 | 36.8 | 26.7 | 6.9 | - | 17.4 | - | - | - | - | - | - | - | - | 44.9% | - | - | - | - | 95.0% | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-27B | ✓ | 85.5% | - | 72.4% | 48.5% | $0.30 | $2.40 | 262.1k | - | 27 | Open | 3c/s | 14.6s | Feb. 2026 | 521 | 42.6 | 42.8 | 22.1 | 20.5 | 22.2 | 30.1 | 13.3 | 31.1 | 46.4 | 46.5 | 44.2 | - | 85.9% | 82.3% | 61.0% | 79.5% | 75.0% | 70.3% | - | - | - | - | - | - | - | - | - | - | - | |
| Bytedance | 🇨🇳 | Seed 2.0 Lite | ✓ | 85.1% | 93.0% | 73.5% | - | - | - | - | Jan. 2024 | - | Closed | - | - | Feb. 2026 | - | 42.6 | 32.8 | 26.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Baidu | 🇨🇳 | ERNIE 5.0 | ✓ | 85.0% | 87.0% | - | 39.0% | - | - | - | - | - | Closed | - | - | Jan. 2026 | - | 44.1 | 40.5 | - | - | - | 27.9 | - | - | 46.8 | 46.8 | 46.6 | - | - | - | - | - | - | - | - | 75.0% | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.7 Sonnet | ✓ | 84.8% | 54.8% | 70.3% | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Feb. 2025 | 632 | 29.7 | 21.2 | 19.9 | - | 23.7 | 18.2 | 21.9 | - | - | - | 27.8 | - | 86.1% | 75.0% | - | - | - | - | - | - | - | - | 35.2% | 81.2% | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-3 | ✓ | 84.6% | 93.3% | - | - | $3.00 | $15.00 | 128k | Nov. 2024 | - | Closed | - | - | Feb. 2025 | - | 40.7 | 39.0 | 26.5 | - | - | 21.2 | - | - | - | - | 31.6 | - | - | 78.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2-Thinking-0905 | x | 84.5% | 100.0% | 71.3% | 51.0% | - | - | - | - | 1000 | Open | - | - | Sep. 2025 | 941 | 45.9 | 47.5 | 26.6 | 19.9 | -6.3 | 36.8 | - | - | 26.4 | 26.3 | 39.2 | - | - | - | 60.2% | - | - | - | - | - | - | - | 47.1% | - | - | - | 44.8% | - | - | |
| 🇺🇸 | Gemma 4 31B | ✓ | 84.3% | - | - | 26.5% | $0.14 | $0.40 | 262.1k | Jan. 2025 | 30.7 | Open | 80c/s | 2.5s | Apr. 2026 | 640 | 45.4 | 39.9 | - | - | - | 28.1 | 20.1 | 29.7 | 42.4 | 42.4 | 39.6 | - | 88.4% | - | - | - | 76.9% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-35B-A3B | ✓ | 84.2% | - | 69.2% | 47.4% | $0.25 | $2.00 | 262.1k | - | 35 | Open | 175c/s | 11.5s | Feb. 2026 | 573 | 39.2 | 38.8 | 16.8 | 17.8 | 18.8 | 27.1 | 13.3 | 26.9 | 41.9 | 42.1 | 40.4 | - | 85.2% | 81.4% | 61.0% | 77.5% | 75.1% | 68.6% | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | ChatGPT-4o Latest | ✓ | 84.0% | - | - | - | $2.50 | $10.00 | 128k | - | - | Closed | - | - | May 2024 | 346 | 41.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-3 Mini | ✓ | 84.0% | 90.8% | - | - | $0.30 | $0.50 | 128k | Nov. 2024 | - | Closed | 105c/s | 7.9s | Feb. 2025 | 323 | 42.3 | 38.0 | 27.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Flash | x | 83.7% | 94.1% | 73.4% | 22.1% | $0.10 | $0.30 | 256k | - | 309 | Open | - | - | Dec. 2025 | 793 | 38.9 | 38.1 | 23.6 | 15.9 | - | 19.3 | 9.8 | 22.2 | 38.6 | 38.6 | 38.4 | - | - | - | 58.3% | - | - | - | - | - | - | - | 30.5% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Sonnet 4.5 | ✓ | 83.4% | 87.0% | - | - | $3.00 | $15.00 | 200k | Jan. 2025 | - | Closed | 159c/s | 1.4s | Sep. 2025 | 1,101 | 40.4 | 32.9 | 32.0 | - | 35.3 | 19.4 | 33.8 | 19.0 | - | - | 16.7 | - | 89.1% | - | - | - | - | - | - | - | 61.4% | - | 50.0% | 86.2% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3 | ✓ | 83.3% | 86.4% | 69.1% | 14.7% | $2.00 | $8.00 | 200k | May 2024 | - | Closed | - | - | Apr. 2025 | - | 38.9 | 32.5 | 20.5 | 9.7 | 23.0 | 28.6 | 16.3 | - | - | - | 41.2 | 6.5% | - | 82.9% | 49.7% | 78.6% | 76.4% | - | - | - | - | - | - | - | 15.8% | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Pro | ✓ | 83.0% | 83.0% | 63.2% | 17.8% | $1.25 | $10.00 | 1.0M | Jan. 2025 | - | Closed | 112c/s | 7.2s | May 2025 | 933 | 35.5 | 31.6 | 17.4 | - | - | 23.3 | - | 29.8 | - | - | 32.7 | 4.9% | - | 79.6% | - | - | - | - | - | 50.8% | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemini 2.5 Flash | ✓ | 82.8% | 72.0% | 60.4% | 11.0% | $0.30 | $2.50 | 1.0M | Jan. 2025 | - | Closed | 285c/s | 4.2s | May 2025 | 783 | 28.9 | 23.4 | 12.4 | - | - | 17.3 | - | 19.1 | - | - | 33.3 | - | - | 79.7% | - | - | - | - | - | 26.9% | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-5.4 nano | ✓ | 82.8% | - | - | 24.3% | $0.20 | $1.25 | 400k | Aug. 2025 | - | Closed | 335c/s | 821ms | Mar. 2026 | 668 | 40.2 | 33.4 | 24.1 | - | 20.9 | 22.1 | 16.4 | 19.4 | - | - | 42.5 | - | - | - | - | - | 66.1% | - | 56.1% | - | - | 35.5% | - | - | - | 33.1% | - | - | 52.4% | |
| Nvidia | 🇺🇸 | Nemotron 3 Super (120B A12B) | x | 82.7% | 90.2% | 53.7% | 22.8% | - | - | - | Jun. 2025 | 120 | Open | - | - | Mar. 2026 | -63 | 30.6 | 38.1 | 17.4 | -0.1 | 13.6 | 19.7 | 5.8 | 19.0 | 36.0 | 36.1 | 35.8 | - | - | - | 31.3% | - | - | - | - | - | - | - | 25.8% | - | - | - | 42.0% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 (Thinking) | x | 82.4% | 93.1% | 73.1% | 25.1% | - | - | - | - | 685 | Open | - | - | Dec. 2025 | 393 | 42.8 | 39.3 | 29.9 | 15.0 | - | 22.1 | 14.8 | - | 40.4 | 40.4 | 40.1 | - | - | - | 51.4% | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 | x | 82.4% | 93.1% | 73.1% | 40.8% | $0.26 | $0.38 | 163.8k | - | 685 | Open | 32c/s | 6.1s | Dec. 2025 | 378 | 42.3 | 39.1 | 28.8 | 13.7 | - | 29.2 | 15.7 | - | 39.7 | 39.7 | 39.5 | - | - | - | 51.4% | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 mini | ✓ | 82.3% | 91.1% | - | 16.7% | $0.25 | $2.00 | 400k | May 2024 | - | Closed | 147c/s | 16.0s | Aug. 2025 | 1,095 | 37.1 | 33.3 | - | - | - | 12.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 22.1% | - | - | - | - | |
| 🇺🇸 | Gemma 4 26B-A4B | ✓ | 82.3% | - | - | 17.2% | $0.13 | $0.40 | 262.1k | Jan. 2025 | 25.2 | Open | 142c/s | 1.2s | Apr. 2026 | 935 | 36.0 | 31.9 | - | - | - | 21.1 | 18.4 | 15.9 | 32.6 | 32.6 | 31.4 | - | 86.3% | - | - | - | 73.8% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-9B | ✓ | 81.7% | - | - | - | - | - | - | - | 9 | Open | - | - | Mar. 2026 | - | 29.7 | 29.0 | - | - | 16.2 | 19.9 | 9.2 | 22.3 | 30.9 | 30.9 | 30.6 | - | 81.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Thinking | x | 81.5% | 90.6% | 59.4% | - | - | - | - | - | 560 | Open | - | - | Sep. 2025 | 791 | 36.1 | 34.2 | 19.4 | - | 23.7 | 16.6 | 21.6 | - | 36.6 | 32.5 | 32.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o4-mini | ✓ | 81.4% | 92.7% | 68.1% | 14.7% | $1.10 | $4.40 | 200k | May 2024 | - | Closed | - | - | Apr. 2025 | - | 34.7 | 35.2 | 16.0 | 12.5 | 14.8 | 22.8 | 14.5 | - | - | - | 35.3 | - | - | 81.6% | 51.5% | 72.0% | - | - | - | - | - | - | - | 71.8% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-235B-A22B-Thinking-2507 | x | 81.1% | 92.3% | - | 18.2% | $0.30 | $3.00 | 262.1k | - | 235 | Open | 169c/s | 1.6s | Jul. 2025 | 344 | 34.7 | 38.3 | - | - | 18.1 | 20.8 | 13.0 | - | 42.1 | 43.1 | 41.5 | - | - | - | - | - | - | - | - | - | - | - | - | 67.8% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.6 | ✓ | 81.0% | 93.9% | 68.0% | 17.2% | $0.55 | $2.19 | 131.1k | - | 357 | Open | 51c/s | 6.1s | Sep. 2025 | 1,135 | 38.4 | 34.7 | 20.5 | 7.4 | - | 14.1 | - | - | - | - | - | - | - | - | 45.1% | - | - | - | - | - | - | - | 40.5% | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2.1 | x | 81.0% | 81.0% | 67.0% | 22.0% | $0.30 | $1.20 | 1M | - | 230 | Open | 328c/s | 2.6s | Dec. 2025 | 926 | 41.6 | 36.8 | 27.6 | 19.1 | 18.2 | 18.6 | 21.1 | 20.1 | 51.5 | 51.5 | 51.1 | - | - | - | 62.0% | - | - | - | - | - | - | 43.5% | 47.9% | - | - | - | 39.0% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-R1-0528 | x | 81.0% | 87.5% | 44.6% | 17.7% | $0.55 | $2.19 | 131.1k | - | 671 | Open | 83c/s | 2.9s | May 2025 | 357 | 31.5 | 33.4 | 12.6 | -14.0 | - | 14.6 | - | - | 40.6 | 40.6 | 40.4 | - | - | - | 8.9% | - | - | - | - | 92.3% | - | - | 5.7% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4.1 | ✓ | 80.9% | 78.0% | 74.5% | - | $15.00 | $75.00 | 200k | - | - | Closed | 123c/s | 1.4s | Aug. 2025 | 1,167 | 38.3 | 30.3 | 28.6 | - | 24.7 | 21.8 | 23.0 | - | - | - | 16.9 | - | 89.5% | - | - | - | - | - | - | - | - | - | 43.3% | 82.4% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 120B High | x | 80.9% | 92.5% | - | - | $0.10 | $0.50 | 131.1k | - | 116.8 | Open | 125c/s | 19.2s | Aug. 2025 | 517 | 32.4 | 29.1 | - | - | - | - | 2.2 | - | 26.8 | 26.8 | 26.5 | - | 83.8% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Thinking-2601 | x | 80.5% | 99.6% | 70.0% | 25.2% | $0.30 | $1.20 | 128k | - | 560 | Open | 42c/s | 89.4s | Jan. 2026 | 531 | 46.4 | 41.9 | 26.1 | 18.7 | 38.0 | 23.0 | 34.2 | - | - | - | - | - | - | - | 56.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 120B | x | 80.1% | - | - | 14.9% | $0.09 | $0.45 | 131.1k | - | 116.8 | Open | 113c/s | 3.7s | Aug. 2025 | 330 | 31.5 | 32.2 | - | - | 10.3 | 10.6 | 8.4 | - | 32.0 | 32.1 | 34.7 | - | - | - | - | - | - | - | - | - | - | - | - | 67.8% | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2-Exp | x | 79.9% | 89.3% | 67.8% | 19.8% | - | - | - | - | 685 | Open | - | - | Sep. 2025 | 750 | 36.1 | 33.8 | 22.3 | 1.9 | - | 16.5 | - | - | 39.1 | 39.1 | 38.9 | - | - | - | 40.1% | - | - | - | - | 97.1% | - | - | 37.7% | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Opus 4 | ✓ | 79.6% | 75.5% | 72.5% | - | - | - | - | - | - | Closed | - | - | May 2025 | 932 | 35.6 | 27.4 | 22.8 | - | 25.0 | 21.3 | 23.2 | - | - | - | 8.8 | 8.6% | 88.8% | - | - | - | - | - | - | - | - | - | 39.2% | 81.4% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.5 | x | 79.1% | - | 64.2% | 14.4% | - | - | - | - | 355 | Open | - | - | Jul. 2025 | 744 | 34.5 | 34.6 | 21.3 | -4.2 | 25.2 | 8.8 | 25.8 | - | 42.0 | 37.9 | 37.7 | - | - | - | 26.4% | - | - | - | - | - | - | - | 37.5% | 79.7% | - | - | 41.7% | - | - | |
| OpenAI | 🇺🇸 | o1-pro | ✓ | 79.0% | - | - | - | - | - | - | Sep. 2023 | - | Closed | - | - | Dec. 2024 | - | 28.8 | 24.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Sarvam AI | 🇮🇳 | Sarvam-105B | x | 78.7% | 96.7% | 45.0% | 11.2% | - | - | - | - | 105 | Open | - | - | Mar. 2026 | - | 33.0 | 35.5 | 3.9 | 9.0 | - | 6.5 | - | - | 35.4 | 35.4 | 34.8 | - | - | - | 49.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2 | x | 78.0% | 78.0% | 69.4% | 12.5% | $0.30 | $1.20 | 1M | - | 230 | Open | 298c/s | 2.2s | Oct. 2025 | 700 | 34.4 | 26.0 | 23.9 | 5.3 | 19.0 | 6.7 | 14.5 | - | 29.9 | 29.9 | 29.6 | - | - | - | 44.0% | - | - | - | - | - | - | - | 46.3% | - | - | - | 36.0% | - | - | |
| OpenAI | 🇺🇸 | o1 | x | 78.0% | - | 41.0% | - | $15.00 | $60.00 | 200k | - | - | Closed | - | - | Dec. 2024 | - | 24.7 | 27.6 | 7.4 | - | 17.1 | 19.6 | 15.4 | - | 42.5 | 42.6 | 37.7 | - | 87.7% | 77.6% | - | - | - | - | - | 47.0% | - | - | - | 70.8% | 5.5% | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-235B-A22B-Instruct-2507 | x | 77.5% | 70.3% | - | - | $0.15 | $0.80 | 262.1k | - | 235 | Open | - | - | Jul. 2025 | 147 | 28.6 | 29.2 | 7.2 | - | 13.5 | 13.1 | 8.5 | - | 35.2 | 35.7 | 36.9 | - | - | - | - | - | - | - | - | 54.3% | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3-mini | x | 77.2% | - | 49.3% | - | $1.10 | $4.40 | 200k | Sep. 2023 | - | Closed | - | - | Jan. 2025 | - | 22.0 | 30.7 | 8.3 | - | 9.0 | - | -2.8 | 2.3 | 23.3 | 23.3 | 22.8 | - | - | - | - | - | - | - | - | 15.0% | - | - | - | 57.6% | 9.2% | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Thinking | x | 77.2% | 87.8% | - | - | $0.15 | $1.50 | 65.5k | - | 80 | Open | - | - | Sep. 2025 | - | 30.2 | 32.2 | - | - | 16.1 | 18.4 | 14.1 | - | 34.8 | 33.5 | 35.1 | - | - | - | - | - | - | - | - | - | - | - | - | 69.6% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-4B | ✓ | 76.2% | - | - | - | - | - | - | - | 4 | Open | - | - | Mar. 2026 | - | 22.7 | 23.6 | - | - | 12.6 | 14.4 | 8.5 | 11.9 | 23.6 | 23.7 | 23.3 | - | 76.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron Ultra 253B v1 | x | 76.0% | 72.5% | - | - | - | - | - | Dec. 2023 | 253 | Open | - | - | Apr. 2025 | - | 24.8 | 22.0 | 18.6 | - | - | - | 22.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 0905 | x | 75.8% | - | - | - | $0.60 | $2.50 | 262.1k | - | 1000 | Closed | 37c/s | 21.4s | Sep. 2025 | 1,005 | 26.0 | 28.2 | 30.1 | - | - | - | - | - | 35.1 | 35.1 | 34.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Sonnet 4 | ✓ | 75.4% | 70.5% | 72.7% | - | - | - | - | - | - | Closed | - | - | May 2025 | 882 | 30.6 | 22.1 | 21.8 | - | 25.3 | 16.8 | 23.5 | - | - | - | 25.8 | - | 86.5% | 74.4% | - | - | - | - | - | - | - | - | 35.5% | 80.5% | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.7-Flash | x | 75.2% | 91.6% | 59.2% | 14.4% | $0.07 | $0.40 | 128k | - | 30 | Open | 56c/s | 29.6s | Jan. 2026 | 759 | 31.9 | 29.3 | 9.0 | 3.9 | - | 8.1 | 10.1 | - | - | - | - | - | - | - | 42.8% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2-Instruct-0905 | x | 75.1% | 49.5% | 65.8% | 4.7% | - | - | - | - | 1000 | Open | - | - | Sep. 2025 | - | 24.4 | 23.5 | 13.1 | - | 14.8 | -3.8 | 9.9 | - | 30.6 | 30.6 | 30.2 | - | - | - | - | - | - | - | - | 31.0% | - | - | 25.0% | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 Instruct | x | 75.1% | 49.5% | - | 4.7% | $0.50 | $0.50 | 200k | - | 1000 | Open | - | - | Jul. 2025 | -882 | 25.7 | 25.7 | 17.3 | - | 16.1 | -5.4 | 11.4 | - | 31.4 | 31.5 | 31.0 | - | - | - | - | - | - | - | - | 31.0% | - | - | 30.0% | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Nemotron 3 Nano (30B A3B) | x | 75.0% | 99.2% | 38.8% | 15.5% | $0.06 | $0.24 | 262.1k | Nov. 2025 | 32 | Open | 96c/s | 10.2s | Dec. 2025 | 193 | 24.9 | 28.2 | 2.9 | - | 5.4 | 10.9 | 1.2 | - | 18.1 | 18.2 | 17.8 | - | - | - | - | - | - | - | - | - | - | - | 8.5% | - | - | - | 33.3% | - | - | |
| ZAI | 🇨🇳 | GLM-4.5-Air | x | 75.0% | - | 57.6% | 10.6% | - | - | - | - | 106 | Open | - | - | Jul. 2025 | - | 30.7 | 29.3 | 16.8 | -7.8 | 25.1 | 3.7 | 24.5 | - | 35.5 | 28.5 | 28.2 | - | - | - | 21.3% | - | - | - | - | - | - | - | 30.0% | 77.9% | - | - | 37.3% | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.1 | x | 74.9% | 49.8% | 66.0% | 15.9% | $0.27 | $1.00 | 163.8k | - | 671 | Open | - | - | Jan. 2025 | - | 28.6 | 23.4 | 17.0 | 3.3 | - | 11.7 | - | - | 34.1 | 34.1 | 33.9 | - | - | - | 30.0% | - | - | - | - | 93.4% | - | - | 31.3% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 30B A3B Thinking | ✓ | 74.4% | 83.1% | - | - | $0.20 | $1.00 | 262.1k | - | 31 | Open | - | - | Sep. 2025 | - | 21.9 | 26.3 | - | - | 13.2 | 16.9 | 6.0 | 17.7 | 28.1 | 28.4 | 28.9 | - | - | - | - | 56.6% | 63.0% | 57.3% | - | 23.9% | 30.6% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 20B High | x | 74.2% | 98.7% | - | - | - | - | - | - | 20.9 | Open | - | - | Aug. 2025 | 230 | 38.1 | 45.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash Thinking | ✓ | 74.2% | - | - | - | - | - | - | Aug. 2024 | - | Closed | - | - | Jan. 2025 | - | 22.5 | 13.8 | - | - | - | 19.5 | - | - | - | - | 29.5 | - | - | 75.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Inception | 🇺🇸 | Mercury 2 | x | 74.0% | 91.1% | - | - | $0.25 | $0.75 | 128k | - | - | Closed | 886c/s | 1.0s | Feb. 2026 | 125 | 29.5 | 31.6 | 19.2 | - | 7.3 | - | 3.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 38.0% | - | - | |
| Baidu | 🇨🇳 | ERNIE 4.5 | x | 74.0% | - | - | - | $0.40 | $4.00 | 128k | - | 21 | Closed | - | - | Jun. 2025 | 124 | -18.1 | -21.1 | - | - | - | - | - | - | -22.3 | -22.3 | -22.6 | - | - | - | - | - | - | - | - | 1.8% | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Zero | x | 73.3% | - | - | - | - | - | - | - | 671 | Open | - | - | Jan. 2025 | - | 21.4 | 23.7 | 8.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o1-preview | x | 73.3% | - | 41.3% | - | $15.00 | $60.00 | 128k | - | - | Closed | - | - | Sep. 2024 | - | 20.1 | 23.2 | 1.7 | - | - | - | - | - | 38.6 | 38.7 | 37.4 | - | - | - | - | - | - | - | - | 42.4% | - | - | - | - | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Chat | x | 73.2% | 61.3% | 60.4% | - | $0.30 | $1.20 | 128k | - | 560 | Open | 113c/s | 7.0s | Aug. 2025 | 778 | 23.9 | 23.0 | 16.2 | - | 17.7 | - | 14.4 | - | 35.0 | 35.0 | 34.5 | - | - | - | - | - | - | - | - | - | - | - | 39.5% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 32B Thinking | ✓ | 73.1% | 83.7% | - | - | - | - | - | - | 33 | Open | - | - | Sep. 2025 | - | 28.5 | 30.3 | - | - | 18.9 | 22.3 | 12.0 | 20.5 | 34.1 | 33.9 | 33.5 | - | - | - | - | 65.2% | 68.1% | 57.1% | - | 55.4% | 41.0% | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude Haiku 4.5 | ✓ | 73.0% | 80.7% | 73.3% | - | $1.00 | $5.00 | 200k | Feb. 2025 | - | Closed | 97c/s | 835ms | Oct. 2025 | 844 | 36.0 | 21.4 | 25.8 | - | 25.2 | 17.2 | 21.8 | 14.4 | - | - | -0.7 | - | 83.0% | - | - | - | - | - | - | - | 50.7% | - | 41.0% | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Instruct | x | 72.9% | 69.5% | - | - | $0.15 | $1.50 | 65.5k | - | 80 | Open | - | - | Sep. 2025 | - | 24.0 | 25.1 | 4.2 | - | 10.7 | 10.9 | 4.8 | - | 32.5 | 32.7 | 29.5 | - | - | - | - | - | - | - | - | - | - | - | - | 60.9% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT OSS 20B | x | 71.5% | - | - | 10.9% | $0.10 | $0.50 | 131.1k | - | 20.9 | Open | 209c/s | 8.4s | Aug. 2025 | 547 | 20.3 | 22.6 | - | - | -5.1 | 4.1 | -7.2 | - | 16.7 | 16.7 | 22.6 | - | - | - | - | - | - | - | - | - | - | - | - | 54.8% | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 nano | ✓ | 71.2% | 85.2% | - | 8.7% | - | - | - | May 2024 | - | Closed | - | - | Aug. 2025 | 663 | 24.8 | 25.6 | - | - | - | 1.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 9.6% | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (14B Reasoning 2512) | ✓ | 71.2% | 85.0% | - | - | $0.20 | $0.20 | 262.1k | - | 14 | Open | - | - | Dec. 2025 | - | 27.6 | 29.6 | 17.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 4 | ✓ | 71.2% | 83.8% | - | - | $0.15 | $0.60 | 256k | - | 119 | Open | 468c/s | 804ms | Mar. 2026 | -113 | 24.1 | 24.5 | 16.1 | - | - | 12.3 | - | 39.9 | 21.8 | 21.8 | 21.5 | - | - | - | - | - | 60.0% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Magistral Medium | ✓ | 70.8% | 64.9% | - | 9.0% | - | - | - | Jun. 2025 | 24 | Open | - | - | Jun. 2025 | - | 17.9 | 16.6 | 5.8 | - | - | 2.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 30B A3B Instruct | ✓ | 70.4% | 69.3% | - | - | $0.20 | $0.70 | 262.1k | - | 31 | Open | - | - | Sep. 2025 | - | 16.6 | 19.3 | - | - | 8.7 | 16.0 | 1.1 | 18.9 | 21.4 | 21.7 | 22.0 | - | - | - | - | 48.9% | 60.4% | 60.5% | - | 27.0% | 30.3% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4o | ✓ | 70.1% | - | 33.2% | 5.3% | $2.50 | $10.00 | 128k | - | - | Closed | - | - | Aug. 2024 | - | 17.6 | 15.3 | 1.3 | - | 6.7 | 12.6 | 5.8 | 15.5 | 19.7 | 19.7 | 20.2 | - | 81.4% | 72.2% | - | 58.8% | 59.9% | - | - | 38.2% | - | - | - | 60.3% | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M1 80K | x | 70.0% | 76.9% | 56.0% | 8.4% | $0.55 | $2.20 | 1M | - | 456 | Open | - | - | Jun. 2025 | - | 26.0 | 25.0 | 14.4 | - | 17.8 | 1.5 | 17.4 | 30.5 | 26.7 | 26.7 | 26.4 | - | - | - | - | - | - | - | - | 18.5% | - | - | - | 63.5% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 8B Thinking | ✓ | 69.9% | 80.3% | - | - | $0.18 | $2.09 | 262.1k | - | 9 | Open | - | - | Sep. 2025 | - | 18.9 | 22.1 | - | - | 15.1 | 15.2 | -11.0 | 13.4 | 21.4 | 23.4 | 21.5 | - | - | - | - | 53.0% | 60.4% | 46.6% | - | 49.6% | 33.9% | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 4 Maverick | ✓ | 69.8% | - | - | - | $0.17 | $0.60 | 1M | - | 400 | Open | - | - | Apr. 2025 | - | 15.6 | 22.8 | 5.7 | - | - | 18.8 | - | - | 23.5 | 23.5 | 25.4 | - | - | 73.4% | - | - | 59.6% | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.5 | ✓ | 69.5% | - | 38.0% | - | $75.00 | $150.00 | 128k | - | - | Closed | - | - | Feb. 2025 | - | 23.6 | 24.7 | 7.9 | - | 13.3 | 18.8 | 12.5 | 13.4 | 40.2 | 40.3 | 36.0 | - | 85.1% | 75.2% | - | 55.4% | - | - | - | 62.5% | - | - | - | 68.4% | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M1 40K | x | 69.2% | 74.6% | 55.6% | 7.2% | - | - | - | - | 456 | Open | - | - | Jun. 2025 | - | 24.8 | 22.7 | 12.8 | - | 17.1 | 0.6 | 15.7 | 32.9 | 25.7 | 25.7 | 25.4 | - | - | - | - | - | - | - | - | 17.9% | - | - | - | 67.8% | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Reasoning Plus | x | 68.9% | 78.0% | - | - | - | - | - | Mar. 2025 | 14 | Open | - | - | Apr. 2025 | - | 18.4 | 22.4 | 9.8 | - | - | - | - | - | 19.5 | 19.5 | 19.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 32B Instruct | ✓ | 68.9% | 66.2% | - | - | - | - | - | - | 33 | Open | - | - | Sep. 2025 | - | 21.6 | 22.9 | - | - | 12.5 | 20.2 | 8.3 | 23.0 | 25.8 | 24.8 | 25.7 | - | - | - | - | 62.8% | 65.3% | 57.9% | - | - | 32.6% | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3 0324 | x | 68.4% | - | - | - | $0.28 | $1.14 | 163.8k | - | 671 | Open | - | - | Mar. 2025 | - | 16.7 | 19.5 | 6.8 | - | - | - | - | - | 28.4 | 28.3 | 28.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Magistral Small 2506 | x | 68.2% | 62.8% | - | - | - | - | - | Jun. 2025 | 24 | Open | - | - | Jun. 2025 | - | 14.8 | 12.5 | 8.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Sonnet | ✓ | 67.2% | - | 49.0% | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Oct. 2024 | - | 22.5 | 25.4 | 17.9 | - | 13.1 | 20.1 | 11.3 | - | 30.2 | 30.2 | 27.1 | - | - | 68.3% | - | - | - | - | - | - | - | - | - | 69.2% | - | - | - | - | - | |
| Meituan | 🇨🇳 | LongCat-Flash-Lite | x | 66.8% | 63.2% | 54.4% | - | $0.10 | $0.40 | 256k | - | 68.5 | Open | 119c/s | 6.7s | Feb. 2026 | 712 | 23.2 | 20.0 | 12.3 | - | 18.6 | - | 15.3 | - | 22.3 | 22.3 | 21.9 | - | - | - | - | - | - | - | - | - | - | - | 33.8% | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Reasoning 2512) | ✓ | 66.8% | 78.7% | - | - | $0.15 | $0.15 | 262.1k | - | 8 | Open | - | - | Dec. 2025 | - | 22.6 | 24.7 | 14.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama-3.3 Nemotron Super 49B v1 | x | 66.7% | 58.4% | - | - | - | - | - | Dec. 2023 | 49.9 | Open | - | - | Mar. 2025 | - | 15.9 | 16.8 | - | - | 27.8 | - | 16.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Sarvam AI | 🇮🇳 | Sarvam-30B | x | 66.5% | 96.7% | 34.0% | - | - | - | - | - | 30 | Open | - | - | Mar. 2026 | - | 24.1 | 28.6 | 12.0 | 1.9 | - | - | - | - | 22.0 | 22.0 | 21.7 | - | - | - | 35.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 | ✓ | 66.3% | 46.4% | 54.6% | 5.4% | $2.00 | $8.00 | 1.0M | Jun. 2024 | - | Closed | 259c/s | 1.5s | Apr. 2025 | 657 | 21.8 | 20.3 | 8.2 | - | 10.6 | 15.3 | 12.1 | 20.2 | 33.3 | 33.4 | 32.0 | - | 87.3% | 74.8% | - | 56.7% | - | - | - | - | - | - | - | 68.0% | - | - | - | - | - | |
| Nous Research | 🇺🇸 | Hermes 3 70B | x | 66.1% | - | - | - | - | - | - | - | 70 | Open | - | - | Aug. 2024 | 71 | -1.7 | -1.7 | - | - | 34.8 | - | - | - | 3.2 | 4.1 | 2.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 30B A3B | x | 65.8% | 70.9% | - | - | $0.10 | $0.44 | 128k | - | 30.5 | Open | 250c/s | 303ms | Apr. 2025 | 58 | 18.2 | 19.8 | 15.3 | - | 7.9 | - | 16.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Reasoning | x | 65.8% | 62.9% | - | - | - | - | - | Mar. 2025 | 14 | Open | - | - | Apr. 2025 | - | 15.2 | 16.6 | 11.3 | - | - | - | - | - | 16.6 | 16.6 | 16.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Llama 70B | x | 65.2% | - | - | - | $0.10 | $0.40 | 128k | - | 70.6 | Open | - | - | Jan. 2025 | - | 19.6 | 23.0 | 13.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QwQ-32B | x | 65.2% | - | - | - | - | - | - | Nov. 2024 | 32.5 | Open | - | - | Mar. 2025 | - | 14.8 | 17.8 | 15.9 | - | - | - | 5.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QwQ-32B-Preview | x | 65.2% | - | - | - | $0.15 | $0.60 | 32.8k | Nov. 2024 | 32.5 | Open | - | - | Nov. 2024 | - | 11.2 | 11.1 | 7.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 mini | ✓ | 65.0% | 40.2% | 23.6% | 3.7% | $0.40 | $1.60 | 1.0M | May 2024 | - | Closed | 206c/s | 1.9s | Apr. 2025 | 966 | 16.1 | 16.5 | -1.1 | - | 2.7 | 14.7 | -0.5 | 13.6 | 25.0 | 25.1 | 26.0 | - | 78.5% | 72.7% | - | 56.8% | - | - | - | - | - | - | - | 55.8% | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.5 Flash-Lite | ✓ | 64.6% | 49.8% | 31.6% | 5.1% | $0.10 | $0.40 | 1.0M | Jan. 2025 | - | Open | - | - | Jun. 2025 | - | 12.9 | 10.3 | -2.1 | - | - | 9.1 | - | -13.7 | - | - | 23.0 | - | - | 72.9% | - | - | - | - | - | 10.7% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3 VL 4B Thinking | ✓ | 64.1% | 74.5% | - | - | $0.10 | $1.00 | 262.1k | - | 4 | Open | - | - | Sep. 2025 | - | 15.6 | 18.3 | - | - | 12.2 | 12.1 | 2.9 | 11.9 | 19.5 | 18.8 | 18.0 | - | - | - | - | 50.3% | 57.0% | 49.2% | - | - | 31.4% | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Nemotron Nano 9B v2 | x | 64.0% | 72.1% | - | - | - | - | - | Sep. 2024 | 8.9 | Open | - | - | Aug. 2025 | - | 22.9 | 23.0 | 21.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 32B | x | 62.1% | - | - | - | $0.12 | $0.18 | 128k | - | 32.8 | Open | - | - | Jan. 2025 | - | 17.3 | 20.4 | 13.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash | ✓ | 62.1% | - | - | - | $0.10 | $0.40 | 1.0M | Aug. 2024 | - | Closed | - | - | Dec. 2024 | - | 16.6 | 26.5 | 3.2 | - | - | 13.7 | - | 16.1 | 20.0 | 20.0 | 22.7 | - | - | 70.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3 Max | x | 62.0% | 81.6% | 69.6% | - | $0.50 | $5.00 | 256k | - | 1000 | Closed | 86c/s | 1.5s | Dec. 2025 | 636 | 29.1 | 30.9 | 17.9 | - | - | - | 5.7 | - | 39.6 | 39.8 | 38.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o1-mini | x | 60.0% | - | - | - | $3.00 | $12.00 | 128k | - | - | Closed | - | - | Sep. 2024 | - | 11.7 | 13.1 | 22.7 | - | - | - | - | - | 17.0 | 17.1 | 16.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Sonnet | ✓ | 59.4% | - | - | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Jun. 2024 | - | 18.5 | 26.5 | 21.7 | - | - | - | - | - | 30.1 | 30.1 | 29.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 14B | x | 59.1% | - | - | - | - | - | - | - | 14.8 | Open | - | - | Jan. 2025 | - | 14.1 | 16.9 | 10.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3 | x | 59.1% | - | 42.0% | - | $0.27 | $1.10 | 131.1k | - | 671 | Open | - | - | Dec. 2024 | - | 16.1 | 20.4 | 8.9 | - | - | - | - | 6.5 | 25.4 | 25.5 | 25.0 | - | - | - | - | - | - | - | - | 24.9% | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.5 Pro | ✓ | 59.1% | - | - | - | $2.50 | $10.00 | 2.1M | Nov. 2023 | - | Closed | - | - | May 2024 | - | 10.3 | 17.9 | 3.5 | - | - | 14.5 | - | 25.3 | 21.0 | 21.1 | 20.6 | - | - | 65.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 4 E4B | ✓ | 58.6% | - | - | - | - | - | - | Jan. 2025 | 8 | Open | - | - | Apr. 2026 | - | 12.6 | 14.7 | - | - | - | 10.8 | -1.6 | 7.2 | 11.6 | 11.6 | 12.6 | - | 76.6% | - | - | - | 52.6% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Meta | 🇺🇸 | Llama 4 Scout | ✓ | 57.2% | - | - | - | $0.08 | $0.30 | 10M | - | 109 | Open | - | - | Apr. 2025 | - | 6.2 | 14.3 | -0.5 | - | - | 15.4 | - | - | 12.8 | 12.9 | 15.8 | - | - | 69.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 | x | 56.1% | - | - | - | $0.07 | $0.14 | 16k | Jun. 2024 | 14.7 | Open | - | - | Dec. 2024 | - | 4.9 | 13.7 | 2.2 | - | - | - | - | - | 15.5 | 15.5 | 15.2 | - | - | - | - | - | - | - | - | 3.0% | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-2 | ✓ | 56.0% | - | - | - | $2.00 | $10.00 | 128k | - | - | Closed | - | - | Aug. 2024 | - | 12.4 | 20.9 | 13.6 | - | - | 13.3 | - | - | 23.7 | 23.8 | 22.4 | - | - | 66.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron Nano 8B V1 | x | 54.1% | 47.1% | - | - | - | - | - | Dec. 2023 | 8 | Open | - | - | Mar. 2025 | - | 6.0 | 12.7 | - | - | 7.3 | - | -1.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4o | ✓ | 53.6% | - | - | - | $2.50 | $10.00 | 128k | - | - | Closed | - | - | May 2024 | - | 13.6 | 20.6 | 19.4 | - | - | 6.9 | - | - | 24.4 | 24.4 | 24.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Min istral 3 (3B Reasoning 2512) | ✓ | 53.4% | 72.1% | - | - | $0.10 | $0.10 | 131.1k | - | 3 | Open | - | - | Dec. 2025 | - | 15.0 | 17.2 | 11.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Mini Reasoning | x | 52.0% | - | - | - | - | - | - | Feb. 2025 | 3.8 | Open | - | - | Apr. 2025 | - | 9.3 | 16.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3.5-2B | ✓ | 51.6% | - | - | - | - | - | - | - | 2 | Open | - | - | Mar. 2026 | - | -0.3 | 4.2 | - | - | -6.1 | -4.6 | -4.1 | 0.4 | 4.4 | 4.4 | 4.0 | - | 63.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 2.0 Flash-Lite | ✓ | 51.5% | - | - | - | $0.07 | $0.30 | 1.0M | Jun. 2024 | - | Closed | - | - | Feb. 2025 | - | 12.4 | 20.7 | - | - | - | 6.2 | - | -2.5 | 13.7 | 13.8 | 16.9 | - | - | 68.0% | - | - | - | - | - | 21.7% | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemini 1.5 Flash | ✓ | 51.0% | - | - | - | $0.15 | $0.60 | 1.0M | Nov. 2023 | - | Closed | - | - | May 2024 | - | 4.3 | 11.1 | -3.9 | - | - | 10.7 | - | 21.4 | 7.2 | 7.2 | 10.5 | - | - | 62.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| xAI | 🇺🇸 | Grok-2 mini | ✓ | 51.0% | - | - | - | - | - | - | - | - | Closed | - | - | Aug. 2024 | - | 8.5 | 18.0 | 6.9 | - | - | 10.7 | - | - | 19.9 | 20.0 | 19.0 | - | - | 63.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 405B Instruct | x | 50.7% | - | - | - | $0.89 | $0.89 | 128k | - | 405 | Open | - | - | Jul. 2024 | - | 14.2 | 22.1 | 16.5 | - | - | - | 40.5 | - | 22.4 | 22.4 | 22.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.3 70B Instruct | x | 50.5% | - | - | - | $0.20 | $0.20 | 128k | - | 70 | Open | - | - | Dec. 2024 | - | 12.4 | 19.7 | 13.2 | - | - | - | 29.2 | - | 17.4 | 17.4 | 17.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3 Opus | ✓ | 50.4% | - | - | - | $15.00 | $75.00 | 200k | - | - | Closed | - | - | Feb. 2024 | - | 9.0 | 17.6 | 5.5 | - | - | - | - | - | 18.4 | 18.5 | 18.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4.1 nano | ✓ | 50.3% | - | - | - | $0.10 | $0.40 | 1.0M | May 2024 | - | Closed | - | - | Apr. 2025 | - | 1.3 | 5.6 | -18.7 | - | -19.7 | 1.8 | -20.5 | 3.8 | 5.4 | 5.4 | 6.3 | - | 66.9% | 55.4% | - | 40.5% | - | - | - | - | - | - | - | 22.6% | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 32B Instruct | x | 49.5% | - | - | - | - | - | - | - | 32.5 | Open | - | - | Sep. 2024 | - | 5.4 | 18.1 | 12.4 | - | - | - | - | - | 10.3 | 10.5 | 10.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 7B | x | 49.1% | - | - | - | - | - | - | - | 7.6 | Open | - | - | Jan. 2025 | - | 11.3 | 18.7 | 4.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Llama 8B | x | 49.0% | - | - | - | - | - | - | - | 8.0 | Open | - | - | Jan. 2025 | - | 8.4 | 13.4 | 4.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 72B Instruct | x | 49.0% | - | - | - | $0.35 | $0.40 | 131.1k | - | 72.7 | Open | - | - | Sep. 2024 | - | 12.2 | 19.6 | 11.7 | - | 31.0 | - | - | - | 13.5 | 13.5 | 13.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi K2 Base | x | 48.1% | - | - | - | - | - | - | - | 1000 | Open | - | - | Jul. 2025 | - | 13.9 | 16.2 | 17.2 | - | - | - | - | - | 20.3 | 20.4 | 19.9 | - | - | - | - | - | - | - | - | 35.3% | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4 Turbo | x | 48.0% | - | - | - | $10.00 | $30.00 | 128k | Dec. 2023 | - | Closed | - | - | Apr. 2024 | - | 9.3 | 20.1 | 8.6 | - | - | - | - | - | 23.1 | 23.1 | 22.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 235B A22B | x | 47.5% | 81.5% | - | - | $0.10 | $0.10 | 128k | - | 235 | Open | - | - | Apr. 2025 | - | 15.9 | 22.2 | 16.2 | - | - | - | 22.3 | - | 18.2 | 18.2 | 17.8 | - | 86.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Pro | ✓ | 46.9% | - | - | - | $0.80 | $3.20 | 300k | - | - | Closed | - | - | Nov. 2024 | - | 11.6 | 21.1 | 15.3 | 20.6 | - | 10.8 | 13.9 | 10.9 | 21.0 | 19.5 | 17.0 | - | - | 61.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 90B Instruct | ✓ | 46.7% | - | - | - | $0.35 | $0.40 | 128k | - | 90 | Open | - | - | Sep. 2024 | - | 6.5 | 13.2 | - | - | - | 7.0 | - | - | 20.8 | 20.9 | 17.3 | - | - | 60.3% | - | - | 45.2% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3.2 24B Instruct | ✓ | 46.1% | - | - | - | - | - | - | Oct. 2023 | 23.6 | Open | - | - | Jun. 2025 | - | 7.6 | 12.3 | - | - | 9.0 | 15.2 | - | - | 11.5 | 11.6 | 13.2 | - | - | 62.5% | - | - | - | - | - | 12.1% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3.1 24B Instruct | ✓ | 46.0% | - | - | - | - | - | - | - | 24 | Open | - | - | Mar. 2025 | - | 3.1 | 9.1 | 13.8 | - | - | 3.6 | - | - | 9.2 | 9.2 | 9.8 | - | - | 59.3% | - | - | - | - | - | 10.4% | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 32B Instruct | ✓ | 46.0% | - | - | - | - | - | - | - | 33.5 | Open | - | - | Feb. 2025 | - | 8.2 | 12.7 | 19.6 | - | - | 10.7 | - | 10.2 | 7.6 | 7.7 | 12.6 | - | - | 70.0% | - | - | 49.5% | 39.4% | - | - | 5.9% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 14B Instruct | x | 45.5% | - | - | - | - | - | - | - | 14.7 | Open | - | - | Sep. 2024 | - | 1.0 | 11.1 | 2.9 | - | - | - | - | - | 5.3 | 6.3 | 6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3 24B Instruct | x | 45.3% | - | - | - | $0.07 | $0.14 | 32k | Oct. 2023 | 24 | Open | - | - | Jan. 2025 | - | 2.8 | 8.8 | 4.6 | - | 10.0 | - | - | - | 5.1 | 5.1 | 4.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512) | ✓ | 43.9% | - | - | - | $0.50 | $1.50 | 262.1k | - | 675 | Open | 181c/s | 5.6s | Dec. 2025 | 626 | 8.3 | 19.9 | 0.1 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Base) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 11.8 | 26.3 | 1.9 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512 Eagle) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 10.5 | 24.0 | 1.3 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 (675B Instruct 2512 NVFP4) | ✓ | 43.9% | - | - | - | - | - | - | - | 675 | Open | - | - | Dec. 2025 | - | 9.3 | 22.0 | 0.7 | - | - | - | - | - | - | - | - | - | 85.5% | - | - | - | - | - | - | 23.8% | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 4 E2B | ✓ | 43.4% | - | - | - | - | - | - | Jan. 2025 | 5.1 | Open | - | - | Apr. 2026 | - | 3.3 | 6.9 | - | - | - | 6.5 | -11.0 | -1.4 | 1.3 | 1.4 | 3.1 | - | 67.4% | - | - | - | 44.2% | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3 27B | ✓ | 42.4% | - | - | - | $0.10 | $0.20 | 131.1k | - | 27 | Open | - | - | Mar. 2025 | - | 7.2 | 19.5 | 4.9 | - | - | -0.3 | - | - | 8.3 | 8.3 | 7.4 | - | - | - | - | - | - | - | - | 10.0% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen2 72B Instruct | x | 42.4% | - | - | - | - | - | - | - | 72 | Open | - | - | Jul. 2024 | - | 3.7 | 11.5 | 11.8 | - | - | - | - | - | 8.7 | 7.1 | 6.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Lite | ✓ | 42.0% | - | - | - | $0.06 | $0.24 | 300k | - | - | Closed | - | - | Nov. 2024 | - | 5.8 | 15.0 | 5.7 | 11.6 | - | 6.1 | 10.3 | 5.4 | 10.3 | 8.2 | 8.1 | - | - | 56.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 70B Instruct | x | 41.7% | - | - | - | $0.20 | $0.20 | 128k | - | 70 | Open | - | - | Jul. 2024 | - | 4.7 | 13.0 | 1.6 | - | - | - | 32.2 | - | 11.4 | 11.5 | 11.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3.5 Haiku | x | 41.6% | - | 40.6% | - | $0.80 | $4.00 | 200k | - | - | Closed | - | - | Oct. 2024 | - | 5.4 | 12.1 | 7.2 | - | -8.2 | - | -10.0 | - | 4.2 | 4.2 | 4.0 | - | - | - | - | - | - | - | - | - | - | - | - | 51.0% | - | - | - | - | - | |
| 🇺🇸 | Gemma 3 12B | ✓ | 40.9% | - | - | - | $0.05 | $0.10 | 131.1k | - | 12 | Open | - | - | Mar. 2025 | - | 4.2 | 14.5 | 2.0 | - | - | -1.7 | - | - | 2.7 | 2.7 | 2.0 | - | - | - | - | - | - | - | - | 6.3% | - | - | - | - | - | - | - | - | - | ||
| Anthropic | 🇺🇸 | Claude 3 Sonnet | ✓ | 40.4% | - | - | - | $3.00 | $15.00 | 200k | - | - | Closed | - | - | Feb. 2024 | - | -0.5 | 7.8 | -5.0 | - | - | - | - | - | 4.2 | 4.2 | 3.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini Diffusion | x | 40.4% | 23.3% | 22.9% | - | - | - | - | - | - | Closed | - | - | May 2025 | - | 0.7 | -3.4 | 5.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-4o mini | ✓ | 40.2% | - | 8.7% | - | $0.15 | $0.60 | 128k | Oct. 2023 | - | Closed | - | - | Jul. 2024 | - | 3.2 | 12.8 | 1.3 | - | - | 3.9 | - | - | 12.2 | 12.3 | 12.4 | - | - | 59.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Amazon | 🇺🇸 | Nova Micro | x | 40.0% | - | - | - | $0.03 | $0.14 | 128k | - | - | Closed | - | - | Nov. 2024 | - | -0.8 | 9.5 | 0.5 | -1.8 | - | - | -4.0 | -5.1 | 1.9 | 2.2 | 1.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.5 Flash 8B | ✓ | 38.4% | - | - | - | $0.07 | $0.30 | 1.0M | Oct. 2024 | 8 | Closed | - | - | Mar. 2024 | - | -2.1 | 2.3 | - | - | - | 1.3 | - | 16.0 | 1.6 | 1.6 | 3.9 | - | - | 53.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Mistral | 🇫🇷 | Mistral Small 3.1 24B Base | ✓ | 37.5% | - | - | - | $0.10 | $0.30 | 128k | - | 24 | Open | - | - | Mar. 2025 | - | -1.0 | 5.7 | - | - | - | 4.2 | - | - | 6.7 | 6.8 | 8.5 | - | - | 59.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| AI21 Labs | 🇮🇱 | Jamba 1.5 Large | x | 36.9% | - | - | - | $2.00 | $8.00 | 256k | Mar. 2024 | 398 | Open | - | - | Aug. 2024 | - | -1.1 | 4.9 | - | - | -0.2 | - | - | - | 6.1 | 6.2 | 5.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-MoE-instruct | x | 36.8% | - | - | - | - | - | - | - | 60 | Open | - | - | Aug. 2024 | - | -1.7 | 4.0 | -8.3 | - | - | - | - | 16.2 | 3.9 | 3.9 | 3.6 | - | 69.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 7B Instruct | x | 36.4% | - | - | - | $0.30 | $0.30 | 131.1k | - | 7.6 | Open | - | - | Sep. 2024 | - | 0.7 | 8.7 | 1.7 | - | 21.5 | - | - | - | 0.7 | 0.7 | 0.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-1.5 | x | 35.9% | - | - | - | - | - | - | - | - | Closed | - | - | Mar. 2024 | - | -4.4 | 5.3 | -4.7 | - | - | -1.9 | - | - | 6.1 | 6.1 | 5.7 | - | - | 53.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-4 | ✓ | 35.7% | - | - | - | $30.00 | $60.00 | 32.8k | Dec. 2022 | - | Closed | - | - | Jun. 2023 | - | 0.0 | 9.7 | -9.6 | - | - | - | - | - | 22.6 | 22.6 | 22.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small 3 24B Base | ✓ | 34.4% | - | - | - | - | - | - | Oct. 2023 | 23.6 | Open | - | - | Jan. 2025 | - | -3.1 | 4.0 | - | - | - | - | - | - | 6.3 | 9.2 | 6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek R1 Distill Qwen 1.5B | x | 33.8% | - | - | - | - | - | - | - | 1.8 | Open | - | - | Jan. 2025 | - | -3.9 | 7.5 | -7.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Anthropic | 🇺🇸 | Claude 3 Haiku | ✓ | 33.3% | - | - | - | $0.25 | $1.25 | 200k | - | - | Closed | - | - | Mar. 2024 | - | -4.2 | 4.2 | -1.8 | - | - | - | - | - | 0.3 | 0.3 | -0.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 11B Instruct | ✓ | 32.8% | - | - | - | $0.05 | $0.05 | 128k | Dec. 2023 | 10.6 | Open | - | - | Sep. 2024 | - | -3.0 | 1.2 | - | - | - | 2.7 | - | - | -1.6 | -1.6 | -0.9 | - | - | 50.7% | - | - | 33.0% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.2 3B Instruct | x | 32.8% | - | - | - | $0.01 | $0.02 | 128k | - | 3.2 | Open | - | - | Sep. 2024 | - | -10.4 | -5.4 | - | - | - | - | 8.4 | - | -13.9 | -13.9 | -14.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| AI21 Labs | 🇮🇱 | Jamba 1.5 Mini | x | 32.3% | - | - | - | $0.20 | $0.40 | 256.1k | Mar. 2024 | 52 | Open | - | - | Aug. 2024 | - | -8.9 | -5.8 | - | - | -9.5 | - | - | - | -5.7 | -5.7 | -6.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 3 4B | ✓ | 30.8% | - | - | - | $0.02 | $0.04 | 131.1k | Aug. 2024 | 4 | Open | - | - | Mar. 2025 | - | -6.2 | 5.3 | -10.3 | - | - | -12.3 | - | - | -9.8 | -9.7 | -9.8 | - | - | - | - | - | - | - | - | 4.0% | - | - | - | - | - | - | - | - | - | ||
| OpenAI | 🇺🇸 | GPT-3.5 Turbo | x | 30.8% | - | - | - | $0.50 | $1.50 | 16.4k | Sep. 2021 | - | Closed | - | - | Mar. 2023 | - | -12.4 | -3.7 | -10.0 | - | - | -20.0 | - | - | -5.2 | -5.2 | -8.2 | - | - | 0.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Omni-7B | ✓ | 30.8% | - | - | - | - | - | - | - | 7 | Open | - | - | Mar. 2025 | - | -0.9 | 5.3 | -1.6 | - | -11.3 | 9.1 | - | 1.9 | -6.9 | -6.9 | 1.4 | - | - | 59.2% | - | - | 36.6% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Meta | 🇺🇸 | Llama 3.1 8B Instruct | x | 30.4% | - | - | - | $0.03 | $0.03 | 131.1k | Dec. 2023 | 8 | Open | - | - | Jul. 2024 | - | -7.1 | -3.8 | -6.2 | - | - | - | 23.9 | - | -4.1 | -4.1 | -4.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-mini-instruct | x | 30.4% | - | - | - | $0.10 | $0.10 | 128k | - | 3.8 | Open | - | - | Aug. 2024 | - | -8.9 | -3.2 | -13.5 | - | - | - | - | 14.4 | -0.4 | -0.3 | -0.6 | - | 55.4% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemini 1.0 Pro | x | 27.9% | - | - | - | $0.50 | $1.50 | 32.8k | Feb. 2024 | - | Closed | - | - | Feb. 2024 | - | -12.2 | -3.3 | - | - | - | -7.3 | - | -12.2 | -2.7 | -2.7 | -3.1 | - | - | 47.9% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen2 7B Instruct | x | 25.3% | - | - | - | - | - | - | - | 7.6 | Open | - | - | Jul. 2024 | - | -7.4 | -2.7 | -1.3 | - | 15.6 | - | - | - | -6.4 | -5.1 | -5.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi 4 Mini | x | 25.2% | - | - | - | - | - | - | Jun. 2024 | 3.8 | Open | - | - | Feb. 2025 | - | -8.3 | 1.8 | - | - | - | - | - | - | 0.7 | 0.7 | 0.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 3n E2B Instructed | ✓ | 24.8% | 6.7% | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -15.4 | -10.5 | -8.6 | - | - | - | - | - | -13.9 | -13.9 | -14.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E2B Instructed LiteRT (Preview) | ✓ | 24.8% | 6.7% | - | - | - | - | - | Jun. 2024 | 1.9 | Open | - | - | May 2025 | - | -17.5 | -12.4 | -11.0 | -16.8 | - | - | - | - | -17.3 | -17.3 | -17.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B Instructed | ✓ | 23.7% | 11.6% | - | - | $20.00 | $40.00 | 32k | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -8.4 | -2.0 | -3.7 | - | - | - | - | - | -5.5 | -5.5 | -5.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B Instructed LiteRT Preview | ✓ | 23.7% | 11.6% | - | - | - | - | - | Jun. 2024 | 1.9 | Open | - | - | May 2025 | - | -11.3 | -3.5 | -5.7 | 2.7 | - | - | - | - | -7.0 | -7.0 | -7.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3 1B | x | 19.2% | - | - | - | - | - | - | - | 1 | Open | - | - | Mar. 2025 | - | -20.7 | -12.7 | -18.5 | - | - | - | - | - | -27.8 | -27.8 | -28.1 | - | - | - | - | - | - | - | - | 2.2% | - | - | - | - | - | - | - | - | - | ||
| Qwen | 🇨🇳 | Qwen3.5-0.8B | ✓ | 11.9% | - | - | - | - | - | - | - | 0.8 | Open | - | - | Mar. 2026 | - | -15.1 | -8.3 | - | - | -12.7 | -18.2 | -20.8 | -15.3 | -7.5 | -7.5 | -7.7 | - | 44.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-5 | x | - | - | 77.8% | - | $1.00 | $3.20 | 200k | - | 744 | Open | 86c/s | 9.6s | Feb. 2026 | 1,576 | 52.1 | - | 37.3 | 26.5 | - | - | 26.7 | - | - | - | - | - | - | - | 75.9% | - | - | - | 67.8% | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.3 Codex | ✓ | - | - | - | - | $1.75 | $14.00 | 400k | - | - | Closed | 172c/s | 1.8s | Feb. 2026 | 1,244 | 56.2 | - | 43.9 | - | - | 27.9 | 37.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 56.8% | |
| OpenAI | 🇺🇸 | GPT-5.2 Codex | ✓ | - | - | - | - | $1.75 | $14.00 | 400k | - | - | Closed | 135c/s | 2.2s | Jan. 2026 | 1,148 | 52.6 | - | 40.2 | - | - | - | 27.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 56.4% | |
| MiniMax | 🇨🇳 | MiniMax M2.5 | x | - | - | 80.2% | - | $0.30 | $1.20 | 1M | - | 230 | Open | 200c/s | 4.2s | Feb. 2026 | 967 | 53.4 | - | 38.9 | 27.0 | - | - | - | - | -1.6 | - | - | - | - | - | 76.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | 55.4% | |
| xAI | 🇺🇸 | Grok-4.20 Beta Non-Reasoning | ✓ | - | - | - | - | $2.00 | $6.00 | 2M | - | - | Closed | 219c/s | 1.9s | Mar. 2026 | 914 | - | - | - | - | - | - | - | 13.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Medium | ✓ | - | 98.4% | - | - | $1.25 | $10.00 | 400k | - | - | Closed | 239c/s | 15.5s | Nov. 2025 | 860 | 46.0 | 44.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2-Speciale | x | - | 96.0% | 73.1% | 30.6% | - | - | - | - | 685 | Open | - | - | Dec. 2025 | 832 | 43.0 | 46.2 | 23.5 | - | - | 25.5 | 13.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 35.2% | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V3.2 (Non-thinking) | x | - | - | - | - | $0.28 | $0.42 | 131.1k | - | 685 | Open | - | - | Dec. 2025 | 703 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex High | ✓ | - | 96.7% | - | - | $1.25 | $10.00 | 400k | - | - | Closed | 184c/s | 29.6s | Nov. 2025 | 695 | 44.4 | 42.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex | ✓ | - | - | 73.7% | - | $1.25 | $10.00 | 400k | Sep. 2024 | - | Closed | 222c/s | 6.9s | Nov. 2025 | 687 | 43.3 | - | 29.4 | - | - | - | 17.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.20 Multi-Agent Beta | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Mar. 2026 | 660 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| StepFun | 🇨🇳 | Step-3.5-Flash | x | - | 97.3% | 74.4% | - | $0.10 | $0.40 | 65.5k | - | 196 | Open | 109c/s | 7.3s | Feb. 2026 | 587 | 49.9 | 47.1 | 28.9 | 22.4 | - | - | 19.1 | - | - | - | - | - | - | - | 69.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Coder | x | - | - | - | - | $0.18 | $0.18 | 256k | - | 480 | Open | 105c/s | 1.1s | Jan. 2025 | 580 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Fast Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 276c/s | 3.4s | Aug. 2025 | 557 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MiniMax | 🇨🇳 | MiniMax M2.7 | x | - | - | - | - | $0.30 | $1.20 | 204.8k | - | - | Open | 145c/s | 2.9s | Mar. 2026 | 555 | 53.6 | - | 40.1 | - | - | - | 24.3 | - | 31.2 | 31.5 | - | - | - | - | - | - | - | - | - | - | - | 46.3% | - | - | - | - | - | - | 56.2% | |
| xAI | 🇺🇸 | Grok-4.1 Fast Non-Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 460c/s | 1.1s | Nov. 2025 | 533 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 Fast Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 193c/s | 20.0s | Nov. 2025 | 532 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.20 Beta Reasoning | ✓ | - | - | - | - | $2.00 | $6.00 | 2M | - | - | Closed | 146c/s | 21.5s | Mar. 2026 | 494 | - | - | - | - | - | - | - | 29.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4 Fast Non-Reasoning | ✓ | - | - | - | - | $0.20 | $0.50 | 2M | - | - | Closed | 444c/s | 628ms | Aug. 2025 | 468 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok Code Fast 1 | x | - | - | 70.8% | - | $0.20 | $1.50 | 256k | - | - | Closed | 234c/s | 4.4s | Aug. 2025 | 437 | 32.6 | - | 19.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.3 Chat | ✓ | - | - | - | - | $1.75 | $14.00 | 128k | Aug. 2025 | - | Closed | 267c/s | 1.5s | Mar. 2026 | 404 | - | - | - | - | - | - | - | - | - | - | 27.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 32B | x | - | 72.9% | - | - | $0.10 | $0.30 | 128k | - | 32.8 | Open | 249c/s | 1.1s | Apr. 2025 | 300 | 18.9 | 22.5 | 13.1 | - | - | - | 19.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Pro | x | - | - | 78.0% | - | $1.00 | $3.00 | 1M | - | 1000 | Closed | - | - | Mar. 2026 | 215 | 51.8 | - | 35.4 | 33.1 | 28.3 | - | 26.3 | - | 26.9 | 27.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.1 Codex Mini | ✓ | - | 42.1% | - | - | - | - | - | - | - | Closed | - | - | Nov. 2025 | 210 | 0.8 | 1.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Xiaomi | 🇨🇳 | MiMo-V2-Omni | ✓ | - | - | 74.8% | - | $0.40 | $2.00 | 262k | - | - | Closed | - | - | Mar. 2026 | 191 | 45.6 | - | 28.6 | - | - | - | - | - | 24.4 | 24.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Coder 480B A35B Instruct | x | - | - | 69.6% | - | - | - | - | - | 480 | Open | - | - | Jan. 2025 | 46 | 27.4 | - | 16.8 | - | 21.6 | - | 14.5 | - | 22.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 77.5% | - | - | - | - | - | |
| Mistral | 🇫🇷 | Codestral-22B | x | - | - | - | - | - | - | - | - | 22.2 | Open | - | - | May 2024 | - | -1.3 | - | 1.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Cohere | 🇨🇦 | Command R+ | x | - | - | - | - | $0.25 | $1.00 | 128k | - | 104 | Open | - | - | Aug. 2024 | - | -3.2 | -3.1 | - | - | - | - | - | - | 1.2 | 1.2 | 0.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-R1 | x | - | - | - | - | $0.55 | $2.19 | 131.1k | - | 671 | Open | - | - | Jan. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek-V2.5 | x | - | - | 16.8% | - | $0.14 | $0.28 | 8.2k | - | 236 | Open | - | - | May 2024 | - | 8.2 | 15.9 | 10.7 | - | 24.5 | - | - | - | 7.7 | 7.8 | 7.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 | ✓ | - | - | - | - | - | - | - | - | 27 | Open | - | - | Dec. 2024 | - | -2.2 | 9.9 | - | - | - | 5.7 | - | - | - | - | -1.1 | - | - | 51.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 Small | ✓ | - | - | - | - | - | - | - | - | 16 | Open | - | - | Dec. 2024 | - | -5.0 | 7.2 | - | - | - | 3.9 | - | - | - | - | -3.5 | - | - | 48.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| DeepSeek | 🇨🇳 | DeepSeek VL2 Tiny | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2024 | - | -13.3 | 0.8 | - | - | - | -2.2 | - | - | - | - | -9.5 | - | - | 40.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Devstral Medium | x | - | - | 61.6% | - | $0.40 | $2.00 | 128k | - | - | Closed | - | - | Jul. 2025 | - | 24.1 | - | 11.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Devstral Small 1.1 | x | - | - | 53.6% | - | $0.10 | $0.30 | 128k | - | 24 | Open | - | - | Jul. 2025 | - | 18.0 | - | 5.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | Gemma 2 27B | x | - | - | - | - | - | - | - | - | 27.2 | Open | - | - | Jun. 2024 | - | -4.7 | -2.2 | -14.6 | 23.8 | - | - | - | - | -0.3 | 1.4 | -0.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 2 9B | x | - | - | - | - | - | - | - | - | 9.2 | Open | - | - | Jun. 2024 | - | -8.7 | -5.4 | -18.8 | 13.5 | - | - | - | - | -2.7 | -1.1 | -3.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E2B | ✓ | - | - | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -14.7 | -8.6 | - | -3.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| 🇺🇸 | Gemma 3n E4B | ✓ | - | - | - | - | - | - | - | Jun. 2024 | 8 | Closed | - | - | Jun. 2025 | - | -9.1 | -3.2 | - | 8.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| ZAI | 🇨🇳 | GLM-5V-Turbo | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Apr. 2026 | - | 12.7 | - | - | - | - | 27.8 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 62.3% | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5.5 Pro NEW | ✓ | - | - | - | 57.2% | $30.00 | $180.00 | 1M | - | - | Closed | - | - | Apr. 2026 | - | 62.0 | 51.6 | - | 43.2 | - | 42.4 | - | - | 10.6 | - | - | - | - | - | 90.1% | - | - | - | - | - | - | - | - | - | 39.6% | - | - | - | - | |
| OpenAI | 🇺🇸 | GPT-5 Codex | x | - | - | 74.5% | - | - | - | - | Sep. 2024 | - | Closed | - | - | Sep. 2025 | - | 40.8 | - | 27.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | Granite 3.3 8B Base | ✓ | - | - | - | - | - | - | - | Apr. 2024 | 8.2 | Open | - | - | Apr. 2025 | - | -2.9 | -1.6 | 18.2 | - | - | - | - | - | -9.8 | -8.6 | -10.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | IBM Granite 4.0 Tiny Preview | x | - | - | - | - | - | - | - | - | 7 | Open | - | - | May 2025 | - | -9.4 | -8.4 | 1.3 | - | - | - | - | - | -5.9 | -5.8 | -6.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-1.5V | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Apr. 2024 | - | 0.4 | -1.6 | - | - | - | 1.6 | - | - | - | - | 1.1 | - | - | 53.6% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-2 Image 1212 | x | - | - | - | - | - | - | - | - | - | Closed | - | - | Dec. 2024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 | ✓ | - | - | - | - | $3.00 | $15.00 | 256k | - | - | Closed | - | - | Nov. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| xAI | 🇺🇸 | Grok-4.1 Thinking | ✓ | - | - | - | - | $3.00 | $15.00 | 256k | - | - | Closed | - | - | Nov. 2025 | - | - | - | - | - | - | -1.2 | - | - | - | - | -2.0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| MoonshotAI | 🇨🇳 | Kimi-k1.5 | ✓ | - | - | - | - | - | - | - | - | - | Closed | - | - | Jan. 2025 | - | 19.8 | 24.0 | - | - | - | 18.6 | - | - | 25.2 | 25.3 | 24.1 | - | - | 70.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Nvidia | 🇺🇸 | Llama 3.1 Nemotron 70B Instruct | x | - | - | - | - | - | - | - | Dec. 2023 | 70 | Open | - | - | Oct. 2024 | - | -1.8 | 8.3 | - | - | -5.2 | - | - | - | 6.9 | 6.9 | 6.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| 🇺🇸 | MedGemma 4B IT | ✓ | - | - | - | - | - | - | - | - | 4.3 | Open | - | - | May 2025 | - | -10.3 | - | - | - | - | -7.3 | - | - | - | - | -8.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ||
| OpenBMB | 🇨🇳 | MiniCPM-SALA | x | - | 78.3% | - | - | - | - | - | - | 9.5 | Open | - | - | Feb. 2026 | - | 16.5 | 17.6 | 33.1 | - | - | - | - | - | 6.6 | 6.7 | 6.4 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (14B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 14 | Open | - | - | Dec. 2025 | - | 2.3 | 10.3 | - | - | - | - | - | - | 5.4 | 8.1 | 5.1 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | MiniStral 3 (14B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 14 | Open | - | - | Dec. 2025 | - | 13.6 | 29.7 | - | - | 12.5 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (3B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2025 | - | -9.8 | -1.4 | - | - | - | - | - | - | -3.3 | -3.0 | -3.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (3B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 3 | Open | - | - | Dec. 2025 | - | 1.8 | 19.3 | - | - | 3.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Base 2512) | ✓ | - | - | - | - | - | - | - | - | 8 | Open | - | - | Dec. 2025 | - | -3.6 | 4.5 | - | - | - | - | - | - | 1.2 | 4.6 | 0.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 3 (8B Instruct 2512) | ✓ | - | - | - | - | - | - | - | - | 8 | Open | - | - | Dec. 2025 | - | 8.7 | 24.8 | - | - | 8.2 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Ministral 8B Instruct | x | - | - | - | - | $0.10 | $0.10 | 128k | - | 8.0 | Open | - | - | Oct. 2024 | - | -7.4 | -5.0 | -23.4 | - | 10.2 | - | - | - | -10.0 | -10.8 | -10.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 2 | x | - | - | - | - | $2.00 | $6.00 | 128k | - | 123 | Open | - | - | Jul. 2024 | - | 8.5 | 14.1 | 21.1 | - | 19.5 | - | - | - | 15.1 | 15.2 | 14.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Large 3 | ✓ | - | - | - | - | $2.00 | $5.00 | 128k | - | 675 | Open | - | - | Sep. 2025 | - | 7.1 | 16.6 | - | - | 20.5 | - | - | - | - | - | - | - | 74.2% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral NeMo Instruct | x | - | - | - | - | $0.15 | $0.15 | 128k | - | 12 | Open | - | - | Jul. 2024 | - | -8.3 | -8.0 | - | 18.3 | - | - | - | - | -10.3 | -10.4 | -10.7 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Mistral Small | x | - | - | - | - | $0.20 | $0.60 | 32.8k | - | 22 | Open | - | - | Sep. 2024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| OpenAI | 🇺🇸 | o3-pro | ✓ | - | - | - | - | $20.00 | $80.00 | 200k | May 2024 | - | Closed | - | - | Jun. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-3.5-vision-instruct | ✓ | - | - | - | - | - | - | - | - | 4.2 | Open | - | - | Aug. 2024 | - | -6.4 | -8.2 | - | - | - | -1.9 | - | - | - | - | -6.8 | - | - | 43.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Microsoft | 🇺🇸 | Phi-4-multimodal-instruct | ✓ | - | - | - | - | $0.05 | $0.10 | 128k | Jun. 2024 | 5.6 | Open | - | - | Feb. 2025 | - | -1.5 | 9.1 | - | - | - | 5.3 | - | - | - | - | 3.9 | - | - | 55.1% | - | - | 38.5% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Pixtral-12B | ✓ | - | - | - | - | $0.15 | $0.15 | 128k | - | 12.4 | Open | - | - | Sep. 2024 | - | -5.9 | 1.0 | -6.7 | - | 5.8 | 3.4 | - | - | -6.0 | -6.0 | -2.1 | - | - | 52.5% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Mistral | 🇫🇷 | Pixtral Large | ✓ | - | - | - | - | $2.00 | $6.00 | 128k | - | 124 | Open | - | - | Nov. 2024 | - | 16.2 | 17.2 | - | - | 7.0 | 16.2 | - | - | - | - | 14.9 | - | - | 64.0% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | QvQ-72B-Preview | ✓ | - | - | - | - | - | - | - | - | 73.4 | Open | - | - | Dec. 2024 | - | 12.1 | 12.6 | - | - | - | 12.6 | - | - | - | - | 20.1 | - | - | 70.3% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Coder 32B Instruct | x | - | - | - | - | $0.09 | $0.09 | 128k | - | 32 | Open | - | - | Sep. 2024 | - | 0.3 | 4.8 | 13.0 | - | - | - | - | - | 0.4 | -1.3 | -1.6 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5-Coder 7B Instruct | x | - | - | - | - | - | - | - | - | 7 | Open | - | - | Sep. 2024 | - | -6.2 | -4.4 | 6.8 | - | - | - | - | - | -8.9 | -10.6 | -10.9 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 72B Instruct | ✓ | - | - | - | - | - | - | - | - | 72 | Open | - | - | Jan. 2025 | - | 13.6 | 10.3 | - | - | - | 13.6 | - | 14.9 | - | - | 20.0 | - | - | 70.2% | - | - | 51.1% | 43.6% | - | - | 8.8% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2.5 VL 7B Instruct | ✓ | - | - | - | - | - | - | - | - | 8.3 | Open | - | - | Jan. 2025 | - | 2.3 | 5.8 | - | - | - | 8.4 | - | 6.4 | - | - | 6.2 | - | - | 58.6% | - | - | 38.3% | 29.0% | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen2-VL-72B-Instruct | ✓ | - | - | - | - | - | - | - | Jun. 2023 | 73.4 | Open | - | - | Aug. 2024 | - | 15.4 | 8.2 | - | - | - | 16.5 | - | 21.1 | - | - | 4.1 | - | - | - | - | - | 46.2% | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3-Next-80B-A3B-Base | x | - | - | - | - | - | - | - | - | 80 | Open | - | - | Sep. 2025 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 235B A22B Instruct | ✓ | - | 74.7% | - | - | $0.30 | $1.50 | 262.1k | - | 236 | Open | - | - | Sep. 2025 | - | 26.6 | 28.8 | - | - | 19.4 | 23.9 | 4.7 | 23.8 | 32.5 | 34.1 | 32.7 | - | - | - | - | 62.1% | 68.1% | 62.0% | - | 51.9% | 66.7% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 235B A22B Thinking | ✓ | - | 89.7% | - | 13.6% | $0.45 | $3.49 | 262.1k | - | 236 | Open | - | - | Sep. 2025 | - | 32.2 | 35.7 | - | - | 20.9 | 23.5 | 13.2 | 19.7 | 39.6 | 40.3 | 39.5 | - | - | - | - | 66.1% | 69.3% | 61.8% | - | 44.4% | 38.1% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 4B Instruct | ✓ | - | 46.6% | - | - | $0.10 | $0.60 | 262.1k | - | 4 | Open | - | - | Sep. 2025 | - | 8.6 | 9.0 | - | - | 6.0 | 11.3 | -4.8 | 14.6 | 10.8 | 9.9 | 8.0 | - | - | - | - | 39.7% | 53.2% | 59.5% | - | 48.0% | 26.2% | - | - | - | - | - | - | - | - | |
| Qwen | 🇨🇳 | Qwen3 VL 8B Instruct | ✓ | - | 45.9% | - | - | $0.08 | $0.50 | 262.1k | - | 9 | Open | - | - | Sep. 2025 | - | 12.3 | 13.9 | - | - | 11.9 | 14.4 | -1.3 | 16.1 | 17.2 | 17.1 | 15.6 | - | - | - | - | 46.4% | 55.9% | 54.6% | - | - | 33.9% | - | - | - | - | - | - | - | - | |
| StepFun | 🇨🇳 | Step3-VL-10B | ✓ | - | 87.7% | - | - | - | - | - | - | 10 | Open | - | - | Jan. 2026 | - | 35.3 | 30.7 | - | - | 26.6 | 23.9 | - | - | - | - | 32.2 | - | - | 78.1% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| ZAI | 🇨🇳 | GLM-4.5V | ✓ | - | - | - | - | - | - | - | - | 108 | Open | - | - | Aug. 2025 | -45 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| LG AI Research | 🇰🇷 | K-EXAONE-236B-A23B | x | - | 92.8% | - | - | $0.60 | $1.00 | 32.8k | Oct. 2025 | 236 | Closed | - | - | Dec. 2025 | -65 | 35.1 | 34.6 | - | - | - | - | 4.3 | - | 36.5 | 36.5 | 36.3 | - | 85.7% | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
| IBM | 🇺🇸 | Granite 3.3 8B Instruct | ✓ | - | - | - | - | $0.50 | $0.50 | 128k | Apr. 2024 | 8 | Open | - | - | Apr. 2025 | -183 | 0.4 | 1.6 | 17.7 | - | - | - | - | - | 0.6 | 0.7 | 0.3 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
No models found matching your criteria.