AI Model Rankings
基于模型热度、真实调用趋势与能力指标,追踪主流 AI 模型的市场表现。
Top Models
主流模型每周使用趋势
LLM Leaderboard
按时间范围查询不同模型的调用量
1.
Hy3 preview
2.85T tokens
↑41%
2.
DeepSeek V4 Flash
2.76T tokens
↑102%
3.
Claude Sonnet 4.6
1.56T tokens
↑3%
4.
Claude Opus 4.7
1.41T tokens
↓8%
5.
Owl Alpha
1.21T tokens
↑115%
6.
Gemini 3 Flash Preview
1.12T tokens
↓1%
7.
DeepSeek V3.2
1.12T tokens
↑20%
8.
DeepSeek V4 Pro
950B tokens
↑8%
9.
Step 3.5 Flash
790B tokens
↑43%
10.
Kimi K2.6
782B tokens
↓4%
11.
MiniMax M2.7
637B tokens
↓17%
12.
Nemotron 3 Super (free)
589B tokens
↓2%
13.
Gemini 2.5 Flash
566B tokens
↓1%
14.
Gemini 2.5 Flash Lite
550B tokens
↓8%
15.
Claude Opus 4.6
542B tokens
↓10%
16.
Gemini 3.1 Pro Preview
502B tokens
↑73%
17.
GPT-5.5
478B tokens
↑5%
18.
gpt-oss-120b
440B tokens
↑14%
19.
Gemini 3.1 Flash Lite Preview
385B tokens
↑17%
20.
GLM 5.1
381B tokens
↑6%
Tool Calls
对比不同模型的工具调用使用量
1.
Claude Sonnet 4.6
94.1B calls
↑18%
2.
GPT-5.5
88.5B calls
↑12%
3.
Gemini 3 Flash Preview
76.4B calls
↑9%
4.
DeepSeek V4 Flash
70.2B calls
↑6%
5.
Qwen3.6 35B A3B
62.9B calls
↓5%
6.
Claude Opus 4.7
58.6B calls
↓7%
7.
MiniMax M2.7
50.2B calls
↑14%
8.
Kimi K2.6
46.8B calls
4%
9.
gpt-oss-120b
39.7B calls
↑11%
10.
GLM 5.1
34.9B calls
↑3%
Benchmarks
按综合能力指标对比模型表现
1.
GPT-5.5 (xhigh)
66.8
2.
Claude Opus 4.7 (Adaptive)
64.9
3.
MiMo-V2.5-Pro
63.7
4.
Grok 4.3
61.4
5.
Claude Sonnet 4.6
55.1
6.
Qwen3.6 35B A3B (Reasoning)
51.7
7.
MiniMax-M2.1
48.2
8.
Mistral Medium 3.5
45.6
9.
Grok 4.1 Fast (Reasoning)
43.1
10.
Gemini 3 Flash Preview
40.5
Fastest models
对比不同服务商下的模型吞吐表现
Highest throughput
1.
gpt-oss-safeguard-20b
645 tok/s
$0.07/M
2.
gpt-oss-20b
634 tok/s
$0.07/M
3.
gpt-oss-120b
626 tok/s
$0.35/M
4.
Mercury 2
355 tok/s
$0.25/M
5.
Qwen3 32B
351 tok/s
$0.29/M
6.
GLM 4.7
302 tok/s
$2.25/M
7.
MiniMax M2.5
261 tok/s
$0.30/M
8.
Llama 3.1 8B Instruct
230 tok/s
$0.10/M
9.
Qwen3.6 35B A3B
165 tok/s
$0.25/M
10.
Nano Banana (Gemini 2.5)
162 tok/s
$0.35/M
Context Length
按上下文窗口对比模型使用情况
10K
1.
Gemini 3 Flash Preview
1M context
↑17%
2.
Gemini 2.5 Flash
1M context
↓1%
3.
Claude Sonnet 4.6
200K context
↑3%
4.
Claude Opus 4.7
200K context
↓8%
5.
GPT-5.5
400K context
↑5%
6.
Qwen3.6 35B A3B
256K context
↑9%
7.
DeepSeek V4 Pro
128K context
↑8%
8.
MiniMax M2.7
1M context
↓17%
9.
Kimi K2.6
256K context
↓4%
10.
Nemotron 3 Super
128K context
↓2%
Categories
按使用场景对比模型表现
Programming
1.
DeepSeek V4 Flash
2.1T tokens
↑102%
2.
Claude Sonnet 4.6
1.4T tokens
↑3%
3.
GPT-5.5
920B tokens
↑5%
4.
Gemini 3 Flash Preview
870B tokens
↓1%
5.
Qwen3.6 35B A3B
760B tokens
↑19%
6.
Kimi K2.6
610B tokens
↓4%
7.
gpt-oss-120b
440B tokens
↑14%
8.
GLM 5.1
381B tokens
↑6%
9.
Mistral Medium 3.5
300B tokens
2%
10.
Llama 3.1 8B Instruct
230B tokens
↓8%
Languages
按自然语言使用量对比模型表现
1.
DeepSeek V4 Flash
2.38T tokens
↑66%
2.
Hy3 preview
2.12T tokens
↑38%
3.
Qwen3.6 35B A3B
1.76T tokens
↑21%
4.
Kimi K2.6
1.42T tokens
↓9%
5.
GLM 5.1
1.08T tokens
↑18%
6.
MiniMax M2.7
820B tokens
↓7%
7.
DeepSeek V3.2
790B tokens
↑12%
8.
Claude Sonnet 4.6
520B tokens
↑4%
9.
Gemini 3 Flash Preview
470B tokens
↓3%
10.
GPT-5.5
390B tokens
5%
Programming
按编程语言使用量对比模型表现
Python
1.
Claude Sonnet 4.6
1.18T tokens
↑3%
2.
DeepSeek V4 Flash
1.06T tokens
↑102%
3.
GPT-5.5
620B tokens
↑5%
4.
Qwen3.6 35B A3B
540B tokens
↑19%
5.
Gemini 3 Flash Preview
502B tokens
↓1%
6.
Kimi K2.6
440B tokens
↓4%
7.
DeepSeek V4 Pro
390B tokens
↑8%
8.
gpt-oss-120b
360B tokens
↑14%
9.
GLM 5.1
306B tokens
↑6%
10.
Mistral Medium 3.5
240B tokens
2%
Images
模型处理图像任务的累计趋势
1.
Nano Banana (Gemini 2.5)
48.2M images
↑22%
2.
Gemini 2.5 Flash Image
41.8M images
↑13%
3.
GPT Image 1
30.6M images
↑6%
4.
Claude Sonnet 4.6
24.7M images
↑3%
5.
Qwen Image
21.9M images
↑11%
6.
MiniMax M2.7
17.1M images
↓4%
7.
Gemini 3 Flash Preview
15.8M images
↓1%
8.
Mistral Medium 3.5
13.2M images
2%
9.
GLM 5.1
10.9M images
↑6%
10.
DeepSeek V4 Pro
8.6M images
↑8%
Audio Input
模型处理音频输入的累计趋势
1.
GPT-4o Transcribe
26.4M prompts
↑16%
2.
Gemini 2.5 Flash
21.9M prompts
↓1%
3.
Whisper Large V3
18.2M prompts
↓4%
4.
MiniMax Speech 2.5
14.7M prompts
↑12%
5.
Gemini 3 Flash Preview
12.6M prompts
↑7%
6.
Claude Sonnet 4.6
10.1M prompts
↑3%
7.
Qwen Audio
8.8M prompts
↑9%
8.
Mistral Voxtral
7.4M prompts
2%
9.
GLM Audio
5.9M prompts
↑5%
10.
Llama 3.1 Audio
4.8M prompts
↓6%
Top Apps
按应用与 Agent 场景观察模型采用情况
1.
Hermes Agent
353B tokens
2.
OpenClaw
195B tokens
3.
Kilo Code
166B tokens
4.
Claude Code
70.5B tokens
5.
CSS AI Pro
66.7B tokens
6.
Descript
62.7B tokens
7.
pi
39.9B tokens
8.
Janitor AI
27.7B tokens
9.
ISEKAI ZERO
25B tokens
10.
Roo Code
22.8B tokens