Trader consensus on Polymarket reflects a razor-thin race for the top spot on the LMSYS Chatbot Arena leaderboard (Style Control On) by April 17, with Anthropic's Claude Opus 4.6 at 50% implied probability edging out OpenAI's GPT-5.4-high and others at 48.5%. Recent LMSYS updates and third-party benchmarks like GPQA-D and PinchBench show Claude leading in agentic reasoning and coding (Elo ~1504), but Google's Gemini 3.1 Pro Preview and xAI's Grok-4.20-beta1 have closed the gap via efficiency gains and multimodal prowess. Key differentiators include long-context handling, tool-use reliability, and cost-performance ratios. With resolution imminent, fresh evaluations or surprise releases could swing the leaderboard.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizadoclaude-opus-4-6 49%
claude-opus-4-6-thinking 48%
gemini-3.1-pro-preview 48%
gemini-2.5-pro 47.5%
claude-opus-4-6
49%
claude-opus-4-6-thinking
48%
gemini-3.1-pro-preview
48%
gemini-2.5-pro
48%
gpt-5.4-high
48%
kimi-k2.5-thinking
46%
grok-4.20-beta-0309-reasoning
34%
grok-4.20-beta1
22%
gemini-3-pro
19%
gemini-3-flash
4%
gpt-5.2-chat-latest-20260210
4%
qwen3.5-max-preview
3%
dola-seed-2.0-preview
3%
claude-opus-4-5-20251101-thinking-32k
-
claude-opus-4-6 49%
claude-opus-4-6-thinking 48%
gemini-3.1-pro-preview 48%
gemini-2.5-pro 47.5%
claude-opus-4-6
49%
claude-opus-4-6-thinking
48%
gemini-3.1-pro-preview
48%
gemini-2.5-pro
48%
gpt-5.4-high
48%
kimi-k2.5-thinking
46%
grok-4.20-beta-0309-reasoning
34%
grok-4.20-beta1
22%
gemini-3-pro
19%
gemini-3-flash
4%
gpt-5.2-chat-latest-20260210
4%
qwen3.5-max-preview
3%
dola-seed-2.0-preview
3%
claude-opus-4-5-20251101-thinking-32k
-
Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control on will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Mercado abierto: Apr 9, 2026, 5:20 PM ET
Resolver
0x69c47De9D...Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control on will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x69c47De9D...Trader consensus on Polymarket reflects a razor-thin race for the top spot on the LMSYS Chatbot Arena leaderboard (Style Control On) by April 17, with Anthropic's Claude Opus 4.6 at 50% implied probability edging out OpenAI's GPT-5.4-high and others at 48.5%. Recent LMSYS updates and third-party benchmarks like GPQA-D and PinchBench show Claude leading in agentic reasoning and coding (Elo ~1504), but Google's Gemini 3.1 Pro Preview and xAI's Grok-4.20-beta1 have closed the gap via efficiency gains and multimodal prowess. Key differentiators include long-context handling, tool-use reliability, and cost-performance ratios. With resolution imminent, fresh evaluations or surprise releases could swing the leaderboard.
Resumen experimental generado por IA con datos de Polymarket. Esto no es asesoramiento de trading y no influye en cómo se resuelve este mercado. · Actualizado
Cuidado con los enlaces externos.
Cuidado con los enlaces externos.
Preguntas frecuentes