Trader consensus on Polymarket reflects a nail-biting LMSYS Chatbot Arena leaderboard race under Style Control Off, with Anthropic's claude-opus-4-6-thinking at 49% implied probability edging claude-opus-4-6 (45.5%) and Google's gemini-3.1-pro-preview (44.5%), driven by Elo scores clustered around 1494–1504 from ongoing blind human votes. Claude Opus 4.6 variants surged to the top two days ago via superior adaptive reasoning and non-thinking efficiency in complex tasks, outpacing Gemini's strong multimodal and instruction-following benchmarks from its February release. Differentiators include Claude's edge in software engineering evals versus Gemini's legal/science prowess, but daily vote volatility—fueled by user battles—keeps odds fluid ahead of the April 17 snapshot.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · ОбновленоBest AI model on April 17? (Style Control Off)
Best AI model on April 17? (Style Control Off)
gemini-3.1-pro-preview 15%
gpt-5.4-high 5%
grok-4.20-beta1 1.6%
qwen3.5-max-preview 1.1%
gemini-3.1-pro-preview
15%
gpt-5.4-high
5%
grok-4.20-beta1
2%
qwen3.5-max-preview
1%
gemini-3-pro
1%
dola-seed-2.0-preview
1%
gemini-2.5-pro
1%
gemini-3-flash
1%
kimi-k2.5-thinking
1%
claude-opus-4-6
45%
claude-opus-4-6-thinking
49%
gemini-3.1-pro-preview 15%
gpt-5.4-high 5%
grok-4.20-beta1 1.6%
qwen3.5-max-preview 1.1%
gemini-3.1-pro-preview
15%
gpt-5.4-high
5%
grok-4.20-beta1
2%
qwen3.5-max-preview
1%
gemini-3-pro
1%
dola-seed-2.0-preview
1%
gemini-2.5-pro
1%
gemini-3-flash
1%
kimi-k2.5-thinking
1%
claude-opus-4-6
45%
claude-opus-4-6-thinking
49%
Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Открытие рынка: Apr 9, 2026, 5:18 PM ET
Resolver
0x69c47De9D...Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x69c47De9D...Trader consensus on Polymarket reflects a nail-biting LMSYS Chatbot Arena leaderboard race under Style Control Off, with Anthropic's claude-opus-4-6-thinking at 49% implied probability edging claude-opus-4-6 (45.5%) and Google's gemini-3.1-pro-preview (44.5%), driven by Elo scores clustered around 1494–1504 from ongoing blind human votes. Claude Opus 4.6 variants surged to the top two days ago via superior adaptive reasoning and non-thinking efficiency in complex tasks, outpacing Gemini's strong multimodal and instruction-following benchmarks from its February release. Differentiators include Claude's edge in software engineering evals versus Gemini's legal/science prowess, but daily vote volatility—fueled by user battles—keeps odds fluid ahead of the April 17 snapshot.
Экспериментальная сводка, созданная ИИ на основе данных Polymarket. Это не является торговой рекомендацией и не влияет на то, как разрешается этот рынок. · Обновлено
Не доверяй внешним ссылкам.
Не доверяй внешним ссылкам.
Часто задаваемые вопросы