Gemini 3.1 Pro Preview and Claude Opus 4-6 variants hold nearly identical 47% implied probabilities as the top AI model on April 17 under lmarena.ai's Style Control Off leaderboard, reflecting trader consensus on their razor-thin Elo battle—recent updates show Gemini edging ahead at around 1494 versus Claude's 1499-1504, driven by fresh blind user votes favoring Google's multi-task reasoning while Anthropic excels in adaptive thinking depth. Kimi k2.5-thinking trails at 35.5% amid strong Chinese model competition, but frontrunners dominate Pareto frontiers for efficiency and performance. Key swing factors include incoming votes, potential fine-tunes, or previews like Grok 4.20 beta1 before resolution, underscoring rapid frontier shifts in large language model capabilities.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten. Dies ist keine Handelsberatung und spielt keine Rolle bei der Auflösung dieses Marktes. · Aktualisiertclaude-opus-4-6 45%
gemini-3.1-pro-preview 45%
qwen3.5-max-preview 9.0%
gemini-3-pro 6%
claude-opus-4-6
45%
gemini-3.1-pro-preview
45%
qwen3.5-max-preview
9%
gemini-3-pro
6%
gpt-5.4-high
5%
gemini-2.5-pro
3%
grok-4.20-beta1
2%
kimi-k2.5-thinking
1%
dola-seed-2.0-preview
1%
gemini-3-flash
1%
claude-opus-4-6-thinking
49%
claude-opus-4-6 45%
gemini-3.1-pro-preview 45%
qwen3.5-max-preview 9.0%
gemini-3-pro 6%
claude-opus-4-6
45%
gemini-3.1-pro-preview
45%
qwen3.5-max-preview
9%
gemini-3-pro
6%
gpt-5.4-high
5%
gemini-2.5-pro
3%
grok-4.20-beta1
2%
kimi-k2.5-thinking
1%
dola-seed-2.0-preview
1%
gemini-3-flash
1%
claude-opus-4-6-thinking
49%
Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Markt eröffnet: Apr 9, 2026, 5:18 PM ET
Resolver
0x69c47De9D...Results from the "Score" column under the "Text Arena | Overall" Leaderboard tab at https://lmarena.ai/leaderboard/text with style control off will be used to resolve this market.
Models will be ranked primarily by their arena score at this market’s check time, with alphabetical order of model names as listed in this market group (full string, including suffixes such as “-thinking”) used as a tiebreaker (e.g., if the two models are tied by arena score, “claude-opus-4-6” would be ranked ahead of “claude-opus-4-6-thinking”). This market will resolve based on the model that occupies first place under this ranking.
The resolution source for this market is the Chatbot Arena LLM Leaderboard found at https://lmarena.ai/. If this resolution source is unavailable at check time, this market will remain open until the leaderboard comes back online and will resolve based on the first check after it becomes available. If it becomes permanently unavailable, this market will resolve based on another resolution source.
Resolver
0x69c47De9D...Gemini 3.1 Pro Preview and Claude Opus 4-6 variants hold nearly identical 47% implied probabilities as the top AI model on April 17 under lmarena.ai's Style Control Off leaderboard, reflecting trader consensus on their razor-thin Elo battle—recent updates show Gemini edging ahead at around 1494 versus Claude's 1499-1504, driven by fresh blind user votes favoring Google's multi-task reasoning while Anthropic excels in adaptive thinking depth. Kimi k2.5-thinking trails at 35.5% amid strong Chinese model competition, but frontrunners dominate Pareto frontiers for efficiency and performance. Key swing factors include incoming votes, potential fine-tunes, or previews like Grok 4.20 beta1 before resolution, underscoring rapid frontier shifts in large language model capabilities.
Experimentelle KI-generierte Zusammenfassung mit Polymarket-Daten. Dies ist keine Handelsberatung und spielt keine Rolle bei der Auflösung dieses Marktes. · Aktualisiert
Vorsicht bei externen Links.
Vorsicht bei externen Links.
Häufig gestellte Fragen