xAI's Grok models currently trail OpenAI's GPT-5 series on the FrontierMath benchmark, where GPT-5.4 leads at 47.6% accuracy on expert-level math problems including unsolved research challenges, while Grok-4 scored around 14-20% on tiers 1-3 per Epoch AI evaluations. Recent Grok 4.20 release in early 2026 has topped instruction-following (IFBench 82%) and Arena leaderboards with multi-agent reasoning and a 2M-token context window, fueling optimism for math gains amid xAI's rapid iteration and 1GW Colossus training cluster. Competitive pressure intensifies as OpenAI pushes records like 31% on Tier 4; traders eye a potential Grok 5 rollout—rumored at 7 trillion parameters—before the June 30 deadline as the key catalyst for closing the gap.
Riepilogo sperimentale generato dall'AI con riferimento ai dati di Polymarket. Questo non è un consiglio di trading e non ha alcun ruolo nella risoluzione di questo mercato. · AggiornatoxAI Grok score on FrontierMath Benchmark by June 30?
xAI Grok score on FrontierMath Benchmark by June 30?
$19,331 Vol.
25%+
52%
30%+
53%
40%+
62%
50%+
23%
$19,331 Vol.
25%+
52%
30%+
53%
40%+
62%
50%+
23%
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Mercato aperto: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...xAI's Grok models currently trail OpenAI's GPT-5 series on the FrontierMath benchmark, where GPT-5.4 leads at 47.6% accuracy on expert-level math problems including unsolved research challenges, while Grok-4 scored around 14-20% on tiers 1-3 per Epoch AI evaluations. Recent Grok 4.20 release in early 2026 has topped instruction-following (IFBench 82%) and Arena leaderboards with multi-agent reasoning and a 2M-token context window, fueling optimism for math gains amid xAI's rapid iteration and 1GW Colossus training cluster. Competitive pressure intensifies as OpenAI pushes records like 31% on Tier 4; traders eye a potential Grok 5 rollout—rumored at 7 trillion parameters—before the June 30 deadline as the key catalyst for closing the gap.
Riepilogo sperimentale generato dall'AI con riferimento ai dati di Polymarket. Questo non è un consiglio di trading e non ha alcun ruolo nella risoluzione di questo mercato. · Aggiornato
Fai attenzione ai link esterni.
Fai attenzione ai link esterni.
Domande frequenti