xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated$24,234 Vol.
25%+
Yes
30%+
Yes
40%+
Yes
50%+
No
$24,234 Vol.
25%+
Yes
30%+
Yes
40%+
Yes
50%+
No
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Market Opened: Jan 30, 2026, 12:01 AM ET
Resolver
0x65070BE91...Outcome proposed: Yes
No dispute
Final outcome: Yes
This market will resolve according to the Epoch AI’s Frontier Math benchmarking leaderboard (https://epoch.ai/frontiermath) for Tier 1-3. Studies which are not included in the leaderboard (e.g. https://x.com/EpochAIResearch/status/1945905796904005720) will not be considered.
The primary resolution source will be information from EpochAI; however, a consensus of credible reporting may also be used.
Resolver
0x65070BE91...Outcome proposed: Yes
No dispute
Final outcome: Yes
xAI's Grok models currently sit at 12-14% accuracy on Epoch AI's FrontierMath Tiers 1-3, a set of 300 unpublished, research-level math problems designed to resist data contamination and require hours or days of expert effort per question. This places them well behind leaders like OpenAI's o-series variants and GPT-5 iterations, which have posted scores in the mid-20s to low-50s in recent independent evaluations. With only days remaining until the June 30, 2026 resolution deadline and no confirmed Grok updates or capability jumps announced in the past month, trader sentiment reflects the narrow window for any rapid improvement. Competitive dynamics in advanced reasoning benchmarks continue to favor labs with stronger demonstrated tool use and scaling on math-specific tasks, though xAI's focus on unique problem-solving strengths has occasionally yielded novel solves on FrontierMath.
Experimental AI-generated summary referencing Polymarket data. This is not trading advice and plays no role in how this market resolves. · Updated



Beware of external links.
Beware of external links.
Frequently Asked Questions