Anthropic's Claude Mythos Preview, detailed in its April 7, 2026 system card, has surged to the top of the Humanity's Last Exam leaderboard with 56.8% accuracy without tools and 64.7% with tools like web search and code execution, outpacing rivals such as OpenAI's GPT-5 variants and Google's Gemini models on this frontier benchmark of 2,500 expert-level questions spanning math, sciences, and humanities. This marks a leap from Claude Opus 4.6's mid-50s scores earlier in the year, fueled by Anthropic's accelerated 2026 release cadence—including Opus and Sonnet 4.6 in February—emphasizing agentic capabilities and reasoning. Traders eye potential full Mythos rollout or Claude 4.7 by June 30 amid competitive pressures, though evaluation variances across leaderboards and tool-use definitions add uncertainty to resolution thresholds.
Polymarket डेटा का संदर्भ देने वाला प्रयोगात्मक AI-जनरेटेड सारांश। यह ट्रेडिंग सलाह नहीं है और इस बाज़ार के समाधान में कोई भूमिका नहीं निभाता। · अपडेट किया गया$208,689 वॉल्यूम
35%+
98%
45%+
73%
$208,689 वॉल्यूम
35%+
98%
45%+
73%
The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
बाज़ार खुला: Jan 30, 2026, 12:00 AM ET
Resolver
0x65070BE91...The resolution source will be the official Humanity’s Last Exam leaderboard https://scale.com/leaderboard/humanitys_last_exam.
Resolver
0x65070BE91...Anthropic's Claude Mythos Preview, detailed in its April 7, 2026 system card, has surged to the top of the Humanity's Last Exam leaderboard with 56.8% accuracy without tools and 64.7% with tools like web search and code execution, outpacing rivals such as OpenAI's GPT-5 variants and Google's Gemini models on this frontier benchmark of 2,500 expert-level questions spanning math, sciences, and humanities. This marks a leap from Claude Opus 4.6's mid-50s scores earlier in the year, fueled by Anthropic's accelerated 2026 release cadence—including Opus and Sonnet 4.6 in February—emphasizing agentic capabilities and reasoning. Traders eye potential full Mythos rollout or Claude 4.7 by June 30 amid competitive pressures, though evaluation variances across leaderboards and tool-use definitions add uncertainty to resolution thresholds.
Polymarket डेटा का संदर्भ देने वाला प्रयोगात्मक AI-जनरेटेड सारांश। यह ट्रेडिंग सलाह नहीं है और इस बाज़ार के समाधान में कोई भूमिका नहीं निभाता। · अपडेट किया गया
बाहरी लिंक से सावधान रहें।
बाहरी लिंक से सावधान रहें।
अक्सर पूछे जाने वाले प्रश्न