‌

‌
‌
‌
‌
‌

Will there be a score of 80% or higher on Humanity's Last Exam before April 1, 2025?

Manifold Markets

★★☆☆☆

< 1%

Exceptionally unlikely

Yes

Question description #

Background

Humanity's Last Exam (HLE) is a benchmark designed to test AI models at the frontiers of human expertise. The exam consists of expert-level questions across various fields, deliberately crafted to be extremely challenging. Current AI models have performed poorly on this benchmark, with leading models answering fewer than 10% of expert questions correctly.

Resolution Criteria

This market will resolve YES if any AI model achieves a verified score of 80% or higher on Humanity's Last Exam before April 1, 2025. The score must be:

Independently verified by Scale AI or another reputable organization

Achieved on the full exam, not a subset

Publicly announced and documented

Achieved through a single model's capabilities (not through combining multiple models or human assistance)

The market will resolve NO if no AI model achieves a verified score of 80% or higher by April 1, 2025.

Considerations

The current performance gap between AI models (<10%) and the target (80%) is substantial

Experts predict models might exceed 50% accuracy by the end of 2025, making an 80% score by April 2025 particularly ambitious

The exam is specifically designed to test the limits of AI capabilities, making rapid improvements more challenging than on typical benchmarks

Scale AI's methodology and scoring criteria may evolve, but resolution will be based on their official scoring system at the time of evaluation

Indicators #

Indicator	Value
Stars	★★☆☆☆
Platform	Manifold Markets
Forecasters	20
Volume	M11k

Capture #

Resizable preview:

Will there be a score of 80% or higher on Humanity's Last Exam before April 1, 2025?

< 1%

Exceptionally unlikely

Last updated: 2025-04-01

★★☆☆☆

Manifold Markets

Forecasters: 20

Volume: M11k

Embed #

Preview