By Nov 12 2025, will there be a model that meets all of these criteria:
84.6% on the Artificial Analysis Quality Index
ie the average of benchmark scores on
MMLU
GPQA
MATH
HumanEval
MGSM
with no regressions on any individual benchmark
<$3/M input tokens, <$12/M output tokens, >720 tokens/sec
Note:
does not need to be an OpenAI model
open weights or free models will count as cheaper
quantized/distilled versions count, as long as they also beat the same accuracy thresholds
| Indicator | Value |
|---|---|
| Stars | ★★☆☆☆ |
| Platform | Manifold Markets |
| Forecasters | 13 |
| Volume | M1.0k |
By Nov 12 2025, will there be a model that meets all of these criteria:
84.6% on the Artificial Analysis Quality Index
ie the average of benchmark scores on
MMLU
GPQA
MATH
HumanEval
MGSM
with no regressions on any individual...