Large Language Models (LLMs) are famously data hungry, with the largest among today's models requiring >1T tokens for optimal training. This high sample complexity has several important implications. For one thing, as reported by Epoch recently, current LLMs may already be leveraging almost all available high-quality text data, and the stock is not growing anywhere near fast enough to sustain the current rate of progress. For another, high data requirements lead to high compute requirements, meaning that only well-resourced actors are able to train LLMs. If techniques for making better use of available data during LLM pretraining were to be invented, this might remove data as a bottleneck to progress, and could increase the dispersion of powerful models among actors.
Indicator | Value |
---|---|
Stars | ★★★☆☆ |
Platform | Metaculus |
Number of forecasts | 88 |
Large Language Models (LLMs) are famously data hungry, with the largest among today's models requiring >1T tokens for optimal training. This high sample complexity has several important implications. For one thing, as reported by Epoch recently,...
<iframe src="https://metaforecast.org/questions/embed/metaculus-14415" height="600" width="600" frameborder="0" />