‌

‌
‌
‌
‌
‌

Will a dense machine learning model with at least 100 trillion parameters be trained before 2026?

Metaculus

★★★☆☆

28%

Unlikely

Yes

Question description #

Related Questions on Metaculus:

Will a 100 trillion parameter deep learning model be trained before 2026? (resolved as Yes)
If GPT-4 is announced before 2025, how many parameters will it have (in billions of parameters)?

Parameter count is a key attribute of modern machine learning (ML) systems: it has a strong influence on model performance, and on training costs. Deepchecks describes parameters as follows:

The weights and coefficients that the algorithm extracts from the data are known as model parameters. Model parameters of neural networks consider how the predictor variable influences the target variable.

In other words the model learns these parameters during training to fit the input data to the appropriate output.

In recent years the number of parameters used in ML models has increased rapidly. But, as discussed in this writeup (and also here), research by DeepMind published in the spring of 2022, along with a model named Chinchilla, suggested that the importance of dataset size relative to parameter count had been underestimated in previous work.

On March 29th, DeepMind published a paper, "Training Compute-Optimal Large Language Models", that shows that essentially everyone -- OpenAI, DeepMind, Microsoft, etc. -- has been training large language models with a deeply suboptimal use of compute.

Following the new scaling laws that they propose for the optimal use of compute, DeepMind trains a new, 70-billion parameter model that outperforms much larger language models, including the 175-billion parameter GPT-3 and DeepMind's own 270-billion parameter "Gopher".

In March of 2022, a paper describing the BaGuaLu model model was published, and discussed a variant of this model trained with 174 trillion parameters. However, this was a sparse model (seemingly a variant of mixture of experts), and was primarily a demonstration of the ability to train large scale models.

Sparse models activate a smaller share of their parameters in a forward pass, using those that were trained for the task at hand, while dense models use a larger share of their parameters. In an ML model a forward pass or forward propagation is the process of input data "travelling" through the neural network to the output node.

Note that the above is for information only and is not the resolution source for this question.

Indicators #

Indicator	Value
Stars	★★★☆☆
Platform	Metaculus
Number of forecasts	87

Capture #

Resizable preview:

Will a dense machine learning model with at least 100 trillion parameters be trained before 2026?

28%

Unlikely

Last updated: 2024-10-07

Related Questions on Metaculus:

Will a 100 trillion parameter deep learning model be trained before 2026? (resolved as Yes)
If GPT-4 is announced before 2025, how many parameters will it have (in billions of parameters)?

Parameter...

Last updated: 2024-10-07

★★★☆☆

Metaculus

Forecasts: 87

Embed #

Preview