Related Questions on Metaculus:
Parameter count is a key attribute of modern machine learning (ML) systems: it has a strong influence on model performance, and on training costs. Deepchecks describes parameters as follows:
The weights and coefficients that the algorithm extracts from the data are known as model parameters. Model parameters of neural networks consider how the predictor variable influences the target variable.
In other words the model learns these parameters during training to fit the input data to the appropriate output.
In recent years the number of parameters used in ML models has increased rapidly. But, as discussed in this writeup (and also here), research by DeepMind published in the spring of 2022, along with a model named Chinchilla, suggested that the importance of dataset size relative to parameter count had been underestimated in previous work.
On March 29th, DeepMind published a paper, "Training Compute-Optimal Large Language Models", that shows that essentially everyone -- OpenAI, DeepMind, Microsoft, etc. -- has been training large language models with a deeply suboptimal use of compute.
Following the new scaling laws that they propose for the optimal use of compute, DeepMind trains a new, 70-billion parameter model that outperforms much larger language models, including the 175-billion parameter GPT-3 and DeepMind's own 270-billion parameter "Gopher".
In March of 2022, a paper describing the BaGuaLu model model was published, and discussed a variant of this model trained with 174 trillion parameters. However, this was a sparse model (seemingly a variant of mixture of experts), and was primarily a demonstration of the ability to train large scale models.
Sparse models activate a smaller share of their parameters in a forward pass, using those that were trained for the task at hand, while dense models use a larger share of their parameters. In an ML model a forward pass or forward propagation is the process of input data "travelling" through the neural network to the output node.
<iframe src="https://ourworldindata.org/grapher/artificial-intelligence-parameter-count" loading="lazy" style="width: 100%; height: 600px; border: 0px none;"></iframe>Note that the above is for information only and is not the resolution source for this question.
Indicator | Value |
---|---|
Stars | ★★★☆☆ |
Platform | Metaculus |
Number of forecasts | 87 |
Related Questions on Metaculus:
Parameter...
<iframe src="https://metaforecast.org/questions/embed/metaculus-14502" height="600" width="600" frameborder="0" />