On the origins of algorithmic progress in artificial intelligence

A new paper from MIT FutureTech argues that research has overstated the impact of algorithmic breakthroughs on artificial intelligence progress while underestimating the role of growing compute resources.

A new paper titled On the Origins of Algorithmic Progress in Artificial Intelligence examines how much leading researchers’ breakthroughs actually contribute to progress in artificial intelligence compared to simply building larger and more powerful datacenters. The authors argue that the existing literature overestimates the impact of algorithmic advances while underestimating the role of increased computing resources in driving recent gains in artificial intelligence capabilities. This challenges a common narrative in the field that places primary emphasis on novel training techniques and model architectures.

The paper frames artificial intelligence progress as emerging from three main components: compute scaling, hardware efficiency, and algorithmic efficiency. Compute scaling is defined as the raw number of mathematical operations, or FLOPs, performed during model training, where more FLOPs lead to better performance and can be achieved by having more hardware or running that hardware for longer. Hardware efficiency is described as the cost of each operation in monetary, energy, or time units, so better hardware efficiency means more FLOPs can be performed for the same number of dollars, joules, or days. Algorithmic efficiency refers to the cleverness of the training procedure, where better algorithmic efficiency means the same model performance can be reached using fewer operations.

The conventional view has been that these three components are orthogonal and can be evaluated independently. For example, hardware efficiency (in FLOPs per dollar) improved 45x between 2013 and 2024, and because compute scale and hardware efficiency are considered independent, it is expected that this roughly 45x gain applies regardless of whether a lab has 10 GPUs or 10,000. Traditionally, algorithmic efficiency has also been assumed to be independent of compute scale, so whether a researcher has 1015 FLOPs or 1025 FLOPs, the literature assumes that the same algorithmic improvement should yield a similar multiplicative benefit, such as 10x less compute at 1015 FLOPs and 1025 FLOPs alike. The paper questions whether this assumption about algorithmic efficiency being scale independent is actually safe, setting up a reexamination of how progress in artificial intelligence should be attributed between algorithms and compute.

58

Impact Score

Global regulatory trends on the use of generative artificial intelligence

Governments in the EU, Japan, the United States, and the United Kingdom are moving quickly to regulate generative artificial intelligence, using a mix of binding laws, guidelines, and standards. Diverging philosophies and timelines are making cross-border compliance planning increasingly complex for companies.

Perplexity launches Computer to orchestrate many Artificial Intelligence models

Perplexity is rolling out Computer, a cloud-based agent that coordinates 19 Artificial Intelligence models for complex workflows, as it pivots toward high-value enterprise users and deep research. The launch underscores a broader bet on multi-model orchestration, custom benchmarks and a boutique business strategy over mass adoption.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.