Chartered: Exponential Growth in AI Computation
Before experiments with AI began, electronic computers had barely existed in the 1940s. Now we have AI models that can write poems and create images from text prompts. But what has led to such exponential growth in such a short period of time?
This chart from Our World in Data tracks the history of AI in terms of the amount of computing power used to train AI models using data from Epoch AI.
Three eras of AI computing
In the 1950s, American mathematician Claude Shannon trained a robotic mouse named Theseus to navigate a maze and memorize its course—the first artificial learning of any kind.
Theseus was built on 40 floating point operations (FLOPs), a unit of measurement used to measure the number of basic arithmetic operations (addition, subtraction, multiplication, or division) that a computer or processor can perform in one second.
ℹ️ FLOPs Used as a metric to measure the computing performance of computer hardware. The higher the number of FLOPs, the higher the computation, the more powerful the system.
Computational power, availability of training data and algorithms are the three main factors for the advancement of AI. And for the first few decades of AI’s progress, calculation, That is the computing power required to train an AI model, growing according to Moore’s Law.
duration | era | Double counting |
---|---|---|
1950-2010 | Pre-Deep Learning | 18-24 months |
2010-2016 | Deep learning | 5-7 months |
2016-2022 | Large scale model | 11 months |
Source: “Compute Trends on Three Eras of Machine Learning” by Sevilla et. al, 2022.
However, in the beginning The deep learning eraIn 2012, AlexNet (Image Recognition AI) announced that the doubling time was reduced to six months, as researchers invested more in computation and processors.
Since the emergence of AlphaGo in 2015—a computer program that beat human professional Go players—researchers have identified a third era: that Massive AI Models whose computations must dwarf all previous AI systems.
Predicting AI computing progress
Looking back over the last decade alone, computing has grown so enormously that it’s hard to fathom.
For example, the calculations used to train Minerva, an AI that can solve complex math problems, are approx 6 million times which was used to train AlexNet 10 years ago.
Here is a list of important AI models in history and the computations used to train them.
AI | year | FLOPs |
---|---|---|
Theseus | 1950 | 40 |
Perceptron Mark I | 1957-58 | 695,000 |
Neocognitron | 1980 | 228 million |
Nettalk | 1987 | 81 billion |
TD-Gammon | 1992 | 18 trillion |
NPLM | 2003 | 1.1 petaFLOPs |
Alexnet | 2012 | 470 petaFLOPs |
alphago | 2016 | 1.9 million petaFLOPs |
GPT-3 | 2020 | 314 million petaFLOPs |
Minerva | 2022 | 2.7 billion petaFLOPs |
Note: One petaFLOP = one quadrature FLOPs. Source: “Compute Trends on Three Eras of Machine Learning” by Sevilla et. al, 2022.
As a result of this increase in computation, along with the availability of huge data sets and better algorithms, AI has made a lot of progress in a very short time. Now AI not only matches, but surpasses human performance in many fields.
It is difficult to predict whether the same pace of enumeration growth will continue. Training large-scale models requires more computing power and can slow progress if calculations don’t continue to ramp up. Exhausting all the data currently available for training AI models can also hinder the development and implementation of new models.
However, with all the funding poured into AI recently, perhaps more advances are on the way—like matching the computing power of the human brain.
Where does this data come from?
Source: “Computational Trends in Three Eras of Machine Learning” by Sevilla et al, 2022.
Note: Estimates of doubling for calculations vary depending on different research efforts, including Amodei and Hernandez (2018) and Lyzhov (2021). This article is based on the findings of our source. Please see their full paper for more details. Moreover, the authors are aware of the problems of framing AI models as “regular-sized” or “large-sized” and say that more research is needed in this area.
Procedure: The paper’s authors used two methods to determine the amount of computation used to train AI models: counting the number of operations and tracking GPU time. Both methods have disadvantages, namely: lack of transparency in the training process and serious complexity as ML models grow.

This article was published as part of Visual Capitalist’s Creator Program, which features data-driven visuals from some of our favorite creators around the world.