Exponential growth in AI computing

Rate this post

Chartered: Exponential Growth in AI Computation

Before experiments with AI began, electronic computers had barely existed in the 1940s. Now we have AI models that can write poems and create images from text prompts. But what has led to such exponential growth in such a short period of time?

This chart from Our World in Data tracks the history of AI in terms of the amount of computing power used to train AI models using data from Epoch AI.

Three eras of AI computing

In the 1950s, American mathematician Claude Shannon trained a robotic mouse named Theseus to navigate a maze and memorize its course—the first artificial learning of any kind.

Theseus was built on 40 floating point operations (FLOPs), a unit of measurement used to measure the number of basic arithmetic operations (addition, subtraction, multiplication, or division) that a computer or processor can perform in one second.

ℹ️ FLOPs Used as a metric to measure the computing performance of computer hardware. The higher the number of FLOPs, the higher the computation, the more powerful the system.

Computational power, availability of training data and algorithms are the three main factors for the advancement of AI. And for the first few decades of AI’s progress, calculation, That is the computing power required to train an AI model, growing according to Moore’s Law.

durationeraDouble counting
1950-2010Pre-Deep Learning18-24 months
2010-2016Deep learning5-7 months
2016-2022Large scale model11 months

Source: “Compute Trends on Three Eras of Machine Learning” by Sevilla et. al, 2022.

However, in the beginning The deep learning eraIn 2012, AlexNet (Image Recognition AI) announced that the doubling time was reduced to six months, as researchers invested more in computation and processors.

Since the emergence of AlphaGo in 2015—a computer program that beat human professional Go players—researchers have identified a third era: that Massive AI Models whose computations must dwarf all previous AI systems.

Predicting AI computing progress

Looking back over the last decade alone, computing has grown so enormously that it’s hard to fathom.

For example, the calculations used to train Minerva, an AI that can solve complex math problems, are approx 6 million times which was used to train AlexNet 10 years ago.

Here is a list of important AI models in history and the computations used to train them.

Perceptron Mark I1957-58695,000
Neocognitron1980228 million
Nettalk198781 billion
TD-Gammon199218 trillion
NPLM20031.1 petaFLOPs
Alexnet2012470 petaFLOPs
alphago20161.9 million petaFLOPs
GPT-32020314 million petaFLOPs
Minerva20222.7 billion petaFLOPs

Note: One petaFLOP = one quadrature FLOPs. Source: “Compute Trends on Three Eras of Machine Learning” by Sevilla et. al, 2022.

As a result of this increase in computation, along with the availability of huge data sets and better algorithms, AI has made a lot of progress in a very short time. Now AI not only matches, but surpasses human performance in many fields.

It is difficult to predict whether the same pace of enumeration growth will continue. Training large-scale models requires more computing power and can slow progress if calculations don’t continue to ramp up. Exhausting all the data currently available for training AI models can also hinder the development and implementation of new models.

However, with all the funding poured into AI recently, perhaps more advances are on the way—like matching the computing power of the human brain.

Where does this data come from?

Source: “Computational Trends in Three Eras of Machine Learning” by Sevilla et al, 2022.

Note: Estimates of doubling for calculations vary depending on different research efforts, including Amodei and Hernandez (2018) and Lyzhov (2021). This article is based on the findings of our source. Please see their full paper for more details. Moreover, the authors are aware of the problems of framing AI models as “regular-sized” or “large-sized” and say that more research is needed in this area.

Procedure: The paper’s authors used two methods to determine the amount of computation used to train AI models: counting the number of operations and tracking GPU time. Both methods have disadvantages, namely: lack of transparency in the training process and serious complexity as ML models grow.

A cropped version of a time series chart showing the amount of AI computation used by machine learning systems produced on the x-axis and measured in FLOPs on the y-axis.

Green check mark icon

This article was published as part of Visual Capitalist’s Creator Program, which features data-driven visuals from some of our favorite creators around the world.

Leave a Comment