Artificial intelligence has long been firmly established in our lives. Most often, we do not even think about this issue when we habitually use the voice assistant in a smartphone or automatic image recognition in a program. Even Google search is not complete without machine learning, a field of artificial intelligence.

For about two years now, a technology called RankBrain has been working, which serves to intelligently sort search results. About 15% of daily queries are new to Google, that is, they have not been formulated by any user before.

How Google's CPU Works: Free Artificial Intelligence

“Our artificial intelligence does without human commands”

Professor David SilverChief AI Programmer
Google Alpha Go Zero

RankBrain algorithms look for available forms in unknown search queries and link them to semantically similar concepts. As a result, the search engine must learn on its own and be able to provide appropriate answers to questions that have not been asked before.

Google uses specially designed tensor processors – TPUs – in its data centers.

Tensor Processors (TPUs)

The success of Google’s artificial intelligence looks impressive. One reason is that Google has developed special hardware that accelerates neural networks much more efficiently than standard CPUs and GPUs. Strikingly, tensor processors use some of the usual PC components. They are on the same SATA plug-in board and communicate via PCI Express with other tensor processors.

Tensor processor structure

Fast addition and multiplication is a strong point of the tensor processor. In the module responsible for this, with the matrix as the central component, neural network calculations are performed. It occupies about a quarter of the processor area. The rest of the space is used for quick input. They come via PCI Express and DDR3 RAM. The calculation results are returned to the server via PCI Express and the host interface.

How Google's CPU Works: Free Artificial IntelligenceSuperhuman intelligence

The processors got their name from the TensorFlow software library. The main purpose of the TPU is to accelerate artificial intelligence algorithms that rely on free software libraries.

TPUs initially gained popularity as the hardware platform for AlphaGo, the artificial intelligence that beat the world’s best players in the Asian game of Go. Unlike chess, professional Go software development was considered impossible for many years.

The subsequent development of AlphaGo Zero was able to learn the game on its own based on the rules it was given. In three days she reached the professional level, in three weeks she caught up with the previous version of AlphaGo, the training of which cost enormous efforts and required millions of professional games. It turned out that artificial intelligence had previously limited the study of moves by a person. Six weeks later, AlphaGo Zero was unbeatable.

Neural Network Accelerator

Compared to conventional processors, tensor processors are specialized in the use of artificial neural networks. They consist of many networked mathematical functions that mimic the human brain with its nerve cells and their connections. Like our brain, a neural network requires appropriate input. There is learning, for example, to recognize speech, images – or the rules of the game “Go”.

An artificial neural network includes several levels of neurons. Each neuron uses the weighted sum of the outputs of the connected neurons in the previous layer. Correct benchmarking is the key to success in Machine Learning, but it must first be done, which in practice often means multiple floating point operations.

In this discipline, GPUs are actually the best choice. To sort search results later or predict moves, the neural network no longer needs the high precision of floating point calculations. This process requires a very large number of multiplications and additions of integers.

Tensor processors at the Google Computing Center

Google has been using TPUs in its data centers since 2016. The board usually has several processors in one cluster. They are mainly used in blocks of four.

How Google's CPU Works: Free Artificial IntelligenceHow Google's CPU Works: Free Artificial Intelligence

The Google Tensor Processor consists mainly of a computational unit, a matrix of 256×256 units. It works with eight-bit integers, achieves a processing power of 92 trillion operations per second, and stores the results in memory.

The diagram shows that the matrix occupies only about a quarter of the processor area. The remaining components are responsible for constantly providing the kernels with new data. Tensor processors do not issue commands to themselves – they come from a connected server via PCI Express. In the same way, the final results are transmitted in response.

The benchmarks required for neural network calculations are supplied by the First-In/First-Out memory module. Since there is little to change for a specific application, a connection via DDR3 RAM is sufficient. Intermediate results are placed in a 24 MB buffer drive and fed back to the computing unit.

Power consumption in comparison

Comparison of processor performance per watt of electricity consumed demonstrates the greater efficiency of tensor processors.

Computing power/watt
Computing power/watt

Racing with CPU and GPU

Tensor processors compute at a rate of 225,000 neural network predictions per second. CPU and GPU can’t compete.

How Google's CPU Works: Free Artificial Intelligence

45 times faster than conventional processor

Although the CISC (Complex Instruction Set Computer) instruction set of a tensor processor can process complex instructions, there are only about a dozen of them. And for most of the necessary operations, only five instructions are required, including instructions for reading, performing a matrix multiplication, or calculating an activation function.

By optimizing artificial intelligence calculations, tensor processors are significantly faster than conventional processors (45 times) or GPUs (17 times). At the same time, they operate with greater energy efficiency.

And Google is only at the beginning of the journey: with the help of simple measures, the performance of tensor processors can be increased even further. Just installing GDDR5 RAM can triple your current computing power.

Read also:

  • Kaspersky Lab found out why artificial intelligence is dangerous and whether it is possible to upgrade a person
  • Scientists have suggested that artificial intelligence will surpass humans in the next 45 years

A photo: Google LLC