The emergence and rise of artificial intelligence undoubtedly played an important role during the development of the Internet. Over the past decade, with extensive applications in the society, artificial intelligence has become more relevant to people’s daily life. This chapter introduces the concept of artificial intelligence, the related technologies, and the existing controversies over the topic.

1.1 The Concept of Artificial Intelligence

1.1.1 What Is Artificial Intelligence?

Currently, people mainly learn about artificial intelligence (AI) through news, movies, and the applications in daily life, as shown by Fig. 1.1.

Fig. 1.1
figure 1

Social cognition of AI

A rather widely accepted definition of AI, also a relatively early one, was proposed by John McCarthy at the 1956 Dartmouth Conference, which outlined that artificial intelligence is about letting a machine simulate the intelligent behavior of humans as precisely as it can be. However, this definition seemingly ignores the possibility of strong artificial intelligence (which means the machine that has the ability or intelligence to solve problems by reasoning).

Before explaining what “artificial intelligence” is, we had better clarify the concept of “intelligence” first.

According to the theory of multiple intelligences, human intelligence can be categorized into seven types: Linguistic, Logical-Mathematical, Spatial Bodily-Kinesthetic, Musical, Interpersonal and Intrapersonal intelligence.

  1. 1.

    Linguistic Intelligence

    Linguistic intelligence refers to the ability to effectively express one’s thoughts in spoken or written language, understand others’ words or texts, flexibly master the phonology, semantics, and grammar of a language, manage verbal thinking, and convey or decode the connotation of linguistic expressions through the verbal thinking. For the people with strong linguistic intelligence, the ideal career choices could be politician-activist, host, attorney, public speaker, editor, writer, journalist, teacher, etc.

  2. 2.

    Logical-Mathematical Intelligence

    Logical-mathematical intelligence designates the capability to calculate, quantify, reason, summarize and classify effectively, and to carry out complicated mathematical operations. This capability is characterized by the sensitivity to abstract concepts, such as logical patterns and relationships, statements and claims, and functions. People who are strong in logic-mathematical intelligence are more suitable to work as scientists, accountants, statisticians, engineers, computer software developers, etc.

  3. 3.

    Spatial Intelligence

    Spatial intelligence features the potential to accurately recognize the visual space and things around it, and to represent what they perceived visually in paintings and graphs. People with strong spatial intelligence are very sensitive to spatial relationships such as color, line, shape, and form. The jobs suitable for them are interior designer, architect, photographer, painter, pilot and so on.

  4. 4.

    Bodily-Kinesthetic Intelligence

    Bodily-kinesthetic intelligence indicates the capacity to use one’s whole body to express thoughts and emotions, and to use hands and other tools to fashion products or manipulate objects. This intelligence demonstrates a variety of particular physical skills such as balance, coordination, agility, strength, suppleness and speed, and tactile abilities. Potential careers for people with strong body-kinesthetic intelligence include athlete, actor, dancer, surgeon, jeweler, mechanic and so on.

  5. 5.

    Musical Intelligence

    Musical intelligence is the ability to discern pitch, tone, melody, rhythm, and timbre. People having relatively high musical intelligence are particularly sensitive to pitch, tone, melody, rhythm or timbre, and are more competitive in performing, creating and reflecting on music. Their recommended professions include singer, composer, conductor, music critic, the piano tuner and so on.

  6. 6.

    Interpersonal Intelligence

    Interpersonal intelligence is the capability to understand and interact effectively with others. People with strong interpersonal intelligence can better recognize the moods and temperaments of others, empathize with their feelings and emotions, notice the hidden information of different interpersonal relationships, and respond appropriately. The professions suitable for them include politician, diplomat, leader, psychologist, PR officer, salesmen, and so on.

  7. 7.

    Intrapersonal Intelligence

    Intrapersonal intelligence is about self-recognition, which means the capability to understand oneself and then act accordingly based on such knowledge. People with strong intrapersonal intelligence are able to discern their strengths and weaknesses, recognize their inner hobbies, moods, intentions, temperaments and self-esteem, and they like to think independently. Their suitable professions include philosopher, politician, thinker, psychologist and so on.

  8. 8.

    Naturalist Intelligence

    Naturalist intelligence refers to the ability to observe the various forms of nature, identify and classify the objects, and discriminate the natural and artificial systems.

However, AI is a new type of technological science that investigates and develops the theories, methods, technologies and application systems to simulate, improve and upgrade the human intelligence. The AI is created to enable machines to reason like human being and to endow them with intelligence. Today, the connotation of AI has been greatly broadened, making it an interdisciplinary subject, as shown by Fig. 1.2.

Fig. 1.2
figure 2

Fields covered by artificial intelligence

Machine learning (ML) is apparently one of the major focuses of this interdisciplinary subject. According to the definition by Tom Mitchell, the so-called “the godfather of global ML”, machine learning is described as: with respect to certain type of tasks T and performance P, if the performance of a computer program at tasks in T improves with experience E as measured by P, then the computer program is deemed to learn from experience E. It is a relatively simple and abstract definition. However, as our perception on the concept deepened, we may find that the connotation and denotation of machine learning will also change accordingly. It is not easy to define machine learning that precisely in only one or two sentences, not only because that it covers a wide span of fields in terms of theory and application, but also it is developing and transforming quite rapidly.

Generally speaking, the processing system and algorithms of machine learning make predictions mainly by identifying the hidden patterns from data. It is an important sub-field of AI, and AI is intertwined with data mining (DM) and knowledge discovery in database (KDD) in a broader sense.

1.1.2 The Relationship Between AI, Machine Learning, and Deep Learning

The study of machine learning aims at enabling computers to simulate or perform human learning ability and acquire new knowledge and skills. Deep learning (DL) derives from the study of artificial neural networks (ANN). As a new subfield of machine learning, it focuses on mimicking the mechanisms of human brain in interpreting data like images, sound, and text.

The relationship between AI, machine learning, and deep learning is shown in Fig. 1.3.

Fig. 1.3
figure 3

The relationship between artificial intelligence, machine learning, and deep learning

Among the three concepts, machine learning is an approach or a subset of AI, and deep learning is one of ML’s special forms. If we take AI as the brain, then machine learning is the process of the acquisition of cognitive abilities, and deep learning is a highly efficient self-training system that dominates this process. Artificial intelligence is the target and result while deep learning and machine learning are methods and tools.

1.1.3 Types of AI

AI can be divided into two types: strong artificial intelligence and weak artificial intelligence.

Strong artificial intelligence is about the possibility to create the intelligent machines that can accomplish reasoning problem-solving tasks. The machines of this kind are believed to have consciousness and self-awareness and be able to think independently and come up with the best solutions to the problems. Strong AI also has its distinctive values and worldview, and is endowed with instincts, such as the needs of survival and safety, just like all the living beings. In a certain sense, strong AI is a new civilization.

Weak artificial intelligence depicts the circumstance when it is not able to make machines that can truly accomplish reasoning and problem-solving. These machines may look smart, but they do not really have intelligence or self-awareness.

We are currently in the era of weak artificial intelligence. The introduction of weak artificial intelligence reduces the burden of intellectual work by functioning in a way similar to the advanced bionics. Whether it is AlphaGo, or the robot who writes news report and novels, they all belong to weak artificial intelligence and outperform humans in only certain fields. In the era of weak artificial intelligence, it is undeniable that data and computing power are both crucial, as they can facilitate the commercialization of AI. In the coming age of strong artificial intelligence, these two factors will still be two decisive elements. Meanwhile, the exploration on quantum computing by companies like Google and IBM also lays the foundation for the advent of the strong artificial intelligence era.

1.1.4 The History of AI

Figure 1.4 presents us a brief history of AI.

Fig. 1.4
figure 4

A brief history of AI

The official origin of modern AI can be traced back to the Turing Test proposed by Alan M. Turing, known as “the Father of Artificial Intelligence”, in 1950. According to his assumption, if a computer can engage in dialogue with humans without being detected as a computer, then it is deemed as having intelligence. In the same year he proposed this assumption, Turing boldly predicted that creating the machines with real human intelligence was possible in the future. But up to now, none of the computers has ever passed the Turing Test.

Although AI is a concept with a history of only a few decades, the theoretical foundation and supporting technology behind it have developed for a time-honored period. The current prosperity of AI is a result of the advancement of all related disciplines and the collective efforts of the scientists of all generations.

  1. 1.

    Precursors and Initiation Period (before 1956)

    The theoretical foundation for the birth of AI can date back to as early as the fourth century BC, when the famous ancient Greek philosopher and scientist Aristotle invented the concept of formal logic. In fact, his theory of syllogism is still working as an indispensable and decisive cornerstone for the deductive reasoning today. In the seventeenth century, the German mathematician Gottfried Leibniz advanced universal notation and some revolutionary ideas on reasoning and calculation, which laid the foundation for the establishment and development of mathematical logic. In the nineteenth century, the British mathematician George Boole developed Boolean algebra, which is the bedrock of the operation of modern computers, and its introduction makes the invention of computer possible. During the same period, the British inventor Charles Babbage created the Difference Engine, the first computer capable of solving quadratic polynomial equations in the world. Although it only had limited functions, this computer reduced the burden of human brain in calculation per se for the first time. The machines were endowed with computational intelligence ever since.

    In 1945, John Mauchly and J. Presper Eckert from a team at Moore School designed the Electronic Numerical Integrator and Calculator (ENIAC), the world’s first general-purpose digital electronic computer. As an epoch-making achievement, ENIAC still had its fatal deficiencies, such as its enormous size, excessive power consumption, and reliance on manual operation to input and adjust commands. In 1947, John von Neumann, the father of modern computers, modified and upgraded on the basis of ENIAC and created the modern electronic computer in the real sense: Mathematical Analyzer Numerical Integrator and Automatic Computer (MANIAC).

    In 1946, the American physiologist Warren McCulloch established the first model of neural network. His research on artificial intelligence at microscopic level laid an important foundation for the development of neural networks. In 1949, Donald O. Hebb proposed Hebbian theory, a neuropsychological learning paradigm, which states the basic principles of synaptic plasticity, namely, the efficacy of synaptic transmission will arise greatly with the repeated and persistent stimulation from a presynaptic neuron to a postsynaptic neuron. This theory is fundamental to the modelling of neural networks. In 1948, Claude E. Shannon, the founder of information theory, introduced the concept of information entropy by borrowing the term from thermodynamics, and defined information entropy as the average amount of information after the redundancy has being removed. The impact of this theory is quite far-reaching as it played an important role in fields such as non-deterministic inference and machine learning.

  2. 2.

    The First Booming Period (1956–1976)

    Finally, in 1956, John McCarthy officially introduced AI as a new discipline at the 2-month long Dartmouth Conference, which marks the birth of AI. A number of AI research groups were formed in the United States ever since, such as the Carnegie-RAND group formed by Allen Newell and Herbert Alexander Simon, the research group the Massachusetts Institute of Technology (MIT) by Marvin Lee Minsky and John McCarthy, and Arthur Samuel’s IBM engineering research group, etc.

    In the following two decades, AI was developing rapidly in a wide range of fields, and thanks to the great enthusiasm of researchers, the AI technologies and applications have kept expanding.

    1. (a)

      Machine Learning

      In 1956, Arthur Samuel of IBM wrote the famous checkers-playing program, which was able to learn an implicit model by observing the positions on checkerboard to instruct moves for the latter cases. After played against the program for several rounds, Arthur Samuel concluded that the program could reach a very high level of performance during the course of learning. With this program, Samuel confuted the notion that computers cannot go beyond the written codes and learn patterns like human beings. Since then, he coined and defined a new term—machine learning.

    2. (b)

      Pattern Recognition

      In 1957, C.K. Chow proposed to adopt statistical decision theory to tackle pattern recognition, which stimulated the rapid development of pattern recognition research since the late 1950s. In the same year, Frank Rosenblatt proposed a simplified mathematical model that imitated the recognition pattern of human brain—the perceptron, the first machine that could possibly train the recognition system by the sample of each given category, so that the system was able to correctly classify patterns of other unknown categories after learning.

    3. (c)

      Pattern Matching

      In 1966, ELIZA, the first conversation program in the world was invented, which was written by the MIT Artificial Intelligence Laboratory. The program was able to perform pattern matching on the basis of the set rules and user’s questions, so as to give appropriate replies by choosing from the pre-written answer archives. This is also the first program try to have passed the Turing Test. ELIZA once masqueraded as a psychotherapist to talk to patients, and many of them failed to recognize it as a robot when it was firstly applied. “Conversation is pattern matching”, thus this unveiled the computer natural language conversation technology.

      In addition, during the first development period of AI, John McCarthy developed the LISP, which became the dominant programming language for AI IN the following decades. Marvin Minsky launched a more in-depth study of neural networks and discovered the disadvantages of simple neural networks. In order to overcome these limitations, the scientists started to introduce multilayer neural networks and Back Propagation (BP) algorithms. Meanwhile, the expert system (ES) also emerged. During this period the first industrial robot was applied on the production line of General Motors, and the world also witnessed the birth of the first mobile robot which was capable of actioning autonomously.

      The advancement of relevant disciplines also contributed to the great strides of AI. The emergence of bionics in the 1950s ignited the research enthusiasm of scientists, which led to the invention of simulated annealing algorithm. It is a type of heuristic algorithm, and is the foundation for the searching algorithms, such as the ant colony optimization (ACO) algorithm which is quite popular in recent years.

  3. 3.

    The First AI Winter (1976–1982)

    However, the AI manic did not last too long, as the over-optimistic projections failed to be fulfilled as promised, and thus incurred the doubt and suspicion on AI technology globally.

    The perceptron, once a sensation in the academic world, had a hard time in 1969 when Marvin Minsky and the rest scientists advanced the famous logical operation exclusive OR (XOR), demonstrating the limitation of the perceptron in terms of the linear inseparable data similar to the XOR problem. For the academic world, the XOR problem became an almost undefeatable challenge.

    In 1973, AI was under strict questioning by the scientific community. Many scientists believed that those seemingly ambitious goals of AI were just some unfulfilled illusions, and the relevant research had been proved complete failures. Due to the increasing suspicion and doubts, AI suffered from severe criticism, and its actual value was also under question. As a consequence, the governments and research institutions all over the world withdrew or reduced funding on AI, and the industry encountered its first winter of development in the 1970s.

    The setback in 1970s was no coincidence. Due to the limitation of computing power at that time, although many problems were solved theoretically, they cannot be put into practice at all. Meanwhile, there were many other obstacles, such as it was difficult for the expert system to acquire knowledge, leaving lots of projects ended in failure. The study on machine vision took off in the 1960s. And the edge detection and contour composition methods proposed by the American scientist L.R. Roberts are not only time-tested, but also still widely used today. However, having a theoretical basis does not mean actual yield. In the last 1970s, there were scientists concluded that to let a computer to imitate human retinal vision, it would need to execute at least one billion instructions. However, the calculation speed of the world’s fastest supercomputer Cray-1 in 1976 (which costed millions of US dollars to make) could only register no more than 100 million times per second, and the speed of an ordinary computer could meet even no more than one million times per second. The hardware limited the development of AI. In addition, another major basis for the progress of AI is the data base. At that time, computers and the Internet were not as popular as today, so there were nowhere for the developers to capture massive data.

    During this period, artificial intelligence developed slowly. Although the concept of BP had been proposed by Juhani Linnainmaa in the “automatic differential flip mode” in the 1970s, it was not until 1981 that it was applied to the multilayer perceptron by Paul J. Werbos. The invention of multilayer perceptron and BP algorithm contributed to the second leap-frogging of neural networks. In 1986, David E. Rumelhart and other scholars developed an effective BP algorithm to successfully train multilayer perceptron, which exerted a profound impact.

  4. 4.

    The Second Booming Period (1982–1987)

    In 1980, XCON, a complete expert system developed by the Carnegie Mello University (CMU) was officially put into use. The system contained more than 2500 set rules, and processed more than 80,000 orders featuring an accuracy of over 95% in the following years. This is considered a milestone that heralds a new era, when the expert system begun to showcase its potential in specific fields, which lifted AI technology to a completely new level of booming development.

    An expert system normally attends to one specific professional field. By mimicking the thinking of human experts, it attempts to answer questions or provide knowledge to help with the decision-making by practitioners. Focusing on only a narrow domain, the expert system avoids the challenges related to artificial general intelligence (AGI) and is able to make full use of the knowledge and experience of existing experts to solve problems of the specific domains.

    The big commercial success of XCON encouraged 60% of Fortune 500 companies to embark on the development and deployment of expert systems in their respective fields in the 1980s. According to the statistics, from 1980 to 1985, more than 1 billion US dollars was invested in AI, with a majority went to the internal AI departments of those enterprises, and the market witnessed a surge in AI software and hardware companies.

    In 1986, the Bundeswehr University in Munich equipped a Mercedes-Benz van with a computer and several sensors, which enabled an automatic control of the steering wheel, accelerator and brake. The installation was called VaMoRs, which proved to be the first self-driving car in the real sense in the world.

    LISP was the mainstream programming language used for AI development at that time. In order to enhance the operating efficiency of LISP programs, many agencies turned to develop computer chips and storage devices designed specifically to executive LISP programs. Although LISP machines had made some progress, personal computers (PCs) were also on the rise. IBM and Apple quickly expanded the market presence in the entire computer marketplace. With a steady increase of CPU frequency and speed, the PCs were becoming even more powerful than the costly LISP machines.

  5. 5.

    The Second AI Winter (1987–1997)

    In 1987, along with the crash of sales market of LISP machine hardware, the AI industry once again fell into another winter. The second AI trough period lasted for years as the hardware market collapsed and governments and institutions all over the world stopped investing in AI research. But during this period, the researchers still made some important achievements. In 1988, the American scientist Judea Pearl championed the probabilistic approach to AI inference, which made a crucial contribution to the future development of AI technology.

    In the almost 20 years after the advent of the second AI winter, the AI technology became gradually and deeply integrated with computer and software technologies, while the research on artificial intelligence algorithm theory had a slow progress. The research results of many researchers were only something based on the old theories, and the computer hardware that was more powerful and faster.

  6. 6.

    Recovery Period (1997–2010)

    In 1995, Richard S. Wallace was inspired by ELIZA and developed a new chatbot program named A.L.I.C.E. (the Artificial Linguistic Internet Computer Entity). The robot was able to optimize the contents and enrich its datasets automatically through the Internet.

    In 1996, the IBM supercomputer Deep Blue played a chess game against the world chess champion Gary Kasparov and was defeated. Gary Kasparov believed that it was impossible for computers to defeat human in chess games ever. After the match, IBM upgraded Deep Blue. The new Deep Blue was enhanced with 480 specialized CPUs and a doubled calculation speed up to 200 million times per second, enabling it to predict the next 8 or more moves on the chessboard. In the later rematch, the computer defeated Gary Kasparov successfully. However, this landmark event actually only marks a victory of computer over human in a game with clear rules by relying on its calculation speed and enumeration. This is not real AI.

    In 2006, as Geoffrey Hinton published a paper in Science Magazine, AI industry entered the era of deep learning.

  7. 7.

    Rapid Growth Period (2010–present)

    In 2011, the Watson system, also a program from IBM, participated the quiz show Jeopardy, competing with human players. The Watson system defeated two human champions with its outstanding natural language processing capabilities and powerful knowledge database. This time, computers can already comprehend human language, which is a big advancement in AI.

    In the twenty-first century, with the widespread application of PCs and the burst of mobile Internet and cloud computing technology, the institutions are able to capture and accumulate an unimaginably huge mass of data, providing sufficient material and impetus for the ongoing development of AI. Deep learning became a mainstream of AI technology, exemplified by the famous Google Brain project, which enhanced the recognition rate of the ImageNet dataset to 84% by a large margin.

    In 2011, the concept semantic network was proposed. The concept steams from the World Wide Web. It is essentially a large-scale distributed database that centers on Web data and connects Web data in the method of machine understanding and processing. The emergence of the semantic network greatly promoted the progress of technology of knowledge representation. A year later, Google first announced the concept of knowledge graph and launched a knowledge-graph-based searching service.

    In 2016 and 2017, Google launched two Go competitions between human and mechanical players that shocked the world. Its AI program AlphaGo defeated two Go world champions, first Lee Sedol of South Korea and then Ke Jie of China.

    Today, AI can be found in almost all aspects of people’s life. For instance, the voice assistant, such as the most typical Siri of Apple, is based on the natural language processing (NLP) technology. With the support of NLP, computers can process human language and match it with the commands and responses in line with human expectation more and more naturally. When users are browsing e-commerce websites, they could possibly receive product recommendation feeds generated by a recommendation algorithm. The recommendation algorithm can predict the products that the users might want to buy by reviewing and analyzing the historical data of the users’ recent purchases and preferences.

1.1.5 The Three Main Schools of AI

Currently, symbolism, connectionism, and behaviorism constitute the three main schools of AI. The following passages will introduce them in detail.

  1. 1.


    The basic theory of symbolism believes that, the cognitive process of human being consists of the inference and processing of symbols. Human is an example of physical symbol system, and so does the computer. Therefore, computers should be able to simulate human intelligent activities. And knowledge representation, knowledge reasoning, and knowledge application are three crucial to artificial intelligence. Symbolism argues that knowledge and concepts can be represented by symbols, thus cognition is a process of processing the symbols, and reasoning is a process of solving problems with heuristic knowledge. The core of symbolism lies in reasoning, namely the symbolic reasoning and machine reasoning.

  2. 2.


    The foundation of connectionism is that the nature of human logical thinking is neurons, rather than a process of symbol processing. Connectionism believes that the human brain is different from computers, and put forward a connectionist model imitating brain work to replace the computer working model operated by symbols. Connectionism is believed to stem from bionics, especially in the study of human brain models. In connectionism, a concept is represented by a set of numbers, vectors, matrices, or tensors, namely, by the specific activation mode of the entire network. Each node (neuron) in the network has no specific meaning, but every node all participates in the expression of overall concept. For example, in symbolism, the concept of a cat can be represented by a “cat node” or a group of nodes that feature the attributes of a cat (e.g., the one with “two eyes”, “four legs” or “fluffy”). However, connectionism believes that each node does not have a specific meaning, so it is impossible to search for a “cat node” or “eye neuron”. The core connectionism lies in neuron networks and deep learning.

  3. 3.


    The fundamental theory of behaviorism believes that intelligence depends on perception and behavior. Behaviorism introduces a “perception-action” model for intelligent activities. Behaviorism believes that intelligence has nothing to do with knowledge, representation, or reasoning. AI can evolve gradually like human intelligence, and intelligent activities can only be manifested through human’s ongoing interactions with the surrounding environment in the real world. Behaviorism emphasizes application and practices and constantly learning from the environment to modify the activities. The core behaviorism lies in behavior control, adaptation and evolutionary computing.

1.2 AI-Related Technologies

AI technology is multi-layered, running through technical levels such as applications, algorithms, chips, devices, and processes, as shown in Fig. 1.5.

Fig. 1.5
figure 5

Overview of AI-related technologies

AI technology has achieved the following developments at all technical levels.

  1. 1.

    Application Level

    • Video and image: face recognition, target detection, image generation, image retouching, search image by image, video analysis, video review, and augmented reality (AR).

    • Speech and voice: speech recognition, speech synthesis, voice wake-up, voiceprint recognition, and music generation.

    • Text: text analysis, machine translation, human-machine dialogue, reading comprehension and recommender system.

    • Control: autonomous driving, drones, robots, industrial automation.

  2. 2.

    Algorithm Level

    Machine learning algorithms: neural network, support vector machine (SVM), K-nearest neighbor algorithm (KNN), Bayesian algorithm, decision tree, hidden Markov model (HMM), ensemble learning, etc.

    Common optimization algorithms for machine learning: gradient descent, Newton’s method, quasi-Newton method, conjugate gradient, spiking timing dependent plasticity (STDP), etc.

    Deep learning is one of the most essential technologies for machine learning. The deep neural network (DNN) is a hotspot of research in this field in recent years, consisting of multilayer perceptron (MLP) and convolutional neural network (CNN), recurrent neural network (RNN), spiking neural network (SNN) and other types. While the relatively popular CNNs include AlexNet, ResNet amd VGGNet, and the popular RNNs include long short-term memory (LSTM) networks and Neural Turing Machine (NTM). For instance, Google’s BERT (Bidirectional Encoder Representation from Transformers) is a natural language processing pre-training technology developed on the basis of neural networks.

    In addition to deep learning, transfer learning, reinforcement learning, one-shot learning and adversarial machine learning are also the important technologies to realize machine learning, and the solutions to some of the difficulties faced by deep learning.

  3. 3.

    Chip Level

    • Algorithm optimization chips: performance optimization, low power consumption optimization, high-speed optimization, and flexibility optimization-such as deep learning accelerator and face recognition chip.

    • Neuromorphic chips: bionic brain, biological brain-inspired intelligence, imitation of brain mechanism.

    • Programmable chips: taking flexibility, programmability, algorithm compatibility, and general software compatibility into consideration, such as digital signal processing (DSP) chips, graphics processing units (GPUs), field programmable gates array (FPGA).

    • Structure of system on chip: multi-core, many-core, single instruction, multiple data (SIMD), array structure of operation, memory architecture, on-chip network structure, multi-chip interconnection structure, memory interface, communication structure, multi-level cache.

    • Development toolchain: connection between deep learning frameworks (TensorFlow, Caffe, MindSpore), compiler, simulator, optimizer (quantization and clipping), atomic operation (network) library.

  4. 4.

    Device Level

    • High-bandwidth off-chip memory: high-bandwidth memory (HBM), dynamic random-access memory (DRAM), high-speed graphics double data rate memory (GDDR), low power double data rate (LPDDR SDRAN), spin-transfer torque magnetic random-access memory (STT-MRAM).

    • High-speed interconnection devices: serializer/deserializer (SerDes), optical interconnection communication.

    • Bionic devices (artificial synapses, artificial neurons): memristor.

    • New type computing devices: analog computing, in-memory computing (IMC).

  5. 5.

    Process Level

    • On-chip memory (synaptic array): distributed static random-access memory (SRAM), Resistive random-access memory (ReRAM), and phase change random-access memory (PCRAM).

    • CMOS process: technology node (16 nm, 7 nm).

    • CMOS multi-layer integration: 2.5D IC/SiP technology, 3D-Stack technology and Monolithic 3D.

    • New type processes: 3D NAND, Flash Tunneling FETs, FeFET, FinFET.

1.2.1 Deep Learning Framework

The introduction of deep learning frameworks has made deep learning easier to build. With the deep learning framework, we do not need to firstly code complex neural networks with backpropagation algorithms, but can just configure the model hyperparameters according to our demands, and the model parameters can be learned automatically from training. We can also add a custom layer for the existing model, or choose the classifier and optimization algorithm we need at the top.

We can consider a deep learning framework as a set of building blocks. Each block, or component of the set is a model or algorithm, and we can assemble the components into an architecture that meets the demands.

The current mainstream deep learning frameworks include: TensorFlow, Caffe, PyTorch and so on.

1.2.2 An Overview of AI Processor

In the four key elements of AI technology (data, algorithms, computing power, and scenarios), computing power is the one most reliant on AI processor. Also known as AI accelerator, an AI processor is a specialized functional module to tackle the large-scale computing tasks in AI applications.

  1. 1.

    Types of AI Processors

    AI processors can be classified into different categories from different perspectives, and here we will take the perspectives of technical architecture and functions.

    In terms of the technical architecture, AI processors can be roughly classified into four types.

    1. (a)


      Central processing unit (CPU) is a large-scale integration circuit, which is the core of computing and control of a computer. The main function of CPU is to interpret program instructions and process data in software that it receives from the computer.

    2. (b)


      Graphics processing unit (GPU), also known as display core (DC), visual processing unit (VPU) and display chip, is a specialized microprocessor dealing with image processing in personal computers, workstations, game consoles and some mobile devices (such as tablets and smartphones).

    3. (c)


      Application specific integrated circuit (ASIC) is designed for the integrated circuit product customized for a particular use.

    4. (d)


      Field programmable gate array (FPGA) is designed to build reconfigurable semi-custom chips, which means the hardware structure can be adjusted and re-configured flexibly real-time as required.

    In terms of the functions, AI processors can be classified into two types: training processors and inference processors.

    1. (a)

      In order to train a complex deep neural network model, the AI training usually entails the input of a large amount of data and learning methods such as reinforcement learning. Training is a compute-intensive process. The large-scale training data and the complex deep neural network structure that the training involves put up a huge challenge to the speed, accuracy, and scalability of the processor. The popular training processors include NVIDIA GPU, Google’s tensor processing unit (TPU), and Huawei’s neural-network processing unit (NPU).

    2. (b)

      Inference here means inferring various conclusions with new data obtained on the basis of the trained model. For instance, the video monitor can distinguish whether a captured face is the specific target by making use of the backend deep neural network model. Although inference entails much less computation than training, it still involves lots of matrix operations. GPU, FPGA and NPU are commonly used in inference processors.

  2. 2.

    Current Status of AI Processor

    1. (a)


      The improvement of CPU performance in the early days mainly relied on the progress made by the underlying hardware technology in line with Moore’s Law. In recent years, as Moore’s Law seems gradually losing its effectiveness, the development of integrated circuits is slowing down, and the hardware technology has faced physical bottlenecks. The limitation of heat dissipation and power consumption restricted the CPU performance and serial program efficiency under the traditional architecture from making much progress.

      The status quo of the industry prompted researchers to keep on looking for CPU architectures and the relevant software frameworks that can better adapted to the post-Moore Era. As a result, the multi-core processor came into being, which allows higher CPU performance with more cores. Multi-core processors can better meet the demands of software on hardware. For example, Intel Core i7 processors adopt instruction-level parallel processors with multiple independent kernels on the x86 instruction set, which improves the performance considerably, but also leads to higher power consumption and cost. Since the number of cores cannot be increased indefinitely, and most traditional programs are written in serial programming, this approach has limited improvements in CPU performance and program efficiency.

      In addition, AI performance can be improved by adding instruction set. For example, adding instruction sets like AVX512 to the x86 complex instruction set computer (CISC), architecture, adding the fused-multiply-add (FMA) instruction set to the arithmetic logic unit (ALU) module, and adding instruction set to the ARM reduced instruction set computer (RISC) architecture.

      The CPU performance can also be improved by increasing the frequency, but there is a limit, and the high frequency will cause excessive power consumption and high temperature.

    2. (b)


      GPU is very competitive in matrix computing and parallel computing and serves as the engine of heterogeneous computing. It was first introduced into the field of AI as an accelerator to facilitate deep learning and now has formed an established ecology.

      With regard to the GPUs in the field of deep learning, NVIDIA made efforts mainly in the following three aspects:

      • Enrich ecology: NVINIA launches the NVIDIA CUDA deep neural network horary (CUDNN), the GPU-accelerated library customized for deep learning, which optimizes the underlying architecture of GPU and ensures an easier application of GPU in deep learning.

      • Improve customization: embracing multiple data types (no longer insisting on float32, and adopting int8, etc.).

      • Add module specialized for deep learning (e.g., NVIDIA V100 Tensor Core GPU adopts the improved Volta architecture introducing and equipped with tensor cores).

        The main challenges of current GPUs are high cost, low energy consumption ratio, and high input and output latency.

    3. (c)


      Since 2016, Google has been committed to applying the concept of application-specific integrated circuits (ASIC) to the study of neural networks. In 2016, it launched the AI custom-developed processor TPU which supports the open-source deep learning framework TensorFlow. By combining large-scale systolic arrays and high-capacity on-chip memory, TPU manages to efficiently accelerate the convolutional operations that are most common in deep neural networks: systolic arrays can optimize matrix multiplication and convolutional operations, so as to increase computing power and reduce energy consumption.

    4. (d)


      FPGA uses a programmable hardware description language (HDL), which is flexible, reconfigurable, and can be deeply customized. It can load DNN model on the chips to perform low-latency operation by incorporating multiple FPGAs, contributing to a computing performance higher than GPU. But as it has to take account the constant erasing process, the performance of FPGA cannot reach the optimal. As FPGA is reconfigurable, its risk of supply and R&D is relatively low. The cost of hardware is decided by the amount of hardware purchased, so it is easy to control the cost. However, the design of FPGA and the tape-out process are decoupled, so the development cycle is long, which usually takes half a year, and has high standards.

  3. 3.

    Comparison Between the Design of GPU and CPU

    The GPU is generally designed to tackle large-scale data that are highly unified in type and independent from each other, and deal with a pure computing environment without interruption. The CPU is designed more general-purpose, so as to process different types of data, and perform logical decisions at the same time, and it also needs to introduce a large number of branch-jump instructions and interrupt processing. The comparison between CPU and GPU architecture is shown in Fig. 1.6.

    The GPU has numerous massively parallel computing architectures composed by thousands of much smaller cores (designed for simultaneous processing of multiple tasks). The CPU consists of several cores optimized for serial processing.

    1. (a)

      The GPU works with many ALUs and little cache memory. Unlike the CPU, cache of the GPU serves for threads merely and plays the role of data forwarding. When multiple threads need to access the same data, the cache will coalesce these accesses, then access the DRAM, and forward the data to each thread after obtaining them, which will cause latency. However, as the large number of ALUs ensure the threads run in parallel, the latency is eased. In addition, the control units of GPUs can coalesce access.

    2. (b)

      The CPU has powerful ALUs, which can complete computation in a very short clock cycle. The CPU has a large number of caches to reduce latency, and the complicated control units that can perform branch prediction and data forwarding: when a program has multiple branches, the control units will reduce latency through branch prediction; for the instructions that depend on the results of previous instructions, the control units must determine the positions of these instructions in the pipeline and forward the result of the previous instruction as quickly as they can.

    GPUs are good at dealing with operations that are intensive and easy to be run in parallel, while CPUs excel at logic control and serial operations.

    The difference in architecture between GPU and CPU is because that they have different emphasis. The GPU has an outstanding advantage in processing the parallel computing of large-scale intensive data, while CPU more stresses the logic control while executing the instructions. In order to optimize a program, it often needs to coordinate both CPU and GPU at the same time to give a full play to their capabilities.

  4. 4.

    Huawei Ascend AI Processor

    NPU refers to the processor carrying out the optimization design specialized for neural network computing, whose performance of neural network tasks processing is much higher than that of CPU and GPU.

    The NPU mimics human neurons and synapses on the circuitry, and directly processes large scale neurons and synapses through deep learning processor instruction set. In NPUs, the processing of a group of neurons will take only one instruction. Currently, the typical examples of NPU include Huawei Ascend AI processor, the Cambrian chip and IBM’s TrueNorth chip.

There are two kinds of Huawei Ascend AI processor: Ascend 310 and Ascend 910.

Fig. 1.6
figure 6

CPU and GPU architecture

Ascend 910 is mainly applied to training scenarios, mostly deployed in the data center. While Ascend 310 is mainly designed for reasoning scenarios, whose deployment covers the device, edge and cloud full scenarios.

Ascend 910 is currently the AI processor with the strongest computing power and fastest training speed in the world, its computing power is twice that of the international top AI processor, equivalent to 50 latest and strongest CPUs today.

The relevant parameters of Ascend 310 and Ascend 910 are shown in Table 1.1.

Table 1.1 The relevant parameters of Ascend 310 and Ascend 910

1.2.3 Ecosystem of AI Industry

Over the past half a century, the world has witnessed three waves of AI. And these three waves are exemplified and unveiled by human-computer matches. The first was in 1962, when the checkers-playing program developed by Arthur Samuel from IBM defeated the world’s best checkers player in the United States. The second time was in 1997, when IBM’s supercomputer Deep Blue defeated the human chess world champion Garry Kasparov by 3.5:2.5. And the third wave of AI came in 2016 when the Go AI AlphaGo developed by DeepMind, a subsidiary of Google, defeated the Go world champion and nine-dan player from the South Korean, Lee Sedol.

In the future, AI will be embedded in every walk of life, from automobiles, finance, consumer goods to retail, healthcare, education, manufacturing, communications, energy, tourism, culture and entertainment, transportation, logistics, real estate and environmental protection, etc.

For example, in the automobile industry, the intelligent driving technologies such as assisted driving, assisted decision-making, and fully automated driving are all realized with the help of AI. As a huge market, the intelligent driving can also support technical research in AI in return, thus form a virtuous circle, and become a solid foundation to the AI development.

As for the financial industry, with the huge amount of data accumulated, AI can help with intelligent asset management, robo-advisor, and making more sensible financial decisions. AI can also play a part in combat financial fraud and be adopted in anti-fraud and anti-money laundering campaigns, as the related AI program can infer the reliability of a transaction by analyzing data and materials of all kinds, to predict where the funds will flow to, and identify the cycles of the financial market.

AI can also be widely used in the healthcare industry. For instance, it can assist doctors to diagnose and treat diseases by identifying problems reflected by the X-ray images, after being trained to interpret images at geometric level. AI can distinguish cancer cells from normal cells after training on classification tasks.

The related research data show that by 2025, the size of AI market will exceed 3 trillion US dollars, as shown in Fig. 1.7.

Fig. 1.7
figure 7

AI market size forecast

As can be inferred from Fig. 1.7, the AI market has a huge potential. It is known that AI has three pillars, namely, data, algorithms and computing power. But to apply AI in real life, these three pillars are far from enough, because we also need to take scenarios into consideration. Data, algorithms, and computing power can prompt the evolvement of AI technically, but without application scenarios, the technological development is merely about data. We need to integrate AI with cloud computing, big data and the Internet of Things (IoT) so as to make the application of AI in real life possible, which is the foundation of the platform architecture for AI application, as shown in Fig. 1.8.

Fig. 1.8
figure 8

Platform architecture for AI application

The infrastructure includes smart sensors and smart chips, which reinforces the computing power for AI industry, and guarantees its development. AI technological service is mainly about building up an AI technological platform and providing solutions and services to external users. The manufacturers of these AI technologies are critical in the AI industry chain, as they provide key technological platforms, solutions and services to all kinds of AI applications thanks to strong infrastructure and massive data that they acquire. With the acceleration of the campaign of building a competitive China by developing manufacturing, Internet, and digital industry, the demand for AI in manufacturing, houseware, finance, education, transportation, security, medical care, logistics and other fields will be further released, and the AI products will have more and more diversified forms and categories. Only when the infrastructure, the four major elements of AI and AI technical services converge, can the architecture fully buttress the upper-layer application of the AI industrial ecosystem.

Although the AI technology can be applied in a wide range of fields, its development and application are facing challenges as well: the unbalance between the limited AI development and the huge market demands. Currently, the development and application of AI needs to deal with the following three problems.

  1. 1.

    High occupational standards: To get engaged in AI industry, it is a prerequisite for a person to have considerable knowledge in machine learning, deep learning, statistics, linear algebra and calculus.

  2. 2.

    Low efficiency. Training a model will take a long working cycle, which consists of data collection, data cleaning, model training and tuning, and optimization of visualization experience.

  3. 3.

    Fragmented capabilities and experiences: to apply a same AI model in other scenarios requires will need to repeat data collection, data cleaning, model training and tuning, and experience optimization, and the capabilities of the AI model cannot be directly passed to the next scenario.

  4. 4.

    Difficult capacity upgrading and improvement: the model upgrading and effective data capturing are difficult tasks.

Currently, the smartphone-centered on-device AI has become a consensus of the industry. More and more smartphones will boast AI capabilities. As several consulting agencies in the UK and the USA estimated, about 80% of the world’s smartphones will have AI capabilities by 2022 or 2023. To meet the market outlook and tackle the challenges of AI, HUAWEI launched its open AI capability platform for smart devices, namely, HUAWEI HiAI. With a mission of “providing developers with convenience while connecting unlimited possibilities through AI”, HUAWEI HiAI enables developers to provide users with a better experience of smart application by swiftly making use of Huawei’s powerful AI processing capabilities.

1.2.4 Huawei CLOUD Enterprise Intelligence Application Platform

  1. 1.

    An Overview of Huawei CLOUD Enterprise Intelligence Application Platform

    Huawei CLOUD Enterprise Intelligence (EI) application platform is an enabler of enterprise intelligence that aims at providing open, credible and smart platforms based on AI and big data technologies and in the form of cloud service (including public cloud and customized cloud, etc.), By combining the industrial scenarios, the enterprise application systems created with Huawei CLOUD are visualized, audible and can express themselves, featuring the capabilities to analyze and interpret images, videos, languages and texts and easier access to AI and big data services. Huawei CLOUD can help the enterprises to speed up business development and benefit the society.

  2. 2.

    Features of Huawei CLOUD EI

    Huawei CLOUD EI has four remarkable features.

    1. (a)

      Industrial wisdom: Huawei CLOUD has a deep understanding of the industry, the industrial know-how, and the major industrial deficiencies. It searches solutions to the problems in the AI technologies and navigate the implementation of AI.

    2. (b)

      Industrial data: It enables the companies to utilize their own data to create massive value through data processing and data mining.

    3. (c)

      Algorithms: It provides enterprises with extensive algorithm libraries and model libraries, and solutions to corporate problems through general AI services and one-stop development platform.

    4. (d)

      Computing power: Based on Huawei’s 30 years of experiences in ICT, the full-stack AI development platform can provide enterprises with the strongest and most economical AI computing power for fusion and changes.

  3. 3.

    The History of Huawei CLOUD EI

    The evolvement of Huawei CLOUD EI is shown in Fig. 1.9.

    The evolvement of Huawei CLOUD EI is as follows.

    1. (a)

      In 2002, Huawei began to develop products of data governance and analysis targeting the traditional Business Intelligence (BI) operations in the field of telecommunication.

    2. (b)

      In 2007, Huawei initiated the Hadoop technology research project, mapping out big data-related strategies, and building a pool of relevant professionals and technology patents.

    3. (c)

      In 2011, Huawei tried to apply the big data technology in telecom big data solutions to deal with the network diagnosis and analysis, network planning, and network tuning.

    4. (d)

      In 2013, some large companies such as China Merchants Bank and Industrial and Commercial Bank exchanged views with Huawei regarding their big data-related demands and kicked off technical cooperation. In September of the same year, Huawei launched its enterprise-grade big data analysis platform FusionInsight at the Huawei Cloud Congress (HCC), which has been adopted by a wide range of industries.

    5. (e)

      In 2012, Huawei officially stepped into AI industry and productized the research outcomes successively since 2014. By the end of 2015, the products developed for finance, supply chain, acceptance of engineering work, and e-commerce began to put into use internally, with the following achievements accomplished.

      • Optical character recognition (OCR) for customs declaration documents recognition: The import efficiency was enhanced by 10 times.

      • Delivery route planning: Additional fees were reduced by 30%.

      • Intelligent auditing: The efficiency was increased by 6 times.

      • Intelligent recommendation for e-commerce users: Application conversion rate was increased by 71%.

    6. (f)

      In 2017, Huawei officially engaged in EI services in the form of cloud service and cooperated with more partners to provide more diversified AI services to the external users.

    7. (g)

      In 2019, Huawei CLOUD EI started to emphasize the inclusive AI, in the hope of making AI affordable and accessible, and safe to use. Based on the self-developed chip Ascend, it provided 59 cloud services (21 platform services, 22 vision services, 12 language services and 4 decision-making services) and developed 159 functions (52 platform functions, 99 application programming interface [API] functions and 8 pre-integrated solutions).

      Thousands of developers of Huawei were engaged in the technology R&D projects mentioned above (including the research and development of product technology, and the cutting-edge technologies such as analysis algorithms, machine learning algorithms, and natural language processing), while Huawei also actively shared the outcomes with the Huawei AI research community in return.

Fig. 1.9
figure 9

The history of Huawei CLOUD EI

1.3 The Technologies and Applications of AI

1.3.1 The Technologies of AI

As shown by Fig. 1.10, the AI technologies mainly include three types of applicational technologies of computer vision, speech processing and natural language processing.

Fig. 1.10
figure 10

The technologies of AI

  1. 1.

    Computer Vision

    Computer vision is a science that explores how to make computers “see” things, and the most established technology among the three genres of AI application technologies. The subjects that computer vision mainly deals with include image classification, object detection, image segmentation, visual tracking, text recognition and facial recognition. Currently, computer vision is generally used in electronic attendance tracking, identity verification, image recognition and image search, as shown in Figs. 1.11, 1.12, 1.13 and 1.14. In the future, computer vision will be upgraded to a more advanced level that it is capable to interpret, analyze images and make decisions autonomously, thus truly endow machines with the capability to “see”, and play a greater role in scenarios such as unmanned vehicles and smart homes.

  2. 2.

    Speech Processing

    Speech processing is the study of the statistical characteristics of speech signals and voice production. The processing technologies such as speech recognition, speech synthesis and speech wake-up can collectively be addressed as “speech processing”. The sub-domains of speech processing research majorly include speech recognition, speech synthesis, voice wake-up, voiceprint recognition and sound event detection. And the most mature sub-domain is the speech recognition, which can achieve an accuracy rate of 96% premised on a quiet indoor environment and near-field recognition. At present, the speech recognition technology is mainly used in intelligent question answering and intelligent navigation, as shown in Figs. 1.15 and 1.16.

  3. 3.

    Natural Language Processing (NLP)

    Natural language processing is a technology aiming at interpreting and utilizing natural language through computer technologies. The subjects of NLP include machine translation, text mining and sentiment analysis. Faced with a number of technical challenges, NLP is not yet a very mature technology currently. Due to the high complexity of semantics, it is impossible for AI to rival human in understanding semantics only by the deep learning based on big data and parallel computing. In the future, AI is excepted to develop to a stage that it can automatically extract features and understand deep semantics from the current status that can only understand shallow semantics to, and to upgrade from single intelligence (machine learning) to hybrid intelligence (machine learning, deep learning and reinforcement learning). The NLP technology is now widely applied in the fields such as public opinion analysis, comment analysis and machine translation, as shown in Figs. 1.17, 1.18 and 1.19.

Fig. 1.11
figure 11

Electronic attendance

Fig. 1.12
figure 12

Identity verification

Fig. 1.13
figure 13

Image recognition

Fig. 1.14
figure 14

Image search

Fig. 1.15
figure 15

Intelligent Q&A

Fig. 1.16
figure 16

Intelligent navigation

Fig. 1.17
figure 17

Public opinion analysis

Fig. 1.18
figure 18

Comment analysis

Fig. 1.19
figure 19

Machine translation

1.3.2 The Applications of AI

The applications of AI are as follows.

  1. 1.

    Smart City

    A smart city is to use ICT technologies to sense, analyze, and integrate the key information of the core urban operation system, so as to intelligently respond to the city’s demands in people’s livelihood, environmental protection, public safety, urban services, and industrial and commercial activities. The nature of smart city is to realize an intelligent management and operation of the city through advanced information technology, thereby improving the living standards of the citizens, and promoting a harmonious and sustainable development for the city. In a smart city, AI is mainly exemplified as smart environment, smart economy, smart life, smart information, smart supply chain and smart government. To put it more specifically, AI technologies are adopted by traffic monitoring logistics, and facial recognition for security and protection. Figure 1.20 shows the structure of a smart city.

  2. 2.

    Smart healthcare

    We can enable AI to “learn” professional medical knowledge, to “memorize” loads of health records, and to analyze medical images with computer vision, so as to provide doctors with reliable and efficient assistance, as shown in Fig. 1.21. For example, for the medical imaging widely used today, AI can build models based on historical data to analyze the medical images and quickly detect the lesions, thus improving the efficiency of consultation.

  3. 3.

    Smart Retail

    AI will also revolutionize the retail industry. A typical case is the unmanned supermarket. Amazon’s unmanned supermarket Amazon Go adopts sensors, cameras, computer vision, and deep learning algorithms and cancels the traditional check-out, so that customers can just walk in the store, grab the products they need and go.

    One of the major challenges faced by unmanned supermarket is how to charge the customers correctly. Up to now Amazon Go is the only successful case, but it is also achieved with many preconditions. For instance, Amazon Go is only open to Amazon’s Prime members. Other companies will need to build their own membership system first if they want to follow Amazon’s model.

  4. 4.

    Smart security

    It is easier for AI to be implemented in the field of security, and the development of AI in this field is relatively mature. The excessive security-related image and video data have provided a good foundation for the training of AI algorithms and models. In the security domain, the application of AI technology can be classified as for civilian-use and for police-use.

    For civilian use: facial recognition, early warning of potential dangers, home defense, and so on. For police-use: identification of suspicious targets, vehicle analysis, tracking suspects, searching and comparing criminal suspects, entrance guard of the key supervised areas, etc.

  5. 5.

    Smart home

    Smart home refers to a IoT technology-based home ecosystem of hardware, software and cloud platforms, which provides users with customized life services and a more convenient, comfortable and safe living environment a home.

    The smart housewares are designed to be controlled by the voice processing technology, such as adjusting the temperature of the air conditioner, opening the curtains and controlling the lighting system.

    Home security is relied on the computer vision technology, such as unlocking through facial or fingerprint recognition, real-time smart camera monitoring, and detection of illegal intrusion to the residence.

    With the help of machine learning and deep learning, the smart home can build user portraits and make recommendations based on the historical records stored in smart speakers and smart TVs.

  6. 6.

    Smart driving

    The Society of Automotive Engineers (SAE) defines six levels for autonomous driving from L0 to L5 based on the degree of dependence the vehicle has on the driving system. The L0-level vehicles need to reply on driver’s operation completely, and the vehicles at level L3 and above allow the hands-off driving under certain circumstances, while the L5-level vehicles are completely operated by the driving system without a driver in all scenarios.

    Currently only a handful of models of commercial passenger vehicle manufacturers such as Audi A8, Tesla and Cadillac are equipped with L2 and L3 Advanced Driving Assistance System (ADAS). With the further enhancement of sensors and on-board processors, the year 2020 witnessed the emergence of more L3 models. The vehicles with L4 and L5 autonomous driving system are expected to be firstly used in the commercial vehicle platforms in the enclosed industry parks. But for the high-level autonomous driving on passenger vehicle platforms, it will require further optimization in technology, relevant policies and infrastructure construction. It is estimated that such passenger vehicles will not be put in use on the common roads until 2025.

Fig. 1.20
figure 20

Smart city

Fig. 1.21
figure 21

Smart healthcare

1.3.3 The Current Status of AI

As shown in Fig. 1.22, the AI development has undergone three stages and now AI is still in the stage of perceptual intelligence.

Fig. 1.22
figure 22

The three stages of artificial intelligence

1.4 Huawei’s AI Development Strategy

1.4.1 Full-Stack All-Scenario AI Solutions

In the first quarter of 2020, Huawei’s all-scenario AI computing framework MindSpore was released to the open-source community. Later Huawei released the GaussDB OLTP stand-alone database to the open-source community in June 2020, and released the server operating system to the open-source community on 31 December 2020.

Full-stack refers to a full-stack solution including chip, chip enable, training and reasoning framework and application enable.

All-scenario refers to an all-scenario deployment environment including public cloud, private cloud, all kinds of edge computing, IoT terminals and consumer terminals.

As the bedrock of Huawei’s full-stack all-scenario AI solution, the Atlas artificial intelligence computing solution, based on the Ascend AI processor, provides products in different forms, including modules, circuit boards and servers to meet the all-scenario demands for computing power by customers.

1.4.2 Directions of Huawei Full-Stack AI

  1. 1.

    Huawei’s one-stop AI development platform—ModelArts

    ModelArts is a one-stop development platform that Huawei designed for AI developers. It supports large-scale data preprocessing, semi-automatic labeling, distributed training, automated model building and on-demand model deployment on end, edge and cloud, to help developers quickly build and deploy models and manage the full AI development lifecycle. ModelArts features characteristics as follows.

    1. (a)

      Automatic learning: With the automatic learning function, ModelArts can automatically design models, adjust parameters, train, compress and deploy models based on the labeled data, thus the developers do not need to have experience in coding or model development.

      The automatic learning of Model Arts is mainly realized through ModelArts Pro, a professional development kit designed for enterprise-grade AI. Based on the advanced algorithms and rapid training capability of HUAWEI CLOUD, ModelArts Pro provides the pre-installed workflows and models to improve the efficiency and reduce the difficulty of AI application development by the enterprises. It supports the users to recreate workflow independently and the real-time development, sharing and launching of applications, conducive to building an open ecosystem through joint efforts, and the implementation of AI in industries that benefit the general public. The toolkit of ModelArts Pro includes the kit of natural language processing, text recognition, computer vision, etc., which will enable it to quickly respond to the demands of different industries and scenarios on AI implementation.

    2. (b)

      Device-edge-cloud: Device, edge, and cloud refer to end-device, Huawei intelligent edge device, and Huawei CLOUD respectively.

    3. (c)

      Support online inference: Online inference is an online service (Web service) that generates the real-time predictions upon each request.

    4. (d)

      Support batch inference: batch inference is to generate a batch of predictions on a batch of data.

    5. (e)

      Ascend AI processor: Ascend AI processor is an AI chip featuring high computing power and low power consumption designed by Huawei.

    6. (f)

      High efficiency of data preparation: ModelArts has a built-in AI data framework, which can enhance the efficiency of data preparation through the convergence of automatic pre-labeling and hard example dataset labeling.

    7. (g)

      Reduced training time: ModelArts is installed with Huawei’s self-developed high-performance distributed framework MoXing, using core technologies including cascaded hybrid parallelism, gradient compression, and convolution acceleration to speed up the model training by a large margin.

    8. (h)

      ModelArts supports one-click deployment of models: ModelArts supports the deployment of models to end, edge, and cloud devices and scenarios by only one click, which can meet multiple requirements such as high concurrency and lightweight devices at the same time.

    9. (i)

      Full-process management: ModelArts provides visual workflow management of data, training, models and inference (covering the entire AI development cycle), and enables training auto-restart after power outage, training result comparison, and traceable management of models.

    10. (j)

      Active AI market: ModelArts supports data and model sharing, which can help companies improve the efficiency of internal AI development activities and can also let developers transform their knowledge into value.

  2. 2.

    MindSpore, All-Scenario AI Computing Framework

    Although the application of AI services to the device, edge and cloud scenarios is thriving in this intelligent age, AI technology still faces huge challenges including the high technological standards, soaring development costs and long deployment cycles. These challenges are a brake on the development of AI ecosystem for developer in all-industry. Consequently, the all-scenario AI computing framework MindSpore was introduced. It was designed based on three principles: development-friendly, efficient execution and flexible deployment.

    In today’s world of deep learning frameworks, if we call Google’s TensorFlow, Amazon’s MXNet, Facebook’s PyTorch and Microsoft’s CNTK as the “four giants”, then Huawei’s MindSpore is the strongest competitor.

    Thanks to the automatic parallelization provided by MindSpore, the senior data scientists and algorithm engineers dedicated to data modeling and problem solving can send an algorithm to visit dozens or even thousands of AI processing nodes with just several lines of code.

    MindSpore supports architectures of different sizes and types, adaptable to all-scenario independent deployment, Ascend AI processor, and other processors such as GPUs and CPUs.

  3. 3.


    Compute Architecture for Neural Networks (CANN) is a chip enablement layer Huawei built for deep neural networks and Ascend AI processors. It consists of the following four major function modules.

    1. (a)

      Fusion Engine: The operator-level fusion engine is mainly used to perform operator fusion to reduce the memory movement among operators and improve performance by 50%.

    2. (b)

      CCE operator library: It is a deeply optimized common operator library of Huawei that can meet most of the needs of the mainstream computer vision and NLP neural network.

      Certainly, it is inevitable for some clients and partners to ask for custom operators out of timeliness, privacy or doing research. This will entail the third function module of CANN.

    3. (c)

      Tensor Boost Engine (TBE). It is an efficient and high-performance custom operator development tool, which makes abstraction of hardware resources into application programming interfaces (API). The clients can quickly build the operators they need.

    4. (d)

      The last module is the compiler at the bottom. It provides ultimate optimization of performance to support Ascend AI processor in all scenarios.

  4. 4.

    Ascend AI processor

    Given the rising demands for AI, the AI processor market is currently monopolized by a few companies, leading to high prices, long supply cycles and weak local service support. The demands for AI in many industries have not been met effectively.

    At the HUAWEI CONNECT conference in October 2018, Huawei released Ascend 310 and Ascend 910 processors specialized for AI inference and training scenarios. The unique Da Vinci 3D Cube architecture of Ascend AI processors makes the series quite competitive in computing power, energy efficiency and scalability.

    Ascend 310 is a highly efficient AI system-on-chip (SoC) designed for the edge intelligent scenarios of inference. It uses a 12 nm chip and delivers a computing power of up to 16 TOPS (tera operations per second) with a consumption of only 8 W, highly suitable for the edge intelligence scenarios requiring low power consumption.

    Ascend 910 is currently the single chip with the greatest computing density, suitable for AI training. It adopts a 7 nm chip and provides a computer power of up to 512 TOPS with a maximum power consumption of 350 W.

  5. 5.

    Atlas artificial intelligence computing solutions

    Huawei Atlas artificial intelligence computing solution is based on the Huawei Ascend AI processors to build an all-scenario AI infrastructure solution for device, edge and cloud scenarios through a wide range of products including modules, circuit boards, edge stations, servers and clusters, etc., as shown in the Fig. 1.23. As a crucial section of Huawei’s full-stack all-scenario AI solution, Atlas launched its inference products in 2019 and brought the industry a complete AI computing solution by complementing the training products in 2020. Meanwhile, Huawei created a device-edge-cloud collaboration platform through all-scenario deployment, letting AI to empower every link in the industry.

Fig. 1.23
figure 23

A panorama of atlas artificial intelligence computing platform

1.5 The Controversy of AI

1.5.1 Algorithmic Bias

The algorithmic bias is mainly cause by the biased data.

While we are making decisions with the help of AI, the algorithms may learn to discriminate against a certain group of individuals as trained on the collected data. For instance, the algorithms could make discriminatory-prone decisions based on race, gender or other factors. Even if we exclude the factors such as race or gender from the data, the algorithms could still make discriminatory decisions based on the personal information such as the name or address of a person.

Here is an example. If you search with a name sounding like an African-American, you may get an advertisement for a tool of criminal records inquiry, which is unlikely to happen if you search with other styles of names. Online advertisers tend to feed advertisements of a product with lower price to female viewers. Google’ image app once mistakenly tagged a photo of black people as “gorillas.”

1.5.2 Privacy Issues

Currently, the existing AI algorithms are all data-driven, as the training of models require massive data. While enjoying the convenience brought by AI, people are also threatened by the risk of privacy leakage. For instance, the huge amount of user data that collected by some technology company may put us into the risk of full exposure of our daily life if these data are leaked.

When people are online, technically the technology companies can record every click, every page scroll, the viewing time spent for any content, and browsing history of the users.

These technology companies can also know the location of the users, where they have been, what they have done, and their education background, purchasing power, preferences and other personal privacy according to the users’ records of rides and purchases every day.

1.5.3 The Contradiction Between Technology and Ethics

Along with the development of computer vision, it is more and more difficult to judge the credibility of images. People can produce fake or manipulated images through image processors (e.g., Photoshop, PS), generative adversarial networks (GAN) and other techniques, making it really difficult to tell whether they are fake or real.

Let’s take GAN as an example. This concept was introduced by the machine learning researcher Ian Goodfellow in 2014. In its name, “G” is for “generative”, which is quoted here to indicate that the model generates image-like information, rather than the predicted values related to the input data. And “AN” is for “adversarial network”, as model uses two groups of neural networks that contest with each other like in a cat-and-mouse game, or like cashiers fighting banknote counterfeiters: the counterfeiter tries to deceive the cashier to believe that he is holding the real money, and the cashier tries to identify the authenticity.

1.5.4 Will Everyone Be Unemployed?

Throughout the course of human development, people are always seeking ways to enhance efficiency, namely, to harvest more with fewer resources. We used sharp stones to hunt and gathered food more efficiently, and invented steam engine to reduce the reliance on horses. In the era of AI , AI will replace the jobs of high repetitiveness, low creativity and seldom social interactions, while the jobs of high creativity will not be easily replaced.

1.6 The Development Trends for AI

  1. 1.

    Easier Development Framework

    All the AI development frameworks are evolving to be simpler in operation while omnipotent in functions. The threshold for AI development has been continuously lowered.

  2. 2.

    Algorithms and Models with Better Performance

    In computer vision, GAN is able to generate high-quality images that cannot be distinguished by the human eyes. And the GAN-related algorithms have begun to be applied to other vision-related tasks, such as semantic segmentation, facial recognition, video synthesis and unsupervised clustering. In natural language processing, major breakthroughs have been made in the Transformer-based pre-training models. The relevant models such as BERT, GPT and XLNet have begun to be widely applied to industrial scenarios. In reinforcement learning, AlphaStar of DeepMind defeated the top human players at the game StarCraft II.

  3. 3.

    Smaller Deep Models

    Models with better performance are often accompanied by larger parameters, and larger models will have to face the problem of operational efficiency during industrial implementation. Therefore, an increasing number of model compression techniques have been proposed to further reduce the size and parameters of the models, accelerate the inference speed, and meet the requirements of industrial applications while ensuring the performance.

  4. 4.

    All-round development of the computing power at device, edge and cloud

    The application of artificial intelligence chips to the cloud, edge devices and mobile terminals is expanding, further solving the problem of computing power for AI.

  5. 5.

    More Sophisticated AI Basic Data Services

    As the AI basic data service is becoming more mature, we will see more and more related data labeling platforms and tools being introduced to the market.

  6. 6.

    Safer Data Sharing

    On the premise of ensuring data privacy and security, federated learning makes use of different data sources to collaboratively train the models, so as to overcome the bottleneck of data as shown in Fig. 1.24.

Huawei’s global industry outlook report GIV 2020 (GIV 2025 for short) lists the 10 major development trends of intelligent technologies in the future.

  1. 1.

    Popularization of intelligent robots

    Huawei predicts that by 2025, 14% of the families across the globe will own a smart robot, which will play an important role in people’s daily life.

  2. 2.

    Popularization of AR/VR

    The report predicts that the percentage of the companies using VR/AR technology will reach 10% in the future. The application of technologies including virtual reality will bring vigor and opportunities to the industries such as commercial presentation and audio-visual entertainment.

  3. 3.

    Application of AI in a wide range of fields

    It is predicted that 97% of the large enterprises will adopt AI technology, mainly exemplified by the employment of speech intelligence, image recognition, facial recognition, human-computer interaction and so on.

  4. 4.

    Popularization of big data applications

    The enterprises will be making efficient use of 86% of the data they produce. The Big data analysis and processing will save time and enhance efficiency for the enterprises.

  5. 5.

    Weakening of search engine

    In the future, 90% of the people will have a smart personal assistant, which means that the chance for you to search something from a search portal will be greatly reduced.

  6. 6.

    Popularization of the Internet of Vehicles

    The cellular vehicle-to-everything (C-V2X) technology will be installed in 15% of the vehicles in the world. Smart vehicles and cars on the Internet will be substantially popularized, providing a safer and more reliable driving experience.

  7. 7.

    Popularization of industrial robots

    Industrial robots will work side by side with people in manufacturing, with 103 robots for every 10,000 employees. The hazardous, high-precision and high-intensity tasks will be assisted or completed by industrial robots independently.

  8. 8.

    Popularization of cloud technology and applications

    The usage rate of cloud-based applications will reach 85%. A majority of applications and program collaboration will be performed on the cloud.

  9. 9.

    Popularization of 5G

    Fifty-eight percent of the world’s population will enjoy 5G services. We may anticipate a revolution of communications industry in the future, when the technology and speed of communications will be greatly advanced.

  10. 10.

    Popularization of digital economy and big data

    The amount of global storage data produced annually will reach as high as 180 ZB. Digital economy and blockchain technology will be widely combined with the Internet.

Fig. 1.24
figure 24

Federated learning

1.7 Chapter Summary

This chapter introduces the basic concepts, development history and application background of AI. By reading this chapter, the readers can find that, as an interdisciplinary science, the application and development of artificial intelligence will not be achieved without the support of other disciplines. Its physical implementation is reliant on the large-scale hardware, and its upper-layer application is reliant upon software design and methods of implementation. As learners, the readers are expected to understand the boundaries of the application of artificial intelligence so as to ameliorate and improve themselves on this basis.

1.8 Exercises

  1. 1.

    There are different interpretations of artificial intelligence in different contexts. Please elaborate on the artificial intelligence in your eyes.

  2. 2.

    Artificial intelligence, machine learning and deep learning are three concepts often mentioned together. What is the relationship between them? What are the similarities and differences between the three terms?

  3. 3.

    After reading the artificial intelligence application scenarios in this chapter, please describe in detail a field of AI application and its scenarios in real life based on your own life experience.

  4. 4.

    CANN is a chip enablement layer that Huawei introduced for deep neural networks and Ascend AI processors. Please brief the four major modules of CANN.

  5. 5.

    Based on your current knowledge and understanding, please elaborate on the development trends of artificial intelligence in the future in your view.