This chapter mainly introduces Huawei CLOUD Enterprise Intelligence (EI), including Huawei CLOUD EI service family. It focuses on Huawei ModelArts platform and Huawei EI solutions.

8.1 Huawei CLOUD EI Service Family

Huawei CLOUD EI service family is composed of EI big data, EI basic platform, conversational bot, natural language processing (NLP), speech interaction, speech analysis, image recognition, content review, image search, face recognition, Optical Character Recognition (OCR) and EI agent, as shown in Fig. 8.1.

Fig. 8.1
figure 1

Huawei CLOUD EI service family

  1. 1.

    EI big data provides such services as data access, CLOUD data migration, real-time streaming computing, MapReduce, Data Lake Insight, table store.

  2. 2.

    EI basic platform provides such services as ModelArts platform, deep learning, machine learning, HiLens, graph engine service and video access.

  3. 3.

    Conversational bot provides intelligent QABot, TaskBot, intelligent quality inspection bot and customized conventional bot services.

  4. 4.

    Natural language processing provides natural language processing fundamentals, content review—text and language understanding, language generation, NLP Customization, machine translation.

  5. 5.

    Speech interaction provides speech recognition, speech synthesis and real-time speech transcription.

  6. 6.

    Video analysis provides video content analysis, video editing, video quality detection and video tagging.

  7. 7.

    Image recognition provides the services of image tagging and celebrity recognition.

  8. 8.

    Content review provides the review of text, image and video.

  9. 9.

    Image search indicates searching images with images, assisting customers to search the same or similar images from the designated image library.

  10. 10.

    Face recognition provides face recognition and body analysis.

  11. 11.

    OCR provides character recognition of general class, certificate class, bill class, industry class and customized template class.

  12. 12.

    EI agent is composed of transportation AI agent, industrial AI agent, park AI agent, network AI agent, auto AI agent, medical AI engine and geographic AI agent.

8.1.1 Huawei CLOUD EI Agent

EI agent integrates AI technology into the application scenarios of all walks of life. Combined with various technologies, it deeply excavates data value and makes full use of AI technology, and thus a scenario based solution is developed to improve its efficiency and users’ experience. EI agent is composed of transportation AI engine, industrial AI engine, park AI engine and network AI engine, as shown in Fig. 8.2. In addition, vehicle AI engine, medical AI engine and geographic AI engine are launched by Huawei as well.

Fig. 8.2
figure 2

EI agent

  1. 1.

    Transportation AI Engine

    Transportation AI engine realizes the products and solutions such as all-around road network analysis, traffic prediction, traffic incident monitoring and control, traffic lights optimization, traffic parameter perception and situation evaluation to ensure efficient, green and safe travelling. Transportation AI engine is shown in Fig. 8.3.

    Transportation AI engine has the following advantages.

    1. (a)

      It realizes comprehensive and in-depth data mining, with fully integrated Internet and transportation big data, and deeply excavated big data value.

    2. (b)

      It provides all-around collaboration and pedestrian-vehicle collaboration, which maximizes the traffic flow of the whole region and minimizes the waiting time of vehicles in the region. It also coordinate the traffic demand of vehicles and pedestrians, so as realize the orderly passage of vehicles and pedestrians.

    3. (c)

      It provides real-time traffic light scheduling, which is the first in the industry to achieve the standard formulation for safe communication interface between transportation AI agent and traffic light control platform.

    4. (d)

      It can accurately predict the driving path demand so as to plan the route in advance.

    Transportation AI engine is characterized as follows.

    1. (a)

      Full time: It realizes the 7 × 24 h whole-area and full-time perception of traffic incidents.

    2. (b)

      Intelligence: It achieves regional traffic light optimization.

    3. (c)

      Completeness: It can identify key congestion points and key congestion paths, making an analysis of congestion diffusion.

    4. (d)

      Prediction: It predicts the crowd density, obtaining the traffic regularity of crowd migration.

    5. (e)

      Accuracy: It achieves the comprehensive and accurate control of traffic conditions in 7 × 24 h.

    6. (f)

      Convenience: It realizes real-time traffic light scheduling and traffic clearance on demand.

    7. (g)

      Visuality: It displays live traffic situation on large screen.

    8. (h)

      Finess: It realizes key vehicle control and fine management.

  2. 2.

    Industrial AI Engine

    Relying on big data and artificial intelligence, industrial AI engine provides full-chain services in the fields of design, production, logistics, sales and service. It excavates data value, assisting enterprises to take a lead with new technologies. Industrial AI engine is shown in Fig. 8.4.

    Industrial AI engine causes the three major changes of the existing industry.

    1. (a)

      Transformation from artificial experience to data intelligence: Based on data mining analysis, new experience of efficiency promotion and product quality improvement can be obtained from data.

    2. (b)

      Transition from digital to intelligent: The ability of intelligent analysis has become the new driving force of enterprise digitization.

    3. (c)

      Transition from product manufacturing to product innovation: Data collaboration from product design to sales in enterprises, as well as data collaboration between upstream and downstream of industrial chain, brings to new competitive advantages.

    The application of industrial AI engine is as follows.

    1. (a)

      Product quality optimization and improvement: Based on customer feedback, Internet comment analysis, competitor analysis, maintenance records and after-sales historical data, a classified analysis is carried out in order to find out key problems of products, so as to guide new product improvement and product quality promotion.

    2. (b)

      Intelligent equipment maintenance: According to the past and present status of the system, predictive maintenance is taken through the predictive inference methods such as time series prediction, neural network prediction, regression analysis. It can predict whether the system will fail in the future, when it will fail, and the type of failure, so as to improve the efficiency of service operation and maintenance, reduce the unplanned downtime of the equipment, and save the labor cost on-site service.

    3. (c)

      Production material estimation: Based on the historical material data, the materials needed for production are accurately predicted so as to reduce the storage cycle and improve the efficiency. Deep algorithm optimization is based on the industry time series algorithm model, combined with Huawei supply chain deep optimization.

  3. 3.

    Park AI Engine

    Park AI engine applies artificial intelligence to the management and monitoring of industrial park, residential park and commercial park, so as to provide a convenient and efficient environment through the technologies such as video analysis, data mining. Park AI engine is shown in Fig. 8.5.

    Park AI engine brings the following three changes.

    1. (a)

      From manual defense to intelligent defense: Intelligent security based on artificial intelligence can relieve the pressure on security personnel.

    2. (b)

      From card swiping to face scanning: Face-scanning automatically clocks in, so it is no longer necessary to bring the entrance card.

    3. (c)

      From worry to reassurance: With powerful lost tracking and analysis ability, artificial intelligence makes employees and property owners feel more at ease.

    The application of park AI engine is as follows.

    1. (a)

      Park entrance control: Face recognition technology can accurately identify the identity of visitors and quickly return the results. Therefore, a high throughput of entrance control and automatic park management can be achieved.

    2. (b)

      Safety Monitoring: Through the intelligent technologies such as intrusion detection, loitering detection, abandoned object detection, the area can be monitored to ensure its safety.

    3. (c)

      Smart parking: Through vehicle license plate recognition and track trajectory, the services such as vehicle access control, driving line control, illegal parking management, parking space management can be realized.

  4. 4.

    Network AI Engine

    Network AI engine (NAIE) introduces AI into the network field to solve the problems of network business prediction, repeatability and complexity. It improves the utilization of network resources, operation and maintenance efficiency, energy efficiency and business experience, making it possible to realize automatic driving network. Network AI Engine is shown in Fig. 8.6.

    Network AI engine has the following commercial values.

    1. (a)

      The utilization rate of resources has been improved. AI is introduced to predict the network traffic, according to the prediction results, so that network resources are managed in a balanced way to improve the utilization rate of network resources.

    2. (b)

      Improve the efficiency of operation and maintenance. AI is introduced to compress a lot of repetitive work, predict faults and carry out preventive maintenance, so as to improve the operation and maintenance efficiency of the network.

    3. (c)

      Improve the efficiency of energy utilization. AI technology is used to predict the business status in real time, automatically making dynamic adjustment of energy consumption according to the business volume, so as to improve the efficiency of energy utilization.

    The technical advantages of network AI engine are as follows.

    1. (a)

      Data safely entering lake. It supports the rapid collection of various types of data such as network parameters, performance and alarm into the lake. On the one hand, a large number of tools are provided to improve the efficiency of data governance. At the same time, multi tenant isolation, encryption and storage are applied to ensure the whole life cycle security of data entering lake.

    2. (b)

      Network experience embedding. It uses the guided model development environment and presets multiple AI model development templates in network domain. It provides different services for different developers such as training service, model generation service, communication model service, aiming to help developers quickly complete model/application development.

    3. (c)

      Rich application services. It provides application services for various network business scenarios such as wireless access, fixed network access, transmission load, core network, DC, energy. It can effectively solve the specific problems of operation and maintenance efficiency, energy consumption efficiency and resource utilization rate in network services.

Fig. 8.3
figure 3

Transportation AI engine

Fig. 8.4
figure 4

Industrial AI engine

Fig. 8.5
figure 5

Park AI engine

Fig. 8.6
figure 6

Network AI engine

8.1.2 EI Basic Platform: Huawei HiLens

Huawei HiLens is a multi-modal AI development and application platform of End-Cloud collaboration, which is composed of end-side computing devices and Cloud platform. It provides a simple development framework, an out-of-the-box development environment, a rich AI skill market and Cloud management platform. It connects with a variety of end-side computing devices, supporting visual and auditory AI application development, AI application online deployment, and massive device management. HiLens help users develop multi-modal AI applications and distribute them to end-side devices to realize intelligent solutions for multiple scenarios. HiLens is shown in Fig. 8.7.

Fig. 8.7
figure 7

Huawei Hilens End-Cloud collaboration

HiLens products feature as follows.

  1. 1.

    End-Cloud collaborative inference, balancing low computing delay and high accuracy.

  2. 2.

    End-side analysis of data, minimizing the cost of cloud storage.

  3. 3.

    One-stop skill development, shortening the development cycle.

  4. 4.

    Skills market presets rich skills, online training, one-click deployment.

  1. 1.

    HiLens Product Advantage

    1. (a)

      End-Cloud collaborative inference.

      • End-Cloud collaboration can work in the scenario of unstable network and save user bandwidth.

      • End-side device can cooperate with Cloud-side device to update the model online, and quickly improve the accuracy of end-side.

      • End-side can analyze the local data, greatly minimizing cloud data flow and saving the storage cost.

    2. (b)

      Unified skills development platform.

      With a collaborative optimization of software and hardware, HiLens products use a unified skill development framework, encapsulate basic components, and support common deep learning models.

    3. (c)

      Cross platform design.

      • HiLens products support Ascend AI processor, Hisilicon 35xx series chips and other mainstream chips in the market, which can cover the needs of mainstream monitoring scenarios.

      • HiLens products provide model conversion and algorithm optimization for end-side chips.

    4. (d)

      Rich skill market.

      • A variety of skills are preset in HiLens skill market, such as human shape detection and crying detection. Users can select the required skills directly from the skill market and quickly deploy them on the end side without any development steps.

      • HiLens has done a lot of algorithm optimization to solve the problems of small memory and low precision of end-to-end devices.

      • Developers can also develop custom skills through the HiLens admin console and join the skill market.

  2. 2.

    HiLens Application

    1. (a)

      From the perspective of users, Huawei HiLens mainly has three types of users: ordinary users, AI developers and camera manufacturers.

      • Ordinary users: Ordinary users are skill users, who can be family members, supermarket owners, parking-lot attendant or site managers. HiLens Kit can achieve the functions such as family security improvement, customer flow counting, the identification of vehicle attribute and license plate, the detection of safety helmet wearing. And the users only need to register to HiLens management console, purchase or customize appropriate skills (such as license plate identification, safety helmet recognition, etc.) in platform skill market. One-click to install HiLens Kit can meet their needs.

      • AI developers: AI developers are usually technicians or college students who are engaged in AI development. They can easily deploy to the device to see how those skills work in real time if they want to obtain certain income or knowledge from AI. These users can develop AI skills in HiLens management console.

        HiLens integrates HiLens framework on the end side, which encapsulates the basic components, simplifies the development process, and provides a unified API interface, by which developers easily complete the development of a skill. After the skill development is completed, users can deploy it to HiLens Kit with one click to view the running effect. At the same time, skills can also be released to the skill market for other users to buy and use, or skills can be shared as templates for other developers to learn.

      • Camera manufacturers: Manufacturers of Hisilicon 35xx series chip camera products. For this series of cameras may have weak or even no AI capabilities, the manufacturers expect their products to be more competitive after obtaining stronger AI capabilities.

    2. (b)

      In terms of application scenarios, Huawei HiLens can be applied in various fields such as home intelligent surveillance, park intelligent surveillance, supermarket intelligent surveillance, intelligent in-vehicle.

      • Home intelligent surveillance: Home Intelligent cameras and smart home manufacturers, integrated with Huawei Hisilicon 35xx series chips, as well as high-performance HiLens Kit integrated with D chip, can be used to improve home video intelligent analysis ability. It can be applied to the following scenarios.

        • Human shape detection. It can detect the human figure in the family surveillance, record the time of appearance, or send an alert to the mobile phone after it is detected in some period of time when no one is at home.

        • Fall detection. When it detects someone falls down, the alert is sent out. It is used mainly for elderly care.

        • Cry detection. When it detects baby’s crying, the alert is sent to the user’s mobile phone. It is used for child care.

        • Vocabulary recognition. It can customize a specific word, such as “help”, and give an alert when the word is detected.

        • Face attribute detection. The face attributes are detected, including gender, age, smiling face, which can be used for door security, video filtering and so on.

        • Time album. The video clips of the detected children are combined into a time album to record the growth of children.

      • Park intelligent surveillance: Through HiLens management console, AI skills are distributed to the intelligent small station with Ascend chip integrated, so that edge devices can handle certain data. It can be applied to the following scenarios.

        • Face recognition gate. Based on face recognition technology, face recognition can be realized for the entrance and exit gates of the park.

        • License plate/vehicle identification. In entrance and exit of the park or garage, vehicle license plate and vehicle type identification can be carried out to realize the license plate and vehicle type authorization certification.

        • Safety helmet detection. Workers without wearing safety helmet will be found from video surveillance and alert will be initiated on the specified equipment.

        • Track restore. The face of the same person or vehicle identified by multiple cameras is analyzed to restore the path of pedestrian or vehicle.

        • Face retrieval. In the surveillance, face recognition is used to identify the specified face, which can be used for blacklist recognition.

        • Abnormal sound detection. When abnormal sound such as glass breakage and explosion sound is detected, alert shall be reported.

        • Intrusion detection. An alert is sent when a human shape is detected in the specified surveillance area.

      • Shopping mall intelligent surveillance: The terminal devices applicable to shopping mall include HiLens Kit, intelligent edge station and commercial cameras. Small supermarkets can integrate HiLens Kit, which supports 4–5 channels of video analysis scenes. With small size, it can be placed in the indoor environment. It can be applied to the following scenarios.

        • Customer flow statistics. Through the surveillance of stores and supermarkets, the intelligent custom flow statistics at the entrance and exit can be realized, which can be used to analyze the customer flow changes at different periods.

        • VIP identification. Through face recognition, VIP customers can be accurately identified to help formulate marketing strategies.

        • Statistics of new and old customers. Through face recognition, the number of new and old customers can be counted.

        • Pedestrian counting heat map. Through the analysis of pedestrian counting heat map, the density of crowd can be identified, which is beneficial for commodity popularity analysis.

      • Intelligent in-vehicle: Intelligent in-vehicle equipment is based on Android system, which can realize real-time intelligent analysis of internal and external conditions. It is applicable to the scenarios such as driving behavior detection, “two kinds of passenger coaches and one kind of hazardous chemical truck” surveillance. It can be applied to the following scenarios.

        • Face recognition. By identifying whether the driver’s face matches the owner’s pre-stored photo library, the driver’s authority is confirmed.

        • Fatigue driving. Real time surveillance of driver’s driving state can be detected, and an intelligent warning is sent if fatigue driving is detected.

        • Posture analysis. Detect the driver’s distracted behaviors, such as making a phone call, drinking water, gazing around, smoking.

        • Vehicle and pedestrian detection. It can be used for pedestrian detection in blind area.

8.1.3 EI Basic Platform: Graph Engine Service

Huawei Graph Engine Service (GES) is the first commercial distributed native graph engine with independent intellectual property rights in China. It is a service for query and analysis of graph structure data based on “relation”.

Adopting EYWA, a high-performance graph engine developed by Huawei, as its core, GES has a number of independent patents. It is widely used in the scenarios with rich relational data such as social applications, enterprise relation analysis, logistics distribution, shuttle bus route planning, enterprise knowledge graph, risk control, recommendation, public opinion, fraud prevention.

Massive and complex associated data such as social relations, transaction records and transportation networks are naturally graph data, while Huawei GES is a service for storage, querying and analysis of graph-structured data based on relations. It plays an important role in many scenarios such as social App, enterprise relation analysis, logistics distribution, shuttle bus route planning, enterprise knowledge graph, risk control.

In individual analysis, GES conducts user portrait analysis on individual nodes according to the number and characteristics of their neighbors. It can also excavate and identify opinion leaders according to the characteristics and importance of the nodes. For example, considering the quantity factor, the more attention a user receives from others, the more important the user is. On the other hand, the quality transfer factor is considered based on the transfer characteristics on the graph, the quality of fans is transferred to the concerned. When concerned by high-quality fans, the quality of the concerned increases.

In group analysis, with label propagation algorithm and community discovery algorithm, GES divides nodes with similar characteristics into one class, so that it can be applied to the node classification scenarios such as friend recommendation, group recommendation and user clustering. For example, in a social circle, if two people have a mutual friend, they will have the possibility of becoming friends in the future. The more mutual friends they have, the stronger their relationship will be. This allows to make friend recommendations based on the number of mutual friends.

In link analysis, GES can use link analysis algorithm and relationship prediction algorithm to predict and identify hot topics, so as to find “the tipping point”, as shown in Fig. 8.8.

Fig. 8.8
figure 8

Graph engine service

It can be seen that the application scenarios of GES in the real world are rich and extensive. There will be more industries and application scenarios in the future worthy of in-depth exploration.

The product advantages of GES are as follows.

  1. 1.

    Large scale: GES provides efficient data organization, which can more effectively query and analyze the data of 10 billion nodes and 100 billion edges.

  2. 2.

    High performance: GES provides deeply optimized distributed graph computing engine, through which users can obtain the real-time query capability of high concurrency, second-level and multi-hop.

  3. 3.

    Integration of query and analysis: GES provides a wealth of graph analysis algorithms, which a variety of analysis capabilities for business such as relationship analysis, route planning and precision marketing.

  4. 4.

    Ease of use: GES provides a guided, easy-to-use visual analysis interface, and what you see is what you get; GES supports Gremlin query language, which is compatible with user habits.

The functions provided by GES are as follows.

  1. 1.

    Rich domain algorithms: GES supports many algorithms such as PageRank, K-core, shortest path, label propagation algorithm, triangle count, interaction prediction.

  2. 2.

    Visual graphic analysis: GES provides a guided exploration environment, which supports visualization of query results.

  3. 3.

    Query analysis API: GES provides API for graph query, graph index statistics, Gremlin query, graph algorithm, graph management, backup management, etc.

  4. 4.

    Compatible with open source Ecology: GES is compatible with Apache TinkerPop Gremlin 3.3.0.

  5. 5.

    Graph management: GES provides graph engine services such as overview, graph management, graph backup, metadata management.

8.1.4 Introduction to Other Services Provided by EI Family

  1. 1.

    Conversational BOT

    Conversational BOT Service (CBS) is composed of QABot, TaskBot, Intelligent Quality Inspection Bot and Customized Conventional Bot. Conversational BOT is shown in Fig. 8.9.

    1. (a)

      QABot: It can help enterprises quickly build, release and manage intelligent question-answering robot system.

    2. (b)

      Taskbot: It can accurately understand the intention of the conversation, extract key information. It can be used for intelligent telephone traffic, intelligent hardware.

    3. (c)

      Intelligent quality inspection bot uses natural language algorithm and user-defined rules to analyze the conversation between customer service agents and customers in call center scenarios, helping enterprises to improve the service quality and customer satisfaction.

    4. (d)

      Customized conversational bot can build AI bots with various capabilities according to customer needs, including QA of knowledge base and knowledge graph, task-based conversation, reading comprehension, automatic text generation, multimodality, serving customers in different industries.

  2. 2.

    Natural Language Processing

    Natural language processing (NLP) provides the related services needed by robots to realize semantic understanding, which is composed of four sub services: NLP foundation, language understanding, language generation and machine translation. NLP service is shown in Fig. 8.10.

    Natural Language Processing Fundamentals provides users with natural language related APIs, including word segmentation, named entity recognition, keyword extraction, text similarity, which can be applied to such scenarios as intelligent question answering, conversational bot, public opinion analysis, content recommendation, e-commerce evaluation and analysis.

    Language understanding provides users with APIs related to language understanding, such as sentiment analysis, opinion extraction, text classification, intention understanding, etc., which can be applied to such scenarios as comment opinion mining, public opinion analysis, intelligent assistant, conversational bot.

    Based on the advanced language model, language generation generates readable texts according to the input information, including text, data or image. It can be applied to human-computer interaction scenarios such as intelligent Q&A and conversation, news summary, report generation.

    NLP Customization is to build a unique natural language processing model according to the specific needs of customer, such as customized automatic classification model of legal documents, customized automatic generation model of medical reports, customized public opinion analysis model of specific fields, which aims to provide unique competitiveness for enterprise applications.

  3. 3.

    Speech Interaction

    Speech interaction is composed of speech recognition, speech synthesis and real-time voice transcription, as shown in Fig. 8.11.

    The main applications of speech recognition are as follows.

    1. (a)

      Speech search: Search content is directly entered by speech, which makes the search more efficient. Speech recognition supports speech search in various scenarios, such as map navigation, web search and so on.

    2. (b)

      Human computer interaction: Through speech wake-up and speech recognition, speech commands is sent to terminal devices and real-time operation of devices is implemented to improve human-computer interaction experience.

    The applications of speech synthesis are as follows.

    1. (a)

      Speech navigation: Speech synthesis can convert on-board navigation data into speech material, providing users with accurate speech navigation service. Using the personalized customization ability of speech synthesis, it provides rich navigation speech service.

    2. (b)

      Audio book: Speech synthesis can transform the text content of books, magazines and news into realistic human voice, so that people can fully free their eyes while obtaining information and having fun in such scenarios such as taking the subway, driving or physical training.

    3. (c)

      Telephone follow-up: In the scene of customer service system, the follow-up content is converted into human voice through speech synthesis, and the user experience is improved by direct communication with customers in speech.

    4. (d)

      Intelligent education: Speech synthesis can synthesize the text content in books into speech. The pronunciation close to the real person can simulate live teaching scenes, so as to realize the reading aloud and leading reading of texts, helping students better understand and master the teaching content.

    The applications of real-time speech transcription are as follows.

    1. (a)

      Live subtitle: Real-time speech transcription converts audio in live video or live broadcast into subtitles in real time, providing more efficient viewing experience for the audience and facilitating content monitoring.

    2. (b)

      Real-time recording of conference: Real-time speech transcription converts audio in video or teleconference into text in real time, which can check, modify and retrieve transcribed conference content in real time, so as to improve conference efficiency.

    3. (c)

      Instant text entry: Real-time speech transcription on mobile App can be used to record and provide transcribed text in real time, such as speech input method. It is convenient for post-processing and content archiving, saving manpower and time cost of recording, and thus the conversion efficiency is greatly promoted.

  4. 4.

    Video Analysis

    Video analysis provides such services as video content analysis, video editing, video tagging.

    The applications of video content analysis are as follows.

    1. (a)

      Monitoring management: Video content analysis conducts a real-time analysis on all videos in the shopping mall or park, in order to extract key episodes, such as warehouse monitoring, cashier compliance troubles, fire exit blockage. It also conducts high security area intruder detection, loitering detection, abandoned object detection, etc.; Intelligent loss prevention can be conducted, such as portrait surveillance, theft detection, etc.

    2. (b)

      Park pedestrian analysis: Through a real-time analysis on the active pedestrians in the park, video content analysis identifies and tracks the high-risk persons after configuring the pedestrian blacklist, sending a warning. It counts the pedestrian flow at key intersections to formulate park management strategies.

    3. (c)

      Video character analysis: Through the analysis of the public figures in the media video, video content analysis accurately identifies the political figures, movie stars and other celebrities in the video.

    4. (d)

      Motion recognition: Video content analysis detects and recognizes the motions in the video after an analysis of the front and back frame information, optical flow motion information, scene content information.

    The applications of video editing are as follows.

    1. (a)

      Highlight clip extraction: Based on the content relevance and highlight of video, video editing extracts scene segments to make video summary.

    2. (b)

      News video splitting: Video editing splits the complete news into news segments of different themes based on the analysis of characters, scenes, voices and character recognition in the news.

    The applications of video tagging are as follows.

    1. (a)

      Video search: Based on the analysis of video scene classification, people recognition, voice recognition and character recognition, the video tagging forms hierarchical classification tags so as to support accurate and efficient video search and search experience improvement, as shown in Fig. 8.12.

    2. (b)

      Video recommendation: Based on the analysis of scene classification, person recognition, speech recognition and OCR, video tagging forms hierarchical classification tags for personalized video recommendation.

  5. 5.

    Image Recognition

    Image recognition, based on deep learning technology, can accurately identify the visual content in the image, providing tens of thousands of objects, scenes and concept tags. It has the ability of target detection and attribute recognition, so as to help customers accurately identify and understand the image content. Image recognition provides such functions as scene analysis, intelligent photo album, target detection, image search, as shown in Fig. 8.13.

    1. (a)

      Scene analysis: The lack of content tag leads to low retrieval efficiency. Image tagging can accurately identify image content, improve retrieval efficiency and accuracy, so as to make personalized recommendation, content retrieval and distribution more effective.

    2. (b)

      Smart photo album: Based on tens of thousands of tags identified from images, smart photo album can be customized in categories, such as “plants”, “food”, “work”, which is convenient for users to manage.

    3. (c)

      Target detection: At the construction site, based on the customized image recognition, target detection system can real-time monitor whether the on-site staff wear safety helmet, in order to reduce the safety risk.

    4. (d)

      Image search: The search of massive image database is troublesome. Image search technology, based on image tag, can quickly search for the desired image no matter whether the user inputs a keyword or an image.

  6. 6.

    Content Review

    Content review is composed of text review, image review and video review. Based on the leading detection technology of text, image and video, it can automatically detect the contents concerning pornography, advertising, terrorism, politics, so as to help customers reduce the risk of business violations. Content review is shown in Fig. 8.14.

    Content review includes the following applications.

    1. (a)

      Pornography identification: Content review can judge the pornographic degree of a picture, give three confidence scores: pornographic, sexy and normal.

    2. (b)

      Terrorism detection: Content review can quickly detect whether the picture contains the content concerning fire, guns, knives, bloodiness, flag of terrorism, etc.

    3. (c)

      Sensitive figures involved in politics: Content review can judge whether the content is involved in sensitive political figures.

    4. (d)

      Text content detection: Content review can detect whether the text content concerns with pornography, politics, advertising, abuse, adding water and contraband.

    5. (e)

      Video review: Content review can judge whether video has the risk of violation so as to provide violation information from the dimensions of screen, sound and subtitle.

  7. 7.

    Image Search

    Image search is to search image with image. Based on deep learning and image recognition technology, it uses feature vectorization and search ability to help customers search for the same or similar pictures from the specified library.

    The applications of image search are as follows.

    1. (a)

      Commodity picture search: Image search can search the pictures taken by the user from the commodity library. By similar picture searching, the same or similar commodity are pushed to the user, so as to sell or recommend the related commodity, as shown in Fig. 8.15.

    2. (b)

      Picture copyright search: Picture copyright is an important asset of photography and design websites. Image search can quickly locate tort pictures from massive image databases so as to defend the rights of image resource websites.

  8. 8.

    Face Recognition

    Face recognition can quickly identify faces in images, analyzing the key information and obtaining face attributes, so that accurate face comparison and retrieval can be achieved.

    The applications of face recognition are as follows.

    1. (a)

      Identity authentication: Face identification and comparison can be used for identity authentication, which is suitable for authentication scenes such as airport, customs.

    2. (b)

      Electronic attendance: Face identification and comparison is applicable to the electronic attendance of employees in enterprises as well as security monitoring.

    3. (c)

      Trajectory analysis: Face search can retrieve N face images and their similarity degree that are most similar to the input face in the image database. According to the time, place and behavior information of the return pictures, it can assist customers to realize trajectory analysis.

    4. (d)

      Customer flow analysis: Customer flow analysis is of great value to shopping malls. Based on the technology of face recognition, comparison and search, it can accurately analyze the information of customers such as age and gender, so as to distinguish new and regular customers, helping customers with efficient marketing. Customer flow analysis is shown in Fig. 8.16.

  9. 9.

    Optical Character Recognition

    Optical Character Recognition (OCR) is the recognition of text in a picture or scanned copy into editable text. OCR can replace manual input to improve business efficiency. It supports the character recognition of such scenarios as ID card, driver’s license, driver’s license, invoice, English customs documents, common forms, common characters, as shown in Fig. 8.17.

    OCR supports the character recognition of general class, certificate class, bill class, industry class and customized template class.

    General OCR supports automatic recognition of text information on arbitrary format pictures, such as forms, documents, network pictures. It can analyze various layouts and forms, so as to quickly realize the electronization of various documents.

    The applications of general OCR are as follows.

    1. (a)

      Electronic filing of enterprise historical documents and reports: It can identify the character information in documents and reports and establish electronic files, and so as to facilitate rapid retrieval.

    2. (b)

      Automatic filling in the sender information of express delivery: It can identify the contact information in the picture and automatically fill in the express delivery form, reducing manual input.

    3. (c)

      Efficiency improvement of contract handling: It can automatically identify the structured information and extract the signature and seal area, which is helpful for rapid audit.

    4. (d)

      Electronic customs documents: As many companies have overseas business. General OCR can realize the automatic structure and electronization of customs document data, so as to improve the efficiency and accuracy of information entry.

    Certificate character recognition supports automatic identification of valid information and structured extraction of key fields on ID card, driving license, vehicle certificate, passport.

    The applications of certificate character recognition are as follows.

    1. (a)

      Fast authentication: It can quickly complete the real-name authentication of mobile phone account opening and other scenes, so as to reduce the cost of user identity verification.

    2. (b)

      Automatic information entry: It can identify key information in certificates so as to save manual entry and improve efficiency.

    3. (c)

      Verification of identity information: It can verify whether the user is the holder of a real certificate.

    Bill type character recognition supports automatic recognition and structured extraction of valid information on various invoices and forms, such as VAT invoice, motor vehicle sales invoice, medical invoice, etc.

    The applications of bill character recognition are as follows.

    1. (a)

      Automatic entry of reimbursement document information: It can quickly identify the key information in the invoice and effectively shorten the reimbursement time.

    2. (b)

      Automatic entry of document information: It can quickly enter motor vehicle sales invoice and contract information, so as to improve the efficiency of vehicle loan processing.

    3. (c)

      Medical insurance: It can automatically identify key fields such as drug details, age and gender of medical documents before entering them into the system. Combined with ID card and bank card OCR, it can quickly complete insurance claims business.

    Industry type character recognition supports the extraction and recognition of structured information of various industry-specific pictures, such as logistics sheets and medical test documents, which helps to improve the automation efficiency of the industry.

    The applications of industry type character recognition are as follows.

    1. (a)

      Automatic filling in the sender’s information of express delivery: It can identify the contact information in the picture before automatically filling in the express delivery form so as to minimize manual input.

    2. (b)

      Medical insurance: It can automatically identify key fields such as drug details, age and gender of medical documents and enter them into the system. Combined with ID card and bank card OCR, it can quickly complete insurance claims business.

    Customized template type character recognition supports user-defined recognition templates. It can specify the key fields to be recognized, so as to realize the automatic recognition and structural extraction of user-specific format images.

    1. (a)

      Identification of various certificates: For card images of various formats, it can be used to make templates to realize automatic identification and extraction of key fields.

    2. (b)

      Recognition of various bills: For various bill images, it can be used to make templates to realize automatic recognition and extraction of key fields.

Fig. 8.9
figure 9

Conversational bot

Fig. 8.10
figure 10

Natural language processing service

Fig. 8.11
figure 11

Speech interaction

Fig. 8.12
figure 12

Video search

Fig. 8.13
figure 13

Applications of image recognition

Fig. 8.14
figure 14

Content review

Fig. 8.15
figure 15

Commodity search

Fig. 8.16
figure 16

Customer flow analysis

Fig. 8.17
figure 17

Optical character recognition

8.2 ModelArts

As EI basic platform in EI service family, ModelArts is a one-stop development platform for AI developers. It provides massive data preprocessing and semi-automatic annotation, large-scale distributed training, automatic model generation and on-demand deployment capabilities of End, Edge and Cloud model, helping users quickly create and deploy models and managing full cycle AI workflow.

“One-stop” means that all aspects of AI development, including data processing, algorithm development, model training and model deployment, can be completed on ModelArts. Technically, it supports various heterogeneous computing resources, so that developers can choose to use flexibly according to their needs, regardless of the underlying technologies. At the same time, ModelArts supports mainstream open source AI development frameworks such as TensorFlow and MXNet, as well as self-developed algorithm frameworks to match the usage habits of developers.

Aiming to make AI development easier and more convenient, ModelArts provides AI developers with a convenient and easy-to-use process. For example, business-oriented developers can use automatic learning process to quickly build AI applications without focusing on model or coding; AI beginners can use preset algorithms to build AI applications without much concern for model development; provided with a variety of development environments, operation processes and modes by ModelArts, AI engineers can easily code expansion and quickly build models and applications.

8.2.1 Functions of ModelArts

ModelArts enables developers to complete all tasks in one stop, from data preparation to algorithm development and model training, and finally to deploy models and integrate them into the production environment. The function overview of ModelArts is shown in Fig. 8.18.

Fig. 8.18
figure 18

Function overview of ModelArts

ModelArts features as follows.

  1. 1.

    Data governance: ModelArts supports data processing such as data filtering and annotation, providing version management of data sets, especially large data sets for deep learning, so that training results can be reproduced.

  2. 2.

    Extremely “fast” and “simple” training: Moxing deep learning framework, developed by ModelArts, is more efficient and easier to use, greatly improving the training speed.

  3. 3.

    Multi-scenario deployment of end, edge and cloud: ModelArts supports the deployment of models to a variety of production environments, which can be deployed as cloud online reasoning and batch reasoning, or directly deployed to end and edge.

  4. 4.

    Automatic learning: ModelArts supports a variety of automatic learning capabilities. Through “automatic learning” training model, users can complete automatic modeling and one-click deployment without writing code.

  5. 5.

    Visual workflow: ModelArts uses GES to manage the metadata of development process and automatically visualize the relationship between workflow and version evolution, so as to realize model traceability.

  6. 6.

    AI Market: ModelArts preset common algorithms and data sets, supporting the sharing of models within the enterprise or publicly.

8.2.2 Product Structure and Application of ModelArts

As a one-stop development platform, ModelArts supports the whole development process of developers from data to AI application, including data processing, model training, model management, model deployment and other operations. It also provides AI market functions, and can share models with other developers in the market. The product structure of ModelArts is shown in Fig. 8.19.

Fig. 8.19
figure 19

Product structure of ModelArts

Modelarts supports the whole process development from data preparation to model deployment AI, and a variety of AI application scenarios, as detailed below.

  1. 1.

    Image recognition: It can accurately identify the object classification information in the picture, such as animal identification, brand logo recognition, vehicle identification, etc.

  2. 2.

    Video analysis: It can accurately analyze key information in video, such as face recognition and vehicle feature recognition.

  3. 3.

    Speech recognition: It enables machines to understand speech signals, assisting in processing speech information, which is applicable to intelligent customer service QA, intelligent assistant, etc.

  4. 4.

    Product recommendation: It can provide personalized business recommendation for customers according to their own properties and behavior characteristics.

  5. 5.

    Anomaly detection: In the operation of network equipment, it can use an automated network detection system to make real-time analysis according to the traffic situation so as to predict suspicious traffic or equipment that may fail.

  6. 6.

    In the future, it will continue to exert its strength in data enhancement, model training speed and weak supervision learning, which will further improve the efficiency of AI model development.

8.2.3 Product Advantages of ModelArts

The product advantages of ModelArts are reflected in the following four aspects.

  1. 1.

    One-stop: Out of the “box”. It covers the whole process of AI development, including the functions of data processing, model development, training, management and deployment. One or more of these functions can be flexibly used.

  2. 2.

    Easy to use: It offers a variety of preset models, and open source models can be used whenever wanted; model super-parameters are automatically optimized, which is simple and fast; zero code development is simple to operate for training their own models; one-click deployment of models to end, edge and cloud is supported.

  3. 3.

    High performance: Moxing deep learning framework developed by ModelArts improves the efficiency of algorithm development and training speed; the utilization optimization of GPU in deep model reasoning can accelerate cloud online reasoning. As a result, models running on ascend chip can be generated to realize efficient end-side inference.

  4. 4.

    Flexibility: It supports a variety of mainstream open source frameworks (TensorFlow, Spark)_ MLlib, etc.). It supports mainstream GPU and self-developed ascend chip. It supports exclusive use of exclusive resources. It supports user-defined mirror image to meet the needs of user-defined framework and operator.

In addition, ModelArts has the following advantages.

  1. 1.

    Enterprise level: It supports massive data preprocessing and version management. It supports multi-scene model deployment of end, edge and cloud, to realize visual management of the whole process of AI development. It also provides AI sharing platform, assisting enterprises to build internal and external AI ecology.

  2. 2.

    Intellectualization: It supports automatic design of models, which can train models automatically according to deployment environment and reasoning speed requirements. It also supports automatic modeling of image classification and object detection scenes, as well as automatic feature engineering and automatic modeling of structured data.

  3. 3.

    Data preparation efficiency improved 100 times: It has a built-in AI data framework, which improves the efficiency of data preparation through the combination of automatic pre-annotation and difficult-case set annotation.

  4. 4.

    Great reduction of model training time consumption: It provides Moxing high-performance distributed framework developed by Huawei, adopting the core technologies such as cascade hybrid parallel, gradient compression, convolution acceleration, to greatly reduce model training time consumption.

  5. 5.

    The model can be deployed to the end, edge and cloud with one click.

  6. 6.

    AI model deployment: It provides edge reasoning, online reasoning and batch reasoning.

  7. 7.

    Accelerating AI development process with AI method—Automatic Learning: It provides UI guide and adaptive training.

  8. 8.

    Creating whole process management: It realizes automatic visualization of development process, restart training breakpoints, and easy comparison of training results.

  9. 9.

    AI Sharing—assisting developers to realize AI resource reuse: It realizes intra-enterprise sharing so as to improve efficiency.

8.2.4 Approaches of Visiting ModelArts

Huawei cloud service platform provides a web-based service management platform, namely management console and Application Programming Interface (API) management mode based on HTTPS request. ModelArts can be accessed in the following three ways.

  1. 1.

    Management Console Mode

    ModelArts provides a simple and easy-to-use management console, including the functions such as automatic learning, data management, development environment, model training, model management, deployment online, AI market, which can complete AI development end-to-end in the management console.

    To use ModelArts management console, you need to register a Huawei cloud account first. After registering the Huawei cloud account, you can click the hyperlink of “EI Enterprise Intelligence → AI Services → EI Basic Platform → AI Development Platform ModelArts” on the Huawei cloud home page, and click the “enter console” button in the page that appears to log in to the management console directly.

  2. 2.

    SDK Mode

    If you need to integrate ModelArts into a third-party system for secondary development, you can choose to call ModelArts SDK. ModelArts SDK is a Python encapsulation of REST API provided by ModelArts service, which simplifies the user’s development work. For the specific operation of calling ModelArts SDK and the detailed description of SDK, please refer to the product help document “SDK Reference” on the official website of ModelArts.

    In addition, when writing code in the Notebook of the management console, you can directly call ModelArts SDK.

  3. 3.

    API Mode

    ModelArts can be integrated into a third-party system for secondary development. Modelarts can also be accessed by calling ModelArts API. For detailed operation and API description, please see the product help document “API overview” on the official website of ModelArts.

8.2.5 How to Use ModelArts

ModelArts is a one-stop development platform for AI developers. Through the whole process management of AI development, it helps developers create AI models intelligently and efficiently and deploy them to the end, edge and cloud with one click.

ModelArts not only supports automatic learning function, but also presets a variety of trained models, integrating Jupyter Notebook to provide online code development environment.

According to different groups of users, different use-patterns of ModelArts are to be selected.

For business developers without AI development experience, ModelArts provides automatic learning function, which can build AI model with zero foundation. Developers don’t need to focus on development details such as model development, parameter adjustment. Just three steps (data annotation, automatic training, deployment online) are needed to complete an AI development project. The product help document “best practice” on the official website of ModelArts provides a sample of “Find Yunbao” (Yunbao is the mascot of Huawei Cloud), which is used to help business developers quickly get familiar with the use process of ModelArts automatic learning. This example is a scene project of “object detection”. Through the preset Yunbao image data set, the detection model is automatically trained and generated, and the generated model is deployed as an online service. After the deployment, users can identify whether the input image contains Yunbao through the online service.

For AI beginners with certain AI foundation, ModelArts provides preset algorithm based on the mainstream engine in the industry. Learners do not need to pay attention to the model development process. They directly use preset algorithm to train existing data and quickly deploy it as a service. The preset algorithm provided by modelarts in AI market can be used for object detection, image classification and text classification.

The product help document “best practice” on the official website of ModelArts provides an example of flower image classification application, which helps AI beginners quickly get familiar with the process of using ModelArts preset algorithm to build models. This example uses the preset flower image data set to annotate the existing image data, and then uses the preset RESNET_ v1_50. Finally, the model is deployed as an online service. After the deployment, users can identify the flower species of the input image through the online service.

For AI engineers who are familiar with code writing and debugging, ModelArts provides one-stop management capability, through which AI engineers can complete the whole AI process in one stop from data preparation, model development, model training and model deployment. ModelArts is compatible with mainstream engines in the industry and user habits. At the same time, it provides a self-developed MoXing deep learning framework to improve the development efficiency and training speed of the algorithm.

The product help document “Best Practice” on the official website of ModelArts provides an example of using MXNet and NoteBook to realize the application of handwritten digital image recognition, which helps AI engineers quickly comb the whole process of AI development of ModelArts.

MNIST is a handwritten numeral recognition data set, which is often used as an example of deep learning. This example will use MXNet native interface or NoteBook model training script (provided by ModelArts by default) for MNIST dataset, deploying the model as an online service. After the deployment, users can identify the numbers entered in the picture through the online service.

8.3 Huawei CLOUD EI Solutions

This chapter mainly introduces the application cases and solutions of Huawei Cloud EI.

8.3.1 OCR Service Enabling Whole-Process Automated Reimbursement

Huawei Cloud OCR service can be applied to financial reimbursement scenarios. It automatically extracts the key information of bills, helping employees automatically fill in reimbursement forms. Meanwhile, combined with Robotic Process Automation (RPA) to it can greatly improve the work efficiency of financial reimbursement. Huawei Cloud Bill OCR recognition supports OCR recognition of various bills such as VAT invoice, taxi invoice, train ticket, itinerary sheet, shopping ticket. It can correct the skew and distortion of pictures, effectively removing the impact of seal on character recognition so as to improve the recognition accuracy.

In financial reimbursement, it is very common to have multiple bills in one image. Generally, OCR service can only identify one kind of bill. For example, VAT invoice service can only identify a single VAT invoice. However, Huawei Cloud OCR service, an online intelligent classification and identification service, supports multiple formats of invoice and card segmentation. It can recognize one image of multiple tickets, one image of multiple cards, mixed card and ticket, and realize total charging. Combined with each OCR service, it can realize the recognition of various kinds of invoices and cards including but not limited to air ticket, train ticket, medical invoice, driver’s license, bank card, identity card, passport, business license, etc.

Financial personnel need to manually input the invoice information into the system after getting a batch of financial invoices. Even if you use OCR service of Huawei Cloud, you need to take photos of each financial invoice and upload it to the computer or server. Huawei Cloud can provide batch scanning OCR recognition solution, which only needs a scanner and a PC to scan invoices in batches through the scanner to generate color images. It can automatically call OCR service of Huawei Cloud in batch, to quickly complete the extraction process of invoice information, and visually compare the recognition results. It can also export the identification results to excel or financial system in batches, greatly simplifying the data entry process.

The solution has the following characteristics.

  1. 1.

    Multiple access methods: automatic connection scanner, batch acquisition of images; high camera, mobile phone photo acquisition of images.

  2. 2.

    Flexible deployment mode: supporting public cloud, HCS, all-in-one and other deployment modes, and unifying the standard API interface.

  3. 3.

    Applicable to all kinds of invoices: VAT general/special/electronic/ETC/voucher, taxi fare/train ticket/itinerary sheet/quota invoice/toll, etc.

  4. 4.

    Support one image of multiple invoices: automatic classification and recognition of multiple invoices.

  5. 5.

    Visual comparison: location information return, Excel format conversion, easy for statistics and analysis.

The invoice reimbursement solution is shown in Fig. 8.20. The advantages of the solution are list as the followings: improving efficiency and reducing cost, optimizing operation, simplifying process and enhancing compliance.

Fig. 8.20
figure 20

Invoice reimbursement solution

8.3.2 OCR Supporting Smart Logistics

Couriers can take pictures of ID card through mobile terminals (such as mobile App) when picking up the items. With Huawei Cloud ID identification service, the identity information is automatically recognized. When filling in the express information, you can complete the automatic entry of the express information by uploading the address screenshot, chat record screenshot and other pictures, for OCR can automatically extract the information such as name, telephone, address. In the process of express transportation, OCR can also extract the waybill information to complete the automatic sorting of express delivery, judging whether the information in the express face sheet is complete. OCR service of Huawei Cloud supports OCR recognition of complex pictures from any angle, uneven illumination, incomplete, with high recognition rate and good stability, which can greatly reduce labor costs and enhance user experience. The smart logistics solution is shown in Fig. 8.21.

Fig. 8.21
figure 21

Smart logistics solution

8.3.3 Conversational Bot

Usually, a single function robot can not solve all the problems in the customer business scenario. By integrating multiple robots with different functions, a joint solution of conversational bot is created, which is presented as a single service interface. Customers can only call a single interface to solve different business problems. The functional characteristics of each robot are as follows.

  1. 1.

    Applicable Scenarios of Intelligent QABot

    1. (a)

      Intelligent QABot can solve common types of problems such as consultation, help seeking in the fields of IT, e-commerce, finance, government. In these scenarios, users frequently consult or seek help.

    2. (b)

      Intelligent QABot has knowledge reserve, with QA knowledge base, FAQ or similar documents, as well as work order and customer service QA data.

  2. 2.

    Applicable Scenarios of TaskBot

    1. (a)

      TaskBot has clear conversational tasks. It can flexibly configure the conversational process (multi-round interaction) according to the actual business scenario. After loading the script template, TaskBot conducts multiple rounds of dialogue with customers based on speech or text in the corresponding scene, understanding and recording customers’ wishes at the same time.

    2. (b)

      Outbound Bot: This kind of TaskBot can complete various tasks such as return visit of business satisfaction, verification of user information, recruitment appointment, express delivery notice, sales promotion, screening of high-quality customers.

    3. (c)

      Customer Service: This kind of TaskBot can complete various tasks such as hotel reservation, air ticket reservation, credit card activation.

    4. (d)

      Intelligent Hardware: This kind of TaskBot can serve in many fields such as speech assistant, smart home.

  3. 3.

    Applicable Scenarios of Knowledge Graph QABot

    1. (a)

      Complex knowledge system.

    2. (b)

      Answer requiring logical inference.

    3. (c)

      Multiple rounds of interaction.

    4. (d)

      A factual problem involving the value of an entity’s attributes or the relationship between entities that cannot be exhausted by enumeration.

    The features of dialogue robot are as follows.

    1. (a)

      Multi robot intelligent integration, more comprehensive: a number of robots have their own strengths, self-learning and self optimization, so that the best answer can be recommended for customers.

    2. (b)

      Multiple rounds of intelligent guidance, better understanding: multiple rounds of dialogue, natural interaction, can accurately identify the user’s intention, so that the user’s potential semantics can be understood.

    3. (c)

      Knowledge graph, smarter: general domain language model + domain knowledge graph; dynamic updating of map content; more intelligent robot based on graph. The architecture of conversational bot is shown in Fig. 8.22.

Intelligent QABot based on knowledge graph can conduct accurate knowledge Q&A. For example, vehicle conversational bot can be applied to query the price and configuration of a specific vehicle model. It can recommend vehicles according to price and level type. It can also conduct vehicle comparison, and offer the corresponding information such as text, table, picture. Vehicle conversational bot is shown in Fig. 8.23.

Fig. 8.22
figure 22

Architecture of conversational bot

Fig. 8.23
figure 23

Vehicle conversational bot

8.3.4 A Case of Enterprise Intelligent Q&A in a Certain District

Enterprise intelligent question answering system in a district of Shenzhen provides relevant business robots with automatic response. The question that are not directly answered by the robots will be automatically recorded, and then the follow-up manual answering will be pushed to the questioner. The system provides a complete closed-loop solution for unsolved problems, which can realize the continuous optimization process of unsolved problems recording, artificial closed-loop knowledge formation, model annotation and optimization, making the robot more intelligent. Enterprise intelligent question answering system is shown in Fig. 8.24.

Fig. 8.24
figure 24

Enterprise intelligent QA system

Business related to enterprise intelligent QA system mainly includes the following three categories.

  1. 1.

    Policy consultation (frequent policy changes).

  2. 2.

    Enterprise matters in office hall (more than 500 items).

  3. 3.

    Appeals (various types).

8.3.5 A Case in Genetic Knowledge Graph

Genetic knowledge graph includes various types of entities, such as gene, mutation, disease, drug, etc., as well as complex relationships between genes and mutation, variation and diseases, diseases and drugs. Based on this graph, the following functions can be realized.

  1. 1.

    Entity query: Based on genetic knowledge graph, the information of an entity (gene, mutation, disease, drug) can be quickly searched.

  2. 2.

    Auxiliary diagnosis: Based on the results of genetic testing, the possible variation or disease can be inferred by the graph so as to give diagnosis and treatment suggestions and recommend drugs.

  3. 3.

    Gene testing report: Based on the structured or semi-structured data of gene entity and its association knowledge with variation and disease, the readable gene testing report will be generated automatically.

Genetic knowledge graph is shown in Fig. 8.25.

Fig. 8.25
figure 25

Genetic knowledge graph

8.3.6 Policy Query Based on Knowledge Graph

The state government often issues some incentive policies to enterprises, such as tax reduction and tax rebate policies. The contents of the policies are so professional, that ordinary people find it hard to understand and need professional interpretation.

There are many kinds of policies and reward categories. There are more than 300 conditions for enterprises to be recognized by policies, and there are logical relations between the conditions of the same policy, such as and, or, and not. Therefore, it is very difficult for enterprises to quickly obtain the policies they can enjoy.

Through policy knowledge map construction, all sorts of policy incentives and identification conditions are . In addition, we can build a knowledge map of enterprise information. Finally, we only need to input a enterprise name, and automatically obtain the value of various information (identification conditions) of the enterprise from the enterprise map, such as type, tax amount, scale and other identification conditions. Based on these identification conditions, we can Finally, all the policies and rewards that the enterprise can enjoy are obtained. The policy query based on knowledge map is shown in Fig. 8.26.

Fig. 8.26
figure 26

Policy query based on knowledge graph

8.3.7 A Case in Smart Park

Tian’an Cloud Valley is located in Banxuegang Science and Technology City, the central area of Shenzhen, covering an area of 760,000 square meters, with a total area of 2.89 million square meters. It focuses on the new generation of information technology such as cloud computing and mobile Internet, as well as leading industries such as robot and intelligent device research and development. At the same time, the relevant modern service industry and productive service industry are developed around it. To meet the needs of leading industries, Tian’an Cloud Valley provides open and shared space and intelligent environment construction, so as to create a smart industry city ecosystem fully connected with enterprises and talents.

This project adopts the video analysis scheme of Edge-cloud collaboration. Video analysis models such as face recognition, vehicle recognition, intrusion detection are distributed to the local GPU inference server in the park. After the analysis of the real-time video stream is completed locally, the analysis results can be uploaded to the cloud or saved to the docking of the local upper application system.

By adopting the video analysis scheme of Edge-cloud collaboration, the park realizes intelligent analysis of surveillance video, real-time intruder perception, large flow of people and other abnormal events, so as to reduce the labor cost of the park. At the same time, the existing IPC cameras in the park can be used to change into intelligent cameras through edge cloud collaboration, which greatly protects users’ stock assets. Smart park is shown in Fig. 8.27.

Fig. 8.27
figure 27

Smart park

The end-side is an ordinary high-definition IPC camera, while the edge adopts a hardware GPU server. The competitiveness and value of edge video analysis are as follows.

  1. 1.

    Business value: The park conducts an intelligent analysis of surveillance video, with a real-time detection of abnormal events such as intrusion, large flow of people, so as to reduce the labor cost of the park.

  2. 2.

    Edge-cloud collaboration: The edge applies life-cycle management, with seamless upgrading.

  3. 3.

    Cloud model training: The model automatically conducts training, with good algorithm scalability, easy to update.

  4. 4.

    Good compatibility: The existing IPC cameras in the park can be used to change to intelligent cameras through edge cloud collaboration.

8.3.8 A Case in Pedestrian Counting and Heat Map

Pedestrian counting and heat map are mainly used to identify the crowd information in the screen, including the number of personnel information and the heat information of regional personnel. User-defined time setting and result sending interval setting are supported. It is mainly applied to pedestrian counting, visitor counting and heat identification in commercial areas, as shown in Fig. 8.28.

Fig. 8.28
figure 28

Pedestrian counting and heat map

The following improvements can be achieved by using pedestrian counting and heat map.

  1. 1.

    Strong anti-interference: It supports pedestrian counting in complex scenes, such as face or body partially being covered.

  2. 2.

    High scalability: It supports the simultaneous sending of pedestrian crossing statistics, regional statistics and heat map statistics.

  3. 3.

    Usability improvement: It can be connected to the ordinary 1080P surveillance camera.

8.3.9 A Case in Vehicle Recognition

Vehicle recognition is shown in Fig. 8.29. With vehicle recognition, the following improvements can be achieved.

  1. 1.

    Comprehensive scene coverage: It supports various scenarios such as vehicle type, body color and license plate recognition in various scenes such as electric police, bayonet, etc.

  2. 2.

    High ease of use: Access to the ordinary 1080P surveillance camera, it can identify the vehicle information in the picture, including license plate and vehicle attribute information. It can recognize vehicle types such as cars, medium-sized vehicles, and vehicle colors as well, including blue license plate and new energy license plate. It is mainly used in various scenarios such as park vehicle management, parking lot vehicle management and vehicle tracking.

Fig. 8.29
figure 29

Vehicle recognition

8.3.10 A Case in Intrusion Identification

Intrusion identification is mainly used to identify the illegal intrusion behavior in the screen. It supports the extraction of the moving target in the camera field of vision. When the target crosses the designated area, an alarm will be triggered. It also supports the minimum number of people in the alarm area setting, alarm trigger time setting and algorithm detection cycle setting. Intrusion detection is mainly used for identification of illegal entry into key areas, illegal entry into dangerous areas or illegal climbing. Intrusion identification is shown in Fig. 8.30.

Fig. 8.30
figure 30

Intrusion identification

Using intrusion detection, the following improvements can be achieved.

  1. 1.

    High flexibility: It supports flexible alarm target sizes and category settings.

  2. 2.

    Low false alarm rate: It supports intrusion alarm based on person/vehicle, filtering interference from other objects.

  3. 3.

    Usability improvement: It can be accessed to the ordinary 1080P surveillance camera.

8.3.11 CNPC Cognitive Computing Platform: Reservoir Identification for Well Logging

With the completion and improvement of the integrated system, CNPC has accumulated a large amount of structured data and unstructured data. The structured data has been well used, but the unstructured data has not been fully applied. Moreover, the relevant knowledge accumulation and expert experience have not been fully exploited, and the intelligent analysis and application ability of data is insufficient.

The unstructured data features large data capacity, numerous varieties and low value density.

Cognitive computing represents a new computing mode, which is the advanced stage of artificial intelligence development. It contains a lot of technological innovation in the fields of information analysis, natural language processing and machine learning, which can help decision makers to obtain valuable information from a large number of unstructured data.

By using Huawei cloud knowledge map and NLP technology, CNPC has constructed the knowledge map of oil and gas industry. Based on the knowledge map, it constructs the upper business application (reservoir identification for well logging is identified as one of business scenarios, and other scenarios include seismic horizon interpretation, water content prediction, working condition diagnosis, etc.) Finally the following functions are realized.

  1. 1.

    Knowledge aggregation: The knowledge map of oil and gas industry can precipitate the professional knowledge of oil and gas industry.

  2. 2.

    Cost reduction and efficiency enhancement: Based on the knowledge map of oil and gas industry, the upper business application can simplify the business process and shortens the working time.

  3. 3.

    Reserve growth and production improvement: Based on the knowledge map of oil and gas industry, the upper business application can advance proved reserves and ensure energy security.

The solution of reservoir identification for well logging features as follows.

  1. 1.

    It can flexibly modify and manually intervene the key links such as ontology, data source, information extraction, knowledge mapping and knowledge fusion.

  2. 2.

    Simple knowledge reuse is simple: It can quickly create new pipeline tasks and build atlas based on existing ontology and data source.

  3. 3.

    Flexible modification and one-click effect: It can test frequently and quickly to improve efficiency. Finally, it can shorten the time by 70% and increase the coincidence rate by 5%. Reservoir identification for well logging is shown in Fig. 8.31.

Fig. 8.31
figure 31

CNPC cognitive computing platform—reservoir identification for well logging

8.4 Chapter Summary

This chapter first introduces Huawei Cloud EI ecology, and its related services are explained. Then, it focuses on the Huawei EI basic platform—ModelArts, and users can learn more about its services from the listed experiments. Finally, the relevant cases in the practical application of enterprise intelligence are discussed.

It is necessary to note that Huawei is committed to lowering the threshold of AI application. In order to assist AI lovers better understand Huawei cloud EI application platform, Huawei cloud official website has set up EI experience space and EI course training camp, as shown in Figs. 8.32 and 8.33.

Fig. 8.32
figure 32

EI experience space

Fig. 8.33
figure 33

EI course training camp

8.5 Exercises

  1. 1.

    Huawei cloud EI is an enterprise intelligence enabling agent. Based on AI and big data technology, it provides an open, trusted and intelligent platform through cloud services (public cloud, dedicated cloud, etc.). What services does Huawei cloud EI service family currently include?

  2. 2.

    In Huawei cloud EI service family, the solutions for large-scale scenarios are called EI agents. What are they?

  3. 3.

    In Huawei cloud EI service family, what is EI basic platform consist of?

  4. 4.

    ModelArts belongs to EI basic platform in Huawei cloud EI service family. It is a one-stop development platform for AI developers. What functions does it have?

  5. 5.

    As a one-stop AI development platform, what are the advantages of ModelArts products?