Cloud computing is based on a variety of existing technologies and is a master of existing information technologies. Cloud computing has the characteristics of being technology-centric, and its architecture includes a variety of technical mechanisms. This chapter will start with four aspects: cloud infrastructure mechanism, cloud management mechanism, cloud security mechanism, and basic cloud architecture, and discuss some of the main technical mechanisms for building the foundation of cloud technology architecture.

2.1 Cloud Infrastructure Mechanism

Just as a house has foundations, walls or columns, slabs and floors, stairs, roofs, doors, windows, and other main building components, the cloud computing environment, as a complex IT resource integrated system, also has some basic building blocks, called cloud infrastructure. These facilities are the basis for building a basic cloud environment. This section will introduce five main cloud infrastructures, namely Logical Network boundary, Virtual Server, Cloud Storage Device, Cloud Usage Monitoring, and Resource Replication.

2.1.1 Logical Network Boundary

The logical network boundary can be regarded as a relatively independent area. This area uses firewalls and other network devices to isolate a certain network environment from other network parts, forming a virtual network boundary.

The logical network boundary isolates a group of related cloud-based IT resources from other entities in the cloud (such as unauthorized users). The resources in the logical network boundary may also be physically distributed in different areas.

To provide a virtualized isolation mechanism, there must be corresponding network equipment. These network equipment usually include the following.

  • Virtual Firewall:Can actively filter the network traffic of the isolated network and control its interaction with the Internet.

  • Virtual Network:Generally formed by virtual local area network (VLAN), used to isolate the network environment in the data center infrastructure.

Logical network boundaries are generally deployed as virtualized IT environments.

Figure 2.1 shows two logical network boundaries, one is the internal enterprise environment containing cloud users, and the other is the cloud environment belonging to the cloud service provider. The two logical network boundaries are connected through the Internet or a virtual private network (VPN). The advantage of using VPN is that the data between the two parties can be encrypted and the communication content can be protected.

Fig. 2.1
figure 1

Two logical network boundaries including cloud users and cloud service provider network environments

The logical network boundary’s main function is network segmentation and isolation to ensure the relative independence of the IT facilities in the area. Usually, the network infrastructure needs to be virtualized before the logical network boundary forms a logical network layout. A virtual firewall based on a physical firewall is used for network isolation. For example, an enterprise may divide its entire network into an internal network and an external network. The enterprise’s internal network is a virtual network isolated by the internal firewall and can only access resources within the internal network. For example, the data in the internal data center cannot access the external network and the Internet; the external network located outside the external firewall is directly connected to the Internet; the area located between the external firewall and the internal firewall is called the control zone (Demilitarized Zone) (DMZ). The DMZ is abstracted as a virtual network, which generally includes a proxy server and a Web server. The proxy server is responsible for coordinating access to common network services (DNS, E-mail, and Web), while the Web server provides webpage access functions. The company’s network is divided into three logical network boundaries: extranet, intranet, and DMZ.

2.1.2 Virtual Server

A virtual server is a server that is virtualized on a physical server with virtualization software. Through virtualization, applications’ monopoly of physical resources is avoided, and the same physical server can generate multiple virtual server instances. By providing independent virtual server instances to cloud users, cloud service providers can use limited physical servers and other IT resources and improve resource utilization efficiency.

A virtual server is the most basic building block of a cloud environment. Each virtual server can store a large amount of IT resources, such as CPU, memory, external storage, and network. To facilitate users to create virtual server instances, cloud service providers or users usually prepare some customized virtual server images in advance. As mentioned earlier, the several products listed in the Huawei Elastic Cloud Server ECS can actually be regarded as customized virtual server images. Each product generally specifies the number of virtual CPUs contained in its corresponding virtual server instance. Specific performance indicators such as clock speed, memory, and network bandwidth. Creating a virtual server instance from an image file is a resource allocation process that can be completed quickly and on-demand.

By installing or releasing virtual servers, cloud users can customize their own environment, which is independent of other cloud users based on virtual servers based on the same underlying physical server.

The two virtual servers in Fig. 2.2 are based on the same physical server, and they can provide services for different users.

Fig. 2.2
figure 2

Two virtual servers based on the same physical server can provide services to different users

Virtual Machine Monitor software is usually run on the physical server to control and manage the virtual server. A Virtual Infrastructure Manager (VIM), also known as a virtual device manager, is used to coordinate work related to the creation of virtual server instances. Figure 2.3 shows several virtual servers running on physical servers. They are created by the central VIM and controlled and managed by a virtual machine monitor.

Fig. 2.3
figure 3

The virtual machine monitor controls and manages virtual servers running on physical servers

The virtual server and virtual machine used in this book are synonymous. To quickly create a virtual machine, you can usually use a pre-customized virtual machine image (VM image).

Virtual machine images are usually stored in common cloud storage devices as files. Cloud service providers or cloud users can prepare virtual machine images with different hardware and performance specifications for users to choose according to business needs. For example, on Huawei Cloud’s website, the elastic cloud server ECS provides many types of virtual machine images for different user needs, such as general computing scenarios, memory-intensive scenarios, high-performance computing scenarios, big data, and computing acceleration scenarios. The category also contains a large number of sub-categories, giving different CPU/memory ratios, virtual CPU quantity ranges, base frequency/turbo frequency, network speed, applicable scenarios, and other performance or product parameters. Cloud users can choose the most suitable product according to their needs.

2.1.3 Cloud Storage Devices

Cloud storage devices refer to storage devices that are used to provide cloud services. These physical storage devices are usually virtualized to offer services to cloud users in the form of virtual storage devices, just as physical servers are virtualized into virtual servers (virtual machines). Generally, cloud storage devices support remote access by cloud users.

Since cloud services are usually billed according to usage, cloud storage devices need to support a pay-per-use mechanism. When a cloud user and a cloud service provider negotiate a Service-Level Agreement (SLA), the user’s initial and highest cloud storage capacity is usually determined. When capacity needs to be expanded, cloud storage devices can generally provide a fixed increase in capacity allocation, such as increasing 1GB each time. If the maximum capacity limit is reached, new expansion requests can be rejected, and cloud users can apply for new and larger storage capacity.

The main issues related to cloud storage are data security, integrity, and confidentiality. If the user is using a public cloud service, its data will be stored in the cloud service provider’s cloud storage device. These cloud storage devices are managed and operated by cloud service providers. Cloud users have only access rights to these data, but no control rights. There is no guarantee that these data will not be peeped, tampered with, or deleted. There must be some corresponding mechanisms to ensure this data. Security. For example, the data to be saved to the cloud storage device is encrypted before storage. Or the data is divided into blocks and the order is shuffled and then stored. Or special redundant coding mechanisms such as erasure codes, regeneration codes are used to realize data storage and restore; or verify the integrity and validity of the data stored in the cloud through some special mechanisms such as data ownership verification schemes. Research in this area is also a hot spot in the field of cloud computing. If the security, integrity, and confidentiality of the data stored in the cloud cannot be guaranteed, cloud computing will lose its appeal. Especially when data is entrusted to external cloud providers and other third parties, more attention should be paid to data protection. Besides, due to the emergence of cross-regional and cross-border cloud services, legal and regulatory issues may arise when data is migrated across regions or national borders.

Cloud storage devices can be divided into four categories according to data storage levels: files (File), blocks (Block), data sets (Dataset), and objects (Object). There are three types of corresponding storage interfaces: network storage interface, object storage interface, and database storage interface.

  1. 1.

    Cloud storage level

    The cloud storage device mechanism provides the following standard data storage levels.

    1. (a)

      File

      A file refers to a collection of data stored on a permanent storage device such as a hard disk. The file can be a text document, picture, program, etc. Files usually have a file extension to indicate the file type. File storage provides external services at the file system level, and the main operation objects are files and folders. Access to data in file operations is usually sequential.

    2. (b)

      Block

      Block storage divides data into fixed-length data blocks, which are directly stored on permanent storage devices such as hard disks in units of data blocks. Compared with file storage, block storage does not have the concept of files and directory trees. A data block is the smallest unit of data that can be accessed independently, and its size is generally fixed. For example, the Hadoop Distributed File System (HDFS) data block length is 64 MB by default. Block storage can be regarded as the lowest storage level closest to hardware and has higher storage efficiency.

    3. (c)

      Data set

      Data sets generally refer to table-based data in a relational database. These data are composed of records, and each record contains fields separated by separators. These data are usually stored in a relational database, and the Structured Query Language (SQL) statement can be used to query, insert, modify, and delete the data. After entering the big data era, to improve the storage efficiency of big data, a non-relational database (NoSQL) data set storage type appeared.

    4. (d)

      Object

      Object storage manages data in the form of objects. The biggest difference between objects and files is that metadata (i.e., data describing data) is added to the file. In general, an object is divided into three parts: data, metadata, and object ID (object identification). Object data is usually unstructured data, such as pictures, videos, or documents. The metadata of an object refers to the related description of the object, such as the picture’s size and document owner. Object ID is a globally unique identifier used to distinguish objects. Object storage generally corresponds to Web-based resources. We can use HTTP CRUD operations, that is, add (Create), read (Retrieve), update (Update), and delete (Delete) to manage stored data. Each data storage level is typically associated with some type of storage interface that corresponds not only to a specific type of cloud storage device, but also to the access protocol it uses (see Fig. 2.4).

  2. 2.

    Network storage interface

    File storage or block storage is mainly affected by the network storage interface category, which includes storage devices that comply with industry-standard protocols, such as the Small Computer System Interface (SCSI) used for storage blocks and server message blocks. Common Internet File System (CIFS) and Network File System (NFS) for file and network storage.

    The main operation objects of file storage are files and folders, and the protocols used are NFS and CIFS that comply with POSIX (Portable Operating System Interface) standards. POSIX is a set of standards developed by IEEE and ISO/IEC. Based on the existing UNIX practice and experience, the standard describes the operating system’s calling service interface, which ensures that the compiled application program can be transplanted and run on multiple operating systems at the source code level. Take NFS as an example. File-related interfaces include lookup (LOOKUP)/access (ACCESS)/read (READ)/write (WRITE)/create (CREATE)/delete (REMOVE)/rename (RENAME), etc. Folder-related interfaces include creating a folder (MKDIR)/deleting a folder (RMDIR)/reading a folder (READDIR), etc. Simultaneously, there are interfaces such as FSSTAT/FSINFO to provide file system-level information.

    SCSI represents the protocol used by block storage. SCSI is an independent processor standard for system-level interfaces between computers and their peripheral devices (such as hard disks, optical drives, printers, scanners). The SCSI standard defines commands, communication protocols, and electrical characteristics of entities and is mainly used in storage devices (such as hard disks and tape drives). The main interfaces of SCSI are READ/WRITE/READ CAPACITY/INQUIRY, etc. FC (Fibre Channel) and iSCSI (Internet Small Computer System Interface) also block storage protocols. Compared with file storage, block storage does not have the concept of files and directory trees and usually does not define disk creation and deletion operations but pays more attention to transmission control efficiency.

    The data search and extraction performance of file storage are usually not optimal. Block storage requires data to have a fixed format (called data block). This format is closest to hardware and is the smallest unit of storage and access. Compared with file storage, block storage usually has better performance.

  3. 3.

    Object storage interface

    Various types of data can be referenced and stored as Web resources, which is object storage. It is based on technologies that support multiple data and media types. The cloud storage device mechanism that implements this interface can usually be accessed through REST with HTTP as the main protocol or cloud services based on Web services.

    The main operation object of object storage is the object. Take Amazon’s S3 storage as an example. The main interfaces are upload (PUT)/download (GET)/delete (DELETE), etc. Object storage does not have a random read/write interface, and there is no concept of a directory tree. The protocols it uses such as HTTP is more focused on simplicity and efficiency.

  4. 4.

    Database storage interface

    In addition to basic storage operations, cloud storage device mechanisms based on database storage interfaces usually support query languages and implement storage management through standard APIs or management user interfaces.

    According to the storage structure, there are two types of this storage interface.

    1. (a)

      Relational database

      Relational databases use tables to organize related data into rows and columns. Usually, it uses industry-standard SQL to operate the database. Most commercial database products use relational database management systems. Therefore, the cloud storage device mechanism implemented by using relational data storage can be based on many existing commercial database products.

      The challenges of cloud-based relational databases mainly come from expansion and performance. It is difficult to scale a relational cloud storage device horizontally, and vertical expansion is more complicated than horizontal expansion. When remotely accessed by cloud services, complex relational databases and databases containing a large amount of data will have higher processing costs and longer delays.

      For large databases, its performance is also closely related to the location of data storage. Local data storage is better than storage on a Wide Area Network (WAN) in terms of network reliability and latency.

    2. (b)

      Non-relational database

      Traditionally, data set storage corresponds to relational databases. After entering the era of big data, because relational databases are not easy to expand horizontally and their storage efficiency is low, non-relational databases are also widely used. These non-relational databases are often referred to as NoSQL. One of the main features of NoSQL databases is the removal of the relational characteristics of relational databases. Compared with traditional relational databases, a “more loose” structure is used to store data, and no emphasis is placed on defining relationships and realizing data standardization. The database has a simple structure and is very easy to expand. Generally, NoSQL databases have very high read/write performance, especially under large amounts of data.

      The main motivation for using non-relational storage is to avoid the possible complexity and processing costs of relational databases. At the same time, compared with relational storage, non-relational storage can be scaled more horizontally.

      Non-relational storage generally does not support the functions of relational databases, such as transactions and connections. The storage is usually non-standard data, which limits the portability of the data.

Fig. 2.4
figure 4

Use different access interface technologies to operate virtualized cloud storage devices

2.1.4 Cloud Usage Monitoring

According to usage, the feature of cloud service billing makes the cloud computing system have a mechanism for monitoring and measuring cloud users’ resource usage. Cloud usage monitoring is a lightweight autonomous software program used to collect and process cloud users’ usage data of IT resources.

Depending on the type of usage indicators that need to be collected and the method of collecting usage data, cloud usage monitoring can exist in different forms. The following are three common agent-based forms, each sending the collected usage data to the log database for subsequent processing and reporting.

  1. 1.

    Monitoring agent

    The monitoring agent is an intermediate event-driven program, which resides on the existing communication path as a service agent, and transparently monitors and analyzes the data flow. This type of cloud usage monitoring is usually used to measure network traffic and message metrics (see Fig. 2.5).

    In this example, the cloud user sends a request message to the cloud service and the monitoring agent intercepts this message. On the one hand, the user’s request message is sent to the cloud service, on the other hand, the relevant usage data is collected and stored in the log database. After the cloud service receives the request message, it will return a response message, which the monitoring agent will not intercept.

  2. 2.

    Resource agent

    Resource agent is a processing module that collects usage data by interacting with specialized resource software in an event-driven manner. It works based on resource software, monitoring the usage indicators of predefined and observable events, such as the start, pause, resume, and vertical expansion of each entity. An example of a resource agent is shown in Fig. 2.6.

    In this example, the resource agent actively monitors the virtual server and detects increased resource usage. Specifically, the resource agent receives notifications from the underlying resource management program, and as the amount of user requests increases, the virtual server expands. The resource agent will store the collected usage data in the log database according to its monitoring indicators.

  3. 3.

    Polling agent

    A polling agent is a processing module that collects cloud server usage data by polling IT resources. It is usually used to periodically monitor the status of IT resources, such as uptime and downtime.

    The polling agent in Fig. 2.7 monitors the status of cloud services on the virtual server and sends polling messages periodically. When the received polling response changes (e.g., when the usage status changes from A to B), record the new usage status to the log database.

Fig. 2.5
figure 5

Example of monitoring agent

Fig. 2.6
figure 6

Example of resource agent

Fig. 2.7
figure 7

Example of polling agent

2.1.5 Resource Replication

The resource replication mentioned here refers to using customized resource templates (such as virtual machine images) to create multiple virtual server instances. This is usually performed when the availability and performance of IT resources need to be enhanced.

The resource replication mechanism uses virtualization technology to realize the replication of cloud-based IT resources. In Fig. 2.8, the virtual machine monitor uses the stored virtual server image to replicate multiple virtual server instances.

Fig. 2.8
figure 8

The virtual machine monitor uses the stored virtual server image to replicate multiple virtual server instances

2.2 Cloud Management Mechanism

Cloud-based IT resources need to be established, configured, maintained, and monitored. This section mainly introduces the system that contains these mechanisms and can accomplish these management tasks. They promote the control and evolution of IT resources that form cloud platforms and solutions, thus forming a key part of cloud technology architecture.

The management mechanisms or systems introduced in this section mainly fall into four categories: Remote Administration, Resource Management System, SLA Management System, and Billing Management System. These systems usually provide integrated APIs and can be provided to users in the form of individual products, customized applications, various combined product packages, or multi-functional applications.

2.2.1 Remote Management System

The remote management system provides external cloud resource managers with tools and user interfaces to configure and manage cloud-based IT resources.

The remote management system will establish an entrance to access the control and management functions of various underlying systems, including resource management systems, SLA management systems, and billing management systems (see Fig. 2.9).

Fig. 2.9
figure 9

Centralized management and control of external cloud resource managers through the remote management system

Cloud providers generally use the tools and APIs provided by the remote management system to develop and customize online portals, which provide cloud users with various management and control.

The remote management system mainly creates the following two types of entrances.

  1. 1.

    Usage and administration portal

    A universal portal that centrally manages different cloud-based IT resources and provides IT resource usage reports. This portal is an integral part of many cloud technology architectures.

  2. 2.

    Self-service portal

    It is a portal that allows cloud users to search for the latest cloud services and IT resources (usually available for cloud users to rent) lists provided by cloud service providers. Then, cloud users submit their options to cloud service providers for resource allocation.

    Figure 2.10 shows an example of a cloud resource manager using two portals to access a remote management system. Among them, the cloud resource manager requests the cloud service provider to provide a new cloud service through the self-service portal; after the cloud service provider provides the service, the cloud resource manager completes the configuration of the newly provided cloud service through the use and management portal.

Fig. 2.10
figure 10

Cloud resource managers use two types of portals

Through the remote management console, cloud users (cloud resource managers) can perform the following tasks.

  • Configure and establish cloud services.

  • Provide and release IT resources for on-demand cloud services.

  • Monitor the status, usage, and performance of cloud services.

  • Monitor the implementation of QoS and SLA.

  • Manage rental costs and usage expenses.

  • Manage user accounts, security credentials, authorization, and access control.

  • Track internal and external access to rental services.

  • Plan and evaluate the supply of IT resources.

  • Capacity planning.

2.2.2 Resource Management System

The resource management system helps coordinate IT resources in order to respond to management operations performed by cloud users and cloud service providers. Its core is VIM, which is used to coordinate server hardware so that virtual server instances can be created based on the most suitable underlying physical server. VIM is a commercial product used to manage a series of IT resources across multiple physical servers.

The tasks that are usually automated and realized by the resource management system are as follows:

  • Manage virtual IT resource templates, such as virtual server images.

  • Allocate, release, and coordinate virtual IT resources in the available physical infrastructure.

  • Monitor the operating conditions of IT resources and enforce usage policies and security regulations.

Figure 2.11 shows two different access methods for resource management. Among them, the cloud resource manager of the cloud user accesses the use and management portal from the outside (① in the figure), and the cloud resource manager of the cloud service provider uses the local user interface provided by VIM to perform internal resource management tasks (② in the figure).

Fig. 2.11
figure 11

Two different access methods for resource management

2.2.3 SLA Management System

SLA is a contract between a network service provider and a customer, which defines service type, service quality, and customer payment.

The SLA management system represents a series of commercially available cloud management products, and its functions include the management, collection, storage, reporting, and runtime notification of SLA data.

When deploying an SLA management system, it often includes a library (QoS measurement library) for storing and retrieving collected SLA data based on predefined indicators and report parameters. Collecting SLA data also requires one or more SLA monitoring agents. Through the monitoring agents, cloud resource managers can query and obtain these data in approximately real-time through the use and management portal.

Figure 2.12 shows the SLA interaction between cloud users and cloud services. First, the cloud user sends an interaction message to the cloud service, the SLA monitoring agent intercepts the interaction message, evaluates the interaction and collects relevant runtime data, which is related to the service quality assurance defined in the cloud service SLA; then, the SLA monitoring agent will The collected data is stored in the QoS measurement library, which is also part of the SLA management system; finally, external cloud resource managers can issue queries and generate reports through the use and management portal, and internal cloud resource managers can directly query SLA management system.

Fig. 2.12
figure 12

SLA-related interaction between cloud users and cloud services

2.2.4 Billing Management System

The billing management system is specifically used to collect and process the usage data of cloud users. It involves the settlement of cloud service providers and the billing of cloud users. Specifically, the billing management system relies on pay-per-use monitors to collect user usage data during operation. These data are stored in a library (pay-per-use measuring library) inside the system.

Figure 2.13 shows the interaction between cloud users and cloud services related to billing. Among them, when a cloud user accesses a cloud service, a pay-per-use monitor tracks usage, collects billing-related data, and sends it to the library in the billing management system. The system regularly calculates usage fees and generates invoices for users. Invoices can be provided to cloud users or cloud resource managers through the use and management portal.

Fig. 2.13
figure 13

Billing interaction between cloud users and cloud services

2.3 Cloud Security Mechanism

Users move their business data into the cloud, which means that cloud service providers have to bear the responsibility for this part of data security. The remote use of IT resources requires the cloud user to extend the trust boundary to the external cloud used by the user. It is very difficult to establish a security architecture that includes such a trust boundary without introducing security vulnerabilities. This section will introduce a set of basic cloud security mechanisms to counter the security threats that may be encountered in cloud services.

2.3.1 Encryption

An encryption mechanism is a digital coding system used to protect data to ensure its confidentiality and integrity. It is used to convert Plaintext into a protected, unreadable format. The Plaintext here refers to data encoded in a readable format.

Encryption technology usually relies on a standardized algorithm called an encryption component (Cipher) to convert the original Plaintext into ciphertext, encrypted data. If you don’t know the secret key, even if you get the ciphertext, unless you master superb deciphering techniques, you generally cannot analyze the original Plaintext from it, except metadata such as the length of the message and the creation date.

When encrypting the Plaintext, a key is needed. The key is a secret message established and shared by authorized parties. The key is also needed to decrypt the ciphertext.

Encryption mechanisms can help combat security threats such as traffic eavesdropping, malicious media, insufficient credit, and overlapping trust boundaries.

There are two common encryption keys: Symmetric Encryption and Asymmetric Encryption.

Figure 2.14 shows an example of using encryption mechanisms to defend against malicious service agent attacks. After the document is encrypted, the malicious service agent cannot obtain data from the encrypted message.

  1. 1.

    Symmetric encryption

    Symmetric encryption uses the same key for encryption and decryption. The encryption and decryption processes are all done by authorized parties using a shared key, that is, a message encrypted with a specific key can only be decrypted with the same key.

    Symmetric encryption has no Non-Repudiation. If more than one party owns the key, it is impossible to determine which party performed the message encryption or decryption.

  2. 2.

    Asymmetric encryption

    Asymmetric encryption uses two keys, Public Key and Private Key. In asymmetric encryption, only the owner has the private key, and the public key is generally publicly available. A document encrypted with a certain private key can only be decrypted correctly with the corresponding public key. Similarly, a document encrypted with a certain public key can only be decrypted with the corresponding private key.

    Because two different keys are used, asymmetric encryption is generally slow to calculate, but encryption with a private key can provide authenticity, non-repudiation, and integrity protection. The private key is only owned by the user. If a document is encrypted with the user’s private key, he cannot deny that the document was not sent by himself. The document can only be decrypted with the user’s public key, and a third party cannot tamper with its content. Similarly, the message encrypted with the public key can only be decrypted by the legal owner of the private key, and the third party cannot snoop on its content, which provides confidentiality protection for the data.

Fig. 2.14
figure 14

After encryption, the content of the document remains confidential through the untrusted transmission channel

2.3.2 Hashing

When a one-way, irreversible form of data protection is needed, a hashing mechanism is used. Hashing is a commonly used technique in mathematics. It can also be translated as “hash” or transliterated as “hash.” It transforms an input of any length into a fixed-length output through a hashing algorithm. The output is the hash value. This conversion is a compression mapping, that is, the space of the hash value is usually much smaller than the space of the input, and different inputs may be hashed into the same output, so the unique input cannot be determined based on the hash value.

But conversely, if the hash values are different, it can be concluded that the inputs are also different. So hashing is simply a function that compresses messages of any length into a fixed-length Message Digest. If you compare a message to a person, and hash the message, it’s like taking a low-pixel photo of the person, which can roughly depict the person’s appearance and outline, as long as the compared photos are exactly the same If the photos are taken under different conditions, it can be concluded that the people being photographed are also different.

A common application of the hashing mechanism is the storage of passwords.

Hashing techniques can be used to obtain the message digest of the message. The message digest is usually fixed in length and smaller than the original message size. The message sender can append the message digest to the message, and the receiver uses the same hash function to regenerate the message digest on the received message and compare it with the original message digest attached to the message. Determine whether the original message has been tampered with by observing whether they are consistent. Using a suitable hash algorithm, under the same computing conditions, the probability of two different messages generating the same message digest is very small. So in general, as long as the original message is modified, a completely different message digest will be generated. Therefore, as long as the newly generated message digest is the same as the original message digest sent with the message, it can be concluded that the message has not been tampered with during transmission. This method can ensure the integrity and consistency of the message.

Figure 2.15 shows the protection of message integrity by the hashing mechanism. In the figure, a cloud user wants to transmit a document to the cloud service, and the document is summarized by a hash function in advance and attached to the transmitted document. If the document is intercepted and tampered with by a malicious service agent, the firewall will recalculate the summary of the received document and compare it with the original summary before the document enters the cloud service. If it is inconsistent, it proves that the document has been tampered with and the document is rejected.

Fig. 2.15
figure 15

Hashing mechanism’s protection of message integrity

2.3.3 Digital Signature

A digital signature is a string of information that can only be generated by the sender of the information that others cannot forge. This string of information is also an effective proof of the authenticity of the information sent by the sender. It is a kind of ordinary physical signature similar to that written on paper, but it is realized by the technology in the field of public key encryption. A set of digital signatures usually defines two complementary operations, one for signing and the other for verification. Digital signature is the application of asymmetric key encryption technology and digital digest technology.

The digital signature mechanism is a means to provide data authenticity and integrity through identity verification and non-repudiation. Before sending the message, give the message a digital signature. If the message is subsequently modified without authorization, the digital signature will become illegal. The digital signature provides a proof that the received message is consistent with the one created by the legitimate sender.

The creation of digital signatures involves hashing and asymmetric encryption. It is actually a message digest encrypted by the private key and appended to the original message. The receiver needs to verify the legality of the digital signature, decrypt the digital signature with the corresponding public key, and get the message digest. At the same time, a hash mechanism is applied to the original message to get the message digest. Two different treatments give the same result, which indicates that the message maintains its integrity.

Digital signature mechanisms can help counter security threats such as malicious media, insufficient authorization, and overlapping trust boundaries.

The malicious service agent in Fig. 2.16 intercepted and tampered with a legitimate cloud user’s data-signed document and pretended to be the cloud user to request cloud services, but the digital signature of the document was verified as invalid when the document passed through the firewall, so the service request was rejected.

Fig. 2.16
figure 16

Verify the legitimacy of the document through data signature

2.3.4 Public Key Infrastructure

Public Key Infrastructure (PKI) is a system composed of protocols, data formats, rules, and implementations. It is used to manage asymmetric key issuance so that large-scale systems can safely use public-key cryptography. This system links the public key with the corresponding key owner, and at the same time, verifies the validity of the key.

PKI relies on digital certificates, which are data structures with digital signatures, which are generally issued by a third-party certificate authority (CA). The difference between digital certificates and ordinary digital signatures is that digital certificates usually carry these pieces of information: the identity information of the certificate owner, which the CA has verified; the public key of the certificate owner; the certificate issued by the CA and the digital signature of the CA. The digital signature uses the CA‘s private key to encrypt the digest of the certificate; there are other related information such as the validity period. A CA is generally an authority trusted by the outside world, and its public key is usually public. The outside world can use the CA‘s public key to verify the CA‘s digital signature‘s authenticity. If it is true, it means that the CA has verified the authenticity of the certificate owner’s identity. Indirectly verified the authenticity and validity of the certificate. The typical steps for CA to generate a certificate are shown in Fig. 2.17.

Fig. 2.17
figure 17

Typical steps for certificate authorities to generate a certificate

Although most digital certificates are issued by a few trusted CAs such as VeriSign (manufacturers in the domain and Internet security field), digital certificates can also be generated by other methods. Larger companies such as Microsoft can act as their own CA, issuing digital certificates to other customers and the public, and even individual users can generate digital certificates as long as they have the appropriate software tools.

PKI is a reliable method for realizing asymmetric encryption, managing the identity information of cloud users and cloud service providers, and defending against malicious intermediaries and insufficient authorization threats.

2.3.5 Identity and Access Management

The identity and access management (IAM) mechanism includes authentication and the necessary components and strategies for managing user identities, as well as related IT resources, environment, and system access privileges.

The IAM mechanism has the following four main components.

  1. 1.

    Certification

    In addition to supporting the most common username and password combination authentication methods, it also supports digital signatures, digital certificates, biometric hardware (such as fingerprint readers), special software (such as voice analysis programs), and the combination of user accounts and registered IP or The MAC address is bound.

  2. 2.

    Authorization

    The authorization component is used to define the correct access control granularity and manage the relationship between user identity, access control authority, and IT resource availability.

  3. 3.

    User management

    User management is responsible for creating new user identities and access groups, resetting passwords, defining password policies and managing privileges.

  4. 4.

    Certificate management

    The Credential Management system has established a set of rules to manage the defined user accounts, including user identity management and corresponding access control and policies.

In addition to assigning specific user privilege levels, the IAM mechanism also includes formulating corresponding access control and management strategies. This mechanism is mainly used to combat security threats such as insufficient authorization, denial of service attacks, and overlapping trust boundaries.

2.3.6 Single Sign On

Cloud users sometimes need to use multiple cloud services simultaneously or continuously. If every cloud service has to re-authenticate the user, the user will be annoying. However, it is not easy to provide cloud users with authentication and authorization information across multiple cloud services. The Single Sign On (SSO) mechanism enables a cloud user to be authenticated by a security agent. This security agent establishes a security context. When the cloud user accesses other cloud services or cloud-based IT resources, the security context will be persisted. Otherwise, the cloud user has to re-authenticate himself when sending each subsequent request.

The SSO mechanism allows independent cloud services and IT resources to generate and circulate runtime authentication and authorization certificates. The certificate is first provided by the cloud user and remains valid during the session, while its security context information is shared. The concept of security context mentioned here has a wide range. It generally refers to a collection of permissions and permissions that define what an entity is allowed to do. Permissions, privileges, access tokens, integrity levels, etc. are all included in it. When cloud users want to access cloud services located in different clouds, the SSO mechanism’s security agent is particularly useful.

Figure 2.18 shows an example of achieving access across multiple cloud services through single sign-on. First, the cloud user provides the security certificate for login to the security agent on the firewall. After successful authentication, the security agent responds with a security token representing the completion of the authentication. The token contains the user’s identity information and can be authenticated by multiple cloud services.

Fig. 2.18
figure 18

Achieving access across multiple cloud services through single sign-on

2.3.7 Cloud-Based Security Group

Setting up isolation between IT resources can increase data protection. The separation of cloud resources creates separate physical and virtual IT environments for different users and groups, forming independent logical network boundaries. For example, according to different network security requirements, an enterprise’s network can be divided into extranets and intranets. The external network deploys a flexible firewall for external Internet access; while the internal network can only access the internal data center’s resources but cannot access the Internet.

The cloud-based resource segmentation process creates a Cloud-Based Security Group mechanism, and security policies determine the division of security groups. According to the established security strategy, the entire cloud environment is divided into several logical cloud-based security groups, and each security group forms an independent logical network boundary. Every cloud-based IT resource belongs to at least one logical cloud-based security group. The communication between these security groups is carried out through some special rules.

Multiple virtual servers running on the same physical server can belong to different cloud-based security groups.

Figure 2.19 shows an example of enhancing data protection by dividing cloud-based security groups. Among them, the cloud-based security group A includes virtual servers A and D, which are assigned to cloud user A; the cloud-based security group B includes virtual servers B, C, and E, and is assigned to cloud user B. Even if cloud user A’s certificate is compromised, the attacker can only attack the virtual servers A and D in the cloud-based security group A. The virtual servers B, C, and E in the cloud-based security group B cannot be affected.

Fig. 2.19
figure 19

Enhance data protection by dividing cloud-based security groups

2.3.8 Hardened Virtual Server Image

A virtual server is created from a template called a virtual server image (or virtual machine image). Hardening is the process of stripping unnecessary software from the system and limiting potential vulnerabilities that attackers may exploit. Removing redundant programs, closing unnecessary server ports, closing unused services, internal root accounts, and guest access permissions are all examples of enhancements.

Hardened Virtual Server Image is a hardened template used to create virtual server instances, which is usually safer than the original standard image.

Enhanced virtual server images can help combat security threats such as denial of service, insufficient authorization, and overlapping trust boundaries.

Figure 2.20 shows an example of an enhanced virtual server image. Among them, the cloud service provider applies its security strategy to the enhanced virtual server image. As part of the resource management system, the enhanced virtual server image template is stored in the VM image library.

Fig. 2.20
figure 20

Enhanced virtual server image

2.4 Basic Cloud Architecture

This section will describe some common basic cloud architectures. These architectures are common in modern cloud environments and are usually an important part of cloud environments.

2.4.1 Load Distribution Architecture

The level of IT resources can be expanded by adding one or more IT resources of the same kind, and the load balancer that provides runtime logic can evenly distribute the workload on the available IT resources. The resulting Workload Distribution Architecture relies to a certain extent on complex load balancing algorithms and runtime logic to reduce overuse or underuse of IT resources.

The load distribution architecture can be used to support distributed virtual servers, cloud storage devices, and cloud services. In fact, this basic architecture can be applied to any IT resource.

Figure 2.21 shows an example of a load distribution architecture. Among them, cloud service A has a redundant copy on virtual server B. If the load on cloud service A is too large, the load balancer will intercept cloud user requests and locate them on virtual servers A and B, respectively, to ensure uniform load distribution.

Fig. 2.21
figure 21

Example of load distribution architecture

In addition to the basic load balancing mechanism and the virtual server and cloud storage device mechanism that can be used for load balancing, the following mechanisms are also part of the cloud architecture:

  • Audit monitor.

  • Cloud usage monitor.

  • Virtual machine monitor.

  • Logical network boundary.

  • Resource cluster.

  • Resource replication.

2.4.2 Resource Pooling Architecture

Resource Pooling Architecture is based on the use of one or more resource pools, where the same IT resources are grouped and maintained by the same system to automatically ensure that they are kept in sync.

Common resource pools are as follows:

  1. 1.

    Physical server pool

    The physical server pool consists of networked servers. These servers have already installed the operating system and other necessary programs and application software and can be put into use immediately.

  2. 2.

    Virtual server pool

    The virtual server pool is generally configured through a customized available template, which is selected by cloud users from a variety of available templates during preparation. A cloud user can configure a low-end server pool, each virtual server is equipped with 2 vCPUs and 4GB of memory; it can also configure a high-end server pool, each virtual server is equipped with 16 vCPUs and 32GB of memory.

  3. 3.

    Storage pool

    Storage pools or cloud storage device pools generally consist of file-based or block-based storage structures.

  4. 4.

    Network pool

    The network pool is composed of different pre-configured network interconnection devices. For example, for redundant connections, load balancing, or link aggregation, you can create a virtual firewall device pool or a physical network switch pool.

  5. 5.

    CPU pool

    The CPU pool can be allocated to virtual servers, usually with a single virtual processing core (vCPU) as the basic unit.

  6. 6.

    Memory pool

    The memory pool can be used as a new supply or vertical expansion of the physical server.

    A dedicated pool can be created for each type of IT resource, or multiple pools of different types can be aggregated into a larger mixed pool. In this mixed pool, each individual pool is called a sub-resource pool.

2.4.3 Dynamic Scalability Architecture

Dynamic Scalability Architecture has some predefined expansion conditions model. Triggering these conditions will cause the system to allocate IT resources from the resource pool automatically dynamically. The dynamic allocation mechanism allows the number of resources available to users to change in accordance with changes in user needs.

Common types of dynamic expansion are as follows:

  1. 1.

    Dynamic horizontal scaling

    Expand IT resources by adding or subtracting instances of the same type of IT resources to handle changes in workload. If resources need to be added, the automatic extension listener will request resource replication according to requirements and permissions and send a signal to start IT resource replication.

  2. 2.

    Dynamic vertical scaling

    When it is necessary to adjust the processing capacity of a single IT resource, expand the IT resource instance up (enhance the configuration) or down (down the configuration). For example, when a virtual server is overloaded, its memory capacity can be dynamically increased, or a processing core can be added.

  3. 3.

    Dynamic relocation

    Relocate service requirements to other IT resources that can provide similar services at runtime. For example, the virtual server corresponding to the cloud service is migrated to a more powerful physical host.

The dynamic expansion architecture can be applied to a range of IT resources, including virtual servers and cloud storage devices. In addition to the core automatic extension listener and resource replication mechanism, the following mechanisms are also used in this form of cloud architecture.

  1. 1.

    Cloud usage monitor

    In response to the dynamic changes caused by this architecture, a special cloud usage monitor can be used to track runtime usage.

  2. 2.

    Virtual machine monitor

    The dynamically scalable system calls the virtual machine monitor to create or remove virtual machine server instances or extend itself.

  3. 3.

    Pay-per-use monitor

    The pay-per-use monitor collects usage cost information in response to the expansion of IT resources.

2.4.4 Elastic Resource Capacity Architecture

The Elastic Resource Capacity Architecture is mainly related to the dynamic supply of virtual servers. The architecture allocates and recycles related IT resources based on real-time changes in user needs, relying on the resource pool containing IT resources such as CPU and memory in the system, so as to respond to changes in user needs instantly.

The flexible resource capacity architecture monitors the needs of cloud users through the automatic extension of the monitor. Once the requirements change, it will execute the pre-deployed intelligent automation engine script, interact with the virtual machine monitor and VIM, and automatically process user requests and notify the resource pool to allocate or reclaim corresponding resources on the processing results. The elastic resource capacity architecture monitors the runtime processing of the virtual server. Before the cloud service capacity reaches the capacity threshold, it can obtain additional processing power from the resource pool through dynamic allocation. Under this architecture, virtual servers and their hosted applications and IT resources can be seen as vertically expanding.

Figure 2.22 shows an example of a flexible resource capacity architecture. The cloud user actively sends a request to the cloud service, and the automatic extension listener monitors this. The intelligent automation engine script is deployed together with the workflow logic to automatically process changes in user requests and send the processing results to the virtual machine monitor. The virtual machine monitor controls the resource pool to allocate or reclaim the corresponding IT resources. When cloud users increase requests, the automatic extension listener will send a signal to the intelligent automation engine to execute the script. After the script runs, the virtual machine monitor will allocate more IT resources from the resource pool to the virtual machine so that the increased workload can be processed.

Fig. 2.22
figure 22

Example of flexible resource capacity architecture

This type of cloud architecture can also include the following additional mechanisms.

  1. 1.

    Cloud usage monitor

    Before, during, and after the expansion, the cloud usage monitor collects IT resources’ usage information to help define the future processing capacity threshold of the virtual server.

  2. 2.

    Pay-per-use monitor

    The pay-per-use monitor is responsible for collecting resource usage cost information, which changes with elastic supply.

  3. 3.

    Resource replication

    Resource replication is used in this architecture to generate new instances of extended IT resources.

2.4.5 Service Load Balancing Architecture

The Service Load Balancing Architecture can be considered a special variant of the load distribution architecture, which is specifically used to implement the expansion of cloud services. By adding a load balancing system to dynamically distributed workloads, redundant deployments of cloud services can be created. The load balancer can intercept cloud users’ service requests and distribute them to multiple IT resources that can provide similar services according to the principle of load balancing so that the workload of the cloud service system is balanced. The load balancer can either become a built-in component of an IT device in the cloud environment or exist independently of the cloud device and its host server.

Sometimes, a copy of a cloud service instance (such as a redundant virtual server) is organized as a resource pool, and a load balancer acts as an external or built-in component, allowing the hosting server to balance the workload by itself.

Figure 2.23 shows an example of a service load balancing architecture. The load balancer intercepts the messages sent by cloud users and forward them to multiple virtual servers, so that the workload processing can be scaled horizontally.

Fig. 2.23
figure 23

Example of service load balancing architecture

In addition to the load balancer, the service load balancing architecture can also include the following mechanisms.

  1. 1.

    Cloud usage monitor

    The cloud usage monitor can monitor cloud service instances and their respective IT resource consumption levels and involve various runtime monitoring and usage data collection tasks.

  2. 2.

    Resource cluster

    The architecture includes active-active cluster groups, which can help load balance among different members of the cluster.

  3. 3.

    Resource replication

    Resource replication is used to support the implementation of cloud services to support load balancing requests.

2.4.6 Cloud Bursting Architecture

The Cloud Bursting Architecture establishes a form of dynamic expansion. As long as the preset capacity threshold is reached, the IT resources within the enterprise will be expanded or “burst” to the cloud.

Some cloud-based IT resources in the cloud burst architecture are redundantly pre-deployed, and they will remain inactive until the cloud bursts. When these resources are no longer needed, cloud-based IT resources are released, and the architecture returns to the internal environment of the enterprise.

The cloud burst architecture is an elastic expansion architecture that provides cloud users with an option to use cloud-based IT resources, but this option is only used to cope with higher usage requirements. The basis of this architecture is the automatic extension of the listener and the resource replication mechanism.

The automatic extension listener decides when to redirect requests to cloud-based IT resources, and the resource replication mechanism maintains the synchronization of state information between the enterprise’s internal and cloud-based IT resources.

Figure 2.24 shows an example of cloud burst architecture. The automatic extension listener monitors the use of service A within the enterprise. When the usage threshold of service A is broken, the request of cloud user C is redirected to the redundant implementation of service A in the cloud (cloud service A).

Fig. 2.24
figure 24

Example of cloud burst architecture

2.4.7 Elastic Disk Provisioning Architecture

Generally, users who use cloud-based storage space are charged according to the fixed allocated disk storage capacity. This means that the cost has been determined based on the pre-allocated storage capacity and has nothing to do with the actual amount of data stored. For example, cloud service provides users with a virtual machine configured with 200GB of storage capacity. Even if the user has not stored any data, they still need to pay for the 200GB of storage space.

The Elastic Disk Provisioning Architecture establishes a dynamic storage provisioning system, which ensures accurate billing based on the amount of storage actually used by cloud users. The system uses automatic streamlined supply technology to automatically realize automatic allocation of storage space and further supports runtime usage monitoring to collect usage data for billing purposes accurately.

The thin provisioning software is installed on the virtual server, and the dynamic storage allocation is handled through the virtual machine monitor. At the same time, the pay-per-use monitor tracks and reports accurate billing related to disk usage data.

In addition to cloud storage devices, virtual servers, and pay-per-use monitors, the architecture may also include cloud usage monitors and resource replication mechanisms.

Figure 2.25 shows an example of an elastic disk supply architecture. The cloud user requests a virtual server with three hard disks, each with a capacity of 120GB. According to the flexible disk supply architecture, the virtual server is allocated a total capacity of 360GB, which is the maximum disk usage. The current cloud user has not installed any software and uses 0GB, so the cloud user does not have to pay any disk space usage fees.

Fig. 2.25
figure 25

Example of an elastic disk supply architecture

2.4.8 Redundant Storage Architecture

Cloud storage devices sometimes encounter some failures or damages. The reasons for this situation include network connection problems, controller or general hardware failures, and security vulnerabilities. The reliability of cloud storage devices in a combination will have a ripple effect, which will cause all services, applications, and infrastructure components in the cloud that depend on their availability to be affected by failures.

The Redundant Storage Architecture introduces the replicated secondary cloud storage device as part of the failure response system, which must be synchronized with the primary cloud storage device’s data. When the primary cloud storage device fails, the storage device gateway transfers the cloud user’s request to the secondary cloud storage device.

Figure 2.26 shows an example of redundant storage architecture. The primary cloud storage device regularly copies data to the secondary cloud storage device to achieve data synchronization. When the primary cloud storage device is unavailable, the storage device gateway automatically redirects the cloud user’s request to the secondary cloud storage device.

Fig. 2.26
figure 26

Example of redundant storage architecture

The cloud architecture mainly relies on a storage replication system, which keeps the primary cloud storage device synchronized with its replicated secondary cloud storage device. The storage replication mechanism is a variant of the resource replication mechanism, which is used to synchronously or asynchronously copy data from the primary cloud storage device to the secondary cloud storage device.

Cloud service providers sometimes place the secondary cloud storage device in a different geographic area from the main cloud storage device. On the one hand, it may be because of the economy, and on the other hand, it may be convenient for load balancing and disaster recovery. At this time, to achieve replication between the devices in the two places, the cloud service provider may need to rent a third-party network connection.

2.5 Exercise

  1. (1)

    Multiple choice.

    1. 1.

      Virtual servers typically do not contain () class IT resources.

      1. A.

        CPU.

      2. B.

        Memory.

      3. C.

        External.

      4. D.

        Peripheral.

    2. 2.

      The false statement about the virtual machine monitor is ().

      1. A.

        Virtual machine monitors are used to control and manage virtual servers.

      2. B.

        A virtual machine monitor is software that runs on a physical server.

      3. C.

        Virtual machine monitors are primarily used to create virtual server instances.

      4. D.

        Virtual machine monitors can work with VIMs to replicate virtual server instances using stored virtual server images.

    3. 3.

      The network storage interface protocol independent of block storage is ().

      1. A.

        SCSI.

      2. B.

        NFS.

      3. C.

        iSCSI.

      4. D.

        FC.

    4. 4.

      The false claim about the NoSQL database is ().

      1. A.

        NoSQL refers to a non-relationship database.

      2. B.

        The NoSQL database is easier to scale than the relationship database.

      3. C.

        The NoSQL database emphasizes data normalization and supports transaction and connection operations.

      4. D.

        NoSQL databases have better storage and read/write performance than relationship databases in large data environments.

    5. 5.

      () is not the main work of the resource management system.

      1. A.

        Manage virtual IT resource templates, such as virtual server images.

      2. B.

        Manage user accounts, security credentials, authorization, and access control.

      3. C.

        Allocate, free up, and coordinate virtual IT resources among available physical infrastructure.

      4. D.

        Monitor the operational conditions of IT resources and enforce usage policies and security regulations.

    6. 6.

      The misconception about encryption is ().

      1. A.

        Symmetric encryption is not undeniable.

      2. B.

        Non-symmetric encryption provides authenticity, non-denial, and integrity protection.

      3. C.

        Documents encrypted with the private key can only be decrypted correctly with the appropriate public key.

      4. D.

        Symmetric encryption is generally slower to calculate than non-symmetric encryption.

    7. 7.

      Under the premise of applying the same hash algorithm to the message, the correct statement about the message summary is ().

      1. A.

        Different messages must not produce the same message summary.

      2. B.

        The same message must produce the same message summary.

      3. C.

        The length of the message summary is generally the same as the message.

      4. D.

        Message summaries for different messages generally have different lengths.

    8. 8.

      The digital certificate requested by the cloud user generally does not contain ().

      1. A.

        CA-certified identity data for cloud users.

      2. B.

        The public key of the cloud user.

      3. C.

        Cloud user private key.

      4. D.

        CA digital signature.

    9. 9.

      Elastic resource capacity architectures typically do not include ().

      1. A.

        SLA Monitoring Agent.

      2. B.

        Intelligent automation engine scripts.

      3. C.

        Resource Pool.

      4. D.

        The listener is automatically extended.

    10. 10.

      By establishing a dynamic storage provisioning system, the basic cloud architecture that ensures accurate billing based on the amount of storage actually used by cloud users is ().

      1. A.

        Service load balancing architecture.

      2. B.

        Elastic resource capacity architecture.

      3. C.

        Dynamic extensible architecture.

      4. D.

        Elastic disk provisioning architecture.

  2. (2)

    Fill in the blanks.

    1. 1.

      ______isolate a related set of cloud-based IT resources from other entities in the cloud, such as unauthorized users, whose primary function is network segmentation and isolation to ensure the relative independence of IT facilities within the region.

    2. 2.

      The main issues associated with cloud storage are data security,______ and confidentiality。.

    3. 3.

      Cloud storage devices can be divided into four categories by data storage level: files, blocks, data sets, and ______.

    4. 4.

      ______is a contract between a network service provider and a customer that defines terms such as service type, quality of service, and customer payment.

    5. 5.

      ______mechanism helps cloud users provide them with authentication and authorization information services across multiple cloud services, i.e., point authentication and multiple accesses.

    6. 6.

      ______establish a form of dynamic scaling that extends or “explodes” from IT resources within the enterprise to the cloud as long as a pre-set capacity threshold is reached.

  3. (3)

    Answer the following questions.

    1. 1.

      What mechanisms are necessary for a cloud computing system to function properly?

    2. 2.

      What are the main principles of digital certificates? What are the uses?

    3. 3.

      From a functional point of view, what are the main categories of basic cloud architecture? Which features match?