To enable open communication among computers worldwide, the International Organization for Standardization (ISO) has developed a reference model for network interconnection, that is the Open System Interconnection Reference Model (OSI/RM). This reference model (architecture standard) defines a 7-layer framework for network interconnection, namely the physical layer, data link layer, network layer, transport layer, session layer, presentation layer, and application layer.

The TCP/IPv4 stack is the current industry standard that streamlines the OSI 7-layer model by integrating it into four layers. Based on functions, the TCP/IPv4 stack is a set of protocols that can be categorized into application layer protocols, transport layer protocols, network layer protocols and network interface layer protocols from top to bottom. This chapter introduces these protocols in detail.

2.1 Overview of Protocols

The most important thing in learning about computer networks is to master and understand the protocols used for computer communication. For many people learning computer networks, the protocols used for computer communication are concepts they struggle to understand. For this reason, before talking about the protocols used for computer communication, let’s take a look at a lease agreement.

2.1.1 Introduction to Protocols

You are no stranger to agreements, as university students have to sign employment agreements with employers when they leave school to work; and when they start working, they might need to rent a room, so they may have to sign lease agreements with landlords. In the following, we introduce the meaning of the protocol and its content through a lease agreement, and then introduce the protocols used for computer communication.

If the lessor and the lessee do not sign an agreement, but only verbally agree on the amount of rent, when to pay the monthly rent, the amount of the deposit, and who is responsible for repairing the damaged furniture and appliances, over time neither party may remember these agreements. Once the lessor and the lessee do not agree over a certain situation, misunderstandings and conflicts will easily occur.

In order to avoid disputes, the lessor and the lessee need to sign a lease agreement, and write the matter of mutual concern into the agreement. Both sides sign their names upon confirmation and the agreement is done in duplicate, which must be abide by both sides. If the lessor and lessee are inexperienced in signing the lease agreements and are worried about missing some important matters, they can find a recognized and standardized lease agreement template from the Internet. Figure 2.1 is an example of a lease agreement template where the covenants are defined and the lessor and lessee simply need to fill in the specified content on the template.

Fig. 2.1
figure 1

Lease agreement template

To simplify the filling process, the lease agreement template also provides a table, as shown in Fig. 2.2. When the lessor and the lessee sign the lease agreement, they only need to fill in the information in the position specified in the table, and the detailed terms and conditions of the agreement are not required to be filled. In the table, the “name of the lessor”, “name of the lessee”, “ID number”, “location of the house” and other information are called fields. These fields can be either fixed length or variable length. In case of variable length, the delimiters between the fields should be defined.

Fig. 2.2
figure 2

Table in the lease agreement template that needs to be filled in

Figure 2.3 is a specific lease agreement filled out based on the table in the lease agreement template, according to which you can know the information such as lessor and lessee, the location of the house, rent, and deposit. The terms of the agreement that both Party A and B should follow do not need to be filled in, but both parties must comply with the matters agreed in the lease agreement template.

Fig. 2.3
figure 3

The detailed lease agreement

Similar to the lease agreement template, the protocols used for computer communications are standardized, that is, common templates with Party A and Party B are formed. In addition to stipulating the conventions that both Party A and Party B need to follow, computer communication protocols also define the format of the messages (messages are the information communicated and exchanged by the application) when Party A and Party B exchange information with each other. It usually includes the format of request messages and response messages. The message format is similar to the format shown in Fig. 2.2.

During data communication, when data packets are analyzed using packet capture tools, the tables specified by the protocol to be filled in by both sides of the communication and the values of each field are similar to the values of each field filled in the lease agreement shown in Fig. 2.3. Figure 2.4 shows the table that needs to be filled in by both sides of the communication as defined by the IPv4 protocol, which is called the IPv4 header. In communications of computers in the network, only the contents of the IPv4 header need to be filled in as specified, then the computers of both communicating parties and the network devices along the way will be able to work according to the IP.

Fig. 2.4
figure 4

IPv4 header

2.1.2 Computer Communication Protocols

  1. 1.

    Overview of protocol layering

    The protocols used for computer communication in the Internet today are a set of protocols, namely the TCP/IPv4 protocol stack. The TCP/IPv4 protocol stack is a communication protocol that is most complete and most widely used at present. As shown in Fig. 2.5, each protocol in the TCP/IPv4 stack is independent, and each contains a Party A, a Party B, an objective, and terms of agreement. This set of protocols can be divided by function into the application layer, transport layer, network layer and network interface layer protocols, which collaborate to enable the communication between computers in the network.

    TCP/IPv4 protocol stack can be used for both WAN and LAN. It is the cornerstone of Internet/Intranet, and its charm lies in its ability to enable computers with different hardware structures and operating systems to communicate with each other. As shown in Fig. 2.5, its main protocols are Transmission Control Protocol (TCP) and Internet Protocol (IPv4).

  2. 2.

    The reason for protocol layering

    Why do we need this set of protocols for computer communication? How to understand the layering? What is the relationship between the layers? In the following, we use an example of online shopping for illustration.

    In the process of online shopping, a shopping protocol need to be formed between the merchant and the customer, and both parties need to follow the procedure specified by the shopping platform for transactions, that is, the merchant provides the products; the customer browses and chooses a product, and pays online; the merchant ships the product; the customer receives the product and confirms receipt, after which the payment is finally transferred to the merchant’s account; the customer can return the product if he or she is not satisfied with the product received; the customer who has purchased the product can comment on it. This is the shopping procedure, which can also be considered as the protocol used in online shopping. The Party A and Party B of the shopping agreement are the merchant and the customer, and the shopping agreement stipulates the shopping procedure, i.e., what the merchant can do, what the customer can do, and what is the order of the operation. For example, a merchant cannot ship a product before the payment is made, and a customer cannot comment on a product without making a purchase. Such an “online shopping protocol” is equivalent to an application layer protocol adopted for computer communication. Similarly, each of online restaurant delivery services, accessing websites, sending and receiving emails, remote login, and other applications requires an application layer protocol.

    Therefore, is the shopping protocol (the protocol of this layer) enough for conducting online shopping? As we all know, the purchased products also need to be delivered to the customer’s home by express delivery, and if the customer is not satisfied, the product will also need to be returned to the merchant through express delivery. That means online shopping also needs the logistic services provided by delivery companies. Delivery companies such as SF Express, YTO Express, and ZTO Express all offer such function, that is, to provide logistics services for online shopping.

    It is worth noting that delivery companies also need a layer of protocol to deliver express packages, namely the express protocol. The express protocol stipulates the procedure and express waybill to be filled in for the express delivery. The customer fills in the information such as recipient and sender in the designated place in accordance with the format of the express waybill. The express waybill specifies the form that needs to be filled out for logistics, and the form specifies the content and location to be filled in. According to the recipient’s city, the delivery company sorts the packages, and selects the consignment route, and upon arrival at the target city, the couriers deliver the package to the recipient based on the specific address on the express waybill. As shown in Fig. 2.6, the IP header defined by IP is equivalent to the courier company’s express waybill, the purpose of which is to send the data packet to the destination address.

    Similar to the delivery company’s provision of logistic services for online shopping, there is a “service” relationship between the four layers of protocols included in TCP/IPv4, that is, the lower layer protocols provide services for its upper layer protocols. Specifically, the transport layer provides services for the application layer, the network layer provides services for the transport layer, and the network interface layer provides services for the network layer.

    Figure 2.7 depicts the layering of the TCP/IPv4 protocol stack and the functional range of each layer’s protocol. Party A and Party B of the application layer protocol are the server program and the client program, which execute the functions of the application. Party A and Party B of the transport layer protocol are located in the two computers of the communication. TCP implements reliable transmission for the application layer protocol, while UDP provides message forwarding service for this protocol. IP in the network layer protocol selects the path for forwarding data packets across network segments. IP is a multi-party protocol that includes the two computers of the communication as well as the routers passed through along the way. The data link layer is responsible for sending the packets of the network layer from one end of the link to the other. Devices on the same link are peer entities for the data link layer protocol, which works on a segment of the link and varies for different types of links. As shown in Fig. 2.7, Ethernet uses the CSMA/CD protocol while a point to point link uses Point to Point Protocol (PPP).

  3. 3.

    Advantages of protocol layering

    The following describes the advantages of protocol layering of computer communication.

    1. (a)

      Each layer is independent of each other. A certain layer does not need to know how its lower layer works, but only the services provided by that layer through the interlayer interface. For the lower layer, the upper layer is the data to be processed, as shown in Fig. 2.8.

    2. (b)

      Good flexibility. Improvements and changes made in each layer do not affect other layers. For example, IPv4 implements the network layer function, and when it is upgraded to IPv6, it still implements the network layer function, while no change will be made to the TCP and UDP in the transport layer, nor will any change be made to the protocols used in the data link layer. As shown in Fig. 2.9, computers can use TCP/IPv4 and TCP/IPv6 for communication.

    3. (c)

      The functions of each layer can be implemented using the most appropriate technology. For instance, twisted-pair cable is used to connect the network if it is suitable for cabling, and wireless coverage is adopted if there are obstacles.

    4. (d)

      Promote standardization. Routers implement network layer functions and switches implement data link layer functions. Network layer standards and data link layer standards are the reason why routers and switches from different vendors can be connected to each other for computer communication.

    5. (e)

      The layering helps to break up complex computer communication problems into multiple simple problems and is conducive to network troubleshooting. For example, the network failures caused by the lack of gateway in the computer belong to the network layer problems, those caused by MAC address conflict are data link layer problems, and the failure to access websites due to the wrong proxy server set by the IE browser belongs to the application layer problems.

Fig. 2.5
figure 5

TCP/IPv4 protocol stack

Fig. 2.6
figure 6

Express waybill

Fig. 2.7
figure 7

The layering and functional range of protocols

Fig. 2.8
figure 8

Relationships between layers

Fig. 2.9
figure 9

IPv4 and IPv6 have the same function

2.1.3 Relationship Between OSI 7-Layer Model and the TCP/IPv4 Protocol Stack

The TCP/IPv4 protocol stack introduced earlier is the industry standard for Internet communication. When the Internet first emerged, communication was typically only possible between computer products manufactured by the same manufacturer. This barrier was shattered in the late 1970s when the ISO created the Open Systems Interconnection (OSI) reference model (referred to as the OSI 7-layer model in this book).

The OSI 7-layer model divides the computer communication process into seven layers based on the functions and specifies the functions that each layer performs. This allows vendors of Internet devices as well as software companies to design their own hardware and software with reference to the OSI reference model, and network devices from different vendors can work in collaboration with each other.

The OSI 7-layer model is not a detailed protocol; the TCP/IPv4 stack is. So how to understand the relationship between them? For instance, the International Organization for Standardization defines a reference model for automobiles stipulating that automobiles shall be equipped with a power system, steering system, braking system, and transmission system, which are equivalent to the functions that each layer of the computer communication is intended to implement as defined by the OSI 7-layer model. The automobile manufacturer, such as Audi, develops its own automobiles with reference to this automobile reference model and equips the car with all the functions required by the model, then the Audi automobile at this time is tantamount to the TCP/IPv4 protocol. If some Audi automobiles use gasoline and some use natural gas for their power system, some use 8-cylinder engines and others 10-cylinder engines, then all the functions implemented are the power system functions of the automobile reference model. Similarly, the OSI reference model only defines the functions to be implemented by each layer of computer communication, without specifying how to implement them and the details of the implementation. They can be implemented differently by different protocol stacks.

In the OSI reference model developed by the International Organization for Standardization, the computer communication process is divided into seven layers, which are illustrated below.

  1. 1.

    Application layer: application layer protocols are used to implement the functions of applications, and the standardization of the implementation methods leads to the application layer protocols. Due to the variety of applications in the Internet, such as accessing websites, sending and receiving emails, accessing file servers, there are all kinds of application layer protocols. The application layer protocols shall include what requests (commands) the client can send to the server, what responses the server can return to the client, the message formats used, the interaction order of commands, and so on.

  2. 2.

    Presentation layer: the presentation layer provides a presentation method for the information transmitted by the application layer. If a character file is transmitted by the application layer, it should be converted into data using the character set. If it an image file or an application binary file, it should be converted into data by coding. Whether the data is compressed or whether it is encrypted before transmission is a matter to be solved by the presentation layer. The presentation layer of the sender and of the receiver are the two sides of the protocol. Encryption and decryption, compression and decompression, as well as the encoding and decoding of character files shall all follow the specification of the presentation layer protocol.

  3. 3.

    Session layer: the session layer establishes, maintains and closes a session for the client and server programs of the communication. Establishing a session: for Computer A and B to communicate, a session should be established for them; in the process of establishing a session, there will be authentication, authority identification, etc. Maintaining a session: after the session is established, the both sides of the communication start to transfer data. When the data transfer is completed, the session layer will not necessarily disconnect the communication session immediately, but maintain the session according to the settings of the application and application layer, during which the two sides of the communication are capable of transferring data by the session at any time. Closing a session: when the time specified by the application or application layer runs out, or when A/B reboots or shuts down, or when the session is manually disconnected, the session between A and B will be disconnected.

  4. 4.

    Transport layer: the transport layer mainly provides end to end services for the process of communication between hosts, and handles transmission problems such as datagram errors and wrong datagram orders. The transport layer is a key layer in the computer communication architecture, which can shield the communication details of the lower data layers from the upper layers by using the data forwarding services provided by the network layer, so as to spare the users from considering the details of the work of the physical layer, data link layer and network layer.

  5. 5.

    Network layer: the network layer is responsible for routing selection during the transmission process of data packets from the source network to the destination network. The Internet is a collection of multiple networks, and it is with the help of the routing path selection of the network layer that multiple networks can be connected and information can be shared.

  6. 6.

    Data link layer: The data link layer is responsible for transferring data from one end of the link to the other, the basic unit of transmission being the “frame”. It provides error control and flow control services for the network layer.

  7. 7.

    Physical layer: the physical layer is the lowest layer in the OSI reference model, which mainly defines the electrical, mechanical, procedural and functional standards of the system, such as voltage, bandwidth, maximum transmission distance and other similar characteristics. The primary function of the physical layer is to provide physical transmission for the data link layer by using transmission media. The basic unit of physical layer transmission is bitstream, i.e., 0s and 1s, which are also the most basic electrical or optical signals.

    The TCP/IPv4 protocol stack merges and streamlines the OSI reference model, as its application layer implements the functions of the application, presentation, and session layers of the OSI reference model, and merges the data link layer and physical layers into a network interface layer, as shown in Fig. 2.10.

Fig. 2.10
figure 10

OSI reference model and TCP/IP layering

2.2 Application Layer Protocols

Computer communication is essentially application communication on a computer, which usually consists of a client program initiating a communication request to a server program, and the server program returning a response to the client program, thus implementing the functions of the application.

There are many applications in the Internet, such as accessing websites, domain name resolution, sending e-mails, receiving e-mails, and transferring files. Each application needs to specify what requests the client program can send to the server, what responses the server program can return to the client, the order in which the client sends requests (commands) to the server, how to handle accidents when they occur, what fields are in the messages sending requests and responses, the length of each field, and what the value of each field means. These provisions are the protocols used for application communication, and these protocols are called application layer protocols.

Since it is a protocol, there is a Party A and a Party B. The client program and the server program of the communication are the Party A and Party B of the protocol, which are called peer entities in many books on computer networks, as shown in Fig. 2.11.

Fig. 2.11
figure 11

Application layer protocol

The following lists the common application layer protocols in the TCP/IPv4 protocol stack and their uses.

  • HyperText Transfer Protocol (HTTP), which is used to access Web services.

  • HyperText Transfer Protocol over Secure Socket Layer (HTTPS), which enables the encrypted transmission of HTTP communication.

  • Simple Mail Transfer Protocol (SMTP), which is adopted to send e-mails.

  • Post Office Protocol version 3 (POP3), which is used to receive e-mails.

  • Domain Name System (DNS), which is for domain name resolution.

  • File Transfer Protocol (FTP), which is used to upload and download files on the Internet.

  • Telnet, which is used to remotely configure network devices, Linux systems, and Windows systems.

  • Dynamic Host Configuration Protocol (DHCP), which is adopted to automatically configure IP addresses, subnet masks, gateways, DNS, etc. for computers or other network devices.

The following is a packet capture analysis of the traffic for accessing websites and for file transfers to observe the working process of the application layer protocols (with HTTP and FTP as the examples)., i.e., the interaction process between the client and the server, the requests sent by the client to the server, the responses sent by the server to the client, the format of request messages, and the format of response messages, which help readers understand the application layer protocols.

2.2.1 HTTP Protocol

HTTP is the most widely used application layer protocol in the Internet. By the packet capture analysis of HTTP, this section observes the requests (commands) sent by the client (browser) to the Web server, the responses (status codes) returned by the Web server to the client, as well as the format of the request and response messages. The illustration of HTTP enabling the Web browser to access the Web server is presented in Fig. 2.12.

  1. 1.

    The main contents of HTTP

    In order to make it easier for readers to understand, the following part elaborates HTTP by a lease agreement format (only its main contents are shown).

    HTTP

    Party A: Web server

    Party B: Web browser

    HTTP is a transport protocol used to transfer hypertext from a World Wide Web (WWW) server to a local browser. HTTP is an application layer protocol of the TCP/IPv4 protocol stack and is used to transfer HTML files, image files, query results, etc.

    HTTP works on top of the client-server architecture. The browser acts as the HTTP client, sending all requests to the HTTP server (i.e., the Web server) through the Uniform Resource Locator (URL) (in this case, the URL entered in the browser). After the Web server receives the request, it sends a response message to the client.

    The terms of the protocol are as follows.

    1. (a)

      Steps of HTTP requests and responses

      1. (i)

        The client is connected to the Web server

        An HTTP client, usually a browser, establishes a TCP socket connection with the HTTP port of the Web server (TCP port 80 is used by default).

      2. (ii)

        Send HTTP request

        Through a TCP socket, the client sends a text request message to the Web server. A request message consists of four parts: request line, request header, blank line and request data.

      3. (iii)

        The Web server accepts the request and returns an HTTP response

        The web server parses the request and locates the requested resource. The server writes a copy of the resource to a TCP socket, which is read by the client. A response message consists of four parts: status line, response header, blank line and response data.

      4. (iv)

        Release the TCP connection

        If the connection mode is “close”, the server will actively close the TCP connection and the client will passively close the connection to release the TCP connection. If the connection mode is “keepalive”, the connection will be maintained for a period of time, during which the request can continue to be received.

      5. (v)

        The client browser parses the HTML content

        The client browser first parses the status line to check the status code indicating whether the request is successful or not. Then each response header is parsed, and the response header advertise the following HTML document of certain bytes and its character set. The client browser reads the response data HTML, formats it according to the HTML syntax, and displays it in the browser window.

    2. (b)

      Format of request messages

      Since HTTP is text-oriented, each field in the message is some ASCII codes, and thus the length of each field is not fixed. As shown in Fig. 2.13, an HTTP request message consists of three parts, namely the request line, the header line and the entity body. The “CR” and “LF” in the figure represent “carriage return” and “line feed”, respectively.

      1. (i)

        Request line

        The request line is used to indicate that it is a request message. The three fields of the line are separated by spaces.

      2. (ii)

        Header line

        This is used to specify some information about the browser, server or message body. The header can have multiple lines or can be left out. In each header line there is the field name of the header and its value, and each line ends with a “carriage return” and a “line feed”. At the end of the header line, there is a blank line to separate the header line from the entity body that follows.

      3. (iii)

        Entity body

        This field is generally not used in the request message.

    3. (c)

      Methods in HTTP request messages

      The browser can send the following eight request methods (sometimes called “actions” or “commands”) to the Web server to indicate the different ways to operate the resources specified by the Request-URL.

      • GET: to request the resource identified by the Request-URL. When a web page is accessed by entering a URL in the browser’s address bar, the browser uses the GET method to request the web page from the server.

      • POST: to append new data to the resource identified by Request-URL and ask the requested server to accept the data attached to the request. It is often used to submit forms, such as submitting information to the server, posting, and logging in.

      • HEAD: to request to get the response message header of the resource identified by Request-URL.

      • PUT: to request the server to store a resource and use Request-URL as its identifier.

      • DELETE: to request the server to delete the resource identified by Request-URL.

      • TRACE: to request the server to return the request information received, mainly for testing or diagnostics.

      • CONNECT: it is used for proxy servers.

      • OPTIONS: to request to query the performance of the server, or to query options and requirements related to the resource.

        The names of the methods are case-sensitive. When the resource targeted by a request does not support the corresponding request method, the server should return the status code 405 (Method Not Allowed); when the server does not recognize or support the corresponding request method, it should return the status code 501 (Not Implemented).

    4. (d)

      Format of response messages

      After each request message is sent, a response message will be received. The first line of the response message is the status line. As shown in Fig. 2.14, the status line includes three items, namely, the version of HTTP, the status code and a simple phrase explaining the status code.

    5. (e)

      Status codes of HTTP response messages

      All status codes are three-digit, with a total of 33 types in five categories. The description is as follows.

      • lxx indicates the notification information, such as the request is received or it is being processed.

      • 2xx indicates success, such as accepted or understood.

      • 3xx indicates a redirect, such as further action must be taken to complete the request.

      • 4xx indicates a client error, such as the request has incorrect syntax or cannot be completed.

      • 5xx indicates a server error, for example, the server fails to complete the request.

        The following three status lines are commonly seen in response messages.

        • HTTP/1.1 202 Accepted.

        • HTTP/1.1 400 Bad Request.

        • HTTP/1.1 404 Not Found.

          In summary, HTTP defines the steps a browser takes to access a Web server, what requests (methods) can be sent to the Web server, the format of HTTP request messages (what fields are there and what they mean), and what responses the Web server can send to the browser (status codes), and the format of HTTP response messages (what fields are there and what they mean).

          Likewise, the following content also have to be defined for other application layer protocols.

      • What requests (methods or commands) the client can send to the server.

      • The order in which the client and server commands interact, such as the POP3 protocol, which requires user authentication before receiving e-mails.

      • What responses (status codes) the server has, and what each status code means.

      • Define the format of each message in the protocol. What fields the message contains, whether the fields are fixed-length or variable-length, and if they are variable-length, what the field delimiters are. These all have to be defined in the protocol.

  2. 2.

    Packet capture analysis of HTTP

    A packet capture tool installed in the computer can capture the data packets sent and received by the network interface card, as well as the those of the application communication. This allows you to visualize the interaction between the client and the server, that is, what requests the client sends and what responses the server returns. This is how the application layer protocol works.

    As shown in Fig. 2.15, enter “http and ip.addr == 202.206.100.34” at the display filter toolbar, click to enable the display filter. At this point, only the http request and response packets are displayed. Select the 1396th packet, then you can see the HTTP request message in the data packet, which you can compare with the format of HTTP request messages introduced earlier, whose request method is GET.

    The 1440th data packet is a Web server response packet with a status code of 404. Status code 404 stands for “Not Found”.

    As shown in Fig. 2.16, the 11626th data packet is an HTTP response packet with a status code of 200, which indicates that the request is successfully processed. This status code is normally returned. This response message can be compared with the format of HTTP response messages introduced earlier.

    In addition to defining that the client uses the GET method to request a web page, HTTP also defines many other methods. For example, POST method should be adopted in processes such as the browser submitting content to the server, logging in to the site, and searching the site. For the content just entered in the search site, by entering “http.request.method == POST” at the display filter, and clicking to enable the display filter, as shown in Fig. 2.17, you can see the 19390th data packet, and the client uses the POST method to submit the search content to the web server.

Fig. 2.12
figure 12

The illustration of HTTP enabling the Web browser to access the Web server

Fig. 2.13
figure 13

HTTP message format

Fig. 2.14
figure 14

Response message format

Fig. 2.15
figure 15

GET method of HTTP request messages

Fig. 2.16
figure 16

HTTP response message

Fig. 2.17
figure 17

POST method in HTTP

2.2.2 FTP Protocol

FTP is a widely used file transfer protocol on the Internet for controlling the two-way file transfer over the Internet. There are various FTP applications based on different operating systems, but all these applications utilize the same protocol for transferring files. FTP shields the details of each computer system and can reduce or eliminate incompatibilities in processing files under different systems, so it is suitable for transferring files between different operating systems. FTP provides only some basic services for file transfer and it uses TCP for reliable transmission.

While using FTP, users will encounter two concepts: “download” and “ppload”. Downloading files is tantamount to duplicating files from the remote host to its own computer; and uploading files is duplicating files from its own computer to the remote host. The following is a packet capture analysis of how FTP works.

By installing Windows Server 2012 R2 server in the virtual machine, installing FTP service, using packet capture tool to analyze the packets of FTP client accessing FTP server in the client (Windows 10), observing the interaction process of FTP client accessing FTP server, you can see the request sent by the client to the server, and the response returned by the server to the client. Setting certain methods to disable FTP on the FTP server enables secure access to the FTP server, such as disabling the deletion of files on the FTP server.

After running the packet capture tool on the FTP client to start capturing data packets, upload a test.txt file, rename it to “abc.txt”, and finally delete the abc.txt file on the FTP. The packet capture tool will capture all the commands sent by the FTP client and all the responses returned by the FTP server. As shown in Fig. 2.18, right-click one of the FTP packets, click “Follow Stream” → “TCP Stream”, the window in Fig. 2.19 will appear. After collating the data generated from all the interactions of the FTP client accessing the FTP server, you can see the methods in FTP, among which, STOR method can upload test.txt files, CWD method can change the working directory, RNFR method can rename test.txt files, and DELE method can delete abc.txt files. Other FTP methods can also be seen using the packet capture tool, such as methods corresponding to operations of using the FTP client to create folders on FTP server, deleting folders, and downloading files.

Fig. 2.18
figure 18

Follow stream

Fig. 2.19
figure 19

Interaction process of FTP client accessing FTP server

In order to prevent the client from performing certain operations, you can set the FTP server to disable some commands in FTP. For example, to prohibit FTP client from deleting files on FTP server, you can set FTP service request filtering and disable DELE method. As shown in Fig. 2.20, click “FTP Request Filtering”.

Fig. 2.20
figure 20

Manage FTP request filtering

As shown in Fig. 2.21, click the “Commands” tab in the “FTP Request Filtering” interface that appears, click “Deny Command”, enter “DELE” in the pop-up “Deny Command” dialog box, and click the “OK” button.

Fig. 2.21
figure 21

Disable DELE method

When you delete a file on the FTP server again on Windows 10, the prompt “500 Command not allowed” will appear, as shown in Fig. 2.22.

Fig. 2.22
figure 22

Command not allowed

2.3 Transport Layer Protocols

The transport layer primarily provides end to end communication between applications on two hosts. In the TCP/IPv4 protocol stack, the transport layer contains two protocols, Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).

2.3.1 Application Scenarios of TCP and UDP

TCP and UDP at the transport layer have their own application scenarios. In the following, the application scenarios of each will be described respectively.

  1. 1.

    Application scenarios of TCP

    TCP provides reliable transmission services for application layer protocols. The sender sends data in order, and the receiver receives data in order. TCP is responsible for retransmission and sorting in case of packet loss or disorder. The following are the application scenarios of TCP.

    1. (a)

      The client program and the server program need several interactions to achieve the function of the application, such as POP3 for receiving e-mails and SMTP for sending e-mails, as well as FTP for transferring files, which all use TCP at the transport layer.

    2. (b)

      When an application transfers a file that requires segmentation, such as accessing a web page through a browser or transferring a file using QQ, TCP is selected at the transport layer for segmentation.

      For example, downloading a 500 MB movie or a 200 MB software from the network requires splitting such a large file into multiple packets for sending, which may take several minutes or tens of minutes. During this period, the sender sends the content as a byte stream and in the meantime put it into the cache while the transport layer segments and numbers the byte stream in the cache, and then sends them in order. This process requires the sender and receiver to establish a connection and negotiate some parameters regarding the communication process (e.g., the maximum number of bytes in a segment). It is important to note that the segments referred to here can be formed into data packets by adding the IP headers at the network layer. If a data packet is lost due to unstable network, the sender must resend the lost packet, otherwise the received file will be incomplete. The TCP protocol is capable of reliable transmission. If the sender sends too fast for the receiver to process, the receiver will also notify the sender to slow down or even stop sending, which is the TCP flow control mechanism. The flow in the Internet is not fixed, and flow peaks may result in network congestion (which is easy to understand, just like traffic jams in the city during rush hour), so that packets that are too late to be forwarded will be dropped by the router. The TCP protocol detects network congestion during transmission so as to adjust the sending speed. TCP protocol has a mechanism for congestion avoidance.

      As shown in Fig. 2.23, the sending speed of the sender is controlled by two factors: whether the network is congested or not, and the receiving speed of the receiver, whichever speed is lower.

      There are some application communications that become inefficient by using the TCP protocol. For example, some applications fulfill their function simply by the client sending a request message to the server and the server returning a response message. Such applications are not efficient if they use TCP, sending three packets to establish a connection and then sending four packets to close the connection. For such applications, UDP is usually used in the transport layer.

  2. 2.

    Application scenarios of UDP

    1. (a)

      The client program and the server program communicate, and the packets sent by the application do not need segmentation. For example, for domain name resolution, the DNS protocol uses UDP at the transport layer. The client sends a message to the DNS server requesting the resolution of a website’s domain name, and the DNS server returns the result of the resolution to the client using a message.

    2. (b)

      Real-time communication. Examples include using QQ and WeChat for voice chat and video chat. For such applications, the sender and receiver need real-time interaction, i.e., no long delays are allowed. Even if a few sentences are missed due to network congestion, there is no need to use TCP to wait for lost messages, because if the waiting time is too long, it will not be real-time chatting.

    3. (c)

      Multicast or broadcast communication. For example, in a multimedia room in a school, the content on the teacher’s computer screen needs to be received by the students in the classroom with their computers. By installing the multimedia classroom server software on the teacher’s computer and the multimedia classroom client software on the students’ computers, when the teacher’s computer sends messages using a multicast address or broadcast address, all students’ computers can receive. One-to-many communications like this use UDP at the transport layer.

      Knowing the characteristics and application scenarios of the two protocols at the transport layer, it is easy to determine what protocol a certain application layer protocol uses at the transport layer. Next, let’s analyze and determine what protocols are used at the transport layer for QQ file transfers and what protocols are used at the transport layer for QQ chats.

      When you transfer files to your friends by QQ, the process of the transfer will last for several minutes or tens of minutes, and the file transfer will not complete by a single packet, so the file to be transferred needs to be segmented. The reliable transmission, flow control, congestion avoidance and other functions that need to be implemented during the transfer process must all be implemented at the transport layer using the TCP protocol.

      When using QQ to chat with a friend, normally not much text is entered at one time, and a single packet is enough to send the chat content. After the first sentence is sent, it is not certain when the second one will be sent, that is, the process of sending data is not continuous, so it is not necessary to keep the two communicating computers connected all the time. Therefore, UDP is used at the transport layer to send the QQ chat content.

      In summary, it can be seen that depending on the characteristics of the communication, applications can choose different protocols at the transport layer.

Fig. 2.23
figure 23

TCP functions

2.3.2 Relationship Between Transport Layer Protocols and Application Layer Protocols

There are many application layer protocols, but only two transport layer protocols. So how to use these two transport layer protocols to identify the different application layer protocols?

Usually, a transport layer protocol is used in conjunction with a port number to identify an application layer protocol. At the transport layer, a 16-bit binary system is used to identify a port, and port numbers take a value range of 0 to 65,535, which is sufficient for a computer.

Port numbers can be divided into two categories, namely, port numbers used by servers and those used by clients.

  1. 1.

    Port numbers used by servers.

    The port numbers used by servers can be divided into two categories, the most important one being well-known port number or system port number, the value of which is from 0 to 1023. The Internet Assigned Numbers Authority (IANA) assigned these port numbers to some of the most important TCP/IP applications of TCP/IP, so that all users know them. Some commonly used well-known port numbers are given below in Fig. 2.24.

    Another category is called registered port numbers, with a value range of 1024 to 49,151. These port numbers are for applications that do not have a well-known port number. Such port numbers must be registered with IANA according to the prescribed procedures to prevent duplication. For example, Microsoft’s Remote Display Protocol (RDP) uses TCP port 3389, which belongs to registered port numbers.

  2. 2.

    Port numbers used by clients.

    When you open a browser to visit a website or log in to QQ and other client software to establish a connection with the server, the computer will assign a temporary port for the client software, which is the client port, with a value range between 49,152 and 65,535. Since these port numbers are dynamically selected only when the client process is running, they are also called temporary (ephemeral) port numbers. These port numbers are reserved for temporary use of the client process selection. When the server process receives a message from the client process, it knows the port number used by the client process and can therefore send the data to the client process. When the communication ends, the client port number that has just been used no longer exists. This port number is then available for other client processes later.

    The following is a list of the default protocols and port numbers used by some common application layer protocols.

    • HTTP uses TCP port 80 by default.

    • FTP uses TCP port 21 by default.

    • SMTP uses TCP port 25 by default.

    • POP3 uses TCP port 110 by default.

    • HTTPS uses TCP port 443 by default.

    • DNS uses UDP port 53 by default.

    • RDP uses TCP port 3389 by default.

    • Telnet uses TCP port 23 by default.

    • Windows uses TCP port 445 to access shared resources by default.

    • Microsoft SQL database uses TCP port 1433 by default.

    • MySQL database uses TCP port 3306 by default.

      The above list is the default ports, and the port used by the application layer protocol can also be altered. If the default port is not used, the client needs to specify the port used. As shown in Fig. 2.25, the server is running Web service, SMTP service and POP3 service, which use HTTP, SMTP and POP3 to communicate with the client, respectively. Now Computer A, Computer B and Computer C in the network intend to access the server’s Web service, SMTP service and POP3 service, respectively, and send three packets ①②③, the destination ports of which are 80, 25 and 110, respectively. After the server receives these three packets, it will submit them to different services according to the destination ports.

      In summary: the destination IP address of the data packet is used to locate a certain server in the network, and the destination port is to locate a certain service on the server.

Fig. 2.24
figure 24

Well-known port numbers

Fig. 2.25
figure 25

Relationship between ports and services

2.3.3 TCP Headers

The following illustrates the format of TCP message headers. The TCP protocol is capable of data segmentation, reliable transmission, flow control, network congestion avoidance, etc. Therefore, the TCP message header has more fields than the UDP message header, and the header length is not fixed. As shown in Fig. 2.26, the first 20 bytes of the header of TCP message segment are fixed, followed by 4N bytes as an option to be added as needed (N is an integer). Therefore, the minimum length of a TCP header is 20 bytes.

Fig. 2.26
figure 26

TCP header

The meanings of each field in the fixed part of the TCP header are described below.

  1. 1.

    Both source port and the destination port are two bytes, which are written into the source port number and the destination port number, respectively. The transport layer port number is used to identify an application layer protocol.

  2. 2.

    Sequence number is four bytes. The range of sequence number is [0,232 − 1], a total of 232 (i.e., 4,294,967,296). After the sequence number increases to 232 − 1, the next number goes back to 0. TCP is byte-stream oriented. Each byte of the byte stream transmitted in a TCP connection is numbered sequentially. The starting number of the entire byte stream to be transferred must be set at the establishment of the connection. The value of sequence number field in the header refers to the sequence number of the first byte of the data sent in this message segment. Figure 2.27 is an example in which Computer A sends a file to Computer B, which is used to illustrate the usage of the sequence number and acknowledgement number. To facilitate the illustration, the other fields of the transport layer are not shown. The value of the sequence number field of the first message segment is 1, and a total of 100 bytes of data is carried. This means that the sequence number of the first byte of the data in this segment is 1 and that of the last byte is 100. The data sequence number of the next message segment should start from 101, that is, the value of the sequence number field of the next message segment should be 101. The name of this field is also called “message segment sequence number”.

    Computer B will store the received packets into the cache, and sort the bytes in the received packets according to the sequence number, and then the program of Computer B will read the bytes with consecutive numbers from the cache.

  3. 3.

    The acknowledgement number is four bytes and is the sequence number of the first data byte of the next message segment the other party is expected to send.

    The TCP protocol is capable of reliable transmission. After the receiver receives several packets, it will send the sender an acknowledgement packet to inform the sender what byte to send for the next packet. As shown in Fig. 2.27, after Computer B receives two packets, it sorts the bytes in the two packets to get the first 200 consecutive bytes. Computer B will send an acknowledgement packet to Computer A, informing Computer A that it should send the 201st byte, and the acknowledgement number of this acknowledgement packet is 201. There is no data division in the acknowledgement packet, only the TCP header.

    In a word, we should remember that if the acknowledgement number is N, then all data up to the sequence number of N-1 has been received correctly.

    Since the sequence number field is 32 bits long and can number 4 GB (i.e., 4 gigabytes) of data, in general, it ensures that when the sequence number is reused, the data of the old sequence number will have already reached the destination of the network.

  4. 4.

    The data offset occupies four bits, which indicates the distance between where the data of TCP message segment starts and where the TCP message segment starts. This field actually points out the length of the header of TCP message segment. The data offset field is essential because there are option fields in the header that are of indeterminate length. It should be note, however, that the unit of “data offset” is four bytes, and since the largest decimal number that can be represented by a four-bit binary number is 15, the maximum value of data offset is 60 bytes, which is the maximum length of a TCP header, so it means that the option length cannot exceed 40 bytes. If there is only a fixed-length 20-byte header, then the value of data offset is 5, which is 0101 as a four-bit binary number.

  5. 5.

    Reservation takes six bits, reserved for future use, but should be set to 0 for now.

  6. 6.

    URG (URGent). When URG = 1, it indicates that the urgent pointer field is valid. It tells the system that there is urgent data in this message segment that should be transmitted as soon as possible (tantamount to high-preference data), rather than in the original sequence. For example, a long program has been sent to run on a remote host, but then some problem is discovered, which requires the program to be terminated, so the user issues an interrupt command (Control + C) via the keyboard. If the urgent data is not used, then these two characters will be stored at the end of the cache of the receiving TCP. Only after all the data has been processed will these two characters be delivered to the application process of the receiver, which wastes a lot of time.

    When URG is set to 1, the sending application process tells the TCP sender that it has urgent data to transmit. The TCP sender then inserts the urgent data at the forefront of the data in this message segment, while the data following the urgent data remains ordinary data. This is used in conjunction with the Urgent Pointer fielder in the header.

  7. 7.

    ACK (ACKnowledgement). The acknowledgement number field is valid only when ACK = 1. When ACK = 0, the acknowledgement number is invalid. TCP stipulates that all transmitted message segments must have ACK set to 1 after the connection is established.

  8. 8.

    PSH (PuSH). In the interactive communication of two application processes, sometimes the application process at one end expects to receive a response from the other end immediately after entering a command. In this case, TCP can use the Push operation, that is, the TCP sender sets PSH to 1 and immediately creates and sends a message segment. When the TCP receiver receives a message segment with PSH = 1, it delivers it to the receiving application process as soon as possible (i.e., “pushes” forward), rather than delivering it after the entire cache is filled. Although the application process can choose to use Push, it is rarely used.

  9. 9.

    RST (ReSeT). When RST = 1, it indicates a serious error in the TCP connection (such as a host crash or other causes), so the connection must be released and then the transport connection will be re-established. Setting RST to 1 can also be used to deny an illegal message segment or refuse to open a connection.

  10. 10.

    SYN (SYNchronization). This is used by TCP to synchronize sequence numbers when a connection is established. When SYN = 1 and ACK = 0, this is a message segment requesting connection. If the other party agrees to establish the connection, then SYN = 1 and ACK = 1 shall be used in the response message segment. Therefore, a SYN of 1 indicates that this is a message of requesting a connection or accepting the connection. The establishment and release of connections will be explained in detail later in the TCP Connection Management section.

  11. 11.

    FIN (FINish, meaning “finished” or “end”). TCP uses this field to release a connection. FIN = 1 indicates that the sender of this message segment has finished transmitting the data and requests to release the transmission connection.

  12. 12.

    The window is two bytes. The window value is an integer between [0, 216 − 1]. TCP protocol can control flow, and the window value is used to tell the other party the amount of data (in bytes) the receiver currently allows the sender to send, starting from the acknowledgement number in the header of this message segment. The reason for this limit is that the receiver has limited data cache space. In short, the window value is the basis for the receiver to instruct the sender to set its sending window. The computer that uses TCP protocol to transmit data will adjust the window value at any time according to its own receiving capability, and the sender will adjust the sending window in time with reference to this value, so as to control the flow.

  13. 13.

    Checksum takes two bytes. The scope of the checksum field includes both the header and the data.

  14. 14.

    Urgent pointer is two bytes. The urgent pointer is meaningful only when URG = 1, and it points out the number of bytes of urgent data in this message segment (the urgent data is followed by ordinary data). Thus, the urgent pointer indicates the position of the end of the urgent data in the message segment. After all the urgent data has been processed, TCP tells the application process to resume normal operation. It is important to note that urgent data can be sent even when the window value is zero.

  15. 15.

    Options are variable in length, up to 40 bytes. When no options are used, the TCP header is 20 bytes. TCP originally specified only one option, that is the Maximum Segment Size (MSS). MSS is the maximum length of the data field in each TCP message segment. The entire TCP message segment consists of the data field and the TCP header equals, so the MSS is not the maximum length of the entire TCP message segment, but rather “the length of the TCP message segment minus the length of the TCP header”.

Fig. 2.27
figure 27

Understand sequence number and acknowledgement number

2.3.4 TCP Connection Management

The TCP protocol is a reliable transmission protocol, and computers using TCP communication need to make sure of the presence of the other party before the communication formally starts and determine the parameters for negotiating the communication, such as the size of the receiving window at the receiving end, the maximum message segment length supported, whether selective acknowledgement (SACK) is allowed, and whether timestamps are supported. Once a connection is established, two-way communication is enabled, and the connection must be released when the communication ends.

TCP connections are established using the client/server method. The application process that actively establishes the connection is called the client, while the application process that passively waits for the connection to be established is called the server. The following section describes the establishment and release of TCP connections in detail.

  1. 1.

    The establishment of TCP connection

    The process of establishing a TCP connection is elaborated in Fig. 2.28. The client initiates communication with the server, and a TCP session is established between the TCP modules of the client and the server through “three-way handshaking”. “Three-way handshaking” means that a total of three TCP packets (which contain no data, but only TCP headers) are exchanged during the establishment of the TCP session, and these three packets are the packets that TCP protocol uses to establish the connection. It should be noted that at different phases, different states can be observed on the client and server.

    When the server starts the service, it will listen to the client’s request using one of the TCP ports and wait for the client’s connection, and the status will change from CLOSED to LISTEN. The process of the three-way handshaking is introduced in detail below, in which the abbreviations are case-sensitive, for example, upper-case ACK refers to the ACK flag bits, while lower-case ack means the value of the acknowledgement number.

    1. (a)

      The client application sends a TCP connection request message to tell the other party its status. The SYN flag bit of the TCP header of this message is 1, the ACK flag bit is 0, and the sequence number (seq) is x, which is called the initial sequence number of the client with its value usually being 0. After sending the connection request message, the client is in the SYN_SENT state.

    2. (b)

      After receiving the TCP connection request from the client, the server sends an acknowledgement message for connection to inform the client about its state. The SYN flag bit of the TCP header of this message is 1, the ACK flag bit is 1, the acknowledgement number (ack) is x + 1, and the sequence number (seq) is y (y is the initial sequence number of the server). The server is then in the SYN_RCVD state.

    3. (c)

      After the client receives the acknowledgement message for connection request, the status changes to ESTABLISHED and then sends another acknowledgement message to the server to confirm the establishment of the session. The SYN flag bit of this message is 0, the ACK flag bit is 1, and the acknowledgement number (ack) is y + 1. The server receives the acknowledgement message, and the status changes to ESTABLISHED.

      It is important to note that after the three-way handshaking, in fact, two TCP sessions are established between the client and the server, one from the client to the server and the other from the server to the client. Since the client is the one initiating the communication, it means that the client has information to pass to the server, so the client first sends a SYN segment requesting the establishment of a TCP session from the client to the server. The purpose of this session is to control that information is passed from the client to the server in a correct and reliable manner. After receiving the SYN segment, the server sends a SYN + ACK segment in response. This SYN + ACK segment means: The server agrees to the client’s request on the one hand, and requests to establish a TCP session from the server to the client on the other, the purpose of which is to enable the correct and reliable delivery of information from the server to the client. After receiving the SYN + ACK segment, the client responds with an ACK segment, indicating that it agrees to the server’s request. After that, two-way reliable communication is enabled.

  2. 2.

    Release of TCP connection

    After TCP communication is completed, the connection shall be released. The process of releasing a TCP connection is complicated, so we will clarify the process of releasing a connection by combining the change of status of both parties. When the data transfer is over, both the sender and the receiver can release the connection. As shown in Fig. 2.29, both A and B are now in the ESTABLISHED state, and A’s application process first sends a connection release message to its TCP, stops sending data, and actively closes the TCP connection. A sets the FIN of the header of the release message segment to 1, with the sequence number of seq = u, which is equal to the sequence number of the last byte of the previously transmitted data plus 1. At this point, A enters the FIN-WAIT-1 state, waiting for the acknowledgement from B. Note that TCP specifies that the FIN message segment consumes a sequence number even if it contains no data.

    When B receives the connection release message, it sends an acknowledgement with an acknowledgement number of ack = u + 1, and the message’s own sequence number is v, which is equal to the sequence number of the last byte of the data that B has previously transmitted plus 1. B then enters the CLOSE-WAIT state, at which point the TCP server process notifies the higher-layer application process, so the connection is released in the direction from A to B. At this point, while the TCP connection is in the Half-Close state, that is A has no more data to send, but if B sends data, A still has to receive it. In other words, the connection in the direction from B to A is not closed. This state may last for some time.

    Once A receives the acknowledgment from B, it enters the FIN-WAIT-2 state and waits for B to send a connection release message segment. If B has no data to send to A, its application process will notify TCP to release the connection. Then B must send a connection release message with FIN = 1. Now assume that B’s sequence number is w (B may have sent some more data in the half-close state.). Meanwhile, B must also repeat the acknowledgement number ack = u + 1 that it has sent the last time. At this point, B enters the LAST-ACK (last-acknowledgment) state and waits for A’s acknowledgement.

    After receiving the connection release message segment from B, A must send an acknowledgement for this. In the acknowledgment message segment, ACK is set to 1, the acknowledgment number is ack = w + 1, and its own sequence number is seq = u + 1 (according to the TCP standard, a sequence number is consumed by the previously sent FIN message segment). Then it enters the TIME-WAIT state. Note that the TCP connection has not been released yet. Only after the TIME-WAIT Timer is set for 2MSL will A enter the CLOSED state. The time MSL is called Maximum Segment Lifetime and is recommended by RFC 793 to be set to 2 min. However, this is purely for engineering reasons and MSL = 2 min may be too long for today’s networks. Therefore, TCP allows smaller MSL values to be applied for different implementations on a case-by-case basis. Thus, after A enters the TIME-WAIT state, it takes 4 min to enter the CLOSED state before a new connection can start to be established.

Fig. 2.28
figure 28

Establish TCP connection through three-way handshaking

Fig. 2.29
figure 29

The process of TCP connection release

2.3.5 Implementation of TCP Reliable Transmission

The TCP protocol implements reliable transmission by using the sliding window protocol and the Automatic Repeat-reQuest (ARQ) protocol. The following describes the working process of the sliding window protocol and the continuous ARQ protocol.

  1. 1.

    Working process of sliding window protocol

    After the TCP protocol establishes a connection, both parties can use the established connection to send data to each other. For the convenience of discussion, here we only consider that A sends data while B receives data and sends an acknowledgement. Therefore, A is called the sender and B the receiver.

    The sliding window is byte-stream oriented, and to make it easier for readers to remember the sequence number of each packet, it is assumed here that each packet is 100 bytes. To facilitate the drawing, the packets are numbered for a simplified representation, as shown in Fig. 2.30. But be sure to remember the sequence number of each packet.

    As shown in Fig. 2.31, when establishing a TCP connection, Computer B tells Computer A that it has a receiving window of 400 bytes, and Computer A sets a sending window of 400 bytes. If a packet has 100 bytes, then there will be a total of four packets in the sending window, M1, M2, M3, M4, and Sender A can send these four packets consecutively. A sending time will be recorded for each packet, as shown at the time t1 in Fig. 2.31. The sending stops when it is finished. The Receiver B receives these four consecutive packets and only needs to send one acknowledgement to A with the acknowledgement number 401, notifying A that it has received all bytes before 400. As shown at the moment t2 in Fig. 2.31, as Sender A receives an acknowledgement for M4, the sending window slides forward, and M5, M6, M7 and M8 enter the sending window. These four packets can be sent consecutively, and after they are sent, the sending stops and waits for an acknowledgement. This is the sliding window protocol.

  2. 2.

    The working process of continuous ARQ protocol

    If M7 is lost during transmission, and Computer B receives M5, M6 and M8 packets, as well as consecutive packets from M1 to M6, it will send an acknowledgement to Computer A. The sequence number of the acknowledgement is 601, notifying Computer A that all bytes before 600 have been received. At the time of t3 in Fig. 2.31, Computer A receives the acknowledgement and instead of sending M7 immediately, it slides forward the sending window so that M9 and M10 enter the sending window and are sent. When to send M7? M7 will automatically be resent when it is timeout. The timeout is a little longer than a round trip time. If M9 is sent and M7 is timeout, the sending sequence becomes M9, M7 and M10. This is the continuous ARQ protocol.

Fig. 2.30
figure 30

Simplified representation of packets

Fig. 2.31
figure 31

The continuous ARQ protocol and sliding window protocol

2.3.6 UDP

UDP is used to process packets in the same way as TCP, and both are located at the transport layer (at the layer above the IP protocol) in the OSI model. UDP at the transport layer is a connectionless transport protocol. It provides a way for applications to send IP packets without establishing a connection.

UDP does not have features such as sequencing of packets sent, retransmission for lost packets, or flow control. In other words, after the message is sent, there is no way to know whether it has arrived securely and completely. UDP exists mainly to identify an application layer protocol by using UDP+port.

2.4 Network Layer Protocols

The network layer is used to process the packets that flow over the network. A packet is the smallest unit of data transmitted over the network, and this layer specifies the path (the so-called transmission line) through which the packet reaches the other computer and delivers the packet to the other party.

2.4.1 Two Versions of Network Layer Protocols

There are two versions of the core protocols of the network layer in the TCP/IP layered model, that is IPv4 and IPv6, which are collectively referred to in this book as IP. IPv6 is an improvement over IPv4, but their functions are the same. The network layer protocol serves the transport layer, responsible for sending segments of the transport layer to the receiver. IP protocol implements the functions of network layer protocols. The sender adds the IP headers to the transport layer segments. The IP header includes the source and target IP addresses, and the segments with the IP headers are called “packets”. Routers in the network forward packets based on IP headers.

As shown in Fig. 2.32, there are four protocols in the network layer of the TCP/IPv4 stack: ARP, IPv4, ICMP, and IGMP, among which ARP, ICMP, and IGMP are secondary protocols. TCP and UDP use port numbers to identify application layer protocols, while TCP segments, UDP messages, ICMP messages, and IGMP messages can all be encapsulated in IPv4 packets, differentiated by protocol numbers. It means that IPv4 uses protocol numbers to identify the upper-layer protocols, with TCP’s protocol number being 6, UDP’s being 17, ICMP’s being 1, and IGMP’s being 2. Although ICMP and IGMP are both at the network layer, in terms of their relationship, ICMP and IGMP are above the IP protocol, which means that ICMP and IGMP messages are to be encapsulated in IPv4 packets.

Fig. 2.32
figure 32

TCP/IPv4 protocol stack

ARP is used only in Ethernet to resolve IP addresses to MAC addresses. Only when the MAC address is resolved can the packet be encapsulated into a frame and sent out. Therefore, ARP provides a service for IP. Although ARP belongs to the network layer, in terms of relationship, ARP is under the IP protocol.

Figure 2.33 shows the TCP/IPv6 protocol stack, in which the network layer protocols have changed significantly, but it does not affect the transport layer protocols, nor the data link layer protocols. The network layer of the TCP/IPv6 protocol stack does not have ARP or IGMP protocols, the functions of ICMP protocols have been greatly extended, and the functions of ARP and IGMP protocols are also embedded in ICMPv6, namely, Neighbor Discovery (ND) and Multicast Listener Discovery (MLD) protocols, respectively.

Fig. 2.33
figure 33

TCP/IPv6 protocol stack

IPv6 is covered in detail in later chapters of this book. If not specified in this book, the default IP protocol refers to IPv4.

2.4.2 IP

The IP (Internet Protocol), also known as the Internet Protocol, is the core of the TCP/IP protocol that is responsible for communication between networks on the Internet and defines the rules for transmitting packets from one network to another.

When IP is adopted as the network layer protocol, both sides of the communication are assigned a “unique” IP address to identify themselves. IP addresses can be written in 32-bit binary form, but to make it easier for people to read and analyze, it is usually written in dotted decimal form, i.e., four bytes are separated and represented in decimal form, using dots for separation, such as 192.168.1.1.

When IP protocols work, various routing protocols such as OSPF, IS-IS, BGP, are needed to help routers build routing tables, so ICMP is required to assist in network state diagnosis. If the influx of packets on a link exceeds the processing capacity of the router, the router drops the packets that it has not processed. Since each packet is individually selected for forwarding, there is no guarantee that the packets will reach the receiver in order. The IP protocol is only responsible for forwarding packets to the best of its ability, but it cannot guarantee the reliability of transmission and may lose packets, nor can it guarantee that packets will reach the receiver in order.

The encapsulation and forwarding process of IP packets is as follows.

  1. 1.

    When the network layer receives data from the upper layer (such as the transport layer) protocols, it encapsulates an IP message header and adds both the source and destination IP addresses to that header.

  2. 2.

    The network devices (e.g., router) that are passed through along the way will maintain a routing table that guides the forwarding of IP messages, and by reading the destination address of the IP packet, forward the IP messages according to the local routing table.

  3. 3.

    The IP message finally reaches the destination host, which reads the destination IP address to determine whether to receive and continue the next processing.

The IP packet consists of two parts: the header and the data. The IP protocol defines the IP message header, as shown in Fig. 2.34. The first part of the IP message header is a fixed length of 20 bytes, which exists in all IP packets. Following the fixed part of the header are some optional fields whose length is variable.

Fig. 2.34
figure 34

Network layer IP packet header format

The following is a detailed explanation of each field in the fixed part of the network layer IP packet header.

  1. 1.

    Version is a four bits value and refers to the version of the IP protocol. There are currently two versions of the IP protocol: IPv4 and IPv6. The version of the IP protocol used by both sides of the communication must be the same. The IP protocol version number that is widely used now is 4 (i.e., IPv4).

  2. 2.

    The header length field is four bits in size, and the maximum decimal value that can be represented is 15. Please note that the unit of the number represented by this field is a 32-bit binary number (i.e., four bytes), so when the header length of IP is 1111 (i.e., 15 in decimal), the header length reaches 60 bytes. When the IP packet’s header length is not an integer multiple of four bytes, it must be filled using stuffing fields in the end. Therefore, the data division always starts with an integer multiple of four bytes, which is more convenient when implementing IP protocols. The disadvantage of limiting the header length to 60 bytes is that sometimes it may not be enough. However, such limit is posed in the hope that the user will minimize the cost. The most common header length is 20 bytes (i.e., a header length of 0101), at which point no options are used. It is because of the variable part of the header that a field is needed to specify the header length; if the header is fixed in length, the header length field will no longer be needed.

  3. 3.

    Differentiated Services (DS) is eight bits long. The differentiated services configure the computer to add a flag to the packets of a particular application, and then configure the routers in the network to give priority to forwarding these packets with flags, so as to guarantee the sufficient bandwidth of this application even the network bandwidth is relatively tight. This is the differentiated services, which ensure the Quality of Service (QoS). This field was called Type of Service in the former standard, but has never actually been used. In 1998, the Internet Engineering Task Force (IETF) renamed this field differentiated services, which only take effect when a differentiated service is used.

  4. 4.

    Total length refers to the total length of the IP packet header and data, which is the length of the packet in bytes. The total length field is 16 bits, so the maximum length of the packet is 216 − 1 = 65,535 bytes. In fact, the transmission of such a long packet is rarely encountered in reality.

  5. 5.

    The Identification field is 16 bits. IP software maintains a counter in memory, and for each packet generated, the counter is incremented by 1, and this value is assigned to the identification field. However, this “Identification” is not a sequence number. Since IP is a connectionless service, the problem of receiving packets in order does not exist. When a packet must be fragmented because its length exceeds the network’s Maximum Transfer Unit (MTU), the same packet is split into multiple fragments sharing the same identifier, that is, the value of the packet’s identification field is copied to the identification fields of all packet fragments. The same value of the identification field enables the packet fragments to be correctly reassembled into the original packet after the fragmentation.

  6. 6.

    Flag has three bits, but currently only two bits mean something. The lowest bit in the flag field is recorded as MF (More Fragment). MF = 1 means that there are “more fragments” of packets following, and MF = 0 means that this fragment is the last one of the several packet fragments. The middle bit of the flag field is DF (Don’t Fragment). Fragmentation is allowed only when DF = 0.

  7. 7.

    Fragment offset is 13 bits in size. The fragment offset indicates the relative position of a fragment in the original packet when a relatively long packet is fragmented. In other words, it means where the fragment starts relative to the beginning of the user data field. The unit of fragment offsets is eight bytes, which means that the length of each fragment must be an integer multiple of eight bytes (64 bits).

  8. 8.

    The common abbreviation for the Time to Live field is TTL, indicating the lifetime of the packet in the network. This field is set by the source point of the outbound packet. It aims to prevent undeliverable packets from circling in the network indefinitely, such as forwarding from router R1 to R2, then to R3, and then back to R1, thus wasting network resources. TTL is originally designed in seconds. Each time a router is passed, the period of time consumed by the packet at the router is subtracted from TTL. If the packet spends less than 1 s at the router, the TTL value is subtracted by 1. When the TTL value is reduced to zero, the packet is discarded. However, as technology progresses, the time required by the router to process the packet keeps to be reduced, and generally it takes much less time than 1 s, so later the function of the TTL field is changed to “hop limit” (but the name remains the same). The TTL value is subtracted by 1 before the router forwards the packet. If the TTL value is reduced to zero, the packet is discarded and will not be forwarded. Therefore, TTL is now no longer in seconds, but hops, and TTL indicates the maximum number of routers through which a packet can pass in the network. Obviously, the maximum number of routers a packet can pass through in the network is 255. If the initial value of TTL is set to 1, it means that the packet can only be transmitted in this LAN. It is because that as soon as the packet is transmitted to one of the routers on the LAN, before it is forwarded, the TTL value decreases to zero, and will be discarded by this router.

  9. 9.

    Protocol is eight bits in size. The protocol field indicates which protocol is used for the data carried by this packet so that the network layer of the destination host knows which process the data division should be passed to. Some of the commonly used protocols and the corresponding protocol field values are shown in Fig. 2.35.

  10. 10.

    The header checksum is 16 bits in size, and this field only checks the header of the packet and not the data division. This is because every time a packet passes through a router, the router has to recalculate the header checksum (some fields, such as time to live, flags, fragment offsets, may change). The workload of the calculation can be reduced if the data division is not checked.

  11. 11.

    The source IP address is 32 bits long.

  12. 12.

    The target IP address is 32 bits long.

Fig. 2.35
figure 35

Commonly used protocols and corresponding protocol field values

2.4.3 ICMP

ICMP (Internet Control Message Protocol) is a TCP/IPv4 network layer protocol for delivering control messages between IP hosts and routers. Control messages are messages about the network itself, such as the network connectivity, host reachability, and route availability.

ICMP messages are transmitted and encapsulated inside IP packets. ICMP messages are usually adopted by IP layer or higher-layer protocols (TCP or UDP). Some ICMP messages return error messages to the user process.

ICMP request messages can be sent using the ping and tracert commands on Windows, Linux, and network devices to test whether the network is running smoothly or to track the routers that the packets pass through to reach the destination IP address.

As shown below, when the ping command is used to query a website domain name on Windows 10, 4 ICMP request messages are sent and 4 ICMP responses from this address are received, indicating that the network is smooth.

C:\Users\hanlg>ping www.huawei.com Pinging www.huawei.com.lxdns.com [111.11.0.121] with 32 bytes of data: Reply from 111.11.0.121: bytes=32 time=10ms TTL=57 Reply from 111.11.0.121: bytes=32 time =11ms TTL=57 Reply from 111.11.0.121: bytes=32 time =10ms TTL=57 Reply from 111.11.0.121: bytes=32 time =11ms TTL=57 Ping statistics for 111.11.0.121: Packets: Sent = 4,Received = 4,Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 10ms, Maximum = 11ms, Average = 10ms

As shown below, when the tracert command is used to trace the routers along the path of the packet, it can be seen that the packet has passed 13 routers along the way, and the 14th is the destination address.

C:\Users\hanlg>tracert www.91xueit.com Tracing route to www.91xueit.com [129.226.71.87] Over a maxim of 30 hops: 1 2 ms 1 ms 3 ms phicomm.me [192.168.2.1] 2 4 ms 6 ms 8 ms 10.220.0.1 3 5 ms 4 ms 4 ms 111.63.220.13 4 4 ms 4 ms 5 ms 111.11.64.17 5 10 ms 4 ms 5 ms 111.24.8.253 6 9 ms 9 ms 8 ms 111.24.3.161 7 38 ms 38 ms 59 ms 221.176.24.241 8 37 ms 38 ms 39 ms 221.176.22.106 9 53 ms 39 ms 42 ms 221.176.19.198 10 77 ms 59 ms 48 ms 221.183.55.81 11 49 ms 63 ms 65 ms 218.189.5.25 12 61 ms 70 ms 71 ms 218.189.29.122 13 47 ms 63 ms 46 ms 10.196.94.241 14 45 ms 55 ms 47 ms 129.226.71.87

In the following, packet capture is used to view the format of ICMP messages. As shown in Fig. 2.36, PC1 pings 192.168.8.2, and the ping command generates an ICMP request message to send to the destination address to test whether the network is connected. If the destination computer receives the ICMP request message, it will return an ICMP response message.

Fig. 2.36
figure 36

ICMP request and response messages

The following part describes how to use the packet capture tool to capture ICMP request and response messages on the link, and observes the differences between the two types of messages.

As demonstrated in Fig. 2.37, packets on router links AR1 and AR2 are captured. The figure shows an ICMP request message, which has an ICMP message type field, an ICMP message code field, a checksum field, and an ICMP data division. The value of request message type is 8 and the message code is 0.

Fig. 2.37
figure 37

ICMP request message

Figure 2.38 shows an ICMP response message with a type value of 0 and a message code of 0.

Fig. 2.38
figure 38

ICMP response message

ICMP messages can be divided into three types, each of which uses a code to further specify the different meanings it represents. Table 2.1 lists the common ICMP message types and codes and what they represent.

Table 2.1 ICMP message types and codes and what they represent

As can be seen from Table 2.1, there are five types of ICMP error reports, which are introduced in detail below.

  1. 1.

    Destination unreachable. When a router or host does not have a route to the destination address, it discards the packet and sends a destination unreachable message to the source.

  2. 2.

    Source quench. When a router or host discards a packet due to congestion, it sends a source quench message to the source, so that the source understands that it should slow down the sending of packets.

  3. 3.

    Redirect. The router sends a redirect message to the host to let it know that the packet should be sent to a different router next time (which might be a better routing).

  4. 4.

    Time exceeded. When a router receives a packet with a time to live of zero, in addition to discarding the packet, it sends a time exceeded message to the source. When the destination fails to receive all the packet fragments of a packet within a pre-defined time, it discards all the received packet fragments and sends a time exceeded message to the source.

  5. 5.

    Parameter problem. When the router or the destination host receives a packet with incorrect field values in its header, it discards the packet and sends a parameter problem message to the source.

2.4.4 ARP

Address Resolution Protocol (ARP) is an indispensable protocol of IPv4. Its main function is to resolve IP addresses into MAC addresses, maintain a cache of the mapping relation between IP addresses and MAC addresses, i.e., the ARP table entries, as well as detect duplicate IP addresses within a network segment.

  1. 1.

    Ethernet and MAC addresses

    To better illustrate the problem, before explaining ARP, we will first introduce the Ethernet and MAC addresses.

    1. (a)

      Ethernet. Ethernet is a broadcast data link layer protocol that supports multipoint access. A network formed by a switch is a typical Ethernet, and the computer’s network interface card follows the Ethernet standard. In Ethernet, the network interface card and network device interface (such as router interfaces and virtual interfaces of Layer 3 switches) of each computer have a MAC address.

      The MAC address is also called the physical address, or hardware address, and is burned into the flash memory chip of the Network Interface Card (NC) by the manufacturer of the network device. MAC addresses are represented in the computer in 48-bit binary form. Computer communication over Ethernet must specify the destination MAC address and the source MAC address.

      In order to view the MAC address of a computer’s network interface card on Windows, you only need to type “ipconfig /all”, as shown below. In Windows, it is called “physical address”. Here is the MAC address in hexadecimal.

      C:\Users\hanlg>ipconfig /all Windows IP Configuration Connection-specific DNS Suffix . . : lan Description. . . . . . . . . . . . . : Intel(R) Dual Band Wireless-AC 3165 Physical Address. . . . . . . . . . : 00-DB-DF-F9-D2-51 DHCP Enabled . . . . . . . . . . . : Yes Autoconfiguration Enabled. . . . . . : Yes Link-local IPv6 Address. . . . . . . : fe80::65d6:9e31:63a0:9dd1%11 (Preferred) IPv4 Address . . . . . . . . . . . . : 192.168.2.161(Preferred) Subnet Mask . . . . . . . . . . . . : 255.255.255.0 Lease Obtained . . . . . . . . . : August-03-20 15:46:18 Lease Expires . . . . . . . . . : August-04-20 16:43:43 Default Gateway. . . . . . . . . . . : 192.168.2.1

      Network devices generally have an ARP Cache. The ARP cache is used to store the association information of IP addresses and MAC addresses.

      Before the data is sent, the device will look for the ARP cache table first. If the ARP table entry of the other device exists in the cache table, the MAC address in that table entry will be directly used to encapsulate the frame and then send the frame. If the corresponding information does not exist in the cache table, it is obtained by sending an ARP Request message.

      The mapping relation between IP address and MAC address is stored in the ARP cache table for a period of time. During the validity period (default: 180 s), the device can directly look up the target MAC address from this table for data encapsulation without ARP query. After this validity period, the ARP table entry will be automatically deleted.

      If the target IP address is located in another network, the source device will look up the MAC address of the gateway (router interface in this network) in the ARP cache table and then send the data to the gateway. After receiving the packet, the router selects a forwarding path for the packet.

  2. 2.

    Working process of ARP

    The working process of ARP is shown in Fig. 2.39. Computer A sends an ARP request message requesting to resolve the target MAC address of 192.168.1.20. Since Computer A does not know the target MAC address of 192.168.1.20, the request writes the target MAC address as a broadcast address, that is, FF-FF-FF-FF-FF-FF. Then the switch forwards the request to all ports once it receives the request.

    When all hosts receive this ARP request message, they check whether its target IP address field matches their own IP address. If not, the host will not respond to the ARP request message. And if it matches, the host will record the sender’s MAC address and IP address information in the ARP request message into its own ARP cache table, and then respond via an ARP reply message, as shown in Fig. 2.40. The target MAC address of ARP Response frame is the MAC address of Computer A.

  3. 3.

    Communication within the same network segment and cross-segment communication

    As shown in Fig. 2.41, there are two Ethernet networks and a point to point link in the network. The addresses of the computer and router interfaces are shown in the figure. The MA, MB…MF in the figure represent the MAC addresses of the corresponding interfaces. Computer A communicates with Computer B on the same network segment. Computer A sends an ARP broadcast frame to resolve the MAC address of the target IP address, and later the frames for communication encapsulate the target IP address and MAC address.

    When Computer A communicates with Computer F on a different network segment, Computer A needs to resolve the MAC address of the gateway. The frame sent from Computer A to Computer F is as in Fig. 2.42. Pay attention to the IP address and MAC address of the packet encapsulated in the two Ethernet networks. The source and target IP addresses of the packet remain unchanged during the transmission. For the packet to be sent from Computer A to Computer F, it needs to forward the MAC address of interface C of router R1, so the source MAC address of the packet encapsulated in Ethernet 1 is MA, while the target MAC address is MC. When the packet arrives at router R2, it has to be sent to Computer F from interface D of R2. The packet has to re-encapsulate the data link layer with the source MAC address as MD and the target MAC address as MF.

    In terms of encapsulating frames for cross-segment communication, the target IP address of the packet determines its destination, and the target MAC address of the frame determines the next hop interface of the packet. ARP can only resolve the MAC address of the same network segment. For packets from computers in other network segments, the source MAC address is that of the router interface. In this example, Computer F does not know the MAC address of Computer A. The source MAC address of the packet from Computer A that Computer F sees is the MAC address of interface D of router R2.

    Note

    ARP is only used in Ethernet. Point to point links commonly use PPP at the data link layer. The frame format defined by PPP has no MAC address field, so there is no need to use ARP to resolve the MAC address.

    After the MAC address is resolved by ARP, the Ethernet interface will cache the resolved MAC address. You can run “arp -a” on Windows system to view the ARP table entries. “Dynamic” in type column indicates the entry is obtained by ARP resolution and will be cleared from the cache if it is not used for a period of time.

C:\Users\hanlg>arp -a Interface: 192.168.2.161 --- 0xb Internet Address Physical Address Type 192.168.2.1 d8-c8-e9-96-a4-61 dynamic 192.168.2.255 ff-ff-ff-ff-ff-ff static 224.0.0.22 01-00-5e-00-00-16 static 224.0.0.251 01-00-5e-00-00-fb static 224.0.0.252 01-00-5e-00-00-fc static 255.255.255.255 ff-ff-ff-ff-ff-ff static

  1. 4.

    Packet capture analysis of ARP frames

    Figure 2.43 shows an ARP request packet captured by the packet capture tool. The 27th frame is the ARP request packet sent by computer 192.168.80.20 to resolve the MAC address of 192.168.80.30. Note that the observed target MAC address is ff: ff: ff: ff: ff: ff: ff, which indicates that all devices in the network can receive this ARP request message. An opcode is an option code indicating whether the current packet is a request message or a response message. The value of the ARP request message is 0x0001, while that of the ARP response message is 0x0002. The response message is a unicast frame.

    The 28th frame is an ARP response frame, and it can be seen that the target MAC address of this frame is not the broadcast address, but the MAC address of 192.168.80.20.

    ARP is built on the basis of mutual trust among hosts in the network. When Computer A sends an ARP broadcast frame to resolve the MAC address of Computer C, all computers in the same network segment can receive this ARP request message, and any host can send an ARP response message to Computer A, which may tell computer A a wrong MAC address. Computer A does not examine the authenticity of the message when it receives the ARP response message, but directly records it into the local ARP cache. Therefore, there is a security risk of ARP spoofing.

Fig. 2.39
figure 39

ARP request using broadcast frames

Fig. 2.40
figure 40

ARP response using unicast frames

Fig. 2.41
figure 41

The computer on the same network segment sends an ARP broadcast frame to resolve the MAC address of the target IP address

Fig. 2.42
figure 42

The computer for cross-segment communication sends ARP broadcast frame to resolve the MAC address of the gateway

Fig. 2.43
figure 43

ARP request frame

2.5 Network Interface Layer Protocols

In fact, the network interface layer of the TCP/IPv4 stack is not part of the Internet protocol set; it is the method by which packets are transmitted from the network layer of one device to that of another device. The process involves adding a data link layer header to encapsulate the packet into a frame, transmitting the data over physical media (e.g., optical fiber, twisted pair, wireless transmission, etc.), removing the encapsulated header of the data link layer after the receiver receives the data, and passing the received packet to the network layer.

The network interface layer of the TCP/IPv4 protocol stack contains the functions of the data link layer and physical layer of the OSI reference model. The interfaces of network devices (network interface cards of computers and interfaces of routers) are capable of the functions of the data link layer and the physical layer. The following part focuses on the functions implemented at the data link layer.

The data link layer protocol is responsible for encapsulating data packets into frames and transmit them from one end of the link to the other. As shown in Fig. 2.44, PC1 communicates to PC2 through link 1, link 2, …, and link 6. The link connecting the computer to the switch is the Ethernet link, which uses the CSMA/CD protocol. The connection between the routers is a point to point connection, and the data link layer protocols for this kind of link includes PPP, High-Level Data Link Control (HDLC) protocol, and so on. Different data link layer protocols define different frame formats.

Fig. 2.44
figure 44

Links and data link protocols

Common data link layer protocols include CSMA/CD, PPP, HDLC, Frame Relay, X.25, etc. All these protocols have 3 basic functions, namely encapsulation, transparent transmission and error check.

  1. 1.

    Encapsulation.

    Encapsulation is to add the headers and footers to the front and back of the IP packets of the network layer, respectively, so as to form a frame. As shown in Fig. 2.45, the information contained in the header and footer of the frame is clearly defined for different data link layer protocols, and the header and footer of the frame have start frame delimiter and end frame delimiter, which are called frame delimiters. The receiver receives the digital signal from the physical layer and reads from the start frame delimiter all the way to the end frame delimiter, and then it is considered that a complete frame is received.

    The role of frame delimiters is more obvious when an error occurs in data transmission. When a sender suddenly fails before it has finished sending a frame, and the transmission is paused, so the receiver receives a frame with only a start frame delimiter but no end frame delimiter, which is considered an incomplete frame and must be discarded.

    To improve the efficiency of data link layer transmission, the data division of the frame should be longer than the length of the header and footer. However, each data link layer protocol specifies an upper limit for the length of the data division of the frame that can be transmitted, i.e., the maximum transmission unit (MTU), which is 1500 bytes for Ethernet. As shown in Fig. 2.45, the MTU refers to the length of the data division.

  2. 2.

    Transparent transmission.

    It is better to choose delimiters that will not appear in the data division of the frame for the start and end frame delimiters. If the start and end frame delimiters appear in the data division of the frame, an escape character shall be inserted. The receiver will remove the escape character once it encounters one upon receipt and treat the characters after the escape character as data, that is, transparent transmission. As shown in Fig. 2.46, a data link layer protocol has SOH as the start frame delimiter, EOT as the end frame delimiter, and ESC as the escape character. Node A sends a data frame to Node B, and before sending it to the data link, the code of the escape character ESC is inserted at the position before characters SOH, ESC and EOT in the data, and this process is called byte stuffing. After receiving it, Node B removes the stuffing escape character, and treats the character after the escape character as data.

    The sender Node A inserts escape characters in the original data at the necessary position before sending the frame, and the receiver Node B removes the escape characters after receiving the frame to get the original data. The escape characters are inserted in the frame to allow the original data to be sent to Node B as it is, and this process is called “transparent transmission”.

  3. 3.

    Error check.

    Communication links in reality are not ideal. This means that errors may occur when bits are transmitted. A 1 can become a 0 or a 0 can become a 1, which is called a bit error. Bit error is a type of transmission error. The ratio of bit error to the total number of bits transmitted over a period of time is called the bit error rate (BER). For example, a BER of 10−10 means that on average, one bit error occurs for every 1010 bits transmitted. The BER is closely related to the signal-to-noise ratio (SNR). Improving the SNR can lower BER. However, in reality, the communication link is not ideal and it is not possible to reduce the BER to zero. Therefore, to ensure the reliability of data transmission, various error check measures must be applied when transmitting data in computer networks. Currently, the error check technology widely used in the data link layer is Cyclic Redundancy Check (CRC).

    To enable the receiver to determine whether there is an error during the transmission of the frame, it is necessary to include information for error check in the transmitted frame, which is called Frame Check Sequence (FCS). As shown in Fig. 2.47, the FCS is calculated using the data division of the frame and the header of the data link layer, and the FCS is placed at the end of the frame. After receiving the frame, the receiver uses the data division and the header of the data link layer to calculate the FCS, and compares the two calculation results to see if they are the same. If they are the same, it is assumed that there is no error during the transmission; and if there is an error, the receiver discards the frame.

Fig. 2.45
figure 45

Add a frame header and footer to encapsulate into a frame

Fig. 2.46
figure 46

Solve the problem of transparent transmission by byte stuffing

Fig. 2.47
figure 47

Frame check sequence

2.6 Exercises

  1. 1.

    Which layer of the TCP/IPv4 protocol stack is used for reliable transmission of computer communications? ( )

    1. A.

      Physical layer

    2. B.

      Application layer

    3. C.

      Transport layer

    4. D.

      Network layer

  2. 2.

    As IPv4 is upgraded to IPv6, which layer of the TCP/IPv4 protocol stack has been changed? ( )

    1. A.

      Data link layer

    2. B.

      Network layer

    3. C.

      Application layer

    4. D.

      Physical layer

  3. 3.

    What can ARP do? ( )

    1. A.

      Resolve computers’ MAC addresses to IP addresses

    2. B.

      Domain name resolution

    3. C.

      Reliable transmission

    4. D.

      Resolve IP addresses into MAC addresses

  4. 4.

    What is the range of TCP and UDP port numbers? ( )

    1. A.

      0–256

    2. B.

      0–1023

    3. C.

      0–65,535

    4. D.

      1024–65,535

  5. 5.

    Which of the following network protocols uses TCP port 25 by default? ( )

    1. A.

      HTTP

    2. B.

      Telnet

    3. C.

      SMTP

    4. D.

      POP3

  6. 6.

    In Windows system, the command used to check the listening ports is ( ).

    1. A.

      ipconfig /all

    2. B.

      netstat –an

    3. C.

      ping

    4. D.

      telnet

  7. 7.

    In Windows, the ping command uses the ( ) protocol.

    1. A.

      HTTP

    2. B.

      IGMP

    3. C.

      TCP

    4. D.

      ICMP

  8. 8.

    Which of the following statements about the functions of the network layer in the OSI reference model is correct ( )?

    1. A.

      It is the layer closest to the user in the OSI reference model and provides network services for applications

    2. B.

      It transfers bitstreams between devices, specifying levels, speeds, and cable pins

    3. C.

      It provides connection-oriented or non-connection-oriented data transfer and error check before retransmission

    4. D.

      It provides logical addresses for routers to determine paths

  9. 9.

    OSI reference model from the upper to lower layers are ( ).

    1. A.

      Application layer, session layer, presentation layer, transport layer, network layer, data link layer, physical layer

    2. B.

      Application layer, transport layer, network layer, data link layer, physical layer

    3. C.

      Application layer, presentation layer, session layer, transport layer, network layer, data link layer, physical layer

    4. D.

      Application layer, presentation layer, session layer, network layer, transport layer, data link layer, physical layer

  10. 10.

    (Multi-selection) A network administrator uses the ping command to test the connectivity of a network. In this process, which of the following protocols may be used? ( )

    1. A.

      ARP

    2. B.

      TCP

    3. C.

      ICMP

    4. D.

      IP

  11. 11.

    Figure 2.48 shows a packet of accessing a shared folder on a file server (the server) captured on a Windows system (the client). Please answer the following questions based on the content displayed in the figure.

    1. A.

      What are the IP address and MAC address of the file server?

    2. B.

      Which three packets are used to establish a TCP connection? How many bytes does the receiving window have when the server establishes a TCP connection?

    3. C.

      Which layer does the Server Message Block (SMB) protocol belong to? Does the SMB protocol use the TCP or UDP protocol at the transport layer? And what is the port?

  12. 12.

    Based on the contents of the seventh packet transport layer shown in Fig. 2.49, write the value of the “Sequence number” of the seventh packet transport layer, as well as the values of “Source Port” and “Destination Port”.

  13. 13.

    Figure 2.50 shows the packets captured by the packet capture tool. A computer is infected with a “virus” and sends ARP broadcast frames on the Internet. Observe the packets in the figure and find out which computer is sending the ARP broadcast frames.

  14. 14.

    What is the TCP/IP protocol layered by? Write down the functions that each protocol layer implements.

  15. 15.

    List a few common application layer protocols.

  16. 16.

    What contents should the application layer protocols define?

  17. 17.

    Write down two transport layer protocols and their application scenarios.

  18. 18.

    Write down four network layer protocols and the functions of each protocol.

Fig. 2.48
figure 48

Captured packet

Fig. 2.49
figure 49

Packet transport layer header

Fig. 2.50
figure 50

ARP broadcast