Keywords

Introduction

The discipline of cybersecurity is still in its infancy, which leads to different interpretations of common terminology and concepts. Before we can discuss privacy concerns in cybersecurity, one must define cybersecurity. Cybersecurity, in essence, is the protective actions taken to safeguard digital information and processes an organization deems necessary for its successful operations. In practice, cybersecurity professionals will use many techniques to reduce the risk of systems being compromised against all types of threats. Not only do cybersecurity practitioners need to protect against adversarial threats; but they also need to be concerned with natural disaster events. The range of threats and the sophistication of methods used to attack networks, called “attack vectors,” make this field change at an exponential rate. Merriam-Webster dictionary (2023) defines privacy as “the quality or state of being apart from company or observation.” In cyberspace, this state translates to being protected from observation over a digital connection. To refine further the definition of privacy relating to cybersecurity, privacy is keeping personal information safe from unauthorized disclosure, or someone’s network traffic protected from surveillance. Privacy concerns have become an essential element of cybersecurity due to the vast amount of information that has become available over the Internet.

A person’s privacy is hard to preserve over the Internet for many reasons. The Internet does not have predefined boundaries like those which are present in the real world. If someone were to drive from the United States to Canada, they would go through a border entry point, which in turn, lets them know that they are leaving the laws and regulations of the United States and will now be under the laws of the Canadian government. In cyberspace, a user’s network traffic may start in one country but may go through many countries before ending at its destination. A computer programming tool called “traceroute” can be used to show how network traffic flows over the Internet. For example, if someone in central Arkansas were to traceroute to the University of Arkansas at Little Rock’s website from their home network, they could get an output that starts on their home network router and traverses through many Internet Service Provider (ISP) networks, then through the university system’s network until it reaches the final location. Internet traffic does not take physical locations as a metric for data flow but deploys a variety of technologies like the speed of the network or the predefined path set by the telecommunication company. Furthermore, the website servers could be distributed around the world to reduce latency or to provide security from attacks. Every country has its ideas about its citizens’ rights to privacy and as such an individual’s traffic may be collected if the data flows through a country that is considered a surveillance state.

Another area of privacy concern is the preservation of Personally Identifiable Information (PII) collected by organizations. PII is any information that can be used to ascertain who a person is and any data about that individual. (See more information in Chapter 4 on PII.) Before the boom of the Internet, PII was collected primarily at physical access points, for example, showing a passport to board an airplane. Now public and private entities have transitioned to a more online presence where applications, forms, and e-commerce can be accomplished. By doing this a person's information may be collected and stored on servers that have access to the Internet. This action complexifies the safeguarding of this data. Data breaches are occurring relatively regularly and according to an article in the Healthcare Journal, the number of people affected by reported hacking events has risen in the healthcare field from 15.3 million between 2005 through 2014 to 145.7 million between 2015 and 2019 (Seh et al., 2020). This trend does not seem to be slowing, especially as new technologies are connected to the Internet and requirements for personal information are still necessary. The need for cybersecurity is becoming more prevalent in the defense of privacy data.

Brief Introduction to the Internet of Things (IoT)

As embedded systems become more feature-rich and cost-effective their integration into household appliances, smart consumer devices, and even the nation’s critical infrastructure has become more prevalent. This increased integration provides users with unprecedented situational awareness but also creates a privacy concern that must be acknowledged and mitigated. For instance, power utilities can now collect power quality data across large geographic regions in real-time which allows them to balance generation resources more effectively and mitigate potential outage events before they cause an interruption in service to their customers. Field devices such as Real-Time Automation Controllers (RTACs), Protection Relays, Solar Inverters, Battery Energy Storage Systems (BESSs), and Phasor Measurement Units (PMUs) with integrated communication channels for monitoring and control are becoming more widespread. These devices transmit critical real-time information to grid operators which provides them with additional situational awareness as well as a means to resolve issues before they progress to a critical level. However, these devices also increase the overall attack surface and may be connected through insecure external networks. As such, the control layer that manages, monitors, and provides mitigations for these systems must be protected using state-of-the-art cybersecurity techniques. A bad actor could intercept this data and malform it to cause the service provider to take a non-optimal action. Bad actors, or companies, could also collect personal user data from connected smart devices or develop consumer patterns from internet-connected smart appliances such as thermostats, refrigerators, or fitness trackers. One noteworthy incident, as reported by Wired in January 2018, involved American bases and soldiers’ patrol patterns overseas being revealed due to military personnel’s usage of fitness trackers while jogging and patrolling the perimeter of the bases (Hsu, 2018). It is important for users to fully review the terms of service and understand what data is being collected and how it will be used, stored, and in some cases, sold to third parties to develop customer profiles for targeted advertisements.

Overview of Common Attack Vectors

In this section, we discuss three common cyber-attack vectors, as well as effective mitigations: Denial of Service (DoS), Man in the Middle (MitM), and Spoofing Attacks.

A DoS attack can render a service unavailable either through a direct or indirect attack (see Fig. 4.1). It also refers to physical attacks on communication infrastructure, such as the cutting of wires or wireless jamming. DoS attacks typically send multiple requests in rapid succession to overload a server’s ability to respond to any other requests. More advanced DoS attacks are executed using BotNets which consist of many previously compromised computers. These attacks are typically characterized as Distributed Denial of Service (DDoS) attacks and require more advanced tactics and coordination to execute. Typical mitigation for this attack vector is to install software that identifies multiple rapid requests from a single computer or network of computers and blocks any future request for a specified time.

Fig. 4.1
A flow diagram of a Denial-of-Service attack renders a service unavailable either through a direct or indirect attack. Includes a digital simulator, a router attacked by 3 hackers, and a photograph of a control room with people.

Denial-of-Service attack diagram

(Image credits Iconfinder.com/Dmitry Mirolyubov, Maxicons, Z Studio, iStock.com/LeonidKos; Opal-RT Technologies; Stephan Green under CC BY-SA 3.0 Deed)

A MitM attack occurs when an external attacker is capable of intercepting, modifying, suppressing, or replaying network packets undetected by tricking two communication nodes into believing they are still communicating normally (see Fig. 4.2). This attack can enable the collection of PII data from a system or allow a bad actor to show corrupted information to operators and managers which would in turn make incorrect decisions based on the compromised data. A typical mitigation for this type of attack is to use up-to-date encryption and validation protocols. A Zero-Trust architecture is also an effective mitigation for this attack vector. Zero-Trust is a concept that involves taking additional measures to verify the authenticity of all devices and users on a network regardless of their physical location or privilege levels.

Fig. 4.2
An illustrative flow diagram of a man-in-the-middle attack by an external attacker capable of intercepting, modifying, suppressing, or replaying network packets undetected by tricking two communication nodes.

Man-in-the-Middle attack diagram

(Image credits Iconfinder.com/Dmitry Mirolyubov, Maxicons, Z Studio, iStock.com/LeonidKos; Opal-RT Technologies; Stephan Green under CC BY-SA 3.0 Deed)

A spoofing is an attack where an illegitimate actor pretends to be a legitimate actor (see Fig. 4.3). There are many methods of spoofing in the cybersecurity domain. A bad actor could spoof a webserver, WiFi Access Point, or even a signal from the Global Positioning System (GPS). The act of causing Global Navigation Satellite System (GNSS) receivers to lock onto simulated or replayed satellite signals instead of real ones, effectively causing the receiver to locate itself at the wrong position and/or time. This class of attack is a major threat to PMU and synchrophasor systems, which are heavily reliant on time synchronization. To mitigate this class of attack, a zero-trust architecture along with a Public Key Infrastructure (PKI) is recommended. PKI is a framework that uses a combination of public and private keys to encrypt and verify data transmitted between users and servers. Regular site surveys are also recommended to help ensure rogue equipment may be identified and removed.

Fig. 4.3
A flow diagram of a spoofing attack in the cybersecurity domain where an illegitimate actor pretends to be a legitimate actor.

Spoofing attack diagram

(Image credits Iconfinder.com/Dmitry Mirolyubov, Maxicons, Z Studio, iStock.com/LeonidKos; Opal-RT Technologies; Stephan Green under CC BY-SA 3.0 Deed)

The Dynamic Protections of Privacy in an Organization

The only true way to protect privacy in cyberspace is to use dynamic protective measures that adapt to the current threat landscape. Dynamic protection is a process of continuously reviewing and changing an entity’s defensive controls. For an organization, this process can be through administrative or technical means. Organizations will need to build robust security plans that consider privacy.

User consent management. User consent is an essential aspect of users’ privacy. User consent will need to adapt over time as the information needs of the organization change along with the techniques attackers will use to subvert security measures. User consent should be in the form of a contract between the individual and the organization and be displayed in any location being monitored, be it physical or digital. By getting consent from all users, everyone understands their rights within an organization, which allows users to make a conscious decision on what activities they will partake in while on the organization’s network.

Access management. Another protection mechanism an organization can take is to apply access management to data. Access management is a way to regulate the data users can access. The concept is to develop markings for information based on importance to the organization or pertinent privacy laws. Several types of access management control can be implemented on a variety of factors. One access control model uses a person’s role within an organization to approve or disapprove access to data. Another model may use predefined classification marking to control data like how the United States Federal Government assigns unclassified, confidential, and secret classification markings to information. The organization will need to determine which access control model suits their needs.

Choosing the wrong access control model could have adverse effects on privacy. For instance, if an organization was protecting the social security numbers of their customers, only employees with a need to access this information should be able to view relevant details but not be able to modify this data. The model for this example should use a person’s role or a set of predefined rules within the organization to determine the level of access. If a model that uses criticality of data was used, then people without a need to access that data may inadvertently have access rights.

Both user consent and access management are holistic approaches to privacy protection. What if PII data is needed by others but specifics about a person’s identity need to be obfuscated? Think of medical research or census data. (See Chapter 10, “Healthcare Privacy in an Electronic Data Age” for a deeper dive on healthcare privacy.) Other people may need particular information like the age or location of an individual in a study, but they do not need the person’s name or the social security number associated with the data. Here, anonymity techniques protect an individual’s privacy but still allow the relevant data to be accessed by anyone.

Anonymity in Cybersecurity

In the cybersecurity discipline, three main tenets are sought: confidentiality, availability, and integrity, known as the CIA model of cybersecurity. Privacy is a crucial aspect of trying to achieve the confidentiality of data. Maintaining confidentiality can be a daunting task that requires many techniques to ensure the privacy of data. One area of interest, anonymity, manipulates data in a way that obfuscates the identity of individuals but still allows others to view the applicable data. Keeping data private can be accomplished through a procedure known as k-anonymity.

K-anonymity: The idea behind K-anonymity has been around since 1986 when Tore Dalenius wrote a paper describing how to identify personal information from census data. The theory behind k-anonymity is that people’s personal information can be anonymized, which preserves privacy but still allows pertinent data to be shared for scientific studies. For example, if a survey for the cure of cancer was conducted by a major university, the data would undoubtedly contain personally identifiable information and cause privacy concerns for the subjects. K-anonymity allows for specific data to be shared while preserving the subject’s privacy.

K-anonymity is accomplished by first categorizing information into three classes: non-identifier, identifier, or quasi-identifier (Samarati & Sweeney, 1998). Identifiable information is any data that would quickly ascertain a person’s identity in the data. Non-identifiers are data that have no relation to a particular person unless directly correlated with an individual. Lastly, quasi-identifiable information is an attribute that can be associated with multiple people but when used with other quasi-identifiable information can eventually allow for the identification of an individual. Once the data is categorized, one can apply one of two methods to the data to anonymize the dataset.

The first method is to generalize the data. This strategy can be done by looking through all the data in a particular column and finding commonalities. If one category is age, the data can be grouped into five- or ten-year increments. The objective here would allow only pertinent information to be shown while preserving key characteristics of the subject. The other method is to sanitize the information from the released data. There will be times when the data cannot be generalized or is not needed for a different study or research. In these situations, the data should not be included in the released dataset. Examples of this are the person’s name, street address, and even the person’s religion. This type of information can be used to identify a person relatively easily.

The key feature behind k-anonymity is to have anonymity set to at least k-1, where k is a person’s identifiable data (Samarati & Sweeney, 1998). This constraint means that there should be two or more matching records in each column. Table 4.1 shows how information should be presented if the dataset is to be considered k-anonymized.

Table 4.1 An example of a k-anonymized dataset

When looking at the table, one can see all the characteristics of k-anonymity are applied. Each column is either sanitized with an asterisk or data is grouped. Each column has at least two rows with the same data to hide the identity of the person. K-anonymity is not impervious to attacks and should not be used for all datasets.

One area in which k-anonymity fails is with small group studies or small datasets. When there is not a substantial amount of data in each column, obscuring the identity of a person becomes difficult when trying to preserve the crucial pieces of information for a particular study. For example, if there are only ten subjects in a study then the commonalities are reduced, which can hinder the generalization of data and thus not lessen the amount of releasable information. When datasets are small, other data science methods for preserving identity would need to be used in conjunction with k-anonymity. The reality is preserving 100% data loss or privacy is very difficult and with the advancement of technology and the amount of data accessible by the public, privacy will only become harder to preserve.

Several techniques can be used to attack k-anonymity and must be understood to implement this privacy control. One method that can be used for re-identification is known as a linkage attack. This type of attack uses anonymized data along with a dataset that is known to the attacker. The attacker then looks for all overlapping information in the datasets and once enough overlap is found the identity of the individual can be ascertained. Now that structured and unstructured datasets are being sold and distributed regularly, this type of attack is becoming more and more effective. Furthermore, neural network models like OpenAI’s GPT-4 and Google’s BERT can expedite this process by taking available information and quickly isolating any common parameters. As artificial intelligence and publicly available information increase the protection of privacy will certainly decrease unless new measures are developed or defense in depth is followed.

The Privacy Control Catalog in NIST 800-53, Revision 5

One sure way to make sure an organization is using defense in depth is to follow a well-developed and reviewed standard. Before we dive into one of these standards, let’s discuss defense in depth.

Defense in depth. Defense in depth is nothing more than adding layers to cybersecurity defense. The quintessential analogy for defense in depth is picturing a castle. If the castle is open and there are no protections like locked doors, curtain walls, a moat, or security guards, then the castle can be easily conquered by adversaries. Defense in depth is where the castle is built on top of a mountain and there is a moat and drawbridge, archers on the high walls. Each layer of security added aids in the overall defense of the castle or, in our case, network security and ultimately privacy. Now that defense in depth has been explained, we can use a standard to help guide in implementing defense in depth also known as layered defense.

The National Institute of Standards and Technology (NIST) is an organization that develops standards for U.S. Federal Agencies. One such standard, NIST Special Publication 800-53 revision 5, published in 2020, outlines how to implement security and privacy controls within an organization. This publication was developed for the U.S. Federal government but can be applied to any organization. NIST SP 800-53r5 uses three categories of controls; administrative, technical, and physical, to guide privacy safeguards. Each category identifies how a control will be implemented.

Administrative controls. An administrative control is nontechnical in nature. This type of control is typically developed by cybersecurity professionals and management within an organization. Policies, training, and other types of plans encompass much of this control category. Administrative controls outline how an organization will operate and define the rules that employees must follow. This control category implements the procedures used in defending personal information or other privacy concerns. Management is heavily involved in this category of controls, which in turn allows for the enforcement of policies and procedures. The enforcement of these controls is regarded as the most critical aspect of this category. If administrative controls are not enforced, then users may decide to not follow the policies or procedures which puts data at risk of compromise.

Technical controls. Technical controls are applied to computer systems, networking equipment, or other information systems. As the name implies, these controls are developed to harden equipment or add boundary protection to the organization's network. Technical controls are usually applied by skilled information technology professionals and overseen by cybersecurity practitioners. An administrative control would define the password policy of the organization, whereas technical control would apply the settings to the systems. Cybersecurity personnel would audit the control to verify that the control was implemented properly.

Physical controls. A physical control is a category that protects employees and data in the real world. This category of controls implements security through tangible means. A lock on a door would be considered a physical control. Another physical control would be an access control system that only allows authorized people into a particular room or area of the building. Physical controls are usually implemented by many divisions of an organization but can still be monitored by the cybersecurity team. Cybersecurity personnel monitor these controls because physical access to equipment can easily subvert any of the other control categories and allow attackers the ability to execute malicious code locally on the computer systems.

NIST SP 800-53r5 takes these three major categories and segments them into smaller areas, which allows the organization to focus on specific controls. If the key concern for the organization is privacy, the organization can quickly scan the control catalog and individually select the areas that are labeled as identifiable information or privacy. This streamlines and allows for prioritization of the implementation of each control. All the control categories rely on each other to preserve privacy within an organization.

The NIST control catalog is organized into twenty functional areas, where each group of controls is classified by requirements stemming from governmental requirements or organizational needs. Controls are developed to combat many attack vectors and as the landscape of attacks increases, NIST develops new controls and provides updates to SP 800-53 on a consistent basis. Controls are placed in alphabetical order and labeled with the abbreviation of the control area with a corresponding number; AC-1 for the Access Control area’s first subsequent control. Furthermore, a description of the control is documented which streamlines the learning curve for the implementation of the control. Privacy-specific controls are easily located within the catalog due to NIST’s approach to documenting controls.

PII Processing and Transparency, labeled as PT in the catalog, is one control group that is exclusively written for privacy. This control first outlines the need for the organization to create policies and procedures in relation to handling personal information, then aids the organization in finding and documenting the authority to process personal information. This control also outlines the need to allow only authorized personnel to access PII and explicitly restricts access to all others. Additionally, this control group has organizations detailing the purpose of collecting PII, getting consent from their users, and providing privacy statements when privacy is not provided on a system. This allows users some transparency on their privacy within the organization and allows them to make their own informed privacy decisions. SP 800-53 has additional privacy-related controls, which ultimately support the cybersecurity posture of the organization.

Individual’s Privacy Relating to Cybersecurity (Three Ways to Protect Your Privacy)

Organizations have resources to provide security to their customers and employees including protecting privacy. What about the average person? How do they protect their privacy using cybersecurity principles and techniques? Using NIST standards may not be feasible for every person due to the lack of equipment, time, and effort gained from applying each control area. Privacy can be enhanced using cybersecurity best practices and using Free and Open-Source Software (FOSS).

  1. 1.

    Use non-attributable networks. One technique that can be very effective is the use of non-attributable networks. These networks are developed to obscure network traffic with the purpose of evading monitoring. One such network, The Onion Router (TOR) network, uses encryption, proxy devices, and tunneling techniques to make the user's network traffic ambiguous. One way to use TOR is to download and install the TOR web browser. When someone opens the TOR browser and surfs the Internet, the browser will set up an encrypted tunnel and pass traffic through multiple TOR relays. These relays are how an individual user’s traffic becomes anonymous and virtually untraceable back to the original sending device. Once the traffic has gone through the TOR relays it will pass through an exit node and send the traffic back to the public Internet. Using TOR will give some privacy on the Internet but as with anything in cyberspace, one method is not enough to give full anonymity.

  2. 2.

    Use a privacy-specific Operating System (OS). Another way to add layers to privacy on the web is to use a privacy-specific operating system (OS) like the Debian Linux-based Tails OS. When one uses Apple’s MacOS or Microsoft’s Windows, network traffic and browsing history can become compromised thus lessening one’s privacy on the web. There are a few OSs that were developed with privacy as the primary purpose. Everything users do on an operating system leaves a trace on their hard drive. Digital Forensic tools can be used to look through each cluster on hard drives and find files that have previously been deleted from computers. This process stems from the fact that operating systems do not typically delete data. When one deletes a piece of data, the operating system just marks that is available to write on again. This is where privacy operating systems come in. These operating systems will run from a disk or USB drive and operate 100 percent in memory without writing anything to the actual hard drive. Once users reboot their computers, all traces of the operating system disappear. Only this process grants its users a much-needed layer of privacy protection, but the method must be used as prescribed by the vendor and proven best practices.

  3. 3.

    Use social behaviors. The last cybersecurity defensive technique in this category is not technical in nature. If an individual uses an anonymizing network and privacy-related operating systems, she or he may also want to utilize social behaviors to preserve privacy. One tactic would be to have many online personas and not log into systems that are associated with user accounts. Think about it this way; a user goes through all the trouble to obfuscate traffic and make it untraceable but then they log into social media accounts. Game over, they just compromised their privacy by associating the login with the network traffic. Never do anything that would trace the traffic back to the originator. Create a few online personalities. Users from the United Kingdom should deploy TOR and only surf traffic with a.com and not.uk domains. This results from the fact that.com is generic and can be attributed around the world whereas.uk domains are specific to the United Kingdom. Moreover, one should never use a typical web browser at the same time they are using the TOR browser. By doing this an individual’s traffic goes over the non-attributed network and can be traced back through the anonymizing network to the original sender by looking at the anonymized traffic. When it comes to preserving privacy in cyberspace, cybersecurity practices can aid in the overall achievement of confidentiality for the average person.

Conclusions

Cybersecurity and privacy in the digital age are closely coupled topics. As more advanced protections and protocols are developed to maintain the confidentiality and overall data security of users, bad actors are simultaneously finding new ways to circumvent or compromise these protections. Protecting user privacy is everyone’s responsibility. It is important for organizations charged with safeguarding PII to follow nationally recognized best practices, such as those outlined in NIST SP 800-53r5, in order to protect their customers’ information. Individuals should also take an active role in managing the way companies and organizations collect, use, share, and sell their data by reviewing and understanding various terms and conditions presented to them. Additionally, users should adopt best practices to limit their exposure while navigating online, using IoT devices in their homes and offices, and sharing information with organizations. Researchers and organizations that have a need to share user data with affiliates and partners should investigate methods such as K-Anonymity and other methods to obfuscate personal information. Connected devices are becoming more prolific and companies will continue to collect, process, and utilize personal information for activities such as targeted advertisements and tailored user experiences. While these activities can help improve situational awareness and increase overall user experiences, users should take an active role in how their personal data is collected and used.