1 Introduction

There has been considerable recent expansion in the development of online social networks (OLSNs). This development is paralleled by improving investigative and analytic techniques for dealing with large bodies of unstructured data (McAfee and Brynjolfsson 2012). In particular the fields of business intelligence and data analytics (BI-DA) and digital forensics (DF) have developed tools that are applicable to analyze such social networks (Garfinkel 2010; Huber et al. 2011; Zainudin et al. 2011). Such analysis has traditionally been conducted within limited areas such as police frameworks. However, with the emergence of the BI-DA and DF information technologies, organizations increasingly use the technologies for intelligence gathering and commercial analysis of online social networks. Many countries have well-defined laws dealing with legal rights to conduct such investigations. However, these laws are often directed at domestic investigations with a focus on individual privacy. The legality of such investigations is less clear when they originate within one nation state but penetrate across international borders via networks or other media to aim at foreign subjects (McKnight 2012). Further concerns arise from new data storage technologies, such as the cloud, where the distinction between domestic and international repositories are blurred or lost, and national privacy rights cannot even be determined (Kerr and Teng 2012). There are established international privacy rights frameworks that protect personal data about individuals. While it is clear that individuals have certain rights to the proper governance of their personal data, these privacy rights do not commonly extend to protect social groups.

There is distinct data that applies to a social group that is different from individual data. Examples of personal individual data may include traits such as age, income, gender, national origin, race, education, religion, etc. These may also include habits, such as entertainment interests, dining preferences, frequent visits to institutional sites like religious institutions, etc. Social groups also have parallel, collective data. Traits can be descriptive: average age, most common profiles of members’ gender, origin, race, education, religion, etc. A social group may also have collective habits, such as their common or shared entertainment interests, dining preferences, frequent institutional site visits, etc.

Most OLSNs must ultimately have owners that sponsor the social network’s servers for commercial or social purposes. In administering access, these owners may collect minimal personal data about the users (name, email, address, etc.). As a consequence, the owner’s data files may not hold much risk for the privacy of the individuals or the social group itself (except perhaps for the harvesting of email addresses for spamming). However, there is much higher risk in the various posts made by the individuals to their profiles or as communication for other members of the social group (Bulgurcu et al. 2010). For example, a hyperpersonal teenager (one seeking to present an image with certain idealized traits) may naively attract sexual predators (Baskerville and Sainsbury 2006).

The privacy and ethics issues engulfing online social networks are complex. Basic social notions are shifting as a result of the technology use. For example, there are concerns that OLSNs are changing the ethics of friendship (Cocking et al. 2012). Superficial online friendships may fail to meet the shared life criterion of virtue friendship (Vallor 2012). But this definition of friendship goes beyond philosophy. Friendship is used to provide the conceptual integrity in someone’s online presence that often distinguishes who is let into one’s privacy and who is excluded (Hull et al. 2011).

The release of profile data about communities online has encountered early difficulties with anonymizing such data. For example, the release of Facebook profile data was quickly mined to reveal the underlying community. There were concerns that, although the profiles did not contain any obvious personally identifying data, the profiles could be pierced using features such as physical, physiological, mental, economic, cultural or social identity. Such piercings could enable re-identification of individuals (Zimmer 2010). The presence of such features in online social networks further complicates the need to govern privacy within national identity systems (Lusoli and Compañó 2010). There appears to be a tendency for online social networks to pressure participants toward self-disclosure of private information (Posey et al. 2010). Governments may be required in future to regulate social networks because of the ease by which participants may victimize themselves through unnecessary self-disclosures (Tow et al. 2010).

Like friendship, the very shape of individual identity is evolving as a result of OLSNs. It is an ethical dimension of identity that entails the potential for online social networks to either reinforce or deteriorate the identity of the participant (Mishra et al. 2012). Beyond actualizing social concepts such as friendship, and privacy/identity disclosures, OLSN participation can actually redefine what these concepts mean and affects the norms that revolve around them.

Responsible owners will ensure that their social networking software provides a reasonable degree of privacy protection to enable members of the social network to protect their personal data appropriately. However, it is not clear whether the privacy regard for this personal data extends to this distinct data about the social network as a whole.

Businesses are moving to take advantage of the data generated by social network groups. Community relations efforts strategize the promotion of products and services through social networks. For example, social network marketing has applied various technologies to analyze the group-generated data such as “vocabulary, language patterns, and phrasing to determine if the comments of their products or services are positive or negative” (Laudon and Traver 2009). Social network group data have also been used for understanding the voters’ preferences in political campaigns. It will be no surprise that DF can retrieve data from OLSNs. Data scientists with BI-DA capabilities can use such data to analyze and understand group characteristics (e.g., culture, movement, interest, etc.) for such purposes as social media marketing (Kumar and Mirchandani 2012). A problem arises when these attributes are something that the group as a whole wants to reserve as private. There is little appropriate protection of the group data. Thus the group values might be vulnerable to misuse or subject to various threats and attacks from unexpected sources that gain access to these data and have the capability for their abuse. More importantly: an individual may be able to hide within a group, but the group itself cannot hide (Taylor 2017). Consequently, the violation of group privacy may lead to the compromise of individual privacy.

This situation is creating a privacy kill chain whereby the relatively easy compromise of highly vulnerable group privacy provides an avenue for individual privacy compromise. The notion of a “kill chain” has been widely adopted to describe the different stages in a cyber-attack (Cockburn 2015), especially network-based cyber-attacks (Hutchins et al. 2011). The purpose of this paper is to provide a model of the emerging privacy kill chain, plus the implicit preventative model that would impede it.

Our existing frameworks of privacy regulations and laws are already stretched in the protection of individual privacy. There is little regulatory and legal protection that governs the privacy of groups (Bloustein 2002). The privacy kill chain amounts to a back door through which DF and investigative technologies can be used to compromise the privacy of individuals by compromising the privacy of groups (Chen et al. 2012). Because these technologies are increasingly in use outside of law enforcement, breaking the privacy kill chain is important to the preservation of individual privacy.

This kill chain and its prevention involve various aspects of the OLSN phenomena. These aspects include social, commercial, legal, ethical, and technical ones. The social aspect regards organizations with commercial motives to penetrate group membership and group characteristics to deduce individual characteristics thereby creating legal and ethical issues. In this way, OLSN forensics can compromise individual privacy by first compromising group privacy. This may create conflicts between group and individual privacy. The technical aspect regards the investigative technologies to collect and analyze OLSN data in order to harvest group characteristics, together with group membership. Our framework explicates the issues arising from the changes that investigative technologies are wreaking through the vulnerability of privacy in online social networks.

The paper is organized as follows. In the next section, we will introduce the privacy kill chain. Following this, we will elaborate the elements of this kill chain, beginning with an introductory overview of online social networking in the ethical context. Then we elaborate the legal and ethical aspects of the implications of group and individual privacy compromises that can follow in the analysis of OLSNs. We then elaborate social aspect of OLSN phenomena, how commercial motives in organizations are related to the compromise of OLSN privacy. Then, for the technical aspect of OLSN phenomena, we review investigative technologies and their potential applications to OLSN. Finally, we conclude with a model of online privacy protection that impedes each stage of the privacy kill chain.

2 The Privacy Kill Chain

In general a kill chain is a military concept: “a systematic process to target and engage a target to create desired effects” (Hutchins et al. 2011, p.87). The concept has been used to describe cyber-attacks that systematically break through more weakly defended elements of a system to gain access to otherwise more strongly defended elements (Cockburn 2015; Hutchins et al. 2011). It is ideal for describing the situation at hand, because currently group privacy is more weakly protected than individual privacy. Figure 1 illustrates the four stages in the privacy kill chain. Each of these stages will be discussed in detail later in the paper. For the purpose of providing an early context and perspective, we will briefly introduce these stages in this section as an overview.

Fig. 1
figure 1

Four stages of privacy kill chain

Persistence conversion, the first stage, regards the ability of OLSNs to keep more-or-less permanent records of social activities and behaviors. The data produced when participants engage in OLSNs persists long after the engagement (Hogan and Quan-Haase 2010). This data represents an OLSN conversion of social ephemera to social perpetua. Before OLSNs, most social behavior was ephemeral, lasting only as long as the behavior itself. Social behavior was only rarely recorded and even more rarely made publicly available. But many OLSNs retain historical data on the participant activities because analysis of such records can be valuable. This means social behavior using OLSNs are converted to perpetual, more-or-less persistent data recording the social behavior.

Advanced analysis, the second stage, regards the use of BI-DA/DF technologies for researching and investigating online data (Chen et al. 2012). Such investigations collect and analyze the evidence available in social perpetua for purposes of governing or profiting.

Group compromise, the third stage, regards the discovery of group characteristics as a result of the advanced analysis of OLSN data. Such revelations about group behavior are not usually subject to privacy regulation because groups usually have no particular privacy rights protected by law (Bloustein 2002). This situation means that the results of such investigations can be regarded as more-or-less public and available for all purposes.

Individual compromise, the fourth stage, regards the deductive projection of individual characteristics from the group characteristics. For example, if Alice is a member of OLSN group of wine enthusiasts, it would be a simple deduction that Alice (at least occasionally) enjoys adult beverages.

3 Being Social Online: Online Social Networks

OLSNs are online service platforms that facilitates the building of social relations among people who want to share their interests, activities, and real-life connections (D. M. Boyd and Ellison 2010). OLSNs consists of members’ profiles, their social links, and a list of additional services that allow users to share information within their networks. Efforts to support computer-mediated social interaction and networking began in late 1970s. Early OLSNs started as generalized online communities to enable people to interact with each other through chat rooms. With new technological features such as user profile creation and friend management, a new wave of OLSNs has gained popularity since early 2000s. OLSNs such as Facebook and LinkedIn have attracted millions of users since their introduction. Some of them became the largest and fastest growing sites in the world. Progressive diversification and sophistication of purposes and usage patterns have accompanied the rapid growth of OLSNs. OLSNs are varied in terms of technologies they use (e.g. mobile connectivity and blogging) and topics they support (e.g. business, common interests, dating, and friends). Patterns of personal information revelation are quite varied. The types and visibility of information revealed or elicited are different depending on each user’s setting and sites.

People in all cultures form complex social networks. Such networks can be as formal as a college fraternity or as informal as a neighborhood garden club. Such networks typically share values and trust. OLSNs are different from other communications media, and could be viewed as social milieux rather than media. OLSNs are information and communications technology (ICT)-based meeting places where individuals gather to socialize, that is, to exchange information, observe and emulate each other, and compare status (Clemons 2009). With the advent of mobile computing, traditional OLSNs based on desktop applications have gradually become mobile OLSNs (Dinh et al. 2013). Because of convenience, location-based services, mobility of mobile devices, mobile OLSNs provide richer user experience and convenient communication thereby increasing the exchange of individual and group information and raising privacy concerns (J. Li et al. 2017).

Most OLSNs allow users to use three core features: 1) construct a public or semi-public profile within a bounded system, 2) articulate a list of other users with whom they share a connection, and 3) view and traverse their list of connections and those made by others within the system. Beyond sharing profiles, OLSNs vary greatly in their features - meeting new friends or dates (e.g. Friendster, Orkut), finding new jobs (e.g. LinkedIn), receiving/providing recommendations (e.g. Tribe), etc. Some note that OLSNs and online community services are different in that OLSNs are individual-centered whereas online community services are group-centered (D. Boyd and Ellison 2007). However, online community services are considered as OLSNs in a broad sense.

Researchers have embraced the use of social network analysis as a means to study OLSNs (Howard 2008). The match seems natural since social network analysis has long been used for studying off-line social networks. Such network analysis has certainly been informative. We have quickly learned that OLSNs are not as stable as the off-line networks, but instead evolve and transform rather quickly in terms of the patterns of interpersonal interactions in specific populations (Hu and Wang 2009). One possible explanation for the instability of OLSNs is the impact of co-membership on the way in which OLSNs grow and change. Essentially the research shows that the more recently joining members have a greater effect on the way the network will grow and change in the future than do members who have been part of the social network for much longer periods. This trait is the contrary to the way off-line social networks grow and change (Peng and Woodlock 2009).

A group’s purpose can be compromised when group members are reluctant to join and be active due to the concern over identification and revelation of their individual privacy. At risk are the social benefits of online groups. Establishing OLSN makes it more feasible to use digital analysis to develop knowledge about the network group itself, over and above any compromise of its individual members. As a result, an online social network opens new avenues for automated digital scrutiny and new problems for the protection of privacy in a digital world.

4 Legal and Ethical Aspects of Online Privacy Compromise

4.1 Ethics and IS

Ethics refers to what people ought to do in order to provide basic moral principles from which norms can be derived (Culnan and Williams 2009). Ethics can be divided into three categories: meta-ethics, normative ethics, and applied ethics. Meta-ethics deals with the general nature of ethical theories (Mingers and Walsham 2010). Normative ethics addresses how moral conclusions should be reached. Applied ethics focuses on how ethics should be applied in a particular context. Prior study identifies three types of ethical approaches: consequentialism, deontology, virtue ethics and communitarianism (Donaldson and Werhane 1999; Pojman and Fieser 2011). Consequentialism matches a rational decision making in arguing that correct actions are ones that maximize the overall good or minimize the overall harm (David Hume and Adam Smith). Deontology focuses on the act in itself instead of the consequences of an act. According to deontology, actions should be seen as morally right of wrong in themselves regardless of their consequences. Virtue ethics and communitarianism originated from Aristotole’s idea of the virtuous life and a modern renaissance in communitarianism (Crisp 2000; Hursthouse 2007). It holds that people should develop ways of behaving that would naturally lead to the well-being of both the individual and the community.

Information ethics deals with ethical dilemmas related to the collection, use and management of information while business ethics focuses on the interaction of ethics and business in general (Kenneth McBride 2014; Mason 1986). PARA, a classic IS normative model suggested by Mason (1986), has focused on a narrow set of issues, primarily information privacy. With IT’s increased complexity and establishment within society, ethical issues have increased and gained attention from both business and academia (Moor 2005). In particular, the rise of social computing has extended the ethical environment which now encompasses individuals, information systems as commodities, and globally networked information systems. A new information ethics such as ACTIVE proposed by Kenneth McBride (2014) incorporates not only utilitarian approach but also virtue ethics in moral philosophy (Audi 2012).

The management of ethical issues is crucial to business. Basically, business cannot pursue its goal such as making a profit unless people follow a minimum moral principle. Aside from an ethical argument that consumers should be protected due to their vulnerability and business obligation to protect consumers, there are two reasons for an organization to protect its customer information. First, a failure to protect customer information can lead to problems which would threaten the trust relationship with its customers. Second, stakeholders including internal audiences (e.g., employees) and external audiences (e.g., regulators, media) would give legitimacy to an organization which keeps its moral responsibilities. Such organizations would be better positioned to survive in competitive markets as they can acquire necessary resources. Stakeholders in society would perceive such organizations trustworthy (Porter and Kramer 2006; Suchman 1995).

There are a wide range of ethical issues important to business and academia. According to Walsham (2006), IS research is little different to other forms of research within the social sciences. The ethical topics covered by prior IS study includes codes of ethics for IS practitioners, privacy and security, cybercrime, intellectual property disputes, open-source software, hacking, and digital divide (Himma and Tavani 2008; Tavani 2007; van den Hoven and Weckert 2008). For example, Walsham (1993) noted the importance of dealing with the individual analyst as a moral agent in the context of system development. In a similar context, Myers and Miller (1996) applied an Aristotelian perspective of ethics to the issue of privacy and information access. Park et al. (2009) have developed a model with morality to explain the extent to which developers are reluctant to report bad news on a software project. While the number of ethical issues in IS has been growing, there is a lack of study to deal with them. A recent notable IS which is related to ethics and privacy is online social networks.

4.2 Individual Privacy

Prior research found that it is difficult to conceptualize privacy, perhaps because the concept is evolving over time. Privacy has not been well defined even though it is one of the fundamental values in society (Dinev 2014). Thomson (1975) notes that no one has a clear idea about the meaning of privacy rights. Prior literature does not even provide an accepted definition of privacy (Inness 1996). Researchers argue that setting up the parameters of privacy and the arguments for its protection is not easy even after a century in the development of privacy rights (Borna and Sharma 2011; Regan 1995).

The rights of data privacy are founded on the protection of individual rights against interference with personal decisions, against illegal searches, against intrusion on seclusion, unwanted spying, etc. Invasion of privacy is a tort arising from the exposure of damaging information regarding an individual (Volokh 2000). Privacy rights protect against “the demands of a curious and intrusive society” (Post 1989, p. 958). Fundamentally, an autonomous individual in the presence of a community is interdependent with that community. It is the norms of the community that enable the individual to exercise their autonomy. Privacy helps provide personal immunity and enables an individual to possess personality, autonomy, and thereby human dignity (Post 1989). For example, the right to privacy can sometimes extend to the right to own one’s personal knowledge (Baskerville and Dulipovici 2006).

The rights to privacy are protected by multiple international conventions and guidelines. Article 12 of The UN Universal Declaration of Human Rights states “No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honor and reputation. Everyone has the right to the protection of the law against such interference or attacks.” (United Nations 1948). The OECD council adopted guidelines for the protection of privacy that specifically protected personal data transmitted across borders (OECD 2013). Article 7 of The Charter of Fundamental Rights of the European Union defines a right to respect for an individual’s private and family life, going further in Article 8 to extend this to the protection of personal data: “Everyone has the right to the protection of personal data concerning him or her. Such data must be processed fairly for specified purposes and on the basis of the consent of the person concerned or some other legitimate basis laid down by law.” (European Union 2000, Article 8).

These conventions and guidelines are implemented in myriad national, state, and provincial laws. Many countries have implemented omnibus national privacy laws that provide sweeping implementations of the conventions and guidelines, e.g. the European and Commonwealth nations. In others, notably the USA, there is more complex and diverse protection for privacy rights through a hodgepodge of special purpose laws. Such laws discretely protect the privacy of data related to health, education, finance, etc. (Baumer et al. 2004). Societies have tended to regard the growth in the power of information technologies for processing and communicating data as particularly threatening to the privacy of individuals’ data. The owners of IT systems that store and process these data, and the owners of communications networks that transfer these data, are largely regarded as data stewards. Whether government- or commercially-owned, the duty to act as responsible stewards for personal data has become both a legal and social obligation (Shapiro and Baker 2001). Concerns over the forensics analysis of online social networks may further increase the tension between privacy and IT.

4.3 Individual Privacy and Online Social Networks

An individual’s online identity is not necessarily the same as the individual’s off-line identity. Members of OLSNs may also operate on their own identities in rather instrumental ways. In other words some individuals create different identities for themselves in their OLSNs (Howard 2008). Members of OLSNs sometimes customize their identities to achieve their social goals (K. Young 2009).

Concerns for the individual privacy of members of OLSNs found voice almost from the beginning of OLSNs. The lack of privacy in these networks was obviously noticed as potentially harmful for those individuals who posted personal information in their profiles online (Rosenblum 2007). It is important to recognize that one distinguishing characteristic of OLSNs is the vulnerability of data in electronic media to computer-based storage and analysis. A face-to-face interpersonal social exchange is characterized by its nature as ephemera; it has a brief and transitory existence. Because an online interpersonal social exchange is brokered by computers, the data can be subject to detective-style matching, analyzed on the fly, or trapped and stored for possible later analysis. Unlike a face-to-face exchange, an online interpersonal social exchange can be characterized by its nature as perpetua; it has a permanent and indefinite existence. As a result, OLSNs can be easily subjected to analysis and investigation.

This point is not lost on the research community. Researchers are already demonstrating successful techniques for detecting and analyzing group characteristics and behavior. For example, research has helped us to recognize how OLSN members install cues in their public profiles to indicate their interests and motives (S. Young et al. 2009). Analysis of social network size has been shown to indicate a subject’s gender and characteristics of introvert or extrovert behavior. In addition, large networks and extensive time spent online can distinguish opinion seekers from opinion leaders (Acar and Polonsky 2007).

These research studies have the advantage of being able to easily and electronically capture and study the empirical evidence. However, these studies also illustrate how such methods also compromise sensitive data and expose the characteristics and behavior of an OLSN to open scrutiny.

For example, in one study seeking to develop better suicide prevention among youngsters, researchers have used automated data collection programs to distinguish young lesbian, gay, and bisexual individuals participating in an OLSN, and to study the structure of their networks (Silenzio et al. 2009). While these purposes are beneficent, the same techniques could be applied with hostile intentions. Even the benign research is turning up surprising relationships from the analysis of OLSNs. For example, members who provide a religious affiliation in their public profile are also likely to install cues that indicate their interest in finding a romantic partner (S. Young et al. 2009).

4.4 Group Privacy and OLSN

Much of the work on privacy laws has been focused on the individual right to privacy. This work refers to an individual’s right to be let alone (Warren and Brandeis 1890). The notion of group privacy is more recent. Group privacy can be thought of as “the right to huddle” (Bloustein 2002, p. 121). While group privacy may lack a substantial theoretical foundation, it can be regarded as an extension of individual privacy to an association of individuals. Protection of group privacy is tightly linked to the duration, purpose, and context of the association. Examples of such associations that are already afforded certain privacy rights include the marital relationship, priest-penitent relationship, lawyer-client relationship, doctor-patient relationship, journalist-informant relationship, and to political organizations, social organizations, business organizations, etc. (Bloustein 2002).

The orientation of privacy concerns has always been individual. Expectations and security of group privacy is not pronounced. There appear to be three perspectives on the privacy rights of social groups as shown at Table 1. The first perspective holds that no group privacy exists. People with this perspective argue that there is no regulation to support group privacy. The second perspective holds that the privacy rights of the social group are nothing more than a collective of the privacy rights of the individual persons who belong to the social group. The third perspective would hold that the social group itself possesses an identity that is distinct from those of its individual members. Because the social group has a distinctive identity, it accrues a right to privacy that is distinctive from the simple collection of the privacy rights of its individual members.

Table 1 Conflict between group privacy and individual privacy

The first perspective is reflected in current legal systems where many countries do not define or protect group privacy. In such legal paradigms, online social group behavior might be considered more or less public and can be investigated by anyone with access to both the media and the forensics technology, and to the legal degree that the group behavior cannot be traced to the individuals. In terms of whether group right-to-privacy itself exists as an ethical issue, some viewpoints may hold that an analysis of online social groups does not violate anyone’s privacy. Such a viewpoint is anchored to the lack of regulation as negating the existence of group privacy. An alternative view would acknowledge and protect of group privacy because it is an ethical issue and merely a legal oversight in current legal paradigms. Such an alternative view may be made timely by OLSNs in the presence of BI-DA and DF capabilities.

The second perspective views that a social group has no distinctive identity apart from the collection of its members’ identities. Therefore, individual privacy rights define any limits on the privacy of an online social group. Under this perspective, a compromise of group privacy means a compromise of individual privacy as the compromise can be traced to a single individual person, it is arguable that an online social group can be investigated without concern for privacy as long as the facts, behaviors, and characteristics making up evidence about the group cannot be traced to a single individual person.

In this second perspective, any group that has a distinctive identity is regarded as an organization. The extent to which research has regarded organizational privacy is generally limited to the perspective that organizations possess private individual information (Culnan and Williams 2009). The general perspective on the degree to which organizations are entitled to privacy is more typically regarded from the perspective of confidentiality rather than privacy. The prevailing assumption is that organizations hold their own information through property rights, and are consequently responsible for the protection of the confidentiality of their own information. Even when this information is held by another, the responsibility for confidentiality protection is often expected to be contractually defined (Adams et al. 1994). However, the responsibility for the protection of the confidentiality of the private information of individual persons is often viewed as ethically and legally vested in the holder of that information (Xu et al. 2008).

The third perspective views that a social group has a distinctive identity apart from the collection of its members’ identities, it may be possible to violate the privacy of the social group without necessarily penetrating the individual privacy of its members. In such a case it would be possible to collect evidence from within a social group without compromising the privacy of any individual member of that group. Group privacy is a form of privacy that people seek in their association with others. As an extension of individual privacy, group privacy protects the desire and need of people to come together, exchange information, share feelings and make plans. Group privacy can be compromised without affecting individual privacy (Dumsday 2008). For example, group privacy is compromised when a newspaper reports a secret society’s rituals that the society wishes to keep from the public eye. In this case, the group would experience loss independently of any loss of individual privacy.

According to the perspective of the privacy kill chain, individual privacy can be compromised by attackers’ breach of group privacy. Regulations are not yet set up for this type of the breach of group privacy while regulations have been playing a role of deterrence to protect individual privacy. With absence of relevant regulations, group privacy may be easily compromised in the context of OLSN. With no deterrence in a form of regulations, attackers who intend to compromise individual privacy would use a back channel of group privacy to breach individual privacy.

If it is possible to collect evidence about a social group in such a way that it does not compromise the privacy of individuals, then the rights to investigate social groups may be poorly defined as well as protected in current law. There is little effective legislative or judicial restriction on forms of interference or invasion against the freedom of association (Bloustein 2002). This lack of legal protection further complicates the collection of evidence about online social groups because many online social groups are transnational in their membership. It appears to be the present case that anyone in the world has the right to collect evidence and investigate online social groups as long as data about the individual members is not compromised.

5 Organizations: Commercial Motives for OLSN Privacy Compromise

There may be strong commercial interest in using OLSNs from business. Businesses recognize OLSN as an effective marketing tool. Organizations are increasingly interacting with customers via social network sites (Dunn 2010). For example, electronic retailers have been using social media tools to spot trends and communicate with employees and customers. Previously organizations have invested substantially in traditional marketing tools such as advertisements on TV/Radio and web sites. From this perspective, OLSNs are new media and have good potential as vehicles for delivering targeted advertising messages.

Social network groups play a vital role in changing consumer behavior. Consumers today connect with brands in fundamentally different ways (Edelman 2010). Prior to Internet age, consumers started with many potential brands in mind and narrow down their choices until they decided which one to buy. Recent research indicates that consumers add and subtract brands from a group under consideration during an extended evaluation phase (Court et al. 2009). They rely heavily on digital interactions before they actually purchase. Even after a purchase, they still evaluate a shifting array of options and stay engaged with the brand via online media. Social media has been making the evaluate and advocate stages increasingly relevant. For example, positive word of mouth from social network groups can play a pivotal role in building awareness of products and driving purchases (Edelman 2010). Although organizations have encouraged and exploited OLSN for their marketing purposes, they have paid little attention to online privacy issues in the past. Organizations’ commercial motives have significant impact on the use of OLSN by both organizations and customers. The more information about customers is available, the higher possibility of privacy issues may arise.

Sophisticated information technologies coupled with BI-DA capabilities and DF enable organizations to profile social network groups and raise digital privacy issues. In the past, however, few firms have developed ethical frameworks and privacy protection strategies. (Sarathy and Robertson 2003). Organizations may find little incentives to make privacy a priority unless they are thrust into the public limelight in newspaper articles (Ashworth and Free 2006). Therefore, organization’s common response to the privacy issue was drifting and reacting (Smith 1994).

Prior study notes that online consumers are concerned about their private information and their concerns have a negative impact on business. Users of OLSN would have the same concerns and might be reluctant to post and share their private information over OLSN because of the uncertainty regarding the privacy and information security. It is important for OLSN business to find a way to address users’ privacy concern. Prior study on privacy is based on the notion that individuals’ cost-benefit analysis affect their decision to disclose personal information (Il-Horn et al. 2007). Users may be willing to sacrifice their privacy for financial gains. Furthermore, privacy-related decision making processes are dynamic, multi-faceted, and varying with situational factors which are not well known (Dinev and Hart 2006; H. Li et al. 2011). Therefore, further research on OLSN privacy from an organizational point of view is needed.

Trust and emotion can be focal constructs that work between organizations and OLSN. Organizations need to know the detailed mechanisms of user’s privacy-related decision-making processes. Prior study on e-commerce suggests that customers would be willing to give up their privacy in exchange for receiving quality service. Prior cost-benefit analysis may not be sufficient to explain users’ information sharing at OLSN. Social exchange theory can be used to explain social exchange and stability as a process of negotiated exchanges between engaged parties. The theory posits that the subjective cost-benefit analysis and the comparison of alternatives formulate all human relationships. Trust can be conceptualized as an alternative way to manage ONSL user’s privacy concerns. If users know that their personal information collected by OLSN would be used to create quality goods or services, privacy concerns might be reduced or superseded by their desire to participate. In a similar way, trust building can be proposed as an alternative way to address privacy concerns over OLSN.

Emotion may be involved in the user’s response to an organization’s management of OLSN privacy. Either positive or negative affect may be engaged in the user’s cognitive processes to handle organization’s privacy management. For example, users would feel positive affects by organization’s commitment to protect privacy at OLSN. An emotion process model has been proposed by Frijda (Frijda 1986; Frijda et al. 1989) and widely used across several fields including psychology, criminology, behavioral decision making, marketing, and management. An emotional model for OLSN under the context of privacy would be helpful to explain how user’s emotional processes interact with organizational factors.

Here we propose that OLSN research should be conducted to explore the relationship between organizations and OLSN users. Trust and emotion are suggested as constructs which may affect user’s privacy concern at OLSN. As to related theories, we suggest social exchange theory (Emerson 1976), emotion theory (Frijda et al. 1989), and cognition theory (Finke et al. 1992). As to research method, either qualitative approach with a case study or quantitative approach with a model development and validation can be used.

While there is a degree of business research about individual privacy (Calluzzo and Cante 2004), there is less work still about group privacy. In the networked world, organizations may need to be more proactive in responding to privacy issues related to OLSN and anticipating the problems arising from the business use of network forensics technology.

Prior research has focused on the use and impact of OLSN for organizations’ marketing purpose without much attention to privacy issues. Prior research focusing on OLSN built upon social network theory. Studies have focused on the role of trust and intimacy, safety of young users, and representing and harvesting OLSN profile information (D. Boyd 2004). Social network theory (Granovetter 1983) has been applied to analyze the relevance of relations of different depth and strength under this context. Social network theory regards social relationships as a network consisting of nodes and ties. Nodes correspond to individuals within networks and ties correspond to relationships between individuals. Social networks operate at different levels (e.g. individual, groups, and organizations). Social network theory can also be used to develop a method in determining whether an individual has a reasonable expectation of privacy (Gross et al. 2005). The theory can be used to determine whether information is considered private or public. Although social network theory explains how privacy is not only an individual-level issue but also a group-level issue, prior research has focused on individual-level privacy.

6 Investigative Technologies: Business Intelligence and Digital Forensics

Organizations have been increasingly using BI-DA for decision making based on both structured and unstructured data. BI-DA helps organizations overcome computational challenges due to high volume, velocity and variety of data (Pegna 2015; Lu et al. 2014). In addition to the data collected from online social networks, other types of data such as large logs are available for organizations to analyze users’ behaviors and their interests. In particular logs stored by search engines and other online platforms have records of the interaction of users with not only the systems but also with their friends in terms of information sharing and tagging actions. In recent years, the BI-DA capability to collect and analyze data from online interactions and relationship links in online social network platforms such as Facebook and LinkedIn has been significantly enhanced (Bonchi et al. 2011). As a result, BI-DA supported data mining of online social networks has become ubiquitous as data collection and analysis are continuous. This capability enables organizations to redefine the types and amount of data that can be collected and analyzed from online social networks platforms. This leads to the emergence of a computational social science (Alvarez 2016).

An organization’s use of BI-DA is closely related to the issue of user’s privacy (Belanger and Xu 2015). Organizations can use BI-DA capability with OLSNs for various business purposes including targeted marketing and advertising, reputation monitoring, community detection and evolution, and expert finding (Bonchi et al. 2011). They often outsource the analysis of online social networks, sharing data with third party service providers. To support this, many OLSN platforms share data with third party organizations. Many of these OLSN platforms even provide open APIs that enable third party organizations to access user profiles, thereby potentially violating user’s privacy (Narayanan and Shmatikov 2009). In addition, the integration of multiple data sources from online social networks enables organizations to have a capability to easily identify individuals due to a small world problem. The small world problem means that the world is small at social networks and there are no more than six intermediate acquaintances between any two arbitrary people (Kleinberg 2000; Milgram 1967).

Data anonymization has been introduced to address a user’s privacy issue for organizations using BI-DA to collect and analyze data from online social networks as BI-DA can further analyze data to identify individuals engaged in online social networks. Data anonymization enables organizations to perturb the data in a way individual values are hidden while useful information can be recovered (Ghinita et al. 2007). Prior studies have introduced several anonymization techniques including data sanitization (Amiri 2007), output perturbation (Rastogi et al. 2009), query auditing (Kleinberg et al. 2003), and privacy preserving data mining (Y. Li et al. 2012). Despite several anonymization techniques, striking a delicate balance between data usability and privacy protection has not been an easy task.

In addition to BI-DA, DF technology is developing impressive capabilities for revealing analytical evidence through online investigations. According to the Merriam-Webster Online Dictionary, the term forensic means “belonging to, used in or suitable to courts of judicature or to public discussion and debate”, and forensic science is “the application of scientific knowledge to legal problems; especially: scientific analysis of physical evidence (as from a crime scene)” (Merriam-Webster 2010). It derives from the Latin root forensis, which, rather concordantly for our discussion, means “public”. The use of information technologies has often left digital information that can be subject to forensic analysis. The DF process typically involves several phases including evidence collection (e.g. data in hard drives, flash drives, etc.), preservation, analyzing, and presentation (Brown 2009).

While the ethical correctness of forensic investigation on OLSN is debatable, the capability to intrude on these networks using DF capabilities is clear. For example, the US National Security Agency has sponsored research in collecting and analyzing the information that people post about themselves on social networks like MySpace. Combining this kind of data with “details such as banking, retail, and property records” allows government agencies such as “NSA to build extensive all-embracing personal profiles of individuals” (Rosenblum 2007, p. 48). To forensics analysts, information posted on OLSN sites can be valuable for such purposes as tracking policy violations, criminal groups, terrorists, political movements, etc. However, because the intense interest in OLSNs is so new, little research has been conducted so far that leads to a good understanding of the relationship between DF and social networks.

Because BI-DA/DF technologies are capable for analyzing these social networking data, we briefly review these methods and technologies in the context of OLSN usage and provide our research agenda.

First, organizations are already at work developing various methods and tools for analysis of OLSNs. In one study, data mining methods were applied to extract the social network structure based on the online information posted on social networking sites. The focus of the analysis is on the interaction and dynamics of the network (J. Haggerty et al. 2008). Examples of how this DF analysis can focus include: 1) Centrality in a social network, 2) Identification of levels of culpability for action(s) involving a group, e.g. dissemination of malicious emails, 3) Commitment in exchange relationships, e.g. how and when individuals become committed to a group (like pedophiles exchanging images), 4) Roles of individuals in social network, e.g. Bridges, power relationships, strength of weak ties, etc.

Another study has developed visualization tools to help analyze certain large and complex social networks (J. Cheng et al. 2009). The graphical representation of the social network provides an aid in the analysis of the large dataset. Examples of related measures of an actor in a social network are listed as follows (John Haggerty 2009): 1) In-degree centrality (number of ties received by an actor in an OLSN), used to identify actor receptivity or popularity, i.e. facilitators in a business network, 2) Out-degree centrality (number of outgoing ties from an actor in an OLSN), used to determine the expansiveness of network ties that an actor possesses, 3) Betweeness centrality (the number of shortest paths from all vertices to all others that pass through an actor in an OLSN), used to identify actors that have power to isolate or broker relationships within a network, i.e., examining paths of potential communication, 4) Closeness centrality (number of steps required to access every other actor from a given actor in an OLSN), used to identify actors that have independence within the network and assesses how quickly they can react to changes.

These capabilities need to be further developed with BI-DA (Isik et al. 2013). For example, processing a large amount of OLSN data takes significant of time due to the limitation of serial processing. A parallel processing method may overcome this limitation (Chaudhuri et al. 2011). Analyzing a large amount of OLSN data may take significant amount of time. New analysis methods such as stochastic analysis or prioritized analysis may reduce the analysis time. In addition, the current visualization method based on WIMP (Window/Icon/Menu/Pointing device) model may not be suitable for presenting complex results of OLSN. An efficient and intuitive interface is needed to visualize and present a complex result.

7 Impeding the Privacy Kill Chain

The above exploration of the social, legal/ethical, commercial, and technical aspects of the online privacy kill chain enables us to analytically consider possible new controls that would impede the success of this chain. This analysis, together with examples of controls to consider, is illustrated in Table 2. We will discuss each of these types of control below, along with examples.

  • Social controls. Social controls would necessarily be enacted by a social group or society. As such, they represent a behavior that must be driven by the shared values that the social group expresses (L. Cheng et al. 2013). These kinds of controls are not very functional, but can be implemented by sharing information about the risks of privacy compromise and the protective measures that the group could encourage among its members. For example, the group could express the importance of not recirculating social data made available in the group. By limiting the further circulation of such data; the extent of the perpetua becomes somewhat more limited. Circulation of group data stays within the group. Likewise, sharing the importance of not retaining social data, helps limit the extent of any collection of this data for analysis. An example would be a group value for securely deleting old messages as soon as these have been read. Minimization of the group membership, i.e., valuing a small community, would also limit the audience (and thereby compromise) of group behavior. Finally, avoiding recirculating or retaining individual data would be helpful in limiting the spread of data about who is in the group, making it more difficult to deduce individual characteristics from group characteristics simply by making it slightly more difficult to identify individuals in the group.

  • Legal/Ethical Controls. Enacting legal controls such as laws and regulations are perhaps more functional than social controls (Gerber and Von Solms 2008). Legal controls usually enforce the deontological ethics of a society. If regulations limit the rights to retain online social data, such as rules enforcing a group’s “right to be forgotten” (Bygrave 2014), information about group behaviors can cease to be perpetua. Providing regulation for BI-DA/DF analysis such as licensure, would provide at least some limit on the kinds of analysis and treatments of group data. Probably of most importance would be the enactment and enforcement of group privacy regulation that would extend the already extensive individual privacy regulation to groups. Such group privacy regulation would be a necessary companion to the notion of licensure.

  • Commercial Controls. These controls might largely be regarded as good hygiene practices in the presence of social, legal and ethical scrutiny by management (Brooks and Corkill 2014). For example, a routine practice to purge perished data as a means of insuring that old social data is not kept around unnecessarily, but regarded as perishable and securely deleted before it “starts to smell bad”. Likewise, a common practice to avoid retention of any secondary (noise) outputs of BI-DA/DF analysis would help limit unnecessary revelations about group characteristics. Further, a practice of treating both individuals and groups as equal under an organization’s privacy equal under policy would help limit group privacy compromise. Finally regarding practices that limit any unnecessary demographic analyses, the organization would reduce the scope of any probabilistic deductions about characteristics shared by individuals within the group.

  • Technical Controls. Like legal regulation, technical controls may be more functional than commercial or social controls. These controls regard the use of information technology to impede the success of the privacy kill chain (Yang et al. 2016). For example, establishing an automated program of wiping perished social data would help restore the ephemeral characteristic of social information. If BI-DA/DF analysis is regulated and licensed, then electronic monitoring of BI-DA/DF execution would help limit the unauthorized investigation of groups. Group privacy compromise would also be impeded by carefully managed access controls to prevent unauthorized access to group data. Along the same lines, group membership data access would need management to prevent compromise of the individual identity of group members. This access control would prevent deducing individual characteristics from group characteristics.

Table 2 Examples of potential controls that could impede the stages of the privacy kill chain

8 Discussion

The protection of group privacy in OLSN groups raises multiple issues. First, group privacy is integral for the freedom to organize and participate in social groups. This freedom is the foundation of democratic society. With the introduction of information technologies, an OLSN grows its own identity (e.g., group name, membership parameters, etc.). As a result, it can have its own privacy on a group level. Second, this OLSN group privacy can be compromised by an analysis using investigative technologies. The technologies introduced above can also easily identify and reveal membership information about each individual. The knowledge of this can lead to members’ apprehension over the revelation of their individual information.

The existing tension between investigative technologies and accessibility to OLSNs needs to be acknowledged and balanced in keeping with the benefits of the societies involved. Investigative technologies have a capability to identify and reveal information about individuals within OLSNs. As the knowledge about the capability of investigative technologies spreads via media to societies, this can make people reluctant to join and to be active in OLSNs. This reluctance might have negative impacts on societies’ needs to express and share diverse opinions thereby integrating societies harmoniously. Further, investigative technologies are necessary to protect social justice. For example, DF expedites the identification of criminals, international terrorists, predators, etc. From an economic perspective, information disclosure on online consumer groups help markets become more efficient in allocation of scarce resources (Posner 1981; Stigler 1980). Therefore investigative technologies can be both a threat and a benefit to society. The challenge rises in how to use investigative technologies for the benefits of society while minimizing its inherent risks (such as those to privacy).

Societies with different cultural values and legislatures treat privacy differently. Specifically, people in more individualistic cultures as measured by Hofstede’s individual-collectivism index (Hofstede and Hofstede 1991) are more concerned about privacy (Milberg et al. 2000). A society with a strong individualism culture tends to protect privacy with various regulations and laws. For example, western countries have more regulations and laws to protect privacy compared with Asian countries that have a strong collectivism culture.

But most of these regulations and laws are aimed at protecting individual privacy. Group privacy also has its own cultural/sociological limits. Nations that span different cultures have their own distinctive privacy regulations. Group privacy would protect characteristics and activities that can be different from the characteristics and activities of the individual members. A group can have an average age, a dominant gender, typical behaviors, and other characteristics. In a secret religious gathering, for example, an overall, collective response as a group to a leader’s message at the meeting would only be considered confidential if the concept of group privacy is acknowledged. Individual members of the group may or may not exhibit the same characteristics as the group. Indeed, with factors such as the group response above, we cannot assume that all anonymous individuals share the same attitude to the message. Individual privacy cannot necessarily be saved through anonymity in settings where group privacy is compromised.

Because group characteristics and behaviors can be different from individual characteristics and behaviors, group privacy is not the same as the sum of the individual privacy within it. It is possible to compromise group characteristics without compromising any one individual’s privacy. Because few privacy rights are attached to a group, OLSN groups are open to a risk of compromise to investigative technologies. Because many OLSN are international in scope, there is a risk for abuse across national borders. Investigative technologies can easily extend across borders. There are possibilities for the creation of a new form of digital divide, one in which participation in OLSN is risky for individuals whose association with an OLSN group might be regarded as dangerous behavior by their local authorities. Such a digital divide would separate “haves”, those for whom free participation in OLSN engenders little social risk; and “have nots”, those for whom such participation is dangerous and therefore avoided.

Because investigative technologies can compromise group privacy, individuals may limit their participation in groups in order to protect themselves from harms arising from disclosures about group characteristics and activities. This situation means that in some cases, the functional scope of OLSN is not only bound technically by computing, software or networking limits, but also bounded socially by privacy concerns.

9 Conclusion

Emerging technologies (e.g., OLSN and BI-DA/DF) each offer great potential to improve our societies. In this paper, we have explored how these emerging technologies can also combine to create new threats to individuals and social groups. Perhaps foremost among these threats is the compromise of privacy. Indeed, it may be the case that OLSN technologies enables a new kind of social group, with needs for a recognition of group privacy, while investigative technologies enable the compromise of such group privacy and individual privacy. The situation provides an example where one sociotechnical improvement generates great social value, and vast amounts of data, while another generates analytical technologies that can unexpectedly yield unintended, and socially unacceptable, results such as privacy breach of both groups and individuals. The collision of these technologies creates a situation where rapid technological development outpaces social controls. While government snooping and commercial data mining of online behavior may seem distasteful, laws and social mores in the area are two underdeveloped to determine what, if any, societal controls are lagging.

Investigative technologies have developed sophisticated methods and tools to collect and analyze digital information of specific human behavior. There is great social value available from these tools, but also great social risk of a different kind than modern societies have encountered before. Care must be taken to balance the benefits and costs of both online social networking and investigative technologies. The research agenda we propose may help societies and organizations strike the proper balance.

By analyzing the privacy kill chain from social, ethical, commercial, and technical perspectives, this article has set out prospects for ways to impede group and individual privacy comprise. As more people and businesses use OLSNs for various purposes, the number of stakeholders, variables, and extensions into other domains will become large. Stakeholders include individuals, online social network groups, social network service providers, forensic analyzers, and other stakeholders who use analyzed data. The role of variables such as trust and emotion in the context of OLSN needs further investigation. Finally, legal and ethical considerations to the relationship between individual and group privacy needs to be extended for the protection of group privacy.