A comparison of privacy issues in collaborative workspaces and social networks
- First Online:
- Cite this article as:
- Pekárek, M. & Pötzsch, S. IDIS (2009) 2: 81. doi:10.1007/s12394-009-0016-4
With the advent of Web 2.0, numerous social software applications allow people to publish and share information on the Internet. Two of these types of applications – collaborative workspaces and social network sites – have a number of features in common, which are explored to provide a basis for comparative analysis. This basis is extended with a suitable definition of privacy, a sociological perspective and an applicable adversary model in order to facilitate an investigation of similarities and differences with regard to privacy threats. Practical examples are derived from the use of Wikipedia and Facebook. Analysis suggests that a combination of technical, legal, and normative solutions should be considered to counter privacy issues. A number of potential solutions that may mitigate these issues are proposed.
KeywordsCollaborative workspacesComparisonFacebookPrivacyPrivacy issuesSocial network sitesSocial softwareWikipedia
With the advent of the so-called Web 2.0 social software applications gain more and more users. People actively participate in discussions and the creation of content on the Internet. They create profiles with personal data and manage their relationships on the Web. This offers a variety of possibilities to make new friends and business connections, to share knowledge and to get support from an online community. In the process, users leave more and more information traces online, which may cause privacy issues. This insight is not new, and much research is carried out investigating this topic (e.g. Grimmelmann 2009, Gross et al. 2005, Hogben 2007, Wong 2008).
As part of the European Community’s Seventh Framework Programme (FP7/2007–2013), one of the topics of the PrimeLife project is the investigation of privacy issues of collaborative workspaces and social network sites (PrimeLife 2008). Both types of platform have a number of elements in common. People participating in these platforms provide and adapt content, and divulge personally identifiable information in the process, thus leading to (potential) privacy issues. This paper investigates whether and to what extent social networks and collaborative workspaces can be treated equally when trying to solve privacy threats, and suggests a number of potential solutions that may mitigate these issues. The scope of the analysis is relatively general, as it is not the objective to solve one particular privacy problem with one specific solution. Rather, the goal is to outline possible types of solutions that may be considered based on the particular features of collective workspaces and social network sites.
The structure of the paper is as follows. After a brief introduction of social software, we focus on the similarities and differences between collaborative workspaces and social network sites. This description of general features is supplemented with a suitable definition of privacy, a sociological perspective and an applicable adversary model in order to have a theoretical basis for the comparison of both types of social software. The mainstay of the paper is formed by an analysis of the privacy issues arising in collective workspaces and social network sites, and it concludes with a number of suggested improvements.
Information Management: finding, evaluating and administration of information
Self Management: present aspects of yourself on the Internet
Relationship Management: represent and maintain contacts to others via Internet
Considering these functional differences, we distinguish between different types of social software applications. On the one hand we consider collaborative workspaces, which encompass applications that are primarily focused on documents that are created in a collaborative manner (Panoke-Babatz and Syri 1997), and that aim to support information management (cf. top corner of the triangle in Fig. 1). Technical systems that can be used for establishing collaborative workspaces are wikis, collaborative real-time editors, forums, chats, weblogs, or further groupware systems. A well-known example of collaborative workspaces realised by a wiki system is Wikipedia (Wikipedia 2008a). On the other hand we find social network sites (e.g. Facebook (Facebook 2008a), LinkedIn (LinkedIn 2009), Hyves (Hyves 2009), MySpace (MySpace 2009)), which stress the self-portayal of social network site members, and the management of the relations between them (boyd and Ellison 2007). Social network applications are therefore positioned in the middle between the lefthand and the righthand corner towards the bottom of the triangle (cf. Fig. 1).
Differences and similarities between collaborative workspaces and social network sites
Social network sites and collaborative workspaces both aim at supporting users in online collaborations; however they follow two different approaches.
The essential feature of a social network site is the provision of user profiles and connections between them (c.f. relationship management in Fig. 1). It is focused on the individual, and users can create additional content—usually related to their profiles (private messages, listed groups, wall)—to present themselves within a group of connections. Collaborative workspaces work the other way around. The key functionality in this case is the collaborative editing and creating of contents (c.f. information management in Fig. 1). The co-authors in a collaborative workspace form a social network, but this social network is not the essence of the collaborative workspace: the focal point is the jointly created content, whereas the value of a social network site lies in the network itself.
Comparison of features for collaborative workspaces and social network sites
Social network sites
“Content is important.”
“The network, supported by user information and relationships, is important.”
Content Creation and Management
Collaborative creation and editing of content is the key feature.
Management of user profiles and their connections is the key feature.
If restricted, AC to documents is identity-based, depending on the goal of the document and the knowledge of the user.
If restricted, AC to profiles is relationship-based, depending on the connection between the user and the profile owner.
User gets newest version per default, all former versions of content may also be available.
User gets only current version of others profiles and connections. Providers may have former versions stored in databases.
Users can be informed, in case of changes on certain contents or when new content is available.
Users can be informed, in case of changes on certain contents or when new content is available.
The first two features—Content Creation and Management and User Administration—are inversely important for the particular applications. Theoretically, collaborative workspaces can be realised only with features for creating and managing content and without any user administration. For example, a wiki system can allow everybody to read and modify all articles without any restriction. Vice versa an application that allows people to create profiles and indicate connections fulfils all mandatory requirements of a social network site without providing additional features for communication and producing content in addition. Access Control, History Services and Event Services are realised slightly differently in current applications from both types. In any case, these features are subordinate functionalities.
Collaborative workspaces and social network sites are both automated tools to support interactions between participants. The underlying goals of these interactions may vary considerably. Members who have an account on a social software application aim to maintain social connections or to share information with others. In order to become part of such a community, it is necessary to disclose at least some personal data that shows who you are, what skills you have and what you are interested in. This disclosure is facilitated directly by profile pages that contain basic information, such as name and age, but also other identity related data, like hobbies and interests, images, and personal opinions as expressed in blogs. Further, personal data is indirectly disclosed when content is provided by the user. This encompasses semantically included information, e.g. someone who writes his real name or place of residence in a forum, as well as writing style, habits of technology usage and other information. The digital availability of this data leads to potential privacy risks, since copying of data and tracing of users is simple and data, once disclosed, is out of control of the data supplier.
Investigating informational privacy in social software
When discussing the privacy aspects of collaborative workspaces and social network sites, an appropriate definition of privacy is indispensable. We limit ourselves to informational privacy, which can be defined as the freedom from unreasonable constraints on the construction of one’s own identity (Agre and Rotenberg 1997). The availability of personal information in the hands of unintended others may cause such constraints, which calls for equipping the user with control over his personal data in order to minimise misuse of information. The ability of the user to actively influence the access to and the use of his personal information is key.
When interacting with other people or organisations, every individual plays a role that is appropriate in a particular situation. The behaviour of someone who is surrounded by close family members may differ substantially from the one displayed at work when interacting with colleagues or management. According to Goffman it depends on the context what part of one’s identity someone is prepared to show to the environment, where it is essential to keep these contexts separate: the term ‘audience segregation’ is coined for this phenomenon. Audience segregation can be defined as the ability of the user to have different partial identities to play different roles and portray the self to others in a way he chooses (Goffman 1959). Thanks to the careful segregation of the different audiences, the partial identities can be allowed to co-exist. Rachels states that this audience segregation “is an essential characteristic of modern (western) societies and allows for different kinds of social relationships to be established and maintained” (Rachels 1975).
The sociological theory concisely introduced above was drafted long before the advent of social network sites or collaborative workspaces, but the concepts hold up well online. On social network sites, the user profile is the image someone presents to his environment, and it forms the basis for his interactions with the other members of the social network site. However, the image someone presents is often only directed at a certain audience (e.g. someone’s closest friends), and may cause embarrassment when accessed by others. The theory behind context segregation and the risk of collapsing contexts form a powerful means to analyse the privacy issues in both social network sites and collaborative workspaces.
Another sociological perspective deals with the specific norms users of social software bring to the table. It was theorised that every social network comes with its own set of social norms (Tönnies 1965). Actions of members of these networks are based on assumptions about the norms that regulate the interactions. The mismatch between the user’s expectation of social norms and the existing practices in a particular network or workspace could be another source of arising privacy issues. The extent to which stakeholders in a network or workspace act in accordance with the normative expectations of other stakeholders forms a useful basis for analysis.
Providers of the collaborative workspaces/social network site application
Providers of the collaborative workspaces or social network site are insiders with the most comprehensive insight in the application since they are the ones who are responsible for implementation, delivery and maintenance of the software.
Privacy issues in collaborative workspaces and social network sites
Since it is not possible to take into consideration all available social network sites and collaborative workspaces, we have done a detailed analysis of potential privacy issues in Wikipedia (Wikipedia 2008a) as a popular example of collaborative workspaces and Facebook (Facebook 2008a) as well known instance of a social network site. The following sections discuss similarities and differences with regard to privacy issues that we found for both applications, viewed from the perspective of the three different adversary types.
Privacy issues caused by third parties
On both platforms, Wikipedia and Facebook, it is quite simple for third parties to gain access to personal data without infringing the technical rules set out for the use of the systems. In the case of Facebook this is due to the settings which are applied by default and which are rarely customised by the user since users believe these are the optimal settings or they have no interest in privacy settings at all (Gross et al. 2005). For users of Wikipedia customisation is simply not foreseen by the application. As a result the general public is often allowed access to at least a limited set of personal information (a basic profile or the user page). These very open access control settings make it possible for third parties to collect information on individual users, which can be used for profiling purposes. The user-created content may also be lifted from the original context and combined into an overarching view of the individual, for which purpose an increasing number of so-called mashups are available. Mashups are a genre of interactive Web applications that draw upon content retrieved from external data sources to create entirely new services (IBM 2006). In this case, personal data publicised in different locations on the web may be brought together on one new web page. This use of personal information leads to the collapsing of contexts that was introduced in the section titled Sociological Perspective, which in turn may impede individuals to construct their identity differently for each separate context, thus causing privacy infringements. Third parties like tax authorities may also link publicly available information on social network sites to check information on tax returns (McDonagh 2008).
In Wikipedia, where due to the minimal access control options, each third party has access to user pages1, the issue is not only limited to the confidentiality of personal data. Since third parties are also allowed to modify user pages from Wikipedia members, it also concerns the correctness and integrity of the published data on an individual’s personal page. In addition it is easily possible to search for all articles, to which a user has contributed and therefore gain a good profile of his interests. It needs to be mentioned that Wikipedia does not require from people to have an account with username and password in order to contribute. However, it is also possible to search for all articles from the same IP address, which is stored if the contributor does not provide a registered username.
Besides legitimate access, third parties may also gain access to personal data though deceitful conduct, e.g. hacking the system’s databases (e.g. Valleywag 2008). In both collaborative workspaces and social network sites, information collected by such means may cause severe privacy issues, ranging from embarrassment to identity theft, which affects many users. Security breaches at the root of these problems are similar for both platforms.
Privacy issues caused by other users
Other users of Wikipedia and Facebook have at least the same potential to cause privacy issues as third parties. The issues identified in the previous section are therefore equally applicable when other users act as the adversaries. Many privacy issues can be traced back to the out-of-context use of personal data. On Facebook the impact of these infringements is generally higher when trusted contacts are involved, since these normally have legitimate access to more personal data than the general public. It holds true for both platforms, that even when all technical rules are respected, the lack of enforceability of social norms concerning the publication of personal information in other, unsolicited contexts lays at the root of these privacy issues. Legal provisions attempting to stem these privacy issues, e.g. the prohibition to use fake identities when constructing social network site profiles, are hard to uphold in practice.
The interactions between users, however, lead to some other potential privacy issues. In both example applications it is possible to leave remarks or comments on the personal site of other members (the so-called Wall on Facebook, user pages on Wikipedia). On Wikipedia it is even possible to create a new public article about an individual as central topic (e.g. Wikipedia 2008b). Some control of the information flow is relinquished to other people and can lead to the result that unwelcome information generated by other users is shown to the public. An example of this are third parties trying to gain access to private information through other users on social network sites. When someone installs an application as an add-on to his profile, it may harvest data from the available network, even without the concerned user being aware. The application may be granted rights to access profile information when installed. Such an application acquires the privileges of the profile owner and can query personal information of the user and members of the user’s network (Felt and Evans 2008).
Concerns regarding the confidentiality, correctness and integrity of personal data have already been discussed in the previous section for the example of Wikipedia, and is also appropriate for social network sites when other users are considered as potential adversaries.
Both applications do not request any proof of identity for registration, which provides users with some anonymity, but on the other hand enables malicious users to perform social engineering attacks by creating an account with false data. By pretending to be someone else, e.g. a friend or a relative of the target, it opens ways to spoof out personal data from members of Facebook or Wikipedia. Further, it is possible to compose embarrassing contributions under the name of someone else in both applications.
Privacy issues caused by providers
Social norms are currently the only forces effectively delimiting the unabridged use of personal information. When these norms are not respected, users perceive this as a breach of privacy. A good example of this is Facebook’s use of Beacon, technology collecting information about the use of certain commercial websites which is relayed back to Facebook (Facebook 2008b). A public outcry of privacy advocates and negative press coverage (e.g. Malik 2007, wikiHow 2008) led Facebook to the decision to review its position (Zuckerberg 2007). Effectively, the social pressure from the public and the risk of popularity loss are interlinked and may restrain platform providers to extensive use of personal information in commercial settings.
Conclusion on similarities and differences concerning privacy issues
The main conclusion from the previous sections is that both collaborative workspaces and social network sites suffer from the same potential privacy issues. Both types of platform store a mix of personal data and content, where the balance between these two is mainly dictated by the goal and actual use of the system. Third parties and other users have comparable means to access personal information of platform users, which are technically only restricted by the limitations imposed by the system and the access control settings that have been established.
The access to personal data that the providers enjoy thanks to their direct access to the supporting systems is essentially the same for collaborative workspaces and social network sites. In short, platform providers have full access, assuming that the social network site or collaborative workspaces are realised as a client-server application, which is the case for today’s popular applications. The subsequent use of the available information for new purposes is only limited by self-imposed norms on behalf of the provider. Effective technical and legal means to limit provider data access are virtually absent.
Access to personal information that contravenes the established rules—be they technical, legal, or social—is also similar between collaborative workspaces and social network sites. Breaking technical access restrictions (also known as hacking) is possible on both platforms, just like the infringement of legal constraints surrounding the use of the applications. Social phenomena cause most privacy issues: information originally presented in one context that is presented in a new, unintended context form a large portion of all privacy intrusions. Especially when the norms of people divulging information and parties using information do not match up, the perceived impact on privacy can be substantial.
After having concluded that similar privacy issues exist in both collaborative workspaces and social network site, the question remains what the differences are. The prime difference is caused by the design and use of the systems, for which we refer back to Table 1. The focus of collaborative workspaces is the management and manipulation of content whereas social network sites primarily provide management of user profiles and connections. Therefore, collaborative workspaces contain much less isolated personal data items than social network sites, where it is a key feature for users to build up an own profile with much well structured personal information. The bulk of information in collaborative workspaces is content, which also may contain personal data; however this data is less structured and requires semantic analysis for extraction. It is therefore easier to collect users’ personal data automatically from social network sites than from collaborative workspaces.
Finally, the analysis of the two example applications demonstrated differences in the currently available user-determined access control settings. The fewer means of access control are available, the more adversaries have opportunities to cause privacy issues.
Because of the high degree of similarity between the privacy issues originating from the use of both collaborative workspaces and social network sites, it is likely that a number of potential improvements will be applicable to both platforms. This section discusses a number of potential improvements that help to mitigate a number of privacy issues in collaborative workspaces and social network sites.
Fine-grained, user-determined access control policies to enable users as owners of their data to define who can access what personal data (e.g. Franz et al. 2006).
Group encryption to prevent access from anybody outside the group (e.g. Camenisch and Damgard 2000).
Open source peer-to-peer networks to prevent all data being stored on a central server under the control of one provider (e.g. Noserub 2008).
Allow use of multiple pseudonyms to enable some level of unlinkability for users between different contexts (cf. Pfitzmann and Koehntopp 2001).
Digital signatures to ensure the integrity of personal data and the authenticity of the sender (cf. Chaum 1985).
As we have seen one of the main causes for privacy issues is that personal information from one context is shifted to a new context without the consent of the user. On top of that, the user has lost control of the use of her personal information in this new context, quite often because she is not even aware of the information shift. There is a distinct lack of shared social norms concerning the acceptability of the use of personal information in new contexts. It will be a challenge to facilitate the forming and the acceptance of social norms in collaborative workspaces and social network sites. The Wikipedia community denotes a good example for the development of social norms in collaborative workspaces. Having no rules before 2001, the community since then discussed and introduced a set of policies and guidelines that serve as standards or advisory, respectively (Wikipedia 2009).
Instead of only focusing on the legal framework presented by the platform provider for improvements, it may be worthwhile to explore the wider legal landscape in addition. Intellectual property legislation, portrait rights or general privacy protection legislation may serve as alternative bases to prevent privacy issues in the future.
Conclusions and future work
This paper compared Wikipedia as an example of a collaborative workspace and Facebook as an example of a social network site. Similarities and differences concerning privacy issues of both social software applications have been identified. In general, the issues we have found arise mainly due to collapsing contexts, i.e. users’ personal data used in contexts other than the original and intended one. The finding that social software lacks fine-grained and user-determined access control options aggravates this source of privacy issues.
Serious privacy issues are not only the result of the breach of technical implementations, but may also be brought about through the disregard of social norms and legal provisions. Therefore we conclude that solutions to address privacy issues in social software can neither be only technical, nor only legal, nor only based on upholding certain social norms: it is necessary to find a comprehensive approach. A combination of all three areas is needed in order to improve privacy protection on the one hand without losing important functionalities on the other hand, whilst safeguarding the social usability of the application for the average user. These aspects are considered by the PrimeLife project (PrimeLife 2008) and will be topic of our future research.
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013) under grant agreement no. 216483(PrimeLife 2008).
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.