Accountability Through Transparency for Cloud Customers
Public cloud providers process data on behalf of their customers in data centres that typically are physically remote from their users. This context creates a number of challenges related to data privacy and security, and may hinder the adoption of cloud technology. One of these challenges is how to maintain transparency of the processes and procedures while at the same time providing services that are secure and cost effective. This chapter presents results from an empirical study in which the cloud customers identified a number of transparency requirements to the adoption of cloud providers. We have compared our results with previous studies, and have found that in general, customers are in synchrony with research criteria for cloud service provider transparency, but there are also some extra pieces of information that customers are looking for. We further explain how A4Cloud tools contribute to addressing the customers’ requirements.
KeywordsCloud computing Accountability Transparency Privacy Security
Cloud computing, which allows for highly scalable computing and storage, is increasing in importance throughout information technology (IT). Cloud computing providers offer a variety of services to individuals, companies, and government agencies, with users employing cloud computing for storing and sharing information, database management and mining, and deploying web services, which can range from processing vast datasets for complicated scientific problems to using clouds to manage and provide access to medical records .
users of cloud services who are currently not convinced by the balance of risk against opportunity
their customers, especially end-users who do not understand the need to control access to personal information
suppliers within the cloud eco-system, who need to be able to differentiate themselves in the ultimate commodity market.
In this paper we report on the results of an elicitation activity related to transparency requirements from the perspective of cloud customers. A Cloud Customer in our context is an entity that (a) maintains a business relationship with, and (b) uses services from a Cloud Provider; correspondingly, a Cloud Provider is an entity responsible for making a [cloud] service available to Cloud Customers.
Transparency is the property of an accountable system that is capable of ‘giving account’ of, or providing visibility of, how it conforms to its governing rules and commitments . Transparency involves operating in such a way as to maximize the amount of and ease-of-access to information which may be obtained about the structure and behavior of a system or process. An accountable organization is transparent in the sense that it makes the policies on treatment of personal and confidential data known to relevant stakeholders, can demonstrate how these are implemented, provides appropriate notifications in case of policy violation, and responds adequately to data subject access requests. In an ideal scenario, the user knows the information requirements and is able to communicate that clearly to the provider, and in return, the provider is transparent and thus willing to address the regulatory and legislative obligations required with regard to the assets.
The rest of the chapter is organized as follows. Section 2 presents some background from the literature. Section 3 explains the methods that we used to elicit the views of the stakeholders. In Sect. 4 we present the results, and in Sect. 5 we illustrate how the tools developed by the A4Cloud project contribute to meeting the customer transparency requirements. We discuss our findings compared to related work in Sect. 6, and draw our conclusions in Sect. 7.
2 Related Work
Pauley’s cloud provider transparency scorecard.
Mentioned in interviews?
Length in years in business \(>\) 5?
Published security or privacy breaches?
Published data loss?
Member of ENISA, CSA, CloudAudit, OCCI, or other cloud standards groups?
Profitable or public?
Portal area for security information?
Published security policy?
White paper on security standards?
Does the policy specifically address multi-tenancy issues?
Email or online chat for questions?
ISO/IEC 27000 certified?
COBIT, NIST SP800-53 security certified?
Offer security professional services (assessment)?
Employees CISSP, CISM, or other security certified?
Portal area for privacy information?
White paper on privacy standards?
Email or online chat for questions?
Offer privacy professional services (assessment)?
Employees CIPP or other privacy certified?
External audits or certifications
SAS 70 Type II
Does it offer an SLA?
Does the SLA apply to all services
Publish outage and remediation?
Khorshed et al.  highlight the gaps between cloud customers’ expectations and the actually delivered services, as shown in Fig. 1 (adapted from Khorshed et al. ). They affirm that cloud customers may form their expectations based on their past experiences and organizations’ needs. They are likely to conduct some sort of survey before choosing a cloud service provider similar to what people do before choosing an Internet Service Provider (ISP). Customers are expected to also establish to what extent providers satisfy confidentiality, integrity and availability requirements. On the other hand, cloud service providers may promise a lot to entice a customer to sign a deal, but harsh reality is frequently accompanied by insurmountable barriers to keeping some of their promises. Many potential cloud customers are well aware of this, and are consequentially still sitting on the sidelines. They will not venture into cloud computing unless they get a clear indication that all gaps are within acceptable limits.
Durkee  says that transparency is one of the first steps to developing trust in a relationship, and that the end customer must have a quantitative model of the cloud’s behavior. The cloud provider must provide details, under NDA if necessary, of the inner workings of their cloud architecture as part of developing a closer relationship with the customer. Durkee also says that this transparency can only be achieved if the billing models for the cloud clearly communicate the value (and avoided costs) of using the service. To achieve such clarity, the cloud vendor has to be able to measure the true cost of computing operations that the customer executes and bill for them.
Pauley  proposed an instrument for evaluating the transparency of a cloudprovider. It is the only empirical evaluation that we found that focuses on transparency in the cloud as a subject of study. The study aims to help businesses assess the transparency of a cloud provider’s security, privacy, auditability, and service-level agreements via self-service Web portals and publications. Pauley designed a scorecard (Table 1) to cover the assessment areas frequently raised in his research, and to begin to establish high-level criteria for assessing provider transparency. He concludes that further research is needed to determine the standard for measuring provider transparency. In our research we used a different strategy than Pauley; we have interviewed customers of cloud services to see what kind of information they would like to get from the cloud providers.
As part of the project, we were responsible for running a set of stakeholder workshops for eliciting requirements for accountability tools. In total, our elicitation effort has involved more than 300 stakeholders, resulting in 149 stakeholder requirements. The first workshop dealt with eliciting initial accountability requirements, serving as a reality-check on the three selected business use cases we had constructed . The second workshop dealt with risk perception. The aim was to focus on the notion of risk and trust assessment of cloud services, future Internet services and dynamic combinations of such services (mashups). After the first two workshops, we decided to organize multiple smaller, local workshops on each theme to ease participation of cloud customers and end users. The third set of workshops presented stakeholders with accountability mechanisms to gather their operational experiences and expectations about accountability in the cloud.
Of particular importance to this study was the risk workshop, where 15 tentative requirements related to transparency were identified. This workshop comprised 20 international stakeholders from the manufacturing industry, telecom, service providers, banking industry and academia, and the tentative transparency requirements were subsequently presented to our interviewees as a starting point for the discussion.
In addition to the stakeholder requirements, we have devised a set of high-level requirements which, from an organizational perspective, set out what it takes to be an accountable cloud provider . These requirements intend to supplement the requirements elicitation process by providing a set of high-level “guiding light” requirements, formulated as requirements that accountable organizations should meet. In short, these requirements state that an accountable organization that processes personal and/or business confidential data must (1) demonstrate willingness and capacity to be responsible and answerable for its data practices (2) define policies regarding their data practices, (3) monitor their data practices, (4) correct policy violations, and (5) demonstrate policy compliance.
From these activities we have created a repository with requirements from all elicitation workshops, the guiding lights requirements as well as a number of more technical requirements that have originating from the conceptual work and technical packages in the project. These have been classified in terms of whether they are functional requirements, which are directly related to the actors involved in the cloud service delivery chain, or requirements for accountability mechanisms, which are related to the tools and technologies that are being developed in the project.
For refining and confirming the elicited requirements of transparency, we have performed an interview study with eight interviewees, followed by an in-depth analysis of the collected information.
Invitations were sent to our list of contacts in Norwegian software companies. Participation was voluntary. Eight people accepted to participate in the interviews. The participants were all IT security experts working with cloud related projects. The participants represented six different organizations: a consultancy, 2 cloud service providers (1 public, 1 private), an application service provider, a distribution service provider, and a tertiary education institution.
What is the most important information you think should be provided to the cloud customer when buying services from cloud service providers? (Fig. 2)
In which parts would you like to be involved in making the decisions? In which parts would you like just to be informed of the decisions? (Fig. 3)
What would increase your trust that the data is secure in this scenario?
What do you want to know about how the provider corrects data security problems? (Fig. 4)
The eight interviews for this study were transcribed into text documents based on the audio recordings. For further analysis of the transcription, we followed the Thematic Synthesis recommended steps proposed by Cruzes and Dybå . Thematic synthesis is a method for identifying, analyzing, and reporting patterns (themes) within data. It comprises the identification of the main, recurrent or most important (based on the specific question being answered or the theoretical position of the reviewer) issues or themes arising from a body of evidence. The level of sophistication achieved by this method can vary; ranging from simple description of all the themes identified, through to analyses of how the different themes relate to one another in a conceptual map. Five steps were performed in this research: initial reading of data/text (extraction), identification of specific segments of text, labeling of segments of text (coding), translation of codes into themes, creation of the model and assessment of the trustworthiness of the model.
clear statements of what is possible to do with the data,
conformance to data agreements,
how the provider handles data,
who else other than the provider is participant of the value chain,
what the provider does with the data,
procedures to leave the service,
assurance that the user still owns the right to the data.
One respondent commented that even though he would like to have clear statements of what is possible to do with the data: “100 pages document could be written about this, but for some non-technical people it would not help at all”. Another one said: “I would like to have a [web] page where they could tell me about security mechanisms, for example, firewalls, backup etc.”
On the conformance to data agreements, the respondents agree that having Data Agreements helps, but it is mainly for technicians, not for non-technical people. On how the provider handles data, the respondents said that they would like to have functional, technical and security related information about how the providers handle the data. On location, the respondents are concerned about where the data is physically stored, and the legal jurisdiction of the services. Another important piece of information is about sub-providers, if there are any; where they are located and whether they meet legal requirements of the customer’s location. Multi-tenant situations are a concern of the customers, and they would like to have this information transparent. Also, information on how the providers ensure that data from one customer will not be accessed by another customer.
It is also important for transparency to know what the provider does to protect customers’ data. One respondent said that he would like to have information on: “How to protect the information or how the information is protected; not much in detail for the end-user, but only for enterprises.” It was also highlighted that they would like to have the procedures to leave the service and on how to move data from one service to another transparent. Besides, they would like to have the assurance that they still own the rights to their data. On the question “What would increase your trust that the data is secure in this scenario?” the participants mentioned eight different themes: (1) upfront transparency; (2) community discussions, (3) customer awareness; (4) way out; (5) reputation; (6) encryption; (7) data processor agreements; and (8) location.
When asked on what they would want to know about how the provider corrects data security problems, it was again surprising to learn that the participants have not thought much on what they could expect from the providers if some security issue happens. Most of the respondents needed further elaboration of the question before they would start saying something. Then, the participants stated that they would like to know what is planned before something happens; when something happens they want to know how the providers are handling the situation, why the problem happened, and when will the services be back online. Interesting was also the fact that the participants wanted to know how the providers are improving their services after something happens, based on lessons learned. These responses are collated in the taxonomy shown in Fig. 4.
5 Transparency Tools
In the following subsections, we will show in more detail how the A4Cloud DataTrack tool enhances transparency for end users by allowing users to visualize the personal data that have been disclosed to different online services.
5.1 The Data Track Tool
In its backend the architecture of the Data Track consists of four high-level components. First, the user interface (UI) component, which displays different visualizations of the data disclosures provided by the Data Track’s core. Second, the core component is a backend to the UI with local encrypted storage. Through a RESTful API, the core is able to provide a uniform view to the UI of all users’ data obtained from a service provider via so called plugins. Third, the plugin component provides the means for acquiring data disclosures from a given source (e.g., a service provider’s database) and parsing them into the internal format readable by the core. Fourth, the Data Track specifies a generic API component that enables a service provider to support the Data Track by providing remote access, correction, and deletion of personal data. Based on solutions proposed by Pulls et al. , the transfer of data through a service’s API can be done in a secure and privacy-friendly manner. By retrieving data from different services through their provided APIs users would be able to import their data immediately into the Data Track and visualize it in different ways, thus providing immediate value for end-users.
Detailed descriptions of the initial Data Track’s proof-of-concept, user interfaces and results of its usability evaluations are given by Fischer-Hübner et al. , and further design process is described by Angulo et al. . The security and privacy mechanisms of its software implementation have been documented by Hedbom, Pulls et al. [22, 23, 24].
5.2 Visualizing Data Disclosures
The design of the Data Track’s UI considers different methods for visualizing a user’s data disclosures in a way that is connected to this user’s momentary intentions. Based on the ideas from previous studies suggesting ways to display data disclosures [25, 26] and the creation of meaningful visualizations for large data sets [27, 28, 29], we have designed and prototyped two main visualizations for the Data Track as part of the A4Cloud project, we refer to them as the trace view and the timeline view
In order to cater for users perceptual capabilities and considering the screen real state, filtering mechanisms are put in place that would allow users to filter for information that is relevant to what they want to find out. In the trace view, users can search using free-text (i.e., by typing the name of a company, like Flickr or Spotify, or the name of a personal attribute, like ‘credit card’ or ‘heart rate’), they can also select categories of data or individual pieces of data, as well as the number of entities to be displayed on the screen.
The other visualization presents each disclosure in chronological order, thus name the timeline view. In this view, shown in Fig. 6, each circle along the vertical line represents the service to which personal data has been disclosed at a specific point in time. Each box besides a circle contains the personal attributes that were sent with that particular disclosure. In order to keep the size of the boxes consistent and to not overwhelm users with visual information, the boxes only show four attributes initially, and users have the option to look at the rest of the attributes in that particular disclosure by clicking in the “Show more” button. Users can scroll vertically indefinitely, thus unveiling the disclosures of data that they have made over time, and allowing them to answer the question “what information about me have I sent to which online services at a particular point in time?”
Thanks to the envisioned architecture in the A4Cloud project, which considers the use of the A-PPL Engine mentioned earlier, the Data Track would allow its end-users to access personal data about them that is located in a service’s side (i.e., stored in the service’s databases). In both, the trace view and the timeline view, a button (in shape of a cloud) located besides a service providers logo, opens up a dialog showing users the data about them that is located on the services’ side. This dialog, shown in Fig. 7, presents not only the personal attributes that have been explicitly collected by the service provider, but also data about the user that has been derived from analysis. Through this dialog users would also be able to request correction or deletion of personal attributes, thus being able to exercise their data access rights.
5.3 User Evaluations of the Data Track’s UI
Throughout the A4Cloud project, the user interface of the Data Track has gone through several iterative rounds of design and user evaluations. The evaluations had the purpose of testing the level of understanding of the interface, but also as a method for gathering end-user requirements on the needs and expectations that such a tool should provide to its users.
Usability testing of earlier designs of the Data Track revealed that lay users expressed feelings of surprise and discomfort with the knowledge that service providers analyze their disclosed data in order to derive additional insights about them, like their music preferences or religion. In general, evaluations have also shown that participants understand the purpose of the tool and ways to interact with it, identifying correctly the data that has been sent to particular service providers, and using the filtering functions to answer questions about their disclosed personal data. The tests also revealed users’ difficulties when differentiating between data that is locally stored under their control on their computers and data that is accessed on the services’ side (and shown through the pop-up dialog), as well as skepticism of the level of security of the data stored locally.
During an evaluation workshop, attendees discussed the advantages and possible risks of using such a tool, as well as the requirements to make such a tool not only user-friendly but also adopted in their routinary Internet activities. One participant, for instance, commented that the transparency that the Data Track provides, would encourage service providers to comply with their policies and be responsible stewards of their customers data, “it would keep me informed and hold big companies in line.”. Another participant mentioned the benefit of becoming more aware of disclosures made to service providers, “makes you aware of what information you put on the Internet, you probably would be more careful.” On the other hand, a participant commented on the risk of accumulating large amounts of personal data in a single place, “if there is one tool collecting all the data, then it is a single point of failure...”.
After analyzing all the collected information we compiled a list of requirements elicited in the interviews, as shown in Appendix A. The main “topics” mentioned by the respondents were related to what is possible to do with the data, conformance to data agreements, data handling, value chain, multi-tenant situations, protection of the data, decisions and corrections of the data.
Pauley  designed a scorecard reproduced in Table 1 to cover the assessment areas frequently raised in the research, and to begin to establish high-level criteria for assessing provider transparency. When comparing our list of elicited requirements (see Appendix A) to Pauley’s scorecard, we can see some slight differences in the criteria that Pauley described as information that should be provided by the cloud providers and the information that the customers are looking for. In the criteria about the business factors, the customers did not mention being concerned about the number of years in business, nor about membership of CSA, CloudAudit, OCCI, or other cloud standards groups, or if the providers are profitable or public. There is a possibility that the respondents did not mention these criteria because (a) companies in Norway are usually stable, and (b) membership of a group or association does not in itself guarantee good performance or compliance, even if the group or association promotes a certain standard.
On the security and privacy aspects, the customers mentioned all the criteria, but they did not mention directly the standards/certifying bodies, such as ISO/IEC 27000, COBIT and NIST, but they mentioned that it would be nice to know if the provider was certified somehow, based on some criteria. The customers also did not mention the need to know about “external” audits. One of the reasons for not mentioning security standards and certification bodies may be that companies that we have investigated are predominantly private companies in Norway, where there are not strong requirements from the certification bodies yet.
One important aspect not very much explored in Pauley’s scorecard is that customers would like providers to be transparent about what is possible to do with the data. In addition, customers were quite concerned about transparency on exit procedures (“way out”) and ownership of the data. The concern over data ownership is interesting seen in the light of Hon et al. , who found no evidence of cloud contracts leading to loss of Intellectual Property Rights.
Another aspect further mentioned by the customers is on the decisions made on “ongoing” services, where the customers would like that: “The cloud providers should get the consent of the cloud customer before moving the data to another country, in cases where new parties will be involved in the value chain and on changes on the initial terms of contract.”
Physical location and legal jurisdiction, as well as specific information on the value chain was a very important aspect to be transparent about for the cloud customers, and it was not explicitly mentioned in Pauley’s scorecard.
The interviewees did not show a desire for the kind of detailed information Durkee  deems necessary (the inner workings of their cloud architecture as part of developing a closer relationship with the customer), and as also pointed out by Durkee, some respondents were also aware that the costs of such clarity may be prohibitive, and we might add that this level of disclosure seems highly unlikely for ordinary customers of commodity cloud services.
The Data Track tool that we have described in Sect. 5 focuses more on end users (data subjects) than professional cloud users, but is clearly relevant for the customers of the cloud users. However, the tool can also be used to follow up on what a provider claims to be able to do with the data (Appendix A.1). It can be used to follow up on the geographical location of the customer’s data (Appendix A.2), and can also help illustrating the existence of services from other parties (Appendix A.4).
Cloud computing has been receiving a great deal of attention, not only in the academic field, but also amongst the users and providers of IT services, regulators and government agencies. The results from our study focus on an important aspect of accountability of the cloud services to customers: transparency.
The customers made explicit all the information that they would like the providers to be transparent about. Much of this information can be easily provided at a provider’s website. Our contention is that being transparent can be a business advantage, and that cloud customers who are concerned with, e.g., privacy of the data they put into the cloud, will choose providers who can demonstrate transparency over providers who cannot.
Our study increases the body of knowledge on the criteria needed for more accountable and transparent cloud services, and confirms the results from previous studies on these criteria. The list of requirements in Appendix A complements, in part, the existing criteria.
An area for future research is to further evaluate how cloud providers currently make the information required by cloud customers available. In addition, what are the effects of having transparent services in terms of costs and benefits to cloud customers and providers. Besides, we plan to increase the number of participants responding to our interview guide and adding strength to the evidence provided in this paper. Another aspect we would like to investigate, is if the results will be different for users of the different types of services (e.g., SaaS vs IaaS).
This paper is based on joint research in the EU FP7 A4CLOUD project, grant agreement no: 317550.
- 3.Gavrilov, G., Trajkovik, V.: Security and privacy issues and requirements for healthcare cloud computing. In: Proceedings of the ICT Innovations (2012)Google Scholar
- 6.Ahuja, S.P., Mani, S., Zambrano, J.: A survey of the state of cloud computing in healthcare. Netw. Commun. Technol. 1, 12–19 (2012)Google Scholar
- 7.Felici, M., Koulouris, T., Pearson, S.: Accountability for data governance in cloud ecosystems. In: 2013 IEEE 5th International Conference on Cloud Computing Technology and Science (CloudCom), vol. 2, pp. 327–332 (2013)Google Scholar
- 8.Yang, H., Tate, M.: A descriptive literature review and classification of cloud computing research. Commun. Assoc. Inf. Syst. 31, 35–60 (2012)Google Scholar
- 10.Khorshed, M.T., Ali, A.S., Wasimi, S.A.: A survey on gaps, threat remediation challenges and some thoughts for proactive attack detection in cloud computing. Future Gener. Comput. Syst. 28, 833–851 (2012). Including Special sections SS: Volunteer Computing and Desktop Grids and SS: Mobile Ubiquitous ComputingCrossRefGoogle Scholar
- 13.Bernsmed, K., Tountopoulos, V., Brigden, P., Rübsamen, T., Felici, M., Wainwright, N., Santana De Oliveira, A., Sendor, J., Sellami, M., Royer, J.C.: Consolidated use case report. A4Cloud Deliverable D23.2 (2014)Google Scholar
- 14.Jaatun, M.G., Pearson, S., Gittler, F., Leenes, R.: Towards strong accountability for cloud service providers. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 1001–1006 (2014)Google Scholar
- 15.Cruzes, D.S., Dybå, T.: Recommended steps for thematic synthesis in software engineering. In: Proceedings of the ESEM 2011, pp. 275–284 (2011)Google Scholar
- 16.Azraoui, M., Elkhiyaoui, K., Önen, M., Bernsmed, K., De Oliveira, A.S., Sendor, J.: A-PPL: an accountability policy language. In: Garcia-Alfaro, J., Herrera-Joancomartí, J., Lupu, E., Posegga, J., Aldini, A., Martinelli, F., Suri, N. (eds.) DPM/SETOP/QASA 2014. LNCS, vol. 8872, pp. 319–326. Springer, Heidelberg (2015)Google Scholar
- 17.Alnemr, R., Pearson, S., Leenes, R., Mhungu, R.: Coat: cloud offerings advisory tool. In: 2014 IEEE 6th International Conference on Cloud Computing Technology and Science (CloudCom), pp. 95–100 (2014)Google Scholar
- 19.Pulls, T.: Preserving privacy in transparency logging. Ph.D. thesis, Karlstad University Studies, vol. 28 (2015)Google Scholar
- 21.Angulo, J., Fischer-Hübner, S., Pulls, T., Wästlund, E.: Usable transparency with the data track: a tool for visualizing data disclosures. In: Extended Abstracts in the Proceedings of the Conference on Human Factors in Computing Systems, CHI 2015, Seoul, Republic of Korea, pp. 1803–1808. ACM (2015)Google Scholar
- 24.Pulls, T., Peeters, R., Wouters, K.: Distributed privacy-preserving transparency logging. In: Workshop on Privacy in the Electronic Society, WPES 2013, Berlin, Heidelberg, Germany, pp. 83–94 (2013)Google Scholar
- 26.Kolter, J., Netter, M., Pernul, G.: Visualizing past personal data disclosures. In: ARES 2010 International Conference on Availability, Reliability, and Security. IEEE, pp. 131–139 (2010)Google Scholar
- 27.Becker, H., Naaman, M., Gravano, L.: Beyond trending topics: real-world event identification on twitter. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, ICWSM 2011 (2011)Google Scholar
- 28.Freeman, L.C.: Visualizing social networks. J. Soc. Struct. 1, 4 (2000)Google Scholar
- 29.Kairam, S., MacLean, D., Savva, M., Heer, J.: Graphprism: compact visualization of network structure. In: Proceedings of the International Working Conference on Advanced Visual Interfaces, ACM, pp. 498–505 (2012)Google Scholar
- 30.Hon, W., Millard, C., Walden, I.: Negotiating cloud contracts - looking at clouds from both sides now. Stan. Tech. L. Rev. 81 (2012). Queen Mary School of Law Legal Studies Research Paper No. 117/2012. https://journals.law.stanford.edu/stanford-technology-law-review/online/negotiating-cloud-contracts-looking-clouds-both-sides-now, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2055199