Skip to main content

Privacy and Confidentiality in Service Science and Big Data Analytics

  • Conference paper
  • First Online:
Privacy and Identity Management for the Future Internet in the Age of Globalisation (Privacy and Identity 2014)

Part of the book series: IFIP Advances in Information and Communication Technology ((Tutorials,volume 457))

Included in the following conference series:

  • 1056 Accesses

Abstract

Vast amounts of data are now being collected from census and surveys, scientific research, instruments, observation of consumer and internet activities, and sensors of many kinds. These data hold a wealth of information, however there is a risk that personal privacy will not be protected when they are accessed and used.

This paper provides an overview of current and emerging approaches to balancing use and analysis of data with confidentiality protection in the research use of data, where the need for privacy protection is widely-recognised. These approaches were generally developed in the context of national statistical agencies and other data custodians releasing social and survey data for research, but are increasingly being adapted in the context of the globalisation of our information society. As examples, the paper contributes to a discussion of some of the issues regarding confidentiality in the service science and big data analytics contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abowd, J.M., Stinson, M., Benedetto, G.: Final report to the social security administration on the sipp/ssa/irs public use file project. Technical report (2006)

    Google Scholar 

  2. Australian Bureau of Statistics: Remote Access Data Laboratory (RADL) (2014). http://www.abs.gov.au. Accessed 23 October 2014

  3. Australian Bureau of Statistics: About CURF Microdata. Website (nd) (2014). http://www.abs.gov.au/websitedbs/D3310114.nsf/home/About+CURF+Microdata. Accessed 23 October 2014

  4. Australian Bureau of Statistics: Census TableBuilder (nd) (2014). http://www.abs.gov.au. Accessed 23 October 2014

  5. Australian Bureau of Statistics: (website) (2014). http://www.abs.gov.au. Accessed 23 October 2014

  6. British Columbia Linked Health Database (BCHLD) (2014). http://riskfactor.cancer.gov/tools/pharmaco/epi/british_columbia.html. Accessed 23 October 2014

  7. Cavoukian, A., Jonas, J.: Privacy by design in the age of big data. Published online (2014). http://privacybydesign.ca/content/uploads/2012/06/pbd-big_data.pdf. Accessed 23 Dec 2014

  8. Centers for Disease Control and Prevention: Public-use data files and documentation. Website (2014). http://www.cdc.gov/nchs/data_access/ftp_data.htm. Accessed 23 Oct 2014

  9. Cox, L.: Linear sensitivity measures in statistical disclosure control. J. Stat. Plan. Infer. 5, 153–164 (1981)

    Article  MATH  Google Scholar 

  10. Duncan, G.T., Keller-McNulty, S.A., Stokes, S.L.: Disclosure risk vs data utility: The R-U confidentiality map. Technical report LA-UR-01-6428, Los Alamos National Laboratory (2001)

    Google Scholar 

  11. Duncan, G., Elliot, M., Salazar-Gonzàlez, J.J.: Statistical Confidentiality. Springer, New York (2011)

    Book  MATH  Google Scholar 

  12. Duncan, G., Pearson, R.: Enhancing access to microdata while protecting confidentiality: prospects for the future. Stat. Sci. 6, 219–239 (1991)

    Article  Google Scholar 

  13. Ford, D.V., Jones, K.H., Verplancke, J.P., Lyons, R.A., John, G., Brown, G., Brooks, C.J., Thompson, S., Bodger, O., Couch, T., Leake, K.: The SAIL databank: building a national architecture for e-health research and evaluation. BioMed central. Health Serv. Res. 9, 157 (2009)

    Article  Google Scholar 

  14. Gećzy, P., Izumi, N., Hasida, K.: Service science, quo vadis? Int. J. Serv. Sci. Manage. Eng. Technol. 1(1), 1–16 (2010)

    Article  Google Scholar 

  15. Gill, L.: OX-LINK: The oxford medical record linkage system. Record Linkage Techniques. Technical report, 19, University of Oxford, Oxford (1997)

    Google Scholar 

  16. Gomatam, S., Karr, A., Reiter, J., Sanil, A.: Data dissemination and disclosure limitation in a world without microdata: a risk-utility framework for remote access systems. Stat. Sci. 20, 163–177 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  17. Gouweleeuw, J., Kooiman, P., DeWolf, L.W.P.P.: Post randomisation for statistical disclosure control: theory and implementation. J. Official Stat. 14, 463–478 (1998)

    Google Scholar 

  18. Holman, C.D.J., Bass, A.J., Rouse, I.L., Hobbs, M.S.: Population-based linkage of health records in Western Australia: development of a health services research linked database. Aust. N. Z. J. Public Health 23, 453–459 (1999)

    Article  Google Scholar 

  19. Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Nordholt, E., Spicer, K., de Wolf, P.P.: Statistical Disclosure Control. Wiley Series in Survey Methodology. Wiley, United Kingdom (2012)

    Book  Google Scholar 

  20. Kendrick, S., Clarke, J.A.: The scottish medical record linkage system. Health Bull. Edinb. 51, 72–79 (1979)

    Google Scholar 

  21. Kinney, S.K., Reiter, J.P., Reznek, A.P., Miranda, J., Jarmin, R.S., Abowd, J.M.: Towards unrestricted public use business microdata: the synthetic longitudinal business database. Int. Stat. Rev. 79(3), 362–384 (2011)

    Article  Google Scholar 

  22. Little, R.: Statistical analysis of masked data. J. Official Stat. 9, 407–426 (1993)

    Google Scholar 

  23. Lucero, J., Zayatz, L., Singh, L., You, J., DePersio, M., Freiman, M.: The current stage of the microdata analysis system at the U.S. census bureau. In: Proceedings of the 58th Congress of the International Statistical Institute, ISI 2011 (2011)

    Google Scholar 

  24. Lusch, R., Vargo, S. (eds.): The Service-Dominant Logic of Marketing: Dialog, Debate, and Directions. ME Sharpe, Armonk (2006)

    Google Scholar 

  25. Marley, J., Leaver, V.: A method for confidentialising user-defined tables: statistical properties and a risk-utility analysis. In: Proceedings of the 58th Congress of the International Statistical Institute, ISI 2011, 21–26 Aug 2011

    Google Scholar 

  26. Marsh, C., Skinner, C., Arber, S., Penhale, B., Openshaw, S., Hobcraft, J., Lievesley, D., Walford, N.: The case for samples of anonymized records from the 1991 census. J. Roy. Stat. Soc.: Ser. A 154, 305–340 (1991)

    Article  Google Scholar 

  27. Minnesota Population Center, University of Minnesota: Ipums international. Website (2014). https://international.ipums.org/international/. Accessed 23 Oct 2014

  28. Office for National Statistics: (website) (2014). http://www.statistics.gov.uk. Accessed 23 Oct 2014

  29. O’Keefe, C.M., Gould, P., Churches, T.: Comparison of two remote access systems recently developed and implemented in australia. In: Domingo-Ferrer, J. (ed.) PSD 2014. LNCS, vol. 8744, pp. 299–311. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  30. O’Keefe, C.M., Rubin, D.B.: Balancing the research use of health and medical data with confidentiality protection, preprint

    Google Scholar 

  31. O’Keefe, C.M., Westcott, M., Ickowicz, A., O’Sullivan, M., Churches, T.: Protecting confidentiality in statistical analysis outputs from a virtual data centre. Working Paper (29–30 October 2013), joint UNECE/Eurostat work session on statistical data confidentiality, Ottawa, Canada, p. 10 (2014). http://www.unece.org/stats/documents/2013.10.confidentiality.html. Accessed 23 Oct 2014

  32. Pitkänen, O., Virtanen, P., Kemppinen, J.: Legal research topics in user-centric services. IBM Syst. J. 47(1), 143–152 (2008)

    Article  Google Scholar 

  33. Population Health Research Network (2014). http://www.phrn.org.au/. Accessed 23 Oct 2014

  34. Reiter, J.: Model diagnostics for remote-access regression systems. Stat. Comput. 13, 371–380 (2003)

    Article  MathSciNet  Google Scholar 

  35. Reiter, J.: Using CART to generate partially synthetic public use microdata. J. Official Stat. 21, 441–462 (2005)

    Google Scholar 

  36. Reiter, J., Kohnen, C.: Categorical data regression diagnostics for remote systems. J. Stat. Comput. Simul. 75, 889–903 (2005)

    Article  MATH  MathSciNet  Google Scholar 

  37. Robertson, D.A., Ethier, R.: Cell suppression: experience and theory. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, p. 8. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  38. Roos, L.L., Wajda, A.: Record linkage strategies: Part 1: Estimating information and evaluating approaches. Technical report 28, University of Manitoba, Winnipeg (1990)

    Google Scholar 

  39. Rubin, D.: Discussion: statistical disclosure limitation. J. Official Stat. 9, 462–468 (1993)

    Google Scholar 

  40. Sampson, S., Froehle, C.: Foundations and implications of a proposed unified services theory. Prod. Oper. Manag. 15(2), 329–343 (2006)

    Article  Google Scholar 

  41. Sax Institute: Secure Unified Research Environment (SURE). Website (2014). http://www.sure.org.au. Accessed 23 Oct 2014

  42. Skinner, C., Shlomo, N.: Assessing identification risk in survey microdata using log-linear models. J. Am. Stat. Assoc. 103, 989–1001 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  43. Sparks, R., Carter, C., Donnelly, J., Duncan, J., O’Keefe, C.M., Ryan, L.: A framework for performing statistical analyses of unit record health data without violating either privacy or confidentiality of individuals. In: Proceedings of the 55th Session of the International Statistical Institute, Sydney, p. 4 (2005)

    Google Scholar 

  44. Sparks, R., Carter, C., Donnelly, J., O’Keefe, C.M., Duncan, J., Keighley, T., McAullay, D.: Remote access methods for exploratory data analysis and statistical modelling: privacy-preserving analytics™. Comput. Methods Programs Biomed. 91, 208–222 (2008)

    Article  Google Scholar 

  45. Spohrer, J., Maglio, P., Bailey, J., Gruhl, D.: Steps toward a science of service systems. Computer 40, 71–77 (2007)

    Article  Google Scholar 

  46. Thompson, G., Broadfoot, S., Elazar, D.: Methodology for automatic confidentialisation of statistical outputs from remote servers at the Australian Bureau of Statistics. In: Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, Ottawa, Canada, 28–30 October 2013, p. 37 (2013)

    Google Scholar 

  47. UK Data Archive: Secure data service (website) (2014). http://ukdataservice.ac.uk/get-data/how-to-access/accesssecurelab.aspx. Accessed 23 Oct 2014

  48. United States Census Bureau: (website) (2014). http://www.census.gov. Accessed 23 Oct 2014

  49. University of Chicago: NORC (website) (2014). http://www.norc.org. Accessed 23 Oct 2014

Download references

Acknowledgments

I warmly thank the organisers of the International Federation for Information Processing (IFIP) 9th Summer School on Privacy and Identity Management for the Future Internet in the Age of Globalisation, for their invitation to participate. I acknowledge the financial support of the Authentication and Authorization for Entrusted Unions (AU2EU) project funded by the European Commission Seventh Framework Programme for Research and Technological Development.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christine M. O’Keefe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 IFIP International Federation for Information Processing

About this paper

Cite this paper

O’Keefe, C.M. (2015). Privacy and Confidentiality in Service Science and Big Data Analytics. In: Camenisch, J., Fischer-Hübner, S., Hansen, M. (eds) Privacy and Identity Management for the Future Internet in the Age of Globalisation. Privacy and Identity 2014. IFIP Advances in Information and Communication Technology, vol 457. Springer, Cham. https://doi.org/10.1007/978-3-319-18621-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18621-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18620-7

  • Online ISBN: 978-3-319-18621-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics