JPEG-Based Microdata Protection

  • Javier Jiménez
  • Guillermo Navarro-Arribas
  • Vicenç Torra
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8744)


JPEG-based protections can be obtained by regarding microdata as an image that is transformed by means of a lossy JPEG compression-decompression process. Here we propose a general model that decouples JPEG-based methods into two parts. First part encompasses transformations between data and image spaces. Second part consists in the image transformation itself. Under this general model, we first explore different maps between data and image spaces. In our experiments, quantization using histogram equalization, in combination with JPEG-based methods, outperform other approaches. Secondly, image transformations other than JPEG can be utilized. We illustrate this point by introducing JPEG 2000 as a valid alternative to JPEG. Finally, we experimentally analyze the effectiveness of the generalized JPEG-based method, comparing it with well-known state-of-the-art protection methods such as rank swapping, microaggregation and noise addition.


Information Loss Image Space Histogram Equalization JPEG Compression Image Transformation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Antonini, M., Barlaud, M., Mathieu, P., Daubechies, I.: Image coding using wavelet transform. IEEE Transactions on Image Processing 1(2), 205–220 (1992)CrossRefGoogle Scholar
  2. 2.
    Brand, R.: Microdata protection through noise addition. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 97–116. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Brand, R., Domingo-Ferrer, J., Mateo-Sanz, J.M.: Reference data sets to test and compare SDC methods for protection of numerical microdata. Unscheduled Deliverable, European Project IST–2000–25069 CASC (April 2002)Google Scholar
  4. 4.
    Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG2000 still image coding system: An overview. IEEE Transactions on Consumer Electronics 16(4), 1103–1127 (2000)CrossRefGoogle Scholar
  5. 5.
    Defays, D., Nanopoulos, P.: Panels of enterprises and confidentiality: The small aggregates method. In: Proc. of the 1992 Symposium on Design and Analysis of Longitudinal Surveys, Statistics Canada, pp. 195–204 (1993)Google Scholar
  6. 6.
    Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)CrossRefGoogle Scholar
  7. 7.
    Domingo-Ferrer, J., Mateo-Sanz, J.M., Torra, V.: Comparing SDC methods for microdata on the basis of information loss and disclosure risk. In: New Techniques and Technologies for Statistics: Exchange of Technology and Know-How, ETK-NTTS 2001, Creta, Hersonissos, pp. 807–826 (2001)Google Scholar
  8. 8.
    Domingo-Ferrer, J., Torra, V.: Disclosure control methods and information loss for microdata. In: Doyle, P., Lane, J.I., Theuwes, J.J.M., Vatz, L. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, ch. 5, pp. 91–110. Elsevier (2001)Google Scholar
  9. 9.
    Domingo-Ferrer, J., Torra, V.: A quantitative comparison of disclosure control methods for microdata. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, ch. 6, pp. 111–133. Elsevier (2001)Google Scholar
  10. 10.
    Domingo-Ferrer, J., Torra, V.: Disclosure risk assessment in statistical microdata protection via advanced record linkage. Statistics and Computing 13(4), 343–354 (2003)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous k-anonymity through microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Domingo-Ferrer, J., Torra, V., Mateo-Sanz, J., Sebe, F.: Empirical disclosure risk assessment of the ipso synthetic data generators. Monographs in Official Statistics – Work Session on Statistical Data Confidentiality, Eurostat (2006)Google Scholar
  13. 13.
    Hartley, H.O.: Maximum likelihood estimation from incomplete data. Biometrics 14(2), 174–194 (1958)CrossRefzbMATHGoogle Scholar
  14. 14.
    ISO/IEC JTC1/SC29/WG1. 10918-1:1994: Information technology — Digital compression and coding of continuous-tone still images: Requirements and guidelines. International Standard, International Organization for Standardization, ITU-T Recommendation T.81 (1994)Google Scholar
  15. 15.
    ISO/IEC JTC1/SC29/WG1. 15444-1: Information technology — JPEG 2000 Image Coding System Part 1: Core Coding System. International Standard, International Organization for Standardization, 2004. ITU-T Recommendation T.803 (2002)Google Scholar
  16. 16.
    Jaro, M.: Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. Journal of the American Statistical Association 84(406), 414–420 (1989)CrossRefGoogle Scholar
  17. 17.
    Jimenez, J., Torra, V.: Utility and risk of JPEG–based continuous microdata protection methods. In: IEEE Proc. of the 4th Int. Conf. on Availability, Reliability and Security, ARES (2009)Google Scholar
  18. 18.
    Kim, J.J.: A method for limiting disclosure in microdata based on random noise and transformation. In: Proc. of the Section on Survey Research Methods, American Statistical Association, Alexandra, VA, pp. 370–374 (1986)Google Scholar
  19. 19.
    Laszlo, M., Mukherjee, S.: Minimum spanning tree partitioning algorithm for microaggregation. IEEE Transactions on Knowledge and Data Engineering 17(7), 902–911 (2005)CrossRefGoogle Scholar
  20. 20.
    Mateo-Sanz, J.M., Domingo-Ferrer, J., Sebé, F.: Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Min. Knowl. Discov. 11(2), 181–193 (2005)CrossRefMathSciNetGoogle Scholar
  21. 21.
    McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, 2nd edn. Wiley Series in Probability and Statistics. Wiley-Interscience (March 2008)Google Scholar
  22. 22.
    Moore Jr., R.A.: Controlled data-swapping techniques for masking public use microdata sets. Research report, RR 96-04, Statistical Research Division Report Series, U.S. Bureau of the Census (1996)Google Scholar
  23. 23.
    Nin, J., Herranz, J., Torra, V.: On the disclosure risk of multivariate microaggregation. Data and Knnowledge Engineering 67(3), 399–412 (2008)CrossRefGoogle Scholar
  24. 24.
    Nin, J., Herranz, J., Torra, V.: Rethinking rank swapping to decrease disclosure risk. Data and Knowledge Engineering 64(1), 346–364 (2008)CrossRefGoogle Scholar
  25. 25.
    Nin, J., Torra, V.: Analysis of the univariate microaggregation disclosure risk. New Generation Computing 27, 177–194 (2009)CrossRefGoogle Scholar
  26. 26.
    Pagliuca, D., Seri, G.: Some results of individual ranking method on the system of enterprise accounts annual survey. Esprit SDC Project, Delivrable MI-3/D2 (1999)Google Scholar
  27. 27.
    Parker, J.R.: Practical computer vision using C. John Wiley & Sons (1994)Google Scholar
  28. 28.
    Rabbani, M., Joshi, R.: An overview of the JPEG 2000 still image compression standard. Signal Processing: Image Communication 17(1), 3–48 (2000); Special Issue on JPEG2000Google Scholar
  29. 29.
    Rebollo-Monedero, D., Forn, J., Pallars, E., Parra-Arnau, J.: A modification of the lloyd algorithm for k-anonymous quantization. Information Sciences 222, 185–202 (2013)CrossRefzbMATHMathSciNetGoogle Scholar
  30. 30.
    Rebollo-Monedero, D., Forn, J., Soriano, M.: An algorithm for k-anonymous microaggregation and clustering inspired by the design of distortion-optimized quantizers. Data & Knowledge Engineering 70(10), 892–921 (2011)CrossRefGoogle Scholar
  31. 31.
    Skodras, A., Christopoulos, C., Ebrahimi, T.: The JPEG 2000 still image compression standard. IEEE Signal Processing Magazine 18(5), 36–58 (2001)CrossRefGoogle Scholar
  32. 32.
    Taubman, D.S., Marcellin, M.W.: JPEG2000: Image Compression Fundamentals, Standards and Practice. Kluwer Academic (2002)Google Scholar
  33. 33.
    Torra, V.: Microaggregation for categorical variables: A median based approach. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 162–174. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  34. 34.
    Torra, V.: Constrained microaggregation: Adding constraints for data editing. Transactions on Data Privacy 1(2), 86–104 (2008)MathSciNetGoogle Scholar
  35. 35.
    Wallace, G.K.: The JPEG still picture compression standard. IEEE Transactions on Consumer Electronics 38(1), xviii–xxxiv (1992)Google Scholar
  36. 36.
    Yancey, W.E., Winkler, W.E., Creecy, R.H.: Disclosure risk assessment in perturbative microdata protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Javier Jiménez
    • 1
  • Guillermo Navarro-Arribas
    • 2
  • Vicenç Torra
    • 1
  1. 1.IIIA-CSIC, Artificial Intelligence Research InstituteSpanish National Research CouncilSpain
  2. 2.DEIC-UAB, Dep. of Information Engineering and CommunicationsUniversitat Autònoma de BarcelonaSpain

Personalised recommendations