A Novel Method for Micro-Aggregation in Secure Statistical Databases Using Association and Interaction

  • B. John Oommen
  • Ebaa Fayyoumi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4861)

Abstract

We consider the problem of micro-aggregation in secure statistical databases, by enhancing the primitive Micro-Aggregation Technique (MAT), which incorporates proximity information. The state-of-the-art MAT recursively reduces the size of the data set by excluding points which are farthest from the centroid, and those which are closest to these farthest points, while it ignores the mutual Interaction between the records. In this paper, we argue that inter-record relationships can be quantified in terms of two entities, namely their “Association” and “Interaction”. Based on the theoretically sound principles of the neural networks (NN), we believe that the proximity information can be quantified using the mutual Association, and their mutual Interaction can be quantified by invoking transitive-closure like operations on the latter. By repeatedly invoking the inter-record Associations and Interactions, the records are grouped into sizes of cardinality “k”, where k is the security parameter in the algorithm. Our experimental results, which are done on artificial data and on the benchmark data sets for real-life data, demonstrate that the newly proposed method is superior to the state-of-the-art by as much as 13%.

Keywords

Information loss (ILMicro-Aggregation Technique (MATInter-record association Interaction between micro-units 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Adam, N., Wortmann, J.: Security-Control Methods for Statistical Databases: A Comparative Study. ACM Computing Surveys 21(4), 515–556 (1989)CrossRefGoogle Scholar
  2. 2.
    Cuppen, M.: Secure Data Perturbation in Statistical Disclosure Control. PhD thesis, Statistics Netherlands (2000)Google Scholar
  3. 3.
    Willenborg, L., Waal, T.: Elements of Statistical Disclosure Control (ILL Number: 2132712). Springer, Heidelberg (2001)CrossRefMATHGoogle Scholar
  4. 4.
    Domingo-Ferrer, J., Mateo-Sanz, J.: Practical Data-Oriented Microaggregation for Statistical Disclosure Control. IEEE Transactions on Knowledge and Data Engineering 14(1), 189–201 (2002)CrossRefGoogle Scholar
  5. 5.
    Mateo-Sanz, J., Domingo-Ferrer, J.: A Method for Data-Oriented Multivariate Microaggregation. In: Proceedings of Statistical Data Protection98, Luxembourg: Office for Official Publications of the European Communities, pp. 89–99 (1999)Google Scholar
  6. 6.
    Domingo-Ferrer, J.: Statistical Disclosure Control in Catalonia and the CRISES Group. Technical report (1999)Google Scholar
  7. 7.
    Domingo-Ferrer, J., Torra, V.: Aggregation Techniques for Statistical confidentiality. In: Aggregation operators: new trends and applications, Germany: Heidelberg, Physica-Verlag GmbH, pp. 260–271 (2002)Google Scholar
  8. 8.
    Hansen, S., Mukherjee, S.: A Polynomial Algorithm for Univariate Optimal Microaggregation. IEEE Transactions on Knowledge and Data Engineering 15(4), 1043–1044 (2003)CrossRefGoogle Scholar
  9. 9.
    Laszlo, M., Mukherjee, S.: Minimum Spanning Tree Partitioning Algorithm for Microaggregation. IEEE Transactions on Knowledge and Data Engineering 17(7), 902–911 (2005)CrossRefGoogle Scholar
  10. 10.
    Torra, V.: Microaggregation for Categorical Variables: A Median Based Approach. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 162–174. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Oganian, A., Domingo-Ferrer, J.: On The Complexity of Optimal Microaggregation for Statistical Disclosure Control. Statistical Journal of the United Nations Economic Comission for Europe 18(4), 345–354 (2001)Google Scholar
  12. 12.
    Solanas, A., Martinez-Balleste, A., Mateo-Sanz, J., Domingo-Ferrer, J.: Multivariate Microaggregation Based Genetic Algorithms. In: 3rd International IEEE Conference on Intelligent Systems, pp. 65–70 (2006)Google Scholar
  13. 13.
    Domingo-Ferrer, J., Torra, V.: Fuzzy Microaggregation for Microdata Protection. Journal of Advanced Computational Intelligence and Intelligent Informatics (JACIII) 7(2), 153–159 (2003)CrossRefMATHGoogle Scholar
  14. 14.
    Torra, V., Domingo-Ferrer, J.: Towards Fuzzy C-Means Based Microaggregation. In: Grzegorzewski, P., Hryniewicz, O., Gil, M. (eds.) Advances in Soft Computing: Soft Methods in Probability, Statistics and Data Analysis, Germany: Heidelberg, Physica-Verlag, pp. 289–294 (2002)Google Scholar
  15. 15.
    Fayyoumi, E., Oommen, B.: A Fixed Structure Learning Automaton Micro-Aggregation Technique for Secure Statistical Databases. In: Privacy Statistical Databases, Italy: Rome, pp. 114–128 (2006)Google Scholar
  16. 16.
    Crises, G.: Trading off Information Loss and Disclosure Risk in Database Privacy Protection. Technical report (2004)Google Scholar
  17. 17.
    Mateo-Sanz, J., Domingo-Ferrer, J., Sebé, F.: Probabilistic Information Loss Measures in Confidentiality Protection of Continuous Microdata. Data Mining and Knowledge Discovery 11(2), 181–193 (2005)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Domingo-Ferrer, J., Torra, V.: Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation. Data Mining and Knowledge Discovery 11(2), 195–212 (2005)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Domingo-Ferrer, J., Sebé, F.: Optimal multivariate 2-microaggregation for microdata protection: a 2-approximation. In: Privacy Statistical Databases, Italy: Rome, pp. 129–138 (2006)Google Scholar
  20. 20.
    Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules Between Sets of Items in Large Databases. In: Proceedings of ACM SIGMOD, USA: Washington DC, pp. 207–216 (1993)Google Scholar
  21. 21.
    Agrawal, R., Mannila, H., Srikant, H., Toivonen, R., Verkamo, I.: Fast Discovery of Association Rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328 (1996)Google Scholar
  22. 22.
    Bacao, F., Lobo, V., Painho, M.: Self-organizing Maps as Substitutes for K-Means Clustering. In: International Conference on Computational Science, pp. 476–483 (2005)Google Scholar
  23. 23.
    Feng, L., Dillon, T., Weigana, H., Chang, E.: An XML-Enabled Association Rule Framework. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 88–97. Springer, Heidelberg (2003)Google Scholar
  24. 24.
    Markey, M., Lo, J., Tourassi, G.: Self-Organizing Map for Cluster Analysis of A Breast Cancer Database. Artificial Intelligence in Medicine 27, 113–127 (2003)CrossRefGoogle Scholar
  25. 25.
    Park, J., Chen, S., Yu, P.: Using A Hash-Based Method with Transaction Trimming for Mining Association Rules. IEEE Transactions on Knowledge and Data Engineering 9, 813–826 (1997)CrossRefGoogle Scholar
  26. 26.
    Yao, Y., Chen, L., Chen, Y.: Associative Clustering for Clusters of Arbitrary Distribution Shapes. Neural Processing Letters 14, 169–177 (2001)CrossRefMATHGoogle Scholar
  27. 27.
    Defays, D., Anwar, M.: Masking Micro-data Using Micro-Aggregation. Journal of Official Statistics 14(4), 449–461 (1998)Google Scholar
  28. 28.
    Defays, D., Anwar, N.: Micro-Aggregation: A Generic Method. In: Proceedings of the 2nd International Symposium on Statistical Confidentiality, Luxembourg: Office for Official Publications of the European Communities, pp. 69–78 (1995)Google Scholar
  29. 29.
    Domingo-Ferrer, J., Mateo-Sanz, J., Oganian, A., Torra, V., Torres, A.: On The Security of Microaggregation with Individual Ranking: Analytical Attacks. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10(5), 477–491 (2002)MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Mas, M.: Statistical Data Protection Techniques. Technical report, Eustat: Euskal Estatistika Erakundea,Instituto Vasco De Estadistica (2006)Google Scholar
  31. 31.
    Li, Y., Zhu, S., Wang, L., Jajodia, S.: A privacy-enhanced microaggregation method. In: Eiter, T., Schewe, K.-D. (eds.) FoIKS 2002. LNCS, vol. 2284, pp. 148–159. Springer, Heidelberg (2002)Google Scholar
  32. 32.
    Fayyoumi, E., Oommen, B.: On Optimizing the k-Ward Micro-Aggregation Technique for Secure Statistical Databases. In: 11th Austratasian Conference on Information Security and Privacy Proceeding, Australia: Melbourne, pp. 324–335 (2006)Google Scholar
  33. 33.
    Solanas, A., Martínez-Ballesté, A.: V-MDAV: A Multivariate Microaggregation With Variable Group Size. In: 17th COMPSTAT Symposium of the IASC, Rome (2006)Google Scholar
  34. 34.
    Yao, Y., Chen, L., Goh, A., Wong, A.: Clustering Gene Data via Associative Clustering Neural Network. The 9th International Conference on Neural Information Processing (ICONIP 2002) 5, 2228–2232 (2002)CrossRefGoogle Scholar
  35. 35.
    Adachi, M., Aihara, K.: Associative Dynamics in a Chaotic Neural Network. Neural Networks 10(1), 83–98 (1997)CrossRefGoogle Scholar
  36. 36.
    Domingo-Ferrer, J., Torra, V.: A Quantitative Comparison of Disclosure Control Methods for Microdata. In: Doyle, P., Lane, J., Theeuwes, J., Zayatz, L. (eds.) Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, Amesterdam: North-Holland, Berlin, pp. 113–134. Springer, Heidelberg (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • B. John Oommen
    • 1
  • Ebaa Fayyoumi
    • 2
  1. 1.Chancellor’s Professor; Fellow: IEEE and Fellow: IAPR., School of Computer Science, Carleton University, Ottawa, K1S 5B6Canada
  2. 2.School of Computer Science, Carleton University, Ottawa, K1S 5B6Canada

Personalised recommendations