Memetic Algorithm for Solving the Task of Providing Group Anonymity

Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 312)

Abstract

Modern information technologies enable us to analyze great amounts of primary non-aggregated data. Publishing them increases threats of disclosing sensitive information. To protect information about a single person, one needs to provide individual data anonymity. Providing group data anonymity presupposes protecting intrinsic data features, properties, and distributions. Methods for providing group anonymity need to protect the underlying data distribution, and also to ensure sufficient data utility after their transformation. In our opinion, the latter task is a problem which can be solved using only exhaustive search, therefore heuristic procedures need to be developed to find suboptimal solutions.

Evolutionary algorithms are heuristic guided random search techniques mimicking biological evolution by natural selection. They are inherently stochastic, which turns out to be a downside when converging to an optimum. Memetic algorithms are a combination of evolutionary algorithms and local search procedures. Applying local search increases convergence and enhances algorithm performance by incorporating problem specific knowledge.

In the paper, we introduce a memetic algorithm for providing group anonymity. We illustrate its application by solving a real data based problem of protecting military personnel regional distribution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gantz, J., Reinsel, D.: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East (2012), http://www.emc.com/leadership/digital-universe/iview/executive-summary-a-universe-of.htm
  2. 2.
    Pfitzmann, A., Hansen, M.: A Terminology for Talking about Privacy by Data Minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management, Version v0.34 (2010), http://dud.inf.tu-dresden.de/Anon_Terminology.shtml
  3. 3.
    Chertov, O., Pilipyuk, A.: Statistical Disclosure Control Methods for Microdata. In: 2009 International Symposium on Computing, Communication, and Control (ISCCC 2009). Proc. of CSIT, vol. 1, pp. 339–343. IACSIT Press, Singapore (2011)Google Scholar
  4. 4.
    Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2007)Google Scholar
  5. 5.
    Moscato, P.: On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Toward Memetic Algorithms. C3P Report 826: Caltech Concurrent Computation Program, pp. 33–48. Caltech, CA (1989)Google Scholar
  6. 6.
    Dawkins, R.: The Selfish Gene: 30th Anniversary Edition. Oxford University Press, Oxford (2006)Google Scholar
  7. 7.
    Neri, F., Cotta, C.: A Primer on Memetic Algorithms. In: Neri, F., Cotta, C., Moscato, P. (eds.) Handbook of Memetic Algorithms. SCI, vol. 379, pp. 43–52. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  8. 8.
    Chertov, O., Tavrov, D.: Data Group Anonymity: General Approach. International Journal of Computer Science and Information Security 8(7), 1–8 (2010)MATHGoogle Scholar
  9. 9.
    Chertov, O., Tavrov, D.: Group Anonymity. In: Hüllermeier, E., Kruse, R., Hoffmann, F. (eds.) IPMU 2010. CCIS, vol. 81, pp. 592–601. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Domingo-Ferrer, J., Tora, V.: Disclosure Control Methods and Information Loss for Microdata, Confidentiality, Disclosure, and Data Access. In: Doyle, P., Lane, J.I., Theeuwes, J.J.M., Zayatz, L.V. (eds.) Theory and Practical Applications for Statistical Agencies, pp. 91–110. Elsevier, Amsterdam (2001)Google Scholar
  11. 11.
    Mateo-Sanz, J.M., Domingo-Ferrer, J., Sebé, F.: Probabilistic Information Loss Measures in Confidentiality Protection of Continuous Microdata. Data Mining and Knowledge Discovery 11, 181–193 (2005)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Yancey, W.E., Winkler, W.E., Creecy, R.H.: Disclosure Risk Assessment in Perturbative Microdata Protection. In: Domingo-Ferrer, J. (ed.) Inference Control in Statistical Databases. LNCS, vol. 2316, pp. 135–152. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Chertov, O. (ed.): Group Methods of Data Processing. Lulu.com, Raleigh (2010)Google Scholar
  14. 14.
    Liu, L., Wang, J., Zhang, J.: Wavelet-Based Data Perturbation for Simultaneous Privacy-Preserving and Statistics-Preserving. In: 2008 IEEE International Conference on Data Mining Workshops, Pisa, pp. 27–35. IEEE Computer Society Press (2008)Google Scholar
  15. 15.
    Chertov, O., Tavrov, D.: Providing Group Anonymity Using Wavelet Transform. In: MacKinnon, L.M. (ed.) BNCOD 2010. LNCS, vol. 6121, pp. 25–36. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  16. 16.
    Tavrov, D., Chertov, O.: SSA-Caterpillar in Group Anonymity. In: Paper presented at the World Conference on Soft Computing, May 23-26. San Francisco State University, San Francisco (2011)Google Scholar
  17. 17.
    Chertov, O.R.: Minimizatsiia spotvoren pry formuvanni mikrofailu z zamaskovanymy danymy (In Ukrainian). Visnyk Skhidnoukrainskoho Natsionalnoho Universytetu imeni Volodymyra Dalia 8(179), 256–262 (2012)Google Scholar
  18. 18.
    Chertov, O.R., Tavrov, D.Y.: Memetychnyi alhorytm dlia modyfikatsii mikrofailu z minimizatsiieiu spotvoren u protsesi zabezpechennia hrupovoi anonimnosti. Shtuchnyi Intelekt (in press, 2013) (in Ukrainian)Google Scholar
  19. 19.
    Smith, A.E., Coit, D.W.: Penalty Functions. In: Bäck, T., Fogel, D.B., Michalewicz, Z. (eds.) Evolutionary Computation 2. Advanced Algorithms and Operators, pp. 41–48. Institute of Physics Publishing, Bristol (2000)CrossRefGoogle Scholar
  20. 20.
    Goldberg, D.E., Korb, B., Deb, K.: Messy Genetic Algorithms: Motivation, Analysis, and First Results. Complex Systems 3, 493–530 (1989)MathSciNetMATHGoogle Scholar
  21. 21.
    Brindle, A.: Genetic Algorithms for Function Optimization. Doctoral dissertation and technical report TR81-2, University of Alberta, Department of Computer Science (1981)Google Scholar
  22. 22.
    U.S. Census 2000. 5-Percent Public Use Microdata Sample Files, http://www.census.gov/main/www/cen2000.html
  23. 23.
    Syswerda, G.: Schedule Optimization Using Genetic Algorithms. In: Davis, L. (ed.) Handbook of Genetic Algorithms, pp. 332–349. Van Nostrand Reinhold, New York (1991)Google Scholar
  24. 24.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley (1989)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Applied Mathematics DepartmentNational Technical University of Ukraine “Kyiv Polytechnic Institute”KyivUkraine

Personalised recommendations