Data Mining and Knowledge Discovery

, Volume 11, Issue 2, pp 117–119 | Cite as

Privacy in Data Mining

  • Josep Domingo-Ferrer
  • Vicenç Torra

The widespread computerization and, especially, the booming use of Internet have enabled in the last years an unprecedented level of automated data collection. Parallel to this, data mining has emerged as an important discipline providing powerful tools for data analysis. Beyond the positive consequences of higher information accuracy, a negative point is an Orwellian feeling of dwindling privacy for individual persons (or companies, for that matter).

Privacy in administrative, statistical and other databases is about finding tradeoffs between the societal right to know and the individual right to private life. Thus, the passive subject of privacy is the individual citizen or, in business data collection, the individual company.

Several disciplines have been active subjects in studying privacy:
  • Statistics: Most national statistical laws contain commitments to respondents' privacy, which are essential to encourage citizens' response;

  • Philosophy:The ethics of information society...


Data Mining Association Rule Disclosure Risk Statistical Privacy Automate Data Collection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Agrawal, R. and Srikant, R. 2000. Privacy preserving data mining. In Proceedings of the ACM SIGMOD, ACM, pp. 439–450.Google Scholar
  2. Bertino, E., Fovino, I.N., and Provenza, L.P. 2005. A framework for evaluating privacy preserving data mining algorithms. Data Mining and Knowledge Discovery. This issue.Google Scholar
  3. Dalenius, T. 1974. The invasion of privacy problem and statistics production: An overview. Statistik Tidskrift, 12:213–225.Google Scholar
  4. Denning, D.E., Denning, P.J., and Schwartz, M.D. 1979. The tracker: A threat to statistical database security. ACM Transactions on Database Systems, 4(1):76–96.CrossRefGoogle Scholar
  5. Domingo-Ferrer, J. and Torra, V. 2005. Ordinal, continuous and heterogenerous k-anonymity through microaggregation. Data Mining and Knowledge Discovery. This issue.Google Scholar
  6. Fienberg, S.E. and Slavkovic, A.B. 2005. Preserving the confidentiality of categorical statistical data bases when releasing information for association rules. Data Mining and Knowledge Discovery. This issue.Google Scholar
  7. Mateo-Sanz, J.M., Domingo-Ferrer, J., and Sebé, F. 2005. Probabilistic information loss measures in confidentiality protection of continuous microdata. Data Mining and Knowledge Discovery. This issue.Google Scholar
  8. Schlörer, J. 1975. Identification and retrieval of personal records from a statistical data bank. Methods Inform. Med., 14(1):7–13.Google Scholar
  9. Willenborg, L. and DeWaal, T. 2001. Elements of Statistical Disclosure Control. New York:Springer-Verlag.zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  1. 1.Department of Computer Engineering and MathsRovira i Virgili University of TarragonaTarragonaSpain
  2. 2.Institut d'Investigació en Intel·ligència Artificial-CSICBellaterraSpain

Personalised recommendations