Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Horizontally Partitioned Data

  • Murat Kantarcıoğlu
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1391

Synonyms

Homogeneously distributed data

Definition

Data is said to be horizontally partitioned when several organizations own the same set of attributes for different sets of entities. More formally, horizontal partitioning of data can be defined as follows: given a dataset DB = (E, I) (e.g., hospital discharge data for state of Texas) where E is the set of entities about whom the information is collected (e.g., the set of patients) and I is the set of attributes that is collected about entities (e.g., set of features collected about patients), DB is said to be horizontally partitioned among k sites where each site owns DBi = (Ei, Ii), 1 ≤ ik if E = E1E2…∪ Ek, EiEj = ∅, 1 ≤ ijk and I = I1 = I2… = In. In relational terms, with horizontal partitioning, the relation to be mined is the union of the relations at the sites.

Historical Background

Cheap data storage and abundant network capacity have revolutionized data collection and data dissemination. At the same time,...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Agrawal R, Srikant R.. Privacy-preserving data mining. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2000. p. 439–50.Google Scholar
  2. 2.
    Clifton C, Marks D. Security and privacy implications of data mining. In: Proceedings of the Workshop on Data Mining and Knowledge Discovery; 1996. p. 15–9.Google Scholar
  3. 3.
    Friedman A, Wolff R, Schuster A. Providing k-anonymity in data mining. VLDB J. 2008;17(4):789–804.CrossRefGoogle Scholar
  4. 4.
    Han J, Kamber M. Data mining: concepts and techniques. San Francisco: Morgan Kaufmann; 2000.zbMATHGoogle Scholar
  5. 5.
    Jagannathan G, Wright R.N. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2005. p. 593–9.Google Scholar
  6. 6.
    Kantarcioglu M, Vaidya J. Privacy preserving naive bayes classifier for horizontally partitioned data. In: Proceedings of the Workshop on Privacy Preserving Data Mining; 2003.Google Scholar
  7. 7.
    Kantarcıoğlu M, Clifton C. Privately computing a distributed k-nn classifier. In: Proceedings of the 8th European Conference on Principles of Data Mining And Knowledge Discovery; 2004. p. 279–0.Google Scholar
  8. 8.
    Kantarcıoğlu M, Jin J, Clifton C. When do data mining results violate privacy? In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2004; p. 599–604.Google Scholar
  9. 9.
    Lin X, Clifton C, Zhu M. Privacy preserving clustering with distributed EM mixture modeling. Knowl Inform Syst. 2005;8(1):68–81.CrossRefGoogle Scholar
  10. 10.
    Lindell Y, Pinkas B. Privacy preserving data mining. In: Advances in Cryptology: Proceedings of the 20th Annual International Cryptology Conference; 2000. p. 36–54.CrossRefGoogle Scholar
  11. 11.
    Yu H, Jiang X, Vaidya J. Privacy-preserving svm using nonlinear kernels on horizontally partitioned data. In: Proceedings of the 2006 ACM Symposium on Applied Computing; 2006. p. 603–10.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of Texas at DallasRichardsonUSA

Section editors and affiliations

  • Chris Clifton
    • 1
  1. 1.Department of Computer SciencePurdue UniversityWest LafayetteUSA