Hybrid Microdata via Model-Based Clustering

  • Anna Oganian
  • Josep Domingo-Ferrer
Conference paper

DOI: 10.1007/978-3-642-33627-0_9

Part of the Lecture Notes in Computer Science book series (LNCS, volume 7556)
Cite this paper as:
Oganian A., Domingo-Ferrer J. (2012) Hybrid Microdata via Model-Based Clustering. In: Domingo-Ferrer J., Tinnirello I. (eds) Privacy in Statistical Databases. PSD 2012. Lecture Notes in Computer Science, vol 7556. Springer, Berlin, Heidelberg


In this paper we propose a new scheme for statistical disclosure limitation which can be classified as a hybrid method of protection, that is, a method that combines properties of perturbative and synthetic methods. This approach is based on model-based clustering with the subsequent synthesis of the records within each cluster. The novelty is that the clustering and synthesis methods have been carefully chosen to fit each other in view of reducing information loss. The model-based clustering tries to obtain clusters such that the within-cluster data distribution is approximately normal; then we can use a multivariate normal synthesizer for the local synthesis of data. In this way, some of the non-normal characteristics of the data are captured by the clustering, so that a simple synthesizer for normal data can be used within each cluster. Our method is shown to be effective when compared to other disclosure limitation strategies.

Keywords and Phrases

Statistical disclosure limitation (SDL) hybrid SDL methods mixture models model-based clustering expectation-maximization (EM) algorithm 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Anna Oganian
    • 1
  • Josep Domingo-Ferrer
    • 2
  1. 1.Department of Mathematical SciencesGeorgia Southern UniversityStatesboroU.S.A.
  2. 2.Department of Computer Engineering and MathsUniversitat Rovira i Virgili, UNESCO Chair in Data PrivacyTarragonaSpain

Personalised recommendations