Abstract
In this paper, we propose the diagonal model (DM), a survey technique for multicategorical sensitive variables. The DM is a nonrandomized response method; that is, the DM avoids the use of any randomization device. Thus, both survey complexity and study costs are reduced. The DM does not require that at least one outcome of the sensitive variable is nonsensitive. Thus, the model can even be applied to characteristics like income which are sensitive as a whole. We describe the maximum likelihood estimation for the distribution of the sensitive variable and show that the EM algorithm is beneficial to calculate the estimates. Subsequently, we present asymptotic as well as bootstrap confidence intervals. Applying properties of circulant matrices, we show the connection between efficiency loss and the degree of privacy protection (DPP). Here, we prove that the efficiency loss has a lower bound that depends on the DPP. Moreover, for any desired DPP, we derive model parameters that ensure the largest possible efficiency.
Similar content being viewed by others
References
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman& Hall, London
Federal Statistical Office, Germany (2009) Lohn- und Einkommensteuer - Fachserie 14 Reihe 7.1 - 2004. [online], document number 2140710049004. Available at www.destatis.com (only in German language)
Gentle JE (1998) Random number generation and Monte Carlo methods. Springer, Berlin
Gentle JE (2007) Matrix algebra: theory, computations, and applications in statistics. Springer, Berlin
Gray RM (2006) Toeplitz and circulant matrices: a review. Now
Tan MT, Tian GL, Tang ML (2009) Sample surveys with sensitive questions: a nonrandomized response approach. Am Stat 63:9–16
Tang ML, Tian GL, Tang NS, Liu Z (2009) A new non-randomized multi-category response model for surveys with a single sensitive question: design and analysis. J Korean Stat Soc 38:339–349
Tian GL, Yu JW, Tang ML, Geng Z (2007) A new non-randomized model for analysing sensitive questions with binary outcomes. Stat Med 26:4238–4252
Van der Vaart AW (2007) Asymptotic statistics. Cambridge University Press, Cambridge
Warner SL (1965) Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. J Am Stat Assoc 60:63–69
Yu JW, Tian GL, Tang ML (2008) Two new models for survey sampling with sensitive characteristic: design and analysis. Metrika 67:251–263
Acknowledgments
The author would like to thank a referee for valuable comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Rights and permissions
About this article
Cite this article
Groenitz, H. A new privacy-protecting survey design for multichotomous sensitive variables. Metrika 77, 211–224 (2014). https://doi.org/10.1007/s00184-012-0406-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-012-0406-8