Skip to main content
Log in

A new privacy-protecting survey design for multichotomous sensitive variables

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In this paper, we propose the diagonal model (DM), a survey technique for multicategorical sensitive variables. The DM is a nonrandomized response method; that is, the DM avoids the use of any randomization device. Thus, both survey complexity and study costs are reduced. The DM does not require that at least one outcome of the sensitive variable is nonsensitive. Thus, the model can even be applied to characteristics like income which are sensitive as a whole. We describe the maximum likelihood estimation for the distribution of the sensitive variable and show that the EM algorithm is beneficial to calculate the estimates. Subsequently, we present asymptotic as well as bootstrap confidence intervals. Applying properties of circulant matrices, we show the connection between efficiency loss and the degree of privacy protection (DPP). Here, we prove that the efficiency loss has a lower bound that depends on the DPP. Moreover, for any desired DPP, we derive model parameters that ensure the largest possible efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman& Hall, London

    Book  MATH  Google Scholar 

  • Federal Statistical Office, Germany (2009) Lohn- und Einkommensteuer - Fachserie 14 Reihe 7.1 - 2004. [online], document number 2140710049004. Available at www.destatis.com (only in German language)

  • Gentle JE (1998) Random number generation and Monte Carlo methods. Springer, Berlin

    Book  MATH  Google Scholar 

  • Gentle JE (2007) Matrix algebra: theory, computations, and applications in statistics. Springer, Berlin

    Book  Google Scholar 

  • Gray RM (2006) Toeplitz and circulant matrices: a review. Now

  • Tan MT, Tian GL, Tang ML (2009) Sample surveys with sensitive questions: a nonrandomized response approach. Am Stat 63:9–16

    Article  MathSciNet  Google Scholar 

  • Tang ML, Tian GL, Tang NS, Liu Z (2009) A new non-randomized multi-category response model for surveys with a single sensitive question: design and analysis. J Korean Stat Soc 38:339–349

    Article  MathSciNet  Google Scholar 

  • Tian GL, Yu JW, Tang ML, Geng Z (2007) A new non-randomized model for analysing sensitive questions with binary outcomes. Stat Med 26:4238–4252

    Article  MathSciNet  Google Scholar 

  • Van der Vaart AW (2007) Asymptotic statistics. Cambridge University Press, Cambridge

    Google Scholar 

  • Warner SL (1965) Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias. J Am Stat Assoc 60:63–69

    Article  Google Scholar 

  • Yu JW, Tian GL, Tang ML (2008) Two new models for survey sampling with sensitive characteristic: design and analysis. Metrika 67:251–263

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

The author would like to thank a referee for valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heiko Groenitz.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Groenitz, H. A new privacy-protecting survey design for multichotomous sensitive variables. Metrika 77, 211–224 (2014). https://doi.org/10.1007/s00184-012-0406-8

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-012-0406-8

Keywords

Navigation