Linear Regression as a Non-cooperative Game
Abstract
Linear regression amounts to estimating a linear model that maps features (e.g., age or gender) to corresponding data (e.g., the answer to a survey or the outcome of a medical exam). It is a ubiquitous tool in experimental sciences. We study a setting in which features are public but the data is private information. While the estimation of the linear model may be useful to participating individuals, (if, e.g., it leads to the discovery of a treatment to a disease), individuals may be reluctant to disclose their data due to privacy concerns. In this paper, we propose a generic game-theoretic model to express this trade-off. Users add noise to their data before releasing it. In particular, they choose the variance of this noise to minimize a cost comprising two components: (a) a privacy cost, representing the loss of privacy incurred by the release; and (b) an estimation cost, representing the inaccuracy in the linear model estimate. We study the Nash equilibria of this game, establishing the existence of a unique non-trivial equilibrium. We determine its efficiency for several classes of privacy and estimation costs, using the concept of the price of stability. Finally, we prove that, for a specific estimation cost, the generalized least-square estimator is optimal among all linear unbiased estimators in our non-cooperative setting: this result extends the famous Aitken/Gauss-Markov theorem in statistics, establishing that its conclusion persists even in the presence of strategic individuals.
Keywords
Linear regression Gauss-Markov theorem Aitken theorem privacy potential game price of stabilityPreview
Unable to display preview. Download preview PDF.
References
- 1.Ioannidis, S., Loiseau, P.: Linear regression as a non-cooperative game. Technical report, arXiv:1309.7824 (2013)Google Scholar
- 2.Vaidya, J., Clifton, C.W., Zhu, Y.M.: Privacy Preserving Data Mining. Springer (2006)Google Scholar
- 3.Domingo-Ferrer, J.: A survey of inference control methods for privacy-preserving data mining. In: Privacy-Preserving Data Mining, pp. 53–80. Springer (2008)Google Scholar
- 4.Traub, J.F., Yemini, Y., Woźniakowski, H.: The statistical security of a statistical database. ACM Transactions on Database Systems (TODS) 9(4), 672–679 (1984)CrossRefGoogle Scholar
- 5.Duncan, G.T., Mukherjee, S.: Optimal disclosure limitation strategy in statistical databases: Deterring tracker attacks through additive noise. Journal of the American Statistical Association 95(451), 720–729 (2000)CrossRefGoogle Scholar
- 6.Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: ACM SIGMOD International Conference on Management of Data, pp. 439–450 (2000)Google Scholar
- 7.Oliveira, S.R., Zaiane, O.R.: Privacy preserving clustering by data transformation. In: SBBD, pp. 304–318 (2003)Google Scholar
- 8.Atallah, M., Bertino, E., Elmagarmid, A., Ibrahim, M., Verykios, V.: Disclosure limitation of sensitive rules. In: Workshop on Knowledge and Data Engineering Exchange (KDEX 1999), pp. 45–52 (1999)Google Scholar
- 9.Dwork, C.: Differential privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4052, pp. 1–12. Springer, Heidelberg (2006)CrossRefGoogle Scholar
- 10.Kifer, D., Smith, A., Thakurta, A.: Private convex empirical risk minimization and high-dimensional regression. JMLR W&CP 23, 25.1–25.40 (2012); Proceedings of COLT 2012Google Scholar
- 11.Ghosh, A., Roth, A.: Selling privacy at auction. In: ACM EC, pp. 199–208 (2011)Google Scholar
- 12.Nissim, K., Smorodinsky, R., Tennenholtz, M.: Approximately optimal mechanism design via differential privacy. In: Innovations in Theoretical Computer Science (ITCS), pp. 203–213 (2012)Google Scholar
- 13.Ligett, K., Roth, A.: Take it or Leave it: Running a Survey when Privacy Comes at a Cost. In: Goldberg, P.W. (ed.) WINE 2012. LNCS, vol. 7695, pp. 378–391. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 14.Pukelsheim, F.: Optimal design of experiments, vol. 50. Society for Industrial Mathematics (2006)Google Scholar
- 15.Atkinson, A., Donev, A., Tobias, R.: Optimum experimental designs, with SAS. Oxford University Press, New York (2007)zbMATHGoogle Scholar
- 16.Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press (2004)Google Scholar
- 17.Horel, T., Ioannidis, S., Muthukrishnan, S.: Budget feasible mechanisms for experimental design. arXiv preprint arXiv:1302.5724 (2013)Google Scholar
- 18.Dekel, O., Fischer, F., Procaccia, A.D.: Incentive compatible regression learning. Journal of Computer and System Sciences (76), 759–777 (2010)Google Scholar
- 19.Perote, J., Perote-Pena, J.: Strategy-proof estimators for simple regression. Mathematical Social Sciences (47), 153–176 (2004)Google Scholar
- 20.Morgan, J.: Financing public goods by means of lotteries. Review of Economic Studies 67(4), 761–784 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
- 21.Loiseau, P., Schwartz, G., Musacchio, J., Amin, S., Sastry, S.S.: Congestion pricing using a raffle-based scheme. In: NetGCoop (October 2011)Google Scholar
- 22.Loiseau, P., Schwartz, G., Musacchio, J., Amin, S., Sastry, S.S.: Incentive mechanisms for internet congestion management: Fixed-budget rebate versus time-of-day pricing. IEEE/ACM Transactions on Networking (to appear, 2013)Google Scholar
- 23.Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2nd edn. Springer (2009)Google Scholar
- 24.Monderer, D., Shapley, L.S.: Potential games. Games and Economic Behavior 14(1), 124–143 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
- 25.Sandholm, W.H.: Population Games and Evolutionary Dynamics. MIT Press (2010)Google Scholar
- 26.Schäfer, G.: Online social networks and network economics. Lecture notes, Sapienza University of Rome (2011)Google Scholar
- 27.Roughgarden, T., Tardos, E.: How bad is selfish routing? Journal of the ACM 49(2), 236–259 (2002)MathSciNetCrossRefGoogle Scholar
- 28.Johari, R., Tsitsiklis, J.N.: Efficiency loss in a network resource allocation game. Mathematics of Operations Research 29(3), 407–435 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
- 29.Aitken, A.C.: On least squares and linear combinations of observations. Proceedings of the Royal Society of Edinburgh (1935)Google Scholar