Fairness-Aware Classifier with Prejudice Remover Regularizer

  • Toshihiro Kamishima
  • Shotaro Akaho
  • Hideki Asoh
  • Jun Sakuma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7524)

Abstract

With the spread of data mining technologies and the accumulation of social data, such technologies and data are being used for determinations that seriously affect individuals’ lives. For example, credit scoring is frequently determined based on the records of past credit data together with statistical prediction techniques. Needless to say, such determinations must be nondiscriminatory and fair in sensitive features, such as race, gender, religion, and so on. Several researchers have recently begun to attempt the development of analysis techniques that are aware of social fairness or discrimination. They have shown that simply avoiding the use of sensitive features is insufficient for eliminating biases in determinations, due to the indirect influence of sensitive information. In this paper, we first discuss three causes of unfairness in machine learning. We then propose a regularization approach that is applicable to any prediction algorithm with probabilistic discriminative models. We further apply this approach to logistic regression and empirically show its effectiveness and efficiency.

Keywords

fairness discrimination logistic regression classification social responsibility information theory 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C., Yu, P.S. (eds.): Privacy-Preserving Data Mining: Models and Algorithms. Springer (2008)Google Scholar
  2. 2.
    Boyd, D.: Privacy and publicity in the context of big data. In: Keynote Talk of The 19th Int’l Conf. on World Wide Web (2010)Google Scholar
  3. 3.
    Calders, T., Verwer, S.: Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 277–292 (2010)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Dai, W., Yang, Q., Xue, G.R., Yu, Y.: Boosting for transfer learning. In: Proc. of the 24th Int’l Conf. on Machine Learning, pp. 193–200 (2007)Google Scholar
  5. 5.
    Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. arxiv.org:1104.3913 (2011)Google Scholar
  6. 6.
    Elkan, C.: The foundations of cost-sensitive learning. In: Proc. of the 17th Int’l Joint Conf. on Artificial Intelligence, pp. 973–978 (2001)Google Scholar
  7. 7.
    Frank, A., Asuncion, A.: UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine (2010), http://archive.ics.uci.edu/ml Google Scholar
  8. 8.
    Gondek, D., Hofmann, T.: Non-redundant data clustering. In: Proc. of the 4th IEEE Int’l Conf. on Data Mining, pp. 75–82 (2004)Google Scholar
  9. 9.
    Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley-Interscience (2001)Google Scholar
  10. 10.
    NIPS workshop — inductive transfer: 10 years later (2005), http://iitrl.acadiau.ca/itws05/
  11. 11.
    Kamiran, F., Calders, T., Pechenizkiy, M.: Discrimination aware decision tree learning. In: Proc. of the 10th IEEE Int’l Conf. on Data Mining, pp. 869–874 (2010)Google Scholar
  12. 12.
    Kamishima, T., Akaho, S., Sakuma, J.: Fairness-aware learning through regularization approach. In: Proc. of The 3rd IEEE Int’l Workshop on Privacy Aspects of Data Mining, pp. 643–650 (2011)Google Scholar
  13. 13.
    Luong, B.T., Ruggieri, S., Turini, F.: k-NN as an implementation of situation testing for discrimination discovery and prevention. In: Proc. of the 17th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 502–510 (2011)Google Scholar
  14. 14.
    Nissim, K.: Private data analysis via output perturbation. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining: Models and Algorithms, ch. 4. Springer (2008)Google Scholar
  15. 15.
    Pariser, E.: The Filter Bubble: What The Internet Is Hiding From You. Viking (2011)Google Scholar
  16. 16.
    Pearl, J.: Causality: Models, Reasoning and Inference, 2nd edn. Cambridge University Press (2009)Google Scholar
  17. 17.
    Pedreschi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proc. of the 14th Int’l Conf. on Knowledge Discovery and Data Mining (2008)Google Scholar
  18. 18.
    Pedreschi, D., Ruggieri, S., Turini, F.: Measuring discrimination in socially-sensitive decision records. In: Proc. of the SIAM Int’l Conf. on Data Mining, pp. 581–592 (2009)Google Scholar
  19. 19.
    Perlich, C., Kaufman, S., Rosset, S.: Leakage in data mining: Formulation, detection, and avoidance. In: Proc. of the 17th Int’l Conf. on Knowledge Discovery and Data Mining, pp. 556–563 (2011)Google Scholar
  20. 20.
    Ruggieri, S., Pedreschi, D., Turini, F.: DCUBE: Discrimination discovery in databases. In: Proc of The ACM SIGMOD Int’l Conf. on Management of Data, pp. 1127–1130 (2010)Google Scholar
  21. 21.
    Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)MathSciNetGoogle Scholar
  22. 22.
    Venkatasubramanian, S.: Measures of anonimity. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining: Models and Algorithms, ch. 4. Springer (2008)Google Scholar
  23. 23.
    Žliobaitė, I., Kamiran, F., Calders, T.: Handling conditional discrimination. In: Proc. of the 11th IEEE Int’l Conf. on Data Mining (2011)Google Scholar
  24. 24.
    Zadrozny, B.: Learning and evaluating classifiers under sample selection bias. In: Proc. of the 21st Int’l Conf. on Machine Learning, pp. 903–910 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Toshihiro Kamishima
    • 1
  • Shotaro Akaho
    • 1
  • Hideki Asoh
    • 1
  • Jun Sakuma
    • 2
    • 3
  1. 1.National Institute of Advanced Industrial Science and Technology (AIST)TsukubaJapan
  2. 2.University of TsukubaTsukubaJapan
  3. 3.Japan Science and Technology AgencyKawaguchiJapan

Personalised recommendations