Advertisement

Discrimination-Aware Association Rule Mining for Unbiased Data Analytics

  • Ling LuoEmail author
  • Wei Liu
  • Irena Koprinska
  • Fang Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9263)

Abstract

A discriminatory dataset refers to a dataset with undesirable correlation between sensitive attributes and the class label, which often leads to biased decision making in data analytics processes. This paper investigates how to build discrimination-aware models even when the available training set is intrinsically discriminating based on some sensitive attributes, such as race, gender or personal status. We propose a new classification method called Discrimination-Aware Association Rule classifier (DAAR), which integrates a new discrimination-aware measure and an association rule mining algorithm. We evaluate the performance of DAAR on three real datasets from different domains and compare it with two non-discrimination-aware classifiers (a standard association rule classification algorithm and the state-of-the-art association rule algorithm SPARCCC), and also with a recently proposed discrimination-aware decision tree method. The results show that DAAR is able to effectively filter out the discriminatory rules and decrease the discrimination on all datasets with insignificant impact on the predictive accuracy.

Keywords

Discrimination-aware data mining Association rule classification Unbiased decision making 

References

  1. 1.
    Pedreshi, D., Ruggieri, S., Turini, F.: Discrimination-aware data mining. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2008), pp. 560–568. ACM (2008)Google Scholar
  2. 2.
    Calders, T., Verwer, S.: Three naive Bayes approaches for discrimination-free classification. Data Min. Knowl. Disc. 21, 277–292 (2010)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ma, Y., Liu, B., Yiming, W.H.: Integrating classification and association rule mining. In: Proceedings of the 4th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 1998), pp. 80–86 (1998)Google Scholar
  4. 4.
    Verhein, F., Chawla, S.: Using significant, positively associated and relatively class correlated rules for associative classification of imbalanced datasets. In: Proceedings of the 7th IEEE International Conference on Data Mining, pp. 679–684. IEEE (2007)Google Scholar
  5. 5.
    Kamiran, F., Calders, T., Pechenizkiy, M.: Discrimination aware decision tree learning. In: Proceedings of the 10th IEEE International Conference on Data Mining, pp. 869–874. IEEE (2010)Google Scholar
  6. 6.
    Kamiran, F., Calders, T.: Classifying without discriminating. In: International Conference on Computer, Control and Communication, pp. 1–6. IEEE (2009)Google Scholar
  7. 7.
    Kamiran, F., Calders, T.: Classification with no discrimination by preferential sampling. In: Proceedings of the Benelearn (2010)Google Scholar
  8. 8.
    Calders, T., Kamiran, F., Pechenizkiy, M.: Building classifiers with independency constraints. In: IEEE International Conference on Data Mining Workshops, pp. 13–18. IEEE (2009)Google Scholar
  9. 9.
    Hajian, S., Domingo-Ferrer, J.: A methodology for direct and indirect discrimination prevention in data mining. IEEE Trans. Knowl. Data Eng. 25, 1445–1459 (2013)CrossRefGoogle Scholar
  10. 10.
    Pedreschi, D., Ruggieri, S., Turini, F.: Integrating induction and deduction for finding evidence of discrimination. In: Proceedings of the 12th International Conference on Artificial Intelligence and Law, pp. 157–166. ACM, Barcelona, Spain (2009)Google Scholar
  11. 11.
    Ristanoski, G., Liu, W., Bailey, J.: Discrimination aware classification for imbalanced datasets. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, pp. 1529–1532. ACM (2013)Google Scholar
  12. 12.
    Simon, G.J., Kumar, V., Li, P.W.: A simple statistical model and association rule filtering for classification. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 823–831. ACM, 2020550 (2011)Google Scholar
  13. 13.
    University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ling Luo
    • 1
    • 2
    Email author
  • Wei Liu
    • 2
    • 3
  • Irena Koprinska
    • 1
  • Fang Chen
    • 2
  1. 1.School of Information TechnologiesUniversity of SydneySydneyAustralia
  2. 2.NICTA ATP LaboratorySydneyAustralia
  3. 3.Faculty of Engineering and ITUniversity of TechnologySydneyAustralia

Personalised recommendations