Skip to main content

Enhancing Fairness and Accuracy in Machine Learning Through Similarity Networks

  • Conference paper
  • First Online:
Cooperative Information Systems (CoopIS 2023)

Abstract

Machine Learning is a powerful tool for uncovering relationships and patterns within datasets. However, applying it to a large datasets can lead to biased outcomes and quality issues, due to confounder variables indirectly related to the outcome of interest. Achieving fairness often alters training data, like balancing imbalanced groups (privileged/unprivileged) or excluding sensitive features, impacting accuracy. To address this, we propose a solution inspired by similarity network fusion, preserving dataset structure by integrating global and local similarities. We evaluate our method, considering data set complexity, fairness, and accuracy. Experimental results show the similarity network’s effectiveness in balancing fairness and accuracy. We discuss implications and future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://rpubs.com/rhuebner/hrd_cb_v14.

  2. 2.

    https://scikit-learn.org/stable/modules/impute.html.

  3. 3.

    https://search.r-project.org/CRAN/refmans/mice/html/mice.html.

  4. 4.

    https://radimrehurek.com/gensim/models/word2vec.html.

References

  1. Abdel-Megeed, S.M.: Monte Carlo study of psychometric effects of scaling levels on the pearson product moment correlation coefficient (1984)

    Google Scholar 

  2. Agarwal, A., Agarwal, H., Agarwal, N.: Fairness score and process standardization: framework for fairness certification in artificial intelligence systems. AI Ethics 3(1), 267–279 (2023). https://doi.org/10.1007/s43681-022-00147-7

    Article  Google Scholar 

  3. Aurelio, Y.S., De Almeida, G.M., de Castro, C.L., Braga, A.P.: Learning from imbalanced data sets with weighted cross-entropy function. Neural Process. Lett. 50, 1937–1949 (2019). https://doi.org/10.1007/s11063-018-09977-1

    Article  Google Scholar 

  4. Barocas, S., Hardt, M., Narayanan, A.: Fairness and Machine Learning: Limitations and Opportunities. fairmlbook.org (2019). http://www.fairmlbook.org

  5. Bellandi, V., Damiani, E., Ghirimoldi, V., Maghool, S., Negri, F.: Validating vector-label propagation for graph embedding. In: Sellami, M., Ceravolo, P., Reijers, H.A., Gaaloul, W., Panetto, H. (eds.) CoopIS 2022. LNCS, vol. 13591, pp. 259–276. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-17834-4_15

    Chapter  Google Scholar 

  6. Casiraghi, E., et al.: A method for comparing multiple imputation techniques: a case study on the US national COVID cohort collaborative. J. Biomed. Inform. 139, 104295 (2023)

    Article  Google Scholar 

  7. Corbett-Davies, S., Pierson, E., Feller, A., Goel, S., Huq, A.: Algorithmic decision making and the cost of fairness. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 797–806 (2017)

    Google Scholar 

  8. Cotter, A., et al.: Training well-generalizing classifiers for fairness metrics and other data-dependent constraints. In: International Conference on Machine Learning, pp. 1397–1405. PMLR (2019)

    Google Scholar 

  9. Cotter, A., et al.: Optimization with non-differentiable constraints with applications to fairness, recall, churn, and other goals. J. Mach. Learn. Res. 20(172), 1–59 (2019)

    MathSciNet  MATH  Google Scholar 

  10. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 214–226 (2012)

    Google Scholar 

  11. Fish, B., Kun, J., Lelkes, Á.D.: A confidence-based approach for balancing fairness and accuracy. In: Proceedings of the 2016 SIAM International Conference on Data Mining, pp. 144–152. SIAM (2016)

    Google Scholar 

  12. Friedler, S.A., Scheidegger, C., Venkatasubramanian, S., Choudhary, S., Hamilton, E.P., Roth, D.: A comparative study of fairness-enhancing interventions in machine learning. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 329–338 (2019)

    Google Scholar 

  13. Garcia, L.P., de Carvalho, A.C., Lorena, A.C.: Effect of label noise in the complexity of classification problems. Neurocomputing 160, 108–119 (2015)

    Article  Google Scholar 

  14. Ghazimatin, A., Kleindessner, M., Russell, C., Abedjan, Z., Golebiowski, J.: Measuring fairness of rankings under noisy sensitive information. In: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2022, pp. 2263–2279. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3531146.3534641

  15. Gliozzo, J., et al.: Heterogeneous data integration methods for patient similarity networks. Briefings Bioinform. 23(4), bbac207 (2022)

    Google Scholar 

  16. Gower, J.C.: A general coefficient of similarity and some of its properties. Biometrics 27(4), 857–871 (1971)

    Article  Google Scholar 

  17. Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  18. Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24(3), 289–300 (2002)

    Article  Google Scholar 

  19. Japkowicz, N., Shah, M.: Performance evaluation in machine learning. In: El Naqa, I., Li, R., Murphy, M.J. (eds.) Machine Learning in Radiation Oncology, pp. 41–56. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18305-3_4

    Chapter  Google Scholar 

  20. Kleinberg, J.: Inherent trade-offs in algorithmic fairness. SIGMETRICS Perform. Eval. Rev. 46(1), 40 (2018). https://doi.org/10.1145/3292040.3219634

    Article  MathSciNet  Google Scholar 

  21. Lepri, B., Oliver, N., Letouzé, E., Pentland, A., Vinck, P.: Fair, transparent, and accountable algorithmic decision-making processes: the premise, the proposed solutions, and the open challenges. Philos. Technol. 31, 611–627 (2018). https://doi.org/10.1007/s13347-017-0279-x

    Article  Google Scholar 

  22. Liang, A., Lu, J., Mu, X.: Algorithmic design: fairness versus accuracy. In: Proceedings of the 23rd ACM Conference on Economics and Computation, pp. 58–59 (2022)

    Google Scholar 

  23. Lorena, A.C., Garcia, L.P., Lehmann, J., Souto, M.C., Ho, T.K.: How complex is your classification problem? A survey on measuring classification complexity. ACM Comput. Surv. (CSUR) 52(5), 1–34 (2019)

    Article  Google Scholar 

  24. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  25. Mary, J., Calauzenes, C., El Karoui, N.: Fairness-aware learning for continuous attributes and treatments. In: International Conference on Machine Learning, pp. 4382–4391. PMLR (2019)

    Google Scholar 

  26. Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. ACM Comput. Surv. (CSUR) 54(6), 1–35 (2021)

    Article  Google Scholar 

  27. Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Conference on Fairness, Accountability and Transparency, pp. 107–118. PMLR (2018)

    Google Scholar 

  28. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26 (2013)

    Google Scholar 

  29. Morais, G., Prati, R.C.: Complex network measures for data set characterization. In: 2013 Brazilian Conference on Intelligent Systems, pp. 12–18. IEEE (2013)

    Google Scholar 

  30. Naeem, S.B., Bhatti, R., Khan, A.: An exploration of how fake news is taking over social media and putting public health at risk. Health Inf. Libr. J. 38(2), 143–149 (2021)

    Article  Google Scholar 

  31. Oneto, L., Chiappa, S.: Fairness in machine learning. In: Oneto, L., Navarin, N., Sperduti, A., Anguita, D. (eds.) Recent Trends in Learning From Data. SCI, vol. 896, pp. 155–196. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-43883-8_7

    Chapter  Google Scholar 

  32. Ormiston, C.K., Chiangong, J., Williams, F.: The COVID-19 pandemic and hispanic/latina/o immigrant mental health: why more needs to be done. Health Equity 7(1), 3–8 (2023)

    Article  Google Scholar 

  33. Pessach, D., Shmueli, E.: A review on fairness in machine learning. ACM Comput. Surv. 55(3), 1–44 (2022). https://doi.org/10.1145/3494672

    Article  Google Scholar 

  34. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?” Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144 (2016)

    Google Scholar 

  35. Schölkopf, B.: The kernel trick for distances. In: Advances in Neural Information Processing Systems, vol. 13 (2000)

    Google Scholar 

  36. Singh, A., Joachims, T.: Fairness of exposure in rankings. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2219–2228 (2018)

    Google Scholar 

  37. Smola, A.J., Kondor, R.: Kernels and regularization on graphs. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT-Kernel 2003. LNCS (LNAI), vol. 2777, pp. 144–158. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45167-9_12

    Chapter  MATH  Google Scholar 

  38. Sugiyama, M., Borgwardt, K.: Halting in random walk kernels. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28. Curran Associates, Inc. (2015). https://proceedings.neurips.cc/paper_files/paper/2015/file/31b3b31a1c2f8a370206f111127c0dbd-Paper.pdf

  39. Tizpaz-Niari, S., Kumar, A., Tan, G., Trivedi, A.: Fairness-aware configuration of machine learning libraries. In: Proceedings of the 44th International Conference on Software Engineering, pp. 909–920 (2022)

    Google Scholar 

  40. Wang, B., et al.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11(3), 333–337 (2014)

    Article  Google Scholar 

  41. Zhang, T., Zhu, T., Gao, K., Zhou, W., Philip, S.Y.: Balancing learning model privacy, fairness, and accuracy with early stopping criteria. IEEE Trans. Neural Netw. Learn. Syst. 34(9), 5557–5569 (2023)

    Article  MathSciNet  Google Scholar 

  42. Zhu, X.: Semi-supervised learning with graphs. Carnegie Mellon University (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Samira Maghool or Paolo Ceravolo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Maghool, S., Casiraghi, E., Ceravolo, P. (2024). Enhancing Fairness and Accuracy in Machine Learning Through Similarity Networks. In: Sellami, M., Vidal, ME., van Dongen, B., Gaaloul, W., Panetto, H. (eds) Cooperative Information Systems. CoopIS 2023. Lecture Notes in Computer Science, vol 14353. Springer, Cham. https://doi.org/10.1007/978-3-031-46846-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-46846-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-46845-2

  • Online ISBN: 978-3-031-46846-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics