Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Dealing with Mislabeling via Interactive Machine Learning

Abstract

We propose an interactive machine learning framework where the machine questions the user feedback when it realizes it is inconsistent with the knowledge previously accumulated. The key idea is that the machine uses its available knowledge to check the correctness of its own and the user labeling. The proposed architecture and algorithms run through a series of modes with progressively higher confidence and features a conflict resolution component. The proposed solution is tested in a project on university student life where the goal is to recognize tasks like user location and transportation mode from sensor data. The results highlight the unexpected extreme pervasiveness of annotation mistakes and the advantages provided by skeptical learning.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3

Notes

  1. 1.

    In order to support its argument, the machine could provide some sort of explainable critique to the user feedback, in terms of counter-examples or evidence of inconsistencies with respect to the SK. This is a promising direction for future research.

References

  1. 1.

    Baader F, Calvanese D, McGuinness D, Patel-Schneider P, Nardi D (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, Cambridge

  2. 2.

    Bakir GH, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SVN (2007) Predicting structured data (neural nnformation processing). The MIT Press, Cambridge

  3. 3.

    Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157. https://doi.org/10.1023/A:1007607513941

  4. 4.

    Diligenti M, Gori M, Saccà C (2017) Semantic-based regularization for learning and inference. Artif Intell 244:143–165. https://doi.org/10.1016/j.artint.2015.08.011. http://www.sciencedirect.com/science/article/pii/S0004370215001344. (Combining constraint solving with mining and learning)

  5. 5.

    Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231

  6. 6.

    Folleco A, Khoshgoftaar T.M, Hulse J.V, Bullard L (2008) Identifying learners robust to low quality data. In: 2008 IEEE International Conference on Information Reuse and Integration, pp 190–195. https://doi.org/10.1109/IRI.2008.4583028

  7. 7.

    Frénay B, Kabán A, et al (2014) A comprehensive introduction to label noise. In: ESANN

  8. 8.

    Ghosh A, Manwani N, Sastry PS (2017) On the robustness of decision tree learning under label noise. In: Kim J, Shim K, Cao L, Lee JG, Lin X, Moon YS (eds) Advances in knowledge discovery and data mining. Springer International Publishing, Cham, pp 685–697

  9. 9.

    Giunchiglia F, Zeni M, Bignotti E (2018) Personal context recognition via reliable human-machine collaboration. In: Pervasive Computing and Communications Workshops (PerCom Workshops), 2018 IEEE International Conference on, in print

  10. 10.

    Guo B, Yu Z, Zhou X, Zhang D (2014) From participatory sensing to mobile crowd sensing. In: Pervasive Computing and Communications Workshops (PERCOM Workshops), 2014 IEEE International Conference on, pp 593–598

  11. 11.

    Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33(4):275–306. https://doi.org/10.1007/s10462-010-9156-z

  12. 12.

    Rätsch G, Schölkopf B, Smola AJ, Mika S, Onoda T, Müller KR (2000) Robust ensemble learning for data mining. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications. Springer, Berlin, pp 341–344

  13. 13.

    Restuccia F, Ghosh N, Bhattacharjee S, Das SK, Melodia T (2017) Quality of information in mobile crowdsensing: survey and research challenges. ACM Trans Sens Netw (TOSN) 13(4):34

  14. 14.

    Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62(1):107–136. https://doi.org/10.1007/s10994-006-5833-1

  15. 15.

    Settles B (2009) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison. http://axon.cs.byu.edu/~martinez/classes/778/Papers/settles.activelearning.pdf

  16. 16.

    Teso S, Sebastiani R, Passerini A (2017) Structured learning modulo theories. Artif Intell 244:166–187. https://doi.org/10.1016/j.artint.2015.04.002. http://www.sciencedirect.com/science/article/pii/S0004370215000648. (Combining constraint solving with mining and learning)

  17. 17.

    Tourangeau R, Rips LJ, Rasinski K (2000) The psychology of survey response. Cambridge University Press, Cambridge

  18. 18.

    West BT, Sinibaldi J (2013) The quality of paradata: a literature review. Improving surveys with paradata. Wiley Online Library, pp 339–359

  19. 19.

    Zeni M, Zaihrayeu I, Giunchiglia F (2014) Multi-device activity logging. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp 299–302

  20. 20.

    Zeni M, Zhang W, Bignotti E, Passerini A, Giunchiglia F (2019) Fixing mislabeling by human annotators leveraging conflict resolution and prior knowledge. Proc ACM Interact Mob Wearable Ubiquitous Technol 3(1):32

Download references

Acknowledgements

This research has received funding from the European Union’s Horizon 2020 FET Proactive project “WeNet–The Internet of us”, Grant Agreement No: 823783.

Author information

Correspondence to Wanyi Zhang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, W., Passerini, A. & Giunchiglia, F. Dealing with Mislabeling via Interactive Machine Learning. Künstl Intell (2020). https://doi.org/10.1007/s13218-020-00630-5

Download citation

Keywords

  • Interactive learning
  • Knowledge and learning
  • Managing annotator mistakes