Dealing with Mislabeling via Interactive Machine Learning

  • Wanyi ZhangEmail author
  • Andrea Passerini
  • Fausto Giunchiglia
Dissertation and Habilitation Abstracts


We propose an interactive machine learning framework where the machine questions the user feedback when it realizes it is inconsistent with the knowledge previously accumulated. The key idea is that the machine uses its available knowledge to check the correctness of its own and the user labeling. The proposed architecture and algorithms run through a series of modes with progressively higher confidence and features a conflict resolution component. The proposed solution is tested in a project on university student life where the goal is to recognize tasks like user location and transportation mode from sensor data. The results highlight the unexpected extreme pervasiveness of annotation mistakes and the advantages provided by skeptical learning.


Interactive learning Knowledge and learning Managing annotator mistakes 



This research has received funding from the European Union’s Horizon 2020 FET Proactive project “WeNet–The Internet of us”, Grant Agreement No: 823783.


  1. 1.
    Baader F, Calvanese D, McGuinness D, Patel-Schneider P, Nardi D (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, CambridgezbMATHGoogle Scholar
  2. 2.
    Bakir GH, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SVN (2007) Predicting structured data (neural nnformation processing). The MIT Press, CambridgeCrossRefGoogle Scholar
  3. 3.
    Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157. CrossRefGoogle Scholar
  4. 4.
    Diligenti M, Gori M, Saccà C (2017) Semantic-based regularization for learning and inference. Artif Intell 244:143–165. (Combining constraint solving with mining and learning)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231Google Scholar
  6. 6.
    Folleco A, Khoshgoftaar T.M, Hulse J.V, Bullard L (2008) Identifying learners robust to low quality data. In: 2008 IEEE International Conference on Information Reuse and Integration, pp 190–195.
  7. 7.
    Frénay B, Kabán A, et al (2014) A comprehensive introduction to label noise. In: ESANNGoogle Scholar
  8. 8.
    Ghosh A, Manwani N, Sastry PS (2017) On the robustness of decision tree learning under label noise. In: Kim J, Shim K, Cao L, Lee JG, Lin X, Moon YS (eds) Advances in knowledge discovery and data mining. Springer International Publishing, Cham, pp 685–697CrossRefGoogle Scholar
  9. 9.
    Giunchiglia F, Zeni M, Bignotti E (2018) Personal context recognition via reliable human-machine collaboration. In: Pervasive Computing and Communications Workshops (PerCom Workshops), 2018 IEEE International Conference on, in printGoogle Scholar
  10. 10.
    Guo B, Yu Z, Zhou X, Zhang D (2014) From participatory sensing to mobile crowd sensing. In: Pervasive Computing and Communications Workshops (PERCOM Workshops), 2014 IEEE International Conference on, pp 593–598Google Scholar
  11. 11.
    Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33(4):275–306. CrossRefGoogle Scholar
  12. 12.
    Rätsch G, Schölkopf B, Smola AJ, Mika S, Onoda T, Müller KR (2000) Robust ensemble learning for data mining. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications. Springer, Berlin, pp 341–344CrossRefGoogle Scholar
  13. 13.
    Restuccia F, Ghosh N, Bhattacharjee S, Das SK, Melodia T (2017) Quality of information in mobile crowdsensing: survey and research challenges. ACM Trans Sens Netw (TOSN) 13(4):34Google Scholar
  14. 14.
    Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62(1):107–136. CrossRefGoogle Scholar
  15. 15.
    Settles B (2009) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison.
  16. 16.
    Teso S, Sebastiani R, Passerini A (2017) Structured learning modulo theories. Artif Intell 244:166–187. (Combining constraint solving with mining and learning)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Tourangeau R, Rips LJ, Rasinski K (2000) The psychology of survey response. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  18. 18.
    West BT, Sinibaldi J (2013) The quality of paradata: a literature review. Improving surveys with paradata. Wiley Online Library, pp 339–359Google Scholar
  19. 19.
    Zeni M, Zaihrayeu I, Giunchiglia F (2014) Multi-device activity logging. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp 299–302Google Scholar
  20. 20.
    Zeni M, Zhang W, Bignotti E, Passerini A, Giunchiglia F (2019) Fixing mislabeling by human annotators leveraging conflict resolution and prior knowledge. Proc ACM Interact Mob Wearable Ubiquitous Technol 3(1):32CrossRefGoogle Scholar

Copyright information

© Gesellschaft für Informatik e.V. and Springer-Verlag GmbH Germany, part of Springer Nature 2020

Authors and Affiliations

  1. 1.DISI, University of TrentoTrentoItaly

Personalised recommendations