We propose an interactive machine learning framework where the machine questions the user feedback when it realizes it is inconsistent with the knowledge previously accumulated. The key idea is that the machine uses its available knowledge to check the correctness of its own and the user labeling. The proposed architecture and algorithms run through a series of modes with progressively higher confidence and features a conflict resolution component. The proposed solution is tested in a project on university student life where the goal is to recognize tasks like user location and transportation mode from sensor data. The results highlight the unexpected extreme pervasiveness of annotation mistakes and the advantages provided by skeptical learning.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
In order to support its argument, the machine could provide some sort of explainable critique to the user feedback, in terms of counter-examples or evidence of inconsistencies with respect to the SK. This is a promising direction for future research.
Baader F, Calvanese D, McGuinness D, Patel-Schneider P, Nardi D (2003) The description logic handbook: theory, implementation and applications. Cambridge University Press, Cambridge
Bakir GH, Hofmann T, Schölkopf B, Smola AJ, Taskar B, Vishwanathan SVN (2007) Predicting structured data (neural nnformation processing). The MIT Press, Cambridge
Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157. https://doi.org/10.1023/A:1007607513941
Diligenti M, Gori M, Saccà C (2017) Semantic-based regularization for learning and inference. Artif Intell 244:143–165. https://doi.org/10.1016/j.artint.2015.08.011. http://www.sciencedirect.com/science/article/pii/S0004370215001344. (Combining constraint solving with mining and learning)
Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd 96:226–231
Folleco A, Khoshgoftaar T.M, Hulse J.V, Bullard L (2008) Identifying learners robust to low quality data. In: 2008 IEEE International Conference on Information Reuse and Integration, pp 190–195. https://doi.org/10.1109/IRI.2008.4583028
Frénay B, Kabán A, et al (2014) A comprehensive introduction to label noise. In: ESANN
Ghosh A, Manwani N, Sastry PS (2017) On the robustness of decision tree learning under label noise. In: Kim J, Shim K, Cao L, Lee JG, Lin X, Moon YS (eds) Advances in knowledge discovery and data mining. Springer International Publishing, Cham, pp 685–697
Giunchiglia F, Zeni M, Bignotti E (2018) Personal context recognition via reliable human-machine collaboration. In: Pervasive Computing and Communications Workshops (PerCom Workshops), 2018 IEEE International Conference on, in print
Guo B, Yu Z, Zhou X, Zhang D (2014) From participatory sensing to mobile crowd sensing. In: Pervasive Computing and Communications Workshops (PERCOM Workshops), 2014 IEEE International Conference on, pp 593–598
Nettleton DF, Orriols-Puig A, Fornells A (2010) A study of the effect of different types of noise on the precision of supervised learning techniques. Artif Intell Rev 33(4):275–306. https://doi.org/10.1007/s10462-010-9156-z
Rätsch G, Schölkopf B, Smola AJ, Mika S, Onoda T, Müller KR (2000) Robust ensemble learning for data mining. In: Terano T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. Current issues and new applications. Springer, Berlin, pp 341–344
Restuccia F, Ghosh N, Bhattacharjee S, Das SK, Melodia T (2017) Quality of information in mobile crowdsensing: survey and research challenges. ACM Trans Sens Netw (TOSN) 13(4):34
Richardson M, Domingos P (2006) Markov logic networks. Mach Learn 62(1):107–136. https://doi.org/10.1007/s10994-006-5833-1
Settles B (2009) Active learning literature survey. Computer Sciences Technical Report 1648, University of Wisconsin–Madison. http://axon.cs.byu.edu/~martinez/classes/778/Papers/settles.activelearning.pdf
Teso S, Sebastiani R, Passerini A (2017) Structured learning modulo theories. Artif Intell 244:166–187. https://doi.org/10.1016/j.artint.2015.04.002. http://www.sciencedirect.com/science/article/pii/S0004370215000648. (Combining constraint solving with mining and learning)
Tourangeau R, Rips LJ, Rasinski K (2000) The psychology of survey response. Cambridge University Press, Cambridge
West BT, Sinibaldi J (2013) The quality of paradata: a literature review. Improving surveys with paradata. Wiley Online Library, pp 339–359
Zeni M, Zaihrayeu I, Giunchiglia F (2014) Multi-device activity logging. In: Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, pp 299–302
Zeni M, Zhang W, Bignotti E, Passerini A, Giunchiglia F (2019) Fixing mislabeling by human annotators leveraging conflict resolution and prior knowledge. Proc ACM Interact Mob Wearable Ubiquitous Technol 3(1):32
This research has received funding from the European Union’s Horizon 2020 FET Proactive project “WeNet–The Internet of us”, Grant Agreement No: 823783.
About this article
Cite this article
Zhang, W., Passerini, A. & Giunchiglia, F. Dealing with Mislabeling via Interactive Machine Learning. Künstl Intell 34, 271–278 (2020). https://doi.org/10.1007/s13218-020-00630-5
- Interactive learning
- Knowledge and learning
- Managing annotator mistakes