Advances in Data Analysis and Classification

, Volume 11, Issue 4, pp 659–690 | Cite as

Parametric classification with soft labels using the evidential EM algorithm: linear discriminant analysis versus logistic regression

Regular Article
  • 143 Downloads

Abstract

Partially supervised learning extends both supervised and unsupervised learning, by considering situations in which only partial information about the response variable is available. In this paper, we consider partially supervised classification and we assume the learning instances to be labeled by Dempster–Shafer mass functions, called soft labels. Linear discriminant analysis and logistic regression are considered as special cases of generative and discriminative parametric models. We show that the evidential EM algorithm can be particularized to fit the parameters in each of these models. We describe experimental results with simulated data sets as well as with two real applications: K-complex detection in sleep EEGs signals and facial expression recognition. These results confirm the interest of using soft labels for classification as compared to potentially erroneous crisp labels, when the true class membership is partially unknown or ill-defined.

Keywords

Partially supervised learning Belief functions Dempster–Shafer theory Machine learning Uncertain data Discriminant analysis Logistic regression 

Mathematics Subject Classification

62H30 62F86 68T10 68T37 

References

  1. Abassi L, Boukhris I (2016) Crowd label aggregation under a belief function framework. In: Lehner F, Fteimi N (eds) Proceedings of 9th international conference on knowledge science, engineering and management, KSEM 2016, Passau, Germany, 5–7 Oct 2016. Springer, Cham, pp 185–196Google Scholar
  2. Bishop CM (2006) Pattern recognition and machine learning. Springer, BerlinMATHGoogle Scholar
  3. Chapelle O, Schölkopf B, Zien A (2006) Semi-supervised learning. MIT Press, CambridgeCrossRefGoogle Scholar
  4. Cherfi ZL, Oukhellou L, Côme E, Denœux T, Aknin P (2012) Partially supervised independent factor analysis using soft labels elicited from multiple experts: application to railway track circuit diagnosis. Soft Comput 16(5):741–754CrossRefGoogle Scholar
  5. Côme E, Oukhellou L, Denœux T, Aknin P (2009) Learning from partially supervised data using mixture models and belief functions. Patt Recognit 42(3):334–348CrossRefMATHGoogle Scholar
  6. Cour T, Sapp B, Taskar B (2011) Learning from partial labels. J Mach Learn Res 12:1225–1261MathSciNetMATHGoogle Scholar
  7. Couso I, Dubois D (2017) Maximum likelihood under incomplete information: toward a comparison of criteria. In: Ferraro MB, Giordani P, Vantaggi B, Gagolewski M, Gil M Ángeles, Grzegorzewski P, Hryniewicz O (eds) Soft methods for data science. Springer, Cham, pp 141–148CrossRefGoogle Scholar
  8. Dempster AP (1967) Upper and lower probabilities induced by a multivalued mapping. Ann Math Stat 38:325–339MathSciNetCrossRefMATHGoogle Scholar
  9. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B 39:1–38MathSciNetMATHGoogle Scholar
  10. Denœux T (1995) A \(k\)-nearest neighbor classification rule based on Dempster–Shafer theory. IEEE Trans Syst Man Cybern 25(05):804–813CrossRefGoogle Scholar
  11. Denœux T (2013) Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Trans Knowl Data Eng 25(1):119–130CrossRefGoogle Scholar
  12. Denœux T (2014) Likelihood-based belief function: justification and some extensions to low-quality data. Int J Approx Reason 55(7):1535–1547MathSciNetCrossRefMATHGoogle Scholar
  13. Denoeux T, Kanjanatarakul O (2016) Beyond fuzzy, possibilistic and rough: an investigation of belief functions in clustering. In: Proceedings of the 8th international conference on soft methods in probability and statistics SMPS 2016, soft methods for data science, advances in intelligent and soft computing, AISC, vol 456. Springer, Rome, Italy, pp 157–164Google Scholar
  14. Denœux T, Masson MH (2004) EVCLUS: evidential clustering of proximity data. IEEE Trans Syst Man Cybern B 34(1):95–109CrossRefGoogle Scholar
  15. Denœux T, Skarstein-Bjanger M (2000) Induction of decision trees for partially classified data. In: Proceedings of SMC’2000. IEEE, Nashville, TN, pp 2923–2928Google Scholar
  16. Denœux T, Zouhal LM (2001) Handling possibilistic labels in pattern classification using evidential reasoning. Fuzzy Sets Syst 122(3):47–62MathSciNetMATHGoogle Scholar
  17. Denœux T, Sriboonchitta S, Kanjanatarakul O (2016) Evidential clustering of large dissimilarity data. Knowl Based Syst 106:179–195CrossRefMATHGoogle Scholar
  18. Dubuisson S, Davoine F, Masson MH (2002) A solution for facial expression representation and recognition. Signal Process Image Commun 17(9):657–673CrossRefGoogle Scholar
  19. Elouedi Z, Mellouli K, Smets P (2001) Belief decision trees: theoretical foundations. Int J Approx Reason 28:91–124MathSciNetCrossRefMATHGoogle Scholar
  20. Hasan A, Wang Z, Mahani A (2016) Fast estimation of multinomial logit models: R package mnlogit. J Stat Softw 75(1):1–24Google Scholar
  21. Heitjan DF, Rubin DB (1991) Ignorability and coarse data. Ann Stat 19(4):2244–2253MathSciNetCrossRefMATHGoogle Scholar
  22. Hüllermeier E (2014) Learning from imprecise and fuzzy observations: data disambiguation through generalized loss minimization. Int J Approx Reason 55(7):1519–1534MathSciNetCrossRefMATHGoogle Scholar
  23. Hüllermeier E, Beringer J (2005) Learning from ambiguously labeled examples. In: Proceedings of the 6th international symposium on intelligent data analysis (IDA-05), Madrid, SpainGoogle Scholar
  24. Jaffray JY (1989) Linear utility theory for belief functions. Oper Res Lett 8(2):107–112MathSciNetCrossRefMATHGoogle Scholar
  25. Kanade T, Cohn J, Tian Y (2000) Comprehensive database for facial expression analysis. In: Proceedings of the fourth international conference of face and gesture recognition, Grenoble, France, pp 46–53Google Scholar
  26. Li J (2013) Logistic regression. Course notes. http://sites.stat.psu.edu/~jiali/course/stat597e/notes2/logit.pdf
  27. Liu ZG, Pan Q, Dezert J, Mercier G (2015) Credal c-means clustering method based on belief functions. Knowl Based Syst 74:119–132CrossRefGoogle Scholar
  28. Liu ZG, Pan Q, Dezert J, Mercier G (2017) Hybrid classification system for uncertain data. IEEE Trans Syst Man Cybern Syst (in press).  https://doi.org/10.1109/TSMC.2016.2622247
  29. Ma L, Destercke S, Wang Y (2016) Online active learning of decision trees with evidential data. Patt Recognit 52:33–45CrossRefGoogle Scholar
  30. Mardia KV (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57(3):519–530MathSciNetCrossRefMATHGoogle Scholar
  31. McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New YorkMATHGoogle Scholar
  32. McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New YorkCrossRefMATHGoogle Scholar
  33. Nguyen N, Caruana R (2008) Classification with partial labels. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’08. ACM, New York, NY, USA, pp 551–559Google Scholar
  34. Peters G, Crespo F, Lingras P, Weber R (2013) Soft clustering: fuzzy and rough approaches and their extensions and derivatives. Int J Approx Reason 54(2):307–322MathSciNetCrossRefGoogle Scholar
  35. Press SJ, Wilson S (1978) Choosing between logistic regression and discriminant analysis. J Am Stat Assoc 73(364):699–705CrossRefMATHGoogle Scholar
  36. Quost B (2014) Logistic regression of soft labeled instances via the evidential EM algorithm. In: Cuzzolin F (ed) Proceedings of the third international conference on belief functions: theory and applications, BELIEF 2014. Oxford, UK, 26–28 Sept 2014. Springer, Cham, pp 77–86Google Scholar
  37. Quost B, Denoeux T (2016) Clustering and classification of fuzzy data using the fuzzy EM algorithm. Fuzzy Sets Syst 286:134–156MathSciNetCrossRefGoogle Scholar
  38. Ramasso E, Denœux T (2013) Making use of partial knowledge about hidden states in HMMs: an approach based on belief functions. IEEE Trans Fuzzy Syst 21(6):1–11CrossRefGoogle Scholar
  39. Richard C (1998) Une méthodologie pour la détection à structure imposée. applications au plan temps-fréquence. Ph.D. thesis, Université de Technologie de CompiègneGoogle Scholar
  40. Richard C, Lengellé R (1999) Data driven design and complexity control of time-frequency detectors. Sig Process 77:37–48CrossRefMATHGoogle Scholar
  41. Rjab AB, Kharoune M, Miklos Z, Martin A (2016) Characterization of experts in crowdsourcing platforms. In: Vejnarová J, Kratochvíl V (eds) Proceedings of 4th international conference on belief functions: theory and applications, BELIEF 2016, Prague, Czech Republic, 21–23 Sept 2016. Springer, Cham, pp 97–104Google Scholar
  42. Shafer G (1976) A mathematical theory of evidence. Princeton University Press, PrincetonMATHGoogle Scholar
  43. Strat TM (1990) Decision analysis using belief functions. Int J Approx Reason 4(5–6):391–417CrossRefMATHGoogle Scholar
  44. Sutton-Charani N, Destercke S, Denoeux T (2013) Learning decision trees from uncertain data with an evidential EM approach. In: 12th international conference on machine learning and applications, 2013, vol 1, pp 111–116Google Scholar
  45. Sutton-Charani N, Destercke S, Denœux T (2014) Training and evaluating classifiers from evidential data: application to E2M decision tree pruning. In: Cuzzolin F (ed) Proceedings of the third international conference on belief functions: theory and applications, BELIEF 2014. Oxford, UK, 26–28 Sept 2014. Springer, Cham, pp 87–94Google Scholar
  46. Trabelsi S, Elouedi Z, Mellouli K (2007) Pruning belief decision tree methods in averaging and conjunctive approaches. Int J Approx Reason 46(3):568–595MathSciNetCrossRefMATHGoogle Scholar
  47. Zadeh LA (1978) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 1:3–28MathSciNetCrossRefMATHGoogle Scholar
  48. Zhou K, Martin A, Pan Q (2014) Evidential-EM algorithm applied to progressively censored observations. In: Laurent A, Strauss O, Bouchon-Meunier B, Yager RR (eds) Proceedings of 15th international conference on information processing and management of uncertainty in knowledge-based systems, IPMU 2014, Montpellier, France, Part III, 15–19 July 2014. Springer, Cham, pp 180–189Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  1. 1.CNRS, Heudiasyc (UMR 7253)Sorbonne Universités, Université de Technologie de CompiègneCompiègneFrance
  2. 2.College of Applied SciencesBeijing University of TechnologyBeijingChina

Personalised recommendations