Advertisement

The Fuzzy Misclassification Analysis with Deep Neural Network for Handling Class Noise Problem

  • Anupiya Nugaliyadde
  • Ratchakoon Pruengkarn
  • Kok Wai Wong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11304)

Abstract

Most of the real world data is embedded with noise, and noise can negatively affect the classification learning models which are used to analyse data. Therefore, noisy data should be handled in order to avoid any negative effect on the learning algorithm used to build the analysis model. Deep learning algorithm has shown to outperform general classification algorithms. However, it has undermined by noisy data. This paper proposes a Fuzzy misclassification the analysis with deep neural networks (FAD) to handle the noise in classification ion data. By combining the fuzzy misclassification analysis with the deep neural network, it can improve the classification confidence by better handling the noisy data. The FAD has tested on Ionosphere, Pima, German and Yeast3 datasets by randomly adding 40% of noise to the data. The FAD has shown to consistently provide good results when compared to other noise removal techniques. FAD has outperformed CMTF-SVM by an average of 3.88% in the testing datasets.

Keywords

Class noise Fuzzy misclassification analysis Deep neural networks Noise removal technique 

Notes

Acknowledgement

This work was partially supported by a Murdoch University internal grant on the high-end computer.

References

  1. 1.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  2. 2.
    Nugaliyadde, A., Wong, K.W., Sohel, F., Xie, H.: Reinforced memory network for question answering. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S. (eds.) ICONIP 2017. LNCS, vol. 10635, pp. 482–490. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-70096-0_50CrossRefGoogle Scholar
  3. 3.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Press (2016)Google Scholar
  4. 4.
    Glorot, X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th International Conference on Machine Learning, pp. 513–520 (2011)Google Scholar
  5. 5.
    Xiao, T., Xia, T., Yang, Y., Huang, C., Wang, X.: Learning from massive noisy labeled data for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 2691–2699 (2015)Google Scholar
  6. 6.
    Frenay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)CrossRefGoogle Scholar
  7. 7.
    Pruengkarn, R., Wong, K.W., Fung, C.C.: Data cleaning using complementary fuzzy support vector machine technique. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) ICONIP 2016, Part II. LNCS, vol. 9948, pp. 160–167. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46672-9_19CrossRefGoogle Scholar
  8. 8.
    Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning, pp. 1737–1746 (2015)Google Scholar
  9. 9.
    Audhkhasi, K., Osoba, O., Kosko, B.: Noise benefits in backpropagation and deep bidirectional pre-training. In: International Joint Conference on Neural Networks, pp. 1–8. IEEE (2013)Google Scholar
  10. 10.
    Mahjoubfar, A., Chen, C.L., Jalali, B.: Deep learning and classification. Artificial Intelligence in Label-free Microscopy, pp. 73–85. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-51448-2_8CrossRefGoogle Scholar
  11. 11.
    Khoshgoftaar, T.M., Hulse, J.V., Napolitano, A.: Comparing boosting and bagging techniques with noisy and imbalanced data. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 41(3), 552–568 (2011)CrossRefGoogle Scholar
  12. 12.
    Guan, D., Yuan, W., Lee, Y.K., Lee, S.: Identifying mislabeled training data with the aid of unlabeled data. Appl. Intell. 35(3), 345–358 (2011)CrossRefGoogle Scholar
  13. 13.
    Zerhari, B., Lahcen, A.A., Mouline, S.: Detection and elimination of class noise in large datasets using partitioning filter technique. In: 4th IEEE International Colloquium on Information Science and Technology (CiSt), Tangier, pp. 194–199 (2016)Google Scholar
  14. 14.
    Krawczyk, B., Sáez, J.A., Woźniak, M.: Tackling label noise with multi-class decomposition using fuzzy one-class support vector machines. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), Vancouver, BC, pp. 915–922 (2016)Google Scholar
  15. 15.
    Yuan, W., Guan, D., Ma, T., Khattak, A.M.: Classification with class noises through probabilistic sampling. Inf. Fusion 41(C), 57–67 (2018)CrossRefGoogle Scholar
  16. 16.
    Zhang, J., Sheng, V.S., Li, Q., Wu, J., Wu, X.: Consensus algorithms for biased labeling in crowdsourcing. Inf. Sci. 382(C), 254–273 (2017)CrossRefGoogle Scholar
  17. 17.
    Verbaeten, S., Van Assche, A.: Ensemble methods for noise elimination in classification problems. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 317–325. Springer, Heidelberg (2003).  https://doi.org/10.1007/3-540-44938-8_32CrossRefGoogle Scholar
  18. 18.
    Luengo, J., Shim, S.O., Alshomrani, S., Altalhi, A., Herrera, F.: CNC-NOS: class noise cleaning by ensemble filtering and noise scoring. Knowl. Based Syst. 140(C), 27–49 (2017)Google Scholar
  19. 19.
    Jeatrakul, P., Wong, K.W., Fung, C.C.: Data cleaning for classification using misclassification analysis. J. Adv. Comput. Intell. Intell. Inform. 14(3), 297–302 (2010)CrossRefGoogle Scholar
  20. 20.
    Pendharkar, P.C.: Bayesian posterior misclassification error risk distributions for ensemble classifiers. Eng. Appl. Artif. Intell. 65(C), 484–492 (2017)CrossRefGoogle Scholar
  21. 21.
    Ekambaram, R., et al.: Active cleaning of label noise. Pattern Recogn. 51(C), 463–480 (2016)CrossRefGoogle Scholar
  22. 22.
    Tomašev, N., Buza, K.: Hubness-aware kNN classification of high-dimensional data in presence of label noise. Neurocomputing 160(C), 157–172 (2015)CrossRefGoogle Scholar
  23. 23.
    Lee, G.H., Taur, J.S., Tao, C.W.: A robust fuzzy support vector machine for two-class pattern classification. Int. J. Fuzzy Syst. 8(2), 76–86 (2006)Google Scholar
  24. 24.
    Çatak, F.Ö.: Robust ensemble classifier combination based on noise removal with one-class SVM. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015, Part II. LNCS, vol. 9490, pp. 10–17. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-26535-3_2CrossRefGoogle Scholar
  25. 25.
    Sabzevari, M., Martínez-Muñoz, G., Suárez, A.: A two-stage ensemble method for the detection of class-label noise. Neurocomputing 275(C), 2374–2383 (2017)Google Scholar
  26. 26.
    Reed, S., Lee, H., Anguelov, D., Szegedy, C., Erhan, D., Rabinovich, A.: Training deep neural networks on noisy labels with bootstrapping (2014). arXiv preprint: arXiv:1412.6596
  27. 27.
    Batuwita, R., Palade, V.: FSVM-CIL: fuzzy support vector machines for class imbalance learning. IEEE Trans. Fuzzy Syst. 18(3), 558–571 (2010)CrossRefGoogle Scholar
  28. 28.
    Abe, S., Inoue, T.: Fuzzy support vector machines for multiclass problems. In: European Symposium on Artificial Neural Networks, Bruges, Belgium, pp. 113–118 (2002)Google Scholar
  29. 29.
    Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y.: Deep learning-based classification of hyperspectral data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7(6), 2094–2107 (2014)CrossRefGoogle Scholar
  30. 30.
    Boukoros, S., Nugaliyadde, A., Marnerides, A., Vassilakis, C., Koutsakis, P., Wong, K.W.: Modeling server workloads for campus email traffic using recurrent neural networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.-S.M. (eds.) ICONIP 2017, Part V. LNCS, vol. 10638, pp. 57–66. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-70139-4_6CrossRefGoogle Scholar
  31. 31.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  32. 32.
    Zhang, W., Du, T., Wang, J.: Deep learning over multi-field categorical data. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 45–57. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-30671-1_4CrossRefGoogle Scholar
  33. 33.
    Lichman, M.: UCI Machine Learning Repository. University of California, Irvine, School of Information and Computer Sciences (2013)Google Scholar
  34. 34.
    Alcalá-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17(2–3), 255–287 (2011)Google Scholar
  35. 35.
    Pruengkarn, R., Wong, K.W., Fung, C.C.: Imbalanced data classification using complementary fuzzy support vector machine techniques and SMOTE. In: IEEE International Conference on Systems, Man, and Cybernetics, Banff, Canada, pp. 978–983 (2017)Google Scholar
  36. 36.
    Daza, L., Acuna, E.: An algorithm for detecting noise on supervised classification. In: The World Congress on Engineering and Computer Science 2007, San Francisco, USA, pp. 1–6 (2007)Google Scholar
  37. 37.
    Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: 33rd International Conference on Software Engineering, Honolulu, HI, pp. 481–490 (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Anupiya Nugaliyadde
    • 1
  • Ratchakoon Pruengkarn
    • 2
  • Kok Wai Wong
    • 1
  1. 1.Murdoch UniversityPerthAustralia
  2. 2.Dhurakij Pundit UniversityBangkokThailand

Personalised recommendations