Feature and Search Space Reduction for Label-Dependent Multi-label Classification

  • Prema NedungadiEmail author
  • H. Haripriya
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 380)


The problem of high dimensionality in multi-label domain is an emerging research area to explore. A strategy is proposed to combine both multiple regression and hybrid k-Nearest Neighbor algorithm in an efficient way for high-dimensional multi-label classification. The hybrid kNN performs the dimensionality reduction in the feature space of multi-labeled data in order to reduce the search space as well as the feature space for kNN, and multiple regression is used to extract label-dependent information from the label space. Our multi-label classifier incorporates label dependency in the label space and feature similarity in the reduced feature space for prediction. It has various applications in different domains such as in information retrieval, query categorization, medical diagnosis, and marketing.


Multi-label Multiple regression Hybrid kNN PCA 



This work derives inspiration and direction from the Chancellor of Amrita University, Sri Mata Amritanandamayi Devi.


  1. 1.
    Smith, L.I.: A tutorial on principal components analysis. Cornell Univ. USA 51, 52 (2002)Google Scholar
  2. 2.
    Hotelling, H. Relations between two sets of variates. Biometrika, 321–377 (1936).Google Scholar
  3. 3.
    De Leeuw, J. (2011). History of nonlinear principal component analysisGoogle Scholar
  4. 4.
    Sammon, J.W.: A nonlinear mapping for data structure analysis. IEEE Trans. Comput. 18(5), 401–409 (1969)CrossRefGoogle Scholar
  5. 5.
    Nedungadi, P., Harikumar, H., Ramesh, M.: A high performance hybrid algorithm for text classification. In 5th International Conference on the Applications of Digital Information and Web Technologies (ICADIWT), IEEE, pp. 118–123 (2014)Google Scholar
  6. 6.
    Nedungadi, P., Haripriya, H.: Exploiting label dependency and feature similarity for multi-label classification. In: International Conference on Advances in Computing, Communications and Informatics (ICACCI) IEEE, pp. 2196–2200 (2014)Google Scholar
  7. 7.
    Zhang, M.L., Zhou, Z.H.: ML-KNN: A lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)zbMATHCrossRefGoogle Scholar
  8. 8.
    Chiang, T.H., Lo, H.Y., Lin, S.D.: A Ranking-based KNN Approach for Multi-Label Classification. In: ACML. pp. 81–96 (2012)Google Scholar
  9. 9.
    Clare, A., King, R. D.: Knowledge discovery in multi-label phenotype data. In: Principles of data mining and knowledge discovery, pp. 42–53, Springer Berlin Heidelberg (2001)Google Scholar
  10. 10.
    Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2–3), 135–168 (2000)zbMATHCrossRefGoogle Scholar
  11. 11.
    Charte, F., Rivera, A., del Jesus, M.J., Herrera, F.: Improving multi-label classifiers via label reduction with association rules. In: Hybrid Artificial Intelligent Systems, pp. 188–199, Springer Berlin Heidelberg (2012)Google Scholar
  12. 12.
    Zhang, M. L., Wu, L. (2011). LIFT: Multi-label learning with label-specific featuresGoogle Scholar
  13. 13.
    Hang, L.I.: A short introduction to learning to rank. IEICE Trans. Inform. Syst. 94(10), 1854–1862 (2011)Google Scholar
  14. 14.
    Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM. pp. 133–142 (2002)Google Scholar
  15. 15.
    Ji, S., Ye, J.: Linear Dimensionality Reduction for Multi-label Classification. In: IJCAI 9, pp. 1077–1082 (2009)Google Scholar
  16. 16.
    Qian, B., Davidson, I.: Semi-Supervised Dimension Reduction for Multi-Label Classification. In AAAI. 10, pp. 569–574 (2010)Google Scholar
  17. 17.
    Zhang, Y., Zhou, Z.H.: Multilabel dimensionality reduction via dependence maximization. ACM Trans. Knowl. Disc. Data (TKDD) 4(3), 14 (2010)Google Scholar
  18. 18.
    Zhou, T., Tao, D.: Multi-label subspace ensemble. In International Conference on Artificial Intelligence and Statistics. pp. 1444–1452 (2012)Google Scholar
  19. 19.
    Wei, Z., Zhang, H., Zhang, Z., Li, W., Miao, D.: A naive Bayesian multi-label classification algorithm with application to visualize text search results. Int. J. Adv. Intell. 3(2), 173–188 (2011)Google Scholar
  20. 20.
    Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, ACM pp. 186–193 (2006)Google Scholar
  21. 21.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  22. 22.
    Prajapati, P., Thakkar, A., Ganatra, A.: A survey and current research challenges in multi-label classification methods. Int. J. Soft Comput. 2 (2012)Google Scholar
  23. 23.
    Cheng, W., Hüllermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Mach. Learn. 76(2–3), 211–225 (2009)CrossRefGoogle Scholar
  24. 24.
    Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)CrossRefGoogle Scholar
  25. 25.
    Rätsch, G., Onoda, T., Müller, K.R.: Soft margins for AdaBoost. Mach. Learn. 42(3), 287–320 (2001)zbMATHCrossRefGoogle Scholar
  26. 26.
    Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. In: Mining text data. pp. 163–222, Springer USGoogle Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.Amrita CREATEAmrita UniversityKollamIndia

Personalised recommendations