AWX: An Integrated Approach to Hierarchical-Multilabel Classification

  • Luca MaseraEmail author
  • Enrico Blanzieri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11051)


The recent outbreak of works on artificial neural networks (ANNs) has reshaped the machine learning scenario. Despite the vast literature, there is still a lack of methods able to tackle the hierarchical multilabel classification (HMC) task exploiting entirely ANNs. Here we propose AWX, a novel approach that aims to fill this gap. AWX is a versatile component that can be used as output layer of any ANN, whenever a fixed structured output is required, as in the case of HMC. AWX exploits the prior knowledge on the output domain embedding the hierarchical structure directly in the network topology. The information flows from the leaf terms to the inner ones allowing a jointly optimization of the predictions. Different options to combine the signals received from the leaves are proposed and discussed. Moreover, we propose a generalization of the true path rule to the continuous domain and we demonstrate that AWX’s predictions are guaranteed to be consistent with respect to it. Finally, the proposed method is evaluated on 10 benchmark datasets and shows a significant increase in the performance over plain ANN, HMC-LMLP, and the state-of-the-art method CLUS-HMC. Code related to this paper is available at:


Hierarchical multilabel classification Structured prediction Artificial neural networks 


  1. 1.
    Cerri, R., Barros, R.C., de Carvalho, A.C.P.L.F.: Hierarchical classification of Gene Ontology-based protein functions with neural networks. In: 2015 International Joint Conference on Neural Networks, pp. 1–8 (2015)Google Scholar
  2. 2.
    Cerri, R., Barros, R.C., de Carvalho, A.C.P.L.F.: Hierarchical multi-label classification using local neural networks. J. Comput. Syst. Sci. 80(1), 39–56 (2014)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Chollet, F., et al.: Keras (2015)Google Scholar
  4. 4.
    Gene Ontology Consortium: Creating the gene ontology resource: design and implementation. Genome Res. 11(8), 1425–33 (2001)CrossRefGoogle Scholar
  5. 5.
    Gong, Q., Ning, W., Tian, W.: GoFDR: a sequence alignment based method for predicting protein functions. Methods 93, 3–14 (2015)CrossRefGoogle Scholar
  6. 6.
    Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29(6), 82–97 (2012)CrossRefGoogle Scholar
  7. 7.
    Jiang, Y., et al.: An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol. 17(1), 184 (2016)CrossRefGoogle Scholar
  8. 8.
    Kingma, D., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  10. 10.
    Murzin, A.G., Brenner, S.E., Hubbard, T., Chothia, C.: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247(4), 536–540 (1995)Google Scholar
  11. 11.
    Obozinski, G., Lanckriet, G., Grant, C., Jordan, M.I., Noble, W.S.: Consisten probabilistic outputs for protein function prediction. Genome Biol. 9(65), 1–19 (2008)Google Scholar
  12. 12.
    Radivojac, P., et al.: A large-scale evaluation of computational protein function prediction. Nat. Methods 10(3), 221–227 (2013)CrossRefGoogle Scholar
  13. 13.
    Ruepp, A., et al.: The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucl. Acids Res. 32(18), 5539–5545 (2004)CrossRefGoogle Scholar
  14. 14.
    Schriml, L.M., et al.: Disease ontology: a backbone for disease semantic integration. Nucl. Acids Res. 40(D1), D940–D946 (2011)CrossRefGoogle Scholar
  15. 15.
    Sokolov, A., Ben-Hur, A.: Hierarchical classification of gene ontology terms using the gostruct method. J. Bioinform. Comput. Biol. 8(02), 357–376 (2010)CrossRefGoogle Scholar
  16. 16.
    Soricut, R., Marcu, D.: Sentence level discourse parsing using syntactic and lexical information. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, pp. 149–156. Association for Computational Linguistics (2003)Google Scholar
  17. 17.
    Sorower, M.S.: A literature survey on algorithms for multi-label learning, vol. 18. Oregon State University, Corvallis (2010)Google Scholar
  18. 18.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  19. 19.
    Sun, A., Lim, E.-P.: Hierarchical text classification and evaluation. In: Proceedings IEEE International Conference on Data Mining, ICDM 2001, pp. 521–528. IEEE (2001)Google Scholar
  20. 20.
    Triguero, I., Vens, C.: Labelling strategies for hierarchical multi-label classification techniques. Pattern Recognit. 56, 1–14 (2015)CrossRefGoogle Scholar
  21. 21.
    Vens, C., Struyf, J., Schietgat, L., Džeroski, S., Blockeel, H.: Decision trees for hierarchical multi-label classification. Mach. Learn. 73(2), 185–214 (2008)CrossRefGoogle Scholar
  22. 22.
    Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of TrentoTrentoItaly

Personalised recommendations