Semantic HMC: Ontology-Described Hierarchy Maintenance in Big Data Context

  • Rafael Peixoto
  • Christophe Cruz
  • Nuno Silva
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9416)


One of the biggest challenges in Big Data is the exploitation of Value from large volumes of data that are constantly changing. To exploit value, one must focus on extracting knowledge from these Big Data sources. To extract knowledge and value from unstructured text we propose using a Hierarchical Multi-Label Classification process called Semantic HMC that uses ontologies to describe the predictive model including the label hierarchy and the classification rules. To not overload the user, this process automatically learns the ontology-described label hierarchy from a very large set of text documents. This paper aims to present a maintenance process of the ontology-described label hierarchy relations with regards to a stream of unstructured text documents in the context of Big Data that incrementally updates the label hierarchy.


Maintenance Multi-label classification Hierarchy induction Ontology Machine learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hassan, T., Peixoto, R., Cruz, C., Bertaux, A., Silva, N.: Semantic HMC for big data analysis. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 26–28 (2014)Google Scholar
  2. 2.
    Peixoto, R., Hassan, T., Cruz, C., Bertaux, A., Silva, N.: Semantic HMC: a predictive model using multi-label classification for big data. In: The 9th IEEE International Conference on Big Data Science and Engineering (IEEE BigDataSE-15) (2015) (to appear)Google Scholar
  3. 3.
    Chen, M., Mao, S., Liu, Y.: Big Data: A Survey. Mob. Networks Appl. 19, 171–209 (2014)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Hitzler, P., Janowicz, K.: Linked data, big data, and the 4th paradigm. Semant. Web. 4, 233–235 (2013)Google Scholar
  5. 5.
    Syed, A., Gillela, K., Venugopal, C.: The Future Revolution on Big Data. Future 2, 2446–2451 (2013)Google Scholar
  6. 6.
    Medelyan, O., Manion, S., Broekstra, J., Divoli, A., Huang, A.-L., Witten, I.H.: Constructing a focused taxonomy from a document collection. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 367–381. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 120–126. Association for Computational Linguistics, Stroudsburg (1999)Google Scholar
  8. 8.
    Hearst, M.A.: Automatic acquisition of hyponyms ftom large text corpora. In: Proc. 14th Conf. Comput. Linguist, vol. 2, pp. 23–28 (1992)Google Scholar
  9. 9.
    Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proc. Jt. SIGDAT Conf. Empir. Methods Nat. Lang. Process. Very Large Corpora, pp. 63–70 (2000)Google Scholar
  10. 10.
    Cimiano, P., Staab, S., Tane, J.: Automatic acquisition of taxonomies from text: FCA meets NLP. In: Proceedings of the International Workshop & Tutorial on Adaptive Text Extraction and Mining (2003)Google Scholar
  11. 11.
    Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24, 513–523 (1988)CrossRefGoogle Scholar
  12. 12.
    Maedche, A., Volz, R.: The ontology extraction & maintenance framework text-to-onto. In: Proc. Work. Integr. Data, pp. 1–12 (2001)Google Scholar
  13. 13.
    Halevy, A., Norvig, P., Pereira, F.: The Unreasonable Effectiveness of Data. IEEE Intell. Syst. 24 (2009)Google Scholar
  14. 14.
    De Knijff, J., Frasincar, F., Hogenboom, F.: Domain taxonomy learning from text: The subsumption method versus hierarchical clustering. Data Knowl. Eng. 83, 54–69 (2013)CrossRefGoogle Scholar
  15. 15.
    Meijer, K., Frasincar, F., Hogenboom, F.: A Semantic Approach for Extracting Domain Taxonomies from Text. Decis. Support Syst. (2014)Google Scholar
  16. 16.
    Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: Proc. 22nd Annu. Int. ACM SIGIR Conf. Res. Dev. Inf. Retr. - SIGIR 1999, pp. 206–213 (1999)Google Scholar
  17. 17.
    Liu, X., Song, Y., Liu, S., Wang, H.: Automatic taxonomy construction from keywords. In: Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min, pp. 1433–1441 (2012)Google Scholar
  18. 18.
    Wang, X., Liu, S., Song, Y., Guo, B.: Mining evolutionary multi-branch trees from text streams. In: Proc. 19th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. - KDD 2013, p. 722 (2013)Google Scholar
  19. 19.
    Cui, W., Liu, S., Member, S., Wu, Z., Wei, H.: How Hierarchical Topics Evolve in Large Text Corpora. IEEE Trans. Vis. Comput. Graph. 20, 2281–2290 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.GECAD - ISEPPolytechnic of PortoPortoPortugal
  2. 2.LE2I UMR 6306 CNRSUniversity Bourgogne Franche-ComtéDijonFrance

Personalised recommendations