Advertisement

Knowledge and Information Systems

, Volume 38, Issue 3, pp 641–667 | Cite as

A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text

  • Mei Kuan Wong
  • Syed Sibte Raza Abidi
  • Ian D. Jonsen
Regular Paper

Abstract

Over the last decade, ontology engineering has been pursued by “learning” the ontology from domain-specific electronic documents. Most of the research works are focused on extraction of concepts and taxonomic relations. The extraction of non-taxonomic relations is often neglected and not well researched. In this paper, we present a multi-phase correlation search framework to extract non-taxonomic relations from unstructured text. Our framework addresses the two main problems in any non-taxonomic relations extraction: (a) the discovery of non-taxonomic relations and (b) the labelling of non-taxonomic relations. First, our framework is capable of extracting correlated concepts beyond ordinary search window size of a single sentence. Interesting correlations are then filtered using association rule mining with lift interestingness measure. Next, our framework distinguishes non-taxonomic concept pairs from taxonomic concept pairs based on existing domain ontology. Finally, our framework features the usage of domain related verbs as labels for the non-taxonomic relations. Our proposed framework has been tested with the marine biology domain. Results have been validated by domain experts showing reliable results as well as demonstrate significant improvement over traditional association rule approach in search of non-taxonomic relations from unstructured text.

Keywords

Correlation search Non-taxonomic relation Relation labelling Association rule mining Lift interestingness measure Ontology learning 

Notes

Acknowledgments

This research is supported by a R&D grant from CANARIE, Canada, through the Network Enabled Platform program. We would also like to extend our gratitude to Dr. Isidora Katara for her valuable help in the evaluation of the proposed framework.

References

  1. 1.
    Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the ACM SIGMOD conference on management of data, pp 207–216Google Scholar
  2. 2.
    Alvarez SA (2003) Chi-squared computation for association rules: Preliminary results. Technical report BC-CS-2003-01, Computer Science Department, Boston CollegeGoogle Scholar
  3. 3.
    Bui QC, Katrenko S, Sloot PMA (2011) A hybrid approach to extract protein-protein interactions. Bioinformatics 27(2):259–265CrossRefGoogle Scholar
  4. 4.
    Buitelaar P, Cimiano P, Grobelnik M et al (2005) Ontology learning from text. In: Tutorial at ECML/PKDDGoogle Scholar
  5. 5.
    Chagnoux M, Hernandez N, Aussenac-Gilles N, (2008) An interactive pattern based approach for extracting non-taxonomic relations from texts. In: Workshop on ontology learning and population (associated to ECAI, (2008) OLP. University of Patras, Patras, pp 1–6Google Scholar
  6. 6.
    Chowdhury MFM, Lavelli A (2012) Combining tree structures, flat features and patterns for biomedical relation extraction. In: EACL, pp 420–429Google Scholar
  7. 7.
    Cimiano P, Völker J (2005) Text2Onto: a framework for ontology learning and data-driven change discovery. In: Proceedings of the 10th international conference on applications and natural language to databases (NLDB ’05), pp 227–238Google Scholar
  8. 8.
    Cimiano P, Völker J, Studer R (2006) Ontologies on demand? A description of the state-of-the-art, applications, challenges and trends for ontology learning from text. Information, Wissenschaft und Praxis 57(6–7):315–320Google Scholar
  9. 9.
    Cunningham H (2002) GATE, a general architecture for text engineering. Comput Hum 36(2):223–254CrossRefGoogle Scholar
  10. 10.
    Ding L, Finin T, Joshi A et al (2004) Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the 13th ACM international conference on information and knowledge management (CIKM 2004), pp 652–659Google Scholar
  11. 11.
    Fundel K, Küffner R, Zimmer R (2007) RelEx–relation extraction using dependency parse trees. Bioinformatics 23(3):365–371CrossRefGoogle Scholar
  12. 12.
    Gulla JA, Brasethvik T, Kvarv GS (2009) Association rules and cosine similarities in ontology relationship learning. In Enterprise information systems. Springer, Berlin, pp 201–212Google Scholar
  13. 13.
    Hall M, Frank E, Holmes G et al (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRefGoogle Scholar
  14. 14.
    Jang H, Lim J, Lim JH et al (2006) Finding the evidence for protein-protein interactions from PubMed abstracts. Bioinformatics 22(14):220–226CrossRefGoogle Scholar
  15. 15.
    Kamaruddin SS, Hamdan AR, Bakar AA et al (2009) Automatic extraction of performance indicators from financial statements. In: Proceedings of the international conference on electrical engineering and informatics (ICEEI’ 09), pp 348–350Google Scholar
  16. 16.
    Kavalec M, Maedche A, Svátek V (2003) Discovery of lexical entries for non-taxonomic relations in ontology learning. In: Theory and practice of computer science, pp 17–33, LNCS, vol 2932, SOFSEM 2004Google Scholar
  17. 17.
    Kavalec M, Svaték V (2005) A study on automated relation labelling in ontology learning. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, evaluation and applications. IOS Press, Amsterdam, pp 44–58Google Scholar
  18. 18.
    Kornfeld W, Wattecamps J (1998) Automatically locating, extracting and analyzing tabular data. In: Proceedings of the 21st ACM SIGIR international conference on research and development in, information retrieval, pp 347–348Google Scholar
  19. 19.
    Maedche A (2002) Ontology learning for the semantic web. Kluwer Academic Publishers, NorwellCrossRefzbMATHGoogle Scholar
  20. 20.
    Maedche A, Staab S (2000) Discovering conceptual relations from text. In: Proceedings of the 13th european conference on, artificial intelligence (ECAI-2000), pp 321–325Google Scholar
  21. 21.
    Maedche A, Staab S (2000) The text-to-onto ontology learning environment. In: Software demonstration at the 8th international conference on conceptual structures (ICSS-2000), pp 14–18Google Scholar
  22. 22.
    Nedellec C (2000) Corpus-based learning of semantic relations by the ILP system, Asium. In: Cussens J, Dzeroski S (eds) Proceedings of learning language in logic. Springer, Berlin, pp 259–278CrossRefGoogle Scholar
  23. 23.
    Punuru J, Chen J (2012) Learning non-taxonomical semantic relations from domain texts. J Intell Inf Syst 38(1):191–207CrossRefGoogle Scholar
  24. 24.
    Rinaldi F, Schneider G, Kaljurand K et al (2007) Mining of relations between proteins over biomedical scientific literature using a deep-linguistic approach. Artif Intell Med 39(2):127–136CrossRefGoogle Scholar
  25. 25.
    Sánchez D, Moreno A (2008) Learning non-taxonomic relationships from web documents for domain ontology construction. Data Knowl Eng 64(3):600–623CrossRefGoogle Scholar
  26. 26.
    Serra I, Girardi R (2011) Extracting non-taxonomic relationships of ontologies from texts. Intell Inf Manag 3(4):119–124Google Scholar
  27. 27.
    Schutz A, Buitelaar P (2005) RelExt: a tool for relation extraction from text in ontology extension. In: Proceedings of the 4th international semantic web conference, pp 593–606Google Scholar
  28. 28.
    Shamsfard M, Barforoush AA (2004) Learning ontologies from natural language texts. Int J Hum Comput Stud 60(1):17–63CrossRefGoogle Scholar
  29. 29.
    Sheikh L, Tanveer B, Hamdani M (2004) Interesting measures for mining association rules. In: Proceedings of the 8th IEEE international multi-topic conference (INMIC ’04), pp 641–644Google Scholar
  30. 30.
    Shen M, Liu DR, Huang YS (2012) Extracting semantic relations to enrich domain ontologies. J Intell Inf Syst 39(3):749–761CrossRefGoogle Scholar
  31. 31.
    Velardi P, Navigli R, Cucchiarelli A et al (2005) Evaluation of OntoLearn, a methodology for automatic learning of domain ontologies. In: Buitelaar P, Cimiano P, Magnini B (eds) Ontology learning from text: methods, applications and evaluation. IOS Press, Amsterdam, pp 92–106Google Scholar
  32. 32.
    Villaverde J, Persson A, Godoy D et al (2009) Supporting the discovery and labeling of non-taxonomic relationships in ontology learning. Expert Syst Appl 36(7):10288–10294CrossRefGoogle Scholar
  33. 33.
    Weichselbraun A, Wohlgenannt G, Scharl A (2010) Refining non-taxonomic relation labels with external structured data to support ontology learning. Data Knowl Eng Eng 69(8):763–778CrossRefGoogle Scholar
  34. 34.
    Witten IH, Paynter GW, Frank E et al (1999) KEA: practical automatic keyphrase extraction. In: Proceedings of the 4th ACM conference on digital libraries, pp 254–255Google Scholar
  35. 35.
    Wong MK, Abidi SSR, Jonsen ID (2011) Mining non-taxonomic concept pairs from unstructured text: a concept correlation search framework. In: Proceedings of the 7th international conference on web information systems and technologies, pp 707–715Google Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  • Mei Kuan Wong
    • 1
  • Syed Sibte Raza Abidi
    • 1
  • Ian D. Jonsen
    • 2
  1. 1.Faculty of Computer ScienceDalhousie UniversityHalifaxCanada
  2. 2.Department of BiologyDalhousie UniversityHalifaxCanada

Personalised recommendations