Abstract
The growth of multilingual web content and increasing internationalization portends the need for cross-language query processing. We offer ML-OntoES (a MultiLingual Ontology-based Extraction System) as a solution for narrow-domain/data-rich applications. Based on language-independent extraction ontologies (Embley et al., Conceptual modeling foundations for a web of knowledge. In: Embley D, Thalheim B (eds) Handbook of conceptual modeling: theory, practice, and research challenges. Springer, Heidelberg, Germany, pp 477–516, 2011a), ML-OntoES enables semantic search over domain-specific, semistructured information. Key ideas of ML-OntoES include: (1) monolingual semantic indexing and query interpretation with extraction ontologies and (2) conceptual-level cross-language translation. A prototype implementation, along with experimental work showing good extraction accuracy in multiple languages, demonstrates the viability of the ML-OntoES approach of using multilingual extraction ontologies for cross-language query processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
See, for example, http://www.clef-initiative.eu.
- 2.
Similar to the linguistic grounding discussed in Buitelaar et al. (2009), but different in its details.
- 3.
By π and σ, we mean projection and selection, respectively.
- 4.
Our mapping typology here resonates with that of León-Araúz and Faber (this volume), though our lexical type inventory is not as finely articulated.
References
Aggarwal, N., Polajnar, T., & Buitelaar, P. (2013). Cross-lingual natural language querying over the web of data. In E. Métais, F. Meziane, M. Saraee, V. Sugumaran, & S. Vadera (Eds.), Natural Language Processing and Information Systems: 18th International Conference on Applications of Natural Language to Information Systems (NLDB 2013). Lecture Notes in Computer Science (Vol. 7934, pp. 152–163). New York: Springer.
Beach, M., Ely, W., & Vanderpoel, G. (1902). The ely ancestry. New York: The Columer Press. On-line book: http://www.archive.org/details-/elyancestrylinea00beac.
Buitelaar, P., Cimiano, P., Haase, P., & Sintek, M. (2009). Towards linguistically grounded ontologies. In Proceedings of the 6th European Semantic Web Conference (ESWC’09) (pp. 111–125). Heraklion: Greece.
Declerck, T., Krieger, H.-U., Thomas, S., Buitelaar, P., O’Riain, S., Wunner, T., et al. (2010). Ontology-based multilingual access to financial reports for sharing business knowledge across Europe (pp. 67–76). Budapest: Memolux Kft.
Dorr, B., Hovy, E., & Levin, L. (2006). Machine translation: Interlingual methods. In Natural language processing and machine translation. Encyclopedia of Language and Linguistics (2nd ed., pp. 383–394). Oxford, UK: Elsevier Ltd.
Embley, D. (1980). Programming with data frames for everyday data items. In Proceedings of the 1980 National Computer Conference, Anaheim, CA (pp. 301–305).
Embley, D., Liddle, S., & Lonsdale, D. (2011a). Conceptual modeling foundations for a web of knowledge. In D. Embley & B. Thalheim (Eds.), Handbook of conceptual modeling: Theory, practice, and research challenges (Chap. 15, pp. 477–516). Heidelberg, Germany: Springer.
Embley, D., Liddle, S., Lonsdale, D., Machado, S., Packer, T., Park, J., et al. (2011b). Enabling search for facts and implied facts in historical documents. In Proceedings of the International Workshop on Historical Document Imaging and Processing (HIP 2011), Beijing, China (pp. 59–66).
Embley, D., Liddle, S., Lonsdale, D., & Tijerino, Y. (2011c). Multilingual ontologies for cross-language information extraction and semantic search. In Proceedings of the 30th International Conference on Conceptual Modeling (ER 2011), Brussels, Belgium (pp. 147–160).
Feigenbaum, E. (1984). Knowledge engineering: The applied side of artificial intelligence. Annals of the New York Academy of Sciences, 426, 91–107.
Fu, B., Brennan, R., & O’Sullivan, D. (2012). A configurable translation-based cross-lingual ontology mapping system to adjust mapping outcomes. Web Semantics: Science, Services and Agents on the World-Wide Web, 15, 15–36.
Hoekstra, R. (2010). The knowledge reengineering bottleneck. Semantic Web—Interoperability, Usability, Applicability, 1, 1–5.
Lonsdale, D., Ding, Y., Embley, D., & Melby, A. (2002). Peppering knowledge sources with SALT: Boosting conceptual content for ontology generation. In Proceedings of the AAAI Workshop: Semantic Web Meets Language Resources: The 18th National Conference on Artificial Intelligence, Edmonton, AB, Canada (pp. 30–36).
Mahesh, K. (1996). Ontology development for machine translation: Ideology and methodology. Technical Report MCCS-96-292. Albuquerque, NM: Computing Research Laboratory, University of New Mexico.
Montiel-Ponsoda, E., de Cea, G. A., Gómez-Pérez, A., & Peters, W. (2011). Enriching ontologies with multilingual information. Natural Language Engineering, 17(3), 283–309.
Olive, J., Christianson, C., & McCary, J. (Eds.). (2011). Handbook of natural language processing and machine translation: DARPA global autonomous language exploitation. New York: Springer.
Packer, T., & Embley, D. (2013). Cost effective ontology population with data from lists in OCRed historical documents. In Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing (HIP 2013), Washington, DC, USA (pp. 1–8).
Peters, C., Braschler, M., & Clough, P. (2012). Multilingual information retrieval: From research to practice. New York: Springer.
Tijerino, Y. (2010). Cross-cultural and cross-lingual ontology engineering. In Proceedings of the 2010 Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web, Shanghai, China (pp. 44–53).
Acknowledgments
We are grateful to Tae Woo Kim and Rebecca Brinck for annotating our Korean and French document sets. We are also grateful to the reviewers for their insightful feedback and challenging questions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Embley, D.W., Liddle, S.W., Lonsdale, D.W., Shin, BJ., Tijerino, Y. (2014). Multilingual Extraction Ontologies. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-662-43585-4_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-43584-7
Online ISBN: 978-3-662-43585-4
eBook Packages: Computer ScienceComputer Science (R0)