Skip to main content

Multilingual Extraction Ontologies

  • Chapter
  • First Online:
Towards the Multilingual Semantic Web

Abstract

The growth of multilingual web content and increasing internationalization portends the need for cross-language query processing. We offer ML-OntoES (a MultiLingual Ontology-based Extraction System) as a solution for narrow-domain/data-rich applications. Based on language-independent extraction ontologies (Embley et al., Conceptual modeling foundations for a web of knowledge. In: Embley D, Thalheim B (eds) Handbook of conceptual modeling: theory, practice, and research challenges. Springer, Heidelberg, Germany, pp 477–516, 2011a), ML-OntoES enables semantic search over domain-specific, semistructured information. Key ideas of ML-OntoES include: (1) monolingual semantic indexing and query interpretation with extraction ontologies and (2) conceptual-level cross-language translation. A prototype implementation, along with experimental work showing good extraction accuracy in multiple languages, demonstrates the viability of the ML-OntoES approach of using multilingual extraction ontologies for cross-language query processing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See, for example, http://www.clef-initiative.eu.

  2. 2.

    Similar to the linguistic grounding discussed in Buitelaar et al. (2009), but different in its details.

  3. 3.

    By π and σ, we mean projection and selection, respectively.

  4. 4.

    Our mapping typology here resonates with that of León-Araúz and Faber (this volume), though our lexical type inventory is not as finely articulated.

References

  • Aggarwal, N., Polajnar, T., & Buitelaar, P. (2013). Cross-lingual natural language querying over the web of data. In E. Métais, F. Meziane, M. Saraee, V. Sugumaran, & S. Vadera (Eds.), Natural Language Processing and Information Systems: 18th International Conference on Applications of Natural Language to Information Systems (NLDB 2013). Lecture Notes in Computer Science (Vol. 7934, pp. 152–163). New York: Springer.

    Google Scholar 

  • Beach, M., Ely, W., & Vanderpoel, G. (1902). The ely ancestry. New York: The Columer Press. On-line book: http://www.archive.org/details-/elyancestrylinea00beac.

  • Buitelaar, P., Cimiano, P., Haase, P., & Sintek, M. (2009). Towards linguistically grounded ontologies. In Proceedings of the 6th European Semantic Web Conference (ESWC’09) (pp. 111–125). Heraklion: Greece.

    Google Scholar 

  • Declerck, T., Krieger, H.-U., Thomas, S., Buitelaar, P., O’Riain, S., Wunner, T., et al. (2010). Ontology-based multilingual access to financial reports for sharing business knowledge across Europe (pp. 67–76). Budapest: Memolux Kft.

    Google Scholar 

  • Dorr, B., Hovy, E., & Levin, L. (2006). Machine translation: Interlingual methods. In Natural language processing and machine translation. Encyclopedia of Language and Linguistics (2nd ed., pp. 383–394). Oxford, UK: Elsevier Ltd.

    Google Scholar 

  • Embley, D. (1980). Programming with data frames for everyday data items. In Proceedings of the 1980 National Computer Conference, Anaheim, CA (pp. 301–305).

    Google Scholar 

  • Embley, D., Liddle, S., & Lonsdale, D. (2011a). Conceptual modeling foundations for a web of knowledge. In D. Embley & B. Thalheim (Eds.), Handbook of conceptual modeling: Theory, practice, and research challenges (Chap. 15, pp. 477–516). Heidelberg, Germany: Springer.

    Google Scholar 

  • Embley, D., Liddle, S., Lonsdale, D., Machado, S., Packer, T., Park, J., et al. (2011b). Enabling search for facts and implied facts in historical documents. In Proceedings of the International Workshop on Historical Document Imaging and Processing (HIP 2011), Beijing, China (pp. 59–66).

    Google Scholar 

  • Embley, D., Liddle, S., Lonsdale, D., & Tijerino, Y. (2011c). Multilingual ontologies for cross-language information extraction and semantic search. In Proceedings of the 30th International Conference on Conceptual Modeling (ER 2011), Brussels, Belgium (pp. 147–160).

    Google Scholar 

  • Feigenbaum, E. (1984). Knowledge engineering: The applied side of artificial intelligence. Annals of the New York Academy of Sciences, 426, 91–107.

    Article  Google Scholar 

  • Fu, B., Brennan, R., & O’Sullivan, D. (2012). A configurable translation-based cross-lingual ontology mapping system to adjust mapping outcomes. Web Semantics: Science, Services and Agents on the World-Wide Web, 15, 15–36.

    Article  Google Scholar 

  • Hoekstra, R. (2010). The knowledge reengineering bottleneck. Semantic Web—Interoperability, Usability, Applicability, 1, 1–5.

    Google Scholar 

  • Lonsdale, D., Ding, Y., Embley, D., & Melby, A. (2002). Peppering knowledge sources with SALT: Boosting conceptual content for ontology generation. In Proceedings of the AAAI Workshop: Semantic Web Meets Language Resources: The 18th National Conference on Artificial Intelligence, Edmonton, AB, Canada (pp. 30–36).

    Google Scholar 

  • Mahesh, K. (1996). Ontology development for machine translation: Ideology and methodology. Technical Report MCCS-96-292. Albuquerque, NM: Computing Research Laboratory, University of New Mexico.

    Google Scholar 

  • Montiel-Ponsoda, E., de Cea, G. A., Gómez-Pérez, A., & Peters, W. (2011). Enriching ontologies with multilingual information. Natural Language Engineering, 17(3), 283–309.

    Article  Google Scholar 

  • Olive, J., Christianson, C., & McCary, J. (Eds.). (2011). Handbook of natural language processing and machine translation: DARPA global autonomous language exploitation. New York: Springer.

    Google Scholar 

  • Packer, T., & Embley, D. (2013). Cost effective ontology population with data from lists in OCRed historical documents. In Proceedings of the 2nd International Workshop on Historical Document Imaging and Processing (HIP 2013), Washington, DC, USA (pp. 1–8).

    Google Scholar 

  • Peters, C., Braschler, M., & Clough, P. (2012). Multilingual information retrieval: From research to practice. New York: Springer.

    Book  Google Scholar 

  • Tijerino, Y. (2010). Cross-cultural and cross-lingual ontology engineering. In Proceedings of the 2010 Workshop on Cross-Cultural and Cross-Lingual Aspects of the Semantic Web, Shanghai, China (pp. 44–53).

    Google Scholar 

Download references

Acknowledgments

We are grateful to Tae Woo Kim and Rebecca Brinck for annotating our Korean and French document sets. We are also grateful to the reviewers for their insightful feedback and challenging questions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David W. Embley .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Embley, D.W., Liddle, S.W., Lonsdale, D.W., Shin, BJ., Tijerino, Y. (2014). Multilingual Extraction Ontologies. In: Buitelaar, P., Cimiano, P. (eds) Towards the Multilingual Semantic Web. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43585-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43585-4_10

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43584-7

  • Online ISBN: 978-3-662-43585-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics