Skip to main content

OOV Term Translation, Context Information and Definition Extraction Based on OOV Term Type Prediction

  • Conference paper
  • 1568 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7614))

Abstract

Although there are many existing approaches for solving the OOV term translation problems, but existing approaches are not able to handle different types of OOV terms, especially hybrid translations, such as “Kenny-Caffey syndrome (Kenny-Caffey 氏症候群)”. We proposed a novel integrated ranking approach to consider the types of OOV terms before translating them. Thus, different types of OOV terms could be translated differently. Furthermore, the translations mined in other languages are also OOV terms, none of existing approaches offer the context information or definitions of the OOV terms. Users without special knowledge cannot easily understand meanings of the OOV terms. Our integrated ranking approach also extracts monolingual definitions and multilingual context information of OOV terms. Moreover, we propose a novel adaptive rules approach with Bayesian net and Adaboost for handling hybrid translations. Experiments show our approach performs better than existing approaches.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lu, W.-H., Chien, L.-F., Lee, H.-J.: Anchor text mining for translation of Web queries: A transitive translation approach. ACM Trans. Inf. Syst. 22(2), 242–269 (2004)

    Article  Google Scholar 

  2. Cheng, P.-J., et al.: Translating unknown queries with web corpora for cross-language information retrieval. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 146–153. ACM, Sheffield (2004)

    Google Scholar 

  3. Zhang, Y., Huang, F., Vogel, S.: Mining translations of OOV terms from the web through cross-lingual query expansion. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 669–670. ACM, Salvador (2005)

    Google Scholar 

  4. Zhang, Y., Vines, P.: Using the web for automated translation extraction in cross-language information retrieval. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 162–169. ACM, Sheffield (2004)

    Google Scholar 

  5. Zhang, Y., Vines, P.: Detection and translation of OOV terms prior to query time. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 524–525. ACM, Sheffield (2004)

    Google Scholar 

  6. Zhang, Y., Vines, P., Zobel, J.: Chinese OOV translation and post-translation query expansion in chinese-english cross-lingual information retrieval. ACM Transactions on Asian Language Information Processing (TALIP) 4(2), 57–77 (2005)

    Article  Google Scholar 

  7. Zhang, Y., Wang, Y., Xue, X.: English-Chinese bi-directional OOV translation based on web mining and supervised learning. In: ACL-IJCNLP 2009 Conference Short Papers, pp. 129–132. Association for Computational Linguistics, Suntec (2009)

    Chapter  Google Scholar 

  8. Lu, C., Xu, Y., Geva, S.: Translation disambiguation in web-based translation extraction for English-Chinese CLIR. In: ACM Symposium on Applied Computing, pp. 819–823. ACM, Seoul (2007)

    Google Scholar 

  9. Tiffin, N., et al.: Integration of text and data-mining using ontologies successfully selects disease gene candidates. Nucleic Acids Res. 33, 1544–1552 (2005)

    Article  Google Scholar 

  10. Fellbaum, C.: WordNet An Electronic Lexical Database (1998)

    Google Scholar 

  11. Ferreira da Silva, J., Dias, G., Guilloré, S., Pereira Lopes, J.G.: Using LocalMaxs Algorithm for the Extraction of Contiguous and Non-contiguous Multiword Lexical Units. In: Barahona, P., Alferes, J.J. (eds.) EPIA 1999. LNCS (LNAI), vol. 1695, pp. 113–132. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  12. Shi, L.: Mining OOV Translations from Mixed-Language Web Pages for Cross Language Information Retrieval. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 471–482. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  13. Rapidminer, Rapidminer data mining tool (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qu, J., Shimazu, A., Le Nguyen, M. (2012). OOV Term Translation, Context Information and Definition Extraction Based on OOV Term Type Prediction. In: Isahara, H., Kanzaki, K. (eds) Advances in Natural Language Processing. JapTAL 2012. Lecture Notes in Computer Science(), vol 7614. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33983-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-33983-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-33982-0

  • Online ISBN: 978-3-642-33983-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics