A Bilingual Dictionary Extracted from the Wikipedia Link Structure
A lot of bilingual dictionaries have been released on the WWW. However, these dictionaries insufficiently cover new and domainspecific terminology. In our demonstration, we present a dictionary constructed by analyzing the link structure of Wikipedia, a huge scale encyclopedia containing a large amount of links between articles in different languages. We analyzed not only these interlanguage links but extracted even more translation candidates from redirect page and link text information. In an experiment, we already proved the advantages of our dictionary compared to manually created dictionaries as well as to extracting bilingual terminology from parallel corpora.
KeywordsMachine Translation Link Structure Parallel Corpus Bilingual Dictionary Target Page
Unable to display preview. Download preview PDF.
- 3.Nakayama, K., Hara, T., Nishio, S.: A thesaurus construction method from large scale web dictionaries. In: Proc. of IEEE International Conference on Advanced Information Networking and Applications (AINA 2007), pp. 932–939 (2007)Google Scholar
- 5.Erdmann, M., Nakayama, K., Hara, T., Nishio, S.: An approach for extracting bilingual terminology from wikipedia. In: Haritsa, et al.(eds.) DASFAA 2008. LNCS, vol. 4947, pp. 580–587, Springer, Heidelberg (to appear, 2008) Google Scholar
- 6.Wikimedia Foundation: Wikimedia downloads, http://download.wikimedia.org/