Skip to main content
Log in

Illinois CCG LoReHLT 2016 named entity recognition and situation frame systems

  • Published:
Machine Translation

Abstract

This paper describes Illinois Cognitive Computation Group’s system for the 2016 NIST Low Resource Human Language Technology (LoReHLT) evaluation, in which the target language is Uyghur. We participate in two tasks, named entity recognition (NER) and situation frame (SF). For NER, we develop two models. The first model is a rule-based model, which is based on the knowledge obtained by inspecting the monolingual documents, reading the Uyghur grammar book, and interacting with the native informants. The second model is a transfer model, which is trained on the labeled Uzbek data. Combining the outputs of these two models yields significant improvement and achieves 60.4 F1-score on the official evaluation set. For the new SF task, we apply the dataless classification technique to build an English classifier for eight situation types, and use an Uyghur-to-English dictionary to translate the Uyghur documents. Using this classifier, we propose two frameworks of grounding situations to the locations mentioned in text.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. The incident in this evaluation is the 2008 Sichuan earthquake.

References

  • Chang M-W, Ratinov L, Roth D, Srikumar V (2008) Importance of semantic representation Dataless classification. In: Proceedings of the Conference on Artificial Intelligence (AAAI), p 7. http://cogcomp.cs.illinois.edu/papers/CRRS08.pdf

  • Fan RE, Chang K-W, Hsieh C-J, Wang X-R, Lin C-J (2008) Liblinear: a library for large linear classification. J Mach Learn Res 9:1871–1874

    MATH  Google Scholar 

  • Faruqui M, Dyer C (2014) Improving vector space word representations using multilingual correlation. In: Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pp 462–471

  • Gabrilovich E, Markovitch S (2009) Wikipedia-based semantic interpretation for natural language processing. J Artif Intell Res 34(1):443–498. http://dl.acm.org/citation.cfm?id=1622716.1622728

  • Jiazheng L, Kai L, Mairehaba A, Yajuan L, Qun L, Tuergen Y (2011) Recognition and translation for chinese names in uighur language. J Chin Inf Process 25(4):82–88

    Google Scholar 

  • Mayhew S, Tsai C-T, Roth D (2017) Cheap translation for cross-lingual named entity recognition. In: EMNLP, http://dl.acm.org/citation.cfm?id=1622716.1622728

  • Pasternack J, Roth D (2009) Learning better transliterations. In: Proceedings of the ACM Conference on Information and Knowledge Management (CIKM), p 11. http://cogcomp.cs.illinois.edu/papers/PasternackRo09a.pdf

  • Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL), p 6. http://cogcomp.cs.illinois.edu/papers/RatinovRo09.pdf

  • Song Y, Roth D (2014) On dataless hierarchical text classification. In: Proceedings of the Conference on Artificial Intelligence (AAAI), p 7. http://cogcomp.cs.illinois.edu/papers/SongSoRo14.pdf

  • Tsai C-T, Roth D (2016) Cross-lingual wikification using multilingual embeddings. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), p 6. http://cogcomp.cs.illinois.edu/papers/TsaiRo16b.pdf

  • Tsai C-T, Mayhew S, Roth D (2016) Cross-lingual named entity recognition via wikification. In: Proceedings of the Conference on Computational Natural Language Learning (CoNLL). http://cogcomp.cs.illinois.edu/papers/TsaiMaRo16.pdf

Download references

Acknowledgements

This work was supported by Contract HR0011-15-2-0025 with the US Defense Advanced Research Projects Agency (DARPA). Approved for Public Release, Distribution Unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chen-Tse Tsai.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsai, CT., Mayhew, S., Song, Y. et al. Illinois CCG LoReHLT 2016 named entity recognition and situation frame systems. Machine Translation 32, 91–103 (2018). https://doi.org/10.1007/s10590-017-9211-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10590-017-9211-5

Keywords

Navigation