References
Allan, K. (1977). Classifiers. Language, 53(2), 285–311.
Bird, S., & Simons, G. (2003). Seven dimensions of portability for language documentation and description. Language, 79(4), 557–582.
Bond, F., & Paik, K. (2000). Reusing an ontology to generate numerical classifiers. In Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), pp. 90–96.
Brants, T., & Franz, A. (2006). Web 1T 5-gram Version 1. LCD Catalog No. LDC2006T13.
Butt, M., & King, T. H. (2007). Urdu in a parallel grammar development environment. Language Resources and Evaluation, 41(2), 191–207.
Clarke, C., Craswell, N., & Soboroff, I. (2004). Overview of the TREC 2004 terabyte track. In Proceedings of the 13th Text Retrieval Conference (TREC 2004).
Huang, C.-R., Tokunaga, T., & Lee, S. Y. M. (2006). Special issue on: Asian language processing: state-of-the art resources and processing. Language Resources and Evaluation, 40(3–4).
Kilgarriff, A. (2007). Googleology is bad science. Computational Linguistics, 33(1), 147–151.
Kilgarriff, A., & Grenfenstette, G. (2003). Introduction to the special issue on the web as corpus. Computational Linguistics, 29(3), 333–347.
Nakramura, J., & Nagao, M. (1988). Extraction of semantic information from an ordinary English dictionary and its evaluation. In Proceedings of the 12th International Conference on Computational linguistics (COLING 1988), pp. 459–464.
Naseem, T., & Hussain, S. (2007). A novel approach for ranking spelling error corrections for Urdu. Language Resources and Evaluation, 41(2), 117–128.
Pantel, P., & Pennacchiotti, M. (2006). Espresso: leveraging generic patterns for automatically harvesting semantic relations. In Proceedings of the 21st International Conference on Computational Linguistics/the 44th Annual Meeting of the Association for Computational Linguistics (COLING/ACL 2006), pp. 113–120.
Ringlstetter, C., Schulz, K.U., & Mihov, S. (2006). Orthographic errors in web pages: toward cleaner web corpora. Computational Linguistics, 32(3), 295–340.
Shirai, K., Tokunaga, T., Huang, C.-R., Hsieh, S.-K., Kuo, T.-Y., Sornlertlamvanich, V., & Charoenporn, T. (2008). Constructing taxonomy of numerative classifiers for Asian languages. In Proceedings of the 3rd International Joint Conference on Natural Language Processing (IJCNLP 2008), pp. 397–402.
Tanaka, K., & Iwasaki, H. (1996). Extraction of lexical translations from non-aligned corpora. In Proceedings of the 16th International Conference on Computational linguistics (COLING 1996), pp. 580–585.
Tsurumaru, H., Hitaka, T., & Yoshida, S. (1986). An attempt to automatic thesaurus construction from an ordinary Japanese language dictionary. In Proceedings of the 11th International Coference on Computational linguistics (COLING 1986), pp. 445–447
Resources
British National Corpus. http://www.natcorp.ox.ac.uk/.
Brown Corpus. http://icame.uib.no/brown/bcm.html.
Cobuild Project. http://www.collins.co.uk/corpus/CorpusSearch.aspx.
Sinica Corpus. http://www.sinica.edu.tw/SinicaCorpus.
Chinese Gigaword. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2003T09.
English Gigaword. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2003T05.
Tagged Chinese Gigaword. http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2007T03.
Acknowledgements
We would like to thank all the authors who submitted 74 papers on a wide range of research topics on Asian languages. We had the privilege of going through all these papers and wished that the full range of resources and topics could have been presented. We would also like to thank all the reviewers, whose prompt action helped us through all the submitted papers with helpful comments. We would like to thank AFNLP for its support of the initiative to promote Asian language processing. Various colleagues helped us processing all the papers, including Dr. Sara Goggi at CNR-Italy, and Liwu Chen at Academia Sinica. Finally, we could like to thank four people at LRE and Springer that made this special issue possible. Without the generous support of the chief editors Nancy Ide and Nicoletta Calzolari, this volume would not have been possible. In addition, without the diligent work of both Estella La Jappon and Jenna Cataluna at Springer, we would never have been able to negotiate all the steps of publication. For this introductory chapter, we would like to thank Kathleen Ahrens, Nicoletta Calzolari, and Nancy Ide for their detailed comments. Any remaining errors are, of course, ours.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tokunaga, T., Huang, CR. & Lee, S.Y.M. Asian language resources: the state-of-the-art. Lang Resources & Evaluation 42, 109–116 (2008). https://doi.org/10.1007/s10579-008-9071-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-008-9071-y