Abstract
In this methodological paper, I review a number of studies in corpus linguistics that rely heavily on off-the-shelf computer programs known as concordancers. While acknowledging the fruitful research findings generated using concordancers, it is argued that natural language processing (NLP) tools such as Stanford parser and SyntaxNet should be used to automate certain analytical procedures that are often performed manually by corpus linguistics researchers using concordancers. More collaboration efforts between NLP researchers and corpus linguists are called for to help advance the field of corpus linguistics into a post-concordancer era.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Banko, M. et al.: Open information extraction from the web. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007) (2007)
Bresnan, J.: Lexical-functional syntax. Blackwell, Malden (2001)
Carroll, J. et al.: Corpus annotation for parser evaluation. arXiv preprint arXiv:cs/9907013 (1999)
Chambers, A.: Integrating corpus consultation in language studies. Lang. Learn. Technol. 9(2), 111–125 (2005)
Clegg, A.B.: Computational-linguistic approaches to biological text mining, University of London (2008)
Davies, M.: Google Scholar and COCA-Academic: two very different approaches to examining academic English. J. Engl. Acad. Purp. 12(3), 155–165 (2013)
De Marneffe, M.-C., Manning, C.D.: The Stanford typed dependencies representation. In: Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics (2008)
Erkan, G. et al.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Flowerdew, J.: Concordancing as a tool in course design. System 21(2), 231–244 (1993)
Flowerdew, J., Wang, S.H.: Author’s editor revisions to manuscripts published in international journals. J. Second Lang. Writ. 32, 39–52 (2016)
Hyland, K., Tse, P.: Hooking the reader: a corpus study of evaluative <i> that </i> in abstracts. Engl. Specif. Purp. 24(2), 123–139 (2005)
King, T.H. et al.: The PARC 700 dependency bank. In: 4th International Workshop on Linguistically Interpreted Corpora (LINC 2003) (2003)
Lee, D., Swales, J.: A corpus-based EAP course for NNS doctoral students: moving from available specialized corpora to self-compiled corpora. Engl. Specif. Purp. 25(1), 56–75 (2006)
Lu, X.: Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15, 4 (2010)
Lu, X., Ai, H.: Syntactic complexity in college-level English writing: differences among writers with diverse L1 backgrounds. J. Second Lang. Writ. 29, 16–27 (2015)
Meena, A., Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 573–580. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71496-5_53
Ortega, L.: Syntactic complexity in L2 writing: progress and expansion. J. Second Lang. Writ. 29, 82–94 (2015)
Ortega, L.: Syntactic complexity measures and their relationship to L2 proficiency: a research synthesis of college-level L2 writing. Appl. Linguist. 24(4), 492–518 (2003)
Thurstun, J., Candlin, C.N.: Concordancing and the teaching of the vocabulary of academic English. Engl. Specif. Purp. 17(3), 267–280 (1998)
Yoon, C.: Concordancing in L2 writing class: an overview of research and issues. J. Engl. Acad. Purp. 10(3), 130–139 (2011)
Yoon, H.: More than a linguistic reference: the influence of corpus technology on L2 academic writing. Lang. Learn. Technol. 12(2), 31–48 (2008)
Youn, S.J.: Measuring syntactic complexity in L2 pragmatic production: investigating relationships among pragmatics, grammar, and proficiency. System 42, 270–287 (2014)
Zareva, A.: Self-mention and the projection of multiple identity roles in TESOL graduate student presentations: the influence of the written academic genres. Engl. Specif. Purp. 32(2), 72–83 (2013)
Zhang, G.: It is suggested that… or it is better to…? Forms and meanings of subject it-extraposition in academic and popular writing. J. Engl. Acad. Purp. 20, 1–13 (2015)
Zhuang, L. et al.: Movie review mining and summarization. In: Presented at the Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wang, S.H. (2017). Text Analysis of Corpus Linguistics in a Post-concordancer Era. In: Wu, TT., Gennari, R., Huang, YM., Xie, H., Cao, Y. (eds) Emerging Technologies for Education. SETE 2016. Lecture Notes in Computer Science(), vol 10108. Springer, Cham. https://doi.org/10.1007/978-3-319-52836-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-319-52836-6_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52835-9
Online ISBN: 978-3-319-52836-6
eBook Packages: Computer ScienceComputer Science (R0)