Text Analysis of Corpus Linguistics in a Post-concordancer Era

Wang, Simon Ho

doi:10.1007/978-3-319-52836-6_41

Text Analysis of Corpus Linguistics in a Post-concordancer Era

Simon Ho Wang¹⁸

Conference paper
First Online: 19 February 2017

2804 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10108))

Abstract

In this methodological paper, I review a number of studies in corpus linguistics that rely heavily on off-the-shelf computer programs known as concordancers. While acknowledging the fruitful research findings generated using concordancers, it is argued that natural language processing (NLP) tools such as Stanford parser and SyntaxNet should be used to automate certain analytical procedures that are often performed manually by corpus linguistics researchers using concordancers. More collaboration efforts between NLP researchers and corpus linguists are called for to help advance the field of corpus linguistics into a post-concordancer era.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Banko, M. et al.: Open information extraction from the web. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007) (2007)
Google Scholar
Bresnan, J.: Lexical-functional syntax. Blackwell, Malden (2001)
Google Scholar
Carroll, J. et al.: Corpus annotation for parser evaluation. arXiv preprint arXiv:cs/9907013 (1999)
Chambers, A.: Integrating corpus consultation in language studies. Lang. Learn. Technol. 9(2), 111–125 (2005)
Google Scholar
Clegg, A.B.: Computational-linguistic approaches to biological text mining, University of London (2008)
Google Scholar
Davies, M.: Google Scholar and COCA-Academic: two very different approaches to examining academic English. J. Engl. Acad. Purp. 12(3), 155–165 (2013)
Article Google Scholar
De Marneffe, M.-C., Manning, C.D.: The Stanford typed dependencies representation. In: Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics (2008)
Google Scholar
Erkan, G. et al.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Google Scholar
Flowerdew, J.: Concordancing as a tool in course design. System 21(2), 231–244 (1993)
Article Google Scholar
Flowerdew, J., Wang, S.H.: Author’s editor revisions to manuscripts published in international journals. J. Second Lang. Writ. 32, 39–52 (2016)
Article Google Scholar
Hyland, K., Tse, P.: Hooking the reader: a corpus study of evaluative <i> that </i> in abstracts. Engl. Specif. Purp. 24(2), 123–139 (2005)
Article Google Scholar
King, T.H. et al.: The PARC 700 dependency bank. In: 4th International Workshop on Linguistically Interpreted Corpora (LINC 2003) (2003)
Google Scholar
Lee, D., Swales, J.: A corpus-based EAP course for NNS doctoral students: moving from available specialized corpora to self-compiled corpora. Engl. Specif. Purp. 25(1), 56–75 (2006)
Article Google Scholar
Lu, X.: Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15, 4 (2010)
Article Google Scholar
Lu, X., Ai, H.: Syntactic complexity in college-level English writing: differences among writers with diverse L1 backgrounds. J. Second Lang. Writ. 29, 16–27 (2015)
Article Google Scholar
Meena, A., Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 573–580. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71496-5_53
Chapter Google Scholar
Ortega, L.: Syntactic complexity in L2 writing: progress and expansion. J. Second Lang. Writ. 29, 82–94 (2015)
Article Google Scholar
Ortega, L.: Syntactic complexity measures and their relationship to L2 proficiency: a research synthesis of college-level L2 writing. Appl. Linguist. 24(4), 492–518 (2003)
Article Google Scholar
Thurstun, J., Candlin, C.N.: Concordancing and the teaching of the vocabulary of academic English. Engl. Specif. Purp. 17(3), 267–280 (1998)
Article Google Scholar
Yoon, C.: Concordancing in L2 writing class: an overview of research and issues. J. Engl. Acad. Purp. 10(3), 130–139 (2011)
Article Google Scholar
Yoon, H.: More than a linguistic reference: the influence of corpus technology on L2 academic writing. Lang. Learn. Technol. 12(2), 31–48 (2008)
MathSciNet Google Scholar
Youn, S.J.: Measuring syntactic complexity in L2 pragmatic production: investigating relationships among pragmatics, grammar, and proficiency. System 42, 270–287 (2014)
Article Google Scholar
Zareva, A.: Self-mention and the projection of multiple identity roles in TESOL graduate student presentations: the influence of the written academic genres. Engl. Specif. Purp. 32(2), 72–83 (2013)
Article Google Scholar
Zhang, G.: It is suggested that… or it is better to…? Forms and meanings of subject it-extraposition in academic and popular writing. J. Engl. Acad. Purp. 20, 1–13 (2015)
Article Google Scholar
Zhuang, L. et al.: Movie review mining and summarization. In: Presented at the Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Language Center, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong
Simon Ho Wang

Authors

Simon Ho Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Ho Wang .

Editor information

Editors and Affiliations

National Yunlin University of Science and Technology, Yunlin, Taiwan
Ting-Ting Wu
Free University of Bozen-Bolzano, Rome, Italy
Rosella Gennari
National Cheng-Kung University, Tainan, Taiwan
Yueh-Min Huang
The Education University of Hong Kong, Hong Kong, Hong Kong
Haoran Xie
MC Information Multimedia Communication AG, Saarbrücken, Germany
Yiwei Cao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S.H. (2017). Text Analysis of Corpus Linguistics in a Post-concordancer Era. In: Wu, TT., Gennari, R., Huang, YM., Xie, H., Cao, Y. (eds) Emerging Technologies for Education. SETE 2016. Lecture Notes in Computer Science(), vol 10108. Springer, Cham. https://doi.org/10.1007/978-3-319-52836-6_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-52836-6_41
Published: 19 February 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52835-9
Online ISBN: 978-3-319-52836-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics