Skip to main content

Text Analysis of Corpus Linguistics in a Post-concordancer Era

  • Conference paper
  • First Online:
  • 2804 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10108))

Abstract

In this methodological paper, I review a number of studies in corpus linguistics that rely heavily on off-the-shelf computer programs known as concordancers. While acknowledging the fruitful research findings generated using concordancers, it is argued that natural language processing (NLP) tools such as Stanford parser and SyntaxNet should be used to automate certain analytical procedures that are often performed manually by corpus linguistics researchers using concordancers. More collaboration efforts between NLP researchers and corpus linguists are called for to help advance the field of corpus linguistics into a post-concordancer era.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Banko, M. et al.: Open information extraction from the web. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007) (2007)

    Google Scholar 

  2. Bresnan, J.: Lexical-functional syntax. Blackwell, Malden (2001)

    Google Scholar 

  3. Carroll, J. et al.: Corpus annotation for parser evaluation. arXiv preprint arXiv:cs/9907013 (1999)

  4. Chambers, A.: Integrating corpus consultation in language studies. Lang. Learn. Technol. 9(2), 111–125 (2005)

    Google Scholar 

  5. Clegg, A.B.: Computational-linguistic approaches to biological text mining, University of London (2008)

    Google Scholar 

  6. Davies, M.: Google Scholar and COCA-Academic: two very different approaches to examining academic English. J. Engl. Acad. Purp. 12(3), 155–165 (2013)

    Article  Google Scholar 

  7. De Marneffe, M.-C., Manning, C.D.: The Stanford typed dependencies representation. In: Coling 2008: Proceedings of the Workshop on Cross-Framework and Cross-Domain Parser Evaluation, pp. 1–8. Association for Computational Linguistics (2008)

    Google Scholar 

  8. Erkan, G. et al.: Semi-supervised classification for extracting protein interaction sentences using dependency parsing. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)

    Google Scholar 

  9. Flowerdew, J.: Concordancing as a tool in course design. System 21(2), 231–244 (1993)

    Article  Google Scholar 

  10. Flowerdew, J., Wang, S.H.: Author’s editor revisions to manuscripts published in international journals. J. Second Lang. Writ. 32, 39–52 (2016)

    Article  Google Scholar 

  11. Hyland, K., Tse, P.: Hooking the reader: a corpus study of evaluative <i> that </i> in abstracts. Engl. Specif. Purp. 24(2), 123–139 (2005)

    Article  Google Scholar 

  12. King, T.H. et al.: The PARC 700 dependency bank. In: 4th International Workshop on Linguistically Interpreted Corpora (LINC 2003) (2003)

    Google Scholar 

  13. Lee, D., Swales, J.: A corpus-based EAP course for NNS doctoral students: moving from available specialized corpora to self-compiled corpora. Engl. Specif. Purp. 25(1), 56–75 (2006)

    Article  Google Scholar 

  14. Lu, X.: Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15, 4 (2010)

    Article  Google Scholar 

  15. Lu, X., Ai, H.: Syntactic complexity in college-level English writing: differences among writers with diverse L1 backgrounds. J. Second Lang. Writ. 29, 16–27 (2015)

    Article  Google Scholar 

  16. Meena, A., Prabhakar, T.V.: Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 573–580. Springer, Heidelberg (2007). doi:10.1007/978-3-540-71496-5_53

    Chapter  Google Scholar 

  17. Ortega, L.: Syntactic complexity in L2 writing: progress and expansion. J. Second Lang. Writ. 29, 82–94 (2015)

    Article  Google Scholar 

  18. Ortega, L.: Syntactic complexity measures and their relationship to L2 proficiency: a research synthesis of college-level L2 writing. Appl. Linguist. 24(4), 492–518 (2003)

    Article  Google Scholar 

  19. Thurstun, J., Candlin, C.N.: Concordancing and the teaching of the vocabulary of academic English. Engl. Specif. Purp. 17(3), 267–280 (1998)

    Article  Google Scholar 

  20. Yoon, C.: Concordancing in L2 writing class: an overview of research and issues. J. Engl. Acad. Purp. 10(3), 130–139 (2011)

    Article  Google Scholar 

  21. Yoon, H.: More than a linguistic reference: the influence of corpus technology on L2 academic writing. Lang. Learn. Technol. 12(2), 31–48 (2008)

    MathSciNet  Google Scholar 

  22. Youn, S.J.: Measuring syntactic complexity in L2 pragmatic production: investigating relationships among pragmatics, grammar, and proficiency. System 42, 270–287 (2014)

    Article  Google Scholar 

  23. Zareva, A.: Self-mention and the projection of multiple identity roles in TESOL graduate student presentations: the influence of the written academic genres. Engl. Specif. Purp. 32(2), 72–83 (2013)

    Article  Google Scholar 

  24. Zhang, G.: It is suggested that… or it is better to…? Forms and meanings of subject it-extraposition in academic and popular writing. J. Engl. Acad. Purp. 20, 1–13 (2015)

    Article  Google Scholar 

  25. Zhuang, L. et al.: Movie review mining and summarization. In: Presented at the Proceedings of ACM Conference on Information and Knowledge Management (CIKM) (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Ho Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wang, S.H. (2017). Text Analysis of Corpus Linguistics in a Post-concordancer Era. In: Wu, TT., Gennari, R., Huang, YM., Xie, H., Cao, Y. (eds) Emerging Technologies for Education. SETE 2016. Lecture Notes in Computer Science(), vol 10108. Springer, Cham. https://doi.org/10.1007/978-3-319-52836-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-52836-6_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-52835-9

  • Online ISBN: 978-3-319-52836-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics