Corpus Linguistics and Statistics with R

Introduction to Quantitative Methods in Linguistics

  • Guillaume Desagulier

Table of contents

  1. Front Matter
    Pages i-xiii
  2. Guillaume Desagulier
    Pages 1-12
  3. Part I

    1. Front Matter
      Pages 13-14
    2. Guillaume Desagulier
      Pages 15-49
    3. Guillaume Desagulier
      Pages 51-67
    4. Guillaume Desagulier
      Pages 69-86
    5. Guillaume Desagulier
      Pages 87-114
    6. Guillaume Desagulier
      Pages 115-135
  4. Part II

    1. Front Matter
      Pages 137-138
    2. Guillaume Desagulier
      Pages 139-149
    3. Guillaume Desagulier
      Pages 151-195
    4. Guillaume Desagulier
      Pages 197-238
    5. Guillaume Desagulier
      Pages 239-294
  5. Back Matter
    Pages 295-353

About this book


This textbook examines empirical linguistics from a theoretical linguist’s perspective. It provides both a theoretical discussion of what quantitative corpus linguistics entails and detailed, hands-on, step-by-step instructions to implement the techniques in the field. The statistical methodology and R-based coding from this book teach readers the basic and then more advanced skills to work with large data sets in their linguistics research and studies. Massive data sets are now more than ever the basis for work that ranges from usage-based linguistics to the far reaches of applied linguistics. This book presents much of the methodology in a corpus-based approach. However, the corpus-based methods in this book are also essential components of recent developments in sociolinguistics, historical linguistics, computational linguistics, and psycholinguistics. Material from the book will also be appealing to researchers in digital humanities and the many non-linguistic fields that use textual data analysis and text-based sensorimetrics. Chapters cover topics including corpus processing, frequencing data, and clustering methods. Case studies illustrate each chapter with accompanying data sets, R code, and exercises for use by readers. This book may be used in advanced undergraduate courses, graduate courses, and self-study.


R package linguistics categorical data clustering methods data organization frequency data linguistics with R modeling quantitative methods for linguistics regression methods statistics for linguistics textual data analysis

Authors and affiliations

  • Guillaume Desagulier
    • 1
  1. 1.Université Paris 8Saint DenisFrance

Bibliographic information

  • DOI
  • Copyright Information Springer International Publishing AG 2017
  • Publisher Name Springer, Cham
  • eBook Packages Mathematics and Statistics
  • Print ISBN 978-3-319-64570-4
  • Online ISBN 978-3-319-64572-8
  • Series Print ISSN 2199-0956
  • Series Online ISSN 2199-0964
  • Buy this book on publisher's site