Advertisement

Utility and Application of Language Corpora

  • Niladri Sekhar Dash
  • L. Ramamoorthy

Table of contents

  1. Front Matter
    Pages i-xxx
  2. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 1-16
  3. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 17-34
  4. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 35-56
  5. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 57-71
  6. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 73-90
  7. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 91-103
  8. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 105-119
  9. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 121-138
  10. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 139-153
  11. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 155-172
  12. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 173-191
  13. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 193-217
  14. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 219-236
  15. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 237-249
  16. Niladri Sekhar Dash, L. Ramamoorthy
    Pages 251-266
  17. Back Matter
    Pages 267-290

About this book

Introduction

This book discusses some of the basic issues relating to corpus generation and the methods normally used to generate a corpus. Since corpus-related research goes beyond corpus generation, the book also addresses other major topics connected with the use and application of language corpora, namely, corpus readiness in the context of corpus sanitation and pre-editing of corpus texts; the application of statistical methods; and various text processing techniques. Importantly, it explores how corpora can be used as a primary or secondary resource in English language teaching, in creating dictionaries, in word sense disambiguation, in various language technologies, and in other branches of linguistics. Lastly, the book sheds light on the status quo of corpus generation in Indian languages and identifies current and future needs.

Discussing various technical issues in the field in a lucid manner, providing extensive new diagrams and charts for easy comprehension, and using simplified English, the book is an ideal resource for non-native English readers. Written by academics with many years of experience teaching and researching corpus linguistics, its focus on Indian languages and on English corpora makes it applicable to graduate and postgraduate students of applied linguistics, computational linguistics and language processing in South Asia and across countries where English is spoken as a first or second language.

Keywords

digital speech corpora monolingual corpora issue relating to text corpus generation skewedness and imbalance in corpus generation corpus sanitation statistical approaches for processing a corpus use of corpus as primary resource in ELT application to bilingual work bilingual dictionary creation use in studying dialects word sense disambiguation corpora use in linguistics

Authors and affiliations

  • Niladri Sekhar Dash
    • 1
  • L. Ramamoorthy
    • 2
  1. 1.Indian Statistical InstituteLinguistic Research UnitKolkataIndia
  2. 2.Linguistic Data Consortium-Indian LanguagesCentral Institute of Indian LanguagesMysoreIndia

Bibliographic information

  • DOI https://doi.org/10.1007/978-981-13-1801-6
  • Copyright Information Springer Nature Singapore Pte Ltd. 2019
  • Publisher Name Springer, Singapore
  • eBook Packages Social Sciences
  • Print ISBN 978-981-13-1800-9
  • Online ISBN 978-981-13-1801-6
  • Buy this book on publisher's site