Skip to main content

Building a parallel corpus of English/Panjabi

  • Chapter
Parallel Text Processing

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 13))

Abstract

In this chapter we will be concerned primarily with the development of new parallel corpora, specifically for English paired with Indic languages. The focus of our discussion here will be Panjabi, though the issues we explore apply fairly equally to other Indic languages and scripts. We want to highlight a range of difficulties which face those constructing parallel corpus resources for the exploration of these languages, especially in the context of parallel corpora. In order to do this, two corpora—one of 16th century Panjabi and one of modern Panjabi—will be described, and some preliminary work on English/Panjabi alignment briefly presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Burnard, L. and Sperberg-McQueen, C. M. (1995). TEl Lite: An Introduction to Text Encoding for Interchange. (Online] Available: http://sable.ox.ac.uk/ota/teilite.

    Google Scholar 

  • Debili, F. and Sammouda, E. (1992). Appariement des Phrases de Textes Bilingues. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 517–538.

    Google Scholar 

  • Edwards, V. and Alladina, S. (1991). Many People Many Tongues: Babel and beyond. In Alladina, S. and Edwards, V. (Eds.) Multilingualism In The British Isles (Vol. 2, pp 1–29 ), London: Longman.

    Google Scholar 

  • Hearn, P. (1996). The Language Engineering Directory. Madrid, Language and Technology.

    Google Scholar 

  • Ide, N. and Véronis, J. (1994). MULTEXT (Multilingual Text Tools and Corpora). Proceedings of the International Conference on Computational Linguistics (COLING) 1994, Kyoto, Japan, 588–592.

    Chapter  Google Scholar 

  • McEnery, A. M. (1999). Final Report on MILLEFT, Report to EPSRC, Lancaster University. McEnery, A. M., Wilson, A., Sanchez-Leon, F. and Nieto-Serrano, A. (1997). Multilingual Resources for European Languages: Contributions of the CRATER Project. Literary and Linguistic Computing, 12 (4), 219–226.

    Google Scholar 

  • McEnery, T., Piao, S. L. and Xin, X. (2000). Parallel Alignment in English and Chinese. In McEnery, A. M., Botley, S. and Wilson, A. (Eds.), Multilingual Corpora: Teaching and Research, Amsterdam: Rodopi to appear].

    Google Scholar 

  • McLeod, W. H. (1989). The Sikhs: History, Religion and Society. Columbia University Press.

    Google Scholar 

  • Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence (pp. 173–180 ), Amsterdam: North-Holland.

    Google Scholar 

  • Piao, S. L. (2000). A Hybrid Model of English/Chinese Alignment, PhD Thesis, Lancaster University.

    Google Scholar 

  • Talib, G. S. (1984). Sri Guru Granth Sahib (in English translation). Vol I, Patiala: Panjabi University.

    Google Scholar 

  • Wu, D. (1995). An Algorithm For Simultaneously Bracketing Parallel Texts By Aligning Words. Proceedings of the 33“ . ’ meeting of the Association for Computational Linguistics, MIT, Cambridge, MA, 244–251.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Singh, S., McEnery, T., Baker, P. (2000). Building a parallel corpus of English/Panjabi. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-94-017-2535-4_17

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-90-481-5555-2

  • Online ISBN: 978-94-017-2535-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics