Skip to main content

Introducing new learning courses and educational videos from Apress. Start watching

Unicode and Natural Language

  • 709 Accesses

Abstract

Text that computers deal with tends to fall into two categories: things that are meant to be consumed by humans (like prose), and things that are meant to be consumed by software (machine code and encrypted files come to mind).

Keywords

  • Codepoint
  • Basic Latin
  • Code Point
  • Graph Clustering
  • Writing Systems

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-1-4842-3228-6_12
  • Chapter length: 6 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   29.99
Price excludes VAT (USA)
  • ISBN: 978-1-4842-3228-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   37.99
Price excludes VAT (USA)

Notes

  1. 1.

    http://unicode.org/reports/tr29/

  2. 2.

    https://en.wikipedia.org/wiki/Eastern_Arabic_numerals

  3. 3.

    https://docs.perl6.org/language/regexes#Unicode_properties

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

Copyright information

© 2017 Moritz Lenz

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Lenz, M. (2017). Unicode and Natural Language. In: Parsing with Perl 6 Regexes and Grammars. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-3228-6_12

Download citation