Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Unicode

  • Ethan V. Munson
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_5045

Definition

Unicode is an international standard for representing text characters. Unicode supports the scripts of all human languages with any substantial level of use and has a flexible design that is capable of supporting all known human languages and all of their variant scripts. The development of the Unicode standard is coordinated by the Unicode Consortium.

Key Points

Unicode’s development is motivated by the need to encode characters in all languages without conflicts between the encodings for different languages. Obviously, the achievement of this goal is fraught with technical and political complexities.

Unicode has several different encodings. The most widely used is the 8-bit, variable-width UTF-8 encoding, which permits the encoding of many European languages in an efficient 1-byte form and is backward compatible with both the ASCII and ISO-8859-1 character sets. UTF-16 is a 16-bit, variable-width encoding that is more suitable to languages with many characters, such as...

This is a preview of subscription content, log in to check access.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of EECSUniversity of Wisconsin-MilwaukeeMilwaukeeUSA

Section editors and affiliations

  • Frank Tompa
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada