Skip to main content

Chapter 3: Strings and Characters

  • Chapter
Book cover Common Lisp Recipes
  • 2035 Accesses

Abstract

Whereas “computing” was mostly about numbers in its earlier days, strings and characters are ubiquitous now—just think about XML and Internet protocols like HTTP. In Common Lisp, characters, as well as strings, are first-class data types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This is also called “Latin u with diaeresis.”

  2. 2.

    For more about character names, see Recipe 3-2.

  3. 3.

    Meaning that different characters have different character codes.

  4. 4.

    Modulo character attributes, actually. But that’s already too much detail…

  5. 5.

    So the CHAR-CODE function is injective, but not surjective.

  6. 6.

    For how to change the syntax of Common Lisp, see Chapter 8.

  7. 7.

    CL-Unicode won’t magically extend your Lisp’s supply of characters. It’ll just provide it with Unicode information about the characters it already has.

  8. 8.

    See http://en.wikipedia.org/wiki/UTF-8 .

  9. 9.

    That an implementation supports a specific character encoding doesn’t necessarily imply that it supports all characters this encoding can encode (see Recipe 3-1).

  10. 10.

    Note that while EQ might accidentally work for you, it is nothing you should rely on. See Recipe 10-1 for more on this.

  11. 11.

    The technical reason is that strings are compound objects (vectors) internally and two strings will likely be two different vectors, although they might denote the same sequence of characters. See Recipe 10-1 for more on this.

  12. 12.

    This is, of course, fine because every value that is not NIL is a true value in Common Lisp.

  13. 13.

    For example, in Swedish, z comes before ö, whereas in German, it’s the other way around. And even in German, dictionaries and phone books disagree about whether of or öf comes first.

  14. 14.

    See http://www.unicode.org/reports/tr10/ for some interesting examples.

  15. 15.

    See https://en.wikipedia.org/wiki/Long_s .

  16. 16.

    In 13.1.4.3, in case you want to look it up.

  17. 17.

    See Recipe 3-2 for the (implementation-dependent) meaning of #\U+00DC.

  18. 18.

    More about how you can change the syntax of Common Lisp in Chapter 8.

  19. 19.

    Well, of course the standard technically doesn’t require this, but you can be very sure that every self-respecting Lisp implementation does this.

  20. 20.

    The HyperSpec defines exactly what a “word” means in this context, but it’s pretty intuitive and very likely does what you mean.

  21. 21.

    See Chapter 5.

  22. 22.

    See https://en.wikipedia.org/wiki/Locale .

  23. 23.

    See also Recipe 7-5.

  24. 24.

    It has to be a bounding index, of course, which is to say that it must not be greater than the length of the string.

  25. 25.

    These functions make certain assumptions about the ordering of the characters, which aren’t necessarily portable between all theoreticalCommon Lisp implementations. For example, it would be perfectly legal if (char<= #\a #\B #\c) returned true or if (code-char (1+ (char-code #\A))) weren’t #\B. However, for all current Common Lisp implementations, it is safe to assume that the examples will work as intended (see Recipe 3-1).

  26. 26.

    These terms are explained in Chapter 5.

  27. 27.

    See Recipe 7-8, for example.

  28. 28.

    See http://www.gigamonkeys.com/book/a-few-format-recipes.html .

  29. 29.

    See Chapter 8 for the mechanics of modifying the syntax of Common Lisp.

  30. 30.

    See https://tools.ietf.org/html/rfc4180 .

  31. 31.

    Or have a look at http://xach.com/rpw3/articles/2qydnU8FD8--B0CiXTWc-w@speakeasy.net.html.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edmund Weitz .

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

chapter-03 (zip 3 kb)

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Edmund Weitz

About this chapter

Cite this chapter

Weitz, E. (2016). Chapter 3: Strings and Characters. In: Common Lisp Recipes. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-1176-2_3

Download citation

Publish with us

Policies and ethics