Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Region Algebra

  • Matthew Young-LaiEmail author
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_304


A region algebra is a collection of operators, each of which returns a set of regions as a result and takes as arguments one or more sets of regions. A region of a string is a pair of natural number positions (s, e) that correspond to the substring starting at s and ending at e. A position is a count of bytes, characters, or words from the beginning of the string.

Choosing a set of operators defines a particular region algebra. Operators are chosen for efficiency as well as utility. For example, if regions correspond to structure elements such as chapters and sections in a document, then many operators for querying structure conditions are useful and can be implemented efficiently. One example of such an operator is containedIn(X,Y) which takes two sets of regions X and Y and returns the subset of regions in X that are contained in some region of Y, i.e., {(sx, ex)∈X∣∃(sy, ey)∈Y(sxsy) ∧ (exey)}. A similar operator is contains(X,Y) which returns the subset of regions...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Burkowski FJ. Retrieval activities in a database consisting of heterogeneous collections of structured text. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1992. p. 112–125.Google Scholar
  2. 2.
    Burkowski FJ. An algebra for hierarchical organized text-dominated databases. Inf Process Manag. 1994;28(3):313–24.Google Scholar
  3. 3.
    Clarke CLA, Cormack GV, Burkowski FJ. An algebra for structured text search and a framework for its implementation. Comput J. 1995;38(1): 43–56.CrossRefGoogle Scholar
  4. 4.
    Consens M.P. and Milo T. Algebras for querying text regions. In: Proceedings of the 14th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems; 1995. p. 11–22.Google Scholar
  5. 5.
    Consens MP, Milo T. Algebras for querying text regions: expressive power and optimization. J Comput Syst Sci. 1998;57(3):272–88.MathSciNetzbMATHCrossRefGoogle Scholar
  6. 6.
    Cormack GV, Clarke CLA, Palmer CR, Good RC. The multitext retrieval system (demonstration abstract). In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1999. p. 334.Google Scholar
  7. 7.
    Jaakkola J, Kilpelinen P. Using sgrep for querying structured text files. In: Proceedings of Standard Generalized Markup Language Finland 1996, Saarela J, editor. 1996. p. 56–67. Available as Report C-1996–83, Department of Computer Science, University of Helsinki, Nov 1996.Google Scholar
  8. 8.
    Jaakkola J, Kilpeläinen P. Nested text-region algebra. Technical Report C-1999-2, Department of Computer Science, University of Helsinki, Jan 1999.Google Scholar
  9. 9.
    Miller RC. Lightweight structure in text. Ph.D thesis, School of Computer Science, Carnegie Mellon University, 2002.Google Scholar
  10. 10.
    Salminen A, Tompa F. PAT expressions: an algebra for text search. Acta Linguistica Hungarica. 1992;41(1–4):277–306.Google Scholar
  11. 11.
    Young-Lai M. Text structure recognition using a region algebra. Ph.D thesis, Department of Computer Science, University of Waterloo, 2000.Google Scholar
  12. 12.
    Young-Lai M, Tompa FW. One-pass evaluation of region algebra expressions. Inf Syst. 2003;28(3):159–68.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Sybase iAnywhereWaterlooCanada
  2. 2.Google, Inc.Mountain ViewUSA

Section editors and affiliations

  • Frank Tompa
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUniv. of WaterlooWaterlooCanada