Synonyms
Retrieval Models for Text Databases
Definition
Structured text retrieval models provide a formal definition or mathematical framework for querying semi-structured textual databases. A textual database contains both content and structure. The content is the text itself, and the structure divides the database into separate textual parts and relates those textual parts by some criterion. Often, textual databases can be represented as marked-up text, for instance, as XML, where the XML elements define the structure on the text content. Retrieval models for textual databases should comprise of three parts: (i) a model of the text, (ii) a model of the structure, and (iii) a query language [4]: The model of the text defines a tokenization into words or other semantic units, as well as stop words, stemming, synonyms, etc. The model of the structure defines parts of the text, typically a contiguous portion of the text called element, region, or segment, which is defined on top of the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Alink W XIRAF: an XML information retrieval approach to digital forensics. Master’s thesis, University of Twente. 2005.
Amer-Yahia S, Botev C, Shanmugasundaram J TeXQuery: a full-text search extension to XQuery. In: Proceedings of the 12th International World Wide Web Conference; 2004.
Amer-Yahia S, Lakshmanan LVS, Pandit S. FleXPath: flexible structure and full-text querying for XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004.
Baeza-Yates RA, Navarro G. Integrating contents and structure in text retrieval. ACM SIGMOD Rec. 1996;25(1):67–79.
Burkowski FJ Retrieval activities in a database consisting of heterogeneous collections of structured text. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1992. p. 112–24.
Carmel D, Maarek YS, Mandelbrod M, Mass Y, Soffer A. Searching XML documents via XML fragments. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2003. p. 151–8.
Clarke CLA, Cormack GV, Burkowski FJ. An algebra for structured text search and a framework for its implementation. Comput J. 1995;38(1):43–56.
Fuhr N, Gövert N, Kazai G, Lalmas M, editors. In: Proceedings of the 1st International Workshop of the Initiative for the Evaluation of XML Retrieval; 2002.
Fuhr N, Grossjohann K. XIRQL: a query language for information retrieval in XML. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 172–80.
Gonnet GH, Tompa FW Mind your grammar: a new approach to modelling text. In: Proceedings of the 13th International Conference on Very Large Data Bases; 1987. p. 339–46.
Jaakkola J, Kilpeläinen P. Nested text-region algebra. Technical report. University of Helsinki. 1999.
Mihajlovic V, Blok HE, Hiemstra D, Apers PMG. Score region algebra: building a transparent XML-IR database. In: Proceedings of the International Conference on Information and Knowledge Management; 2005. p. 12–9.
Navarro G, Baeza-Yates RA. Proximal nodes: a model to query document databases by content and structure. ACM Trans Inf Syst. 1997;15(4):400–35.
Ogilvie P, Callan J. Hierarchical language models for XML component retrieval. In: Advances in XML information retrieval. Lecture notes in computer science 3493. Springer; 2005. p. 224–37.
Salminen A, Tompa FW. PAT expressions: an algebra for text search. Proc Complex. 1992;92:309–32.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Hiemstra, D., Baeza-Yates, R. (2018). Structured Text Retrieval Models. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_379
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_379
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering