XMLibrary Search: An XML Search Engine Oriented to Digital Libraries

  • Enrique Sánchez-Villamil
  • Carlos González Muñoz
  • Rafael C. Carrasco
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3652)

Abstract

The increase in the amount of data available in digital libraries calls for the development of search engines that allow the users to find quickly and effectively what they are looking for. The XML tagging makes possible the addition of structural information in digitized content. These metadata offer new opportunities to a wide variety of new services. This paper describes the requirements that a search engine inside a digital library should fulfill and it also presents a specific XML search engine architecture. This architecture is designed to index a large amount of text with structural tagging and to be web-available. The architecture has been developed and successfully tested at the Miguel de Cervantes Digital Library.

Keywords

Passage Information Retrieval Systems XML Search Engines Digital Libraries 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Neumann, A., Berlea, A., Seidl, H.: Fxgrep: A XML querying tool (2000), http://www.informatik.uni-trier.de/~aberlea/Fxgrep/
  2. 2.
    Zhao, B.Y., Joseph, A.: Xset: A lightweight XML search engine for internet applications (2000), http://www.cs.berkeley.edu/~ravenben/xset/html/xset-saint.pdf
  3. 3.
    Jaakkola, J., Kipeläinen, P.: Using sgrep for querying structured text files (1996), http://www.cs.helsinki.fi/TR/C-1996/83/
  4. 4.
    Katz, H.: XQEngine - XML query engine (2003), http://xengine.sourceforge.net/
  5. 5.
    Goetz, B.: The Lucene search engine: Powerful, flexible and free (2000), http://www.javaworld.com/javaworld/jw-09-2000/jw-0915-lucene.html
  6. 6.
    Noehring, O., Jedlicka, M.: TSep: The search engine project (2004), http://tsep.sourceforge.net/
  7. 7.
    Meier, W.: eXist: An Open Source Native XML Database. Web, Web-Services, and Database Systems, 169–183 (2002)Google Scholar
  8. 8.
    Doclinx: TeraXML enterprise search (2002), http://www.doclinx.com/products/ftxml.html
  9. 9.
    Liota, M.: Apache’s XIndice organizes XML data without schema (2002), http://www.devx.com/xml/article/9796
  10. 10.
    Zakharov, M.: DataparkSearch engine (2004), http://www.dataparksearch.org/
  11. 11.
    Salton, G., Allan, J., Buckley, C.: Approaches to Passage Retrieval in Full Text Information Systems. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 49–58 (1993)Google Scholar
  12. 12.
    Convera: Convera Retrievalware (2004), http://www.convera.com/
  13. 13.
    Croft, W.: What do people want from information retrieval? D-Lib Magazine 1 (1995), http://www.dlib.org/dlib/november95/11croft.html
  14. 14.
    Yates, B.: Proximal nodes: a model to query document databases by content and structure. ACM Transactions on Information Systems (TOIS) 15(4), 400–435 (1997)CrossRefGoogle Scholar
  15. 15.
    Canals-Marote, R., Esteve-Guillén, A., Garrido, A., Guardiola-Savall, M., Iturraspe-Bellver, A., Montserrat-Buendia, S., Ortiz-Rojas, S., Pastor-Pina, H., Pérez-Antón, P., Forcada, M.: The Spanish-Catalan machine translation system interNOSTRUM. 0922-6567 - Machine Translation VIII, 73–76 (2001)Google Scholar
  16. 16.
    Manber, U., Myers, G.: Suffix arrays: A new method for on-line string searching. In: The first Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 319–327 (1990)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Enrique Sánchez-Villamil
    • 1
  • Carlos González Muñoz
    • 1
  • Rafael C. Carrasco
    • 1
  1. 1.Transducens , Departamento de Lenguajes i Sistemas InformáticosUniversidad de AlicanteAlicante

Personalised recommendations