FuzzyXPath: Using Fuzzy Logic an IR Features to Approximately Query XML Documents

  • Ernesto Damiani
  • Stefania Marrara
  • Gabriella Pasi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4529)


XML has become a key technology for interoperability, providing a common data model to applications. However, diverse data modeling choices may lead to heterogeneous XML structure and content. In this paper, information retrieval and database-related techniques have been jointly applied to effectively tolerate XML data diversity in the evaluation of flexible queries. Approximate structure and content matching is supported via a straightforward extension to standard XPath syntax. Also, we outline a query execution technique representing a first step toward efficiently addressing structural pattern queries together with predicate support over XML elements content.


Information Retrieval Query Language Structure Document Common Data Model Approximate Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S.: Querying semi-structured data. In: Afrati, F.N., Kolaitis, P.G. (eds.) ICDT 1997. LNCS, vol. 1186, pp. 1–18. Springer, Heidelberg (1996)Google Scholar
  2. 2.
    Amer-Yahia, S., Cho, S., Srivastava, D.: Tree pattern relaxation. In: Jensen, C.S., Jeffery, K.G., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 496. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  3. 3.
    Bordogna, G., Pasi, G.: Controlling retrieval through a user adaptive representation of documents. Int. J. of Apporximate Reasoning 12, 317–339 (1995)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Bordogna, G., Pasi, G.: Flexible representation and querying of heterogeneous structured documents. Kibernetika 36(6), 617–633 (2000)Google Scholar
  5. 5.
    Bordogna, G., Pasi, G.: Personalized indexing and retrieval of heterogeneous structured documents. Information Retrieval 8(2), 301–318 (2005)CrossRefGoogle Scholar
  6. 6.
    Braga, D., Campi, A., Damiani, E., Pasi, G., Lanzi, P.: FXPath: Flexible querying of xml documents. In: Proceedings of EuroFuse 2002, Varenna, Italy (Sep. 2002)Google Scholar
  7. 7.
    Buche, P., Dibie-Barthélemy, J., Wattez, F.: Approximate querying of XML fuzzy data. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 26–38. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Callan, J.: Passage-level evidence in document retrieval. In: Proceedings of SIGIR 94, Dublin, Ireland, ACM, New York (1994)Google Scholar
  9. 9.
    Campi, A., Guinea, S., Spoletini, P.: A fuzzy extension for the xPath query language. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 210–221. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Chiaramella, Y.: Information retrieval and structured documents. In: Agosti, M., Crestani, F., Pasi, G. (eds.) ESSIR 2000. LNCS, vol. 1980, p. 286. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Chiaramella, Y., Mulhem, P., Fourel, F.: A model for multimedia information retrieval. Technical Report Fermi ESPRIT BRA 8134, University of Glasgow (1996)Google Scholar
  12. 12.
    Ciaccia, P., Penzo, W.: The collection index to support complex approximate queries. In: Bellahsène, Z., Chaudhri, A.B., Rahm, E., Rys, M., Unland, R. (eds.) XSym 2003. LNCS, vol. 2824, pp. 164–179. Springer, Heidelberg (2003)Google Scholar
  13. 13.
    Damiani, E., Oliboni, B., Tanca, L.: Fuzzy techniques for XML data smushing. In: Reusch, B. (ed.) Fuzzy Days 2001. LNCS, vol. 2206, pp. 637–652. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  14. 14.
    Damiani, E., Tanca, L.: Blind queries to XML data. In: Database and Expert Systems Applications, pp. 345–356 (2000)Google Scholar
  15. 15.
    Frisse, M.: Searching for information in a hypertext medical handbook. Communication of the ACM 31(7), 880–886 (1988)CrossRefGoogle Scholar
  16. 16.
    Fuhr, N., Grobjohann, K.: XIRQL: A query language for information retrieval in xml documents. In: Proceedings of SIGIR’01, New Orleans, Louisiana, USA, ACM, New York (2001)Google Scholar
  17. 17.
    Kaszkiel, M., Zobel, J.: Passage retrieval revisited. In: Belkin, N.J., Narasimhalu, D., Willett, P. (eds.) Proceedings of the 20th SIGIR (1994)Google Scholar
  18. 18.
    Lalmas, M.: Dempster-shafer’s theory of evidence applied to structured documents: modelling uncertainty. In: Proceedings of ACM SIGIR, Philadelphia (1997)Google Scholar
  19. 19.
    Lalmas, M., Ruthven, I.: Representing and retrieving structured documents using the dempster-shafer theory of evidence: Modelling and evaluation. Journal of Documentation 54(5), 529–565 (1988)CrossRefGoogle Scholar
  20. 20.
    Li, H.-G., Aghili, S.A., Agrawal, D., El Abbadi, A.: FLUX: Fuzzy content and structure matching of XML range queries. In: Proceedings of WWW 2006, Edinburgh, Scotland, May 23-26 (2006)Google Scholar
  21. 21.
    Macleod, I.: Storage and retrieval of structured documents. Information Processing and Management 26(2), 197–208 (1990)CrossRefGoogle Scholar
  22. 22.
    Mandreoli, F., Martoglia, R., Tiberio, P.: Approximate query answering for a heterogeneous XML document base. In: Zhou, X., Su, S., Papazoglou, M.P., Orlowska, M.E., Jeffery, K.G. (eds.) WISE 2004. LNCS, vol. 3306, pp. 337–351. Springer, Heidelberg (2004)Google Scholar
  23. 23.
    Myaeng, S., Jang, D.H., Kim, M.S., Zhoo, Z.C.: A flexible model for retrieval of sgml documents. In: Proceedings of the 21st ACM SIGIR, Melbourne, Australia, pp. 138–145 (1998)Google Scholar
  24. 24.
    Navarro, G., Baeza-Yates, R.: A language for queries on structure and content of textual databases. In: Proceedings of ACM SIGIR, Seattle, pp. 93–101 (1995)Google Scholar
  25. 25.
    Schlieder, T.: Schema-driven evaluation of approximate tree-pattern queries. In: Jensen, C.S., Jeffery, K.G., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, p. 514. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  26. 26.
    W3C. Xquery 1.0: An xml query language (November 2006)Google Scholar
  27. 27.
    Wilkinson, R.: Effective retrieval of structured documents. In: Proceedings of the 17th ACM-SIGIR, Dublin, pp. 311–317 (1994)Google Scholar
  28. 28.
    XML:DB. Xupdate (November 2000),

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Ernesto Damiani
    • 1
  • Stefania Marrara
    • 1
  • Gabriella Pasi
    • 2
  1. 1.Università degli Studi di Milano, Dipartimento di Tecnologie dell’Informazione, via Bramante 65 26013 Crema (CR)Italy
  2. 2.Università degli Studi di Milano Bicocca, DISCO, Via Bicocca degli Arcimboldi, 8 20126 Milano (MI)Italy

Personalised recommendations