This paper proposes a method of ranking XML documents with respect to an Information Retrieval query by means of fuzzy logic. The proposed method allows imprecise queries to be evaluated against an XML document collection and it provides a model of ranking XML documents. In addition the proposed method enables sophisticated ranking of documents by employing proximity measures and the concept of editing (Levenshtein) distance between terms or XML paths.


Fuzzy Logic Information Retrieval Editing Distance Inverse Document Frequency Levenshtein Distance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E., Yergeau, F. (eds.): Extensible Markup Language (XML) 1.0, 3rd edn. W3C Recommendation (February 04, 2004), http://www.w3.org/TR/REC-xml/
  2. 2.
    Kotsakis, E.: XSD: A Hierarchical Access Method for Indexing XML Schemata. Knowledge and Information Systems 4(2), 168–201 (2002)CrossRefGoogle Scholar
  3. 3.
    Fallside, D.C., Walmsley, P. (eds.): XML Schema Part 0: Primer, 2nd edn. W3C Recommendation (October 28, 2004), http://www.w3.org/TR/xmlschema-0/
  4. 4.
    Meyer, H., Bruder, I., Heuer, A., Weber, G.: The Xircus Search Engine. In: Proceedings of the First Workshop of the INitiative for the Evaluation of XML Retrieval (INEX), Schloss Dagstuhl, Germany, December 9-11, pp. 119–124 (2002)Google Scholar
  5. 5.
    Boag, S., Chamberlin, D., Fernández, M.F., Florescu, D., Robie, J., Siméon, J. (eds.): XQuery 1.0: An XML Query Language. W3C Candidate Recommendation (November 3, 2005), http://www.w3.org/TR/xquery/
  6. 6.
    Kotsakis, E.: Structured Information Retrieval in XML documents. In: Proceedings of the seventeenth ACM Symposium on Applied Computing (SAC 2002), Madrid, Spain, March 10-14, pp. 663–667 (2002) Google Scholar
  7. 7.
    Initiative for the Evaluation of XML retrieval, INEX, DELOS Network of Excellence for Digital Libraries (2002), http://qmir.dcs.qmul.ac.uk/inex/index.html
  8. 8.
    Combi, C., Oliboni, B., Rossato, R.: Querying XML Documents by Using Association Rules. In: 16th International Workshop on Database and Expert Systems Applications (DEXA 2005), Copenhagen, Denmark, pp. 1020–1024 (2005) Google Scholar
  9. 9.
    Lalmas, M., Rölleke, T.: Four-Valued Knowledge Augmentation for Structured Document Retrieval. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 11(1), 67–86 (2003)MATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    PHP Documentation Group: PHP-Hypertext Preprocessor, http://www.php.net/
  11. 11.
    Weigel, F., Meuss, H., Schulz, K.U., Bry, F.: Content and Structure in Indexing and Ranking XML. In: Proceedings of the 7th International Workshop on the Web and Databases (WebDB) 2004, Paris, France (2004)Google Scholar
  12. 12.
    Fuhr, N.: Models for Integrated Information Retrieval and Database Systems. IEEE Bulletin of the Technical Committe on Data Engineering 19(1), 3–13 (1996)Google Scholar
  13. 13.
    Clark, J., De Rose, S. (eds.): XML Path Language (XPath) Version 1.0. W3C Recommendation (November 16, 1999), http://www.w3.org/TR/xpath
  14. 14.
    Pasi, G.: A logical formulation of the Boolean model and of weighted Boolean models. In: Workshop on Logical and Uncertainty Models for Information Systems (LUMIS 1999), University College London (July 5-6, 1999)Google Scholar
  15. 15.
    Bordogna, G., Pasi, G.: Modeling Vagueness in Information Retrieval. In: Agosti, M., Crestani, F., Pasi, G. (eds.) ESSIR 2000. LNCS, vol. 1980, p. 207. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  16. 16.
    Yager, R.R., Larsen, Legind Larsen, H.: Retrieving Information by Fuzzification of Queries. J. Intell. Inf. Syst. 2(4), 421–441 (1993)CrossRefGoogle Scholar
  17. 17.
    Bookstein, A., Tomi Klein, S., Raita, T.: Fuzzy Hamming Distance: A New Dissimilarity Measure (Extended Abstract). In: Amir, A., Landau, G.M. (eds.) CPM 2001. LNCS, vol. 2089, p. 86. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  18. 18.
    Crestani, F., Pasi, G.: Soft Information Retrieval: Applications of Fuzzy Set Theory and Neural Networks. In: Kasabov, N. (ed.) Neuro-fuzzy tools and techniques, pp. 287–313. Physica-Verlag, Springer-Verlag Group, Heidelberg (1999)Google Scholar
  19. 19.
    Wall, L., Christiansen, T., Orwant, J.: Programming Perl, 3rd edn. O’Reilly, Sebastopol (2000)MATHGoogle Scholar
  20. 20.
    The Expat XML Parser, by James Clark, http://expat.sourceforge.net/
  21. 21.
    Wolff, J.E., Flörke, H., Cremers, A.B.: XPRES: A Ranking Approach to Retrieval on Structured Documents. Technical Report JAI-TR-99-12, Institute of Computer Science III (1999)Google Scholar
  22. 22.
    Stasiu, R.K., Heuser, C.A., da Silva, R.: Estimating Recall and Precision for Vague Queries in Databases. In: Pastor, Ó., Falcão e Cunha, J. (eds.) CAiSE 2005. LNCS, vol. 3520, pp. 187–200. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  23. 23.
    Document Object Model (DOM), http://www.w3.org/DOM/
  24. 24.
    Fuhr, N.: A Probabilistic Framework for Vague Queries and Imprecise Information in Databases. In: Proceedings of VLDB 1990, pp. 696–707 (1990)Google Scholar
  25. 25.
    Kazai, G., Lalmas, M., Rölleke, T.: A Model for the Representation and Focussed Retrieval of Structured Documents Based on Fuzzy Aggregation. String Processing and Information Retrieval (SPIRE), pp. 123–135 (2001)Google Scholar
  26. 26.
    Masuda, K.: A Ranking model of proximal and structural text retrieval based on region algebra. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (ACL 2003), Sapporo, Japan, vol. 2, pp. 50–57 (2003)Google Scholar
  27. 27.
    Rölleke, T., Lalmas, M., Kazai, G., Ruthven, I., Quicker, S.: The accessibility dimension for structured document retrieval. In: Crestani, F., Girolami, M., van Rijsbergen, C.J.K. (eds.) ECIR 2002. LNCS, vol. 2291, p. 284. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  28. 28.
    Kazai, G., Gövert, N., Lalmas, M., Fuhr, N.: The INEX evaluation initiative. In: Blanken, H.M., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.) Intelligent Search on XML Data. LNCS, vol. 2818, pp. 279–293. Springer, Heidelberg (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Evangelos Kotsakis
    • 1
  1. 1.Joint Research Center (CCR), TP267Ispra (VA)Italy

Personalised recommendations