Advertisement

Using the Web Information Structure for Retrieving Web Pages

  • Mirna Adriani
  • Rama Pandugita
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4022)

Abstract

We present a report on our participation in the mixed monolingual web task of the 2005 Cross-Language Evaluation Forum (CLEF). We compared the result of web page retrieval based on the page content, page title, and a combination of page content and page title. The result shows that using the combination of page title resulted in the best retrieval performance compared to using only page content or page title. Taking into account the number of links referring to a web page and the depth of the directory path in its URL did not result in any significant improvement to the retrieval performance.

Keywords

Retrieval Performance Information Retrieval System Weight Composition Average Success Retrieval Effectiveness 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, New York (1999)Google Scholar
  2. 2.
    Craswell, N., Hawking, D.: Overview of the TREC 2004 Web Track. In: NIST Special Publication: The 13th Text Retrieval Conference (2004)Google Scholar
  3. 3.
    Porter, M.F.: An algoritm for suffix stripping. Program 14(3), 127–130 (1980)Google Scholar
  4. 4.
    Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)MATHGoogle Scholar
  5. 5.
    Westerveld, T.K.W., Hiemstra, D.: Retrieving Web Pages using Content, Links, URLs, and Anchors. In: NIST Special Publication: The 10th Text Retrieval Conference (TREC-10) (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Mirna Adriani
    • 1
  • Rama Pandugita
    • 1
  1. 1.Faculty of Computer ScienceUniversity of IndonesiaDepokIndonesia

Personalised recommendations