Query Expansion for Effective Geographic Information Retrieval

  • Qiang Pu
  • Daqing He
  • Qi Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5706)


We developed two methods for monolingual Geo-CLEF 2008 task. The GCEC method aims to test the effectiveness of our online geographic coordinates extraction and clustering algorithm, and the WIKIGEO method wants to examine the usefulness of using the geographic coordinates information in Wikipedia for identifying geo-locations. We proposed a measure of topic distance to evaluate these two methods. The experiments results show that: 1) our online geographic coordinates extraction and clustering algorithm is useful for the type of locations that do not have clear corresponding coordinates; 2) the expansion based on the geo-locations generated by GCEC is effective in improving geographic retrieval; 3) Wikipedia can help in finding the coordinates for many geo-locations, but its usage for query expansion still needs further study; 4) query expansion based on title only obtained better results than that on the title and narrative parts, even though the latter contains more related geographic information. Further study is needed for this part.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    He, D., Oard, D.W., Wang, J., Luo, J., Demner-Fushman, D., Darwish, K., Resnik, P., Khudanpur, S., Nossal, M., Subotin, M., Leuski, A.: Making MIRACLEs: Interactive Translingual Search for Cebuano and Hindi. ACM Transactions on Asian Language Information Processing 2(3), 219–244 (2003)CrossRefGoogle Scholar
  2. 2.
    Li, Z.S., Wang, C., Xie, X., Ma, W.Y.: MSRA Columbus at GeoCLEF 2006. In: Peters, C., Clough, P., Gey, F.C., Karlgren, J., Magnini, B., Oard, D.W., de Rijke, M., Stempfhuber, M. (eds.) CLEF 2006. LNCS, vol. 4730, pp. 926–929. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Li, Z.S., Wang, C., Xie, X., Ma, W.Y.: Query Parsing Task for GeoCLEF 2007 Report. In: Working Notes of the Cross Language Evaluation Forum (CLEF) 2007 Workshop, Budapest, Hungary (2007)Google Scholar
  4. 4.
    Salton, G., Wong, A., Yang, C.S.A.: Vector Space Model for Automatic Indexing. Communication of the ACM 18(11), 613–620 (1975)CrossRefzbMATHGoogle Scholar
  5. 5.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting Query Performance. In: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 299–306. ACM Press, New York (2002)Google Scholar
  6. 6.
    Amati, G., Carpineto, C., Romano, G.: Query Difficulty, Robustness, and Selective Application of Query Expansion. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 127–137. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Pu, Q., He, D., Li, Q.: University of Pittsburgh at GeoCLEF 2008: Towards Effective Geographic Information Retrieval. In: Working Notes of the Cross Language Evaluation Forum (CLEF) 2008 Workshop, Aarhus, Denmark (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Qiang Pu
    • 1
  • Daqing He
    • 2
  • Qi Li
    • 2
  1. 1.School of Computer Science and EngineeringUniversity of Electronic Science and Technology of ChinaChengduChina
  2. 2.School of Information SciencesUniversity of PittsburghPittsburghUSA

Personalised recommendations