Skip to main content

Extracting Focused Locations for Web Pages

  • Conference paper
Web-Age Information Management (WAIM 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7142))

Included in the following conference series:

Abstract

Most Web pages contain location information, which can be used to improve the effectiveness of search engines. In this paper, we concentrate on the focused locations, which refer to the most appropriate locations associated with Web pages. Current algorithms suffer from the ambiguities among locations, as many different locations share the same name (known as GEO/GEO ambiguity), and some locations have the same name with non-geographical entities such as person names (known as GEO/NON-GEO ambiguity). In this paper, we first propose a new algorithm named GeoRank, which employs a similar idea with PageRank to resolve the GEO/GEO ambiguity. We also introduce some heuristic rules to eliminate the GEO/NON-GEO ambiguity. After that, an algorithm with dynamic parameters to determine the focused locations is presented. We conduct experiments on two real datasets to evaluate the performance of our approach. The experimental results show that our algorithm outperforms the state-of-the-art methods in both disambiguation and focused locations determination.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cognitive computation group, http://cogcomp.cs.illinois.edu/page/software (accessed in April 2011)

  2. Gate, http://gate.ac.uk/ (accessed in April 2011)

  3. Andogah, G., Bouma, G., Nerbonne, J., Koster, E.: Place name Ambiguity Resolution. In: Proc. of LREC, Marrakech Morocco, pp. 4–10 (2008)

    Google Scholar 

  4. Geonames, http://www.geonames.org (accessed in April 2011)

  5. Washington, http://en.wikipedia.org/wiki/washington (accessed in April 2011)

  6. United Nations department of economic and social affairs, http://unstats.un.org/unsd (accessed in April 2011)

  7. Usgs geographic names information system (gnis), http://geonames.usgs.gov (accessed in April 2011)

  8. World Gazetteer, http://www.world-gazetteer.com (accessed in April 2011)

  9. Lingpipe, http://alias-i.com/lingpipe/ (accessed in April 2011)

  10. Amitay, E., Har’El, N., Sivan, R., Soffer, A.: Web-a-where: geotagging Web content. In: Proc. of SIGIR, Sheffield, United Kingdom, pp. 273–280 (2004)

    Google Scholar 

  11. Anastacio, I., Martins, B., Calado, P.: A comparison of different approaches for assigning geographic scopes to documents. In: Proc. of the INForum 2009 (2009)

    Google Scholar 

  12. Chen, M., Lin, X., Zhang, Y., Wang, X., Yu, H.: Assigning geographical focus to documents. In: Proc. of Geoinformatics, Beijing, China, pp. 1–6 (2010)

    Google Scholar 

  13. Ding, J., Gravano, L., Shivakumar, N.: Computing geographical scopes of Web resources. In: Proc. of VLDB, Cairo, Egypt, pp. 545–556 (2000)

    Google Scholar 

  14. Gyle, A., Plaunt, C.: Gipsy: Automated geographic indexing of text documents. Journal of the American Society of Information Science 45(9), 645–655 (1994)

    Article  Google Scholar 

  15. Leidner, J.L.: Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. PhD dissertation, University of Edinburgh (2007)

    Google Scholar 

  16. Leidner, J.L.: An evaluation dataset for the toponym resolution task. Computers Environment and Urban Systems 30(4), 400–417 (2006)

    Article  Google Scholar 

  17. Markowetz, A., Chen, Y., Suel, T.: Design and implementation of a geographic search engine. In: Proc. of WebDB, Baltimore, Maryland, pp. 19–24 (2005)

    Google Scholar 

  18. Silva, M.J., Martins, B.: Adding Geographic Scopes to Web Resources. Computers Environment and Urban Systems 30(4), 378–399 (2006)

    Article  Google Scholar 

  19. Martins, B., Silva, M.J.: A Graph-Ranking Algorithm for Geo-Referencing Documents. In: Proc. If ICDM, Houston, Texas, pp. 741–744 (2005)

    Google Scholar 

  20. Wang, C., Xie, X., Wang, L., Lu, Y., Ma, W.: Detecting Geographic Locations from Web Resources. In: Proc. of GIR, Bremen, Germany, pp. 17–249

    Google Scholar 

  21. Sanderson, M., Kohler, J.: Analyzing geographic queries. In: Proc. of GIR, Sheffield, UK (2004)

    Google Scholar 

  22. Sanderson, M.: Retrieving with good sense. Information Retrieval 2(1), 45–65 (2000)

    Article  Google Scholar 

  23. Sobhana, N., Barua, A., Das, M., Mitra, P., Ghosh, S.: Co-occurrence Based Place Name Disambiguation and its Application to Retrieval of Geological Text. In: Meghanathan, N., Boumerdassi, S., Chaki, N., Nagamalai, D. (eds.) NeCoM 2010, Part III. CCIS, vol. 90, pp. 543–552. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  24. Volz, R., Kleb, J., Mueller, W.: Towards ontology-based disambiguation of geographical identifiers. In: Proc. of WWW Workshop on Identity, Identifiers, Identifications (I3), Bandd, Alberta, Canada (2007)

    Google Scholar 

  25. Zubizarreta, A., de la Fuente, P., Cantera, J.M., Arias, M.: Extracting geographic context from the Web: georeferencing in mymose. In: Proc. of GIR, pp. 554–561 (2009)

    Google Scholar 

  26. Wang, X., Zhang, Y., Chen, M., Lin, X.: An Evidence-based Approach for Toponym Disambiguation. In: Proc. of Geoinformatics 2010, pp. 1–7 (2010)

    Google Scholar 

  27. Wang, L., Wang, C., Xie, X., Forman, J., Lu, Y., Ma, W., Li, Y.: Detecting Dominant Locations from Search Queries. In: Proc. of SIGIR, Salvador, Brazil, pp. 424–431 (2005)

    Google Scholar 

  28. Rauch, E., Bukatin, M., Baker, K.: A confidence-based framework for disambiguating geographic terms. In: Proc. of HLT-NAACL-GEOREF, pp. 50–54 (2003)

    Google Scholar 

  29. Bryan, K., Leise, T.: The $25,000,000,000 Eigenvector: The Linear Algebra Behind Google. Journal SIAM Review 40(3), 569–581 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Q., Jin, P., Lin, S., Yue, L. (2012). Extracting Focused Locations for Web Pages. In: Wang, L., Jiang, J., Lu, J., Hong, L., Liu, B. (eds) Web-Age Information Management. WAIM 2011. Lecture Notes in Computer Science, vol 7142. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28635-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28635-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28634-6

  • Online ISBN: 978-3-642-28635-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics