Abstract
Browsing constitutes an important part of the user information searching process on the Web. In this paper, we present a browser plug-in called ESpotter, which recognizes entities of various types on Web pages and highlights them according to their types to assist user browsing. ESpotter uses a range of standard named entity recognition techniques. In addition, a key new feature of ESpotter is that it addresses the problem of multiple domains on the Web by adapting lexicon and patterns to these domains.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Google, http://www.google.com
Google API, http://www.google.com/apis/
Google toolbar, http://toolbar.google.com/
KMi (Knowledge Media Institute), http://kmi.open.ac.uk
The Royal Society for the Protection of Birds, http://www.rspb.org.uk
Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)
Cimiano, P., Handschuh, S., Staab, S.: Towards the Self-Annotating Web. In: Proc. of WWW (2004)
Ciravegna, F.: Adaptive Information Extraction from Text by Rule Induction and Generalisation. In: Proc. of IJCAI (2001)
Cunningham, H.: GATE: a General Architecture for Text Engineering. Computers and the Humanities 36, 223–254 (2002)
Dill, S., Eiron, N., Gibson, D., Gruhl, D., Guha, R., Jhingran, A., Kanungo, T., McCurley, K.S., Rajagopalan, S., Tomkins, A., Tomlin, J.A., Zien, J.Y.: A Case for Automated Large-Scale Semantic Annotation. Journal of Web Semantics 1(1), 115–132 (2003)
Domingue, J.B., Dzbor, M.: Magpie: Browsing and Navigating on the Semantic Web. In: Proc. of IUI (2004)
Grover, C., Gearailt, D.N., Karkaletsis, V., Farmakiotou, D., Pazienza, M.T., Vindigni, M.: Multilingual XML-Based Named Entity Recognition for E-Retail Domains. In: Proc. of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, pp. 1060–1067 (2002)
Gupta, S., Kaiser, G., Neistadt, D., Grimm, P.: DOM-based Content Extraction from HTML Documents. In: Proc. of WWW (2003)
Guthrie, L., Pustejowsky, J., Wilks, Y., Slator, B.M.: The Role of Lexicons in Natural Language Processing. CACM 39(1), 63–72 (1996)
Heflin, J., Hendler, J.: Searching the Web with Shoe. In: AAAI Workshop on AI for Web Search (2000)
Kan, M.-Y.: Web Page Categorization without the Web Pages. In: Proc. of WWW (2004)
Lei, Y., Lopez, V., Zhu, J.: Engineering Sustainable Semantic Web Sites (Submitted)
Perkowitz, M., Philipose, M., Fishkin, K., Patterson, D.J.: Mining Models of Human Activities from the Web. In: Proc. of WWW (2004)
Soderland, S.: Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning 34(1), 233–272 (1999)
Vargas-Vera, M., Motta, E., Domingue, J.B., Lanzoni, M., Stutt, A., Ciravegna, F.: MnM: Ontology driven semi-automatic and automatic support for semantic markup. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 379–391. Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, J., Uren, V., Motta, E. (2005). ESpotter: Adaptive Named Entity Recognition for Web Browsing. In: Althoff, KD., Dengel, A., Bergmann, R., Nick, M., Roth-Berghofer, T. (eds) Professional Knowledge Management. WM 2005. Lecture Notes in Computer Science(), vol 3782. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11590019_59
Download citation
DOI: https://doi.org/10.1007/11590019_59
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30465-4
Online ISBN: 978-3-540-31620-6
eBook Packages: Computer ScienceComputer Science (R0)