Monitoring Web Resources Discovery by Reusing Classification Knowledge
Any automated client pull systems, such as web monitoring systems, Web services, or RSS systems, require resources that publish relevant information. These resources may be discovered manually, but this is not ideal. Public search engines may be used to find these resources by submitting appropriate queries. In the previous research, we proposed a search query formulation method that reuses MCRDR (Multiple Classification Ripple-Down Rules) classification knowledge bases. However, all search results may not relevant web resources, because they may not resources, but web pages linked from the candidate resources. Therefore, it is necessary to develop methods that locate candidate monitoring web resources from search results. This paper summarizes heuristics that were obtained user study, which will be used in this automated resource location process.
KeywordsSearch Result Classification Knowledge Document Page Commercial Search Engine Current Page
Unable to display preview. Download preview PDF.
- 1.Kim, Y.S. and B.H. Kang. Search Query Generation with MCRDR Document Classification Knowledge. in EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns. 2008. Acitrezza, Catania, Italy.Google Scholar
- 2.Kim, Y.S. and B.H. Kang. A Study on Monitoring Web Page Locating Heuristics. in The 2008 International Conference on Information and Knowledge Engineering (IKE'08). 2008. Monte Carlo Resort, Las Vegas, Nevada, USA.Google Scholar
- 4. Sebastiani, F., Text categorization, in The Encyclopedia of Database Technologies and Applications, L.C. Rivero, J.H. Doorn, and V.E. Ferraggine, Editors. 2005, Idea Group PublishingHershey, US.Google Scholar
- 5. Sebastiani, F., Text categorization, in Text Mining and its Applications, A. Zanasi, Editor. 2004, WIT Press, Southampton, UK. p. pp. 109–129.Google Scholar
- 6.Matsuda, K. and T. Fukushima. Task-oriented world wide web retrieval by document type classification. in the eighth international conference on Information and knowledge management. 1999. Kansas City, Missouri, United States: ACM New York, NY, USA.Google Scholar
- 7.Glover, E.J., G.W. Flake, S. Lawrence, W.P. Birmingham, A. Kruger, C.L. Giles, and D.M. Pennock. Improving Category Specific Web Search by Learning Query Modifications. in SAINT 2001. 2001. San Diego, California: IEEE Computer Society.Google Scholar
- 8.Elsas, J. and M. Efron. HTML tag based metrics for use in web page type classification. in American Society for Information Science and Technology Annual Meeting. 2004. Providence, Rhode Island, USA.Google Scholar
- 9.Wilson, R.F., Google's Index Shows Only a Few Backlinks. 2006.Google Scholar
- 10.Boyapati, V., K. Chevrier, A. Finkel, N. Glance, T. Pierce, R. Stockton, and C. Whitmer. ChangeDetector[tm]: a site-level monitoring tool for the WWW. in Eleventh International World Wide Web Conference (WWW 2002). 2002. Hawaii, USA.Google Scholar
- 11.Douglis, F. and T. Ball. Tracking and Viewing Changes on the Web. in USENIX Annual Technical Conference. 1996.Google Scholar
- 12. Liu, L., W. Tang, D. Buttler, and C. Pu, Information Monitoring on the Web:A Scalable Solution. World Wide Web Journal, 2002. 5(4): p. 263–304.Google Scholar
- 13.Pandey, S., K. Dhamdhere, and C. Olston. WIC: A General-Purpose Algorithm for Monitoring Web Information Sources. in 30th VLDB Conference. 2004. Toronto, Canada.Google Scholar