Abstract
We present a new approach to learning hypertext classifiers that combines a statistical text-learning method with a relational rule learner. This approach is well suited to learning in hypertext domains because its statistical component allows it to characterize text in terms of word frequencies, whereas its relational component is able to describe how neighboring documents are related to each other by hyperlinks that connect them. We evaluate our approach by applying it to tasks that involve learning definitions for (i) classes of pages, (ii) particular relations that exist between pairs of pages, and (iii) locating a particular class of information in the internal structure of pages. Our experiments demonstrate that this new approach is able to learn more accurate classifiers than either of its constituent methods alone.
Article PDF
Similar content being viewed by others
References
Cestnik, B. (1990). Estimating probabilities: A crucial task in machine learning. In Proceedings of the Ninth European Conference on Artificial Intelligence (pp. 147–150). Stockholm, Sweden: Pitman.
Cohen, W. W. (1995a). Fast effective rule induction. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123). Tahoe City, CA: Morgan Kaufmann.
Cohen, W. W. (1995b). Learning to classify English text with ILP methods. In L. D. Raedt (Ed.), Advances in Inductive Logic Programming. Amsterdam, The Netherlands: IOS Press.
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K.,& Slattery, S. (1998a). Learning to extract symbolic knowledge from the World Wide Web. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (pp. 509–516). Madison, WI: AAAI Press.
Craven, M., Slattery, S.,& Nigam, K. (1998b). First-order learning for Web mining. In Proceedings of the Tenth European Conference on Machine Learning (pp. 250–255). Chemnitz, Germany: Springer-Verlag.
DiPasquo, D. (1998). Using HTML formatting to aid in natural language processing on the World Wide Web. Senior Thesis, Computer Science Department, Carnegie Mellon University.
Domingos, P.& Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning, 29, 103–130.
Džeroski, S.& Bratko, I. (1992). Handling noise in inductive logic programming. In Proceedings of the Second International Workshop on Inductive Logic Programming (pp. 109–125). Tokyo, Japan.
Ehrenfeucht, A., Haussler, D., Kearns, M.,& Valiant, L. (1989). A general lower bound on the number of examples needed for learning. Information and Computation, 82(3), 247–251.
Friedman, N., Getoor, L., Koller, D., & Pfeffer, A. (1999). Learning probabilistic relational models. In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence (pp. 1300–1307). Stockholm, Sweden: Morgan Kaufmann.
Joachims, T., Freitag, D.,& Mitchell, T. (1997).WebWatcher: A tour guide for the World Wide Web. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 770–775). Nogoya, Japan: Morgan Kaufmann.
Kijsirikul, B., Numao, M.,& Shimura, M. (1992). Discrimination-based constructive induction of logic programs. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 44–49). San Jose, CA: AAAI Press.
Koller, D.& Pfeffer, A. (1997). Learning probabilities for noisy first-order rules. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 1316–1321). Nagoya, Japan: Morgan Kaufmann.
Kramer, S. (1995). Predicate invention: A comprehensive view. Technical Report OFAI-TR-95-32, Austrian Research Institute for Artificial Intelligence, Vienna, Austria.
Kushmerick, N., Weld, D. S.,& Doorenbos, R. (1997). Wrapper induction for information extraction. In Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 729–737). Nagoya, Japan: Morgan Kaufmann.
Lewis, D. D.& Ringuette, M. (1994). A comparison of two learning algorithms for text categorization. In Proceedings of the Third Annual Symposium on Document Analysis and Information Retrieval. (pp. 81–93). ISRI; University of Nevada, Las Vegas.
Lewis, D. D., Schapire, R. E., Callan, J. P.,& Papka, R. (1996). Training algorithms for linear classifiers. In Proceedings of the Nineteenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval (pp. 298–306). Zurich, Switzerland: ACM.
Mitchell, T. (1997). Machine learning. New York: McGraw Hill.
Mladeni´c, D. (1996). PersonalWebWatcher: Design and implementation. Technical Report IJS-DP-7472, Department for Intelligent Systems, J. Stefan Institute, Ljubljana, Slovenia.
Moulinier, I., Raškinis, G.,& Ganascia, J.-G. (1996). Text categorization: A symbolic approach. In Proceedings of the 6th Annual Symposium on Document Analysis and Information Retrieval. Las Vegas, NV.
Pazzani, M. J., Muramatsu, J.,& Billsus, D. (1996). Syskill&Webert: Identifying interesting Web sites. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 54–59). Portland, OR: AAAI Press.
Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137.
Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
Quinlan, J. R.& Cameron-Jones, R. M. (1993). FOIL: A midterm report. In Proceedings of the Fifth European Conference on Machine Learning (pp. 3–20). Vienna, Austria: Springer-Verlag.
Richards, B.& Mooney, R. (1992). Learning relations by pathfinding. In Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 50–55). San Jose, CA: AAAI Press.
Silverstein, G.& Pazzani, M. J. (1991). Relational clichés: Constraining constructive induction during relational learning. In Proceedings of the Eighth International Workshop on Machine Learning (pp. 203–207). Evanston, IL: Morgan Kaufmann.
Soderland, S. (1997). Learning to extract text-based information from the World Wide Web. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining (pp. 251–254). Newport Beach, CA: AAAI Press.
Srinivasan, A.& Camacho, R. (1999). Numerical reasoning with an ILP system capable of lazy evaluation and customised search. The Journal of Logic Programming, 40(2/3), 185–213.
Srinivasan, A., Muggleton, S.,& Bain, M. (1992). Distinguishing exceptions from noise in non-monotonic learning. In Proceedings of the Second International Workshop on Inductive Logic Programming. Tokyo, Japan.
Stahl, I. (1996). Predicate invention in inductive logic programming. In L. DeRaedt (Ed.), Advances in Inductive Logic Programming. Amsterdam, The Netherlands: IOS Press.
van Rijsbergen, C. J. (1979). Information retrieval. London, England: Butterworths.
Wrobel, S. (1994). Concept formation during interactive theory revision. Machine Learning, 14(2), 169–191.
Yang, Y.& Pedersen, J. (1997). A comparative study on feature set selection in text categorization. In Proceedings of the Fourteenth International Conference on Machine Learning (pp. 412–420). Nashville, TN: Morgan Kaufmann.
Zelle, J. M., Mooney, R. J.,& Konvisser, J. B. (1994). Combining top-down and bottom-up techniques in inductive logic programming. In Proceedings of the Eleventh International Conference on Machine Learning (pp. 343 351). Rutgers, NJ: Morgan Kaufmann.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Craven, M., Slattery, S. Relational Learning with Statistical Predicate Invention: Better Models for Hypertext. Machine Learning 43, 97–119 (2001). https://doi.org/10.1023/A:1007676901476
Issue Date:
DOI: https://doi.org/10.1023/A:1007676901476