Improving Automatic Edge Selection for Relational Classification
In this paper, we address the problem of edge selection for networked data, that is, given a set of interlinked entities for which many different kinds of links can be defined, how do we select those links that lead to a better classification of the dataset. We evaluate the current approaches to the edge selection problem for relational classification. These approaches are based on defining a metric over the graph that quantifies the goodness of a specific link type. We propose a new metric to achieve this very same goal. Experimental results show that our proposed metric outperforms the existing ones.
KeywordsAggregation Operator Versus Test Manhattan Distance Induction Algorithm Feature Subset Selection
Unable to display preview. Download preview PDF.
- 1.John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the 11th Int. Machine Learning, pp. 121–129 (1994)Google Scholar
- 3.Almuallim, H., Dietterich, T.G.: Learning with many irrelevant features. In: Proceedings of the 9th National Conf. on Artificial Intelligence, pp. 547–552 (1991)Google Scholar
- 4.Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proc. of the 10th Conf. on Artificial intelligence, pp. 129–134 (1992)Google Scholar
- 5.Cardie, C.: Using decision trees to improve case-based learning. In: Proceedings of the 10th Int. Conf. on Machine Learning, pp. 25–32. Morgan Kaufmann (1993)Google Scholar
- 6.Macskassy, S.A., Provost, F.: Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8, 935–983 (2007)Google Scholar
- 7.Newman, M.E.J.: Mixing patterns in networks. Phys. Rev. E 67, 026126 (2003)Google Scholar
- 9.Perlich, C., Provost, F.: Aggregation-based feature invention and relational concept classes. In: Proc. of the 9th Int. Conf. on Knowledge Discovery and Data Mining, pp. 167–176 (2003)Google Scholar
- 11.Macskassy, S., Provost, F.: NetKit-SRL - network learning toolkit for statistical relational learningGoogle Scholar
- 12.Kendall, M., Gibbons, J.D.: Rank Correlation Methods, 5th edn. A Charles Griffin Title (September 1990)Google Scholar