Abstract
This paper introduces a class of predictive self-organizing neural networks known as Adaptive Resonance Associative Map (ARAM) for classification of free-text documents. Whereas most statistical approaches to text categorization derive classification knowledge based on training examples alone, ARAM performs supervised learning and integrates user-defined classification knowledge in the form of IF-THEN rules. Through our experiments on the Reuters-21578 news database, we showed that ARAM performed reasonably well in mining categorization knowledge from sparse and high dimensional document feature space. In addition, ARAM predictive accuracy and learning efficiency can be improved by incorporating a set of rules derived from the Reuters category description. The impact of rule insertion is most significant for categories with a small number of relevant documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
C. Apte, F. Damerau, and S.M. Weiss. Automated learning of decision rules for text categorization. ACM Transactions on Information Systems, 12(3):233–251, 1994.
C. Apte, F. Damerau, and S.M. Weiss. Text mining with decision rules and decision trees. In Proceedings of the Conference on Automated Learning and Discovery, Workshop 6: Learning from Text and the Web, 1998.
G.A. Carpenter and S. Grossberg. A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer Vision, Graphics, and Image Processing, 37:54–115, 1987.
G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, and D.B. Rosen. Fuzzy ARTMAP: A neural network architecture for incremental supervised learning of analog multidimensional maps. IEEE Transactions on Neural Networks, 3:698–713, 1992.
G.A. Carpenter, S. Grossberg, and D.B. Rosen. Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system. Neural Networks, 4:759–771, 1991.
S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representation for text categorization. In Proceedings, ACM 7th International Conference on Information and Knowledge Management, pages 148–155, 1998.
T. Joachims. Text categorization with support vector machines: Learning with many relevant features. In Proceedings, 10th European Conference on Machine Learning (ECML’98), pages-, 1998.
D.D. Lewis and M. Ringuette. A comparison of two learning algorithms for text categorization. In Proceedings, Third Annual Symposium on Document Analysis and Information Retrieval (SDAIR’94), Las Vegas, pages 81–93, 1994.
H.T. Ng, W.B. Goh, and K.L. Low. Feature selection, perceptron learning, and a usability case study for text categorization. In Proceedings, 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’97), pages 67–73, 1997.
G. Salton and C Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, 24(5):513–523, 1988.
J.W. Shavlik and T. Eliassi-Rad. Building intelligent agents for web-based tasks: A theory-refinement approach. In Working Notes of the Conf on Automated Learning and Discovery Workshop on Learning from Text and the Web, Pittsburgh, PA, 1998.
A.-H. Tan. Adaptive Resonance Associative Map. Neural Networks, 8(3):437–446, 1995.
A.-H. Tan. Cascade ARTMAP: Integrating neural computation and symbolic knowledge processing. IEEE Transactions on Neural Networks, 8(2):237–250, 1997.
A-H. Tan and Lai F-L. Text categorization, supervised learning, and domain knowledge integration. In Proceedings, KDD-2000 International Workshop on Text Mining, Boston, pages 113–114, 2000.
G.G. Towell, J.W. Shavlik, and M.O. Noordewier. Refinement of approximately correct domain theories by knowledge-based neural networks. In Proceedings, 8th National Conference on AI, Boston, MA, pages 861–866. AAAI Press/The MIT Press, 1990.
E. Wiener, J.O. Pedersen, and A.S. Weigend. A neural network approach to topic spotting. In Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval (SDAIR’95), 1995.
Y. Yang. Expert network: Effective and efficient learning from human decisions in text categorization and retrieval. In Proceedings, 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’94), 1994.
Y. Yang and C.G. Chute. An exampled-based mapping method for text categorization and retrieval. ACM Transactions on Information Systems, 12(3):252–277, 1994.
Y. Yang and X. Liu. A re-examination of text categorization methods. In Proceedings, 22th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’99), pages 42–49, 1999.
Y. Yang and J.P. Pedersen. Feature selection in statistical learning for text categorization. In Proceedings, Fourteehth International Conference on Machine Learning, pages 412–420, 1997.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tan, AH. (2001). Predictive Self-Organizing Networks for Text Categorization. In: Cheung, D., Williams, G.J., Li, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2001. Lecture Notes in Computer Science(), vol 2035. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45357-1_10
Download citation
DOI: https://doi.org/10.1007/3-540-45357-1_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41910-5
Online ISBN: 978-3-540-45357-4
eBook Packages: Springer Book Archive