Web Documents Categorization Using Neural Networks

  • Renato Fernandes Corrêa
  • Teresa Bernarda Ludermir
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3316)

Abstract

This paper shows, through experimental results, that artificial neural networks are good classifiers for the text categorization task. The paper compares the results of experiments on text categorization using Multilayer Perceptron, Self-organizing Maps, C4.5 decision tree and PART decision rules. The experiments were carried out with K1 collection of web documents.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning internal representations by error propagation. In: Rumelhart, D.E., McClelland, J.L. (eds.) Parallel Distributed Processing, vol. 1, pp. 318–362. MIT Press, Cambridge (1986)Google Scholar
  2. 2.
    Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., Saarela, A.: Self Organization of a Massive Document Collection. IEEE Transaction on Neural Networks 11(3), 574–585 (2000)CrossRefGoogle Scholar
  3. 3.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)Google Scholar
  4. 4.
    Boley, D., Gini, M., Gross, R., Han, E., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., Moore, J.: Partitioning-based clustering for web document categorization. Decision Support Systems 27, 329–341 (1999)CrossRefGoogle Scholar
  5. 5.
    Prechelt, L.: Proben1– A Set of Neural Network Benchmark Problems and Benchmarking Rules. Technical Report 21/94, Fakultät für Informatik, Universität Karlsruhe, Germany (1994)Google Scholar
  6. 6.
    Lin, X., Soergel, D., Marchionini, G.: A self-organizing semantic map for information retrieval. In: Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Chicago, IL, pp. 262–269 (1991)Google Scholar
  7. 7.
    Strehl, A., Ghosh, J., Mooney, R.: Impact of Similarity Measures on Web-page Clustering. In: Proc. of the 17th National Conference on Artificial Intelligence: Workshop of Artificial Intelligence for Web Search (AAAI 2000), Austin, Texas, July 2000, pp. 58–64 (2000)Google Scholar
  8. 8.
    Wiener, E., Pedersen, J., Weigend, A.: A Neural Network Approach to Topic Spotting. In: Proceedings of the Fourth Annual Symposium on Document Analysis and Information Retrieval (SIDAIR 1995), Nevada, Las Vegas, pp. 317–332. University of Nevada, Las Vegas (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Renato Fernandes Corrêa
    • 1
    • 2
  • Teresa Bernarda Ludermir
    • 1
  1. 1.Polytechnic SchoolPernambuco UniversityRecifeBrazil
  2. 2.Center of InformaticsFederal University of PernambucoRecifeBrazil

Personalised recommendations