Web Document Classification Using Changing Training Data Set

  • Gilcheol Park
  • Seoksoo Kim
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3984)


Machine learning methods are generally employed to acquire the knowledge for automated document classification. They can be used if a large set of pre-sampled training set is available and the domain does not change rapidly. However, it is not easy to get a complete trained data set in the real world. Furthermore, the classification knowledge continually changes in different situations. This is known as the maintenance problem or knowledge acquisition bottleneck problem. Multiple Classification Ripple-Down Rules (MCRDR), an incremental knowledge acquisition method, was introduced to resolve this problem and has been applied in several commercial expert systems and a document classification system. Evaluation results for several domains show that our MCRDR based document classification method can be successfully applied in the real world document classification task.


Knowledge Acquisition Knowledge Engineer Precision Rate Intermediate Representation Document Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pierre, J.: Practical Issues for Automated Categorization of Web Pages (2000)Google Scholar
  2. 2.
    Sullivan, D.: Search Engine Size (2003)Google Scholar
  3. 3.
    Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  4. 4.
    Mladenic, D.: Text-learning and related intelligent agents: a survey. IEEE Intelligent Systems 14(4), 44–54 (1999)CrossRefGoogle Scholar
  5. 5.
    Wong, W.-c., Fu, A.W.-c.: Incremental Document Clustering for Web Page Classification. In: IEEE 2000 Int. Conf. on Info. Society in the 21st century: emerging technologies and new challenges (IS 2000), Japan (2000)Google Scholar
  6. 6.
    Liu, R.-L., Lu, Y.-L.: Incremental context mining for adaptive document classification. In: Conference on Knowledge Discovery in Data. ACM Press, New York (2002)Google Scholar
  7. 7.
    Charikar, M., et al.: Incremental clustering and dynamic information retrieval. In: Annual ACM Symposium on Theory of Computing, El Paso, Texas, United States. ACM Press, New York (1997)Google Scholar
  8. 8.
    Musen, M.A.: Automated Generation of Model-Based Knowledge-Acquisition Tools. In: Research Notes in Artificail Intelligence. Morgan Kaufmann Publishers, Inc., San Mateo (1989)Google Scholar
  9. 9.
    Lawrence, K.L.: Collision-Theory vs. Reality in Expert Systems, 2nd edn. QED Information Sciences, Inc., Wellwsley (1989)Google Scholar
  10. 10.
    Compton, P., et al.: Maintaining an Expert System. In: 4th Australian Conference on Applications of Expert Systems (1988)Google Scholar
  11. 11.
    Compton, P., Jansen, R.: A philosophical basis for knowledge acquisition. Knowledge Acquisition 2(3), 241–258 (1990)CrossRefGoogle Scholar
  12. 12.
    Compton, P., et al.: Knowledge acquisition without analysis. Knowledge Acquisition for Knowledge-Based Systems. In: 7th European Workshop, EKAW 1993 Proceedings, pp. 277–299 (1993)Google Scholar
  13. 13.
    Kang, B.H., Compton, P., Preston, P.: Validating incremental knowledge acquisition for multiple classifications. In: Critical Technology: Proceedings of the Third World Congress on Expert Systems, pp. 856–868 (1996)Google Scholar
  14. 14.
    Shaw, M.L.G.: Validation in a knowledge acquisition system with multiple experts. In: The International Conference on Fifth Generation Computer Systems (1988)Google Scholar
  15. 15.
    Gruber, T.R.: Automated knowledge acquisition for strategic knowledge. Machine Learning 4(3-4), 293–336 (1989)CrossRefGoogle Scholar
  16. 16.
    Davis, R.: Applications of Meta Level Knowledge to the Construction, Maintenance, and Use of Large Knowledge bases. Stanford University, Stanford (1976)Google Scholar
  17. 17.
    Ford, K.M., et al.: Knowledge acquisition as a constructive modeling activity. International Journal of Intelligent Systems 8(1), 9–32 (1993)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Gilcheol Park
    • 1
  • Seoksoo Kim
    • 1
  1. 1.Dept.of Multimedia EngineeringHannam UniversityDaejeonSouth Korea

Personalised recommendations