Abstract
In order to provide predictable runtime performance for text categorization (TC) systems, an innovative system design method is proposed for soft real-time TC systems. An analyzable mathematical model is established to approximately describe the nonlinear and time-varying TC systems. According to this mathematical model, the feedback control theory is adopted to prove the system's stableness and zero steady state error. The experiments result shows that the error of deadline satisfied ratio in the system is kept within 4% of the desired value. And the number of classifiers can be dynamically adjusted by the system itself to save the computation resources. The proposed methodology enables the theoretical analysis and evaluation to the TC systems, leading to a high-quality and low-cost implementation approach.
Similar content being viewed by others
References
Lewis D D. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval.Proceedings of the 10th European Conference on Machine Learning. Berlin: Springer-Verlag, 1998. 4–19.
Yang Yi-ming, Liu Xin. A Re-Examination of Text Categorization Methods.Proceedings of the 22 nd International Conference on Research and Development in Information Retrieval. Berkeley: ACM Press, 1999. 42–49.
Yang Yi-ming, Chute C G. An Example-Based Mapping Method for Text Categorization and Retrieval.ACM Transaction on Information Systems, 1994,12(3): 252–277.
Wiener E, Federsen J O, Weigend A S. A Neural Network Approach to Topic Spotting.Proceedings of the 4th Annual Symposium on Document Analysis and Information Retrieval. Las Vegas: Press of University of Nevada, 1995. 317–332.
Schapire R E, Singer Y. Improved Boosting Algorithms Using Confidence-Rated Predictions.Machine Learning, 1999,37(3): 297–336.
Joachims T. Text Categorization with Support Vector Machines: Learning with Many Relevant Features.Proceedings of the 10th European Conference on Machine Learning. Berlin: Springer-Verlag, 1998. 137–142.
Grossman D A, Frieder O.Information Retrieval—Algorithms and Heuristics. Massachusetts: Kluwer Academic Publishers, 1998.
Li Rong-lu, Hu Yun-fa. Noise Reduction to Text Categorization Based on Density for KNN.Proceedings of the International Conference on Machine Learning and Cybernetics]. Xi'an: Institute of Electrical and Electronics Engineers Inc, 2003. 3119–3124.
Zhou Shui-geng, Ling T W, Guan Ji-hong,et al. Fast Text Classification: A Training-corpus Pruning Based Approach.Proceedings of 8th International Conference on Database Systems for Advanced Applications. Los Alamitos: IEEE Computer Society, 2003. 127–136.
Deng Zhi-hong, Tang Shi-wei, Yang Dong-qing,et al. SRFW: A Simple, Fast and Effective Text Classification Algorithm.Proceedings of International Conference on Machine Learning and Cybernetics. Piscataway, NJ: IEEE Computer Society, 2002. 1267–1271.
Liu C L, Layland J W. Scheduling Algorithms for Multiprogramming in a Hard Real Time Environment.Journal of the ACM, 1973,20(1): 46–61.
Buttazzo G C.Hard Real-Time Computing System: Predictable Scheduling Algorithms and Applications. Massachusetts: Kluwer Academic Publishers, 2000.
Diaz L, Garcia D F, Kim K,et al. Stochastic Analysis of Periodic Real-Time Systems.Proceedings of the 23rd IEEE Real-Time Systems Symposium. Los Alamitos CA: IEEE Computer Society, 2002. 289–300.
Franklin G F, Powell J D, Workman M L.Digital Control of Dynamic Systems. 3rd Edition. Massachusetts: Addison-Wesley, 1998.
Author information
Authors and Affiliations
Additional information
Foundation item: Supported by the National Natural Science Foundation of China (90104032), the National High-Tech Research and Development Plan of China (2003AA1Z2090)
Biography: WANG Hua-yong(1978-), male, Ph. D candidate, research direction: information retrieval, real-time system.
Rights and permissions
About this article
Cite this article
Hua-yong, W., Yu, C. & Yi-qi, D. A text categorization system with soft real-time guarantee. Wuhan Univ. J. Nat. Sci. 11, 226–229 (2006). https://doi.org/10.1007/BF02831736
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02831736