ObCom 2011: Global Trends in Information Systems and Software Applications pp 680-690 | Cite as
Evaluation of Classifier Models Using Stratified Tenfold Cross Validation Techniques
Abstract
One of the important datamining function is prediction. Many predictive models can be built for the data. The data may be continous, categorical or combination of both. For either of the above type of data many similar predictive models are available. So it is highly important to choose the possible best accurate predictive model for the user data . For this the models are evaluated using resampling techniques. The evaluated models gives statistical results respectively. These statistical results are analysed and compared . The appropriate model that gives maximum accuracy for the user data is used to do predictions for further data of same type. The predictions thus made by the suitable model can be visualized which forms the decision reports for the user data. A proposal is made to apply fuzzy rough set techniques for evaluation of classifier models [7].
Keywords
Dataset Stratified Tenfold Cross Validation Accuracy Class label Training data Test data Model induction Model deductionPreview
Unable to display preview. Download preview PDF.
References
- 1.Dubois, D., Prade, H.: Rough fuzzy sets model. International Journal of General Systems 46(1), 191–208 (1990)CrossRefGoogle Scholar
- 2.Kibler, D., Langley, P.: Machine learning as an experimental science. In: Proc. of 1988 Euro. Working Session on Learning, pp. 81–92 (1988)Google Scholar
- 3.Wolpert, D.: On the connection between insample testing and generalization error. Complex Systems 6, 47–94 (1992)MathSciNetMATHGoogle Scholar
- 4.Written, I.H., Frank, E.: Data Mining -Practical Machine Learning Tools and Techniques With Java Implementations, p. 371Google Scholar
- 5.Gascuel, O., Caraux, G.: Statistical significance in inductive learning. In: Proc. of the European Conf. on Artificial Intelligence (ECAI), New York, pp. 435–439 (1992)Google Scholar
- 6.Pawlak, Z.: Rough sets. International Jour. of Information and Computer Science 11, 341–356 (1982)MathSciNetCrossRefMATHGoogle Scholar
- 7.Maji, P., Pal, S.K.: Rough Set Based Generalized Fuzzy C-Means Algorithm and Quantitative Indices. IEEE Transactions on Systems, Man, and Cybernetics-Part B, Cybernetics 37(6), 1529–1540 (2007)CrossRefGoogle Scholar
- 8.Murphy, P.M.: UCI repository of machine learning databases – a machinereadable data repository. Maintained at the Department of Information and Computer Science, University of California, Irvine (1995), Anonymous FTP from, ftp.ics.uci.edu.inthedirectorypub/machine-learning-databasesGoogle Scholar
- 9.Kohavi, R.: Computer Science Department, A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection, Stanford University, Stanford, CA 94305, ronnyKGCS,Stanford EDU, KOHAVI, pp. 1137–1143Google Scholar
- 10.Dietteric, T.G.: Department of Computer Science, Oregon State University, Approximate Statistical Tests for comparing Supervised Classification Learning Algorithms, Corvaellis OR 9733, December 30 (1997)Google Scholar
- 11.Pawlak, Z.: Rough Sets, Theoretical Aspects of Reasoning about Data. Kluwer, Dordrecht (1991)MATHGoogle Scholar
- 12.Zadeh, L.A.: Fuzzy Sets. Information and Control 11, 338–353 (1965)MathSciNetCrossRefGoogle Scholar