Skip to main content
Log in

Distinctive Features of Minimization of a Risk Functional in Mass Data Sets

  • Published:
Cybernetics and Systems Analysis Aims and scope

Abstract

A statistical learning model is considered within the framework of the theory of uniform convergence of frequencies of errors in the case where the convergence is violated as a result of increasing the informativeness of training examples. Drawbacks of nonconstructive refinements of Vapnik-Chervonenkis estimates based on an assumption on the distribution law of violations are shown. A new approach to obtaining constructive estimates for mass data sets is proposed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

REFERENCES

  1. J. Way and E. A. Smith, “The evolution of synthetic aperture radar systems and their progression to the EOS SAR,” IEEE Trans. Geosci. and Remote Sens., 29, No. 6, 962-985 (1991).

    Google Scholar 

  2. U. Fayad, S. G. Djorgovski, and N. Weir, “Automating the Analysis and Cataloging of Sky Surveys,” in: U.M. Fayad, et al. (eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press, Cambridge (1996), pp. 471-494.

    Google Scholar 

  3. C. J. Matheus, G. Piatetsky-Shapiro, and D. MsNeill, “Selecting and reporting what is interesting: The kefir application to healthcare data,” in: U. M. Fayad, et al. (eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press, Cambridge (1996), pp. 495-516.

    Google Scholar 

  4. D. Conklin, S. Fortier, and J. Glasgow, “Knowledge Discovery in Molecular Databases,” IEEE Trans. Knowledge and Data Eng., 5, No. 6, 985-987 (1993).

    Google Scholar 

  5. M. C. Burl, U. Fayad, P. Perona, et al., “Automating the hunt for volcanoes on Venus,” in: Proc. of Computer Vision and Pattern Recognition Conf. (CVPR-94), IEEE CS Press, Piscataway (NJ) (1994), pp. 302-308.

    Google Scholar 

  6. P. Stolorz and C. Dean, “Quakefinder: A scalable data mining system for detecting earthquakes from space,” in: Proc. 2nd Intern. Conf. on Knowledge Discovery and Data Mining, AAAI Press, Cambridge (1996), pp. 75-87.

    Google Scholar 

  7. Y. Sakakibara, M. Brown, R. Hughey, et al., “Stochastic context-free grammars for t RNA modeling,” Nucleic Acids Research, No. 22, 5112-5120 (1994).

    Google Scholar 

  8. J. Major, “Selecting among rules induced from a hurricane database,” in: Proc. of KDD-94: AAAI-94 Workshop on Knowledge Discovery in Databases (AAAI Techn. Rep. WS-94-03), AAAI Press, Melno Park (CA) (1993), pp. 28-44.

    Google Scholar 

  9. P. Stolorz, et al., “Data analysis and knowledge discovery in geophysical databases,” in: Concurrent Supercomputing Consortium Annual Report, California Inst. Technology (1994), pp. 12-14.

  10. V. M. Glushkov, Principles of Paperless Informatics [in Russian], Nauka, Moscow (1982).

    Google Scholar 

  11. V. N. Vapnik, Restoration of Dependences from Empirical Data [in Russian], Nauka, Moscow (1979).

    Google Scholar 

  12. D. P. Helmbold and P. M. Long, “Tracking drifting concepts by minimizing disagreements,” Machine Learning, No. 14, 27-45 (1994).

  13. P. Bartlett, “Learning with slowly changing distribution,” in: Proc. Workshop on Computational Learning Theory, Morgan Kauffman Publ., San Mateo (CA) (1992), pp. 243-252.

    Google Scholar 

  14. R. D. Barve and P. M. Long, “On the complexity of learning from drifting distributions,” in: Proc. Workshop on Computational Learning Theory, Morgan Kauffman Publ., San Mateo (CA) (1996), pp. 132-142.

    Google Scholar 

  15. P. Bartlett, S. Ben-David, and S. Kulkarni, “Learning changing concepts by exploiting the structure of change,” in: Proc. Workshop on Computational Learning Theory, Morgan Kauffman Publ., San Mateo (CA) (1996), pp. 143-155.

    Google Scholar 

  16. A. Kuh, T. Petsche, and R. Rivest, “Learning time-varying concepts,” in: Advances in Neural Information Processing Systems, Morgan Kauffman Publ., San Mateo (CA) (1991), pp. 183-189.

    Google Scholar 

  17. T. Mitchell, R. Caruana, D. Freitag, et al., “Experience with a learning personal assistant,” CACM., No. 37, 81-91 (1994).

  18. G. Widmer and M. Kubat, “Learning in the presence of concept drift and hidden contexts,” Machine Learning, No. 23, 69-101 (1996).

  19. R. Klinkenberg and I. Renz, “Adaptive information filtering: Learning in the presence of concept drifts,” in: Workshop Notes of the ICML-98 Workshop on Learning for Text Categorization, AAAI Press, Melno Park (CA) (1998), pp. 33-40.

    Google Scholar 

  20. C. Taylor, G. Nakhaeizadeh, and C. Lanquillon, “Structural change and classification,” in: Workshop Notes of the ECML-97 Workshop on Dynamically Changing Domains: Theory Revision and Context Dependence Issues, AAAI Press, New York (1997), pp. 67-78.

    Google Scholar 

  21. A. V. Kharchenko, “Generalization of some estimates of statistical learning theory to mass data sets,” Upr. Sist. Mash., No. 1, 59-64 (2000).

  22. J. Rissanen, “Stochastic complexity in statistical inquiry,” World Scientific. Ser. in Computer Sci., 15 (1989).

  23. D. Haussle, M. Kearns, and R. Shapire, “Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension,” Machine Learning, 14(1), 83-113 (1992).

    Google Scholar 

  24. D. MacKay, “Information-based objective functions for active data selection,” Neural Computation, No. 4 (4), 590-604 (1990).

  25. I. Guyon, N. Matic, and V. Vapnik, “Discovering informative patterns and data cleaning,” in: U. M. Fayad, et al. (eds.), Advances in Knowledge Discovery and Data Mining, AAAI Press, Cambridge (1996), pp. 181-203.

    Google Scholar 

  26. I. N. Kovalenko and B. V. Gnedenko, Probability Theory [in Russian], Vyshcha Shkola, Kiev (1990).

    Google Scholar 

  27. V. N. Vapnik, Statistical Learning Theory, Wiley, New York (1998).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perevozchikova, O.L., Tul'chinskii, V.G. & Kharchenko, A.V. Distinctive Features of Minimization of a Risk Functional in Mass Data Sets. Cybernetics and Systems Analysis 39, 501–508 (2003). https://doi.org/10.1023/B:CASA.0000003500.91054.5b

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:CASA.0000003500.91054.5b

Navigation