Learning in Hybrid Noise Environments Using Statistical Queries

  • Scott E. Decatur
Part of the Lecture Notes in Statistics book series (LNS, volume 112)


We consider formal models of learning from noisy data. Specifically, we focus on learning in the probability approximately correct model as defined by Valiant. Two of the most widely studied models of noise in this setting have been classification noise and malicious errors. However, a more realistic model combining the two types of noise has not been formalized. We define a learning environment based on a natural combination of these two noise models. We first show that hypothesis testing is possible in this model. We next describe a simple technique for learning in this model, and then describe a more powerful technique based on statistical query learning. We show that the noise tolerance of this improved technique is roughly optimal with respect to the desired learning accuracy and that it provides a smooth tradeoff between the tolerable amounts of the two types of noise. Finally, we show that statistical query simulation yields learning algorithms for other combinations of noise models, thus demonstrating that statistical query specification truly captures the generic fault tolerance of a learning algorithm.


Noise Model Statistical Query Noise Rate Noise Tolerance Computational Learn Theory 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [AD93]
    Javed Aslam and Scott Decatur. General bounds on statistical query learning and PAC learning with noise via hypothesis boosting. In Proceedings of the 34th Annual Symposium on Foundations of Computer Science, pages 282–291, November 1993.Google Scholar
  2. [AD94]
    Javed Aslam and Scott Decatur. Improved noise-tolerant learning and generalized statistical queries. Technical Report TR-17–94, Harvard University, July 1994.Google Scholar
  3. [AL88]
    Dana Angluin and Philip Laird. Learning from noisy examples. Machine Learning, 2 (4): 343–370, 1988.Google Scholar
  4. [Ang92]
    Dana Angluin. Computational learning theory: Survey and selected bibliography. In Proceedings of the 24 Annual ACM Symposium on the Theory of Computing, 1992.Google Scholar
  5. AV79] Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for Hamil-Google Scholar
  6. tonian circuits and matchings. Journal of Computer and System Sciences, 18 (2): 155–193, April 1979.MathSciNetCrossRefGoogle Scholar
  7. [Dec93]
    Scott Decatur. Statistical queries and faulty PAC oracles. In Proceedings of the Sixth Annual ACM Workshop on Computational Learning Theory, pages 262–268. ACM Press, July 1993.Google Scholar
  8. [Dec95]
    Scott Decatur. Efficient Learning from Faulty Data. PhD thesis, Harvard University, 1995.Google Scholar
  9. [DG95]
    Scott Decatur and Rosario Gennaro. On learning from noisy and incomplete examples. In Proceedings of the Eighth Annual ACM Workshop on Computational Learning Theory. ACM Press, July 1995.Google Scholar
  10. [HSW92]
    David Heimbold, Robert Sloan, and Manfred K. Warmuth. Learning integer lattices. SIAM Journal on Computing, 21 (2): 240–266, 1992.MathSciNetzbMATHCrossRefGoogle Scholar
  11. [Kea93]
    Michael Kearns. Efficient noise-tolerant learning from statistical queries. In Proceedings of the 25 th Annual ACM Symposium on the Theory of Computing, pages 392–401, San Diego, 1993.Google Scholar
  12. [KL88]
    Michael Kearns and Ming Li. Learning in the presence of malicious errors. In Proceedings of the 20th Annual ACM Symposium on Theory of Computing, Chicago, Illinois, May 1988.Google Scholar
  13. [Lai88]
    Philip D. Laird. Learning from Good and Bad Data. Kluwer international series in engineering and computer science. Kluwer Academic Publishers, Boston, 1988.zbMATHGoogle Scholar
  14. [Sim93]
    Hans Ulrich Simon. General bounds on the number of examples needed for learning probabilistic concepts. In Proceedings of the Sixth Annual ACM Workshop on Computational Learning Theory, pages 402–411. ACM Press, 1993.Google Scholar
  15. [Va184]
    Leslie Valiant. A theory of the learnable. Communications of the ACM, 27 (11): 1134–1142, November 1984.CrossRefGoogle Scholar
  16. [Va185]
    Leslie Valiant. Learning disjunctions of conjunctions. In Proceedings of the Ninth International Joint Conference on Artificial Intelligence, 1985.Google Scholar
  17. [VC71]
    V.N. Vapnik and A.Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theor. Probability Appl.,16(2):264280, 1971.Google Scholar

Copyright information

© Springer-Verlag New York, Inc. 1996

Authors and Affiliations

  • Scott E. Decatur
    • 1
  1. 1.Aiken Computation Laboratory Division of Applied SciencesHarvard UniversityCambridgeUSA

Personalised recommendations