Useful Tools for Statistics and Machine Learning

DasGupta, Anirban

doi:10.1007/978-1-4419-9634-3_20

Useful Tools for Statistics and Machine Learning

Anirban DasGupta²

Chapter
First Online: 01 January 2011

13k Accesses
1 Citations

Part of the book series: Springer Texts in Statistics ((STS))

Abstract

As much as we would like to have analytical solutions to important problems, it is a fact that many of them are simply too difficult to admit closed-form solutions. Common examples of this phenomenon are finding exact distributions of estimators and statistics, computing the value of an exact optimum procedure, such as a maximum likelihood estimate, and numerous combinatorial algorithms of importance in computer science and applied probability. Unprecedented advances in computing powers and availability have inspired creative new methods and algorithms for solving old problems; often, these new methods are better than what we had in our toolbox before. This chapter provides a glimpse into a few selected computing tools and algorithms that have had a significant impact on the practice of probability and statistics, specifically, the bootstrap, the EM algorithm, and the use of kernels for smoothing and modern statistical classification. The treatment is supposed to be introductory, with references to more advanced parts of the literature.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Aizerman, M., Braverman, E., and Rozonoer, L. (1964). Theoretical foundations of the potential function method in pattern recognition learning, Autom. Remote Control, 25, 821–837.
MathSciNet Google Scholar
Aronszajn, N. (1950). Theory of reproducing kernels,Trans. Amer. Math. Soc., 68, 307–404.
Article MathSciNet Google Scholar
Athreya, K. (1987). Bootstrap of the mean in the infinite variance case, Ann. Statist., 15, 724–731.
Article MathSciNet MATH Google Scholar
Berlinet, A. and Thomas-Agnan, C. (2004).Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer, Boston.
Book MATH Google Scholar
Bickel, P.J. (2003). Unorthodox bootstraps, Invited paper, J. Korean Statist. Soc., 32, 213–224.
MathSciNet Google Scholar
Bickel, P.J. and Doksum, K. (2006).Mathematical Statistics, Basic Ideas and Selected Topics, Prentice Hall, upper Saddle River, NJ.
Google Scholar
Bickel, P.J. and Freedman, D. (1981). Some asymptotic theory for the bootrap, Ann. Statist., 9, 1196–1217.
Article MathSciNet MATH Google Scholar
Carlstein, E. (1986). The use of subseries values for estimating the variance of a general statistic from a stationary sequence,Ann. Statist., 14, 1171–1179.
Article MathSciNet MATH Google Scholar
Chan, K. and Ledolter, J. (1995). Monte Carlo estimation for time series models involving counts, J. Amer. Statist. Assoc., 90, 242–252.
Article MathSciNet MATH Google Scholar
Cheney, W. (2001).Analysis for Applied Mathematics, Springer, New York.
MATH Google Scholar
Cheney, W. and Light, W. (2000). A Course in Approximation Theory, Pacific Grove, Brooks/ Cole, CA.
Google Scholar
Cristianini, N. and Shawe-Taylor, J. (2000).An Introduction to Support Vector Machines and other Kernel Based Learning Methods, Cambridge Univ. Press, Cambridge, UK.
Google Scholar
DasGupta, A. (2008). Asymptotic Theory of Statistics and Probability, Springer, New York.
MATH Google Scholar
Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm,JRSS, Ser. B, 39, 1–38.
MathSciNet MATH Google Scholar
Devroye, L., Györfi, L., and Lugosi, G. (1996).A Probabilistic Theory of Pattern Recognition, Springer, New York.
MATH Google Scholar
Efron, B. (2003). Second thoughts on the bootstrap, Statist. Sci., 18, 135–140.
Article MathSciNet Google Scholar
Efron, B. and Tibshirani, R. (1993).An Introduction to the Bootstrap, Chapman and Hall, London.
MATH Google Scholar
Giné, E. and Zinn, J. (1989).Necessary conditions for bootstrap of the mean, Ann. Statist., 17, 684–691.
Article MathSciNet MATH Google Scholar
Hall, P. (1986). On the number of bootstrap simulations required to construct a confidence interval,Ann. Statist., 14, 1453–1462.
Article MathSciNet MATH Google Scholar
Hall, P. (1988). Rate of convergence in bootstrap approximations, Ann. prob, 16,4, 1665–1684.
Article MATH Google Scholar
Hall, P. (1989). On efficient bootstrap simulation,Biometrika, 76, 613–617.
Article MathSciNet MATH Google Scholar
Hall, P. (1990). Asymptotic properties of the bootstrap for heavy-tailed distributions, Ann. Prob., 18, 1342–1360.
Article MATH Google Scholar
Hall, P. (1992).The Bootstrap and Edgeworth Expansion, Springer, New York.
Google Scholar
Hall, P., Horowitz, J. and Jing, B. (1995). On blocking rules for the bootstrap with dependent data, Biometrika, 82, 561–574.
Article MathSciNet MATH Google Scholar
Hall, P. (2003). A short prehistory of the bootstrap,Statist. Sci., 18, 158–167.
Article MathSciNet Google Scholar
Künsch, H.R. (1989). The Jackknife and the bootstrap for general stationary observations, Ann. Statist., 17, 1217–1241.
Article MathSciNet MATH Google Scholar
Lahiri, S.N. (1999). Theoretical comparisons of block bootstrap methods, Ann. Statist., 27, 386–404.
Article MathSciNet MATH Google Scholar
Lahiri, S.N. (2003).Resampling Methods for Dependent Data, Springer-Verlag, New York.
Book MATH Google Scholar
Lahiri, S.N. (2006). Bootstrap methods, a review, in Frontiers in Statistics, J. Fan and H. Koul Eds., 231–256, Imperial College Press, London.
Chapter Google Scholar
Lange, K. (1999).Numerical Analysis for Statisticians, Springer, New York.
MATH Google Scholar
Le Cam, L. and Yang, G. (1990). Asymptotics in Statistics, Some Basic Concepts, Springer, New York.
Book MATH Google Scholar
Lehmann, E.L. (1999).Elements of Large Sample Theory, Springer, New York.
Book MATH Google Scholar
Lehmann, E.L. and Casella, G. (1998). Theory of Point Estimation, Springer, New York.
MATH Google Scholar
Levine, R. and Casella, G. (2001). Implementation of the Monte Carlo EM algorithm,J. Comput. Graph. Statist., 10, 422–439.
Article MathSciNet Google Scholar
McLachlan, G. and Krishnan, T. (2008). The EM Algorithm and Extensions, Wiley, New York.
Book MATH Google Scholar
Mercer, J. (1909). Functions of positive and negative type and their connection with the theory of integral equations,Philos. Trans. Royal Soc. London, A, 415–416.
Google Scholar
Minh, H., Niyogi, P., and Yao, Y. (2006). Mercer’s theorem, feature maps, and smoothing, Proc. Comput. Learning Theory, COLT, 154–168.
Google Scholar
Murray, G.D. (1977). Discussion of paper by Dempster, Laird, and Rubin (1977),JRSS Ser. B, 39, 27–28.
Google Scholar
Politis, D. and Romano, J. (1994). The stationary bootstrap, JASA, 89, 1303–1313.
Article MathSciNet MATH Google Scholar
Politis, D. and White, A. (2004). Automatic block length selection for the dependent bootstrap,Econ. Rev., 23, 53–70.
Article MathSciNet MATH Google Scholar
Politis, D., Romano, J. and Wolf, M. (1999). Subsampling, Springer, New York.
Book MATH Google Scholar
Rudin, W. (1986).Real and Complex Analysis, 3rd edition, McGraw-Hill, Columbus, OH.
Google Scholar
Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function, Ann. Math, Statist., 27, 832–835. 3rd Edition, McGraw-Hill, Columbus, OH.
Google Scholar
Shao, J. and Tu, D. (1995).The Jackknife and Bootstrap, Springer, New York.
Book MATH Google Scholar
Singh, K. (1981). On the asymptotic accuracy of Efron’s bootstrap, Ann. Statist., 9, 1187–1195.
Article MathSciNet MATH Google Scholar
Sundberg, R. (1974). Maximum likelihood theory for incomplete data from exponential family,Scand. J. Statist., 1, 49–58.
MathSciNet MATH Google Scholar
Tong, Y. (1990). The Multivariate Normal Distribution, Springer, New York.
Book MATH Google Scholar
Vapnik, V. and Chervonenkis, A. (1964). A note on one class of perceptrons,Autom. Remote Control, 25.
Google Scholar
Vapnik, V. (1995). The Nature of Statistical Learning Theory, Springer, New York.
MATH Google Scholar
Wei, G. and Tanner, M. (1990). A Monte Carlo implementation of the EM algorithm,J. Amer. Statist. Assoc., 85, 699–704.
Article Google Scholar
Wu, C.F.J. (1983). On the convergence properties of the EM algorithm, Ann. Statist., 11, 95–103.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Purdue University, 150 N. University Street, West Lafayette, IN, 47907, USA
Anirban DasGupta

Authors

Anirban DasGupta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anirban DasGupta .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

DasGupta, A. (2011). Useful Tools for Statistics and Machine Learning. In: Probability for Statistics and Machine Learning. Springer Texts in Statistics. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9634-3_20

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9634-3_20
Published: 06 May 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9633-6
Online ISBN: 978-1-4419-9634-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics