Probability, Statistics, and Computational Science

Beerenwinkel, Niko; Siebourg, Juliane

doi:10.1007/978-1-61779-582-4_3

Niko Beerenwinkel² &
Juliane Siebourg²

Part of the book series: Methods in Molecular Biology ((MIMB,volume 855))

5195 Accesses
3 Citations

Abstract

In this chapter, we review basic concepts from probability theory and computational statistics that are fundamental to evolutionary genomics. We provide a very basic introduction to statistical modeling and discuss general principles, including maximum likelihood and Bayesian inference. Markov chains, hidden Markov models, and Bayesian network models are introduced in more detail as they occur frequently and in many variations in genomics applications. In particular, we discuss efficient inference algorithms and methods for learning these models from partially observed data. Several simple examples are given throughout the text, some of which point to models that are discussed in more detail in subsequent chapters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ewens, W. J. and Grant, G. R. (2005) Statistical methods in bioinformatics: an introduction. Springer, New York, NY.
Google Scholar
Deonier, R. C., Tavaré, S., and Waterman, M. S. (2005) Computational genome analysis: an introduction. Springer, New York, NY.
Google Scholar
Davison, A. C. (2009) Statistical models. Cambridge University Press, Cambridge, UK.
Google Scholar
Ross, S. M. (2007) Introduction to probability models. Academic Press.
Google Scholar
Hardy, G. H. (1908) Mendelian proportions in a mixed population. Science, 28, 49.
Article PubMed CAS Google Scholar
Weinberg, W. (1908) Über den Nachweis der Vererbung beim Menschen. Jahreshefte des Vereins für vaterländische Naturkunde in Württemberg, 64, 368–382.
Google Scholar
Pachter, L. and Sturmfels, B. (eds.) (2005) Algebraic statistics for computational biology.
Google Scholar
Casella, G. and Berger, R. L. (2002) Statistical inference. Thomson Learning, Pacific Grove, CA.
Google Scholar
Efron, B. and Tibshirani, R. (1993) An introduction to the bootstrap. Chapman & Hall/CRC, New York, NY.
Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S., and Rubin, D. B. (2003) Bayesian data analysis, second edition. Chapman & Hall/CRC, Boca Raton, Fla.
Google Scholar
Dempster, A. P., Laird, N. M., Rubin, D. B., et al. (1977) Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39, 1–38.
Google Scholar
Norris, J. R. (1998) Markov chains. Cambridge University Press.
Google Scholar
Wright, S. (1990) Evolution in Mendelian populations. Bulletin of Mathematical Biology, 52, 241–295.
PubMed CAS Google Scholar
Fisher, R. A. (1930) The genetical theory of natural selection. Clarendon Press, Oxford, UK.
Google Scholar
Jukes, T. H. and Cantor, C. R. (1969) Evolution of protein molecules. Mammalian protein metabolism, 3, 21–132.
CAS Google Scholar
Rabiner, L. R. (1989) A tutorial on HMM and selected applications in speech recognition. Proceedings of the IEEE, 77, 257–286.
Article Google Scholar
Durbin, R. (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge, UK.
Google Scholar
Viterbi, A. (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 13, 260–269.
Article Google Scholar
Baum, L. E. (1972) An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities, 3, 1–8.
Google Scholar
Bishop, C. M. (2006) Pattern recognition and machine learning. Springer, New York.
Google Scholar
Husmeier, D., Dybowski, R., and Roberts, S. (2005) Probabilistic modeling in bioinformatics and medical informatics. Springer, New York.
Google Scholar
Koller, D. and Friedman, N. (2009) Probabilistic graphical models: principles and techniques. The MIT Press, Cambridge, MA.
Google Scholar
Jordan, M. I. (1998) Learning in graphical models. Kluwer Academic Publishers, Cambridge, MA.
Google Scholar
Schwarz, G. (1978) Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Article Google Scholar
Neal, R. M. (1993) Probabilistic inference using Markov Chain Monte Carlo methods. Intelligence, 62, 144.
Google Scholar
Hastings, W. K. (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97.
Article Google Scholar
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721–741.
Article PubMed CAS Google Scholar
Felsenstein, J. (2004) Inferring phylogenies. Sinauer Associates, Sunderland, MA.
Google Scholar
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–376.
Article PubMed CAS Google Scholar
Siepel, A. and Haussler, D. (2005) Phylogenetic hidden Markov models. Statistical Methods in Molecular Evolution, pp. 325–351.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
Niko Beerenwinkel & Juliane Siebourg

Authors

Niko Beerenwinkel
View author publications
You can also search for this author in PubMed Google Scholar
Juliane Siebourg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Niko Beerenwinkel .

Editor information

Editors and Affiliations

Department of Computer Science, ETH Zürich, Universitätsstr. 6, Zürich, 8092, Switzerland
Maria Anisimova

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Beerenwinkel, N., Siebourg, J. (2012). Probability, Statistics, and Computational Science. In: Anisimova, M. (eds) Evolutionary Genomics. Methods in Molecular Biology, vol 855. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-61779-582-4_3

Download citation

DOI: https://doi.org/10.1007/978-1-61779-582-4_3
Published: 07 February 2012
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-61779-581-7
Online ISBN: 978-1-61779-582-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics