Using Statistics to Shed Light on the Dynamics of the Human Genome: A Review
In this article we review a number of recent studies in which information derived from genomic alignments and data concerning composition, location and biochemical features of the nuclear DNA are used to investigate salient properties and determinants of change (mutations) in the human genome. The studies under review, all conducted by an interdisciplinary group of investigators at The Pennsylvania State University, required the use of a range of statistical techniques—from regression, to multivariate analysis, to the modeling of latent structures.
KeywordsRepeat Number Fragile Site Mutagenic Process Genomic Landscape Microsatellite Mutability
We wish to thank G. Ananda, A. Fungtammasan, Y.D. Kelkar, E.M. Kvikstad, P. Kuruppumullage Don and S. Tyekucheva—the brilliant and hard working graduate students that took the lead and collaborated with each other in the studies reviewed in this article. We also wish to thank our collaborators in the Center for Medical Genomics of The Pennsylvania State University, in particular K. Eckert whose group performed experimental work critical for our studies of microsatellites and common fragile sites. Finally, we are in debt to a reviewer of this manuscript who offered useful and interesting comments on our work. Our research over the years has been supported by various sources; particularly important for the studies reviewed here were awards from the NSF (DBI 0965596) and the NIH (General Medical Sciences R01 GM087472-01).
- 8.Kvikstad, E.M., Tyekucheva, S., Chiaromonte, F., Makova, K.D.: A macaque’s-eye view of human insertions and deletions: differences in mechanisms. PLoS Comput. Biol. 3(9)e176, 1772–1782 (2007)Google Scholar
- 17.Muggeo, V.: Segmented: an R package to fit regression models with broken-line relationships. R. News. 8, 20–25 (2008). http://cran.r-project.org/doc/Rnews/
- 19.Mrasek, K., Schoder, C., Teichmann, A.C., Behr, K., Franze, B., Wilhelm, K., Blaurock, N., Claussen, U., Liehr, T., Weise, A.: Global screening and extended nomenclature for 230 aphidicolin-inducible fragile sites, including 61 yet unreported ones. Int. J. Oncol. 36, 929–940 (2010)Google Scholar
- 24.Taramasco, O., Bauer, S.: R package RHmm. http://CRAN.project.org/package=RHmm (2007)
- 27.Davoli, et al.: Cumulative haploinsufficiency and triplosensitivity drive aneuploidy patterns and shape in the cancer genome. Cell 155(4), 948–962 (2013)Google Scholar
- 29.Carrel, L., Park, C., Tyekucheva, S., Dunn, J., Chiaromonte, F., Makova, K.D.: Genomic environment predicts expression patterns on the human inactive X chromosome. PLoS Gen. 2(9) e151, 1477–1486 (2006)Google Scholar
- 31.Tyekucheva, S., Chiaromonte, F.: Augmenting the bootstrap to analyze high dimensional genomic data (invited discussion article). Test 17, 1–18 (article) and 47–55 (rejoinder) (2008)Google Scholar
- 32.Chiaromonte F., Yang S., Elnitski L., Bing Yap V., Miller W., Hardison R.C.: Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proc. Natl. Acad. Sci. USA. 98(25), 14503–14508 (2001)Google Scholar
- 33.Hardison R.C., Roskin K.M., Yang S., Diekhans M., Kent J.W., Weber R., Elnitski L., Li J., O'Connor M., Kolbe D., Schwartz S., Furey T.S., Whelan S., Goldman N., Smit A., Miller W., Chiaromonte F., Haussler D.: Co-variation in frequencies of substitution, deletion, transposition and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003)Google Scholar
- 34.Yang S., Smit A.F., Schwartz S., Chiaromonte F., Roskin K. M., Haussler D., Miller W., Hardison R.C.: Patterns of insertions and their covariation with substitutions in the rat, mouse and human genomes. Genome Res. 14, 517–527 (2004)Google Scholar
- 35.Hodgkinson, A., Chen, Y., Eyre-Walker, A.: The large scale distribution of somatic mutations 534 in cancer. Hum. Mut. 33(1), 136–143 (2012)Google Scholar
- 36.Lukusa T., Fryns J.P.: Human chromosome fragility. Biochim Biophys Acta. 1779, 3–16 (2008)Google Scholar