Accurate Restoration of DNA Sequences
Conference paper
Keywords
Error Rate Markov Chain Hide Markov Model Posterior Distribution Gibbs Sampler
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Preview
Unable to display preview. Download preview PDF.
References
- Alizadeh, F., Karp, R.M., Newberg, L.A., Weisser, D.K. (1992) Physical mapping of chromosomes: A combinatorial problem in molecular biology. Preprint.Google Scholar
- Altschul, S.F., Lipman, D.J. (1989) Trees, stars, and multiple biological sequence alignment. SIAM Journal on Applied Mathematics 49:197–209.MathSciNetMATHCrossRefGoogle Scholar
- Berger, J.O. (1985) Statistical Decision Theory and Bayesian Analysis. 2nd ed. Springer-Verlag.MATHGoogle Scholar
- Borodovsky, M. and McIninch, J. (1993a) Genmark: Parallel gene recognition for both DNA strands. Computers Chem. 17:123–133.MATHCrossRefGoogle Scholar
- Borodovsky, M. and McIninch, J. (1993b) Eecognition of genes in DNA sequence with ambiguity. Biosystems 30:161–171.CrossRefGoogle Scholar
- Bowling, J.M., Bruner, K.L., Cmarik, J.L., Tibbets, C. (1991) Neighboring nucleotide interactions during DNA sequencing gel electrophoresis. Nucl. Acids Res. 19:3089–3097.CrossRefGoogle Scholar
- Branscomb, E. et al. (1990) Optimizing restriction fragment fingerprinting methods for ordering large genomic libraries. Genomics 8:351–366.CrossRefGoogle Scholar
- Casella, G.C. and George, E.I. (1992) Explaining the Gibbs sampler American Statistician.Google Scholar
- Chen, E. et al. (1991) Sequence of the human glucose-6-phosphate dehydrogenase cloned in plasmids and a yeast artificial chromosome. Genomics 10:792–800.CrossRefGoogle Scholar
- Chernoff H. (1992) Estimating a sequence from noisy copies. Harvard University technical report no. ONR-C-10.Google Scholar
- Churchill, G.A. (1989) A stochastic model for heterogeneous DNA sequences. Bull. Math. Biol. 51:79–94.MathSciNetMATHGoogle Scholar
- Churchill, G.A., Burks, C., Eggert, M., Engle, M.L., Waterman, M.S. (1992) Assembling DNA fragments by shuffling and simulated annealing. Manuscript.Google Scholar
- Churchill, G.A. and Thorne, J.L. (1993) The probability distribution of a molecular sequence alignment. Cornell University, Biometrics Unit technical report.Google Scholar
- Churchill, G.A. and Waterman, M.S. (1992). The accuracy of DNA sequences: estimating sequence quality. Genomics in press.Google Scholar
- Clark, A.G. and Whittam T.S. (1992) Sequencing errors and molecular evolutionary analysis. Mol. Biol. Evol. 9:744–752.Google Scholar
- Clarke, L. and Carbon, J. (1976) A colony bank containing synthetic Col EI hybrid plasmids representative of the entire E. coli genome. Cell 9:91–99.CrossRefGoogle Scholar
- Cornish-Bowden A. (1985) Nomenclature for incompletely specified bases in DNA sequences: Recommendations 1984. Nucl. Acids Res. 13:3021–3030.CrossRefGoogle Scholar
- Daniels, D.L., Plunkett, G., Burland, V., Blattner, F.R. (1992) Analysis of the Escherichia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science 257: 771–778.CrossRefGoogle Scholar
- Dempster, A.P., Laird, N.M., Rubin, D.B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. B 39:1–38.MathSciNetMATHGoogle Scholar
- Edwards, A. et al. (1990) Automated DNA sequencing of the Human HPRT locus. Genomics 6:593–608.CrossRefGoogle Scholar
- Fu, Y.X., Timberlake, W.E., Arnold, J. (1992) On the design of genome mapping experiments using short synthetic oligonucleotides. Biometrics 48:337–359.CrossRefGoogle Scholar
- Gelfand A.E. and Smith, A.F.M. (1990) Sampling based approaches to calculating marginal densities. J. Amer. Statist. Assoc. 85:398–409.MathSciNetMATHCrossRefGoogle Scholar
- Gelman, A. and Rubin, D.B. (1992) Inference from iterative simulation, with discussion. Statistical Science 7:457–511.CrossRefGoogle Scholar
- Geyer, C.J. (1992) Markov chain Monte Carlo maximum likelihood. Computer Science and Statistics: Proceeding of the 23rd symposium on the interface.Google Scholar
- Golden, J.B., Torgersen, D., Tibbets, C. (1993) Pattern recognition for automated DNA sequencing: I. On-line signal conditioning and feature extraction for basecalling. In Proceedings of the First International Conference on Intelligent Systems for Molecular Biology. AAAI Press.Google Scholar
- Hastings (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109.CrossRefGoogle Scholar
- Huang, X. (1992) A contig assembly program based on sensitive detection of fragment overlaps. Genomics 14:18–25.CrossRefGoogle Scholar
- Hunkapillar, T, Kaiser, R.J., Koop, B.F., Hood, L. (1991) Large-scale automated DNA sequence determination. Science 254:59–67.CrossRefGoogle Scholar
- Kececioglu, J. and Myers, E. (1990). A robust automatic fragment assembly system. Preprint.Google Scholar
- Koop, B.F., Rowan, L., Chen, W.-Q., Deshpande, P., Lee, H. and Hood, L. (1993) Sequence length and error analysis of sequenase and automated Taq cycle sequencing methods. Bio Techniques 14:442–447.Google Scholar
- Krawetz, S.A. (1989) Sequence errors described in GenBank: A means to determine the accuracy of DNA sequence interpretation. Nucl. Acids Res. 17:3951–3957.CrossRefGoogle Scholar
- Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D. (1993) Hidden Markov models in computational biology: Applications to protein modeling. J. Mol. Biol., accepted.Google Scholar
- Lander, E.S. and Waterman, M.S. (1988) Genomic mapping by fingerprinting random clones: A mathematical analysis. Genomics 2:231–239.CrossRefGoogle Scholar
- Larson, S., Mudita, J., Myers, G. (1993) An interface for a fragment assembly kernal. University of Arizona, Department of Computer Science TR93–20.Google Scholar
- Lawrence, C.B. and Solovyev, V.V. (1993) Assignment of position specific error probability to primary DNA sequence data, manuscript.Google Scholar
- Lewin, B. (1992) Genes V. Wiley, New York.Google Scholar
- Maxam, A.M. and Gilbert, W. (1977) A new method for sequencing DNA. Proc. Natl Acad. Sci. 74:5463–5467.CrossRefGoogle Scholar
- Oliver, S.G., et al. (1992) The complete DNA sequence of yeast chromosome III. Nature 357:38–46.CrossRefGoogle Scholar
- Posfai J. and Roberts, R.J. (1992) Finding errors in DNA sequences. Proc. Natl. Acad. Sci. 89: 4698–4702.CrossRefGoogle Scholar
- Roberts, L. (1990). Large-scale sequencing trials begin. Science, 250: 1336–1338.CrossRefGoogle Scholar
- Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Biochemistry 74:560–564.Google Scholar
- Santner, T.J. and Duffy, D.E. (1989) The Statistical Analysis of Discrete Data. Springer-Verlag, NY.MATHCrossRefGoogle Scholar
- Seto, D., Koop, B.F., Hood, L. (1993) An experimentally derived data set constructed for testing large-scale DNA sequence assembly algorithms. Genomics 15:673–676.CrossRefGoogle Scholar
- Staden, R. (1980). A new computer method for the storage and manipulation of DNA gel reading data. Nucleic Acids Res. 8:3673–2694.CrossRefGoogle Scholar
- States, D.J. (1992) Molecular sequence accuracy: analysing imperfect data. Trends in Genetics 8:52–55.Google Scholar
- States, D.J. and Botstein, D. (1991). Molecular sequence accuracy and the analysis of protein coding regions. Proc. Natl. Acad. Sci. USA 88:5518–5522.CrossRefGoogle Scholar
- Sulston, J. et al. (1992) The C. elegans genome sequencing project: a beginning. Nature 356:37–41.CrossRefGoogle Scholar
- Thorne, J.L. and Churchill, G.A. (1993) Estimation and reliability of molecular sequence alignments. Biometrics, accepted.Google Scholar
- Thorne, J.L., Kishino, H., Felsenstein, J.F. (1991) An evolutionary model for maximum likelihood alignment of DNA sequences. J. Mol. Evol. 33:114–124.CrossRefGoogle Scholar
- Thorne, J.L., Kishino, H., Felsenstein, J.F. (1992) Inching toward reality: An improved likelihood model of sequence evolution. J. Mol. Evol. 34:3–16.CrossRefGoogle Scholar
- Tibbets, C, Bowling, J.M., Golden, J.B. (1993) Neural networks for automated base calling of gel-based DNA sequencing ladders. In Automated DNA Sequencing and Analysis Techniques Dr. J. Craig Ventner, Editor, Academic Press.Google Scholar
- Waterman, M.S. (1984) General methods of sequence comparison. Bull. Math. Biol. 46:473–500.MathSciNetMATHGoogle Scholar
- Watson, J and Crick, F. (1953) Nature 171: 737–738.CrossRefGoogle Scholar
- Besag, J. and Mengersen, K.L. (1993) Meta-Analysis using Monte Carlo Markov Chain methods. Tech. report, Dept. of Statistics, Colorado State Univ.Google Scholar
- Celeux, G. and Diebolt, J. (1986) The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Cornput. Statist. Quater. 2, 73–82.Google Scholar
- Diebolt, J. and Robert, C.P. (1993) The Duality Principle: Discussion of Smith and Roberts, Besag and Green, and Gllks et al. J.R.S.S. (Ser. B) 55, 73–74.Google Scholar
- Diebolt, J. and Robert, C.P. (1994) Estimation of finite mixture distributions by Bayesian sampling. J.R.S.S. (Ser. B) 56, 163–175.MathSciNetGoogle Scholar
- Gelman, A. and Rubin, D.B. (1992) Does a single iteration suffice? In Bayesian Statistics 4 (J.O. Berger, J.M. Bernardo, A.P. Dawid and A.F.M. Smith, eds.) Oxford University Press, London.Google Scholar
- Karlin, S., Dembo, A., and Kawabata, T. (1990). Statistical composition of high-scoring segments from molecular sequences. Ann. Statist. 18 , 571–581.MathSciNetCrossRefGoogle Scholar
- Lawrence, C.E., Atschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F. and Wootton, J.C. (1993) Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262, 208–214.CrossRefGoogle Scholar
- Muller, P. (1992) A black-box algorithm for implementing the Metropolis algorithm. Tech. Report, Dept. of Statistics, Purdue University, Lafayette.Google Scholar
- Qian, W. and Titterington, D.M. (1991) Estimation of parameters in hidden Markov models. Phil Trans. Roy. Soc. London A 337, 407–428.MATHCrossRefGoogle Scholar
- Robert, C.P. (1992) Discussion of Meng and Rubin In Bayesian Statistics 4 (J.O. Berger, J.M. Bernardo, A.P. Dawid and A.F.M. Smith, eds.) Oxford University Press, London.Google Scholar
- Robert, C.P. (1993) Convergence assessments for Monte-Carlo Markov chain methods. Technical Report, Dept. of Math, Univ. de Rouen.Google Scholar
- Tierney, L. (1991) Markov chains for exploring posterior distributions. Computer Sciences and Statistics: Proc. 23d Symp. Interface, 563–570.Google Scholar
- Cleveland, W.S. (1979) Robust Locally-weighted Regression and Smoothing Scatterplots. J. Amer. Statist. Assoc. 74, 829–836.MathSciNetMATHCrossRefGoogle Scholar
- Koop, B.F., Rowan, L., Chen, W.-Q., Deshpande, P., Lee, BL and Hood, L. (1993). Sequence Length and Error Analysis of Sequenase and Automated Taq Cycle Sequencing Methods. Biotechniques 14, 442–447.Google Scholar
- Sanger, F., Nicklen, S. and Coulson, A.R. (1977). DNA Sequencing with Chain Terminating Inhibiters. Biochemistry 74, 560–564.Google Scholar
- Waterman, M.S. (1984). General Methods of Sequence Comparison. Bull Math. Biol. 46, 473–500.MathSciNetMATHGoogle Scholar
Copyright information
© Springer-Verlag New York, Inc. 1995