Skip to main content

Introduction

  • Chapter
  • First Online:

Part of the book series: Computational Biology ((COBO,volume 20))

Abstract

This book is meant to serve as an introduction to the new and very exciting field of comparative gene finding. We introduce the field in its current state, and go through the process of constructing a comparative gene finder by breaking it down into its separate building blocks. But before we can dive into the algorithmic details of such a process, we begin by giving a brief introduction to the underlying biological theory. In this chapter we introduce the basic concepts of genetics needed for this book, and define the gene finding problem we have set out to solve. We round off by giving a brief account of the historical developments of approaching the gene finding problem up to where it stands today. In the last section we split the process of building a gene finder into its smaller parts, and the rest of the book is structured in the same manner.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alexandersson, M., Cawley, S., Pachter, L.: SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model. Genome Res. 13, 496–502 (2003)

    Google Scholar 

  2. Allen, J.E., Salzberg, S.L.: JIGSAW: integration of multiple sources of evidence for gene prediction. Bioinformatics 21, 3596–3603 (2005)

    Google Scholar 

  3. Audic, S., Claverie, J.-M.: Self-identification of protein-coding regions in microbial genomes. Proc. Natl. Acad. Sci. USA 95, 10026–10031 (1998)

    Google Scholar 

  4. Axelson-Fisk, M., Sunnerhagen, P.: Comparative genomics and gene finding in fungi. In: Sunnerhagen, P., Piskur, J. (eds.) Topics in Current Genetics: Comparative Genomics Using Fungi as Models, pp. 1–28. Springer, Berlin (2005)

    Google Scholar 

  5. Badger, J.H., Olsen, G.J.: CRITICA: coding region identification tool invoking comparative analysis. Mol. Biol. Evol. 16, 512–524 (1999)

    Google Scholar 

  6. Bafna, V., Huson, D.H.: The conserved exon method for gene finding. Int. Conf. Intell. Syst. Mol. Biol. 8, 3–12 (2000)

    Google Scholar 

  7. Batzoglou, S., Pachter, L., Mesirov, J., Berger, B., Lander, E.S.: Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res. 10, 950–958 (2000)

    Google Scholar 

  8. Beadle, G., Tatum, E.: Genetic control of biochemical reactions in Neurospora. Proc. Natl. Acad. Sci. USA 27, 499–506 (1941)

    Google Scholar 

  9. Besemer, J., Lomsadze, A., Borodovsky, M.: GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions. Nucleic Acids Res. 29, 2607–2618 (2001)

    Google Scholar 

  10. Biémont, C., Vieira, C.: Junk DNA as an evolutionary force. Nature 443, 521–524 (2006)

    Google Scholar 

  11. Birney, E., Clamp, M., Durbin, R.: GeneWise and GenomeWise. Genome Res. 14, 988–995 (2004)

    Google Scholar 

  12. Birney, E., Durbin, R.: Dynamite: a flexible code generating system for dynamic programming methods used in sequence comparison. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 56–64 (1997)

    Google Scholar 

  13. Blandin, G., Durrens, P., Tekaia, F., Aigle, M., Bolotin-Fukuhara, M., Bon, E., Casarégola, S., de Montigny, J., Gaillardin, C., Lépingle, A., Llorente, B., Malpertuy, A., Neuvéglise, C., Ozier-Kalogeropoulus, O., Perrin, A., Potier, S., Souciet, J.-L., Talla, E., Toffano-Nioche, C., Wésolowski-Louvel, M., Marck, C., Dujon, B.: Genomic exploration of the hemiascomycetous yeasts: 4. The genome of Saccharomyces cerevisiae revisited. FEBS Lett. 487, 31–36 (2000)

    Google Scholar 

  14. Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003)

    Google Scholar 

  15. Borodovsky, M., McIninch, J.: GENMARK: parallel gene recognition for both DNA strands. Comput. Chem. 17, 123–133 (1993)

    MATH  Google Scholar 

  16. Brejova, B., Brown, D.G., Li, M., Vinar, T.: ExonHunter: a comprehensive approach to gene finding. Bioinformatics 21, i57–i65 (2005)

    Google Scholar 

  17. Brunak, S., Engelbrecht, J., Knudsen, S.: Prediction of human mRNA donor and acceptor sites from the DNA sequence. J. Mol. Biol. 220, 49–65 (1991)

    Google Scholar 

  18. Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997)

    Google Scholar 

  19. Carter, D., Durbin, R.: Vertebrate gene finding from multiple-species alignments using a two-level strategy. Genome Biol. 7, S6.1–S6.12 (2006)

    Google Scholar 

  20. Cawley, S.E., Wirth, A.I., Speed, T.P.: Phat—-a gene finding program for Plasmodium falciparum. Mol. Biochem. Parasitol. 118, 167–174 (2001)

    Google Scholar 

  21. Cebrat, S., Dudek, M.R., Machiewicz, P., Kowalczuk, M., Fita, M.: Asymmetry of coding versus noncoding strand in coding sequences of different genomes. Microb. Comp. Genomics 2, 259–268 (1997)

    Google Scholar 

  22. Chatterji, S., Pachter, L.: Reference based annotation with GeneMapper. Genome Biol. 7, R29 (2006)

    Google Scholar 

  23. Chen, T., Zhang, M.Q.: Pombe: a gene-finding and exon-intron structure prediction system for fission yeast. Yeast 14, 701–710 (1998)

    Google Scholar 

  24. Cherry, J.M., Adler, C., Ball, C., Chervitz, S.A., Dwight, S.S., Hester, E.T., Jia, Y., Juvik, G., Roe, T., Schroeder, M., Weng, S., Botstein, D.: SGD: saccharomyces genome database. Nucleic Acids Res. 26, 73–79 (1998)

    Google Scholar 

  25. Claverie, J.M.: Gene number: what if there are only 30,000 human genes? Science 291, 1255–1257 (2001)

    Google Scholar 

  26. Comings, D.E.: The structure and function of chromatin. Adv. Hum. Genet. 3, 237–431 (1972)

    Google Scholar 

  27. Crick, F.: Cetnral dogma of molecular biology. Nature 227, 561–563 (1970)

    Google Scholar 

  28. Curwen, V., Eyras, E., Andrews, T.D., Clarke, L., Mongin, E., Searle, S.M.J., Clamp, M.: The ensembl automatic gene annotation system. Genome Res. 14, 942–950 (2004)

    Google Scholar 

  29. DeCaprio, D., Vinson, J.P., Pearson, M.D., Montgomery, P., Doherty, M., Galagan, J.E.: Conrad: gene prediction using conditional random fields. Genome Res. 17, 1389–1398 (2007)

    Google Scholar 

  30. Delcher, A.L., Harmon, D., Kasif, S., White, O., Salzberg, S.L.: Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 27, 4636–4641 (1999)

    Google Scholar 

  31. Dong, S., Searls, D.B.: Gene structure prediction by linguistic models. Genomics 23, 540–551 (1994)

    Google Scholar 

  32. The FANTOM consortium and RIKEN genome exploration research group and genome science group (genome network project core group). Science 309, 1559–1563 (2005)

    Google Scholar 

  33. Fickett, J.W.: Recognition of protein coding regions in DNA sequences. Nucleic Acids Res. 10, 5303–5318 (1982)

    Google Scholar 

  34. Fields, C.A., Söderlund, C.A.: GM: a practical tool for automating DNA sequence analysis. Comput. Appl. Biosci. 6, 263–270 (1990)

    Google Scholar 

  35. Flicek, P., Aken, B.L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., Down, T., Dyer, S.C., Eyre, T., Fitzgerald, S., Fernandez-Banet, J., Grf, S., Haider, S., Hammond, M., Holland, R., Howe, K.L., Howe, K., Johnson, N., Jenkinson, A., Khri, A., Keefe, D., Kokocinski, F., Kulesha, E., Lawson, D., Longden, I., Megy, K., Meidl, P., Overduin, B., Parker, A., Pritchard, B., Prlic, A., Rice, S., Rios, D., Schuster, M., Sealy, I., Slater, G., Smedley, D., Spudich, G., Trevanion, S., Vilella, A.J., Vogel, J., White, S., Wood, M., Birney, E., Cox, T., Curwen, V., Durbin, R., Fernandez-Suarez, X.M., Herrero, J., Hubbard, T.J., Kasprzyk, A., Proctor, G., Smith, J., Ureta-Vidal, A., Searle, S.: Ensembl 2008. Nucleic Acids Res. 36, D707–D714 (2008)

    Google Scholar 

  36. Frishman, D., Mironov, A., Mewes, H.-W., Gelfand, M.: Combining diverse evidence for gene recognition in completely sequenced bacterial genomes. Nucleic Acids Res. 26, 2941–2947 (1998)

    Google Scholar 

  37. Gelfand, M.S.: Computer prediction of the exon-intron structure of mammalian pre-mRNAs. Nucleic Acids Res. 18, 5865–5869 (1990)

    Google Scholar 

  38. Gelfand, M.S., Mironov, A.A., Pevzner, P.A.: Gene recognition via spliced sequence alignment. Proc. Natl. Acad. Sci. USA 93, 9061–9066 (1996)

    Google Scholar 

  39. Gelfand, M.S., Roytberg, M.A.: Prediction of the exon-intron structure by a dynamic programming approach. BioSystems 30, 173–182 (1993)

    Google Scholar 

  40. Gerstein, M.B., Bruce, C., Rozowsky, J.S., Zheng, D., Du, J., Korbel, J.O., Emanuelsson, O., Zhang, Z.D., Wiessman, S., Snyder, M.: What is a gene, post-ENCODE? History and updated definition. Genome Res. 17, 669–681 (2007)

    Google Scholar 

  41. Gish, W., States, D.J.: Identification of protein coding regions by database similarity search. Nat. Genet. 3, 266–272 (1993)

    Google Scholar 

  42. Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., Oliver, S.G.: Life with 6000 genes. Science 274, 563–567 (1996)

    Google Scholar 

  43. Gregory, T.R.: Coincidence, coevolution, or causation? DNA content, cell size, and the C-value enigma. Biol. Rev. 76, 65–101 (2001)

    Google Scholar 

  44. Gregory, T.R.: The C-value enigma in plants and animals: a review of parallels and an appeal for partnership. Ann. Bot. 95, 133–146 (2005)

    Google Scholar 

  45. Gremme, G., Brendel, V., Sparks, M.E., Kurtz, S.: Engineering a software tool for gene structure prediction in higher organisms. Inf. Softw. Tech. 47, 965–978 (2005)

    Google Scholar 

  46. Gross, S.S., Brent, M.R.: Using multiple alignments to improve gene prediction. J. Comput. Biol. 13, 379–393 (2006)

    MathSciNet  Google Scholar 

  47. Guigó, R., Knudsen, S., Drake, N., Smith, T.: Prediction of gene structure. J. Mol. Biol. 226, 141–157 (1992)

    Google Scholar 

  48. Guo, F.-B., Ou, H.-Y., Zhang, C.-T.: ZCURVE: a new system for recognizing protein-coding genes in bacterial and archaeal genomes. Nucleic Acids Res. 31, 1780–1789 (2003)

    Google Scholar 

  49. Harrison, P.M., Kumar, A., Lang, N., Snyder, M., Gerstein, M.: A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 30, 1083–1090 (2002)

    Google Scholar 

  50. Henderson, J., Salzberg, S., Fasman, K.H.: Finding genes in DNA with a hidden Markov model. J. Comput. Biol. 4, 127–141 (1997)

    Google Scholar 

  51. Howe, K.L., Chothia, T., Durbin, R.: GAZE: a generic framework for the integration of gene-prediction data by dynamic programming. Genome Res. 12, 1418–1427 (2002)

    Google Scholar 

  52. Hsieh, S.J., Lin, C.Y., Liu, N.H., Chow, W.Y., Tang, C.Y.: GeneAlign: a coding exon prediction tool based on phylogenetical comparisons. Nucleic Acids Res. 34, W280–W284 (2006)

    Google Scholar 

  53. Human genome sequencing consortium: initial sequencing and analysis of the human genome. Nature 409, 745–964 (2002)

    Google Scholar 

  54. Hutchinson, G.B., Hayden, M.R.: The prediction of exons through an analysis of spliceable open reading frames. Nucleic Acids Res. 20, 3453–3462 (1992)

    Google Scholar 

  55. Issac, B., Raghava, G.P.S.: EGPred: prediction of eukaryotic genes uisng ab initio methods after combining with sequence similarity approaches. Genome Res. 14, 1756–1766 (2004)

    Google Scholar 

  56. Kanno, H., Huang, I.-Y., Kan, Y.W., Yoshida, A.: Two structural genes on different chromosomes are required for encoding the major subunit of human red cell glucose-6-phosphate dehydrogenase. Cell 58, 595–606 (1989)

    Google Scholar 

  57. Kellis, M., Patterson, N., Endrizzi, M., Birren, B., Lander, E.S.: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)

    Google Scholar 

  58. Kim, H., Klein, R., Majewski, J., Ott, J.: Estimating rates of alternative splicing in mammals and invertebrates. Nat. Genet. 36, 915–917 (2004)

    Google Scholar 

  59. Korf, I.: Gene finding in novel genomes. BMC Bioinform. 5, 59 (2004)

    Google Scholar 

  60. Korf, I., Flicek, P., Duan, D., Brent, M.R.: Integrating genomic homology into gene structure prediction. Bioinformatics 17, S140–S148 (2001)

    Google Scholar 

  61. Kowalczuk, M., Mackiewicz, P., Gierlik, A., Dudek, M.R., Cebrat, S.: Total number of coding open reading frames in the yeast genome. Yeast 15, 1031–1034 (1999)

    Google Scholar 

  62. Krogh, A.: Two methods for improving performance of an HMM and their application for gene finding. Proc. Int. Conf. Intell. Syst. Mol. Biol. 5, 179–186 (1997)

    Google Scholar 

  63. Krogh, A.: Using database matches with HMMGene for automated gene detection in Drosophila. Genome Res. 10, 523–528 (2000)

    Google Scholar 

  64. Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D.: Hidden Markov models in computational biology: applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (2002)

    Google Scholar 

  65. Krogh, A., Mian, I.S., Haussler, D.: A hidden Markov model that finds genes in E.coli DNA. Nucleic Acids Res. 22, 4768–4778 (1994)

    Google Scholar 

  66. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: A generalized hidden Markov model for the recognition of human genes in DNA. Proc. Int. Conf. Intell. Syst. Mol. Biol. 4, 134–142 (1996)

    Google Scholar 

  67. Kulp, D., Haussler, D., Reese, M.G., Eeckman, F.H.: Integrating database homology in a probabilistic gene structure model. Pac. Symp. Biocomput. 2, 232–244 (1997)

    Google Scholar 

  68. Kumar, A., Harrison, P.M., Cheung, K.-H., Lan, N., Echols, N., Bertone, P., Miller, P., Gerstein, M.B., Snyder, M.: An integrated approach for finding overlooked genes in yeast. Nat. Biotech. 20, 58–63 (2002)

    Google Scholar 

  69. Larsen, T.S., Krogh, A.: Easy-Gene—a prokaryotic gene finder that ranks ORFs by statistical significance. BMC Bioinform. 4, 21–35 (2003)

    Google Scholar 

  70. Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y.O., Borodovsky, M.: Gene identification in novel eukaryotic genomes by self-traning algorithm. Nucleic Acids Res. 33, 6494–6506 (2005)

    Google Scholar 

  71. Mackiewicz, P., Kowalczuk, M., Mackiewicz, D., Nowicka, A., Dudkiewicz, M., Laszkiewicz, A., Dudek, M.R., Cebrat, S.: How many protein-coding genes are there in the Saccharomyces cerevisiae genome? Yeast 19, 619–629 (2002)

    Google Scholar 

  72. Majoros, W.H., Pertea, M., Antonescu, C., Salzberg, S.L.: GlimmerM, Exonomy and Unveil: three ab initio eukaryotic gene finders. Nucleic Acids Res. 31, 3601–3604 (2003)

    Google Scholar 

  73. Majoros, W.H., Pertea, M., Delcher, A.L., Salzberg, S.L.: Efficient decoding algorithms for generalized hidden Markov model gene finders. BMC Bioinform. 6, 16–28 (2005)

    Google Scholar 

  74. Majoros, W.H., Pertea, M., Salzberg, S.L.: TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene finders. Bioinformatics 20, 2878–2879 (2004)

    Google Scholar 

  75. Majoros, W.H., Pertea, M., Salzberg, S.L.: Efficient implementation of a generalized pair hidden Markov model for comparative gene finding. Bioinformatics 21, 1782–1788 (2005)

    Google Scholar 

  76. Mewes, H.W., Heumann, K., Kaps, A., Mayer, K., Pfeiffer, F., Stocker, S., Frishman, D.: MIPS: a database for genomes and protein sequences. Nucleic Acids Res. 27, 44–48 (1999)

    Google Scholar 

  77. Meyer, I.M., Durbin, R.: Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics 18, 1309–1318 (2002)

    Google Scholar 

  78. Meyer, I.M., Durbin, R.: Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res. 32, 776–783 (2004)

    Google Scholar 

  79. Milanesi, L., D’Angelo, D., Rogozin, I.B.: GeneBuilder: interactive in silico prediction of gene structure. Bioinformatics 15, 612–621 (1999)

    Google Scholar 

  80. Mironov, A.A., Noivchkov, P.S., Gelfand, M.S.: Pro-Frame: similarity-based gene recognition in eukaryotic DNA sequences with errors. Bioinformatics 17, 13–15 (2001)

    Google Scholar 

  81. Mouse Genome Sequencing Consortium: Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002)

    Google Scholar 

  82. Munch, K., Krogh, A.: Automatic generation of gene finders for euakryotic species. BMC Bioinform. 7, 263–274 (2006)

    Google Scholar 

  83. Novichkov, P.S., Gelfand, M.S., Mironov, A.A.: Gene recognition in eukaryotic DNA by comparison of genomic sequences. Bioinformatics 17, 1011–1018 (2001)

    Google Scholar 

  84. Ovcharenko, I., Boffelli, D., Loots, G.G.: eShadow: a tool for comparing closely related sequences. Genome Res. 14, 1191–1198 (2004)

    Google Scholar 

  85. Parra, G., Agarwal, P., Abril, J.F., Wiehe, T., Fickett, J.W., Guigó, R.: Comparative Gene Prediction in Human and Mouse. Genome Res. 13, 108–117 (2003)

    Google Scholar 

  86. Pedersen, J.S., Hein, J.: Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics 19, 219–227 (2003)

    Google Scholar 

  87. RIKEN genome exploration research group and genome science group (genome network project core group) and the FANTOM consortium. Science 309, 1564–1566 (2005)

    Google Scholar 

  88. Salamov, A.A., Solovyev, V.V.: Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10, 516–522 (2000)

    Google Scholar 

  89. Salzberg, S.L., Delcher, A.L., Fasman, K.H., Henderson, J.: A decision tree system for finding genes in DNA. J. Comput. Biol. 5, 667–680 (1998)

    Google Scholar 

  90. Salzberg, S.L., Delcher, A.L., Kasif, S., White, O.: Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26, 544–548 (1998)

    Google Scholar 

  91. Schiex, T., Moisan, A., Rouzé, P.: EuGene: an eucaryotic gene finder that combines several sources of evidenc. In: Gascuel, O., Sagot, M.-F. (eds.) Computational Biology, pp. 111–125. Springer, Berlin (2001)

    Google Scholar 

  92. Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieteric, C., Ong, C.S., Philips, P., De Bona, F., Hartmann, L., Bohlen, A., Krüger, N., Sonnenburg, S., Rätsch, G.: mGene: accurate SVM-based gene finding with an application to nematode genomes. Genome Res. June 29 Epub (2009)

    Google Scholar 

  93. Siepel, A., Haussler, D.: Computational identification of evolutionary conserved exons. RECOMB 8, 177–186 (2004)

    Google Scholar 

  94. Smit, A.F.A., Hubley, R., Green, P.: RepeatMasker. http://www.repeatmasker.org

  95. Snyder, E.E., Stormo, G.D.: Identification of coding regions in genomic DNA sequences: an application of dynamic programming and neural networks. Nucleic Acids Res. 21, 607–613 (1993)

    Google Scholar 

  96. Snyder, E.E., Stormo, G.D.: Identification of protein coding regions in genomic DNA. J. Mol. Biol. 248, 1–18 (1995)

    Google Scholar 

  97. Solovyev, V.V., Salamov, A.A., Lawrence, C.B.: Predicting internal exons by oligonucleotide composition and discrimant analysis of spliceable open reading frames. Nucleic Acids Res. 22, 5156–5163 (1994)

    Google Scholar 

  98. Southan, C.: Has the yo-yo stopped? an assessment of human protein-coding gene number. Proteomics 4, 1712–1726 (2004)

    Google Scholar 

  99. Staden, R.: Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res. 12, 505–519 (1984)

    Google Scholar 

  100. Staden, R., McLachlan, A.D.: Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 10, 141–156 (1982)

    Google Scholar 

  101. Stanke, M., Waack, S.: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003)

    Google Scholar 

  102. Swift, H.: The constancy of desoxyribose nucleic acid in plant nuclei. Proc. Natl. Acad. Sci. USA 36, 643–654 (1950)

    Google Scholar 

  103. Taher, L., Rinner, O., Garg, S., Sczyrba, A., Brudno, M., Batzoglou, S., Morgenstern, B.: AGenDA: homology-based gene prediction. Bioinformatics 19, 1575–1577 (2003)

    Google Scholar 

  104. Vendrely, R., Vendrely, C.: La teneur du noyau cellulaire en acide désoxyribonucléique à travers les organes, les individus et les espéces animales : Techniques et premiers résultats. Experientia 4, 434–436 (1948)

    Google Scholar 

  105. Wade, N.: Gene sweepstakes end, but winner may well be wrong. New York Times, 3 June 2003

    Google Scholar 

  106. Wain, H.M., Bruford, E.A., Lovering, E.C., Lush, M.J., Wright, M.W., Povey, S.: Guidelines for human gene nomenclature. Genomics 79, 464–470 (2002)

    Google Scholar 

  107. Wiehe, T., Gebauer-Jung, S., Mitchell-Olds, T., Guigó, R.: SGP-1: prediction and validation of homologous genes based on sequence alignments. Genome Res. 11, 1574–1583 (2001)

    Google Scholar 

  108. Wood, V., Rutherford, K.M., Ivens, A., Rajandream, M.-A., Barrell, B.: A re-annotation of the Saccharomyces cerevisiae genome. Comp. Funct. Genomics 2, 143–154 (2001)

    Google Scholar 

  109. Wu, J., Haussler, D.: Coding exon detection using comparative sequences. J. Comput. Biol. 13, 1148–1164 (2006)

    MathSciNet  Google Scholar 

  110. Xu, Y., Mural, R.J., Einstein, J.R., Shah, M.B., Uberbacher, E.C.: GRAIL: a multi-agent neural network system for gene identification. Proc. IEEE 84, 1544–1552 (1996)

    Google Scholar 

  111. Xu, Y., Uberbacher, E.C.: In: Salzberg, S.L., Searls, D.B., Kasif, S. (eds.) Computational Methods in Molecular Biology, pp. 109–128. Elsevier Science B.V., Amsterdam (1998)

    Google Scholar 

  112. Yada, T., Takagi, T., Totoki, Y., Sakaki, Y., Takaeda, Y.: DIGIT: a novel gene finding program by combining gene-finders. Pac. Symp. Biocomput. 8, 375–387 (2003)

    Google Scholar 

  113. Zhang, C.-T., Wang, J.: Recognition of protein coding genes in the yeast genome at better than 95 % accuracy based on the Z curve. Nucleic Acids Res. 28, 2804–2814 (2000)

    Google Scholar 

  114. Zhang, M.Q.: Identification of protein coding regions in the human genome by quadratic discriminant analysis. Proc. Natl. Acad. Sci. USA 94, 565–568 (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marina Axelson-Fisk .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag London

About this chapter

Cite this chapter

Axelson-Fisk, M. (2015). Introduction. In: Comparative Gene Finding. Computational Biology, vol 20. Springer, London. https://doi.org/10.1007/978-1-4471-6693-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-6693-1_1

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-6692-4

  • Online ISBN: 978-1-4471-6693-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics