Skip to main content

A Parsimony Approach to Genome-Wide Ortholog Assignment

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3909))

Abstract

The assignment of orthologous genes between a pair of genomes is a fundamental and challenging problem in comparative genomics, since many computational methods for solving various biological problems critically rely on bona fide orthologs as input. While it is usually done using sequence similarity search, we recently proposed a new combinatorial approach that combines sequence similarity and genome rearrangement. This paper continues the development of the approach and unites genome rearrangement events and (post-speciation) duplication events in a single framework under the parsimony principle. In this framework, orthologous genes are assumed to correspond to each other in the most parsimonious evolutionary scenario involving both genome rearrangement and (post-speciation) gene duplication. Besides several original algorithmic contributions, the enhanced method allows for the detection of inparalogs. Following this approach, we have implemented a high-throughput system for ortholog assignment on a genome scale, called MSOAR, and applied it to the genomes of human and mouse. As the result will show, MSOAR is able to find 99 more true orthologs than the INPARANOID program did. We have also compared MSOAR with the iterated exemplar algorithm on simulated data and found that MSOAR performed very well in terms of assignment accuracy. These test results indiate that our approach is very promising for genome-wide ortholog assignment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S., et al.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  2. Bairoch, A., et al.: The Universal Protein Resource (UniProt). Nuc. Acids Res. 33, D154–D159 (2005)

    Google Scholar 

  3. Cannon, S.B., Young, N.D.: OrthoParaMap: distinguishing orthologs from paralogs by integrating comparative genome data and gene phylogenies. BMC Bioinformatics 4(1), 35 (2003)

    Article  Google Scholar 

  4. Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Computing the assignment of orthologous genes via genome rearrangement. In: Proc. 3rd Asia Pacific Bioinformatics Conf (APBC 2005), pp. 363–378 (2005)

    Google Scholar 

  5. Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: The assignment of orthologous genes via genome rearrangement. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2(4), 302–315 (2005)

    Article  Google Scholar 

  6. Fitch, W.M.: Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–113 (1970)

    Article  Google Scholar 

  7. Hannenhalli, S., Pevzner, P.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). In: Proc. 27th Ann. ACM Symp. Theory of Comput (STOC 1995), pp. 178–189 (1995)

    Google Scholar 

  8. Hannenhalli, S., Pevzner, P.: Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proc. IEEE 36th Symp. Found. of Comp. Sci, pp. 581–592 (1995)

    Google Scholar 

  9. Karolchik, D., Roskin, K.M., Schwartz, M., Sugnet, C.W., Thomas, D.J., Weber, R.J., Haussler, D., Kent, W.J.: The UCSC Genome Browser Database. Nucleic Acids Res. 31(1), 51–54 (2003)

    Article  Google Scholar 

  10. Koonin, E.: Orthologs, paralogs, and evolutionary genomics. In: Annu. Rev. Genet. (2005)

    Google Scholar 

  11. Lee, Y., et al.: Cross-referencing eukaryotic genomes: TIGR orthologous gene alignments (TOGA). Genome Res. 12, 493–502 (2002)

    Article  Google Scholar 

  12. Li, L., Stoeckert, C., Roos, D.: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003)

    Article  Google Scholar 

  13. Marron, M., Swenson, K., Moret, B.: Genomic distances under deletions and insertions. Theoretic Computer Science 325(3), 347–360 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  14. El-Mabrouk, N.: Reconstructing an ancestral genome using minimum segments duplications and reversals. Journal of Computer and System Sciences 65, 442–464 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  15. Ozery-Flato, M., Shamir, R.: Two notes on genome rearragnements. Journal of Bioinformatics and Computational Biology 1(1), 71–94 (2003)

    Article  Google Scholar 

  16. Remm, M., Storm, C., Sonnhammer, E.: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J. Mol. Biol. 314, 1041–1052 (2001)

    Article  Google Scholar 

  17. Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)

    Article  Google Scholar 

  18. Swenson, K., Marron, M., Earnest-DeYoung, J., Moret, B.: Approximating the true evolutionary distance between two genomes. In: Proc. 7th SIA Workshop on Algorithm Engineering & Experiments, pp. 121–125 (2005)

    Google Scholar 

  19. Swenson, K., Pattengale, N., Moret, B.: A framework for orthology assignment from gene rearrangement data. In: McLysaght, A., Huson, D.H. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3678, pp. 153–166. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  20. Storm, C., Sonnhammer, E.: Automated ortholog inference from phylogenetic trees and calculation of orthology reliability. Bioinformatics 18(1) (2002)

    Google Scholar 

  21. Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.: The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)

    Article  Google Scholar 

  22. Tesler, G.: Efficient algorithms for multichromosomal genome rearrangements. Journal of Computer and System Sciences 65(3), 587–609 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  23. Tatusov, R.L., Koonin, E., Lipman, D.J.: A genomic perspective on protein families. Science 278, 631–637 (1997)

    Article  Google Scholar 

  24. Wain, H.M., Bruford, E.A., Lovering, R.C., Lush, M.J., Wright, M.W., Povey, S.: Guidelines for human gene nomenclature. Genomics 79(4), 464–470 (2002)

    Article  Google Scholar 

  25. Yuan, Y.P., Eulenstein, O., Vingron, M., Bork, P.: Towards detection of orthologues in sequence databases. Bioinformatics 14(3), 285–289 (1998)

    Article  Google Scholar 

  26. Zheng, X., et al.: Using shared genomic synteny and shared protein functions to enhance the identification of orthologous gene pairs. Bioinformatics 21(6), 703–710 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fu, Z., Chen, X., Vacic, V., Nan, P., Zhong, Y., Jiang, T. (2006). A Parsimony Approach to Genome-Wide Ortholog Assignment. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2006. Lecture Notes in Computer Science(), vol 3909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11732990_47

Download citation

  • DOI: https://doi.org/10.1007/11732990_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33295-4

  • Online ISBN: 978-3-540-33296-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics