Skip to main content

Advertisement

Log in

Median Approximations for Genomes Modeled as Matrices

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

The genome median problem is an important problem in phylogenetic reconstruction under rearrangement models. It can be stated as follows: Given three genomes, find a fourth that minimizes the sum of the pairwise rearrangement distances between it and the three input genomes. In this paper, we model genomes as matrices and study the matrix median problem using the rank distance. It is known that, for any metric distance, at least one of the corners is a \(\frac{4}{3}\)-approximation of the median. Our results allow us to compute up to three additional matrix median candidates, all of them with approximation ratios at least as good as the best corner, when the input matrices come from genomes. We also show a class of instances where our candidates are optimal. From the application point of view, it is usually more interesting to locate medians farther from the corners, and therefore, these new candidates are potentially more useful. In addition to the approximation algorithm, we suggest a heuristic to get a genome from an arbitrary square matrix. This is useful to translate the results of our median approximation algorithm back to genomes, and it has good results in our tests. To assess the relevance of our approach in the biological context, we ran simulated evolution tests and compared our solutions to those of an exact DCJ median solver. The results show that our method is capable of producing very good candidates.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bader M (2009) On reversal and transposition medians. Int Sci Index 30 3(6):504–512

    Google Scholar 

  • Berestycki N, Durrett R (2006) A phase transition in the random transposition random walk. Probab Theory Relat F 136(2):203–233

    Article  MathSciNet  MATH  Google Scholar 

  • Biller P, Feijao P, Meidanis J (2013) Rearrangement-based phylogeny using the single-cut-or-join operation. IEEE/ACM Trans Comput Biol Bioinform (TCBB) 10(1):122–134

    Article  Google Scholar 

  • Biller P, Guéguen L, Tannier E (2015) Moments of genome evolution by double cut-and-join. BMC Bioinform 16(Suppl 14):S7

    Article  Google Scholar 

  • Bourque G, Pevzner PA (2002) Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Res 12(1):26–36

    Google Scholar 

  • Bryant D (1998) The complexity of the breakpoint median problem. Technical repot CRM-2579, Centre de recherches mathematiques, Universite de Montreal

  • Caprara A (1999) Formulations and hardness of multiple sorting by reversals. In: Proceedings of the third annual international conference on computational molecular biology (RECOMB), ACM, pp 84–93

  • Caprara A (2002) Additive bounding, worst-case analysis, and the breakpoint median problem. SIAM J Opt 13(2):508–519

    Article  MathSciNet  MATH  Google Scholar 

  • Caprara A (2003) The reversal median problem. INFORMS J Comput 15(1):93–113. doi:10.1287/ijoc.15.1.93.15155

    Article  MathSciNet  MATH  Google Scholar 

  • Cosner M, Jansen R, Moret B (2000) An empirical comparison of phylogenetic methods on chloroplast gene order data in campanulaceae. Kluwer, Dordrecht

    Book  MATH  Google Scholar 

  • Delsarte P (1978) Bilinear forms over a finite field, with applications to coding theory. J Comb Theory Ser A 25(3):226–241

    Article  MathSciNet  MATH  Google Scholar 

  • Feijao P, Meidanis J (2011) SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE IEEE/ACM Trans Comput Biol Bioinform 8:1318–1329. doi:10.1109/TCBB.2011.34

    Article  Google Scholar 

  • Feijao P, Meidanis J (2012) Extending the algebraic formalism for genome rearrangements to include linear chromosomes. In: de Souto M, Kann M (eds) BSB 2012, LNBI 7409. Springer, Berlin, pp 13–24

    Google Scholar 

  • Fertin G, Labarre A, Rusu I, Tannier E, Vialette S (2009) Combinatorics of genome rearrangements. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  • Gao N (2014) Using genetic algorithm to solve median problem and phylogenetic inference. PhD thesis, University of South Carolina

  • Hagberg AA, Schult DA, Swart PJ (2008) Exploring network structure, dynamics, and function using networkX. In: Proceedings of the 7th python in science conference (SciPy2008), Pasadena, CA USA, pp 11–15

  • Haghighi M, Sankoff D (2012) Medians seek the corners, and other conjectures. BMC bioinformatics 13(Suppl 19):S5

    Google Scholar 

  • Jamshidpey A, Sankoff D (2013) Phase change for the accuracy of the median value in estimating divergence time. BMC bioinformatics 14(Suppl 15):S7

    Article  Google Scholar 

  • Meyer CD (2000) Matrix analysis and applied linear algebra. Society for Industrial and Applied Mathematics, Philadelphia

    Book  Google Scholar 

  • Moret BM, Wang LS, Warnow T, Wyman SK (2001) New approaches for reconstructing phylogenies from gene order data. Bioinformatics 17(suppl 1):S165–S173

    Article  Google Scholar 

  • Pe’er I, Shamir R (1998) The median problems for breakpoints are NP-Complete. In: Electronic colloquium on computational complexity, vol 71. Hasso Plattner Institute, Potsdam

  • Pe’er I, Shamir R (2000) Approximation algorithms for the median problem in the breakpoint model. Comparative genomics. Springer, Berlin, pp 225–241

    Chapter  Google Scholar 

  • Rajan V, Xu AW, Lin Y, Swenson KM, Moret BM (2010) Heuristics for the inversion median problem. BMC bioinformatics 11(Suppl 1):S30

    Article  Google Scholar 

  • Sankoff D, Blanchette M (1998) Multiple genome rearrangement and breakpoint phylogeny. J Comput Biol 5(3):555–570

    Article  Google Scholar 

  • Sankoff D, Sundaram G, Kececioglu J (1996) Steiner points in the space of genome rearrangements. Int J Found Comput Sci 7(01):1–9

    Article  MATH  Google Scholar 

  • Tannier E, Zheng C, Sankoff D (2009) Multichromosomal median and halving problems under different genomic distances. BMC Bioinform 10:120. doi:10.1186/1471-2105-10-120

    Article  Google Scholar 

  • Xu AW (2009a) DCJ median problems on linear multichromosomal genomes: graph representation and fast exact solutions. Comp genomics. Springer, Berlin, pp 70–83

    Chapter  Google Scholar 

  • Xu AW (2009b) A fast and exact algorithm for the median of three problem: a graph decomposition approach. J Comput Biol 16(10):1369–1381

    Article  MathSciNet  Google Scholar 

  • Yancopoulos S, Attie O, Friedberg R (2005) Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 21(16):3340–3346. doi:10.1093/bioinformatics/bti535

    Article  Google Scholar 

  • Zanetti JPP, Biller P, Meidanis J (2013) On the matrix median problem. In: Darling A, Stoye J (eds) Algorithms in bioinformatics., Lecture notes in computer scienceSpringer, Berlin, pp 230–243. doi:10.1007/978-3-642-40453-5_18

    Chapter  Google Scholar 

  • Zhang M, Arndt W, Tang J (2009) An exact solver for the DCJ median problem. In: Pacific symposium on biocomputing, vol 14, pp 138–149

Download references

Acknowledgments

We thank funding agency FAPESP (Brazil) for financial support (Grants 2012/13865-7 and 2012/14104-0).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joao Paulo Pereira Zanetti.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zanetti, J.P.P., Biller, P. & Meidanis, J. Median Approximations for Genomes Modeled as Matrices. Bull Math Biol 78, 786–814 (2016). https://doi.org/10.1007/s11538-016-0162-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-016-0162-4

Keywords

Navigation