Abstract
A point-interval\((I_v, p_v)\) is a pair constituted by an interval \(I_v\) of \({\mathbb {R}}\) and a point \(p_v \in I_v\). A graph \(G=(V,E)\) is a Max-Point-Tolerance (MPT) graph if each vertex \(v\in V\) can be mapped to a point-interval in such a way that (u, v) is an edge of G iff \(I_u \cap I_v \supseteq \{p_u, p_v\}\). MPT graphs constitute a superclass of interval graphs and naturally arise in genetic analysis as a way to represent specific relationships among DNA fragments extracted from a population of individuals. One of the most important applications of MPT graphs concerns the search for an association between major human diseases and chromosome regions from patients that exhibit loss of heterozygosity events. This task can be formulated as a minimum cost clique cover problem in a MPT graph and gives rise to a \({{\mathcal {N}}}{{\mathcal {P}}}\)-hard combinatorial optimization problem known in the literature as the Parsimonious Loss of Heterozygosity Problem (PLOHP). In this article, we investigate ways to speed up the best known exact solution algorithm for the PLOHP as well as techniques to enlarge the size of the instances that can be optimally solved. In particular, we present a Branch&Price algorithm for the PLOHP and we develop a number of preprocessing techniques and decomposition strategies to dramatically reduce the size of its instances. Computational experiments show that the proposed approach is 10–30\(\times \) faster than previous approaches described in the literature, and suggest new directions for the development of future exact solution approaches that may prove of fundamental assistance in practice.
Similar content being viewed by others
References
Achterberg T (2009) SCIP: solving constraint integer programs. Math Program Comput 1(1):1–41
Asinowski A, Cohen E, Golumbic MC, Limouzy V, Lipshteyn M, Stern M (2012) Vertex intersection graphs of paths on a grid. J Graph Algorithms Appl 16:129–150
Booth KS, Lueker GS (1976) Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms. J Comput Syst Sci 13(3):335–379
Catanzaro D, Labbé M (2009) The pure parsimony haplotyping problem: overview and computational advances. Int Trans Oper Res 16(5):561–584
Catanzaro D, Labbe M, Halldorsson BV (2013) An integer programming formulation of the parsimonious loss of heterozygosity problem. IEEE/ACM Trans Comput Biol Bioinform 10(6):1391–1402
Catanzaro D, Chaplick S, Felsner S, Halldórsson BV, Halldórsson MM, Hixon T, Stacho J (2017) Max point-tolerance graphs. Discret Appl Math 216(1):84–97
Conrad DF, Andrews TD, Carter NP, Hurles ME, Pritchard JK (2006) A high-resolution survey of deletion polymorphism in the human genome. Nat Genet 38(1):75–81
Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms. The MIT Press, Cambridge
Corneil DG, Kamula PA (1987) Extensions of permutation and interval graphs. In: Proceedings of 18th southeastern conference on combinatorics, graph theory and computing, pp 267–275
Diestel R (2010) Graph theory. Springer, Berlin
Dion G, Jost V, Queyranne M (2007) Clique partitioning of interval graphs with submodular costs on the cliques. RAIRO Oper Res 41:275–287
Fay MP, Proschan MA (2010) Wilcoxon–Mann–Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules. Stat Surv 4:1–39
Fishburn P (1985) Interval orders and interval graphs: a study of partially ordered sets. Wiley, New York
Frank A (1976) Some polynomial algorithms for certain graphs and hypergraphs. In: Proceedings of the 5th British combinatorial conference (Aberdeen 1975), Congressus numerantium XV, pp 211–226
Fulkerson DR, Gross OA (1965) Incidence matrices and interval graphs. Pac J Math 15:835–855
Garey MR, Johnson DS (2003) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
Gavril F (1978) A recognition algorithm for the intersection graphs of paths in trees. Discret Math 23:211–227
Golumbic MC (2004) Algorithmic graph theory and perfect graphs. Elsevier, North-Holland
Golumbic MC, Monma CL (1982) A generalization of interval graphs with tolerances. Congressus Numerantium 35:321–331. Proceedings of the 13th southeastern conference on combinatorics, graph theory and computing
Golumbic MC, Trenk A (2004) Tolerance graphs, Cambridge studies in advanced mathematics, vol 89. Cambridge University Press, Cambridge
Halldorsson BV, Aguiar D, Tarpine R, Istrail S (2011) The Clark phaseable sample size problem: long-range phasing and loss of heterozygosity in GWASlark phaseable sample size problem: long-range phasing and loss of heterozygosity in GWAS. J Comput Biol 18(3):323–333
Kaufmann M, Kratochvil J, Lehmann K, Subramanian A (2006) Max-tolerance graphs as intersection graphs: cliques, cycles and recognition. In: Proceedings of 17th annual ACM-SIAM symposium on discrete algorithms SODA 06, SIAM, pp 832–841
Martin RK (1999) Large scale linear and integer optimization: a unified approach. Springer, Berlin
Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed uniform pseudorandom number generator. ACM Trans Model Comput Simul 8(1):3–30
McCarroll SA, Kuruvilla FG, Korn JM, Cawley S, Nemesh J, Wysoker A, Shapero MH, de Bakker PIW, Maller JB, Kirby A, Elliott AL, Parkin M, Hubbell E, Webster T, Mei R, Veitch J, Collins PJ, Handsaker R, Lincoln S, Nizzari M, Blume J, Jones KW, Rava R, Daly MJ, Gabriel SB, Altshuler D (2008) Integrated detection and population-genetic analysis of snps and copy number variation. Nat Genet 40(10):1166–1174
Nemhauser GL, Wolsey LA (1989) Optimization. In: Nemhauser GL, Kan AHGR, Todd MJ (eds) Handbooks in operations research and management science. Elsevier, North-Holland
Speed T, Huang H, Corona E, Raphael B, Eskin E (2007) Identification of deletion polymorphisms from haplotypes, vol 4453. Springer, Berlin, pp 354–365
Stefansson H, Rujescu D, Cichon S, Pietiläinen OPH, Ingason A, Steinberg S, Fossdal R, Sigurdsson E, Sigmundsson T, Buizer-Voskamp JE, Hansen T, Jakobsen KD, Muglia P, Francks C, Matthews PM, Gylfason A, Halldorsson BV, Gudbjartsson D, Thorgeirsson TE, Sigurdsson A, Jonasdottir A, Jonasdottir A, Bjornsson A, Mattiasdottir S, Blondal T, Haraldsson M, Magnusdottir BB, Giegling I, Moeller HJ, Hartmann A, Shianna KV, Ge D, Need AC, Crombie C, Fraser G, Walker N, Lonnqvist J, Suvisaari J, Tuulio-Henriksson A, Paunio T, Toulopoulou T, Bramon E, Forti MD, Murray R, Ruggeri M, Vassos E, Tosato S, Walshe M, Li T, Vasilescu C, Moehleisen TW, Wang AG, Ullum H, Djurovic S, Melle I, Olesen J, Kiemeney LA, Franke B, Sabatti C, Freimer NB, Gulcher JR, Thorsteinsdottir U, Kong A, Andreassen OA, Ophoff RA, Georgi A, Rietschel M, Werge T, Petursson H, Goldstein DB, Nöthen MM, Peltonen L, Collier DA, Clair DS, Stefansson K (2008) Large recurrent microdeletions associated with schizophrenia. Nature 455:232–236
Acknowledgements
Part of this work has been developed when L. Porretta and D. Catanzaro were visiting both Reykjavik University and DeCODE Genetics. L. Porretta acknowledges support from the Belgian National Fund for Scientific Research (FRS-FNRS) who funded his short-term research visit. D. Catanzaro acknowledges support from the FRS-FNRS via the grant “Crédit de Recherche” Ref. S/25-MCF/OL J.0026.17, the Université Catholique de Louvain via the “Fonds Spéciaux de Recherche” (FSR) 2017–2019, the Fondation Louvain via the grant “Le numérique au service de l’humain”, and the computational resources provided by the “Consortium des equipements de Calcul Intensif” (CCI), funded by the FRS-FNRS via the Grant Ref. 2.5020.11. The authors thank the anonymous referees for their valuable suggestions.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Porretta, L., Catanzaro, D., Halldórsson, B.V. et al. A Branch & Price algorithm for the minimum cost clique cover problem in max-point tolerance graphs. 4OR-Q J Oper Res 17, 75–96 (2019). https://doi.org/10.1007/s10288-018-0377-3
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10288-018-0377-3
Keywords
- Clique partitioning
- Max point tolerance graphs
- Column generation
- Branch-and-price
- Computational biology
- Loss of heterozygosity