Abstract
We study a combinatorial problem arising from the microarrays synthesis. The objective of the BMP is to place a set of sequences in the array and to find an embedding of these sequences into a common supersequence such that the sum of the “border length” is minimized. A variant of the problem, called P-BMP, is that the placement is given and the concern is simply to find the embedding.
Approximation algorithms have been proposed for the problem [21] but it is unknown whether the problem is NP-hard or not. In this paper, we give a comprehensive study of different variations of BMP by presenting NP-hardness proofs and improved approximation algorithms. We show that P-BMP, 1D-BMP, and BMP are all NP-hard. In contrast with the result in [21] that 1D-P-BMP is polynomial time solvable, the interesting implications include (i) the array dimension (1D or 2D) differentiates the complexity of P-BMP; (ii) for 1D array, whether placement is given differentiates the complexity of BMP; (iii) BMP is NP-hard regardless of the dimension of the array. Another contribution of the paper is improving the approximation for BMP from O(n 1/2 log2 n) to O(n 1/4 log2 n), where n is the total number of sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bartal, Y.: Probabilistic approximations of metric spaces and its algorithmic applications. In: FOCS, pp. 184–193 (1996)
Bonizzoni, P., Vedova, G.D.: The complexity of multiple sequence alignment with SP-score that is a metric. TCS 259(1-2), 63–79 (2001)
de Carvalho Jr., S.A., Rahmann, S.: Improving the Layout of Oligonucleotide Microarrays: Pivot Partitioning. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 321–332. Springer, Heidelberg (2006)
de Carvalho Jr., S.A., Rahmann, S.: Microarray layout as quadratic assignment problem. In: Proc. GCB, pp. 11–20 (2006)
de Carvalho Jr., S.A., Rahmann, S.: Improving the design of genechip arrays by combining placement and embedding. In: Proc. 6th CSB, pp. 54–63 (2007)
Chatterjee, M., Mohapatra, S., Ionan, A., Bawa, G., Ali-Fehmi, R., Wang, X., Nowak, J., Ye, B., Nahhas, F.A., Lu, K., Witkin, S.S., Fishman, D., Munkarah, A., Morris, R., Levin, N.K., Shirley, N.N., Tromp, G., Abrams, J., Draghici, S., Tainsky, M.A.: Diagnostic markers of ovarian cancer by high-throughput antigen cloning and detection on arrays. Cancer Research 66(2), 1181–1190 (2006)
Cretich, M., Chiari, M.: Peptide Microarrays Methods and Protocols. Methods in Molecular Biology, vol. 570. Human Press (2009)
Ernvall, J., Katajainen, J., Penttonen, M.: NP-completeness of the hamming salesman problem. BIT Numerical Mathematics 25, 289–292 (1985)
Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: STOC, pp. 448–455 (2003)
Feng, D.F., Doolittle, R.F.: Approximation algorithms for multiple sequence alignment. TCS 182(1), 233–244 (1987)
Fodor, S., Read, J., Pirrung, M., Stryer, L., Lu, A., Solas, D.: Light-directed, spatially addressable parallel chemical synthesis. Science 251(4995), 767–773 (1991)
Gerhold, D., Rushmore, T., Caskey, C.T.: DNA chips: promising toys have become powerful tools. Trends in Biochemical Sciences 24(5), 168–173 (1999)
Gusfield, D.: Efficient methods for multiple sequence alignment with guaranteed error bounds. Bulletin of Mathematical Biology 55(1), 141–154 (1993)
Hannenhalli, S., Hubell, E., Lipshutz, R., Pevzner, P.A.: Combinatorial algorithms for design of DNA arrays. Adv. in Biochem. Eng./Biotech. 77, 1–19 (2002)
Kaderali, L., Schliep, A.: Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 18, 1340–1349 (2002)
Kahng, A.B., Mandoiu, I.I., Pevzner, P.A., Reda, S., Zelikovsky, A.: Scalable heuristics for design of DNA probe arrays. JCB 11(2/3), 429–447 (2004); Preliminary versions in WABI 2002 and RECOMB 2003
Kahng, A.B., Mandoiu, I.I., Reda, S., Xu, X., Zelikovsky, A.: Computer-aided optimization of DNA array design and manufacturing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25(2), 305–320 (2006)
Kasif, S., Weng, Z., Detri, A., Beigel, R., De Lisi, C.: A computational framework for optimal masking in the synthesis of oligonucleotide microarrays. Nucleic Acids Research 30(20), e106 (2002)
Kundeti, V., Rajasekaran, S.: On the hardness of the border length minimization problem. In: BIBE, pp. 248–253 (2009)
Kundeti, V., Rajasekaran, S., Dinh, H.: On the border length minimization problem (BLMP) on a square array. CoRR, abs/1003.2839 (2010)
Li, C.Y., Wong, P.W.H., Xin, Q., Yung, F.C.C.: Approximating Border Length for DNA Microarray Synthesis. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 410–422. Springer, Heidelberg (2008)
Li, F., Stormo, G.: Selection of optimal DNA oligos for gene expression arrays. Bioinformatics 17(11), 1067–1076 (2001)
Melle, C., Ernst, G., Schimmel, B., Bleul, A., Koscielny, S., Wiesner, A., Bogumil, R., Möller, U., Osterloh, D., Halbhuber, K.-J., von Eggeling, F.: A technical triade for proteomic identification and characterization of cancer biomarkers. Cancer Research 64(12), 4099–4104 (2004)
Rahmann, S.: The shortest common supersequence problem in a microarray production setting. Bioinformatics 19(suppl.2), 156–161 (2003)
Räihä, K.-J.: The shortest common supersequence problem over binary alphabet is NP-complete. Theoretical Computer Science 16(2), 187–198 (1981)
Reinert, K., Lenhof, H.P., Mutzel, P., Mehlhorn, K., Kececioglu, J.D.: A branch-and-cut algorithm for multiple sequence alignment. In: RECOMB, pp. 241–250 (1997)
Slonim, D.K., Tamayo, P., Mesirov, J.P., Golub, T.R., Lander, E.S.: Class prediction and discovery using gene expression data. In: RECOMB, pp. 263–272 (2000)
Sung, W.K., Lee, W.H.: Fast and accurate probe selection algorithm for large genomes. In: Proc. 2nd CSB, pp. 65–74 (2003)
Welsh, J., Sapinoso, L., Kern, S., Brown, D., Liu, T., Bauskin, A., Ward, R., Hawkins, N., Quinn, D., Russell, P., Sutherland, R., Breit, S., Moskaluk, C., Frierson Jr., H., Hampton, G.: Large-scale delineation of secreted protein biomarkers overexpressed in cancer tissue and serum. PNAS 100(6), 3410–3415 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Popa, A., Wong, P.W.H., Yung, F.C.C. (2012). Hardness and Approximation of the Asynchronous Border Minimization Problem. In: Agrawal, M., Cooper, S.B., Li, A. (eds) Theory and Applications of Models of Computation. TAMC 2012. Lecture Notes in Computer Science, vol 7287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29952-0_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-29952-0_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29951-3
Online ISBN: 978-3-642-29952-0
eBook Packages: Computer ScienceComputer Science (R0)