Advertisement

Hardness and Approximation of the Asynchronous Border Minimization Problem

(Extended Abstract)
  • Alexandru Popa
  • Prudence W. H. Wong
  • Fencol C. C. Yung
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7287)

Abstract

We study a combinatorial problem arising from the microarrays synthesis. The objective of the BMP is to place a set of sequences in the array and to find an embedding of these sequences into a common supersequence such that the sum of the “border length” is minimized. A variant of the problem, called P-BMP, is that the placement is given and the concern is simply to find the embedding.

Approximation algorithms have been proposed for the problem [21] but it is unknown whether the problem is NP-hard or not. In this paper, we give a comprehensive study of different variations of BMP by presenting NP-hardness proofs and improved approximation algorithms. We show that P-BMP, 1D-BMP, and BMP are all NP-hard. In contrast with the result in [21] that 1D-P-BMP is polynomial time solvable, the interesting implications include (i) the array dimension (1D or 2D) differentiates the complexity of P-BMP; (ii) for 1D array, whether placement is given differentiates the complexity of BMP; (iii) BMP is NP-hard regardless of the dimension of the array. Another contribution of the paper is improving the approximation for BMP from O(n 1/2 log2 n) to O(n 1/4 log2 n), where n is the total number of sequences.

Keywords

Approximation Algorithm Deposition Sequence Euler Tour Optimal Embedding Binary Alphabet 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bartal, Y.: Probabilistic approximations of metric spaces and its algorithmic applications. In: FOCS, pp. 184–193 (1996)Google Scholar
  2. 2.
    Bonizzoni, P., Vedova, G.D.: The complexity of multiple sequence alignment with SP-score that is a metric. TCS 259(1-2), 63–79 (2001)zbMATHCrossRefGoogle Scholar
  3. 3.
    de Carvalho Jr., S.A., Rahmann, S.: Improving the Layout of Oligonucleotide Microarrays: Pivot Partitioning. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 321–332. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    de Carvalho Jr., S.A., Rahmann, S.: Microarray layout as quadratic assignment problem. In: Proc. GCB, pp. 11–20 (2006)Google Scholar
  5. 5.
    de Carvalho Jr., S.A., Rahmann, S.: Improving the design of genechip arrays by combining placement and embedding. In: Proc. 6th CSB, pp. 54–63 (2007)Google Scholar
  6. 6.
    Chatterjee, M., Mohapatra, S., Ionan, A., Bawa, G., Ali-Fehmi, R., Wang, X., Nowak, J., Ye, B., Nahhas, F.A., Lu, K., Witkin, S.S., Fishman, D., Munkarah, A., Morris, R., Levin, N.K., Shirley, N.N., Tromp, G., Abrams, J., Draghici, S., Tainsky, M.A.: Diagnostic markers of ovarian cancer by high-throughput antigen cloning and detection on arrays. Cancer Research 66(2), 1181–1190 (2006)CrossRefGoogle Scholar
  7. 7.
    Cretich, M., Chiari, M.: Peptide Microarrays Methods and Protocols. Methods in Molecular Biology, vol. 570. Human Press (2009)Google Scholar
  8. 8.
    Ernvall, J., Katajainen, J., Penttonen, M.: NP-completeness of the hamming salesman problem. BIT Numerical Mathematics 25, 289–292 (1985)MathSciNetzbMATHCrossRefGoogle Scholar
  9. 9.
    Fakcharoenphol, J., Rao, S., Talwar, K.: A tight bound on approximating arbitrary metrics by tree metrics. In: STOC, pp. 448–455 (2003)Google Scholar
  10. 10.
    Feng, D.F., Doolittle, R.F.: Approximation algorithms for multiple sequence alignment. TCS 182(1), 233–244 (1987)Google Scholar
  11. 11.
    Fodor, S., Read, J., Pirrung, M., Stryer, L., Lu, A., Solas, D.: Light-directed, spatially addressable parallel chemical synthesis. Science 251(4995), 767–773 (1991)CrossRefGoogle Scholar
  12. 12.
    Gerhold, D., Rushmore, T., Caskey, C.T.: DNA chips: promising toys have become powerful tools. Trends in Biochemical Sciences 24(5), 168–173 (1999)CrossRefGoogle Scholar
  13. 13.
    Gusfield, D.: Efficient methods for multiple sequence alignment with guaranteed error bounds. Bulletin of Mathematical Biology 55(1), 141–154 (1993)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Hannenhalli, S., Hubell, E., Lipshutz, R., Pevzner, P.A.: Combinatorial algorithms for design of DNA arrays. Adv. in Biochem. Eng./Biotech. 77, 1–19 (2002)CrossRefGoogle Scholar
  15. 15.
    Kaderali, L., Schliep, A.: Selecting signature oligonucleotides to identify organisms using DNA arrays. Bioinformatics 18, 1340–1349 (2002)CrossRefGoogle Scholar
  16. 16.
    Kahng, A.B., Mandoiu, I.I., Pevzner, P.A., Reda, S., Zelikovsky, A.: Scalable heuristics for design of DNA probe arrays. JCB 11(2/3), 429–447 (2004); Preliminary versions in WABI 2002 and RECOMB 2003Google Scholar
  17. 17.
    Kahng, A.B., Mandoiu, I.I., Reda, S., Xu, X., Zelikovsky, A.: Computer-aided optimization of DNA array design and manufacturing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 25(2), 305–320 (2006)CrossRefGoogle Scholar
  18. 18.
    Kasif, S., Weng, Z., Detri, A., Beigel, R., De Lisi, C.: A computational framework for optimal masking in the synthesis of oligonucleotide microarrays. Nucleic Acids Research 30(20), e106 (2002)CrossRefGoogle Scholar
  19. 19.
    Kundeti, V., Rajasekaran, S.: On the hardness of the border length minimization problem. In: BIBE, pp. 248–253 (2009)Google Scholar
  20. 20.
    Kundeti, V., Rajasekaran, S., Dinh, H.: On the border length minimization problem (BLMP) on a square array. CoRR, abs/1003.2839 (2010)Google Scholar
  21. 21.
    Li, C.Y., Wong, P.W.H., Xin, Q., Yung, F.C.C.: Approximating Border Length for DNA Microarray Synthesis. In: Agrawal, M., Du, D.-Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 410–422. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Li, F., Stormo, G.: Selection of optimal DNA oligos for gene expression arrays. Bioinformatics 17(11), 1067–1076 (2001)CrossRefGoogle Scholar
  23. 23.
    Melle, C., Ernst, G., Schimmel, B., Bleul, A., Koscielny, S., Wiesner, A., Bogumil, R., Möller, U., Osterloh, D., Halbhuber, K.-J., von Eggeling, F.: A technical triade for proteomic identification and characterization of cancer biomarkers. Cancer Research 64(12), 4099–4104 (2004)CrossRefGoogle Scholar
  24. 24.
    Rahmann, S.: The shortest common supersequence problem in a microarray production setting. Bioinformatics 19(suppl.2), 156–161 (2003)Google Scholar
  25. 25.
    Räihä, K.-J.: The shortest common supersequence problem over binary alphabet is NP-complete. Theoretical Computer Science 16(2), 187–198 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  26. 26.
    Reinert, K., Lenhof, H.P., Mutzel, P., Mehlhorn, K., Kececioglu, J.D.: A branch-and-cut algorithm for multiple sequence alignment. In: RECOMB, pp. 241–250 (1997)Google Scholar
  27. 27.
    Slonim, D.K., Tamayo, P., Mesirov, J.P., Golub, T.R., Lander, E.S.: Class prediction and discovery using gene expression data. In: RECOMB, pp. 263–272 (2000)Google Scholar
  28. 28.
    Sung, W.K., Lee, W.H.: Fast and accurate probe selection algorithm for large genomes. In: Proc. 2nd CSB, pp. 65–74 (2003)Google Scholar
  29. 29.
    Welsh, J., Sapinoso, L., Kern, S., Brown, D., Liu, T., Bauskin, A., Ward, R., Hawkins, N., Quinn, D., Russell, P., Sutherland, R., Breit, S., Moskaluk, C., Frierson Jr., H., Hampton, G.: Large-scale delineation of secreted protein biomarkers overexpressed in cancer tissue and serum. PNAS 100(6), 3410–3415 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Alexandru Popa
    • 1
  • Prudence W. H. Wong
    • 2
  • Fencol C. C. Yung
    • 2
  1. 1.Department of Communications & NetworkingAalto University School of Electrical EngineeringAaltoFinland
  2. 2.Department of Computer ScienceUniversity of LiverpoolUK

Personalised recommendations