Skip to main content

Application of Biomolecular Computing to Medical Science: A Biomolecular Database System for Storage, Processing, and Retrieval of Genetic Information and Material

  • Chapter
Complex Systems Science in Biomedicine

Abstract

A key problem in medical science and genomics is that of the efficient storage, processing, and retrieval of genetic information and material. This chapter presents an architecture for a Biomolecular Database system that would provide a unique capability in genomics. It completely bypasses the usual transformation from biological material (genomic DNA and transcribed RNA) to digital media, as done in conventional bioinformatics. Instead, biotechnology techniques provide the needed capability of a Biomolecular Database system without ever transferring the biological information into digital media. The inputs to the system are DNA obtained from tissues: either genomic DNA, or reverse-transcript cDNA. The input DNA is then tagged with artificially synthesized DNA strands. These “information tags” encode essential information (e.g., identification of the DNA donor, as well as the date of the sample, gender, and date of birth) about the individual or cell type that the DNA was obtained from. The resulting Biomolecular Database is capable of containing a vast store of genomic DNA obtained from many individuals (multiple army divisions, etc.). For example, the DNA of a million individuals requires about 6 pedabits (6 × 1015 bits), but due to the compactness of DNA a volume the size of a conventional test tube with a few milliliters of solution could contain that entire Biomolecular Database. Known procedures for amplification and reproduction of the resulting Biomolecular Database are discussed. The Biomolecular Database system has the capability of retrieval of subsets of stored genetic material, which are specified by associative queries on the tags and/or the attached genomic DNA strands, as well as logical selection queries on the tags of the database. We describe how these queries can be executed by applying recombinant DNA operations on the Biomolecular Database, which have the effect of selection of subsets of the database as specified by the queries. In particular, we describe how to execute these queries on this Biomolecular Database by the use of biomolecular computing (also known as DNA computing) techniques, including execution of parallel associative search queries on DNA databases, and the execution of logical operations using recombinant DNA operations. We also utilize recent biotechnology developments (recombinant DNA technology, DNA hybridization arrays, DNA tagging methods, etc.), which are quickly being enhanced in scale (e.g., output via DNA hybridization array technology). The chapter also discusses applications of such a Biomolecular Database system to various medical sciences and genomic processing capabilities, including: (a) rapid identification of subpopulations possessing a specific known genotype, (b) large-scale gene expression profiling using DNA databases, and (c) streamlining identification of susceptibility genes (high-throughput screening of candidate genes to optimize genetic association analysis for complex diseases). Such a Biomolecular Database system may provide a revolutionary change in the way that these genomic problems are solved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

7. References

  1. Adleman L. 1994. Molecular computation of solution to combinatorial problems. Science 266:1021–27.

    Article  PubMed  CAS  Google Scholar 

  2. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. 2000. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511.

    Article  PubMed  CAS  Google Scholar 

  3. Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ. 1999. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750.

    Article  PubMed  CAS  Google Scholar 

  4. Arnheim, N, Li HH, Cui XF. 1990. PCR analysis of DNA sequences in single cells: single sperm gene mapping and genetic disease diagnosis. Genomics 8:415–419.

    Article  PubMed  CAS  Google Scholar 

  5. Bach E, Condon A, Glaser E, Tanguay C. 1996. Improved models and algorithms for DNA computation. In Proc. 11th annual IEEE conference on computational complexity, J Comp Syst Sci, pp. 290–299.

    Google Scholar 

  6. Bancroft C, Bowler T, Bloom B, Clelland CT. 2001. Long-term storage of information in DNA. Science 293(5536):1763–1765.

    Article  PubMed  CAS  Google Scholar 

  7. Baum EB. 1995. How to build an associative memory vastly larger than the brain. Science 268:583–585.

    Article  PubMed  CAS  Google Scholar 

  8. Baum EB. 1996. DNA sequences useful for computation. In DNA sequences useful for computation, Proc. 2nd DIMACS workshop on DNA-based computing, Princeton. AMS DIMACS Series, 44:235–241. Ed. LF Landweber, E Baum. See (http://www.neci.nj.nec.com/homepages/eric/seq.ps.)

    Google Scholar 

  9. Box, GEP. 1978. Statistics for experimenters: an introduction to design, data analysis, and model building. Wiley, New York.

    Google Scholar 

  10. Box, GEP. 1987. Empirical model-building and response surfaces. Wiley, New York.

    Google Scholar 

  11. Braich RS, Chelyapov N, Johnson C, Rothemund PWK, Adleman L. 2002. Solution of a 20-variable 3-SAT problem on a DNA computer. Science 296(5567):499–502.

    Article  PubMed  CAS  Google Scholar 

  12. Cantor CR, Smith CL, Mathew MK. 1988. Pulsed-field gel electrophoresis of very large DNA molecules. Annu Rev Biophys Biophys Chem 17:287–304.

    Article  PubMed  CAS  Google Scholar 

  13. Chen CJ, Deaton R, Wang Y. 2003. A DNA-based memory with in vitro learning and associative recall, Proc. 9th annual meeting on DNA-based computers, pp. 127–136.

    Google Scholar 

  14. Clayton SJ, Scott FM, Walker J, Callaghan K, Haque K, Liloglou T, Xinarianos G, Shawcross S, Ceuppens P, Field JK, Fox JC. 2000. K-ras point mutation detection in lung cancer: comparison of two approaches to somatic mutation detection using ARMS allele-specific amplification. Clin Chem 46:1929–1938.

    PubMed  CAS  Google Scholar 

  15. Corder EH, Saunders AM, Strittmatter WJ, Schmechel DE, Gaskell PC, Small GW, Roses AD, Haines JL, Pericak-Vance MA. 1993. Gene dose of apolipoprotein E type 4 allele and the risk of Alzheimer’s disease in late onset families. Science 261(5123):921–923.

    Article  PubMed  CAS  Google Scholar 

  16. Corder EH, Saunders AM, Risch NJ, Strittmatter WJ, Schmechel DE, Gaskell Jr PC, Rimmler JB, Locke PA, Conneally PM, Schmader KE, Small GW, Roses AD, Haines JL, Pericak-Vance MA. 1994. Protective effect of apolipoprotein e type 2 allele for late onset Alzheimer disease. Nature Genet 7:180–184.

    Article  PubMed  CAS  Google Scholar 

  17. Cukras AR, Faulhammer D, Lipton, RJ, Landweber LF. 2000. Molecular computation: RNA solutions to chess problems, Proc Natl Acad Sci USA 97:1385–1389.

    Article  PubMed  Google Scholar 

  18. Deaton R. Murphy RE, Rose JA, Garzon M, Franceschetti DR, Stevens Jr SE. 1997. A DNA-based implementation of an evolutionary search for good encodings for DNA computation. In Proc. IEEE Conference on Evolutionary Computation, ICEC-97, pp. 267–271.

    Google Scholar 

  19. Deaton R, Garzon M, Rose JA, Franceschetti DR, Murphy RC, Stevens Jr SE. 1998. Reliability and efficiency of a DNA-based computation. Phys Rev Lett 80:417–420.

    Article  CAS  Google Scholar 

  20. Deaton R, Murphy RC, Garzon M, Franceschetti DR, Stevens Jr SE. 1999. Good encodings for DNA-based solutions to combinatorial problems. In Proc. DNA-based computers, II: DIMACS Workshop 10–12 June. Ed LF Landweber and EB Baum. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 44:247–258.

    Google Scholar 

  21. Deming SN. 1987. Experimental design: a chemometric approach. Elsevier, New York.

    Google Scholar 

  22. DeRisi J, van den Hazel B, Marc P, Balzi E, Brown P, Jacq C, Goffeau A. 2000. Genome microarray analysis of transcriptional activation in multidrug resistance yeast mutants. FEBS Lett 470(2):156–160.

    Article  PubMed  CAS  Google Scholar 

  23. Faulhammer D, Cukras AR, Lipton RJ, Landweber. 2000. Molecular computation: RNA solutions to chess problems. Proc Natl Acad Sci USA 97:1385–1389.

    Article  PubMed  CAS  Google Scholar 

  24. Frutos AG, Thiel AJ, Condon AE, Smith LM, Corn RM. 1997. DNA computing at surfaces: 4 base mismatch word design. In Proc. 3rd DIMACS meeting on DNA-based computers, University of Pennsylvania, Philadelphia, June.

    Google Scholar 

  25. Garzon M, Deaton R, Neathery P, Murphy RC, Franceschetti DR, Stevens Jr SE. 1997. On the Encoding Problem for DNA Computing. In Proc. 3rd DIMACS meeting on DNA-based computers, University of Pennsylvania, Philadelphia, June.

    Google Scholar 

  26. Garzon M, Neel A, Bobba K. 2004. Efficiency and reliability of semantic retrieval in DNA-based memories. In DNA computing, 9th international workshop on DNA-based computers. Ed. J Chen, JH Reif. Lect Notes Comput Sci 2943:157–169.

    Google Scholar 

  27. Gehani A, and Reif JH. 1999. Microflow bio-molecular computation. In Proc. 4th DIMACS workshop on DNA-based computers, University of Pennsylvania, June 1998. Series in Discrete Mathematics and Theoretical Computer Science. Ed. H Rubin. American Mathematical Society, Providence, RI. Also appeared in special issue of Biosystems: J Biol Inform Processing Sci 52: (1–3):197–216.

    Google Scholar 

  28. Gehani A, LaBean TH, Reif JH. 2000. DNA-based cryptography. In 5th DIMACS workshop on DNA-based computers, MIT, June 1999. Series in Discrete Mathematics and Theoretical Computer Science. Ed. E Winfree. American Mathematical Society, Providence, RI.

    Google Scholar 

  29. Gray JM, Frutos TG, Berman AM, Condon AE, Lagally MG, Smith LM, Corn RM. 1996. Reducing errors in DNA computing by appropriate word design. Draft paper, University of Wisconsin, Department of Chemistry, October 9.

    Google Scholar 

  30. Hartemink A, Gifford D, Khodor J. 1998. Automated constraint-based nucleotide sequence selection for DNA computation, In Proc. 4th DIMACS workshop on DNA-based computers, University of Pennsylvania, June 1998.

    Google Scholar 

  31. Helene C, Thuong NT. 1991. Design of bifunctional oligonucleotide intercalator conjugates as inhibitors of gene expression. Nucleic Acids Symp Ser 24:133–137.

    PubMed  CAS  Google Scholar 

  32. Jonoska N, Karl SA. 1997. Ligation experiments in computing with DNA. In Proc. IEEE Conference on Evolutionary Computation, ICEC-97, pp. 261–265.

    Google Scholar 

  33. Kaplan P, Cecchi G, Libchaber A. 1996. DNA-based molecular computation: template-template interactions in PCR. In Proc. 2nd DIMACS workshop on DNA-based computing. Ed. LF Landweber and EB Baum. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 44:94–102.

    Google Scholar 

  34. Kashiwamura S, Yamamoto M, Kameda A, Shiba T, Ohuchi A. 2003. Hierarchical DNA memory based on nested PCR. In Proc. 8th DIMACS workshop on DNA-based computing, Sapporo, Japan, June 10–13. Ed. M Hagiya, A Ohuchi. Lect Notes Comput Sci 2568:112–123.

    Google Scholar 

  35. Li HH, Cui XF, Arnheim N. 1990. Analysis of DNA sequences in individual gametes: application to human genetic mapping. Prog Clin Biol Res 340C:207–211.

    PubMed  CAS  Google Scholar 

  36. Lipton RJ. 1996. DNA computations can have global memory. In Proc. 2nd DIMACS workshop on DNA-based computing. Ed. LF Landweber and EB Baum. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 44:259–266.

    Google Scholar 

  37. Liu Q, Liman W. Frutos AG, Condon AE, Corn RM, Smith LM. 2000. DNA Computing on surfaces. Nature 403:175–179.

    Article  PubMed  CAS  Google Scholar 

  38. Lizardi P, Huang X, Zhu Z, Bray-Ward P, Thomas DC, Ward DC. 1998. Mutant detection and single molecule counting using isothermal rolling circle replication. Nature Genet 19:225–232.

    Article  PubMed  CAS  Google Scholar 

  39. Mir KU. 1996. A restricted genetic alphabet for DNA computing. In Proc. 2nd DIMACS workshop on DNA-based computing. Ed. LF Landweber and EB Baum. DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 44:243–246.

    Google Scholar 

  40. Niculescu AB, Segal DS, Kuczenski R, Barrett T, Hauger RL, Kelsoe JR. 2000. Identifying a series of candidate genes for mania and psychosis: a convergent functional genomics approach. Physiol Genomics 4(1):83–91.

    PubMed  CAS  Google Scholar 

  41. Olson MV. 1989. Separation of large DNA molecules by pulsed-field gel electrophoresis: a review of the basic phenomenology. J Chromatogr 470:377–383.

    Article  PubMed  CAS  Google Scholar 

  42. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lonning PE, Borresen-Dale AL, Brown PO, Botstein D. 2000. Molecular portraits of human breast tumours. Nature 406(6797):747–752.

    Article  PubMed  CAS  Google Scholar 

  43. Pieles U, Englisch U. 1989. Psoralen covalently linked to oligodeoxyribonucleotides: synthesis, sequence specific recognition of DNA and photo-cross-linking to pyrimidine residues of DNA. Nucleic Acids Res 17:285–99.

    Article  PubMed  CAS  Google Scholar 

  44. Pirrung MC. 1995. Combinatorial libraries: chemistry meets Darwin. Chemtracts Org Chem 8:5.

    CAS  Google Scholar 

  45. Pirrung MC. 1997. Spatially-addressable combinatorial libraries. Chem Rev 97:473.

    Article  PubMed  CAS  Google Scholar 

  46. Pirrung MC, Chau JH-L, Chen J. 1996. Indexed combinatorial libraries: non-oligomeric chemical diversity for the discovery of novel enzyme inhibitors. In Combinatorial chemistry: a high-tech search for new drug candidates, pp. 191–206. Ed. SR Wilson, R Murphy. John Wiley & Sons, New York.

    Google Scholar 

  47. Pirrung MC, Connors RV, Montague-Smith MP, Odenbaugh AL, Walcott NG, Tollett JJ. 2000. The arrayed primer-extension method for DNA microchip analysis: molecular computation of satisfaction problems. J Am Chem Soc 122:1873.

    Article  CAS  Google Scholar 

  48. Pirrung MC, Zhao X, Harris SV. 2001. A universal, photocleavable, DNA base: nitropiperonyl 2′-deoxyriboside (dP*). J Org Chem 66:2067.

    Article  PubMed  CAS  Google Scholar 

  49. Quillent C, Oberlin E, Braun J, Rousset D, Gonzalez-Canali G, Metais P, Montagnier L, Virelizier JL, Arenzana-Seisdedos F, Beretta A. 1998. HIV-1-resistance phenotype conferred by combination of two separate inherited mutations of CCR5 gene. Lancet 351(9095):14–18.

    Article  PubMed  CAS  Google Scholar 

  50. Reif, J.H. 1998. Paradigms for biomolecular computation. Paper presented at 1st international conference on unconventional models of computation, Auckland, New Zealand, January. In Unconventional models of computation, pp. 72–93. Ed. CS Calude, J Casti, MJ Dinneen. Springer, New York.

    Google Scholar 

  51. Reif JH. 1999. Parallel Molecular Computation: Models and Simulations. In Proc. 7th annual ACM symposium on parallel algorithms and architectures (SPAA’95), Santa Barbara, CA, July 1995, pp. 213–223. Published in Algorithmica, special issue on Comput Biol 25(2):142–176.

    Google Scholar 

  52. Reif JH. 2002. The emergence of the discipline of biomolecular computation in the US. Invited paper presented in a special issue on Biomolecular Computing, New Generation Computing, ed. M Hagiya, M Yamamura, T Head, 20(3):217–236.

    Google Scholar 

  53. Reif, JH. 2002. Perspectives: successes and challenges. Science 296:478–479.

    Article  PubMed  CAS  Google Scholar 

  54. Reif JH. LaBean TH. 2001. Computationally inspired biotechnologies: improved dna synthesis and associative search using error-correcting codes and vector-quantization, In Proc. 6th DIMACS workshop on DNA-based computers, Leiden, The Netherlands, June 13–17, 2000. Lect Notes Comput Sci 2054:145–172.

    Google Scholar 

  55. Reif JH, LaBean TH, Pirrung M, Rana VS, Guo B, Kingsford C, Wickham GS. 2002. Experimental construction of very large-scale DNA databases with associative search capability. In Proc. 7th DIMACS workshop on DNA-based computers, Tampa, FL, June 10–13, 2001. Lect Notes Comput Sci 2340:231–247.

    Google Scholar 

  56. Risch N, Merikangas K. 1996. The future of genetic studies of complex human disorders. Science 273(5281):1516–1517.

    Article  PubMed  CAS  Google Scholar 

  57. Robinson BH, Seeman NC. 1987. The design of a biochip: a self-assembling molecular-scale memory device. Prot Eng 1:295–300.

    Article  CAS  Google Scholar 

  58. Roweis S, Winfree E, Burgoyne R, Chelyapov NV, Goodman MF, Rothemund PWK, Adleman LM. 1998. A sticker-based model for DNA computation, J Comput Biol 5:615–629.

    Article  PubMed  CAS  Google Scholar 

  59. Sakakibara Y, Suyama A. 2000. Intelligent DNA chips: logical operation of gene expression profiles on DNA computers. Genome Informatics 11:33–42.

    PubMed  CAS  Google Scholar 

  60. Suyama A, Nishida N, Kurata K, Omagari K. 2000. Gene expression analysis by DNA computing. Curr Comput Mol Biol 30:12–13.

    CAS  Google Scholar 

  61. Szatmari I, Aradi J. 2001. Telomeric repeat amplification, without shortening or lengthening of the telomerase products: a method to analyze the processivity of telomerase enzyme. Nucleic Acids Res 29:E3.

    Article  PubMed  CAS  Google Scholar 

  62. Taylor GR, Logan WP. 1995. The polymerase chain reaction: new variations on an old theme. Curr Opin Biotechnol 6:24–29.

    Article  PubMed  CAS  Google Scholar 

  63. Taylor GR, Robinson P. 1998. The polymerase chain reaction: from functional genomics to high-school practical classes. Curr Opin Biotechnol 9:35–42.

    Article  PubMed  CAS  Google Scholar 

  64. Wellinger RE, Lucchini R, Dammann R, Sogo JM. 1999. In vivo mapping of nucleosomes using psoralen-DNA crosslinking and primer extension. Methods Mol Biol 119:161–173.

    PubMed  CAS  Google Scholar 

  65. Winfree E. 1998. Whiplash PCR for O(1) computing. In Proc. 4th DIMACS workshop on DNA-based computers, University of Pennsylvania, June 1998.

    Google Scholar 

  66. Wood DH. 1998. Applying error-correcting codes to DNA computing. In Proc. 4th DIMACS workshop on DNA-based computers, University of Pennsylvania, June 1998, pp. 109–110.

    Google Scholar 

  67. Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N. 1992. Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci USA 89:5847–5851.

    Article  PubMed  CAS  Google Scholar 

  68. Zhao R, Gish K, Murphy M, Yin Y, Notterman D, Hoffman WH, Tom E, Mack DH, Levine AJ. 2000. Analysis of p53-regulated gene expression patterns using oligonucleotide arrays. Genes Dev 14(8):981–993.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John H. Reif .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer Inc.

About this chapter

Cite this chapter

Reif, J.H., Hauser, M., Pirrung, M., LaBean, T. (2006). Application of Biomolecular Computing to Medical Science: A Biomolecular Database System for Storage, Processing, and Retrieval of Genetic Information and Material. In: Deisboeck, T.S., Kresh, J.Y. (eds) Complex Systems Science in Biomedicine. Topics in Biomedical Engineering International Book Series. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-33532-2_31

Download citation

Publish with us

Policies and ethics