Distributed and Parallel Databases

, Volume 13, Issue 1, pp 7–42 | Cite as

A Computational Biology Database Digest: Data, Data Analysis, and Data Management

  • François Bry
  • Peer Kröger

Abstract

Computational Biology or Bioinformatics has been defined as the application of mathematical and Computer Science methods to solving problems in Molecular Biology that require large scale data, computation, and analysis [26]. As expected, Molecular Biology databases play an essential role in Computational Biology research and development. This paper introduces into current Molecular Biology databases, stressing data modeling, data acquisition, data retrieval, and the integration of Molecular Biology data from different sources. This paper is primarily intended for an audience of computer scientists with a limited background in Biology.

computational biology molecular biology databases data analysis data modelling data integration 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    3D Hit Homepage: http://3dhit.bioinfo.pl/.Google Scholar
  2. 2.
    S. Abiteboul, P. Buneman, and D. Suciu, Data on theWeb: From Relations to Semistructured Data and XML, Morgan Kaufmann Publishers: San Francisco, 2000.Google Scholar
  3. 3.
    ACEDB Documentation Library, http://genome.cornell.edu/acedocs/.Google Scholar
  4. 4.
    S. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, “Basic local alignment search tool,” Journal of Molecular Biology, vol. 215, pp. 403–410, 1990.Google Scholar
  5. 5.
    ASN.1 Standard. Web Site. http://asn1.elibel.tm.fr.Google Scholar
  6. 6.
    T.L. Bailey and C. Elkan, “Fitting a mixture model by expectation maximization to discover motifs in biopolymers,” in Proceedings of the 2nd International Conference on Intelligent Systems in Molecular Biology (ISMB'94), 1994, pp. 28–36.Google Scholar
  7. 7.
    T.L. Bailey and M. Gribskov, “Combining evidence using P-values: Application to sequence homology searches,” Bioinformatics, vol. 14, pp. 48–54, 1998.Google Scholar
  8. 8.
    A. Bairoch and R. Apweiler, “The SWISS-PROT database and its supplement TrEMBL in 2000,” Nucleic Acids Research, vol. 28, no. 1, pp. 45–48, 2000.Google Scholar
  9. 9.
    P. Baker, A. Brass, S. Bechhofer, C. Goble, N. Paton, and R. Stevens, “Tambis--Transparent access to multiple bioinformatics information sources,” in Proceedings of the 6th International Conference on Intelligent Systems in Molecular Biology (ISMB'98), 1998, pp. 25–34.Google Scholar
  10. 10.
    P. Baker, C. Goble, S. Bechhofer, N. Paton, R. Stevens, and A. Brass, “An ontology for bioinformatics application,” Bioinformatics, vol. 15, no. 6, pp. 510–520, 1999.Google Scholar
  11. 11.
    W. Baker, A. van den Broek, E. Camon, P. Hingamp, P. Sterk, G. Stoesser, and M.A. Tuli, “The EMBL nucleotide sequence database,” Nucleic Acids Research, vol. 28, no. 1, pp. 19–23, 2000.Google Scholar
  12. 12.
    F. Bancilhon, C. Delobel, and P. Kanellakis, Building an Object-Oriented Database System: The Story of O2,” Morgan Kaufmann: San Francisco, 1992.Google Scholar
  13. 13.
    W.C. Barker, J.S. Garavelli, Z. Hou, H. Huang, R.S. Ledley, P.B. McGarvey, H.-W. Mewes, B.C. Orcutt, F. Pfeiffer, A. Tsugita, C.R. Vinayaka, C. Xiao, L.-S.L. Yeh, and C. Wu, “Protein information resource: A community resource for expert annotation of protein data,” Nucleic Acids Research, vol. 29, pp. 29–32, 2001.Google Scholar
  14. 14.
    D. Benson, I. Karsch-Mizrachi, D.J. Lipman, J. Ostell, B.A. Rapp, and D.L. Wheeler, “GenBank,” Nucleic Acids Research, vol. 28, no. 1, pp. 15–18. 2000.Google Scholar
  15. 15.
    B. Boss, H. Wium Lee, C. Lilley, and I. Jacobs, “Cascading style sheets, level 2,” W3C Recommendation, 1998. http://www.w3.org/TR/REC-CSS2/.Google Scholar
  16. 16.
    S.H. Bryant, J.-F. Gibrat, and T. Madej, “Threading a database of protein cores,” Proteins, vol. 23, pp. 356–369, 1995.Google Scholar
  17. 17.
    P. Buneman, “Semistructured data,” in Tutorial Proceedings of the 16th ACM Symposium on Principles of Database Systems, 1997.Google Scholar
  18. 18.
    C. Burge and S. Karlin, “Prediction of complete gene structures in human genomic DNA,” Journal of Molecular Biology, vol. 268, pp. 78–94, 1997.Google Scholar
  19. 19.
    CBS Prediction Server: http://www.cbs.dtu.dk/services/.Google Scholar
  20. 20.
    D. Chamberlain, J. Clark, D. Florescu, J. Robie, J. Siméon, and M. Stefanescu, “XQuery 1.0: An XML query language,” W3C Working Draft, 2001, http://www.w3.org/TR/xquery/.Google Scholar
  21. 21.
    I.-M. Chen and V. Markowitz, “An overview of the object protocol model (OPM) and the OPM data management tools,” Information Systems, vol. 20, no. 5, pp. 393–418, 1995.Google Scholar
  22. 22.
    Chime Homepage: http://www.mdlchime.com/chime/.Google Scholar
  23. 23.
    J. Clark and S. DeRose, “XML path language (XPath) Version 1.0,” W3C Recommendation, 1999, http://www.w3.org/TR/xpath.Google Scholar
  24. 24.
    P. Clote and R. Backofen, Computational Molecular Biology, an Introduction, John Wiley and Sons, Ltd.: Chichester, 2000.Google Scholar
  25. 25.
    ClustArray Homepage: http://www.cbs.dtu.dk/services/DNAarray/.Google Scholar
  26. 26.
    M. Clutter, “Hearing on computational biology,” Statement before the Subcommittee on Science, Technology and Space Committee on Commerce, Science, and Transportation, U.S. Senate, 1996. http://www.nsf.gov/od/lpa/congress/cluttes2.htm.Google Scholar
  27. 27.
    C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.Google Scholar
  28. 28.
    J.A. Cuff and G.J. Barton, “Evaluation and improvement of multiple sequence methods for protein secondary structure prediction,” PROTEINS: Structure, Function and Genetics, vol. 34, pp. 508–519, 1999.Google Scholar
  29. 29.
    S.B. Davidson, C. Overton, V. Tannen, and L. Wong, “Biokleisli: A digital library for biomedical researchers,” International Journal on Digital Libraries, vol. 1, no. 1, pp. 36–53, 1997.Google Scholar
  30. 30.
    F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, J. Sui, and M. Grinbaum, “WAIS interface protocol prototype functional specification (Version 1.5),” Thinking Machine Corporation, April' 90, 1990.Google Scholar
  31. 31.
    A.L. Delcher, S. Kasif, R.D. Fleischmann, J. Peterson, O. White, and S.L. Salzberg, “Alignment of whole genomes,” Nucleic Acids Research, vol. 27, no. 11, pp. 2369–2376, 1999.Google Scholar
  32. 32.
    U. Dengler, A.S. Siddiqui, and G.J. Barton, “Protein structural domains: Analysis of the 3Dee domains database,” Proteins, vol. 42, pp. 332–344, 2001.Google Scholar
  33. 33.
    C. Discala, X. Benigni, E. Barillot, and G. Vaysseix, “DBcat: A catalog of 500 biological databases,” Nucleic Acids Research, vol. 28, no. 1, pp. 8–9, 2000.Google Scholar
  34. 34.
    D.R. Dolk, “Model management and structured modeling: The role of an information resource dictionary system,” Communications of the ACM, vol. 31, no. 6, pp. 704–718, 1988.Google Scholar
  35. 35.
    R. Durbin and J. Thierry-Mieg, “The ACEDB genome database,” in Computational Methods in Genome Research, S. Suhai (Ed.), Plenum Press: New York, 1994.Google Scholar
  36. 36.
    S.R. Eddy, “Profile hidden Markov models,” Bioinformatics, vol. 14, pp. 755–763, 1998.Google Scholar
  37. 37.
    Entrez Online Dokumentation: http://www.ncbi.nlm.nih.gov/Database/index.html.Google Scholar
  38. 38.
    T. Etzold, A. Ulyanow, and P. Argos, “SRS: Information retrieval system for molecular biology data banks,” Methods in Enzymology, vol. 266, pp. 114–128, 1996.Google Scholar
  39. 39.
    D.V. Faulkner and J. Jurka, “Multiple aligned sequence editor (MASE),” Trends in Biochemical Sciences, vol. 13, no. 8, pp. 321–322, 1988.Google Scholar
  40. 40.
    J. Felsenstein, “PHYLIP--Phylogeny inference package (Version 3.2),” Cladistics, vol. 5, pp. 164–166, 1989.Google Scholar
  41. 41.
    W. Fujibuchi, S. Goto, H. Migimatsu, I. Uchiyama, A. Ogiwara, Y. Akiyama, and M. Kanehisa, “DBGET/LinkDB: An integrated database retrieval system,” in Pacific Symposium on Biocomputing (PSB'97), 1997, pp. 683–694.Google Scholar
  42. 42.
    M. Gardiner-Garden and M. Frommer, “CpG islands in vertebrate genomes,” Journal of Molecular Biology, vol. 196, pp. 261–282, 1987.Google Scholar
  43. 43.
    M.S. Gelfand, A.A. Mironov, and P.A. Pevzner, “Gene recognition via spliced sequence alignment,” in Proceedings of the National Academy of Science USA (PNAS), vol. 93, 1996, pp. 9061–9066.Google Scholar
  44. 44.
    GenBank Growth: http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html.Google Scholar
  45. 45.
    D. George, H.-W. Mewes, and H. Kihara, “A standardized format for sequence data exchange,” Protein Sequence Data Analysis, vol. 1, pp. 27–39, 1987.Google Scholar
  46. 46.
    D.R. Gilbert, D.R. Westhead, N. Nagano, and J.M. Thornton, “Motif-based searching in TOPS protein topology databases,” Bioinformatics, vol. 5, no. 4, pp. 317–326, 1999.Google Scholar
  47. 47.
    N. Guex and M.C. Peitsch, “SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling,” Electrophoresis, vol. 18, pp. 2714–2723, 1997.Google Scholar
  48. 48.
    A. Gupta, H.V. Jagadish, and I.S. Mumick, “Data integration using self-maintainable views,” in Proceedings of the International Conference on Extending Database Technology (EDBT), LNCS, vol. 1057, Springer Verlag, 1996, pp. 140–144.Google Scholar
  49. 49.
    D. Gusfield, “Efficient methods for multiple sequence alignment with guaranteed error bounds,” Bulletin of Mathematical Biology, vol. 55, no. 141, p. 154, 1993.Google Scholar
  50. 50.
    M. Hammer and D. McLeod, “Database description with SDM: A semantic database model,” ACM Transactions on Database Systems, vol. 6, no. 3, 1981.Google Scholar
  51. 51.
    HIV-MAP Homepage: http://hiv-web.lanl.gov/content/hiv-db/MAP/hivmap.html.Google Scholar
  52. 52.
    K. Hofmann, P. Bucher, L. Falquet, and A. Bairoch, “The PROSITE database, its status in 1999,” Nucleic Acids Research, vol. 27, no. 1, pp. 215–219, 1999.Google Scholar
  53. 53.
    L. Holm and C. Sander, “Protein structure comparison by alignment of distance matrices,” Journal of Molecular Biology, vol. 233, pp. 123–138, 1993.Google Scholar
  54. 54.
    A.K. Jain and R.C. Dubes, Algorithms for clustering data, Prentice-Hall, 1988.Google Scholar
  55. 55.
    Jalview Homepage: http://circinus.ebi.ac.uk:6543/jalview/help.html.Google Scholar
  56. 56.
    F. Jeanmougin, J.D. Thompson, M. Gouy, D.G. Higgins, and T.J. Gibson, “Multiple sequence alignment with clustal X,” Trends in Biochemical Sciences, vol. 23, pp. 403–405, 1998.Google Scholar
  57. 57.
    T.K. Jenssen, A. Laegreid, J. Komorowski, and E. Hovig, “A literature network of human genes for high-throughput analysis of gene expression,” Nature Genetics, vol. 28, no. 1, pp. 21–28, 2001.Google Scholar
  58. 58.
    D.T. Jones, “GenTHREADER: An efficient and reliable protein fold recognition method for genomic sequences,” Journal of Molecular Biology, vol. 287, pp. 797–815, 1999.Google Scholar
  59. 59.
    D.T. Jones, “Protein secondary structure prediction based on position-specific scoring matrices,” Journal of Molecular Biology, vol. 292, pp. 195–202, 1999.Google Scholar
  60. 60.
    D.T. Jones, W.R. Taylor, and J.R. Thornton, “A model recognition approach to the prediction of all-helical membran protein structure and topology,” Biochemestry, vol. 33, pp. 3038–3049, 1994.Google Scholar
  61. 61.
    P. Karp, “A strategy for database interoperation,” Journal of Computational Biology, vol. 2, no. 4, pp. 573–586, 1995.Google Scholar
  62. 62.
    D.G. Kneller, F.E. Cohen, and R. Langridge, “Improvements in protein secondary structure prediction by an enhanced neural network,” Journal of Molecular Biology, vol. 214, pp. 171–182, 1990.Google Scholar
  63. 63.
    T. Kohonen, Self-Organization and Associative Memory, Springer Verlag: Berlin, 1984.Google Scholar
  64. 64.
    R. Koradi, M. Billeter, and K. Wüthrich, “MOLMOL: A program for display and analysis of macromolecular structures,” Journal of Molecular Graphics and Modelling, vol. 14, pp. 51–55, 1996.Google Scholar
  65. 65.
    A. Krogh, B. Larsson, G. von Heijne, and E.L. Sonnhammer, “Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes,” Journal of Molecular Biology, vol. 305, no. 3, pp. 567–580, 2001.Google Scholar
  66. 66.
    J. Kyte and R.F. Doolittle, “A simple method for displaying the hydropathic character of a protein,” Journal of Molecular Biology, vol. 157, no. 1, pp. 105–132, 1982.Google Scholar
  67. 67.
    L.V.S. Lakshmanan, F. Sadri, and I.N. Subramanian, “SchemaSQL: A language for interoperability in relational multidatabase systems,” in Proceedings of the 22nd International Conference on Very Large Databases (VLDB'96), 1996, pp. 239–250.Google Scholar
  68. 68.
    H. Lehväslaiho, M. Ashburner, and T. Etzold, “Unified access to mutation databases,” Trends in Genetics, vol. 14, no. 5, pp. 205–206, 1998.Google Scholar
  69. 69.
    S. Letovsky, R.W. Cottingham, C.J. Porter, and P.W.D. Li, “GDB: The human genome database,” Nucleic Acids Research, vol. 26, no. 1, pp. 94–99, 1998.Google Scholar
  70. 70.
    O. Lund, K. Frimand, J. Gorodkin, H. Bohr, J. Bohr, J. Hansen, and S. Brunak, “Protein distance constraints predicted by neural networks and probability density functions,” Protein Engineering, vol. 10, no. 11, pp. 1241–1248, 1997.Google Scholar
  71. 71.
    J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in 5th Berkeley Symposium on Mathematics, Statistics, and Probabilistics, vol. 1, pp. 281–297, 1967.Google Scholar
  72. 72.
    V. Markowitz, I.-M. Chen, A. Kosky, and E. Szeto, “Facilities for exploring molecular biology databases on the web: A comparative study,” in Pacific Symposium on Biocomputing (PSB'97), 1997, pp. 256–267.Google Scholar
  73. 73.
    M.A. Marti-Renom, A. Stuart, A. Fiser, R. Sánchez, F. Melo, and A. Sali, “Comparative protein structure modeling of genes and genomes,” Annual Review Biophysics and Biomolecular Structures, vol. 29, pp. 291–325, 2000.Google Scholar
  74. 74.
    D.C. McArthur, “An extensible XML schema definition for automated exchange of protein data: PROXIML (PROtein eXtensIble Markup Language),” http://www.cse.ucsc.edu/ douglas/proximl/.Google Scholar
  75. 75.
    R. McEntire, P. Karp, N. Abernethy, D. Benton, G. Helt, M. DeJongh, R. Kent, A. Kosky, S. Lewis, D. Hodnett, E. Neumann, F. Olken, D. Pathak, P. Tarczy-Hornoch, L. Toldo, and T. Topaloglou, “An evaluation of ontology exchange languages for bioinformatics,” in Proceedings of the 8th International Conference on Intelligent Systems in Molecular Biology (ISMB'00), 2000, pp. 239–250.Google Scholar
  76. 76.
    C. Medigue, A. Viari, A. Henaut, and A. Danchin, “Colibri: A functional database for the Escherichia coli genome,” Microbiology and Molecular Biology Reviews, vol. 57, no. 3, pp. 623–654, 1992.Google Scholar
  77. 77.
    H.-W. Mewes, D. Frishman, C. Gruber, B. Geier, D. Haase, A. Kaps, K. Lemcke, G. Mannhaupt, F. Pfeiffer, C. Schüller, S. Stocker, and B. Weil, “MIPS: A database for genomes and protein sequences,” Nucleic Acids Research, vol. 28, no. 1, pp. 37–40, 2000.Google Scholar
  78. 78.
    MolScript Homepage: http://www.avatar.se/molscript/.Google Scholar
  79. 79.
    Motif Homepage: http://motif.genome.ad.jp/.Google Scholar
  80. 80.
    S.B. Needleman and C.D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of Molecular Biology, vol. 48, pp. 443–453, 1970.Google Scholar
  81. 81.
    Patscan Homepage: http://www-unix.mcs.anl.gov/compbio/PatScan/HTML/patscan.html.Google Scholar
  82. 82.
    W. Pearson and D. Lipman, “Improved tools for biological sequence comparison,” in Proceedings of the National Academy of Science USA (PNAS), vol. 85, pp. 2444–2448, 1988.Google Scholar
  83. 83.
    M.C. Peitsch, “ProMod and swiss-model: Internet-based tools for automated comparative protein modelling,” Biochemical Society Transactions, vol. 24, pp. 274–279, 1996.Google Scholar
  84. 84.
    G. Perrière, P. Bessières, and B. Labedan, “EMGLib: The enhanced microbial genomes library (Update 2000),” Nucleic Acids Research, vol. 28, no. 1, pp. 68–71, 2000.Google Scholar
  85. 85.
    Predator Homepage: http://www.embl-heidelberg.de/argos/predator/predator info.html.Google Scholar
  86. 86.
    D.S. Prestridge, “SIGNAL SCAN: A computer program that scans DNA sequences for eukaryotic transcriptional elments,” CABIOS, vol. 7, pp. 203–206, 1991.Google Scholar
  87. 87.
    ProFit Homepage: http://www.bioinf.org.uk/software/.Google Scholar
  88. 88.
    M. Prokop, J. Damborsky, and J. Koca, “TRITON: In Silico construction of protein mutants and prediction of their activities,” Bioinformatics, vol. 16, pp. 845–846, 2000.Google Scholar
  89. 89.
    Promotor Scan Homepage: http://bimas.dcrt.nih.gov/molbio/proscan/index.html.Google Scholar
  90. 90.
    Protein Structure Prediction Center: http://predictioncenter.llnl.gov/.Google Scholar
  91. 91.
    PubMed Database: http://www.ncbi.nlm.nih.gov/PubMed/.Google Scholar
  92. 92.
    Readseq Homepage: http://www.nih.go.jp/%7Ejun/cgi-bin/readseq.pl.Google Scholar
  93. 93.
    F. Rechenmann, “Knowledge bases and computational biology,” in Towards Very Large Knowledge Bases, N. Mars (Ed.), IOS Press, 1995, pp. 1–12.Google Scholar
  94. 94.
    I.T. Rombel, K.F. Sykes, S. Rayner, and S.A. Johnston, “ORF-FINDER: A vector for high-throughput gene identification,” Gene, vol. 282, nos. 1/2, pp. 33–41, 2002.Google Scholar
  95. 95.
    B. Rost, “Review: Protein secondary structure prediction continues to rise,” Journal of Structural Biology, vol. 134, nos. 2/3, pp. 204–218, 2001.Google Scholar
  96. 96.
    B. Rost and C. Sander, “Prediction of protein secondary structure at better than 70% accuracy,” Journal of Molecular Biology, vol. 232, pp. 584–599, 1993.Google Scholar
  97. 97.
    RPFOLD Homepage: http://www.imtech.res.in/raghava/rpfold/.Google Scholar
  98. 98.
    K.-U. Sattler, S. Conrad, and G. Saake, “Adding conflict resolution features to a query language for database federations,” Australian Journal of Information Systems, vol. 8, no. 1, pp. 116–125, 2000.Google Scholar
  99. 99.
    R. Sayle and E.J. Milner-White, “RasMol: Biomolecular graphics for all,” Trends in Biochemical Sciences, vol. 20, no. 9, p. 374, 1995.Google Scholar
  100. 100.
    S. Schwartz, Z. Zhang, K.A. Frazer, A. Smit, C. Riemer, J. Bouck, R. Gibbs, R. Hardison, and W. Miller, “PipMaker: A web server for aligning two genomic DNA sequences,” Genome Research, vol. 10, no. 4, pp. 577–586, 2000.Google Scholar
  101. 101.
    SFgate Homepage:http://ls6-www.informatik.uni-dortmund.de/ir/projects/SFgate/#intro.Google Scholar
  102. 102.
    A.P. Sheth and J.A. Larson, “Federated database systems for managing distributed, heterogeneous, and automated databases,” ACM Computing Surveys, vol. 22, no. 3, pp. 183–196, 1990.Google Scholar
  103. 103.
    J. Shi, T.L. Blundell, and K. Mizuguchi, “FUGUE: Sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties,” Journal of Molecular Biology, vol. 310, pp. 243–257, 2001.Google Scholar
  104. 104.
    A.S. Siddiqui, U. Dengler, and G.J. Barton, “3Dee:Adatabase of protein structural domains,” Bioinformatics, vol. 17, pp. 200–201, 2001.Google Scholar
  105. 105.
    T.F. Smith and M.S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, pp. 195–197, 1981.Google Scholar
  106. 106.
    R.F. Smith, B.A. Wiese, M.K. Wojzynski, D.B. Davison, and K.C. Worley, “BCM search launcher--an integrated interface to molecular biology data base search and analysis services available on the world wide web,” Genome Research, vol. 6, no. 5, pp. 454–462, 1996.Google Scholar
  107. 107.
    S. Spaccapietra, C. Parent, and Y. Dupont, “Model independent assertions for integration of heterogeneous schemas,” VLDB Journal, vol. 1, no. 1, pp. 81–126, 1992.Google Scholar
  108. 108.
    SRS User Guide, 2000, /srs6/doc/srsuser.pdf.Google Scholar
  109. 109.
    S.A. Sullivan, L. Aravind, I. Makalowska, A.D. Baxevanis, and D. Landsman, “The histone database: A comprehensiveWWWresource for histones and histone fold-containing proteins,” Nucleic Acids Research, vol. 28, no. 1, pp. 320–322, 2000.Google Scholar
  110. 110.
    R.M. Sweet and D. Eisenberg, “Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure,” Journal of Molecular Biology, vol. 171, no. 4, pp. 479–488, 1983.Google Scholar
  111. 111.
    Y. Tateno, S. Miyazaki, M. Ota, H. Sugawara, and T. Gojobori, “DNA data bank of Japan (DDBJ) in collaboration with mass sequencing teams,” Nucleic Acids Research, vol. 28, no. 1, pp. 24–26, 2000.Google Scholar
  112. 112.
    J.D. Thompson, D.G. Higgins, and T.J. Gibson, “CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice,” Nucleic Acids Research, vol. 22, pp. 4673–4680, 1994.Google Scholar
  113. 113.
    S. Tsur, “Data mining in the bioinformatics domain,” in Proceedings of the 26th Conference on Very Large Databases (VLDB'00), 2000.Google Scholar
  114. 114.
    J. van Helden, A. Naim, R. Mancuso, M. Eldridge, L. Wernisch, D. Gilbert, and S.J. Wodak, “Representing and analysing molecular and cellular function in the computer,” Biological Chemistry, vol. 381, pp. 921–935, 2000.Google Scholar
  115. 115.
    J. Vilo, A. Brazma, I. Jonassen, A. Robinson, and E. Ukkonen, “Mining for putative regulatory elements in the yeast genome using gene expression data,” in Proceedings of the 8th International Conference on Intelligent Systems in Molecular Biology (ISMB'00), 2000, pp. 384–394.Google Scholar
  116. 116.
    G. von Heijne, “Membrane protein structure prediction: Hydrophobicity analysis and the 'Positive Inside' rule,” Journal of Molecular Biology, vol. 225, pp. 487–494, 1992.Google Scholar
  117. 117.
    A.C. Wallace, R.A. Laskowski, and J.M. Thornton, “LIGPLOT: A program to generate schematic diagrams of protein-ligand interactions,” Protein Engineering, vol. 8, pp. 127–134, 1995.Google Scholar
  118. 118.
    Y. Wang, L.Y. Geer, C. Chappey, J.A. Kans, and S.H. Bryant, “Cn3D: Sequence and structure views for entrez,” Trends in Biochemical Sciences, vol. 25, no. 6, pp. 300–302, 2000.Google Scholar
  119. 119.
    Wise2 Homepage: http://www.sanger.ac.uk/Software/Wise2/.Google Scholar
  120. 120.
    XEMBL Project: http://www.ebi.ac.uk/xembl/.Google Scholar
  121. 121.
    G. Xie, R. DeMarco, R. Blevins, and Y. Wang, “Storing biological sequence databases in relational form,” Bioinformatics, vol. 16, no. 2, pp. 288–289, 2000.Google Scholar
  122. 122.
    Y. Xu, R.J. Mural, and E.C. Uberbacher, “Inferring gene structures in genomic sequences using pattern recognition and expressed sequence tags,” in Proceedings of the 5th International Conference on Intelligent Systems in Molecular Biology (ISMB' 97), 1997, pp. 344–353.Google Scholar
  123. 123.
    R. Zimmer and T. Lengauer, “Protein structure prediction,” in Bioinformatics--From Genomes to Drugs, T. Lengauer (Ed.), Vol. 1: Basic Technologies, Wiley-VCH., 2002.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • François Bry
    • 1
  • Peer Kröger
    • 1
  1. 1.Institute for Computer ScienceUniversity of MunichGermany

Personalised recommendations