Abstract
The Internet consists of a vast inhomogeneous reservoir of data. Developing software that can integrate a wide variety of different data sources is a major challenge that must be addressed for the realisation of the full potential of the Internet as a scientific research tool. This article presents a semi-automated object-oriented programming system for integrating web-based resources. We demonstrate that the current Internet standards (HTML, CGI [common gateway interface], Java™, etc.) can be exploited to develop a data retrieval system that scans existing web interfaces and then uses a set of rules to generate new Java™ code that can automatically retrieve data from the Web. The validity of the software has been demonstrated by testing it on several biological databases. We also examine the current limitations of the Internet and discuss the need for the development of universal standards for web-based data.
Similar content being viewed by others
Notes
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
Supplementary figures to this article can be found with the electronic version of this article on AdisOnline (http://www.AdisOnline.com/). The supplementary material is listed as ‘ArticlePlus’.
References
Servant F, Bru C, Carrère S, et al. Automated clustering of homologous domains. Brief Bioinform 2002; 3: 246–51
Bateman A, Coin L, Durbin R, et al. The Pfam protein families database. Nucleic Acids Res 2004; 32: 138–41
Attwood TK, Bradley P, Flower DR, et al. PRINTS and its automatic supplement, prePRINTS. Nucleic Acids Res 2003; 31: 400–2
Henikoff S, Henikoff JG. Protein family classification based on searching a database of blocks. Genomics 1994; 19: 97–107
Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol 1990; 215: 403–10
Murray-Rust P, Rzepa HS. Chemical markup, XML, and the Worldwide Web. I: basic principles. J Chem Inf Comput Sci 1999; 39: 928–42
Murray-Rust P, Rzepa HS. Chemical markup, XML, and the World Wide Web. 4: CML schema. J Chem Inf Comput Sci 2003; 43(3): 757–72
Araki K, Ohashi K, Yamazaki S, et al. Medical markup language (MML) for XML-based hospital information interchange. J Med Syst 2000; 24: 195–211
Hucka M, Finney A, Sauro HM, et al. The Systems Biology Markup Language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 2003; 19: 524–31
Acknowledgements
The software described in this paper was developed by the author in the department of Professor Hans Lehrach, Max Planck Institute for Molecular Genetics, Berlin, Germany.
The author has no conflicts of interest that are directly relevant to the content of this article.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Appendix
Appendix
Refer to figure A1, Java™ code for the GenericInternetQuery class. This is a typical example of a derived InternetQuery class.
Rights and permissions
About this article
Cite this article
Beveridge, A. An Object-Oriented Programming System for the Integration of Internet-Based Bioinformatics Resources. Appl-Bioinformatics 5, 29–39 (2006). https://doi.org/10.2165/00822942-200605010-00004
Published:
Issue Date:
DOI: https://doi.org/10.2165/00822942-200605010-00004