Abstract
The ever-increasing data and restricted execution time require automated computational workflow systems to handle it. Several tools are emerging to support this activity. Automated workflow systems require scripting to define the repetitive tasks on new data to generate desired output. They help in focussing on what a particular virtual experiment will achieve rather than how the process is executed. The theme of this chapter is identification of the repetitive tasks which can be automated to employ workflows for streamlining a series of computational tasks efficiently. A brief introduction to workflows and their components is followed by in-depth tutorials using today’s state-of-art workflow-based applications in the field of chemoinformatics for drug discovery research. An in-house-developed stand-alone application for chemo-bioinformatics workflow for performing protein–ligand networks J-ProLINE is also presented.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Wyrzykowski R, Dongarra J, Karczewski K et al (2008) Scientific workflow: a survey and research directions. Parallel processing and applied mathematics. Springer Berlin Heidelberg, pp 746–753
http://www.doc.ic.ac.uk/~vc100/papers/Scientific_workflow_systems.pdf. Accessed 30 Oct 2013
Taylor IJ, Deelman E, Gannon DB, Shields M (eds) (2007) Workflows for e-science—scientific workflows for grids. XXI, p 523
Zhao Y, Raicu I, Foster I (2008) Scientific workflow systems for 21st century, new bottle or new wine? 2008 IEEE Congress on Services 2008-Part I, pp 467–471
http://www.cs.gonzaga.edu/~bowers/papers/Bowers_et_al_SCIFLOW06.pdf. Accessed 30 Oct 2013
Aranguren ME, Fernandez-Breis JT, Mungall C et al (2013) OPPL-Galaxy, a Galaxy tool for enhancing ontology exploitation as part of bioinformatics workflows. J Biomed Semantics 4:2
Elhai J, Taton A, Massar JP et al (2009) BioBIKE: A Web-based, programmable, integrated biological knowledge base. Nucleic Acids Res 37:W28–W32
Kallio MA, Tuimala JT, Hupponen T et al (2011) Chipster: user-friendly analysis software for microarray and other high-throughput data. BMC Genomics 12:507
Ovaska K, Laakso M, Haapa-Paananen S et al (2010) Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme. Genome Med 2:65
http://www.aosabook.org/en/vistrails.html. Accessed 30 Oct 2013
http://accelrys.com/products/pipeline-pilot/. Accessed 30 Oct 2013
http://www.idbs.com/products-and-services/inforsense-suite/chemsense/. Accessed 30 Oct 2013
Oinn T, Addis M, Ferris J et al (2004) Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20:3045–3054
Mazanetz MP, Marmon RJ, Reisser CBT, Morao I (2012) Drug discovery applications for KNIME: an open source data mining platform. Curr Top Med Chem 12:1965–1979
Warr WA (2012) Scientific workflow systems: Pipeline Pilot and KNIME. J Compu Aided Mol Des 26:801–804
www.chemaxon.com. Accessed 30 Oct 2013
Kuhn T, Willighagen EL, Zielesny A, Steinbeck C (2010) CDK-Taverna: an open workflow environment for cheminformatics. BMC Bioinform 11:159
Thorsten M, Wiswedel B, Berthold, Michael R (2012) Workflow tools for managing biological and chemical data. In: Guha R, Bender A (eds) Computational approaches in chemoinformatics and bioinformatics, pp 179–209
Fourches D, Muratov E, Pu D, Tropsha, A (2011) Boosting predictive power of QSAR models Alexander Abstracts of Papers, 241st ACS National Meeting & Exposition, Anaheim, CA, United States, March 27–31
http://www.knime.org/files/01_Schroedinger.pdf. Accessed 30 Oct 2013
http://www.knime.org/files/09_CCG.pdf. Accessed 30 Oct 2013
http://www.chemaxon.com/library/chemaxons-jchem-nodes-on-the-knime-workbench/. Accessed 30 Oct 2013
Dunbar JB, Smith RD, Damm-Ganamet KL, Ahmed A, Esposito, EX, Delproposto J, Chinnaswamy K, Kang Y-N, Kubish G, Gestwicki JE (2013) CSAR data set release 2012: ligands, affinities, complexes, and docking decoys. J Chem Inf Model 53(8):1842–1852
Chan AWE, Overington JP (2003) Recent development in chemoinformatics and chemogenomics. Annu Rep Med Chem 38:285–294
Hwang KY, Chung JH, Kim SH, Han YS, Cho Y (1999) Structure-based identification of a novel NTPase from methanococcus jannaschii. Nat Struct Biol 6:691–696
Martin YC, Willett P, Heller SR (eds) (1995) In designing bioactive molecules. American Chemical Society, Washington DC
Koshland DE Jr (1994) The key-lock theory and the induced fit theory. Chem Int Ed Engl 33:2375–2378
Todd AE, Orengo CA, Thornton JM (1999) Evolution of protein function, from a structural perspective. Curr Opin Chem Biol 3:548–556
Eckers E, Petrungaro C, Gross D, Riemer J, Hell K, Deponte M (2013) Divergent molecular evolution of the mitochondrial sulfhydryl: cytochrome c oxidoreductase Erv in opisthokonts and parasitic protists. J Biol Chem 288(4):2676–2688
Gaston, Daniel;Roger, Andrew J (2013) Functional divergence and convergent evolution in the plastid-targeted glyceraldehyde-3-phosphate dehydrogenases of diverse eukaryotic algae. PLoS One 8(7):e70396
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H (2000) The protein data bank. Nucl Acids Res 28:235–242
Ewing T, Baber JC, Feher M (2006) Novel 2D fingerprints for ligand-based virtual screening. J Chem Inf Mod 46:2423–2431
Deng Z, Chuaqui C, Singh J (2003) Structural Interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein ligand binding interactions. J Med Chem 47:337–344
Deng Z, Chuaqui C, Singh J (2007) Generation of profile-structural interaction fingerprints for representing and analyzing three-dimensional target molecule-ligand interactions. U.S. Pat Appl Publ US 20070020642A120070125
Nandigam RK, Kim S, Singh J, Chuaqui C (2009) Position specific interaction dependent scoring technique for virtual screening based on weighted protein-ligand interaction fingerprint profiles. J Chem Inf Mod 49(5):1185–1192
Tan L, Bajorath J (2009) Utilizing target–ligand interaction information in fingerprint searching for ligands of related targets. Chem Biol Drug Des 74:25–32
Klepsch F, Chiba P, Ecker GF (2011) Exhaustive sampling of docking poses reveals binding hypotheses for propafenone type inhibitors of P-Glycoprotein. PLoS Comput Biol 7(5):e1002036
Weisel M, Bitter H-M, Diederich F (2012) PROLIX: rapid mining of protein ligand interactions in large crystal structure databases. J Chem Inf Model 52:1450–1461
Unpublished results
Karthikeyan M, Krishnan S, Pandey AK, Bender A (2006) Harvesting chemical information from the internet using a distributed approach: vhemXtreme. J Chem Inf Model 46:452–461
Karthikeyan, M, Krishnan S, Pandey AK, Andreas B, Alexander Tropsha A (2008) Distributed chemical computing using chemstar: an open source java remote method invocation architecture applied to large scale molecular data from pubchem. J Chem Inf Model 48(4):691–703
http://www.liferay.com/products/liferay-portal/overview. Accessed 30 Oct 2013
https://surechem.uservoice.com/knowledgebase/articles/84207-tanimoto-coefficient-and-fingerprint-generation. Accessed 30 Oct 2013
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/heatmap.html. Accessed 30 Oct 2013
Unpublished work
http://www.pdbbind.org.cn/. Accessed 30 Oct 2013
http://www.bindingdb.org/bind/index.jsp. Accessed 30 Oct 2013
http://www.hupo.org/. Accessed 30 Oct 2013
http://string-db.org/. Accessed 30 Oct 2013
http://thebiogrid.org/. Accessed 30 Oct 2013
http://www.hprd.org/. Accessed 30 Oct 2013
Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G (2007) FEBS letters MINT: a molecular INTeraction database. Nucleic Acid Res 35:D572–574
Xenarios, I, Salwinski L, Duan XJ, Higney P, Kim S-M, Eisenberg D (2002) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res 30(1):303–305
http://bioinfow.dep.usal.es/apid/index.htm. Accessed 30 Oct 2013
http://www.cytoscape.org/. Accessed 30 Oct 2013
http://vlado.fmf.uni-lj.si/pub/networks/pajek/. Accessed 30 Oct 2013
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2014 Springer India
About this chapter
Cite this chapter
Karthikeyan, M., Vyas, R. (2014). Integration of Automated Workflow in Chemoinformatics for Drug Discovery. In: Practical Chemoinformatics. Springer, New Delhi. https://doi.org/10.1007/978-81-322-1780-0_9
Download citation
DOI: https://doi.org/10.1007/978-81-322-1780-0_9
Published:
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-1779-4
Online ISBN: 978-81-322-1780-0
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)