Abstract
Bacteriocins are proteinaceous toxins produced and exported by both gram-negative and gram-positive bacteria as a defense mechanism. The bacteriocin protein family is highly diverse, which complicates the identification of bacteriocin-like sequences using alignment approaches. The use of topological indices (TIs) irrespective of sequence similarity can be a promising alternative to predict proteinaceous bacteriocins. Thus, we present Topological Indices to BioPolymers (TI2BioP) as an alignment-free approach inspired in both the Topological Substructural Molecular Design (TOPS-MODE) and Markov Chain Invariants for Network Selection and Design (MARCH-INSIDE) methodology. TI2BioP allows the calculation of the spectral moments as simple TIs to seek quantitative sequence-function relationships (QSFR) models. Since hydrophobicity and basicity are major criteria for the bactericide activity of bacteriocins, the spectral moments (HPμ k ) were derived for the first time from protein artificial secondary structures based on amino acid clustering into a Cartesian system of hydrophobicity and polarity. Several orders of HPμ k characterized numerically 196 bacteriocin-like sequences and a control group made up of 200 representative CATH domains. Subsequently, they were used to develop an alignment-free QSFR model allowing a 76.92% discrimination of bacteriocin proteins from other domains, a relevant result considering the high sequence diversity among the members of both groups. The model showed a prediction overall performance of 72.16%, detecting specifically 66.7% of proteinaceous bacteriocins whereas the InterProScan retrieved just 60.2%. As a practical validation, the model also predicted successfully the cryptic bactericide function of the Cry 1Ab C-terminal domain from Bacillus thuringiensis’s endotoxin, which has not been detected by classical alignment methods.
Similar content being viewed by others
References
Agüero-Chapin G, Antunes A, Ubeira FM, Chou KC, Gonzalez-Diaz H (2008a) Comparative study of topological indices of macro/supramolecular RNA complex networks. J Chem Inf Model 48:2265–2277
Agüero-Chapin G, Gonzalez-Diaz H, de la Riva G, Rodriguez E, Sanchez-Rodriguez A, Podda G, Vazquez-Padron RI (2008b) MMM-QSAR recognition of ribonucleases without alignment: comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction, and experimental assay of a new sequence. J Chem Inf Model 48:434–448
Agüero-Chapin G, Varona-Santos J, de la Riva G, Antunes A, González-Villa T, Uriarte E, González-Díaz H (2009) Alignment-free prediction of polygalacturonases with pseudofolding topological indices: experimental isolation from coffea arabica and prediction of a new sequence. J Proteome Res 8:2122–2128
Barloy F, Lecadet MM, Delecluse A (1998) Distribution of clostridial cry-like genes among Bacillus thuringiensis and Clostridium strains. Curr Microbiol 36:232–237
Berry C, O’Neil S, Ben-Dov E, Jones AF, Murphy L, Quail MA, Holden MT, Harris D, Zaritsky A, Parkhill J (2002) Complete sequence and organization of pBtoxis, the toxin-coding plasmid of Bacillus thuringiensis subsp. israelensis. Appl Environ Microbiol 68:5082–5095
Brandt BW, Heringa J, Leunissen JA (2008) SEQATOMS: a web tool for identifying missing regions in PDB in sequence context. Nucleic Acids Res 36:W255–W259
Bravo A (1997) Phylogenetic relationships of Bacillus thuringiensis delta-endotoxin family proteins and their functional domains. J Bacteriol 179:2793–2801
Bravo A, Gomez I, Conde J, Munoz-Garay C, Sanchez J, Miranda R, Zhuang M, Gill SS, Soberon M (2004) Oligomerization triggers binding of a Bacillus thuringiensis Cry1Ab pore-forming toxin to aminopeptidase N receptor leading to insertion into membrane microdomains. Biochim Biophys Acta 1667:38–46
Bravo A, Gill SS, Soberon M (2007) Mode of action of Bacillus thuringiensis Cry and Cyt toxins and their potential for insect control. Toxicon 49:423–435
Cabrera-Pérez MA, Bermejo Sanz M, Ramos-Torres L, Grau-Ávalos R, Pérez-González M, González-Díaz H (2004) A topological sub-structural approach for predicting human intestinal absorption of drugs. Eur J Med Chem 39:905–916
Cornell WD, Cieplak P, Bayly C, Gould IR, Merz KM Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA (1995) A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J Am Chem Soc 117:5179–5197
Cotter P, Hill C, Ross R (2005) Bacteriocins: developing innate immunity for food. Nat Rev Microbiol 3:777–788
Cotter P, Hill C, Ross R (2006) What’s in a name? Class distinction for bacteriocins. Nat Rev Microbiol 4
Cruz-Chamorro L, Puertollano MA, Puertollano E, de Cienfuegos GA, de Pablo MA (2006) In vitro biological activities of magainin alone or in combination with nisin. Peptides 27:1201–1209
Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA (2009) The CATH classification revisited-architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res 37:D310–D314
de Jong A, van Hijum SA, Bijlsma JJ, Kok J, Kuipers OP (2006) BAGEL: a web-based bacteriocin genome mining tool. Nucleic Acids Res 34:W273–W279
Dirix G, Monsieurs P, Dombrecht B, Daniels R, Marchal K, Vanderleyden J, Michiels J (2004) Peptide signal molecules and bacteriocins in Gram-negative bacteria: a genome-wide in silico screening for peptides containing a double-glycine leader sequence and their cognate transporters. Peptides 25:1425–1440
Eriksson L, Jaworska J, Worth AP, Cronin MT, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ Health Perspect 111:1361–1375
Estrada E (1996) Spectral moments of the edge adjacency matrix in molecular graphs. 1. Definition and applications to the prediction of physical properties of alkanes. J Chem Inf Comput Sci 36:844–849
Estrada E (1997) Spectral moments of the edge-adjacency matrix of molecular graphs. 2. Molecules containing heteroatoms and QSAR applications. J Chem Inf Comput Sci 37:320–328
Estrada E (2000) On the topological sub-structural molecular design (TOSS-MODE) in QSPR/QSAR and drug design research. SAR QSAR Environ Res 11:55–73
Estrada E (2007) A tight-binding “Dihedral Orbitals” approach to the degree of folding of macromolecular chains. J Phys Chem B 111:13611–13618
Estrada E, Hatano N (2007) A tight-binding “Dihedral Orbitals” approach to electronic communicability in protein chains. Chem Phys Lett 449:216–220
Estrada E, Uriarte E (2001) Recent advances on the role of topological indices in drug discovery research. Curr Med Chem 8:1573–1588
Fimland G, Eijsink VG, Nissen-Meyer J (2002) Mutational analysis of the role of tryptophan residues in an antimicrobial peptide. Biochemistry 41:9508–9515
Gillor O, Nigro L, Riley M (2005) Genetically engineered bacteriocins and their potential as the next generation of antimicrobials. Curr Pharm Des 11:1067–1075
González MP, Teran C, Teijeira M (2006) A topological function based on spectral moments for predicting affinity toward A3 adenosine receptors. Bioorg Med Chem Lett 16:1291–1296
Gonzalez-Diaz H, Uriarte E (2005) Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments. Biopolymers 77:296–303
Gonzalez-Diaz H, Uriarte E, Ramos de Armas R (2005) Predicting stability of Arc repressor mutants with protein stochastic moments. Bioorg Med Chem 13:323–331
Gonzalez-Diaz H, Perez-Castillo Y, Podda G, Uriarte E (2007a) Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices. J Comput Chem 28:1990–1995
Gonzalez-Diaz H, Saiz-Urra L, Molina R, Gonzalez-Diaz Y, Sanchez-Gonzalez A (2007b) Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments. J Comput Chem 28:1042–1048
Gonzalez-Diaz H, Vilar S, Santana L, Uriarte E (2007c) Medicinal chemistry and bioinformatics—current trends in drugs discovery with networks topological indices. Curr Top Med Chem 7:1015–1029
Gonzalez-Diaz H, Gonzalez-Diaz Y, Santana L, Ubeira FM, Uriarte E (2008) Proteomics, networks and connectivity indices. Proteomics 8:750–778
González-Díaz H, Molina-Ruiz R, Hernandez I (2007) MARCH-INSIDE v3.0 (markov chains invariants for simulation & design), pp Windows supported version under request to the main author contact email: gonzalezdiazh@yahoo.es.
Gutierrez Y, Estrada E (2002) MODESLAB 1.0 (Molecular descriptors laboratory) for Windows.
Hammami R, Zouhir A, Hamida JB, Fliss I (2007) BACTIBASE: a new web-accessible database for bacteriocin characterization. BMC Microbiol 7:89
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C (2009) InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–D215
Jacchieri SG (2000) Mining combinatorial data in protein sequences and structures. Molecular Diversity, pp 145–152
Kaur K, Andrew LC, Wishart DS, Vederas JC (2004) Dynamic relationships among type IIa bacteriocins: temperature effects on antimicrobial activity and on structure of the C-terminal amphipathic alpha helix as a receptor-binding region. Biochemistry 43:9009–9020
Kowalski WJ, Marcoin W (2001) Estimation of bioavailability of selected magnesium organic salts by means of molecular modelling. Boll Chim Farm 140:322–328
Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Standardized multiple regression model applied linear statistical models. McGraw Hill, New York, pp 271–277
Markovic S, Markovic Z, McCrindle RI (2001) Spectral moments of phenylenes. J Chem Inf Comput Sci 41:112–119
Marrero-Ponce Y, Diaz HG, Zaldivar VR, Torrens F, Castro EA (2004) 3D-chiral quadratic indices of the ‘molecular pseudograph’s atom adjacency matrix’ and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. Bioorg Med Chem 12:5331–5342
Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, Serrano HS, Morales A, Castanedo N, Ibarra-Velarde F, Huesca-Guillen A, Sanchez AM, Torrens F, Castro EA (2005) Atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic. Bioorg Med Chem 13:1005–1020
Mathews DH (2006) RNA secondary structure analysis using RNAstructure. Curr Protoc Bioinformatics chap 12 (Unit 12.6)
Mc Farland JW, Gans DJ (1995a) Cluster significance analysis. In: van Waterbeemd H (ed) Method and principles in medicinal chemistry. VCH, Weinheim
Mc Farland JW, Gans DJ (1995b) Cluster significance analysis. In: Manhnhold R, Krogsgaard-Larsen P, Timmerman V, Van Waterbeemd H (eds) Method and principles in medicinal chemistry, VCH, Weinhiem 2:295–307
Meneses-Marcel A, Marrero-Ponce Y, Machado-Tugores Y, Montero-Torres A, Pereira DM, Escario JA, Nogal-Ruiz JJ, Ochoa C, Aran VJ, Martinez-Fernandez AR, Garcia Sanchez RN (2005) A linear discrimination analysis based virtual screening of trichomonacidal lead-like compounds: outcomes of in silico studies supported by experimental results. Bioorg Med Chem Lett 15:3838–3843
Molina R, Agüero-Chapin G, Pérez-González MP (2009) TI2BioP (Topological indices to biopolymers) version 1.0. Molecular simulation and drug design (MSDD). Chemical Bioactives Center, Central University of Las Villas, Cuba
Munteanu CR, Vazquez JM, Dorado J, Sierra AP, Sanchez-Gonzalez A, Prado-Prado FJ, Gonzalez-Diaz H (2009) Complex network spectral moments for ATCUN Motif DNA cleavage: first predictive study on proteins of human pathogen parasites. J Proteome Res 8:5219–5228
Nandy A (1994) Recent investigations into global characteristics of long DNA sequences. Indian J Biochem Biophys 31:149–155
Nandy A (1996) Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences. Comput Appl Biosci 12:55–62
Niculescu SP, Atkinson A, Hammond G, Lewis M (2004) Using fragment chemistry data mining and probabilistic neural networks in screening chemicals for acute toxicity to the fathead minnow. SAR QSAR Environ Res 15:293–309
Padilla C, Pardo-Lopez L, de la Riva G, Gomez I, Sanchez J, Hernandez G, Nunez ME, Carey MP, Dean DH, Alzate O, Soberon M, Bravo A (2006) Role of tryptophan residues in toxicity of Cry1Ab toxin from Bacillus thuringiensis. Appl Environ Microbiol 72:901–907
Pardo-Lopez L, Gomez I, Munoz-Garay C, Jimenez-Juarez N, Soberon M, Bravo A (2006) Structural and functional analysis of the pre-pore and membrane-inserted pore of Cry1Ab toxin. J Invertebr Pathol 92:172–177
Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120
Randic M, Vracko M (2000) On the similarity of DNA primary sequences. J Chem Inf Comput Sci 40:599–606
Rivals I, Personnaz L (1999) On cross validation for model selection. Neural Comput 11:863–870
Sand SL, Haug TM, Nissen-Meyer J, Sand O (2007) The bacterial peptide pheromone plantaricin A permeabilizes cancerous, but not normal, rat pituitary cells and differentiates between the outer and inner membrane leaflet. J Membr Biol 216:61–71
Sang Y, Blecha F (2008) Antimicrobial peptides and bacteriocins: alternatives to traditional antibiotics. Anim Health Res Rev 9:227–235
Santana L, Uriarte E, González-Díaz H, Zagotto G, Soto-Otero R, Mendez-Alvarez E (2006) A QSAR model for in silico screening of MAO-A inhibitors. Prediction, synthesis, and biological assay of novel coumarins. J Med Chem 49:1149–1156
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197
Statsoft (2007) STATISTICA 7.0 (data analysis software system for windows)
Stein T (2005) Bacillus subtilis antibiotics: structures, syntheses and specific functions. Mol Microbiol 56:845–857
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293
Vazquez-Padron RI, de la Riva G, Aguero G, Silva Y, Pham SM, Soberon M, Bravo A, Aitouche A (2004) Cryptic endotoxic nature of Bacillus thuringiensis Cry1Ab insecticidal crystal protein. FEBS Lett 570:30–36
Vilar S, Estrada E, Uriarte E, Santana L, Gutierrez Y (2005) In silico studies toward the discovery of new anti-HIV nucleoside compounds through the use of TOPS-MODE and 2D/3D connectivity indices. 2. Purine derivatives. J chem inf model 45:502–514
Vilar S, Gonzalez-Diaz H, Santana L, Uriarte E (2008) QSAR model for alignment-free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP-lattice networks. J Comput Chem 29:2613–2622
Yokoyama T, Tanaka M, Hasegawa M (2004) Novel cry gene from Paenibacillus lentimorbus strain Semadara inhibits ingestion and promotes insecticidal activity in Anomala cuprea larvae. J Invertebr Pathol 85:25–32
Acknowledgments
The authors acknowledge the Portuguese Fundação para a Ciência e a Tecnologia (FCT) for financial support to GACH (SFRH/BD/47256/2008), AMH (SFRH/BPD/63946/2009) and the project PTDC/BIA-BDE/69144/2006 and PTDC/AAC-AMB/104983/2008. GACH acknowledges the Assistant Professor Roberto I. Vázquez-Padrón from the University of Miami, USA for providing useful information on the Cry 1Ab endotoxin from Bt subsp. kurstaki.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Agüero-Chapin, G., Pérez-Machado, G., Molina-Ruiz, R. et al. TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains. Amino Acids 40, 431–442 (2011). https://doi.org/10.1007/s00726-010-0653-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00726-010-0653-9