Skip to main content

Advertisement

Log in

TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains

  • Original Article
  • Published:
Amino Acids Aims and scope Submit manuscript

Abstract

Bacteriocins are proteinaceous toxins produced and exported by both gram-negative and gram-positive bacteria as a defense mechanism. The bacteriocin protein family is highly diverse, which complicates the identification of bacteriocin-like sequences using alignment approaches. The use of topological indices (TIs) irrespective of sequence similarity can be a promising alternative to predict proteinaceous bacteriocins. Thus, we present Topological Indices to BioPolymers (TI2BioP) as an alignment-free approach inspired in both the Topological Substructural Molecular Design (TOPS-MODE) and Markov Chain Invariants for Network Selection and Design (MARCH-INSIDE) methodology. TI2BioP allows the calculation of the spectral moments as simple TIs to seek quantitative sequence-function relationships (QSFR) models. Since hydrophobicity and basicity are major criteria for the bactericide activity of bacteriocins, the spectral moments (HPμ k ) were derived for the first time from protein artificial secondary structures based on amino acid clustering into a Cartesian system of hydrophobicity and polarity. Several orders of HPμ k characterized numerically 196 bacteriocin-like sequences and a control group made up of 200 representative CATH domains. Subsequently, they were used to develop an alignment-free QSFR model allowing a 76.92% discrimination of bacteriocin proteins from other domains, a relevant result considering the high sequence diversity among the members of both groups. The model showed a prediction overall performance of 72.16%, detecting specifically 66.7% of proteinaceous bacteriocins whereas the InterProScan retrieved just 60.2%. As a practical validation, the model also predicted successfully the cryptic bactericide function of the Cry 1Ab C-terminal domain from Bacillus thuringiensis’s endotoxin, which has not been detected by classical alignment methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Agüero-Chapin G, Antunes A, Ubeira FM, Chou KC, Gonzalez-Diaz H (2008a) Comparative study of topological indices of macro/supramolecular RNA complex networks. J Chem Inf Model 48:2265–2277

    Article  PubMed  Google Scholar 

  • Agüero-Chapin G, Gonzalez-Diaz H, de la Riva G, Rodriguez E, Sanchez-Rodriguez A, Podda G, Vazquez-Padron RI (2008b) MMM-QSAR recognition of ribonucleases without alignment: comparison with an HMM model and isolation from Schizosaccharomyces pombe, prediction, and experimental assay of a new sequence. J Chem Inf Model 48:434–448

    Article  PubMed  Google Scholar 

  • Agüero-Chapin G, Varona-Santos J, de la Riva G, Antunes A, González-Villa T, Uriarte E, González-Díaz H (2009) Alignment-free prediction of polygalacturonases with pseudofolding topological indices: experimental isolation from coffea arabica and prediction of a new sequence. J Proteome Res 8:2122–2128

    Article  PubMed  Google Scholar 

  • Barloy F, Lecadet MM, Delecluse A (1998) Distribution of clostridial cry-like genes among Bacillus thuringiensis and Clostridium strains. Curr Microbiol 36:232–237

    Article  CAS  PubMed  Google Scholar 

  • Berry C, O’Neil S, Ben-Dov E, Jones AF, Murphy L, Quail MA, Holden MT, Harris D, Zaritsky A, Parkhill J (2002) Complete sequence and organization of pBtoxis, the toxin-coding plasmid of Bacillus thuringiensis subsp. israelensis. Appl Environ Microbiol 68:5082–5095

    Article  CAS  PubMed  Google Scholar 

  • Brandt BW, Heringa J, Leunissen JA (2008) SEQATOMS: a web tool for identifying missing regions in PDB in sequence context. Nucleic Acids Res 36:W255–W259

    Article  CAS  PubMed  Google Scholar 

  • Bravo A (1997) Phylogenetic relationships of Bacillus thuringiensis delta-endotoxin family proteins and their functional domains. J Bacteriol 179:2793–2801

    CAS  PubMed  Google Scholar 

  • Bravo A, Gomez I, Conde J, Munoz-Garay C, Sanchez J, Miranda R, Zhuang M, Gill SS, Soberon M (2004) Oligomerization triggers binding of a Bacillus thuringiensis Cry1Ab pore-forming toxin to aminopeptidase N receptor leading to insertion into membrane microdomains. Biochim Biophys Acta 1667:38–46

    Article  CAS  PubMed  Google Scholar 

  • Bravo A, Gill SS, Soberon M (2007) Mode of action of Bacillus thuringiensis Cry and Cyt toxins and their potential for insect control. Toxicon 49:423–435

    Article  CAS  PubMed  Google Scholar 

  • Cabrera-Pérez MA, Bermejo Sanz M, Ramos-Torres L, Grau-Ávalos R, Pérez-González M, González-Díaz H (2004) A topological sub-structural approach for predicting human intestinal absorption of drugs. Eur J Med Chem 39:905–916

    Article  Google Scholar 

  • Cornell WD, Cieplak P, Bayly C, Gould IR, Merz KM Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA (1995) A second generation force field for the simulation of proteins, nucleic acids, and organic molecules. J Am Chem Soc 117:5179–5197

    Article  CAS  Google Scholar 

  • Cotter P, Hill C, Ross R (2005) Bacteriocins: developing innate immunity for food. Nat Rev Microbiol 3:777–788

    Article  CAS  PubMed  Google Scholar 

  • Cotter P, Hill C, Ross R (2006) What’s in a name? Class distinction for bacteriocins. Nat Rev Microbiol 4

  • Cruz-Chamorro L, Puertollano MA, Puertollano E, de Cienfuegos GA, de Pablo MA (2006) In vitro biological activities of magainin alone or in combination with nisin. Peptides 27:1201–1209

    Article  CAS  PubMed  Google Scholar 

  • Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA (2009) The CATH classification revisited-architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res 37:D310–D314

    Article  CAS  PubMed  Google Scholar 

  • de Jong A, van Hijum SA, Bijlsma JJ, Kok J, Kuipers OP (2006) BAGEL: a web-based bacteriocin genome mining tool. Nucleic Acids Res 34:W273–W279

    Article  PubMed  Google Scholar 

  • Dirix G, Monsieurs P, Dombrecht B, Daniels R, Marchal K, Vanderleyden J, Michiels J (2004) Peptide signal molecules and bacteriocins in Gram-negative bacteria: a genome-wide in silico screening for peptides containing a double-glycine leader sequence and their cognate transporters. Peptides 25:1425–1440

    Article  CAS  PubMed  Google Scholar 

  • Eriksson L, Jaworska J, Worth AP, Cronin MT, McDowell RM, Gramatica P (2003) Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ Health Perspect 111:1361–1375

    Article  CAS  PubMed  Google Scholar 

  • Estrada E (1996) Spectral moments of the edge adjacency matrix in molecular graphs. 1. Definition and applications to the prediction of physical properties of alkanes. J Chem Inf Comput Sci 36:844–849

    CAS  Google Scholar 

  • Estrada E (1997) Spectral moments of the edge-adjacency matrix of molecular graphs. 2. Molecules containing heteroatoms and QSAR applications. J Chem Inf Comput Sci 37:320–328

    CAS  Google Scholar 

  • Estrada E (2000) On the topological sub-structural molecular design (TOSS-MODE) in QSPR/QSAR and drug design research. SAR QSAR Environ Res 11:55–73

    Article  CAS  PubMed  Google Scholar 

  • Estrada E (2007) A tight-binding “Dihedral Orbitals” approach to the degree of folding of macromolecular chains. J Phys Chem B 111:13611–13618

    Article  CAS  PubMed  Google Scholar 

  • Estrada E, Hatano N (2007) A tight-binding “Dihedral Orbitals” approach to electronic communicability in protein chains. Chem Phys Lett 449:216–220

    Article  CAS  Google Scholar 

  • Estrada E, Uriarte E (2001) Recent advances on the role of topological indices in drug discovery research. Curr Med Chem 8:1573–1588

    CAS  PubMed  Google Scholar 

  • Fimland G, Eijsink VG, Nissen-Meyer J (2002) Mutational analysis of the role of tryptophan residues in an antimicrobial peptide. Biochemistry 41:9508–9515

    Article  CAS  PubMed  Google Scholar 

  • Gillor O, Nigro L, Riley M (2005) Genetically engineered bacteriocins and their potential as the next generation of antimicrobials. Curr Pharm Des 11:1067–1075

    Article  CAS  PubMed  Google Scholar 

  • González MP, Teran C, Teijeira M (2006) A topological function based on spectral moments for predicting affinity toward A3 adenosine receptors. Bioorg Med Chem Lett 16:1291–1296

    Article  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Uriarte E (2005) Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments. Biopolymers 77:296–303

    Article  CAS  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Uriarte E, Ramos de Armas R (2005) Predicting stability of Arc repressor mutants with protein stochastic moments. Bioorg Med Chem 13:323–331

    Article  CAS  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Perez-Castillo Y, Podda G, Uriarte E (2007a) Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices. J Comput Chem 28:1990–1995

    Article  CAS  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Saiz-Urra L, Molina R, Gonzalez-Diaz Y, Sanchez-Gonzalez A (2007b) Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments. J Comput Chem 28:1042–1048

    Article  CAS  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Vilar S, Santana L, Uriarte E (2007c) Medicinal chemistry and bioinformatics—current trends in drugs discovery with networks topological indices. Curr Top Med Chem 7:1015–1029

    Article  CAS  PubMed  Google Scholar 

  • Gonzalez-Diaz H, Gonzalez-Diaz Y, Santana L, Ubeira FM, Uriarte E (2008) Proteomics, networks and connectivity indices. Proteomics 8:750–778

    Article  CAS  PubMed  Google Scholar 

  • González-Díaz H, Molina-Ruiz R, Hernandez I (2007) MARCH-INSIDE v3.0 (markov chains invariants for simulation & design), pp Windows supported version under request to the main author contact email: gonzalezdiazh@yahoo.es.

  • Gutierrez Y, Estrada E (2002) MODESLAB 1.0 (Molecular descriptors laboratory) for Windows.

  • Hammami R, Zouhir A, Hamida JB, Fliss I (2007) BACTIBASE: a new web-accessible database for bacteriocin characterization. BMC Microbiol 7:89

    Article  PubMed  Google Scholar 

  • Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Das U, Daugherty L, Duquenne L, Finn RD, Gough J, Haft D, Hulo N, Kahn D, Kelly E, Laugraud A, Letunic I, Lonsdale D, Lopez R, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Mulder N, Natale D, Orengo C, Quinn AF, Selengut JD, Sigrist CJ, Thimma M, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C (2009) InterPro: the integrative protein signature database. Nucleic Acids Res 37:D211–D215

    Article  CAS  PubMed  Google Scholar 

  • Jacchieri SG (2000) Mining combinatorial data in protein sequences and structures. Molecular Diversity, pp 145–152

  • Kaur K, Andrew LC, Wishart DS, Vederas JC (2004) Dynamic relationships among type IIa bacteriocins: temperature effects on antimicrobial activity and on structure of the C-terminal amphipathic alpha helix as a receptor-binding region. Biochemistry 43:9009–9020

    Article  CAS  PubMed  Google Scholar 

  • Kowalski WJ, Marcoin W (2001) Estimation of bioavailability of selected magnesium organic salts by means of molecular modelling. Boll Chim Farm 140:322–328

    CAS  PubMed  Google Scholar 

  • Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Standardized multiple regression model applied linear statistical models. McGraw Hill, New York, pp 271–277

    Google Scholar 

  • Markovic S, Markovic Z, McCrindle RI (2001) Spectral moments of phenylenes. J Chem Inf Comput Sci 41:112–119

    CAS  PubMed  Google Scholar 

  • Marrero-Ponce Y, Diaz HG, Zaldivar VR, Torrens F, Castro EA (2004) 3D-chiral quadratic indices of the ‘molecular pseudograph’s atom adjacency matrix’ and their application to central chirality codification: classification of ACE inhibitors and prediction of sigma-receptor antagonist activities. Bioorg Med Chem 12:5331–5342

    Article  CAS  Google Scholar 

  • Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, Serrano HS, Morales A, Castanedo N, Ibarra-Velarde F, Huesca-Guillen A, Sanchez AM, Torrens F, Castro EA (2005) Atom, atom-type and total molecular linear indices as a promising approach for bioorganic and medicinal chemistry: theoretical and experimental assessment of a novel method for virtual screening and rational design of new lead anthelmintic. Bioorg Med Chem 13:1005–1020

    Article  CAS  PubMed  Google Scholar 

  • Mathews DH (2006) RNA secondary structure analysis using RNAstructure. Curr Protoc Bioinformatics chap 12 (Unit 12.6)

  • Mc Farland JW, Gans DJ (1995a) Cluster significance analysis. In: van Waterbeemd H (ed) Method and principles in medicinal chemistry. VCH, Weinheim

    Google Scholar 

  • Mc Farland JW, Gans DJ (1995b) Cluster significance analysis. In: Manhnhold R, Krogsgaard-Larsen P, Timmerman V, Van Waterbeemd H (eds) Method and principles in medicinal chemistry, VCH, Weinhiem 2:295–307

  • Meneses-Marcel A, Marrero-Ponce Y, Machado-Tugores Y, Montero-Torres A, Pereira DM, Escario JA, Nogal-Ruiz JJ, Ochoa C, Aran VJ, Martinez-Fernandez AR, Garcia Sanchez RN (2005) A linear discrimination analysis based virtual screening of trichomonacidal lead-like compounds: outcomes of in silico studies supported by experimental results. Bioorg Med Chem Lett 15:3838–3843

    Article  CAS  PubMed  Google Scholar 

  • Molina R, Agüero-Chapin G, Pérez-González MP (2009) TI2BioP (Topological indices to biopolymers) version 1.0. Molecular simulation and drug design (MSDD). Chemical Bioactives Center, Central University of Las Villas, Cuba

    Google Scholar 

  • Munteanu CR, Vazquez JM, Dorado J, Sierra AP, Sanchez-Gonzalez A, Prado-Prado FJ, Gonzalez-Diaz H (2009) Complex network spectral moments for ATCUN Motif DNA cleavage: first predictive study on proteins of human pathogen parasites. J Proteome Res 8:5219–5228

    Article  CAS  PubMed  Google Scholar 

  • Nandy A (1994) Recent investigations into global characteristics of long DNA sequences. Indian J Biochem Biophys 31:149–155

    CAS  PubMed  Google Scholar 

  • Nandy A (1996) Two-dimensional graphical representation of DNA sequences and intron-exon discrimination in intron-rich sequences. Comput Appl Biosci 12:55–62

    CAS  PubMed  Google Scholar 

  • Niculescu SP, Atkinson A, Hammond G, Lewis M (2004) Using fragment chemistry data mining and probabilistic neural networks in screening chemicals for acute toxicity to the fathead minnow. SAR QSAR Environ Res 15:293–309

    Article  CAS  PubMed  Google Scholar 

  • Padilla C, Pardo-Lopez L, de la Riva G, Gomez I, Sanchez J, Hernandez G, Nunez ME, Carey MP, Dean DH, Alzate O, Soberon M, Bravo A (2006) Role of tryptophan residues in toxicity of Cry1Ab toxin from Bacillus thuringiensis. Appl Environ Microbiol 72:901–907

    Article  CAS  PubMed  Google Scholar 

  • Pardo-Lopez L, Gomez I, Munoz-Garay C, Jimenez-Juarez N, Soberon M, Bravo A (2006) Structural and functional analysis of the pre-pore and membrane-inserted pore of Cry1Ab toxin. J Invertebr Pathol 92:172–177

    Article  CAS  PubMed  Google Scholar 

  • Quevillon E, Silventoinen V, Pillai S, Harte N, Mulder N, Apweiler R, Lopez R (2005) InterProScan: protein domains identifier. Nucleic Acids Res 33:W116–W120

    Article  CAS  PubMed  Google Scholar 

  • Randic M, Vracko M (2000) On the similarity of DNA primary sequences. J Chem Inf Comput Sci 40:599–606

    CAS  PubMed  Google Scholar 

  • Rivals I, Personnaz L (1999) On cross validation for model selection. Neural Comput 11:863–870

    Article  CAS  PubMed  Google Scholar 

  • Sand SL, Haug TM, Nissen-Meyer J, Sand O (2007) The bacterial peptide pheromone plantaricin A permeabilizes cancerous, but not normal, rat pituitary cells and differentiates between the outer and inner membrane leaflet. J Membr Biol 216:61–71

    Article  CAS  PubMed  Google Scholar 

  • Sang Y, Blecha F (2008) Antimicrobial peptides and bacteriocins: alternatives to traditional antibiotics. Anim Health Res Rev 9:227–235

    Article  PubMed  Google Scholar 

  • Santana L, Uriarte E, González-Díaz H, Zagotto G, Soto-Otero R, Mendez-Alvarez E (2006) A QSAR model for in silico screening of MAO-A inhibitors. Prediction, synthesis, and biological assay of novel coumarins. J Med Chem 49:1149–1156

    Article  CAS  PubMed  Google Scholar 

  • Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147:195–197

    Article  CAS  PubMed  Google Scholar 

  • Statsoft (2007) STATISTICA 7.0 (data analysis software system for windows)

  • Stein T (2005) Bacillus subtilis antibiotics: structures, syntheses and specific functions. Mol Microbiol 56:845–857

    Article  CAS  PubMed  Google Scholar 

  • Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293

    Article  CAS  PubMed  Google Scholar 

  • Vazquez-Padron RI, de la Riva G, Aguero G, Silva Y, Pham SM, Soberon M, Bravo A, Aitouche A (2004) Cryptic endotoxic nature of Bacillus thuringiensis Cry1Ab insecticidal crystal protein. FEBS Lett 570:30–36

    Article  CAS  PubMed  Google Scholar 

  • Vilar S, Estrada E, Uriarte E, Santana L, Gutierrez Y (2005) In silico studies toward the discovery of new anti-HIV nucleoside compounds through the use of TOPS-MODE and 2D/3D connectivity indices. 2. Purine derivatives. J chem inf model 45:502–514

    Article  CAS  PubMed  Google Scholar 

  • Vilar S, Gonzalez-Diaz H, Santana L, Uriarte E (2008) QSAR model for alignment-free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP-lattice networks. J Comput Chem 29:2613–2622

    Article  CAS  PubMed  Google Scholar 

  • Yokoyama T, Tanaka M, Hasegawa M (2004) Novel cry gene from Paenibacillus lentimorbus strain Semadara inhibits ingestion and promotes insecticidal activity in Anomala cuprea larvae. J Invertebr Pathol 85:25–32

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

The authors acknowledge the Portuguese Fundação para a Ciência e a Tecnologia (FCT) for financial support to GACH (SFRH/BD/47256/2008), AMH (SFRH/BPD/63946/2009) and the project PTDC/BIA-BDE/69144/2006 and PTDC/AAC-AMB/104983/2008. GACH acknowledges the Assistant Professor Roberto I. Vázquez-Padrón from the University of Miami, USA for providing useful information on the Cry 1Ab endotoxin from Bt subsp. kurstaki.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Agostinho Antunes.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Agüero-Chapin, G., Pérez-Machado, G., Molina-Ruiz, R. et al. TI2BioP: Topological Indices to BioPolymers. Its practical use to unravel cryptic bacteriocin-like domains. Amino Acids 40, 431–442 (2011). https://doi.org/10.1007/s00726-010-0653-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00726-010-0653-9

Keywords

Navigation