Abstract
DNA-binding transcription factors (TFs) play a central role in the gene expression of all organisms, from viruses to humans, including bacteria and archaea. The role of these proteins is the fate of gene expression in the context of environmental challenges. Because thousands of genomes have been sequenced to date, predictions of the encoded proteins are validated through the use of bioinformatics tools to obtain the necessary experimental, posterior knowledge. In this chapter, we describe three approaches to identify TFs in protein sequences. The first approach integrates the results of sequence comparisons and PFAM assignments, using as reference a manually curated collection of TFs. The second approach considers the prediction of DNA-binding structures, such as the classical helix-turn-helix (HTH); and the third approach considers a deep learning model. We suggest that all approaches must be considered together to increase the possibility of identifying new TFs in bacterial and archaeal genomes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Martinez-Antonio A, Janga SC, Salgado H, Collado-Vides J (2006) Internal-sensing machinery directs the activity of the regulatory network in Escherichia coli. Trends Microbiol 14(1):22–27
Browning DF, Busby SJ (2016) Local and global regulation of transcription initiation in bacteria. Nat Rev Microbiol 14(10):638–650
Browning DF, Busby SJ (2004) The regulation of bacterial transcription initiation. Nat Rev Microbiol 2(1):57–65
Flores-Bautista E, Cronick CL, Fersaca AR, Martinez-Nuñez MA, Perez-Rueda E (2018) Functional prediction of hypothetical transcription factors of Escherichia coli K-12 based on expression data. Comput Struct Biotechnol J 16:157–166
Perez-Rueda E, Hernandez-Guerrero R, Martinez-Nuñez MA, Armenta-Medina D, Sanchez I, Ibarra JA (2018) Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors. PLoS One 13(4):e0195332
Cortés-Avalos D, Martínez-Pérez N, Ortiz-Moncada MA, Juárez-González A, Baños-Vargas AA, Estrada-de Los Santos P et al (2021) An update of the unceasingly growing and diverse AraC/XylS family of transcriptional activators. FEMS Microbiol Rev
Pérez-Rueda E, Janga SC (2010) Identification and genomic analysis of transcription factors in archaeal genomes exemplifies their functional architecture and evolutionary origin. Mol Biol Evol 27(6):1449–1459
Gama-Castro S, Salgado H, Santos-Zavaleta A, Ledezma-Tejeida D, Muniz-Rascado L, Garcia-Sotelo JS et al (2016) RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond. Nucleic Acids Res 44(D1):D133–D143
Sierro N, Makita Y, de Hoon M, Nakai K (2008) DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation information. Nucleic Acids Res 36(Database issue):D93–D96
Kummerfeld SK, Teichmann SA (2006) DBD: a transcription factor prediction database. Nucleic Acids Res 34(Database issue):D74–D81
Mistry J, Chuguransky S, Williams L, Qureshi M, Salazar GA, Sonnhammer ELL et al (2021) Pfam: the protein families database in 2021. Nucleic Acids Res 49(D1):D412–D4d9
Lechner M, Findeiss S, Steiner L, Marz M, Stadler PF, Prohaska SJ (2011) Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinf 12:124
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410
Eddy SR (2004) What is a hidden Markov model? Nat Biotechnol 22(10):1315–1316
Brennan RG, Matthews BW (1989) The helix-turn-helix DNA binding motif. J Biol Chem 264(4):1903–1906
Nishikawa T, Okamura H, Nagadoi A, König P, Rhodes D, Nishimura Y (2001) Solution structure of a telomeric DNA complex of human TRF1. Structure 9(12):1237–1251
Wintjens R, Rooman M (1996) Structural classification of HTH DNA-binding domains and protein-DNA interaction modes. J Mol Biol 262(2):294–313
Dodd IB, Egan JB (1990) Improved detection of helix-turn-helix DNA-binding motifs in protein sequences. Nucleic Acids Res 18(17):5019–5026
Kim GB, Gao Y, Palsson BO, Lee SY (2021) DeepTFactor: A deep learning-based tool for the prediction of transcription factors. Proc Natl Acad Sci U S A 118(2)
Acknowledgments
We thank Joaquin Morales, Sandra Sauza, and Israel Sanchez for their technical support. Leonardo Ledesma is a doctoral student from Programa de Doctorado en Ingeniería y Ciencias de la Computación at UNAM and received a fellowship from Consejo Nacional de Ciencia y Tecnología (CONACYT CVU 857463). This work was supported by Dirección General de Asuntos del Personal Académico-Universidad Nacional Autónoma de México (IN-209620),Programa Iberoamericano de Ciencia y Tecnología para el Desarrollo (P918PTE0261), and CONACYT (320012). There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic Supplementary Materials
Supplementary Material S1
(DOCX 16 kb)
Supplementary Material S2
(DOCX 17 kb)
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Ledesma, L., Hernandez-Guerrero, R., Perez-Rueda, E. (2022). Prediction of DNA-Binding Transcription Factors in Bacteria and Archaea Genomes. In: Peeters, E., Bervoets, I. (eds) Prokaryotic Gene Regulation. Methods in Molecular Biology, vol 2516. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2413-5_7
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2413-5_7
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2412-8
Online ISBN: 978-1-0716-2413-5
eBook Packages: Springer Protocols