Determination of specificity influencing residues for key transcription factor families

Patel, Ronak Y.; Garde, Christian; Stormo, Gary D.

doi:10.1007/s40484-015-0045-y

Determination of specificity influencing residues for key transcription factor families

Research Article
Published: 16 June 2015

Volume 3, pages 115–123, (2015)
Cite this article

Download PDF

Quantitative Biology

Determination of specificity influencing residues for key transcription factor families

Download PDF

Ronak Y. Patel¹,
Christian Garde² &
Gary D. Stormo¹

541 Accesses
2 Citations
Explore all metrics

Abstract

Transcription factors (TFs) are major modulators of transcription and subsequent cellular processes. The binding of TFs to specific regulatory elements is governed by their specificity. Considering the gap between known TFs sequence and specificity, specificity prediction frameworks are highly desired. Key inputs to such frameworks are protein residues that modulate the specificity of TF under consideration. Simple measures like mutual information (MI) to delineate specificity influencing residues (SIRs) from alignment fail due to structural constraints imposed by the three-dimensional structure of protein. Structural restraints on the evolution of the amino-acid sequence lead to identification of false SIRs. In this manuscript we extended three methods (direct information, PSICOVand adjusted mutual information) that have been used to disentangle spurious indirect protein residue-residue contacts from direct contacts, to identify SIRs from joint alignments of amino-acids and specificity. We predicted SIRs for homeodomain (HD), helix-loop-helix, LacI and GntR families of TFs using these methods and compared to MI. Using various measures, we show that the performance of these three methods is comparable but better than MI. Implication of these methods in specificity prediction framework is discussed. The methods are implemented as an R package and available along with the alignments at http://stormo.wustl.edu/SpecPred.

Article PDF

Deciphering the protein-DNA code of bacterial winged helix-turn-helix transcription factors

Article 13 February 2018

Adam P. Joyce & James J. Havranek

Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles

Molecular and structural considerations of TF-DNA binding for the generation of biologically meaningful and accurate phylogenetic footprinting analysis: the LysR-type transcriptional regulator family as a study model

Article Open access 27 August 2016

Patricia Oliver, Martín Peralta-Gil, … Enrique Merino

References

Balwierz, P. J., Pachkov, M., Arnold, P., Gruber, A. J., Zavolan, M. and van Nimwegen, E. (2014) ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs. Genome Res., 24, 869–884
Article PubMed Central CAS PubMed Google Scholar
Khurana, E., Fu, Y., Colonna, V., Mu, X. J., Kang, H. M., Lappalainen, T., Sboner, A., Lochovsky, L., Chen, J., Harmanci, A., et al. (2013) Integrative annotation of variants from 1092 humans: application to cancer genomics. Science, 342, 1235587
Article PubMed Central PubMed Google Scholar
Wright, D. A., Li, T., Yang, B. and Spalding, M. H. (2014) TALENmediated genome editing: prospects and perspectives. Biochem. J., 462, 15–24
Article CAS PubMed Google Scholar
Mendenhall, E. M., Williamson, K. E., Reyon, D., Zou, J. Y., Ram, O., Joung, J. K. and Bernstein, B. E. (2013) Locus-specific editing of histone modifications at endogenous enhancers. Nat. Biotechnol., 31, 1133–1136
Article PubMed Central CAS PubMed Google Scholar
Lin, Y., Chomvong, K., Acosta-Sampson, L., Estrela, R., Galazka, J. M., Kim, S. R., Jin, Y. S. and Cate, J. H. (2014) Leveraging transcription factors to speed cellobiose fermentation by Saccharomyces cerevisiae. Biotechnol. Biofuels, 7, 126
PubMed Central PubMed Google Scholar
Cheng, C., Alexander, R., Min, R., Leng, J., Yip, K. Y., Rozowsky, J., Yan, K. K., Dong, X., Djebali, S., Ruan, Y., et al. (2012) Understanding transcriptional regulation by integrative analysis of transcription factor binding data. Genome Res., 22, 1658–1667
Article PubMed Central CAS PubMed Google Scholar
Haynes, B. C., Maier, E. J., Kramer, M. H., Wang, P. I., Brown, H. and Brent, M. R. (2013) Mapping functional transcription factor networks from gene expression data. Genome Res., 23, 1319–1328
Article PubMed Central CAS PubMed Google Scholar
Vaquerizas, J. M., Kummerfeld, S. K., Teichmann, S. A. and Luscombe, N. M. (2009) A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet., 10, 252–263
Article CAS PubMed Google Scholar
Matthews, B.W. (1988) No code for recognition. Nature, 335, 294–295
Article CAS PubMed Google Scholar
Benos, P. V., Lapedes, A. S. and Stormo, G. D. (2002) Probabilistic code for DNA recognition by proteins of the EGR family. J. Mol. Biol., 323, 701–727
Article CAS PubMed Google Scholar
Gupta, A., Christensen, R. G., Bell, H. A., Goodwin, M., Patel, R. Y., Pandey, M., Enuameh, M. S., Rayla, A. L., Zhu, C., Thibodeau-Beganny, S., et al. (2014) An improved predictive recognition model for Cys₂-His₂ zinc finger proteins. Nucleic Acids Res., 42, 4800–4812
Article PubMed Central CAS PubMed Google Scholar
Kaplan, T., Friedman, N. and Margalit, H. (2005) Ab initio prediction of transcription factor targets using structural knowledge. PLoS Comput. Biol., 1, e1
Article PubMed Central PubMed Google Scholar
Liu, J. and Stormo, G. D. (2008) Context-dependent DNA recognition code for C₂H₂ zinc-finger transcription factors. Bioinformatics, 24, 1850–1857
Article PubMed Central CAS PubMed Google Scholar
Persikov, A. V., Osada, R. and Singh, M. (2009) Predicting DNA recognition by Cys₂-His₂ zinc finger proteins. Bioinformatics, 25, 22–29
Article PubMed Central CAS PubMed Google Scholar
Persikov, A. V. and Singh, M. (2014) De novo prediction of DNAbinding specificities for Cys₂-His₂ zinc finger proteins. Nucleic Acids Res., 42, 97–108
Article PubMed Central CAS PubMed Google Scholar
Wolfe, S. A., Nekludova, L. and Pabo, C. O. (2000) DNA recognition by Cys₂-His₂ zinc finger proteins. Annu. Rev. Biophys. Biomol. Struct., 29, 183–212
Article CAS PubMed Google Scholar
Christensen, R. G., Enuameh, M. S., Noyes, M. B., Brodsky, M. H., Wolfe, S. A. and Stormo, G. D. (2012) Recognition models to predict DNA-binding specificities of homeodomain proteins. Bioinformatics, 28, i84–i89
Article PubMed Central CAS PubMed Google Scholar
Stormo, G. D. (2013) Introduction to protein-DNA interactions: structure, thermodynamics, and bioinformatics. NewYork: Cold Spring Harbor Laboratory Press.
Google Scholar
Giraud, B. G., Heumann, J. M. and Lapedes, A. S. (1999) Superadditive correlation. Phys. Rev. E, 59, 4983–4991
Article CAS Google Scholar
Lapedes, A. S., Giraud, B., Liu, L.C. and Stormo, G. D. (1999) Correlated mutations in models of protein sequences: phylogenetic and structural effects. The institute of mathematical statistics lecture notesmonograph series, 33, 236–256
Article Google Scholar
Lapedes, A., Giraud, B. and Jarzynski, C. (2002) Using sequence alignments to predict protein structure and stability with high accuracy. q-bio. QM, arXiv, 1207.2484
Google Scholar
Cocco, S., Monasson, R. and Weigt, M. (2013) From principal component to direct coupling analysis of coevolution in proteins: loweigenvalue modes are needed for structure prediction. PLoS Comput. Biol., 9, e1003176
Article PubMed Central CAS PubMed Google Scholar
Jones, D. T., Buchan, D. W., Cozzetto, D. and Pontil, M. (2012) PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics, 28, 184–190
Article CAS PubMed Google Scholar
Kamisetty, H., Ovchinnikov, S. and Baker, D. (2013) Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc. Natl. Acad. Sci. USA, 110, 15674–15679
Article PubMed Central CAS PubMed Google Scholar
Marks, D. S., Colwell, L. J., Sheridan, R., Hopf, T. A., Pagnani, A., Zecchina, R. and Sander, C. (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One, 6, e28766
Article PubMed Central CAS PubMed Google Scholar
Morcos, F., Pagnani, A., Lunt, B., Bertolino, A., Marks, D. S., Sander, C., Zecchina, R., Onuchic, J. N., Hwa, T. and Weigt, M. (2011) Directcoupling analysis of residue coevolution captures native contacts across many protein families. Proc. Natl. Acad. Sci. USA, 108, E1293–E1301
Article PubMed Central CAS PubMed Google Scholar
Weigt, M., White, R. A., Szurmant, H., Hoch, J. A. and Hwa, T. (2009) Identification of direct residue contacts in protein-protein interaction by message passing. Proc. Natl. Acad. Sci. USA, 106, 67–72
Article PubMed Central CAS PubMed Google Scholar
Burger, L. and van Nimwegen, E. (2008) Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method. Mol. Syst. Biol., 4, 165
Article PubMed Central PubMed Google Scholar
Ovchinnikov, S., Kamisetty, H. and Baker, D. (2014) Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. eLife, 3, e02030
Article PubMed Central PubMed Google Scholar
Feizi, S., Marbach, D., Médard, M. and Kellis, M. (2013) Network deconvolution as a general method to distinguish direct dependencies in networks. Nat. Biotechnol., 31, 726–733
Article PubMed Central CAS PubMed Google Scholar
Zhu, L. J., Christensen, R. G., Kazemian, M., Hull, C. J., Enuameh, M. S., Basciotta, M. D., Brasefield, J. A., Zhu, C., Asriyan, Y., Lapointe, D. S., et al. (2011) FlyFactorSurvey: a database of Drosophila transcription factor binding specificities determined using the bacterial onehybrid system. Nucleic Acids Res., 39, D111–D117
Article PubMed Central CAS PubMed Google Scholar
Robasky, K. and Bulyk, M. L. (2011) UniPROBE, update 2011: expanded content and search tools in the online database of proteinbinding microarray data on protein-DNA interactions. Nucleic Acids Res., 39, D124–D128
Article PubMed Central CAS PubMed Google Scholar
Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K. R., Rastas, P., Morgunova, E., Enge, M., Taipale, M., Wei, G., et al. (2013) DNAbinding specificities of human transcription factors. Cell, 152, 327–339
Article CAS PubMed Google Scholar
Novichkov, P. S., Kazakov, A. E., Ravcheev, D. A., Leyn, S. A., Kovaleva, G. Y., Sutormin, R. A., Kazanov,M. D., Riehl,W., Arkin, A. P., Dubchak, I., et al. (2013) RegPrecise 3.0—a resource for genomescale exploration of transcriptional regulation in bacteria. BMC Genomics, 14, 745
Article PubMed Central CAS PubMed Google Scholar
Magrane, M. and Consortium, U. (2011) UniProt Knowledgebase: a hub of integrated protein data. Database, 2011, bar009
Article PubMed Central PubMed Google Scholar
Dehal, P. S., Joachimiak, M. P., Price, M. N., Bates, J. T., Baumohl, J. K., Chivian, D., Friedland, G. D., Huang, K. H., Keller, K., Novichkov, P. S., et al. (2010) MicrobesOnline: an integrated portal for comparative and functional genomics. Nucleic Acids Res., 38, D396–D400
Article PubMed Central CAS PubMed Google Scholar
Eddy, S. R. (2011) Accelerated profile HMM searches. PLoS Comput. Biol., 7, e1002195
Article PubMed Central CAS PubMed Google Scholar
Finn, R. D., Bateman, A., Clements, J., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Heger, A., Hetherington, K., Holm, L., Mistry, J., et al. (2014) Pfam: the protein families database. Nucleic Acids Res., 42, D222–D230
Article PubMed Central CAS PubMed Google Scholar
Wang, T. and Stormo, G. D. (2003) Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics, 19, 2369–2380
Article CAS PubMed Google Scholar
Wang, T. and Stormo, G. D. (2005) Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. Proc. Natl. Acad. Sci. USA, 102, 17400–17405
Article PubMed Central CAS PubMed Google Scholar
Mahony, S. and Benos, P.V. (2007) STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res, 35 (Web Server issue), W253–W258
Article PubMed Central PubMed Google Scholar
Kwan, C. (2014) A regression-based interpretation of the inverse of thesample covariance matrix. Spreadsheets in Education (eJSiE), 7, Article 3
Google Scholar
Dunn, S. D., Wahl, L. M. and Gloor, G. B. (2008) Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics, 24, 333–340
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Genetics, School of Medicine, Washington University, St. Louis, MO, 63108, USA
Ronak Y. Patel & Gary D. Stormo
Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, Kgs. Lyngby, DK 2800, Denmark
Christian Garde

Authors

Ronak Y. Patel
View author publications
You can also search for this author in PubMed Google Scholar
Christian Garde
View author publications
You can also search for this author in PubMed Google Scholar
Gary D. Stormo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ronak Y. Patel or Gary D. Stormo.

Electronic supplementary material

Supplementary material, approximately 2.07 MB.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patel, R.Y., Garde, C. & Stormo, G.D. Determination of specificity influencing residues for key transcription factor families. Quant Biol 3, 115–123 (2015). https://doi.org/10.1007/s40484-015-0045-y

Download citation

Received: 01 March 2015
Revised: 18 May 2015
Accepted: 21 May 2015
Published: 16 June 2015
Issue Date: September 2015
DOI: https://doi.org/10.1007/s40484-015-0045-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Determination of specificity influencing residues for key transcription factor families

Abstract

Article PDF

Similar content being viewed by others

Deciphering the protein-DNA code of bacterial winged helix-turn-helix transcription factors

Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles

Molecular and structural considerations of TF-DNA binding for the generation of biologically meaningful and accurate phylogenetic footprinting analysis: the LysR-type transcriptional regulator family as a study model

References

Author information

Authors and Affiliations

Corresponding authors

Electronic supplementary material

Supplementary material, approximately 2.07 MB.

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Determination of specificity influencing residues for key transcription factor families

Abstract

Article PDF

Similar content being viewed by others

Deciphering the protein-DNA code of bacterial winged helix-turn-helix transcription factors

Mathematical Modeling of Avidity Distribution and Estimating General Binding Properties of Transcription Factors from Genome-Wide Binding Profiles

Molecular and structural considerations of TF-DNA binding for the generation of biologically meaningful and accurate phylogenetic footprinting analysis: the LysR-type transcriptional regulator family as a study model

References

Author information

Authors and Affiliations

Corresponding authors

Electronic supplementary material

Supplementary material, approximately 2.07 MB.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation