Skip to main content

An Easy Protocol for Evolutionary Analysis of Intrinsically Disordered Proteins

  • Protocol
  • First Online:
Intrinsically Disordered Proteins

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2141))

Abstract

We present an easy protocol for evolutionary analysis of proteins, with an emphasis on studying the evolutionary dynamics of disordered regions. Using the p53 protein family as an example, we provide a guide for finding homologous sequences in a database and refining a dataset before constructing the evolutionary context by building a phylogenetic tree. We show how a multiple sequence alignment and phylogeny for a protein family can be further partitioned into smaller datasets in order to investigate the changes in disorder content across the phylogeny. Based on the evolutionary context, we also investigate site-specific conservation of disorder. Last, we address how to evaluate the evolutionary dynamics of disorder-to-order transitions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gabaldón T, Koonin EV (2013) Functional and evolutionary implications of gene orthology. Nat Rev Genet 14:360–366. https://doi.org/10.1038/nrg3456

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Echave J, Spielman SJ, Wilke CO (2016) Causes of evolutionary rate variation among protein sites. Nat Rev Genet 17:109–121. https://doi.org/10.1038/nrg.2015.18

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Brown CJ, Takayama S, Campen AM et al (2002) Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol 55:104–110

    Article  CAS  PubMed  Google Scholar 

  4. van der Lee R, Buljan M, Lang B et al (2014) Classification of intrinsically disordered regions and proteins. Chem Rev 114:6589–6631. https://doi.org/10.1021/cr400525m

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Ahrens J, Rahaman J, Siltberg-Liberles J (2018) Large-scale analyses of site-specific evolutionary rates across eukaryote proteomes reveal confounding interactions between intrinsic disorder, secondary structure, and functional domains. Genes (Basel) 9:553. https://doi.org/10.3390/genes9110553

    Article  CAS  Google Scholar 

  6. Ahrens J, Dos Santos HG, Siltberg-Liberles J (2016) The nuanced interplay of intrinsic disorder and other structural properties driving protein evolution. Mol Biol Evol 33:2248–2256. https://doi.org/10.1093/molbev/msw092

    Article  CAS  PubMed  Google Scholar 

  7. Light S, Sagit R, Sachenkova O et al (2013) Protein expansion is primarily due to indels in intrinsically disordered regions. Mol Biol Evol 30:2645–2653. https://doi.org/10.1093/molbev/mst157

    Article  CAS  PubMed  Google Scholar 

  8. Anisimova M, Liberles DA, Philippe H et al (2013) State-of the art methodologies dictate new standards for phylogenetic analysis. BMC Evol Biol 13:161. https://doi.org/10.1186/1471-2148-13-161

    Article  PubMed  PubMed Central  Google Scholar 

  9. Dos Santos HG, Nunez-Castilla J, Siltberg-Liberles J (2016) Functional diversification after gene duplication: Paralog specific regions of structural disorder and phosphorylation in p53, p63, and p73. PLoS One 11:e0151961. https://doi.org/10.1371/journal.pone.0151961

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Richter DJ, King N (2013) The genomic and cellular foundations of animal origins. Annu Rev Genet 47:509–537. https://doi.org/10.1146/annurev-genet-111212-133456

    Article  PubMed  Google Scholar 

  11. Suga H, Chen Z, de Mendoza A et al (2013) The Capsaspora genome reveals a complex unicellular prehistory of animals. Nat Commun 4:2325. https://doi.org/10.1038/ncomms3325

    Article  CAS  PubMed  Google Scholar 

  12. Huerta-Cepas J, Serra F, Bork P (2016) ETE 3: reconstruction, analysis, and visualization of Phylogenomic data. Mol Biol Evol 33:1635–1638. https://doi.org/10.1093/molbev/msw046

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Huerta-Cepas J, Dopazo J, Gabaldón T et al (2010) ETE: a python environment for tree exploration. BMC Bioinformatics 11:24. https://doi.org/10.1186/1471-2105-11-24

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Golubchik T, Wise MJ, Easteal S, Jermiin LS (2007) Mind the gaps: evidence of bias in estimates of multiple sequence alignments. Mol Biol Evol 24:2433–2442. https://doi.org/10.1093/molbev/msm176

    Article  CAS  PubMed  Google Scholar 

  15. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797. https://doi.org/10.1093/nar/gkh340

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5:113. https://doi.org/10.1186/1471-2105-5-113

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Löytynoja A (2014) Phylogeny-aware alignment with PRANK. Methods Mol Biol 1079:155–170

    Article  PubMed  Google Scholar 

  18. Notredame C, Higgins DG, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. https://doi.org/10.1006/jmbi.2000.4042

    Article  CAS  PubMed  Google Scholar 

  19. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066. https://doi.org/10.1093/nar/gkf436

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298. https://doi.org/10.1093/bib/bbn013

    Article  CAS  PubMed  Google Scholar 

  21. Thompson JD, Linard B, Lecompte O, Poch O (2011) A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One 6:e18093. https://doi.org/10.1371/journal.pone.0018093

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Long H, Li M, Fu H (2016) Determination of optimal parameters of MAFFT program based on BAliBASE3.0 database. Springerplus 5:736. https://doi.org/10.1186/S40064-016-2526-5

    Article  PubMed  PubMed Central  Google Scholar 

  23. Waterhouse AM, Procter JB, Martin DMA et al (2009) Jalview version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics 25:1189–1191. https://doi.org/10.1093/bioinformatics/btp033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Finn RD, Bateman A, Clements J et al (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230. https://doi.org/10.1093/nar/gkt1223

    Article  CAS  PubMed  Google Scholar 

  25. Finn RD, Coggill P, Eberhardt RY et al (2016) The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res 44:D279–D285. https://doi.org/10.1093/nar/gkv1344

    Article  CAS  PubMed  Google Scholar 

  26. Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321. https://doi.org/10.1093/sysbio/syq010

    Article  CAS  PubMed  Google Scholar 

  27. Lefort V, Longueville J-E, Gascuel O (2017) SMS: smart model selection in PhyML. Mol Biol Evol 34:2422–2424. https://doi.org/10.1093/molbev/msx149

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574. https://doi.org/10.1093/bioinformatics/btg180

    Article  CAS  PubMed  Google Scholar 

  29. Meng F, Uversky VN, Kurgan L (2017) Comprehensive review of methods for prediction of intrinsic disorder and its molecular functions. Cell Mol Life Sci 74:3069–3090. https://doi.org/10.1007/s00018-017-2555-4

    Article  CAS  PubMed  Google Scholar 

  30. Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) The pairwise energy content estimated from amino acid composition discriminates between folded and intrinsically unstructured proteins. J Mol Biol 347:827–839. https://doi.org/10.1016/j.jmb.2005.01.071

    Article  CAS  PubMed  Google Scholar 

  31. Dosztányi Z, Csizmok V, Tompa P, Simon I (2005) IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21:3433–3434. https://doi.org/10.1093/bioinformatics/bti541

    Article  CAS  PubMed  Google Scholar 

  32. Di Domenico T, Walsh I, Tosatto SCE (2013) Analysis and consensus of currently available intrinsic protein disorder annotation sources in the MobiDB database. BMC Bioinformatics 14(Suppl 7):S3. https://doi.org/10.1186/1471-2105-14-S7-S3

    Article  PubMed  PubMed Central  Google Scholar 

  33. Mészáros B, Erdős G, Dosztányi Z (2018) IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res 46:W329–W337. https://doi.org/10.1093/nar/gky384

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Fuxreiter M, Tompa P, Simon I (2007) Local structural disorder imparts plasticity on linear motifs. Bioinformatics 23:950–956. https://doi.org/10.1093/bioinformatics/btm035

    Article  CAS  PubMed  Google Scholar 

  35. Felsenstein J (1985) Phylogenies and the comparative method. Am Nat 125(1), 1–15. http://www.jstor.org/stable/2461605

  36. Dos Santos HG, Siltberg-Liberles J (2016) Paralog-specific patterns of structural disorder and phosphorylation in the vertebrate SH3–SH2–tyrosine kinase protein family. Genome Biol Evol 8:2806–2825. https://doi.org/10.1093/gbe/evw194

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Ortiz JF, MacDonald ML, Masterson P et al (2013) Rapid evolutionary dynamics of structural disorder as a potential driving force for biological divergence in flaviviruses. Genome Biol Evol 5:504–513. https://doi.org/10.1093/gbe/evt026

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Fahmi M, Ito M (2019) Evolutionary approach of intrinsically disordered CIP/KIP proteins. Sci Rep 9:1575. https://doi.org/10.1038/s41598-018-37917-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Rahaman J, Siltberg-Liberles J (2016) Avoiding regions symptomatic of conformational and functional flexibility to identify antiviral targets in current and future coronaviruses. Genome Biol Evol 8(11):3471–3484. https://doi.org/10.1093/gbe/evw246

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Smock RG, Gierasch LM (2009) Sending signals dynamically. Science 324:198–203. https://doi.org/10.1126/science.1169377

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Ahrens JB, Nunez-Castilla J, Siltberg-Liberles J (2017) Evolution of intrinsic disorder in eukaryotic proteins. Cell Mol Life Sci 74:3163–3174. https://doi.org/10.1007/s00018-017-2559-0

    Article  CAS  PubMed  Google Scholar 

  42. Rose PW, Prlić A, Bi C et al (2015) The RCSB protein data Bank: views of structural biology for basic and applied research and education. Nucleic Acids Res 43:D345–D356. https://doi.org/10.1093/nar/gku1214

    Article  CAS  PubMed  Google Scholar 

  43. UniProt Consortium (2019) UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res 47:D506–D515. https://doi.org/10.1093/nar/gky1049

    Article  CAS  Google Scholar 

  44. The UniProt Consortium (2014) UniProt: a hub for protein information. Nucleic Acids Res 43:D204–D212. https://doi.org/10.1093/nar/gku989

    Article  CAS  PubMed Central  Google Scholar 

  45. El-Gebali S, Mistry J, Bateman A et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995

    Article  CAS  PubMed  Google Scholar 

  46. Altschul SF, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 245:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2

  47. Gouy M, Guindon S, Gascuel O (2010) SeaView version 4: a multiplatform graphical user Interface for sequence alignment and phylogenetic tree building. Mol Biol Evol 27:221–224. https://doi.org/10.1093/molbev/msp259

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jessica Siltberg-Liberles .

Editor information

Editors and Affiliations

1 Electronic Supplementary Material

P.1.1

Supplementary_materials.zip (1477 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Nunez-Castilla, J., Siltberg-Liberles, J. (2020). An Easy Protocol for Evolutionary Analysis of Intrinsically Disordered Proteins. In: Kragelund, B.B., Skriver, K. (eds) Intrinsically Disordered Proteins. Methods in Molecular Biology, vol 2141. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0524-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-0524-0_7

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-0523-3

  • Online ISBN: 978-1-0716-0524-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics