Skip to main content

PROMALS3D: Multiple Protein Sequence Alignment Enhanced with Evolutionary and Three-Dimensional Structural Information

  • Protocol
  • First Online:
Multiple Sequence Alignment Methods

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1079))

Abstract

Multiple sequence alignment (MSA) is an essential tool with many applications in bioinformatics and computational biology. Accurate MSA construction for divergent proteins remains a difficult computational task. The constantly increasing protein sequences and structures in public databases could be used to improve alignment quality. PROMALS3D is a tool for protein MSA construction enhanced with additional evolutionary and structural information from database searches. PROMALS3D automatically identifies homologs from sequence and structure databases for input proteins, derives structure-based constraints from alignments of three-dimensional structures, and combines them with sequence-based constraints of profile–profile alignments in a consistency-based framework to construct high-quality multiple sequence alignments. PROMALS3D output is a consensus alignment enriched with sequence and structural information about input proteins and their homologs. PROMALS3D Web server and package are available at http://prodata.swmed.edu/PROMALS3D.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Do CB, Katoh K (2008) Protein multiple sequence alignment. In: Walker J (ed) Methods Mol Biol, vol 484, 1st edn. Humana, Totowa, pp 379–413

    Google Scholar 

  2. Pei J (2008) Multiple protein sequence alignment. Curr Opin Struct Biol 18(3):382–386

    Article  PubMed  CAS  Google Scholar 

  3. Notredame C (2007) Recent evolutions of multiple sequence alignment algorithms. PLoS Comput Biol 3(8):e123

    Article  PubMed  Google Scholar 

  4. Edgar RC, Batzoglou S (2006) Multiple sequence alignment. Curr Opin Struct Biol 16(3):368–373

    Article  PubMed  CAS  Google Scholar 

  5. Wallace IM, Blackshields G, Higgins DG (2005) Multiple sequence alignments. Curr Opin Struct Biol 15(3):261–266

    Article  PubMed  CAS  Google Scholar 

  6. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  PubMed  CAS  Google Scholar 

  7. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

    Article  PubMed  CAS  Google Scholar 

  8. Lipman DJ, Altschul SF, Kececioglu JD (1989) A tool for multiple sequence alignment. Proc Natl Acad Sci USA 86(12):4412–4415

    Article  PubMed  CAS  Google Scholar 

  9. Wang L, Jiang T (1994) On the complexity of multiple sequence alignment. J Comput Biol 1(4):337–348

    Article  PubMed  CAS  Google Scholar 

  10. Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol 25(4):351–360

    Article  PubMed  CAS  Google Scholar 

  11. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22(22):4673–4680

    Article  PubMed  CAS  Google Scholar 

  12. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797

    Article  PubMed  CAS  Google Scholar 

  13. Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30(14):3059–3066

    Article  PubMed  CAS  Google Scholar 

  14. Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217

    Article  PubMed  CAS  Google Scholar 

  15. Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15(2):330–340

    Article  PubMed  CAS  Google Scholar 

  16. Pei J, Grishin NV (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res 34(16):4364–4374

    Article  PubMed  CAS  Google Scholar 

  17. Sadreyev R, Grishin N (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical significance. J Mol Biol 326(1):317–336

    Article  PubMed  CAS  Google Scholar 

  18. Soding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21(7):951–960

    Article  PubMed  Google Scholar 

  19. Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808

    Article  PubMed  CAS  Google Scholar 

  20. Deng X, Cheng J (2011) MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts. BMC Bioinformatics 12:472

    Article  PubMed  CAS  Google Scholar 

  21. Chothia C, Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO J 5(4):823–826

    PubMed  CAS  Google Scholar 

  22. Armougom F, Moretti S, Poirot O, Audic S, Dumas P, Schaeli B, Keduas V, Notredame C (2006) Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-Coffee. Nucleic Acids Res 34(Web Server issue):W604–W608

    Article  PubMed  CAS  Google Scholar 

  23. Pei J, Kim BH, Grishin NV (2008) PROMALS3D: a tool for multiple protein sequence and structure alignments. Nucleic Acids Res 36(7):2295–2300

    Article  PubMed  CAS  Google Scholar 

  24. Zhou H, Zhou Y (2005) SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures. Bioinformatics 21(18):3615–3621

    Article  PubMed  CAS  Google Scholar 

  25. Pei J, Tang M, Grishin NV (2008) PROMALS3D web server for accurate multiple protein sequence and structure alignments. Nucleic Acids Res 36(Web Server issue):W30–W34

    Article  PubMed  CAS  Google Scholar 

  26. Li W, Godzik A (2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13):1658–1659

    Article  PubMed  CAS  Google Scholar 

  27. Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33(2):511–518

    Article  PubMed  CAS  Google Scholar 

  28. Pei J, Sadreyev R, Grishin NV (2003) PCMA: fast and accurate multiple sequence alignment based on profile consistency. Bioinformatics 19(3):427–428

    Article  PubMed  CAS  Google Scholar 

  29. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA 89(22):10915–10919

    Article  PubMed  CAS  Google Scholar 

  30. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402

    Article  PubMed  CAS  Google Scholar 

  31. Suzek BE, Huang H, McGarvey P, Mazumder R, Wu CH (2007) UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23(10):1282–1288

    Article  PubMed  CAS  Google Scholar 

  32. Jones DT (1999) Protein secondary structure prediction based on position-specific scoring matrices. J Mol Biol 292(2):195–202

    Article  PubMed  CAS  Google Scholar 

  33. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th international conference on machine learning, pp 282–289

    Google Scholar 

  34. Pei J, Grishin NV (2001) AL2CO: calculation of positional conservation in a protein sequence alignment. Bioinformatics 17(8):700–712

    Article  PubMed  CAS  Google Scholar 

  35. Holm L, Park J (2000) DaliLite workbench for protein structure comparison. Bioinformatics 16(6):566–567

    Article  PubMed  CAS  Google Scholar 

  36. Zhu J, Weng Z (2005) FAST: a novel protein structure alignment algorithm. Proteins 58(3):618–627

    Article  PubMed  CAS  Google Scholar 

  37. Zhang Y, Skolnick J (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33(7):2302–2309

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The work is supported in part by the National Institutes of Health (GM094575 to NVG) and the Welch Foundation (I-1505 to NVG).

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Pei, J., Grishin, N.V. (2014). PROMALS3D: Multiple Protein Sequence Alignment Enhanced with Evolutionary and Three-Dimensional Structural Information. In: Russell, D. (eds) Multiple Sequence Alignment Methods. Methods in Molecular Biology, vol 1079. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-646-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-646-7_17

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-645-0

  • Online ISBN: 978-1-62703-646-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics