PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences

Sahraeian, Sayed Mohammad Ebrahim; Yoon, Byung-Jun

doi:10.1007/978-1-62703-646-7_13

Sayed Mohammad Ebrahim Sahraeian³ &
Byung-Jun Yoon⁴

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1079))

4885 Accesses

Abstract

PicXAA is a probabilistic nonprogressive alignment algorithm that finds protein (or DNA) multiple sequence alignments with maximum expected accuracy. PicXAA greedily builds up the alignment from sequence regions with high local similarity, thereby yielding an accurate global alignment that effectively captures the local similarities across sequences. PicXAA constantly yields accurate alignment results on a wide range of reference sets that have different characteristics, with especially remarkable improvements over other leading algorithms on sequence sets with high local similarities. In this chapter, we describe the overall alignment strategy used in PicXAA and discuss several important considerations for effective deployment of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Phillips A, Janies D, Wheeler W (2000) Multiple sequence alignment in phylogenetic analysis. Mol Phylogenet Evol 16:317–330
Article PubMed CAS Google Scholar
Wong KM, Suchard MA, Huelsenbeck JP (2008) Alignment uncertainty and genomic analysis. Science 319:473–476
Article PubMed CAS Google Scholar
Cuff JA, Barton GJ (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40:502–511
Article PubMed CAS Google Scholar
Kemena C, Notredame C (2009) Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 25:2455–2465
Article PubMed CAS Google Scholar
Edgar R, Batzoglou S (2006) Multiple sequence alignment. Curr Opin Struct Biol 16:368–373
Article PubMed CAS Google Scholar
Pei J (2008) Multiple protein sequence alignment. Curr Opin Struct Biol 18:382–386
Article PubMed CAS Google Scholar
Kumar S, Filipski A (2007) Multiple sequence alignment: in pursuit of homologous DNA positions. Genome Res 17:127–135
Article PubMed CAS Google Scholar
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680
Article PubMed CAS Google Scholar
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217
Article PubMed CAS Google Scholar
Do CB, Mahabhashyam MS, Brudno M, Batzoglou S (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340
Article PubMed CAS Google Scholar
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797
Article PubMed CAS Google Scholar
Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30:3059–3066
Article PubMed CAS Google Scholar
Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–518
Article PubMed CAS Google Scholar
Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298
Article PubMed CAS Google Scholar
Pei J, Grishin NV (2006) MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information. Nucleic Acids Res 34:4364–4374
Article PubMed CAS Google Scholar
Paten B, Herrero J, Beal K, Birney E (2009) Sequence progressive alignment, a framework for practical large-scale probabilistic consistency alignment. Bioinformatics 25:295–301
Article PubMed CAS Google Scholar
Subramanian AR, Kaufmann M, Morgenstern B (2008) DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment. Algorithms Mol Biol 3:6
Article PubMed Google Scholar
Schwartz AS, Pachter L (2007) Multiple alignment by sequence annealing. Bioinformatics 23:e24–e29
Article PubMed CAS Google Scholar
Bradley RK, Roberts A, Smoot M, Juvekar S, Do J, Dewey C, Holmes I, Pachter L (2009) Fast statistical alignment. PLoS Comput Biol 5:e1000392
Article PubMed Google Scholar
Sahraeian SM, Yoon BJ (2010) PicXAA: greedy probabilistic construction of maximum expected accuracy alignment of multiple Sequences. Nucleic Acids Res 38:4917–4928
Article PubMed CAS Google Scholar
Roshan U, Livesay DR (2006) Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics 22:2715–2721
Article PubMed CAS Google Scholar
Gonnet GH, Cohen MA, Benner SA (1992) Exhaustive matching of the entire protein sequence database. Science 256:1443–1445
Article PubMed CAS Google Scholar
Sahraeian SM, Yoon BJ (2010) PicXAA-R: efficient structural alignment of multiple RNA sequences using a greedy approach. BMC Bioinformatics 11(Suppl 1):S38
Article Google Scholar
Sahraeian SM, Yoon BJ (2011) PicXAA-Web: a web-based platform for non-progressive maximum expected accuracy alignment of multiple biological sequences. Nucleic Acids Res 39:W8–W12
Article PubMed CAS Google Scholar

Download references

Acknowledgment

This work was supported in part by the National Science Foundation through NSF Award CCF-1149544.

Author information

Authors and Affiliations

Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
Sayed Mohammad Ebrahim Sahraeian
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
Byung-Jun Yoon

Authors

Sayed Mohammad Ebrahim Sahraeian
View author publications
You can also search for this author in PubMed Google Scholar
Byung-Jun Yoon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
David J Russell

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Sahraeian, S.M.E., Yoon, BJ. (2014). PicXAA: A Probabilistic Scheme for Finding the Maximum Expected Accuracy Alignment of Multiple Biological Sequences. In: Russell, D. (eds) Multiple Sequence Alignment Methods. Methods in Molecular Biology, vol 1079. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-646-7_13

Download citation

DOI: https://doi.org/10.1007/978-1-62703-646-7_13
Published: 23 August 2013
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-645-0
Online ISBN: 978-1-62703-646-7
eBook Packages: Springer Protocols

Publish with us

Policies and ethics