Abstract
Alpha helical transmembrane (TM) proteins constitute an important structural class of membrane proteins involved in a wide variety of cellular functions. The prediction of their transmembrane topology, as well as their discrimination in newly sequenced genomes, is of great importance for the elucidation of their structure and function. Several methods have been applied for the prediction of the transmembrane segments and the topology of alpha helical transmembrane proteins utilizing different algorithmic techniques. Hidden Markov Models (HMMs) have been efficiently used in the development of several computational methods used for this task. In this chapter we give a brief review of different available prediction methods for alpha helical transmembrane proteins pointing out sequence and structural features that should be incorporated in a prediction method. We then describe the procedure of the design and development of a Hidden Markov Model capable of predicting the transmembrane alpha helices in proteins and discriminating them from globular proteins.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Krogh A, Larsson B, von Heijne G et al (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305(3):567–580
Berman HM, Westbrook J, Feng Z et al (2000) The Protein Data Bank. Nucleic Acids Res 28(1):235–242, doi:gkd090 [pii]
Punta M, Forrest LR, Bigelow H et al (2007) Membrane protein prediction methods. Methods 41(4):460–474. doi:10.1016/j.ymeth.2006.07.026
Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157(1):105–132, doi:0022-2836(82)90515-0 [pii]
Claros MG, von Heijne G (1994) TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci 10(6):685–686
Sipos L, von Heijne G (1993) Predicting the topology of eukaryotic membrane proteins. Eur J Biochem 213(3):1333–1340
Pasquier C, Promponas VJ, Palaios GA et al (1999) A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm. Protein Eng Des Sel 12(5):381–385
Jones DT, Taylor WR, Thornton JM (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33(10):3038–3049
Rost B, Casadio R, Fariselli P et al (1995) Transmembrane helices predicted at 95% accuracy. Protein Sci 4(3):521–533
Pasquier C, Hamodrakas SJ (1999) An hierarchical artificial neural network system for the classification of transmembrane proteins. Protein Eng Des Sel 12(8):631–634
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175–182
Bagos PG, Liakopoulos TD, Hamodrakas SJ (2006) Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins. BMC Bioinformatics 7:189. doi:10.1186/1471-2105-7-189
Kall L, Krogh A, Sonnhammer EL (2004) A combined transmembrane topology and signal peptide prediction method. J Mol Biol 338(5):1027–1036. doi:10.1016/j.jmb.2004.03.016
Tusnady GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17(9):849–850
Viklund H, Elofsson A (2004) Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci 13(7):1908–1917. doi:10.1110/ps.04625404
Nugent T, Jones DT (2009) Transmembrane protein topology prediction using support vector machines. BMC Bioinformatics 10:159. doi:10.1186/1471-2105-10-159
Reynolds SM, Kall L, Riffle ME et al (2008) Transmembrane topology and signal peptide prediction using dynamic Bayesian networks. PLoS Comput Biol 4(11):e1000213. doi:10.1371/journal.pcbi.1000213
Viklund H, Elofsson A (2008) OCTOPUS: improving topology prediction by two-track ANN-based preference scores and an extended topological grammar. Bioinformatics 24(15):1662–1668. doi:10.1093/bioinformatics/btn221
Viklund H, Bernsel A, Skwark M et al (2008) SPOCTOPUS: a combined predictor of signal peptides and membrane protein topology. Bioinformatics 24(24):2928–2929. doi:10.1093/bioinformatics/btn550
Promponas VJ, Palaios GA, Pasquier CM et al (1999) CoPreTHi: a Web tool which combines transmembrane protein segment prediction methods. In Silico Biol 1(3):159–162, doi:1998010014 [pii]
Nilsson J, Persson B, Von Heijne G (2002) Prediction of partial membrane protein topologies using a consensus approach. Protein Sci 11(12):2974–2980. doi:10.1110/ps.0226702
Bernsel A, Viklund H, Hennerdal A et al (2009) TOPCONS: consensus prediction of membrane protein topology. Nucleic Acids Res 37(Web Server issue):W465–W468. doi:10.1093/nar/gkp363
Klammer M, Messina DN, Schmitt T et al (2009) MetaTM—a consensus method for transmembrane protein topology prediction. BMC Bioinformatics 10:314. doi:10.1186/1471-2105-10-314
Moller S, Croning MD, Apweiler R (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17(7):646–653
Bagos PG, Liakopoulos TD, Hamodrakas SJ (2005) Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method. BMC Bioinformatics 6:7. doi:10.1186/1471-2105-6-7
Kozma D, Simon I, Tusnady GE (2013) PDBTM: Protein Data Bank of transmembrane proteins after 8 years. Nucleic Acids Res 41(Database issue):D524–D529. doi:10.1093/nar/gks1169
Delano WL (2002) The PyMOL molecular graphics system. http://www.pymol.org
Almen MS, Nordstrom KJ, Fredriksson R et al (2009) Mapping the human membrane proteome: a majority of the human membrane proteins can be classified according to function and evolutionary origin. BMC Biol 7:50. doi:10.1186/1741-7007-7-50
Bowie JU (1997) Helix packing angle preferences. Nat Struct Biol 4(11):915–917
Chen H, Kendall DA (1995) Artificial transmembrane segments. Requirements for stop transfer and polypeptide orientation. J Biol Chem 270(23):14115–14122
Nilsson I, von Heijne G (1998) Breaking the camel’s back: proline-induced turns in a model transmembrane helix. J Mol Biol 284(4):1185–1189. doi:10.1006/jmbi.1998.2219
Wallin E, Tsukihara T, Yoshikawa S et al (1997) Architecture of helix bundle membrane proteins: an analysis of cytochrome c oxidase from bovine mitochondria. Protein Sci 6(4):808–815. doi:10.1002/pro.5560060407
Weiss MS, Kreusch A, Schiltz E et al (1991) The structure of porin from Rhodobacter capsulatus at 1.8 A resolution. FEBS Lett 280(2):379–382, doi:0014-5793(91)80336-2 [pii]
von Heijne G (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225(2):487–494
Nilsson J, Persson B, von Heijne G (2005) Comparative analysis of amino acid distributions in integral membrane proteins from 107 genomes. Proteins 60(4):606–616. doi:10.1002/prot.20583
Gafvelin G, Sakaguchi M, Andersson H et al (1997) Topological rules for membrane protein assembly in eukaryotic cells. J Biol Chem 272(10):6119–6127
Andersson H, von Heijne G (1993) Sec dependent and sec independent assembly of E. coli inner membrane proteins: the topological rules depend on chain length. EMBO J 12(2):683–691
Bogdanov M, Xie J, Dowhan W (2009) Lipid-protein interactions drive membrane protein topogenesis in accordance with the positive inside rule. J Biol Chem 284(15):9637–9641. doi:10.1074/jbc.R800081200
van Klompenburg W, Nilsson I, von Heijne G et al (1997) Anionic phospholipids are determinants of membrane protein topology. EMBO J 16(14):4261–4266
von Heijne G (1991) Proline kinks in transmembrane alpha-helices. J Mol Biol 218(3):499–503, doi:0022-2836(91)90695-3 [pii]
Sansom MS (1992) Proline residues in transmembrane helices of channel and transport proteins: a molecular modelling study. Protein Eng 5(1):53–60
Park SH, Opella SJ (2005) Tilt angle of a trans-membrane helix is determined by hydrophobic mismatch. J Mol Biol 350(2):310–318. doi:10.1016/j.jmb.2005.05.004
Yeagle PL, Bennett M, Lemaitre V et al (2007) Transmembrane helices of membrane proteins may flex to satisfy hydrophobic mismatch. Biochim Biophys Acta 1768(3):530–537. doi:10.1016/j.bbamem.2006.11.018
Granseth E, von Heijne G, Elofsson A (2005) A study of the membrane-water interface region of membrane proteins. J Mol Biol 346(1):377–385. doi:10.1016/j.jmb.2004.11.036
Liang J, Adamian L, Jackups R Jr (2005) The membrane-water interface region of membrane proteins: structural bias and the anti-snorkeling effect. Trends Biochem Sci 30(7):355–357. doi:10.1016/j.tibs.2005.05.003
Viklund H, Granseth E, Elofsson A (2006) Structural classification and prediction of reentrant regions in alpha-helical transmembrane proteins: application to complete genomes. J Mol Biol 361(3):591–603. doi:10.1016/j.jmb.2006.06.037
Yan C, Luo J (2010) An analysis of reentrant loops. Protein J 29(5):350–354. doi:10.1007/s10930-010-9259-z
Van den Berg B, Clemons WM Jr, Collinson I et al (2004) X-ray structure of a protein-conducting channel. Nature 427(6969):36–44. doi:10.1038/nature02218
Dutzler R, Campbell EB, Cadene M et al (2002) X-ray structure of a ClC chloride channel at 3.0 A reveals the molecular basis of anion selectivity. Nature 415(6869):287–294. doi:10.1038/415287a
Zhou Y, Morais-Cabral JH, Kaufman A et al (2001) Chemistry of ion coordination and hydration revealed by a K+ channel-Fab complex at 2.0 A resolution. Nature 414(6859):43–48. doi:10.1038/35102009
Mitsuoka K, Murata K, Walz T et al (1999) The structure of aquaporin-1 at 4.5-A resolution reveals short alpha-helices in the center of the monomer. J Struct Biol 128(1):34–43. doi:10.1006/jsbi.1999.4177
Rapp M, Granseth E, Seppala S et al (2006) Identification and evolution of dual-topology membrane proteins. Nat Struct Mol Biol 13(2):112–116. doi:10.1038/nsmb1057
Rost B (1996) PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol 266:525–539
Tusnady GE, Simon I (1998) Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J Mol Biol 283(2):489–506. doi:10.1006/jmbi.1998.2107
Kall L, Krogh A, Sonnhammer EL (2005) An HMM posterior decoder for sequence feature prediction that includes homology information. Bioinformatics 21(Suppl 1):i251–i257. doi:10.1093/bioinformatics/bti1014
Petersen TN, Brunak S, von Heijne G et al (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods 8(10):785–786. doi:10.1038/nmeth.1701
Tsirigos KD, Peters C, Shu N et al (2015) The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides. Nucleic Acids Res 43(W1):W401–W407. doi:10.1093/nar/gkv485
Bernsel A, Viklund H, Falk J et al (2008) Prediction of membrane-protein topology from first principles. Proc Natl Acad Sci U S A 105(20):7177–7181. doi:10.1073/pnas.0711151105
Peters C, Tsirigos KD, Shu N et al (2015) Improved topology prediction using the terminal hydrophobic helices rule. Bioinformatics 32:1158–1162. doi:10.1093/bioinformatics/btv709
Hessa T, Meindl-Beinker NM, Bernsel A et al (2007) Molecular code for transmembrane-helix recognition by the Sec61 translocon. Nature 450(7172):1026–1030. doi:10.1038/nature06387
Granseth E, Viklund H, Elofsson A (2006) ZPRED: predicting the distance to the membrane center for residues in alpha-helical membrane proteins. Bioinformatics 22(14):e191–e196. doi:10.1093/bioinformatics/btl206
van Geest M, Lolkema JS (2000) Membrane topology and insertion of membrane proteins: search for topogenic signals. Microbiol Mol Biol Rev 64(1):13–33
Bernsel A, Von Heijne G (2005) Improved membrane protein topology prediction by domain assignments. Protein Sci 14(7):1723–1728. doi:10.1110/ps.051395305
Letunic I, Copley RR, Pils B et al (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res 34(Database issue):D257–D260. doi:10.1093/nar/gkj079
Mulder NJ, Apweiler R, Attwood TK et al (2007) New developments in the InterPro database. Nucleic Acids Res 35(Database issue):D224–D228. doi:10.1093/nar/gkl841
Finn RD, Tate J, Mistry J et al (2008) The Pfam protein families database. Nucleic Acids Res 36(Database issue):D281–D288. doi:10.1093/nar/gkm960
Tusnady GE, Kalmar L, Hegyi H et al (2008) TOPDOM: database of domains and motifs with conservative location in transmembrane proteins. Bioinformatics 24(12):1469–1470. doi:10.1093/bioinformatics/btn202
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Eddy SR (1998) Profile hidden Markov models. Bioinformatics 14(9):755–763, doi:btb114 [pii]
Nielsen H, Krogh A (1998) Prediction of signal peptides and signal anchors by a hidden Markov model. Proc Int Conf Intell Syst Mol Biol 6:122–130
Krogh A (1994) Hidden Markov models for labelled sequences. In: Proceedings of the12th IAPR international conference on pattern recognition, pp 140–144
Martelli PL, Fariselli P, Krogh A et al (2002) A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics 18(Suppl 1):S46–S53
Khoury GA, Baliban RC, Floudas CA (2011) Proteome-wide post-translational modification statistics: frequency analysis and curation of the swiss-prot database. Sci Rep 1:90. doi:10.1038/srep00090
Apweiler R, Hermjakob H, Sharon N (1999) On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim Biophys Acta 1473(1):4–8, doi:S0304-4165(99)00165-8 [pii]
Welply JK, Shenbagamurthi P, Lennarz WJ et al (1983) Substrate recognition by oligosaccharyltransferase. Studies on glycosylation of modified Asn-X-Thr/Ser tripeptides. J Biol Chem 258(19):11856–11863
Nilsson IM, von Heijne G (1993) Determination of the distance between the oligosaccharyltransferase active site and the endoplasmic reticulum membrane. J Biol Chem 268(8):5798–5801
Popov M, Li J, Reithmeier RA (1999) Transmembrane folding of the human erythrocyte anion exchanger (AE1, Band 3) determined by scanning and insertional N-glycosylation mutagenesis. Biochem J 339(Pt 2):269–279
Popov M, Tam LY, Li J et al (1997) Mapping the ends of transmembrane segments in a polytopic membrane protein. Scanning N-glycosylation mutagenesis of extracytosolic loops in the anion exchanger, band 3. J Biol Chem 272(29):18325–18332
Landolt-Marticorena C, Reithmeier RA (1994) Asparagine-linked oligosaccharides are localized to single extracytosolic segments in multi-span membrane glycoproteins. Biochem J 302(Pt 1):253–260
Pawson T, Scott JD (2005) Protein phosphorylation in signaling—50 years and counting. Trends Biochem Sci 30(6):286–290. doi:10.1016/j.tibs.2005.04.013
Hunter T (2009) Tyrosine phosphorylation: thirty years and counting. Curr Opin Cell Biol 21(2):140–146. doi:10.1016/j.ceb.2009.01.028
Wood CD, Thornton TM, Sabio G et al (2009) Nuclear localization of p38 MAPK in response to DNA damage. Int J Biol Sci 5(5):428–437
Zhang J, Johnson GV (2000) Tau protein is hyperphosphorylated in a site-specific manner in apoptotic neuronal PC12 cells. J Neurochem 75(6):2346–2357
Kalume DE, Molina H, Pandey A (2003) Tackling the phosphoproteome: tools and strategies. Curr Opin Chem Biol 7(1):64–69, doi:S1367593102000091 [pii]
Tsaousis GN, Bagos PG, Hamodrakas SJ (2014) HMMpTM: Improving transmembrane protein topology prediction using phosphorylation and glycosylation site prediction. Biochim Biophys Acta 1844(2):316–322. doi:10.1016/j.bbapap.2013.11.001
Wistrand M, Käll L, Sonnhammer EL (2006) A general model of G protein-coupled receptor sequences and its application to detect remote homologs. Protein Sci 15(3):509–521. doi:10.1110/ps.051745906
Theodoropoulou MC, Tsaousis GN, Litou ZI et al (2013) GPCRpipe: a pipeline for the detection of G-protein coupled receptors in proteomes. In: Joint 21st annual international conference on Intelligent Systems for Molecular Biology (ISMB) and 12th European Conference on Computational Biology (ECCB), 2013
Lomize MA, Lomize AL, Pogozheva ID et al (2006) OPM: orientations of proteins in membranes database. Bioinformatics 22(5):623–625. doi:10.1093/bioinformatics/btk023
Dobson L, Lango T, Remenyi I et al (2015) Expediting topology data gathering for the TOPDB database. Nucleic Acids Res 43(Database issue):D283–D289. doi:10.1093/nar/gku1119
Tsaousis GN, Tsirigos KD, Andrianou XD et al (2010) ExTopoDB: a database of experimentally derived topological models of transmembrane proteins. Bioinformatics 26(19):2490–2492. doi:10.1093/bioinformatics/btq362
Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402, doi:gka562 [pii]
Bagos PG, Tsaousis GN, Hamodrakas SJ (2009) How many 3D structures do we need to train a predictor? Genomics Proteomics Bioinformatics 7(3):128–137. doi:10.1016/S1672-0229(08)60041-8
Zemla A, Venclovas C, Fidelis K et al (1999) A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34(2):220–223. doi:10.1002/(SICI)1097-0134(19990201)34:2
Baldi P, Brunak S, Chauvin Y et al (2000) Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16(5):412–424
Baum LE (1972) An inequality and associated maximization technique in statistical estimation for probabilistic functions of Markov processes. Inequalities 3:1–8
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc B Methodol 39(1):1–38. doi:10.2307/2984875
Krogh A (1997) Two methods for improving performance of an HMM and their application for gene finding. Proc Int Conf Intell Syst Mol Biol 5:179–186
Bagos P, Liakopoulos T, Hamodrakas S (2004) Faster gradient descent training of hidden Markov models, using individual learning rate adaptation. In: Paliouras G, Sakakibara Y (eds) Grammatical inference: algorithms and applications, vol 3264, Lecture notes in computer science. Springer, Berlin, Heidelberg, pp 40–52. doi:10.1007/978-3-540-30195-0_5
Krogh A, Riis SK (1999) Hidden neural networks. Neural Comput 11(2):541–563
Schwartz R, Chow YL (1990) The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses. In: 1990 international conference on acoustics, speech, and signal processing, 1990. ICASSP-90, 3–6 Apr 1990, vol 81, pp 81–84. doi:10.1109/icassp.1990.115542
Jacoboni I, Martelli PL, Fariselli P et al (2001) Prediction of the transmembrane regions of beta-barrel membrane proteins with a neural network-based predictor. Protein Sci 10(4):779–787. doi:10.1110/ps.37201
Fariselli P, Finelli M, Marchignoli D et al (2003) MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments. Bioinformatics 19(4):500–505
Fariselli P, Martelli PL, Casadio R (2005) A new decoding algorithm for hidden Markov models improves the prediction of the topology of all-beta membrane proteins. BMC Bioinformatics 6(Suppl 4):S12
Virkki MT, Peters C, Nilsson D et al (2014) The positive inside rule is stronger when followed by a transmembrane helix. J Mol Biol 426(16):2982–2991. doi:10.1016/j.jmb.2014.06.002
Wang H, Zhang C, Shi X et al (2012) Improving transmembrane protein consensus topology prediction using inter-helical interaction. Biochim Biophys Acta 1818(11):2679–2686. doi:10.1016/j.bbamem.2012.05.030
Nugent T, Ward S, Jones DT (2011) The MEMPACK alpha-helical transmembrane protein structure prediction server. Bioinformatics 27(10):1438–1439. doi:10.1093/bioinformatics/btr096
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Tsaousis, G.N., Theodoropoulou, M.C., Hamodrakas, S.J., Bagos, P.G. (2017). Predicting Alpha Helical Transmembrane Proteins Using HMMs. In: Westhead, D., Vijayabaskar, M. (eds) Hidden Markov Models. Methods in Molecular Biology, vol 1552. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6753-7_5
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6753-7_5
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6751-3
Online ISBN: 978-1-4939-6753-7
eBook Packages: Springer Protocols