Abstract
Proteins in a proteome can be identified from a sequence of K integers equal to the digitized volumes of subsequences with L residues from the primary sequence of a stretched protein. Exhaustive computations on the proteins of Helicobacter pylori (UniProt id UP000000210) with L and K in the range 4–8 show that ~90% of the proteins can be identified uniquely in this manner. This computational result can be translated into practice with a nanopore, an emerging technology that does not require analyte immobilization, proteolysis or labeling. Unlike other methods, most of which focus on a specific target protein, nanopore-based methods enable the identification of multiple proteins from a sample in a single run. Recent work by Kennedy, Kolmogorov and associates shows that the blockade current due to a protein molecule translocating through a nanopore is roughly proportional to one or more contiguous residues. The present study points to a modified version in which the volumes of subsequences (rather than of single residues) may be obtained by integrating the blockade current due to L contiguous residues. The advantages arising from this include lower detector bandwidth, elimination of the homopolymer problem and reduced noise. Because an identifier is based on near as well as distant (up to 2KL-L) residues, this approach uses more global information than an approach based on single residues and short-range correlations. The results of the study, which are available in a data supplement, are discussed in detail. Potential implementation issues are addressed.
References
Acharya S, Edwards S and Schmidt J 2015 Nanopore protein detection and analysis. Lab. Chip. https://doi.org/10.1039/c5lc90076j
Bell NAW and Keyser UF 2015 Specific protein detection using designed DNA carriers and nanopores. J. Am. Chem. Soc. https://doi.org/10.1021/ja512521w
Berg JM, Tymoczko JL and Stryer L 2012 Biochemistry 7th edition (New York: WH Freeman)
Branden C and Tooze J 1999 Introduction to protein structure 2nd edition (New York: Garland Publishing)
Heather JM and Chain B 2016 The sequence of sequencers: The history of sequencing DNA. Genomics 107 1–8
Kennedy E, Dong Z, Tennant C and Timp G 2016 Reading the primary structure of a protein with 0.07 nm3 resolution using a subnanometre-diameter pore. Nat. Nanotechnol. 11 968–976
Kolmogorov M, Kennedy E, Dong Z, Timp G and Pevzner P 2017 Single-molecule protein identification by sub-nanopore sensors. PLOS Comp. Biol. https://doi.org/10.1371/journal.pcbi.1005356
Madampage C, Tavassoly O, Christensen C, Kumari M and Lee JS 2012 Nanopore analysis: An emerging technique for studying the folding and misfolding of proteins. Prion 6 116–123
Marcotte EM 2007 How do shotgun proteomics algorithms identify proteins? Nat. Biotechnol. 25 755–757
Nivala H, Mulroney L, Li G, Schreiber J and Akeson M 2014 Discrimination among protein variants using an unfoldase-coupled nanopore. ACS Nano. 8 12365–12375
Oukhaled A, Bacri L, Pastoriza-Gallego M, Betton J-M and Pelta J 2012 Sensing proteins through nanopores fundamental to applications. ACS Chem. Biol. 7 1935–1949
Perkins SJ 1986 Protein volumes and hydration effects. Eur. J. Biochem. 157 169–180
Quick J, Quinlan A and Loman N 2014 A reference bacterial genome dataset generated on the MinION portable single-molecule nanopore sequencer. Gigascience 3 1–6
Reiner JE, Balijepalli A, Robertson JWF, Campbell J, Suehle J and Kasianowicz JJ 2012 Disease detection and management via single nanopore-based sensors. Chem. Rev. 112 6431–6451
Rodriguez-Larrea D and Bayley H 2013 Multistep protein unfolding during nanopore translocation. Nat. Nanotechnol. 8 288–295
Rosen CB, Rodriguez-Larrea D and Bayley H 2014 Single-molecule site-specific detection of protein phosphorylation with a nanopore. Nat. Biotechnol. 32 179–181
Sampath G 2015 Amino acid discriminators in a nanopore and the feasibility of sequencing peptides with a tandem cell and exopeptidase. RSC Adv. 5 30694–30700
Sampath G 2017 Protein identification with a nanopore and a binary alphabet. Biorxiv. Org. https://doi.org/10.1101/119313
Simpson RJ 2008 Proteins and proteomics: A laboratory manual (Cold Spring Harbor (NY): CSHL Press)
Smith SW 1999 The scientist and engineer’s guide to digital signal processing 2nd edition (San Diego: California Technical Publishing)
Steen H and Mann M 2004 The ABC’S and XYZ’s of peptide sequencing. Nat. Rev. 5 699–711
Swaminathan J, Boulgakov AA and Marcotte EM 2015 A theoretical justification for single molecule peptide sequencing. PLoS Comput. Biol. 11 e1004080
Timp W, Nice AM, Nelson EM, Kurz V, McKelvey K and Timp G 2014 Think small: Nanopores for sensing and synthesis. IEEE. Access 2 1396–1408
Wei R, Gatterdam V, Wieneke R, Tampe R and Rant U 2012 Stochastic sensing of proteins with receptor-modified solid-state nanopores. Nat. Nanotechnol. 7 257–263
Yusko EC, Bruhn BR, Eggenberger O, Houghtaling J, Rollings RC, Walsh NC, Nandivada S, Pindrus M, Hall AR, Sept D, Li J, Kalonia DS and Mayer M 2016 Real-time shape approximation and fingerprinting of single proteins using a nanopore. Nat. Nanotechnol. https://doi.org/10.1038/nnano.2016.267
Zhao Y, Ashcroft B, Zhang P, Liu H, Sen S, Song W, Im J, Gyarfas B, Manna S, Biswas S, Borges C and Lindsay S 2014 Single-molecule spectroscopy of amino acids and peptides by recognition tunneling. Nat. Nanotechnol. 9 466–473
Acknowledgements
The author thanks an anonymous reviewer for helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Corresponding editor: Ravindra Venkatramani.
Corresponding editor: Ravindra Venkatramani
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Sampath, G. Protein fingerprinting with digital sequences of linear protein subsequence volumes: a computational study. J Biosci 44, 54 (2019). https://doi.org/10.1007/s12038-019-9863-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12038-019-9863-9