Before we consider the elements of analytical proteomics in detail, let’s sketch out the basic approach. Analytical protein identification is built around one essential fact: most peptide sequences of approximately six or more amino acids are largely unique in the proteome of an organism. Put another way, a typical six amino acid peptide maps to a single gene product. Thus, if we can obtain the sequence of the peptide or if we can accurately measure its mass, we can identify the protein it came from simply by finding its match in a database of protein sequences (Fig. 1). Of course, some hexapeptides may map to more than one protein, but multiple “hits” typically come from highly conserved regions of related proteins (such as the paralogs discussed in Chapter 2). If one can obtain sequences of several peptides that map to the same gene product, this strengthens the validity of the match. Accordingly, the essence of analytical proteomics is to convert proteins to peptides, obtain sequences of the peptides, and then identify the corresponding proteins from matching sequences in a database.
Unable to display preview. Download preview PDF.