Characterization and Comparison of Major Urinary Proteins from the House Mouse, Mus musculus domesticus, and the Aboriginal Mouse, Mus macedonicus
Urine from the house mouse, Mus musculus domesticus, contains a high concentration of major urinary proteins (MUPs), which convey olfactory information between conspecifics. In wild populations, each individual expresses a different pattern of around 8 to 14 electrophoretically separable MUP isoforms. To examine whether other Mus species express MUPs and exhibit a similar level of individual heterogeneity, we characterized urinary proteins in urine samples from an aboriginal species, Mus macedonicus, captured from different sites in Turkey. Anion exchange chromatography and electrospray ionization mass spectrometry demonstrated that M. macedonicus urine contained a single major peak of mass 18,742 Da, and in contrast to M. m. domesticus, all individuals were the same. The M. macedonicus masses were not predicted from any known MUP gene sequence. Endoproteinase Lys-C (Lys-C) digestion of the purified M. macedonicus urinary protein followed by matrix assisted laser desorption time of flight (MALDI-TOF) mass spectrometry demonstrated that it shared considerable, but not complete, sequence homogeneity with M. m. domesticus MUPs. Three M. macedonicus Lys-C peptides differed in mass from their M. m. domesticus counterparts. These three peptides were further characterized by tandem mass spectrometry. The complete sequences of two were determined, and in conjunction with methyl esterification, the amino acid composition of the third was inferred, and the sequence narrowed down to three permutations. The complete M. macedonicus sequence contained a maximum of seven amino acid substitutions, discernible by tandem mass spectrometry, relative to a reference M. m. domesticus sequence. Six of these were on the surface of the molecule. Molecular modeling of the M. macedonicus sequence demonstrated that the amino acid substitutions had little effect on the tertiary structure. The differences in the level of heterogeneity between the two species are discussed in relation to their environment and behavior. In addition, the differences in protein structure allow speculation into molecular mechanisms of MUP function.
KeywordsAboriginal mouse De novo protein sequencing House mouse Major urinary proteins Peptide mass spectrometry Protein mass spectrometry Protein sequence heterogeneity
House mice, Mus musculus domesticus, have evolved intricate and sophisticated methods of chemical communication (Hurst, 1987; Hurst and Beynon, 2004). The major urinary proteins (MUPs) are key components of this signaling system (Beynon and Hurst, 2004). MUPs are eight stranded beta-barrel pheromone binding proteins (Adams and Sawyer, 1990; Flower et al., 1993), encoded by a multigene family, expressed in the liver, and secreted into the urine via the kidneys (Finlayson et al., 1965). A number of small, volatile molecules have been identified as endogenous MUP ligands, most of which are reproductive priming pheromones (Bacchini et al., 1992; Robertson et al., 1993; Novotny et al., 1999) that also influence behavior (Malone et al., 2001). MUPs prolong the release of the volatile ligands from scent marks and may also protect them from oxidation (Hurst et al., 1998). However, this does not explain the extreme polymorphism that is a feature of these proteins in M. m. domesticus. Each individual expresses around 8 to 14 electrophoretically separable MUP isoforms, with only very closely related individuals (Hurst et al., 2001) or inbred laboratory mice (Robertson et al., 1996) expressing the same pattern. We have demonstrated that this extreme polymorphism provides a genetically stable method of communicating the individual ownership of scent marks (Hurst et al., 2001, 2005).
Knowledge of the structure and function of MUPs has mostly been derived from M. m. domesticus. These mice are thought to have become commensal some 10,000 yrs ago in the Fertile Crescent, utilizing some of the earliest human settlements there (Cucchi et al., 2005). Therefore, the history of M. m. domesticus has become inextricably intertwined with that of humans. Their innate agility and flexibility has proved ideal for the continued exploitation of human populations as a ready source of food and other resources, which has facilitated their spread from the Fertile Crescent to all parts of the world (Silver, 1995). Where food resources are abundant (for example around grain stores and livestock housing), population densities typically reach high levels (Berry, 1981; Bronson, 1979), and mice live in territorial social groups in which the ranges of many individuals overlap (Hurst, 1987; Barnard et al., 1991). Mus m. domesticus is one of at least three subspecies of house mice that have parapatric distributions and that interbreed where they make contact (Boursot et al., 1993). There are also three other Mus species (Mus macedonicus, Mus spretus, and Mus spicilegus) that are closely related to and occur sympatrically with M. musculus in Europe and the Middle East (Boursot et al., 1993; Suzuki et al., 2004). These species are termed aboriginal mice as, unlike the commensal M. musculus, they live independently of humans. Commensal mouse populations differ in a number of aspects of their behavior compared to free living aboriginal species (Patris and Baudoin, 1998; Ivantcheva and Cassaing, 1999). Here, we studied the aboriginal mouse, M. macedonicus, a short-tailed species that is found in Greece, Turkey, and elsewhere in the Middle East. The aim of this study was to characterize MUPs from M. macedonicus and compare them to the well-characterized MUPs from M. m. domesticus.
Methods and Materials
Animals and Sampling
Urine Sample Preparation
Urine samples were desalted before further analysis with Vivaspin 5,000 Da molecular weight cutoff centrifugal concentrators (Sartorius, Epsom, UK). Typically, a 50 μl urine sample was placed in a prerinsed concentrator and made up to a final volume of 500 μl with distilled water. The sample was then centrifuged at 10,500×g for 15 min, which was sufficient to reduce it to 50 μl. This was then expanded to 500 μl with distilled water, and the process was repeated. The sample was then removed from the apparatus and stored at −20°C until required. An identical process was used to desalt fractions collected from anion exchange chromatography before analysis by electrospray ionization mass spectrometry (ESI-MS)
Ion Exchange Chromatography
Anion exchange chromatography was performed on a Dionex Bio-LC platform fitted with a Dionex ProPac SAX column (2 × 250 mm) and a ProPac SAX guard column (2 × 50 mm). In all cases, the column flow rate was 0.2 ml/min. After equilibration with 20 mM Tris, pH 8.5, desalted urine samples, (typically 1–5 μl) were loaded onto the column. Samples were eluted from the column with a linear NaCl gradient of 0–500 mM in 30 min. The eluent from the column was monitored at 214 nm in a flow cell of 9 mm path length. Where applicable, fractions were collected by hand directly after passage through the flow cell. All aspects of data acquisition and processing were controlled through the Dionex Chromeleon software.
Proteolysis of MUPs
Anion exchange purified MUPs were proteolyzed by endoproteinase Lys-C (Lys-C), endoproteinase Glu-C (Glu-C), or trypsin (Roche Diagnostics, Lewes, UK). An aliquot (typically 100 μl) of the MUP fraction was reduced with 10 mM 2-mercaptoethanol for 1 hr at room temperature (RT) or with 10 mM dithiothreitol (DTT) at 55°C for 1 hr. In some instances, cysteine residues were carbamidomethylated by incubating the reduced protein preparation with a 55-mM final concentration of iodoacetamide for 1 hr at RT. Removal of reduction and carbamylation reagents was achieved by protein precipitation or centrifugal filtration in a Vivaspin apparatus as described previously. Protein precipitation was achieved by the addition of an equal volume of 20% (w/v) tricarboxylic acid (TCA). After centrifugation at 14,000×g for 10 min, the supernatant was discarded, and the precipitate was washed twice with diethyl ether. Residual ether was removed from the sample by incubation at 40°C for 10 min, and the precipitate was resuspended in its starting volume of either 100 mM Tris, pH 8.5 (Lys-C and Glu-C) or 100 mM Tris/2 mM CaCl2, pH 8.5 (trypsin). Protease solution (2 μl of 0.1 mg/ml) was added to the resuspended precipitate, which was then incubated overnight at 37°C. The reaction was subsequently stopped by addition of a 5-μl aliquot of formic acid. Samples prepared by Vivaspin desalting were digested under identical conditions.
Esterification of MUP Lys-C Peptides
Esterification of carboxyl groups within MUP Lys-C peptides was achieved according to Shevchenko et al. (2003). Lys-C peptides from digests of M. macedonicus and M. m. domesticus MUPs were initially desalted into a solution of 0.1% (v/v) trifluoro-acetic acid (TFA)/50% (v/v) acetonitrile with Zip-Tips (Millipore, Billerica, MA, USA). This solution was subsequently reduced to dryness in a vacuum centrifuge. A 1 ml aliquot of ethanol was placed in a 1.5-ml test tube and placed at −20°C for 15 min, after which, a 150 μl aliquot of acetyl chloride was added, and the mixture was incubated at room temperature for 10 min. The dried peptides were immediately treated with 15 μl of this mixture and incubated at room temperature for 45 min before drying in a vacuum centrifuge. The dried, esterified peptides were finally dissolved in MALDI matrix solution (see below) before matrix assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry.
Matrix Assisted Laser Desorption Ionization-Time of Flight (MALDI-TOF) Mass Spectrometry
MALDI-TOF mass spectrometry of MUP digests was performed on a Waters-Micromass M@LDI instrument. Peptide mixtures from proteolytic digests of MUPs were mixed in a 1:1 ratio with a matrix solution consisting of saturated α-cyano-4-hydroxy-cinnaminic acid (Sigma Chemical, Poole, UK) in 50% (v/v) acetonitrile/0.2% (v/v) TFA. A 1 μl aliquot of this preparation was deposited on the MALDI target and allowed to dry at RT. Spectra were subsequently acquired between 1,000 and 4,000 Th with the laser energy optimized to give the best signal to noise ratio for each sample. The laser firing rate was 5 Hz and 10 spectra (collected over 2 sec) were combined. The final mass spectrum was a combination of 10–15 such combined data sets, representing 100–150 individual laser shots. All aspects of data acquisition, processing, and machine management were controlled through the MassLynx software suite (versions 3.5 and 4.0).
Electrospray Ionization (ESI) Mass Spectrometry
Electrospray ionization mass spectrometry (ESI-MS) and tandem mass spectrometry (ESI-MSMS) were performed on a Micromass Q-ToF Micro instrument, fitted with a nanospray source. The electrospray was created from a silver-coated glass capillary with a 10-μm orifice (New Objective, Woburn, MA, USA), held at a potential of +2,000 V relative to the sample cone. For measurement of the mass of intact MUPs, a desalted sample was introduced into the mass spectrometer by syringe pump infusion (Harvard Instruments Ltd, Edenbridge, UK) at a rate of 0.5 μl/min. In this case, the instrument was operated in TOF only mode, with the quadrupole analyzer operating in Rf only mode to allow transmission of all ions. Raw data were gathered between 700 and 1,400 Th at a scan/interscan speed of 2.4/0.1 sec. These raw data were subsequently de-convoluted using the MaxEnt 1 module contained within the MassLynx 3.5 software. To create the MaxEnt damage model, peak width and resolution parameters of 0.75 and 1 Da/channel were used, respectively, and data were processed over the mass range 18,400–19,000 Da.
De novo Sequencing
For tandem mass spectrometry on peptides derived from MUP proteolytic digests, samples were introduced into the mass spectrometer at 0.2 μl/min following reversed phase high pressure liquid chromatography (RP-HPLC) (see below). In this case, the initial quadrupole analyser was set to allow passage only of selected precursor ions to a gas cell, where they were fragmented by collision with argon. The mass of the resulting fragment ions was then measured by the TOF analyzer. Selection of precursors and fragmentation energy were controlled automatically by using the data dependent acquisition facility within the MassLynx software. Precursor spectra were acquired between m/z 400 and 1,500 at a scan/interscan speed of 2.4/0.1 sec. Product ion spectra were acquired between m/z 100 and 2,000 at a scan/interscan speed of 1.0/0.1 sec. Raw product ion spectra were de-convoluted using the MaxEnt 3 algorithm within the MassLynx software. The charge state of the parent peptide was determined from the isotope envelope in the precursor spectrum. Interpretation of product ion spectra and the determination of peptide sequences were facilitated by the PepSeq module within MassLynx.
Before ESI-MSMS, MUP peptides were separated by RP-HPLC on a Dionex Ultimate system. The system was fitted with a PepMap C18 column (LC Packings, Camberley, UK), 15 cm × 75 μm, bead size 3 μm and pore size 100 Å. Before separation, aliquots (2–5 μl) of MUP peptides were desalted in-line using a Dionex Switchos apparatus, fitted with a 1 mm × 300 μm, C18 precolumn. The precolumn was initially equilibrated in 0.2% (v/v) formic acid at 30 μl/min. Peptides were then loaded and washed for 3 min at the same flow rate and then eluted with 90% acetonitrile/0.2% formic acid, introduced as a linear gradient of 0–50% in 30 min at 0.2 μl/min. The column eluent was monitored by UV absorbance at 214 nm, before delivery to the mass spectrometer.
Protein sequences derived from the M. macedonicus MUPs were used to direct homology models. Briefly, protein structures were modeled in the structure of MUP1 1I04.PDB (Timm et al., 2001). After alignment of the two sequences (no insertions or deletions were necessary), the models were built to the highest quality possible using the Modeler package of the Discovery Studio package (Accelrys, Cambridge, UK, ver 1.5). Because the four C-terminal most residues of MUP1 cannot be defined because of main chain flexibility (Krizova et al., 2004), these residues were not modeled.
Results and Discussion
The mass spectra obtained from urine samples of M. m. domesticus were more complex and heterogenous among individuals (Fig. 3b) than found for M. macedonicus, even though the two species were both collected from the same geographical region (Fig. 1). The number of major peaks (>20% base peak intensity) differed greatly. Sample 171 was the simplest, and unusually, expressed a single major peak at 18,694 Da, whereas sample 141 contained three peaks at 18,682, 18,730, and 18,893 Da. Sample 145 was most complex with peaks at 18,666, 18,682, 18,695, and 18,731 Da. Some of these masses have previously been observed in other wild mice or in inbred mouse strains, notably the protein at 18,694 Da, in C57BL/6, BALB/c, and wild mice (Robertson et al., 1996, 1997); 18,682 Da in wild mouse populations (Robertson et al., 1997), and 18,893 Da in inbred C57BL/6 mice (Armstrong et al., 2005).
The corresponding mass spectrum of M. macedonicus Lys-C peptides revealed six peaks which shared masses with peptides in the C57BL/6 reference spectrum (Fig. 4). This was presumptive evidence for the identity of 100 amino acid residues and that the 18,742-Da M. macedonicus protein was indeed a MUP. The amino acid coverage in this instance was 61% (assuming a total of 162 residues). Three of the Lys-C peptides predicted from the cDNA sequence and present in the M. m. domesticus sample were absent from the M. macedonicus MUP (LysC peptides L2, L4, and L5), whereas the M. macedonicus sample contained three different peptides (m/z 1,623.0, 2,066.2, and 2,893.9). These peptides, likely to be the equivalent peptides to L2, L4, and L5, were selected for further MS analysis. The presence of a +76-Da adjacent peak (putatively a 2-mercaptoethanol adduct) to the peptide at m/z 2,066.2, suggested that in common with the equivalent M. m. domesticus peptide L5, it contained a cysteine residue.
The second unique M. macedonicus peptide (m/z 2,893.9 Th, [M + H]+) was observed as a triply charged ion of m/z 965.2 in the precursor spectrum, implying an internal basic residue. During tandem mass spectrometry, this peptide did not generate a full series of y ions but fragmented to produce a complete set of b ions (Fig. 5b), from which the sequence of the peptide was determined as JEEHGNFRJFJEQJHVJENSJDJK. The sequence of the N-terminal eight residues of this peptide (JEEHGNFR) was confirmed by tandem mass spectrometry of a tryptic digest of the same preparation (data not shown). Although this sequence shared significant homology with peptide L4 from AAI00587, (31IEDNGNFRLFLEQIHVLENSLVLK55), it differed at three positions. The observed differences (from M. m. domesticus to M. macedonicus) were D33E, N34H, and V53D. The arginine residue at position 40 was consistent with the triply charged precursor ion. Thus, an additional 24 residues of the M. macedonicus MUP were identified, and again, the interpretation was confirmed from the mass difference between the M. m. domesticus L4 peptide (52.4 Da) and the proposed M. macedonicus sequence (53 Da).
Endopeptidase Lys-C digestion of AAI00587 creates two tripeptides (L3 and L6), each of which has a predicted mass of less than 400 Da. Such peptides are difficult to analyze with MALDI-TOF mass spectrometry because of background signal from ions in the chemical matrix. We have, however, observed in previous experiments that Lys-C has a tendency to omit cleavages in the peptide chain in areas where lysine residues are in close proximity. To assess whether the two tripeptides in M. macedonicus possessed the same sequence as those in AAI00587, an additional LC-MS experiment was run on the Lys-C peptides. In this instance, the m/z values for L2 + L3 (1,018.54, [M + 2H]2+) and L5 + L6 (820.70, [M + 3H]3+ with an oxidized methionine residue) were readily observed in extracted ion chromatograms (data not shown). The two tripeptides were therefore assigned to the M. macedonicus sequence as identical to the M. m. domesticus sequence.
In constructing the model M. macedonicus MUP structure, we had to consider two types of uncertainty in the sequences derived from proteomics experiments. First, the amino acids between residues 58 and 61 could not be unambiguously determined by tandem mass spectrometry, and the three possibilities that existed (GRED, GRDE, ARDD) were modeled. Secondly, it is not possible to discriminate between the isobaric residues leucine and isoleucine. To assess the importance of these residues, we constructed two artificial sequences based on AAI00587, in which all Ile/Leu residues were converted to either Leu or Ile. Models built on the parent 1I04 structure were compared, and the RMSD of the alpha carbon atoms of the two structures was 0.14 Å. The overall main chain trajectories were virtually identical. We conclude that Leu/Ile substitution would not have a major effect on the model structures. Accordingly, we built M. macedonicus models assuming that the identity of Leu/Ile residues were the same as for AAI00587. For the six variant structures based on the unresolved pentapeptide (see above), all yielded high quality models with alpha carbon RMSD of less than 0.25 Å and similar main chain trajectories (Fig. 9). All proteins passed the Protein Health checks built into the Modeler package. Other than the ambiguous acidic loop region of the M. macedonicus protein sequences, there are a number of amino acid substitutions relative to MUPs from M. m. domesticus. These changes (H20Y, D34E, N35H, V53D, T58L, V59G) do not influence the ligand binding cavity and are solvent exposed.
To assess the level of similarity between the M. macedonicus sequence and those from M. m. domesticus, we compared it to all the known M. m domesticus sequences in the NCBI database. The amino acid sequence of the M. macedonicus MUP shares most similarity with a group of MUPs for which only incomplete sequences are available. These, in turn, are most similar to the male-specific 18,893 Da MUP discovered in C57BL/6 mouse, but also widely present in wild caught M. m. domesticus. This protein has a high capacity for binding of the male specific ligand 2-sec-butyl-4,5-dihydrothiazole (thiazole) that accords with a function in male-specific signaling (Armstrong et al., 2005).
The simple MUP pattern in M. macedonicus, together with the lack of individual variability, suggests that MUPs do not have sufficient polymorphism to provide an individual ownership signal in scent marks in this species, in contrast to M. m. domesticus. This may reflect the difference in the population ecology of this aboriginal grassland species compared to the commensal house mice and possibly reflects the more ancestral form. While the field ecology of M. macedonicus is not well known, in line with the other aboriginal species of Mus in Europe and the Middle East, M. macedonicus in Turkey can be presumed to have large territories, and individuals are highly agonistic to each other (reviewed by Frynta et al., 2005). Both male and female M. macedonicus are much more aggressive than M. m. domesticus (Frynta and Čiháková, 1996), and individuals may be largely nonoverlapping. By contrast, in commensal M. m. domesticus populations, multiple males and females live within territorial social groups, and there may be extensive spatial overlap between neighbors when borders are not easily defended (Hurst, 1987; Barnard et al., 1991). This results in an unusually high level of aggregation and contact between individuals in this species. As male M. m. domesticus advertise their territorial dominance through scent marks (Hurst and Beynon, 2004), it is possible that this may have been a strong driver for the evolution of individual-specific MUP patterns in M. m. domesticus but not in the aboriginal species. Given their dispersed distribution, there is likely to be much less requirement for advertising individual scent ownership in M. macedonicus and other widely dispersed rodent species. This suggests the intriguing hypothesis that the extreme polymorphism and individual variability in MUP patterns seen in M. m. domesticus is a species-specific adaptation for signaling individuality in a complex social system where individuals vary in social status. M. macedonicus is one of a number of aboriginal mouse species that have evolved independently of M. m. domesticus, inhabit different environments, and display different forms of behavior. Information about the MUPs and MUP genes in these animals remains scarce (Sampsell and Held, 1985), yet they provide a valuable opportunity to investigate the function of both MUPs and their ligands and the evolution of a family of proteins used for scent signaling. To this end, a wider survey of MUPs from aboriginal mice of both sexes, which includes both primary structure and ligand status, has the possibility to produce greater insights into the wider context of MUP polymorphism and function.
We are grateful for the assistance of Coskun Tez and support from the Genetics Society toward the fieldwork. The work was carried out under research grants from the Biotechnology and Biological Sciences Research Council, UK to J.L.H. and R.J.B.
In memory of J.E. Robertson, 1928–2003.
- Berry, R. J. 1981. Population dynamics of the house mouse. Symp. Zool. Soc. Lond. 47:395–425.Google Scholar
- Frynta, D. and Čiháková, J. 1996. Neutral cage interactions in Mus macedonicus (Rodentia: Muridae): an aggressive mouse? Acta Soc. Zool. Bohem. 60:97–102.Google Scholar
- Hurst, J. L. 1987. The functions of urine marking in a free living population of house mice. Anim. Behav. 35:1433–1422.Google Scholar
- Krizova, H., Zidek, L., Stone, M. J., Novotny, M. V., and Sklenar, V. 2004. Temperature-dependent spectral density analysis applied to monitoring backbone dynamics of major urinary protein-I complexed with the pheromone 2- sec-butyl-4,5-dihydrothiazole. J. Biomol. NMR 4:369–384.Google Scholar
- Malone, N., Payne, C. E., Beynon, R. J., and Hurst, J. L. 2001. Social status, odour communication and mate choice in wild house mice, pp. 217–224, in A. Marchlewska-Koj, D. Muller-Schwarze, and J. Lepri (eds.). Chemical Signals in Vertebrates. Plenum Press, New York.Google Scholar
- Pes, D., Robertson, D. H. L., Hurst, J. L., Gaskell, S., and Beynon, R. J. 1999. How many major urinary proteins are produced by the house mouse Mus domesticus?, pp. 149–162, in R. E. Johnston, D. Müller-Schwarze, and P. W. Sorensen (eds.). Advances in Chemical Signals in Vertebrates 9. Plenum Press, New York.Google Scholar
- Silver, L. M. 1995. Mouse Genetics, Concepts and Applications. Oxford University Press, Oxford, UK.Google Scholar
- Veggerby, C., Payne, C. E., Robertson, D. H. L., Gaskell, S. J., Humphries, R. E., Hurst, J. L., and Beynon, R. J. 2002. Polymorphism in major urinary proteins: heterogeneity in a wild mouse population. J. Chem. Ecol. 28:1425–1440.Google Scholar