Biochemical Genetics

, Volume 46, Issue 3, pp 154–161

A Rare Y Chromosome Missense Mutation in Exon 25 of Human USP9Y Revealed by Pyrosequencing


  • Lynn M. Sims
    • Graduate Program in Biomolecular SciencesUniversity of Central Florida
    • National Center for Forensic Science
    • Graduate Program in Biomolecular SciencesUniversity of Central Florida
    • Department of ChemistryUniversity of Central Florida
    • National Center for Forensic Science

DOI: 10.1007/s10528-007-9139-1

Cite this article as:
Sims, L.M. & Ballantyne, J. Biochem Genet (2008) 46: 154. doi:10.1007/s10528-007-9139-1


Ubiquitin-specific protease 9, Y-linked (USP9Y), is a protein encoded by the Y chromosome. Its precise function in the cell is unknown, although a role in the regulation of protein turnover has been postulated. Nonetheless, mutations in this gene could result in the over- or under-abundance of proteins involved in the regulation of spermatogenesis. We have identified a novel mutation, SM1, located in exon 25 of USP9Y (c.3642G→A), which results in an amino acid substitution (p.V1214I). The mutation is in close linkage (four bases distant) from a silent mutation, referred to as M222 (p.E1212E, c.3636G→A). In our male population (n = 374), SM1 was found in one individual (0.3%) who belongs to the recently described haplogroup R1b3h, defined by the U152 SNP. This new mutation is expected to represent a new haplogroup, (R1b1c10a); therefore, within our population of individuals from haplogroup R1b3h (R1b110) (n = 16), it has a frequency of 6.3% (95% CI: 2.7–9.9%).




Single nucleotide polymorphisms (SNPs) are the smallest and most abundant type of human DNA polymorphisms (Brookes 1999). SNPs have been extensively used in the study of human evolutionary and migratory patterns (Shastry 2002), linkage analysis, and for establishing loss of heterozygosity (Craig et al. 2005). More recently, SNPs are being used for genomewide fine mapping of disease-associated genes and large-scale association studies (Syvanen 2005). Y-SNPs, in particular, are of interest for their paternal inheritance, lack of recombination, abundance, and low-mutation rate and are currently being investigated for characterizing male population structure and individualization in forensic science (Brion et al. 2005; Hammer et al. 2005; Jobling 2001; Kidd et al. 2006; Lao et al. 2006; Onofri et al. 2006; Sanchez et al. 2003; Underhill et al. 2001; Vallone and Butler 2004). A Y-SNP, M222 (rs20321: G→A), within exon 25 of the ubiquitin-specific protease 9 gene located on the Y chromosome (USP9Y: OMIM:400005, HGNC:12633), originally termed DFFRY (Jones et al. 1996), was recently evaluated for its use in population differentiation and found at a frequency of 2% (Sims et al. 2007). In another study M222 was found essentially to define the Irish modal haplotype (IMH) (Moore et al. 2006). M222 represents a nonsynonymous SNP that appears to be rarely used in evolutionary studies because it is located within a gene, USP9Y, in which mutations have been found to be associated with infertility (Sun et al. 1999). USP9Y is located within the AZFa region of the Y chromosome, where deletions within it and the DBY gene have been associated with azoospermia or severe oligozoospermia (Foresta et al. 2000a, b; Friel et al. 2002; Kleiman et al. 2007; Krausz et al. 2006; Kuo et al. 2004; Lin et al. 2004; Van Landuyt et al. 2001). Since M222 is a synonymous substitution, it is likely not to produce significant deleterious consequences for USP9Y protein function. During the course of a population study for M222 that involved pyrosequencing (Ronaghi 2001) of the SNP site and contiguous bases, a novel mutation (SM1, single mutation 1) was identified. It is located exactly four nucleotides from M222 and appears to result in a missense mutation (p.V1214I). Only one individual possessed this mutation in our population sample; it is interesting that he belongs to the recently phylogenetically characterized haplogroup R1b3h, or R1b1c10, according to the International Society of Genetic Genealogy (ISOGG,, which is also defined by the SNP U152 (rs1236440: C→T) by Sims et al. (2007). Other individuals from several different Y-SNP haplogroups, including the individuals that possess the derived M222 allele, were tested. None were found to possess the novel mutation. Furthermore, the one individual who possesses the new mutation has the ancestral M222 allele; therefore, this new mutation most likely represents a new, albeit rare, sub-R1b3h haplogroup.

Materials and Methods

Study Subjects

A total of 374 unrelated male individuals were sequenced, including 5 males known to possess the M222 SNP, belonging to Y-SNP haplogroup R1b1c7, and 16 individuals known to belong to haplogroup R1b3h (ISOGG: R1b1c10), defined by the SNP U152 (Sims et al. 2007). An additional 13 (+) M222 individuals from an independent study were tested to ascertain any correlation between the M222 and SM1 alleles. All samples were obtained with the individual’s informed consent in accordance with the University of Central Florida’s Institutional Review Board.

Genomic DNA Isolation and Amplification

Genomic DNA was extracted from whole blood dried on filter paper or dried buccal swabs by standard organic extraction using phenol/chloroform/isoamyl alcohol followed by purification with Microcon centrifugal devices for blood and/or ethanol precipitation for buccal swabs. PCR primers (M222F: 5′-CAGAGCATTCCTAATCCCTCA-3′ and M222R: Biotin-5′-CCTGAGCAAGAAGTATGGACTC-3′) and the pyrosequencing extension primer (M222E: 5′-CTAATCCCTCATCCGA-3′) were designed using a combination of Primer 3 (Rosen and Skaletsky 2000) and SNP Primer Design Pyrosequencing AB version 1.0.1 software, respectively, specifically to amplify regions flanking and including the M222 SNP. Female controls were genotyped to ensure candidate markers were confined to the Y chromosome. Sequence data were detected only in male individuals. The 50 μL PCR single-plex reaction contained 1 ng DNA, 0.2 μM each primer (forward and reverse), 125 μM dNTPs, 1 × PCR Buffer II (10 mM Tris–HCl, pH 8.3, 50 mM KCl), 2.0 mM MgCl2, 10 μg nonacetylated BSA (Sigma, St. Louis, MO), and 1.5 U AmpliTaq Gold Polymerase (Applied Biosystems, Foster City, CA). Cycling conditions were (1) 95°C for 10 min, (2) 45 cycles of 95°C for 15 s, 54°C for 30 s, and 72°C for 15 s, and (3) final extension at 72°C for 5 min., followed by 4°C until further analysis.

Pyrosequencing and Sequence Data Alignment

Genotyping was performed by pyrosequencing on a PSQ 96 MA instrument using PyroGold Reagents (Biotage, Uppsala, Sweden), according to the manufacturer’s recommendations (Pyrosequencing 2003) for sequencing short reads (∼15 bp), using the extension primer designed for SNP genotyping and for long reads (∼100 bp), using the nonbiotinylated forward primer for sequencing larger regions of DNA. Sequence alignment and translation of the presumed amino acids were prepared using Megalign within the Lasergene software package using four sequences (BV679147 reference sequence and individuals CC44, CC20, and 9948) that represent all possible genotypes for M222 and SM1 (1994 Lasergene DNAstar, Madison, WI) (Hein 1990).


During the course of a population study to identify useful Y-SNPs, a novel G→A mutation (SM1), located at the fourth nucleotide downstream of M222, was identified in exon 25 of the USP9Y gene. Since pyrosequencing is a sequence-by-synthesis method, the presence of sequence data depends on the incorporation of the correct nucleotides. Thus, the new mutation identified here was seen as a truncated sequence result for the one individual (CC44) when using the M222 SNP assay (Fig. 1). All individuals, including CC44, were then resequenced for the whole PCR product (Fig. 2) detected for M222 (64 bp) by setting the dispensation order of nucleotides for the pyrosequencing instrument for the expected sequence of the product, with all possible nucleotides entered in the location where the nucleotide was in question, in order to determine what nucleotide is present in that location for CC44. Once the nucleotide sequence was determined, all 374 individuals were amplified and the full product was sequenced in order to establish the frequency of occurrence for SM1 in our population. In the male population (n = 374), which represented most Y haplogroups, SM1 was found once (0.3%), in an individual from the recently described haplogroup R1b3h, defined by U152 (Sims et al. 2007). Since there were only five individuals in our original population sample that were M222 (+), and we wanted to test for any allelic association between M222 and SM1, we selected and tested an additional 13 M222 (+) individuals from a previous study. We found no association between the M222 SNP and the SM1 mutation. Even though SM1 is present in only 0.3% of the entire population, SM1 has a frequency of 6.3% (95% CI: 2.7–9.9%) within our population of individuals belonging to haplogroup R1b3h (n = 16). Detailed allele frequency data are listed in Table 1. Since the mutation is in the coding region of USP9Y, we investigated the possibility that the mutation results in a change in the protein product of the gene. Indeed, SM1 could result in an amino acid change (p.V1214I) as evidenced in Fig. 2, by aligning the SM (+) and SM (−) sequences against the reference sequence BV679147 (Repping et al. 2006).
Fig. 1

M222 pyrosequencing SNP assay. The pyrosequencing assay was designed using the primers indicated in the “Methods” section, which result in the sequencing of the forward strand. An example of the theoretical outcome (expected for the Y chromosome only) for this assay is shown, with actual results for an individual that is (+) M222 and one that is (−) M222. The individual with the change in sequence (CC44) is shown below the correct sequences
Fig. 2

Sequence analysis of full PCR product. (a) The sequence of the full product (minus the forward primer), using the forward primer for the extension primer. Arrows indicate positions of variation, M222 (first arrow) and SM1 (second arrow). (b) The DNA and amino acid sequence of the full product (including the forward primer) up to nucleotide position 43 in the sequence from (a). Nucleotide changes (blue) and amino acid changes (green) are highlighted. BV679147, reference sequence. CC20, (+) M222. CC44, (+) SM1. 9948, (−) both

Table 1

Allele frequency of SM1 in the entire population and subpopulations





Entire population


99.7% (373)

0.3% (1)


European Americans


99.5% (181)

0.5% (1)

African Americans


100% (122)


Hispanic/Latin Americans


100% (40)




100% (20)




100% (10)


Total population belonging to haplogroup R1b3h


93.7% (13)

6.3% (1)


European Americans


92.3% (12)

7.7% (1)

African Americans


100% (1)




100% (2)


aDoes not include the 13 additional (+) M222 individuals


While genotyping individuals for a population study involving the use of Y-SNP haplogroups for the potential determination of ethnogeographic ancestry, extensive M222 (a SNP in exon 25 of USP9Y) typing was performed. Apparently M222 has received less attention as a male lineage marker because of its possible association with male infertility (Sun et al. 1999). Since the M222 SNP results in a silent mutation, it probably does not result in significant deleterious consequences for USP9Y. M222 was found in 2% of our population, and it was associated with the 17 marker Y-STR Irish modal haplotype (Sims et al. 2007), which is found in high frequencies in Northwest Ireland (Moore et al. 2006). This in itself is further correlative evidence that M222 probably has minor effects on male fertility. Thus we have continued to use this marker for the differentiation of individuals within haplogroup R1b3. During the investigation of M222 in some additional individuals, a novel mutation was identified, located at the fourth nucleotide downstream from M222. This mutation has been identified as a G→A substitution, which should result in an amino acid change from valine to isoleucine in the USP9Y protein.

In a recent study, USP9Y was identified as a ubiquitin-specific protease with a C-terminal hydrolase domain in males, suggesting the involvement of the ubiquitin system in spermatogenesis and a role in infertility (Ginalski et al. 2004). Also in this study, the structural characteristics for several Y-chromosomal proteins were predicted, where it was suggested that USP9Y may play a role in the regulation of protein turnover by de-ubiquitinating proteins, preventing them from degradation by the proteosome (Ginalski et al. 2004). On the other hand, it has been suggested that USP9Y is more of a fine tuner, improving efficiency, and thus not providing a key function in spermatogenesis (Krausz et al. 2006). Although the amino acid substitution identified here is conservative with respect to protein charge, it may affect protein folding, potentially affecting its function, efficiency, or cellular localization. It was recently shown that even silent mutations can affect protein folding by changing the translation kinetics, resulting in a change in the substrate specificity (Kimchi-Sarfaty et al. 2007). The relatively low frequency of SM1 in our population (∼0.3%) could be due to sampling bias or to a recent origin (a private mutation, or confinement to a rare male lineage) or even negative selection. Further studies would be required to eliminate some of these possibilities.

The SM1 mutation may or may not affect the function of the enzyme, but it certainly warrants further investigation in terms of Y-SNP haplogroups. The SM1 mutation is expected to represent a new haplogroup, R1b3h1 (ISOGG: R1b1c10a). Haplogroup R1b3h has only recently been characterized; therefore, most studies do not include U152 in their Y-SNP testing panels. Furthermore, since we were able to test only 14 individuals belonging to haplogroup R1b3h in the present work, it is possible that the derived SM1 allele may be more frequent within haplogroup R1b3h than the general population (if it is not a private mutation). Future work will test for SM1 in more haplogroup R1b3h individuals.


This work was supported under Award Nos. 1998-IJ-CX-K003 and 2005-MU-MU-K044 from the Office of Justice Programs, National Institute of Justice, Department of Justice. Points of view in this manuscript are those of the authors and do not necessarily represent the official position of the U.S. Department of Justice. The authors would like to acknowledge Dr. George Duncan, Nova Southeastern University, for his assistance in obtaining many of the ethnically diverse biological samples used in this study, and Dr. Allah Rakha, Xi’an Jiautang University, for donating biological samples from Pakistan.

Copyright information

© Springer Science+Business Media, LLC 2008