Introduction

Globally, it was estimated that there were 604,000 cases and 342,000 deaths were caused by cervical cancer in 2020, which ranked the fourth most frequently diagnosed cancers1. Females lived in developing countries contributed about 85% cervical cancer and the death rate was higher than developed countries (12.4 vs 5.2 per 100,000)1,2. In 2022, there would be approximately 111,820 new cases of cervical cancer and 61,579 cancer deaths in China3.

Human papillomavirus (HPV) are identified in most patients with cervical carcinoma and the persistent infection with one or more genotypes of oncogenic HPV are the necessary causes of cervical cancer4,5,6. HPV is a small circular non-enveloped double-stranded DNA virus that belongs to the family Papillomavirus7. There are more than 200 different HPVs genotypes have been identified and classified into high-risk HPV (HR-HPV) and low-risk HPV (LR-HPV) based on their potential to cause cancer8. Globally, the most common HPV genotypes in invasive cervical cancer were 16, 18, 31, 33, 35, 45, 52 and 589. Meanwhile, it was reported that the distribution of HPV genotypes exhibited significant differences among countries, even in different areas of a country. For example, the top prevalent HR-HPV genotype was HPV52 in Beijing city and HPV16 in Shanghai city of China10,11. The preventive effect of HPV vaccines on cervical cancer and other HPV-associated diseases has been confirmed in multiple studies12,13. Currently, commercial 2-valent (HPV16 and HPV18), 4-valent (HPV 6, 11, 16 and 18) and 9-valent (HPV 6, 11, 16, 18, 31, 33, 45, 52 and 58) HPV vaccines have been approved by the National Medical Products Administration of China. Some provinces of China, such as Guangdong, have piloted the HPV vaccination program among the native females. Understanding the distribution of HPV genotypes in a region will provide baseline information for the implementation of vaccine-based HPV prevention strategies. Until now, there is limited information about the genotype distribution of HPV in Henan province, located in central of China.

Epidemiological study had showed that except HPV16, HPV18 and HPV58 were the most prevalence genotypes among females with cervical lesions in Henan province of China14. HPV18 can be divided into three lineages and ten sublineages: (1) A, A1–A6; (2) B, B1, B2 and B3; (3) C15. HPV58 has been classified into four lineages and eight sublineages: (1) A, A1–A3; (2) B, B1 and B2; (3) C; (4) D, D1 and D215. The sublineage of HPV18 and HPV58 had been determined in Hunan and Zhejiang province of China and the predominant sublineage was determined to be sublineage A116,17. To our knowledge, there are no reports on the sublineages and sequence variations of HPV18 and HPV58 in Henan province.

Object of the present study is to investigate the genotypes distribution and variations of HPV18 and HPV58 among females in Luoyang city, Henan province, located in central China. The investigation would assist on the formulation and development of vaccine-based HPV prevention strategies against cervical cancer.

Results

Characteristics of the study participants

A total of 6538 females were included in this study and 807 (12.34%) were infected with HPV. As shown in Table 1, the positive rate of HR-HPV was 9.85%, higher than LR-HPV (3.79%) (χ2 = 188.673, P < 0.01). The top five HR-HPV genotypes were HPV52 (1.94%), HPV16 (1.93%), HPV58 (1.48%), HPV51 (1.02%) and HPV39 (0.99%). The most prevalent LR-HPV genotypes were HPV61 (0.89%), followed by HPV54 (0.72%), HPV81 (0.60%), HPV6 (0.37%) and HPV11 (0.29%). Among the 807 HPV positive females, 637 were infected with single HPV genotype, 170 were infected with more than one HPV genotype. Among the 637 females infected with single HPV genotype, the HR-HPV infection accounted for 76.45% (487/637). The multiple HR-HPV infection rate was 2.40%, higher than the multiple LR-HPV infection (1.50%) (χ2 = 13.922, P < 0.01). The top prevalent of the multiple HPV infection was HPV52 (0.70%), followed by HPV58 (0.55%), HPV16 (0.47%), HPV39 (0.41%) and HPV51 (0.41%).

Table 1 The prevalence of 37 HPV genotypes of single/multiple infection in all the specimens (n = 6538).

Prevalence of HPV infection in different age groups

There are significant differences in the positive rates of HPV infection in different age groups (χ2 = 149.128, P < 0.01). The highest rate of any HPV type infection was observed in the ≤ 20 year-old group (29.73%, 11/37), followed by 61–65 year-old group (25.44%, 43/169), while the 31–35 year-old group had the lowest prevalence rate (9.95%, 120/1206). As shown in Fig. 1, the LR-HPV, HR-HPV, single and multiple infection groups showed an identical tendency to the “Any HPV type” infection in different age groups.

Figure 1
figure 1

Prevalence of the HPV infection types in different age groups.

Variations of L1 genes on HPV18 and HPV58

Thirty-nine HPV18 and fifty-six HPV58 L1 genes were sequenced successfully. Except for the same sequences, thirteen different HPV18 (18HNL01-18HNL13) and twenty-four HPV58 (58HNL01-58HNL24) sequences were submitted to GenBank (the accession numbers are OP684028-OP684040 and OP684041-OP684064). The HPV18 (AY262282) and HPV58 (D90400) reference sequences were used as the standard for comparison and position of the gene polymorphism sites, separately. The nucleotide sequence mutations of the studied sequences were shown in Tables 2 and 3.

Table 2 Nucleotide sequence mutations of HPV18 L1 gene.
Table 3 Nucleotide sequence mutations of HPV58 L1 genes.

For HPV18 L1 gene, 18HNL02 sequence accounted for 48.7% (19/39) and was the predominant strain. Twenty-one variations were observed in HPV18 L1 gene and four were non-synonymous mutations, including G5503A (R25Q, 39/39), C5920T (A164V, 5/39), A5796G (I123V, 3/39) and C5875A (T149N, 3/39). The most frequency synonymous mutations were A5832C (6/39) and A5924C (6/39) in HPV18 L1 gene.

For HPV58 L1 gene, fifteen nucleotide changes were non-synonymous mutations. The most prevalence synonymous variation was A6560G (27/56), which was found in 26 HPV58 sublineage A1 isolates. The highest rate of non-synonymous mutations was C6688A (T375N, 24/56) and was observed in 24 HPV58 sublineage A1 isolates. The T6434C, A6539G and G6641A variations were only found in sublineage A2. The A6014C synonymous variation was found in 15 HPV58 sequences and 14 were sublineage A2.

Variations of E6–E7 genes on HPV18 and HPV58

A total of thirty-nine HPV18 and fifty-six E6–E7 genes were gained and the nucleotide and amino acid sequences variations are summarized in Tables 4 and 5. For HPV18, thirteen different E6-E7 gene sequences (18HNE01–18HNE13) were submitted to GenBank (the accession numbers are OP684065–OP684077 for HPV18 E6, OP684078–OP684090 for HPV18 E7). The 18HNE01 represented the most predominant strain (27/39), which shared the same sequence with the HPV18 reference strain (AY262282). Five non-synonymous and four synonymous variations were identified on E6 gene. One non-synonymous and seven synonymous were observed on E7 gene. The non-synonymous mutations on E6 gene were E29Q, E40K, R74K and L93R; and on E7 was Q222H.

Table 4 Nucleotide sequence mutations of HPV18 E6-E7 genes.
Table 5 Nucleotide sequence mutations of HPV58 E6-E7 genes.

For HPV58, eleven different E6-E7 sequences (58HNE01–58HNE11) were gained and submitted to GenBank (the accession numbers are OP684091-OP684101 for HPV58 E6, OP684017-OP684027 for HPV58 E7). A total of thirteen gene mutations were observed and five were on E6 and eight on E7. The most frequency synonymous mutation was C307T (11/56) in E6 gene and T744G (42/56) in E7 gene. The most prevalent non-synonymous mutations were A388C (K93N) on E6 gene and T803C (V77A) on E7 gene.

Phylogenetic analysis

Phylogenetic trees based on the full length of HPV18/58 L1 genes were constructed, together with those of reference HPV18/58 L1 sequences that represent individual variant lineages/sublineages. As shown in Fig. 2, 92.3% (36/39) of HPV18 isolates fell into sublineage A1 and 7.7% (3/39) belonged to sublineage A5. Among the fifty-six HPV58 isolates, 75.0% (42/56) belonged to sublineage A1 and 25.0% (14/56) were sublineage A2 (Fig. 3).

Figure 2
figure 2

Phylogenetic tree generated using nucleotide sequences of the HPV18 L1 gene. Study sequences are labeled in dots, others without dots are reference strain, including: A1 (AY262282, EF202143, MF288710, LC509001, KC470208, GQ180788, MF288706, LC508998), A2 (EF202146, KC470210, KC470211), A3 (EF202147, EF202148, EF202149), A4 (EF202150, EF202151, KC470213), A5 (GQ180787, MF288727), A6 (MF288723, MF288724, MF288725), B1 (EF202153, EF202154, EF202155), B2 (KC470224, KC470225), B3 (EF202152) and C (KC470229, KC470230). Phylogenetic trees were constructed by the Maximum Likelihood by MEGA 6.0 package. Only bootstrap values above 70% are displayed in the branches.

Figure 3
figure 3

Phylogenetic tree generated using nucleotide sequences of the HPV58 L1 gene. Study sequences are labeled in dots, others without dots are reference strain, including: A1 (D90400, KY225918, KY225919, FJ385262-FJ385268), A2 (KY225926, KY225931, KY225934, HQ537752), A3 (KY225936, KY225937, KY225940, HQ537756, HQ537758), B1 (HQ537761-HQ537763), B2 (HQ537764, HQ537765, KY225956, KY225957), C (KY225961, KY225962, HQ537773, HQ537777), D1 (HQ537766-HQ537768) and D2 (KY225966, KY225967, HQ537768-HQ537770). Phylogenetic trees were constructed by the Maximum Likelihood method by MEGA 6.0 package. Only bootstrap values above 70% are displayed in the branches.

Risk association with cervical lesions

As each non-synonymous mutation in HPV18 E6–E7 gene has only one sample, thus, the risk association of amino mutation with cervical lesions was estimated on HPV58 only. Among the fifty-six females infected only with HPV58, thirty-one were diagnosed with normal cervix by colposcopy examination and twenty-five were diagnosed with CIN2 or worse. It showed that there was no association between amino mutation and cervical lesions (Table 6).

Table 6 Analysis on the oncogenic risk association of HPV58 E6 and E7 amino substitutions.

Discussion

Cervical cancer is the leading cause of deaths in China and it was estimated that there were 111,820 new cases and 61,579 deaths in Chinese females in 20223. Persist infection with HR-HPV is known to be the necessary causes of cervical cancer4,5,6. A retrospective study showed that HPV16 and HPV18 were detected in 71% invasive cervical cancer in the world9. In the present study, the distribution of HPV genotypes among 6538 females who underwent gynecological outpatient clinic during 2019–2021 was investigated in Luoyang city. It showed that the overall prevalence of HPV was 12.34%, which was similar to Zhengzhou city of Henan province, but lower than Beijing (21.06%) or Shanghai (18.98%) city10,11,18. The most prevalent HR-HPV genotypes in Luoyang city were HPV52, 16, 58, 51 and 39. The 2-valent, 4-valent and 9-valent HPV vaccines cover 28.6%, 34.0% and 73.1% of HR-HPV positive samples in the present study (data not show). Vaccines contained HPV51 and HPV39 would cover 87.6% of HR-HPV infection in Luoyang city (data not show), which should be taken into consideration in future. In China, the knowledge score and proportion of females who were willing to receive HPV vaccine were relative low19. More efforts should be made by the government to increase the awareness and knowledge of HPV vaccine.

The prevalence of HPV infection in different age population was calculated. It showed that there were two peaks among females with HPV infections, one was ≤ 20, and the other was 61–65 year-old females. The two peaks of HPV infection age group were also observed in other reports18,20,21,22. The first peak of HPV infection may be due to the lack of immunity to HPV in ≤ 20 year-old females23. Thus, the adolescent girls should take priority in the HPV vaccine program. The second peak occurred within the age group 61–65, which was assumed to be caused by the physiologic and immunologic deregulation24. In China, though the government has made enormous investment on cancer screen since 2009, more attention should be paid for females around 60 years old25.

Globally, HPV18 is the second most carcinogenic HPV genotype and has a higher proportion in cervical adenocarcinomas (ADC)26. HPV58 accounted for 6.4% in invasive cervical cancer worldwide, which was especially higher in Eastern Asia27,28. It was reported that HPV18 and HPV58 had a significant association with an increased risk for cervical cancer in China14,29,30,31. Genetic variations and sublineage of HPV may affect the pathogenic potential and host immune responses32,33,34,35. In the present study, the L1, E6 and E7 gene sequences of HPV18 and HPV58 were sequenced. Phylogenetic tree based on the L1 genes showed that the most common HPV18 sublineage was A1, which was similar to other provinces of China16,17. In other countries in Eastern Asia and Pacific, such as Korea and Japan, the predominant HPV18 sublineage was also A136,37. Compared with HPV18 lineage A, the HPV18 B/C tend to cause higher cancer risks38. Four non-synonymous mutations were found in HPV18 L1 gene, including R25Q, I123V, T149N and A164V, which had been detected in Zhejiang province of China17. The R25Q, T149N and A164V mutations were also prevalent in Korea HPV18 sublineage A1 isolates36. In China, it was reported that the distribution of R25Q mutation differed with geographical region and racial characteristic36,39. The A5474G, A5741G, A5796G, C5875A, G6089A, G6143C, A6406G, A7079T and G7130A were only found in HNL03, which represented sublineage A5. Compared with the reference HPV18 sublineage A5 (Accession number: GQ180787), A5468G, A5790G, T5914C, A7073T substitutions were found in L1 sequence. For HPV18 E6 and E7 gene, there were twenty-seven HNE01 isolates that shared the same sequence with the reference. The E29Q, E40K and L93R mutations in E6 protein were also reported in Zhejiang province of China17. Due to the limited numbers of HNE02-HNE13 sequences, associations between amino mutation and cervical lesions were not conducted.

For HPV58 L1 gene, five non-synomous mutations, including N82T, L150F, F318Y, I325M and T375N, had been detected in Liaoning province of China35. The L150F (15/56) was located in the DE loop of L1 protein, which played an important role in the recognition of VLP35. The A6014C, A6416G, T6434C and A6539G in L1 gene variations were present in the whole HPV58 sublineage A2. It was reported that the variations in the fragement of L1 gene (nucleotides 6014–6539) were characterized in HPV58 sublineage A235. In the current research, HPV58 sublineage A1 was the predominant sublineage, which was similar to other provinces of China, such as Zhejiang, Hunan and Liaoning provinces (56.9%)16,34,35. In Japan, most HPV58 isolates belonged to lineage A, with more were sublineage A240. Nevertheless, it was reported that there was no association between HPV58 (sub) lineages and cervical lesions34. For HPV58 E6 gene, the A388C (K93N) variation was the most predominant mutation (55.4%, 31/56), which have also been reported in Hubei province of China41. In our study, no significant difference was observed among HPV58 (K93N) infected females with normal cervix or low-grade lesions. However, it was reported that the K93N can significantly reduce the risk of cervical lesions in Hongkong and Shanghai42,43. The most common synomous mutations observed in HPV58 E6 and E7 gene were C307T and T744G, which had also been reported in the past44,45,46. The C307T and T744G mutations were also common in other countries, such as Mexico, Korean and Italy35,47,48,49. However, the variations included in the present study had been reported to have no association with cervical lesions34,35,49.

Conclusion

In summary, the present study provides basic information about the distribution, genotypes and variations of HPV among females population in Luoyang city, which would assist in the formulation of HPV screening and vaccination programs and preventive strategies for HPV-attributable cancer in this region.

Methods

Study subjects and specimen collection

From April 2019 to April 2021, 6538 females (rang from 18 to 90 years old, mean, 41.14 ± 11.42) who underwent cervical cancer screening in the 989 Hospital of Joint Service Support Force of Chinese PLA (Luoyang city, Henan province, China) were included in this study. The female was considered if she: (a) had no use of vaginal medication or washing in the previous 72 h; (b) had no sexual activity in the previous 24 h; (c) was not presently during menstruation; (d) had no use of acetic or iodine. Before collection, written informed consent from each participant was obtained. The study protocol adhered to the principles of the Declaration of Helsinki and was approved by the institutional ethics committee (Grant No: LLSC20190305).

HPV genotyping

Cervical specimens were collected by a gynecological practitioner using a cytobrush from the ecto- and endocervix of uterus. The samples were stored at – 20 °C until the HPV genotyping. The HPV genotype was proceed by a commercial gene chip (Chaozhou Hybribio Limited Corporation, Chaozhou, China) according to the manufacturer’s instruction. The gene chip contained 37 genotype-specific oligonucleotides designed to detect 18 high-risk human papillomavirus (HR-HPV: 16, 18, 26, 31, 33, 35, 39, 45, 51, 52, 53, 56, 58, 59, 66, 68, 73 and 82) and 19 low-risk human papillomavirus (LR-HPV: 6, 11, 34, 40, 42, 43, 44, 54, 55, 57, 61, 67, 69, 70, 71, 72, 81, 83 and 84). The final results were determined by colorimetric change on the chip under direct visualization and blue-purple spots were recognized as HPV positive.

HPV sequencing

Single HPV18 and HPV58 positive samples were chosen and used to amplify the full length of L1, E6 and E7 genes. The primers were designed based on published HPV 18 (AY262282) and HPV58 (D90400) sequences in GenBank and further synthesized by Sangon Biotech, Inc. (Shanghai, China) (Table 7). Each 50 μl PCR reaction mixture contained 2 μl of each primer, 25 μl 2 × PrimeSTAR Max Premix (Takara Biotechnology Co., LTD, Dalian, China), 19 μl of ultrapure water and 2 μl of template cDNA. The PCR reaction conditions were as follows: 94 °C for 10 min; 35 cycles of 95 °C for 30 s, 60 °C for 30 s, 72 °C for 60 s, 72 °C for 10 min. The amplified products were ligated into p-EASY-Blunt cloning vector (TransGen Biotech, China) according to manufacturer’s instruction and then applied for sequencing by Sangon Biotech, Inc. (Shanghai, China).

Table 7 Primers used for the amplification of HPV18 and HPV58 L1, E6 and E7 genes.

Variants and phylogenetic analysis of HPV18 and HPV58

To identify the variations on the HPV18 and HPV58 L1, E6 and E7 genes, the reference HPV18 (GeneBank AY262282) and HPV58 (GeneBank D90400) sequences were selected and compared with the studied sequences. The comparison was proceeding by DNAStar (Madison, WI, USA) and positions of variations were numbered based on the reference sequence.

Phylogenetic trees based on the L1 gene of HPV18 and HPV58 were constructed through Maximum Likelihood method with 1000 bootstrapped replicates using the MEGA (version 6.0). Reference sequences that represent each HPV18 and HPV58 lineage were used to construct the distinct phylogenetic branches.

Statistical analysis

SPSS version 19.0 (IBM, Armonk, NY, USA) was used to assess the significance of differences in HPV positivity rates among groups. Females with no lesion at colposcopy biopsy were grouped as normal cervices, with CIN2 or worse were grouped as outcome variable. The oncogenic risk association of HPV18 and HPV58 E6/E7 amino acid substitutions was assessed using Chi-squared test or Fisher’s exact test. P-value < 0.05 was considered to be statistically significant.

Ethics approval and consent to participate

Females were informed and a written consent was received. The study protocol adhered to the principles of the Declaration of Helsinki and was approved by the institutional ethics committee in the 989 Hospital of Joint Service Support Force of Chinese PLA, Military Training Medical Research Institute of the Whole Army (Grant No: LLSC20190305).