Introduction

Human papillomavirus (HPV) infection is the most prevalent sexually transmitted infection in women. Persistent infection with high-risk HPV (hrHPV) subtypes, but not all HPV infections, can lead to the development of cervical cancer (CC), a disease that adversely affects the health of women worldwide1. Nearly 989,000 new cases of CC are diagnosed in China every year, and the overall hrHPV infection rate in mainland Chinese women is 19.0% (95% confidence intervals [CI] 17.1–20.9%)2. The top 5 hrHPV subtypes with the highest infection rates in China include HPV-16, HPV-52, HPV-58, HPV-53, and HPV-183. HPV infection is most prevalent among childbearing-age and menopausal women; the prevalence among menopausal women is known as the “second-peak”4. Some of the factors reported as closely correlated with HPV infection include alcohol consumption, smoking, age at first marriage, marital status, vulvovaginal ulcers, and vulvovaginal inflammation5,6. The biological mechanisms of risk factors on HPV infection are less well understood over the decades. However, these investigations have yielded few results when it comes to the pathogenesis of this infection.

In recent years, scientists have become interested in and made promising progress investigating the mechanisms of vaginal flora. Some studies have reported that vaginal microorganisms might promote or impede HPV infection by infecting a host’s cervical micro-ecological environment, and may further play essential roles in the pathogenicity of HPV and cervical lesions7. Lactobacillus maintains a low vaginal pH value through producing lactic acid to perform HPV protection8, and Candida stimulates T cell proliferation to against HPV infection9. However, Gardnerella, Fusobacteria, Mycobiota, and Chlamydial trachomatis (CT) were associated with HPV infection8,10,11, for example, CT plays as the entryway of the cervical epithelium and facilitate HPV infection11.

In 2016, Xiao et al. reported on the characteristics of HPV infection in Changsha based on information obtained from the Gynecological Outpatient Center of our hospital (January 2009–December 2013, The Third Xiangya Hospital of Central South University)12. However, more evidence and investigation need to be updated to analyze the longitude development of cervical cancer prevention and treatment in Changsha. Many risk factors of cervical cancer include lifestyle such as smoking, HPV type and distribution, and the influence of vaginal flora on HPV need to be updated. We examined patients that attended the Physical Examination Outpatients Center at our hospital and hypothesized that the distribution of HPV might have changed in the few years since this previous study. We also hypothesized that vaginal flora or Candida might play a vital role in influencing this infection.

Results

Population baseline characteristics

Of the 12,628 enrolled female participants, from 19 to 84 years old, 10,875 were HPV negative and 1753 were HPV positive; the overall infection rate was 13.88% (1753/12,628). There were significant differences between the HPV-negative and -positive participants in mean age (42.57 ± 9.99, 43.83 ± 10.49 correspondingly, P < 0.001), vaginal pH (4.35 ± 0.25, 4.37 ± 0.25, separately, P < 0.001), vaginal samples positive for galactosidase (P < 0.001), sialidase, leukocyte esterase (P = 0.001), and Candida infection (P < 0.001), age at first marriage (P < 0.001), age at first childbirth (P < 0.001), and alcohol consumption (P = 0.003). Additionally, there were no significant differences between the two groups in BMI (P = 0.855), white blood cell (WBC) count (P = 0.873), neutrophil percentage (P = 0.298), lymphocyte percentage (P = 0.199), fasting blood glucose (P = 0.199), vaginal H2O2 (P = 0.199), vaginal grades (P = 0.227), vaginal trichomonas (P = 0.482), age of menstruation onset (P = 0.874), whether they had given birth (P = 0.106), diet [including spicy diet (P = 0.224) and sweet diet (P = 0.716)], smoking (P = 0.299), daily sitting time (P = 0.748), and waistline measurements (P = 0.704) (Table 1).

Table 1 Baseline characteristics of the study population.

Age distribution of HPV-positive participants

We analyzed the age distribution of the study participants. The highest positive infection rate of HPV was 31.26% in 40–49.9 years old group (n = 548; 4.34% of total numbers), followed by 27.44% in 30–39.9 years old group (n = 481), and 24.59% in 50–59.9 years old group (n = 431). This data showed the HPV distribution of different age groups (Table 2).

Table 2 Age distribution of HPV-infected participants.

HPV subtype distribution

The percentage of multiple HPV infection was observed in the study. In the study, the top 5 HPV subtypes were HPV-52 (n = 491; 28.01%), HPV-58 (n = 260; 14.83%), CP8304 (n = 201; 11.47%), HPV-53 (n = 190; 10.84%), and HPV-39 (n = 169; 9.64%), respectively. These results are strongly consistent with previous research performed at our hospital12 (Table 3, Fig. 1).

Table 3 Subtype distribution in 1753 HPV-infected participants.
Figure 1
figure 1

Subtype distribution in 1753 HPV-infected participants.

Risk and protective factors of HPV infection identified using multiple logistic regression analysis

The trends of the unadjusted OR (ORu) and the adjusted OR (ORa) were found to be consistent. The risk factors of HPV infection were age (ORa 1.01; 95% CI 1–1.01; P = 0.011) and alcohol consumption (ORa 1.31; 95% CI 1.09–1.56; P < 0.01). However, vaginal Candida infections (ORa 0.62; 95% CI 0.48–0.8; P < 0.001), age at first marriage (≥ 20 years, ORa 0.79; 95% CI 0.65–0.95; P = 0.012), and age at first childbirth (> 30 years, ORa 0.67; 95% CI 0.49–0.93; P = 0.016) were protective factors against HPV infection in the study. Additionally, the age at first childbirth ≥ 20 to < 30 years (ORa 0.88; 95% CI 0.67–1.17) was also found to be protective against HPV infection, but the difference was not significant (P = 0.38) (Table 4).

Table 4 Risk and protective factors of HPV infection by the corresponding 95% confidence intervals.

Assessment of prediction accuracy using tenfold cross-validation

The AUCs (area under curves) were significant at vaginal Candida infection, age at first sexual intercourse, age at first childbirth, and alcohol consumption sectors in the study (P < 0.05). The AUC values for all these variables were found to be approximately 0.8. Additionally, the ANOVA test and the initial models that included all variables produced results that were not significant, which indicates that the statistics using AUC were reliable. As an example, the largest AUC as a predictive factor for HPV infection was that of vaginal Candida infection. This AUC value was 0.881, the optimal cut off value was 0.164, the sensitivity was 1.000, and the specificity was 0.827 (Fig. 2).

Figure 2
figure 2

The ROC curve of HPV risk prediction logistic regression model with fungal infection. Using tenfold cross-validation, AUC values were calculated for the ROC curves of significant variables. The largest AUC as a predictive factor for HPV infection was that of fungal infection. The AUC value was 0.881, the optimal cutoff value was 0.164, the sensitivity was 1.000, and the specificity was 0.827.

Discussion

The overall HPV-positive rate of this study was lower (13.88%) than that of the overall in mainland Chinese women (19.0%)3. In our study, the top five HPV subtypes were the following: HPV-52, HPV-58, HPV-CP8304, HPV-53, and HPV-39. However, the top 5 subtypes found in the study of Xiao et al., performed at the same hospital as our study, were HPV-52, HPV-16, HPV-58, HPV-CP8304, and HPV-5312, and the top 5 HPV subtypes found in China overall were HPV-16, HPV-52, HPV-58, HPV-53, and HPV-183. The detection results of our hospital in recent five years showed that the positive rates of hrHPV and HPV16 were lower than the past and the national average level, indicating that Changsha city has achieved good results in reducing the high risk of HPV16 and related cervical precancerous lesions. The HPV-CP8304 is a popular subtype in our region with high prevalence but low risk. Our data were predominately consistent with the previous investigation12 and mostly in line with the analysis of national data3. These differences may be a result of variable geography, ethnicity, education, and horizontal health care. However, when considering vertical health care, this decrease in HPV infection evident in our study may be explained by improvements in women's health care and the improvements in awareness of women's health in recent years13,14.

Persistent hrHPV infection was believed as the causative agent in over 90% CC in early in 2000 by Bosch et al.15,16. In 2015, Wang et al., confirmed that the hrHPV rate was 91.8% patients with invasive cervical cancer (ICC) in Hunan province17. Also, HPV-16 and HPV-18 are considered the top hrHPV subtypes worldwide18,19,20, However, neither HPV-16 (6th) nor HPV-18 was identified as a prevalent subtype in our study. Several factors may account for this, such as regional differences, improvements in women's health care, and the awareness of women's health. Another reason is the participants involvement differences between studies. We investigated women from the Physical Examination Outpatient Center from our study while others from the Gynecological Outpatient Center. Symptomatic or sick patients would generally choose the Gynecological Outpatient Center for CC screening, while the Physical Examination Outpatient Center encounters more asymptomatic or healthy cases. Taking this into account, although our data is reasonably consistent with previous studies, the differences were unavoidable. These screening sample differences also suggest that the CC screening at physical examination outpatient centers, rather than the gynecological outpatient centers, might receive a closer result compared to the rate of HPV infection across the whole country.

The age distribution was later investigated. The age group with the highest HPV infection rate was 40–50 years old, followed 30–40 years, then 50–60 years. This finding is not in a full agreement with the report of the bimodal pattern in the distribution of HPV infection by age, with peaks at 26–30 years and 46–50 years21. Several factors might contribute to this pattern, including that immunity decreases with age, meaning that older women are at an increased the risk of developing HPV infections22. Besides, the postmenopausal population have elevated pH values due to the decreased estrogen levels that is correlated to the low glycogen levels and Lactobacillus abundance, making them sensitive to HPV infection23,24. Additionally, older women tend to seek out for routine gynecological care and cancer screenings25, which potentially elevates the HPV-positive rate for this age group. This explains our finding that increased age may be a risk factor for HPV infection (OR 1.01, 95% CI 1–1.01, P < 0.05) (Table 4). In young women, however, the risk of developing HPV infection might be more associated with an active sex life and less awareness of sexual protection6. There were some limitations on our research with regard to participants, including limited cases of HPV infection and lack of information regarding participants’ marital status26, number of sexual partners4, education levels, living conditions, and their use of oral contraceptives27. However, our findings still suggest that more attention should be paid to aged women and more routine gynecological examinations should be provided as part of routine health care for women of senior populations.

We also investigated potential risk factors of HPV infection. Our data showed that the vaginal pH of the HPV-positive group was significantly elevated than that of the HPV-negative group (4.37 ± 0.25, 4.35 ± 0.25, separately, P < 0.001), this might due to the vaginal dysbiosis, and particularly with the displacement of Lactobacillus28. Besides, the sialidase was positively correlated with HPV infection (4.5%, 487/10,875; 7.1%, 124/1753, separately, P < 0.001). This might be explained that the sialidase, one of the virulent biomarkers of Gardneralla vaginalis, that through hindering the epithelial biofilm formation, facilitates the infection/co-infection of HPV and other microorganisms (such as bacterial communities, chlamydial, virus such as human immunodeficiency virus (HIV), herpes simplex virus (HSV), human cytomegalovirus (HCMV)29,30,31,32,33,34,35. In line with the study of Lili et al., that the leukocyte esterase positive rate is higher in the HPV-positive group when compared with that of the HPV-negative (21.7%, 380/1753; 18.2%, 1977/10,857, separately, P = 0.001)36. We then figured out that alcohol consumption was correlated with HPV infection. An underlying reason for this could be that alcohol may increase sexual disinhibition, which results in increased unsafe sexual behaviors37,38. Smoking is another potential risk factor. Multiple studies have posited that both passive and active smoking of cigarettes increase the risk of developing HPV infection, as smoking suppresses innate immunity and can cause structural and functional changes within the respiratory system39,40,41. However, our data found no significant correlation between smoking and HPV infection. This is possibly due to limited case numbers of HPV-infected participants and the relatively restricted region within which participants lived.

Meanwhile, the probable protective factors were summarized. Our study found that a marriage age of 20 years or older (OR 0.79) and an age at first childbirth of 30 years or older (OR 0.67) were protective against HPV infection (P < 0.05). Consistent with these findings, Niyazi et al. reported that early marriage might be a risk factor for hrHPV infection6. This phenomenon might be explained by the notion that older women are more likely to engage in safe sexual behaviors6. Additionally, older women are believed to have greater awareness of genital hygiene and healthcare, which reduces the rate of HPV infection42. The vaginal Candida infection was significantly negatively correlated with HPV infection (OR 0.62; 95% CI 0.48–0.8; P < 0.001). This might be due to the presenting symptoms of Candida vaginitis, such as leukorrhea, vulvar pruritus, dyspareunia, and dysuria43, encouraging patients to seek timely gynecological screening. Additionally, Candida parapsilosis can serve as a biofilm on the surface of the genital tract, which may act as a shield against invasion by other microorganisms44. Liang et al. also supports this result; they reported similar findings of Candida albicans being protective factor against HPV infection (OR 0.63, 95% CI 0.49–0.82, P < 0.05)45. Engberts et al. also reported that infection by Candida does not increase the risk of developing CC46. Additionally, Gonia et al. found that Candida parapsilosis protects premature intestinal epithelial cells from invasion and damage by other microorganisms47. Wang et al. also reported that Candida albicans can enhance T cell proliferation, which might play a vital role in regulating the local vaginal and cervical microenvironments, and further inhibit the pathogenicity of HPV infection and cervical lesions9. However, the mechanisms that mediate the relationship between Candida infections and HPV infection remain controversial. Studies of larger sample sizes and further investigations are required to explore these mechanisms.

Some deficiencies cannot be ignored for this study. Besides the limited number of HPV cases and lack of information regarding participants’ marital status, number of sexual partners, use of oral contraceptives, and the history of sexual transmitted infections (STIs) such as Gonorrhea, Syphilis and HIV that might infect the vaginal-cervical microbiota48,49. There are further constraints in our study that might limit its wider applicability. We neither investigate details of alcohol consumption, nor follow up on infection persistence, progression, or the outcomes of participants. Besides, our documented vaginal microecology narrowly focused on Candida of different fugal types, more preliminary clinical findings need to be further studied, with the rest of vaginal cofounders, such as Gardnerella, Fusobacterium, bacterial vaginosis and aerobic vaginitis8,10,45,50. Furthermore, we excluded the previous positive HPV patients with or without local drug administration which might bias its distribution. Additionally, we did not further investigate the participants infected by multiple subtypes of HPV.

Despite these limitations, our study is in strong agreement with previous research. Additionally, this study is the first to explore the potential relationship between vaginal Candida with HPV infection in Changsha, Hunan. Our findings potentially provide valuable information to assist in the improvement of clinical HPV screening and CC prevention in local regions. To improve future research, larger sample sizes, optimization of the participant questionnaire, improved follow-up of participants, and in vitro experimentation might be considered.

In conclusion, we found that the prevalence of HPV infection and the distribution of its subtypes is relatively constant in Changsha, Hunan. Our data suggest that vaginal Candida infection is a protective factor against HPV infection and that more radical HPV management is required in the local Changsha area for perimenopausal women and who regularly consume alcohol.

Materials and methods

Participants

Every female patient that attended the Physical Examination Outpatients Center at the Third Xiangya Hospital of Central South University between 11 August 2017 and 11 September 2018 was asked to complete a questionnaire and voluntarily sign written informed consent. This study was approved by the ethics committee of the Third Xiangya Hospital of Central South University (IRB No. 20017). All methods were carried out in accordance with relevant guidelines and regulations.

To be included in this study, participants had to satisfy all of the following inclusion criteria: (1) age ≥ 18 years old and no previous positive HPV results; (2) no sexual intercourse, vaginal douching, or administration of vaginal medications in the 3 days before vaginal samples taken; (3) mentally competent so as to understand the consent form and communicate with study staff. The exclusion criteria: (1) history of no sexual activity; (2) HPV positive results in one year or a history of CC; (3) received a hysterectomy; (4) received treatment for vaginitis in the past 3 months; (5) had sexual intercourse, vaginal douching, or taken vaginal medication in the 3 days before vaginal samples taken; (6) unable understand the consent form and communicate with study staff; (7) the pregnant; (8) withdraw from the study.

Medical data and diagnosis information

Patient data extracted from medical records, including general information (age, body mass index [BMI], waist measurements, age of menstruation onset, age at first marriage, childbirth history, age at first childbirth, eating habits [salty diet: less salty, more salty; spicy diet: yes, no; sweet diet: yes/no], lifestyle habits [smoking: ever, rarely, quit or daily; alcohol consumption: no/yes; daily sitting time: < 2 h, 2–4 h, 4–6 h or > 6 h]) and test results (key peripheral blood test results which were automatically calculated and generated by machine (WBC count, neutrophil percentage, lymphocyte percentage, fasting blood glucose) and vaginal microecology test results which were diagnoses by experienced cytologist (galactosidase-negative/positive; sialidase-negative/positive; leukocyte esterase-negative/positive; H2O2-negative/positive; Candida-negative/positive; Trichomonas-negative/positive; grades-I/II/III/IV (Grade I was dominated by Lactobacillus species. Grade II represents an intermediate status between grade I and grade III, with the presence of L. iners, L. gasseriL. crispatus, Atopobium vaginaeGardnerella vaginalisActinomyces neuii and Peptoniphilus. Grade III is characterized by the presence of BV-associated species (Prevotella biviaA. vaginaeG. vaginalisBacteroides ureolyticus and Mobiluncus curtisii) and low amounts of Lactobacillus species, mainly L. iners. Finally, Grade IV is characterized by the presence of a variety of Streptococcus spp.)51,52 were carefully recorded and double-checked by two research assistants.

The vaginal and cervical samples were collected by gynecologist, while the blood was drawn by nurse. All vaginal samples were collected for the test of vaginal microecology, the cervical samples were collected for HPV screening, and the blood samples were collected for lab testing. The Pentaplex Vaginitis Detection kit (Rhfay, Guangzhou) was used for vaginal ecology testing, including galactosidase, sialidase, leukocyte esterase, H2O2, and pH value. The diagnosis of Candida vaginitis is based on microscopic examination (spores, and/or hypha) and biochemical testing (galactosidase) for vaginal discharge by experts. In our study, the criteria of bacterial infection diagnosis by Amsel method are as follows: (1) uniform vaginal secretion; (2) pH > 4.5; (3) amine smell in secretion with 10% potassium hydroxide; (4) positive laboratory test results (Gram staining for bacteria in secretion or wet film for clue cells). If any three of the above criteria are met (but the last one is necessary), the diagnosis can be made.

A total of 12,628 participants with complete medical records were enrolled in this study and retrospectively analyzed; 10,875 were non-HPV infected females and 1753 were HPV-infected females (Supplementary Information).

Human papillomavirus typing

HPV DNA was amplified by polymerase chain reaction (PCR). Then, HPV genotyping by HybriMax was performed using an HPV GenoArray Test Kit (HybriBio Ltd., Chaozhou, China). This assay can determine 21 HPV types, including 14 high-risk HPV types (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68), five low-risk HPV types (6, 11, 42, 43, and 44), and two unknown-risk types (53 and CP8304), by the flow-through hybridization technique using HPV DNA amplified by PCR53.

Statistical analysis

Statistical analyses were performed using Statistical Analysis System 9.4 (SAS Institute, USA). Continuous variables were analyzed by t-tests, and the categorical variables were analyzed by Chi-square tests. The age distribution and subtypes distribution were summarized based on HPV positive cases.

The multivariate logistic regression risk model was used for investigating the risk and protective factors of HPV infection. The risk factor was defined as odds ratio (OR) > 1 and the protective factor was OR < 1. The multivariable regression model was established in three steps. Firstly, univariate analyses were performed to demonstrate which patient variables correlated with the presence of HPV infection with a significance of P < 0.05, and the analysis of variance (ANOVA) test and all variables were significantly different in the model (P < 0.05). Next, non-significant variables (P > 0.05) were removed, and stepwise regression was performed using the forward and backward method. Finally, the variables demonstrated as significant (P < 0.05), including age, a vaginal microecology test sample positive for fungal infection, age at first sexual intercourse, age at first childbirth, and alcohol consumption were included in the model. Analyses were performed using tenfold cross-validation, that the data set is divided into 10 parts, including 9 parts as training data and 1 part as test data. The final data is generated area under the curve (AUC) values of the receiver operating characteristic (ROC) curves. This was then used to determine the model's classification ability, and AUCs were compared to assess prediction accuracy. A P value of < 0.05 was considered statistically significant.

Ethical statement

The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.