Background

Since the first case of AIDS was reported in China in 1985 [1], nearly 40 years have passed. In 2018, the Chinese government estimated that there were approximately 1,250,000 people living with HIV/AIDS in the country [2]. Despite the implementation of various strategies, such as promoting condom usage among female sex workers and men who have sex with men (MSM), providing syringe service programs for injecting drug users, offering mother and child block interventions for HIV-positive pregnant women, as well as making pre-exposure prophylaxis (PrEP) and post-exposure prophylaxis (PEP) available to all populations, the number of newly reported HIV infections per year has unfortunately not shown a significant decline. This situation is a cause for concern and underscores the persistent challenges faced in HIV prevention and control efforts in China.

In 2018, the US Centers for Disease Control and Prevention (CDC) proposed the use of HIV molecular transmission networks as a new strategy for HIV prevention [3]. This approach has gained recognition and was included as one of the main strategies in the US Department of Health and Human Services' ambitious plan "Ending the HIV Epidemic: A Plan for America" in 2019 [4]. In line with these developments, the National Center for AIDS/STD Control and Prevention at the Chinese Center for Disease Control and Prevention published guidelines for monitoring and intervening in HIV transmission networks in China (trial version) in September 2019 [5]. HIV molecular transmission networks are constructed using genetic data from individuals infected with HIV. By identifying similarities and connections between viral sequences, these networks aim to reconstruct the macro-social networks of infected individuals and examine the characteristics of the network's active and critical groups, with the ultimate goal of preventing and controlling HIV transmission [6]. Over the years, research on molecular transmission networks has expanded, not only helping analyze the basic characteristics of the epidemic but also playing a role in determining when and where newly diagnosed individuals were likely infected, assessing the speed of HIV transmission spread, and guiding intervention efforts.

The use of HIV molecular transmission networks has shown promise in understanding the dynamics of HIV transmission and designing targeted prevention strategies. By identifying clusters of interconnected infections, public health officials can prioritize interventions and resources to effectively reach populations at higher risk. This approach allows for a more precise understanding of transmission patterns and can inform the development of tailored prevention and control measures.

In China, the adoption of HIV molecular transmission networks as a strategy reflects the country's commitment to staying at the forefront of HIV prevention and control efforts. By leveraging genetic data and analyzing transmission networks, China aims to enhance its understanding of the epidemic, identify key populations, and implement interventions that can have a significant impact on reducing new HIV infections [7].

Continued research and collaboration are necessary to refine the use of HIV molecular transmission networks in China. This includes strengthening laboratory capacities for genetic sequencing, improving data sharing and integration, and ensuring the ethical use of this information while protecting individuals' privacy. By harnessing the potential of molecular transmission networks, China can further enhance its HIV prevention and control efforts and work towards reducing the burden of HIV/AIDS in the country.

In China, the predominant HIV-1 subtypes are CRF01_AE and CRF07_BC [8], as well as some second-generation recombinant forms such as CRF01_AE/CRF07_BC, CRF01_AE/CRF08_BC, CRF07_BC/CRF55_01B and so on have been reported to be prevalent in different regions and different high-risk populations. Different HIV-1 subtypes, different high-risk groups and different regional distribution result in different HIV transmission characteristics, which also poses challenges to HIV epidemic prevention and control. Nanjing, as the capital city of Jiangsu Province and one of the central cities of the Yangtze River Delta Economic Belt, boasts a highly developed cultural tourism and transportation network. The unique geographical characteristics of Nanjing contribute to its distinct epidemic characteristics. Currently, the predominant mode of HIV transmission in Nanjing is through homosexual contact, accounting for 69.0% of cases, which is higher than the provincial and national averages [9,10,11]. The frequent population movement driven by social and economic development further complicates the HIV epidemic in Nanjing, leading to diverse and complex transmission patterns. Given the complexity of the epidemic in Nanjing, it is crucial to understand the distribution characteristics of HIV subtypes and identify transmission clusters using molecular transmission networks. This approach provides valuable insights into the local HIV epidemic strains, their subtypes, and the interconnectedness of infections. By analyzing transmission networks, public health officials can formulate targeted intervention strategies and allocate resources more effectively. This data-driven approach helps prioritize interventions and implement measures that can have a significant impact on reducing new HIV infections in Nanjing.

Material and methods

Study population and sample collection

This study included all newly diagnosed HIV-1 patients in Nanjing from January 2019 to December 2021 who had not received antiretroviral therapy (ART). Informed consent was obtained from each participant before their inclusion in the study. A peripheral blood sample of 6 mL was collected from each participant within 12 h of collection and stored at -80 °C for further analysis. The baseline CD4+ T lymphocyte (CD4) cell count and HIV viral load (VL) data were collected from patients who were diagnosed before starting ART. Additionally, epidemiological information including age, sex, and transmission route of infection was surveyed at the time of enrollment. This study was conducted in accordance with the ethical guidelines and was reviewed and approved by the ethics committees of Nanjing Center for Disease Prevention and Control.

Laboratory operations

The extraction of viral RNA was performed using the QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol, with 200 μl plasma samples used for each extraction. Nested polymerase chain reaction (PCR) was used to amplify the target fragment containing 1060 bp in the pol gene (HXB2:2253–3313). The first-round PCR procedure and cDNA synthesis were carried out using PrimeScriptTM One Step RT-PCR Ver. 2.0 (TakaRa, China). Cycling conditions were 50 ℃ for 45 min; 94 ℃ for 2 min; 94 ℃ for 15 s; 55 ℃ for 20 s; 72 ℃ for 2 min, 50 cycles; followed with an extension at 72 ℃ for 10 min. The nested PCR was conducted using Ex Taq (TaKaRa, China). Cycling conditions were 94 °C for 4 min; 94 °C for 15 s, 55 °C for 20 s, 72 °C 2 min, 40 cycles; followed with an extension at 72 °C for 10 min. PCR products were validated by visualizing 1% agarose gel electrophoresis results. Successfully amplified samples were sent to Sangon Biotechnology Co. for sequencing using Applied Biosystems 3730XL. The primers for PCR and sequencing are listed in Additional Table S1.

HIV sequence acquirement and subtyping

Sequencher 4.10.1 (Gene Codes Corporation, Ann Arbor, MI, USA) was utilized for sequence splicing, and the aligned sequences were analyzed using BioEdit (version 7.0.9, Informer Technologies Inc.). To ensure sequence quality, the WHO HIVDR QC TOOL (Resistance Quality Control Tool provided by the World Health Organization, https://sequenceqc.bccfe.ca/who_qc) was employed. Sequences longer than 1000 bp and had less than 5% ambiguous nucleotides been included for analysis. FastTree 2.1, utilizing the maximum likelihood (ML) method, was used to generate a phylogenetic tree for subtype identification. The nucleotide substitution model GTR + G + I was applied, and support values of the nodes were calculated using a Shimodaira Hasegawa-like test. Clusters with a bootstrap value greater than 0.90 (90%) were classified as belonging to the same subtype. Reference sequences from the HIV Databases (https://www.hiv.lanl.gov/content/index) were used, which encompassed major international epidemic strains A-D, F–H, and J K, as well as the major epidemic recombinant strains found in China. The ML trees were visualized and edited using Figtree v1.4.3.

TDR analysis and HIV molecular transmission network construction

The partial pol genes were aligned and uploaded to the Stanford HIV Drug Resistance Database website (https://hivdb.stanford.edu/) for TDR and mutations analysis [12]. The analysis of TDR categories involved protease inhibitors (PIs), nucleoside reverse transcriptase inhibitors (NRTIs) and non-nucleoside reverse transcriptase inhibitors (NNRTIs). Low-level resistance, intermediate resistance, and high-level resistance were identified as TDR among ART-naïve individuals [13]. Pairwise genetic distances were calculated using the Tamura-Nei 93 model, and HIV TRACE was employed to construct a molecular transmission network [14]. In the network, nodes represent individuals, while edges indicate two connected nodes with potential transmission relationships under a certain gene threshold. Links refer to the number of edges, also known as degrees, to which each node connects. The largest number of clusters could be found at the optimal gene distance threshold [15]. At this threshold, the network can identify the most transmission clusters. When the threshold increases more, different transmission clusters begin to gather with each other and the number of clusters decreases, suggesting the network's resolution ability decreases. Therefore, at a genetic threshold of ≤ 1.50%, individuals entering the cluster represented a potential transmission link. Small clusters contained ≤ 10 nodes, while large clusters contained more than 10 nodes in this study.

Variable definition

The sample sources were categorized into three groups: VCT, hospital, and others. VCT refers to Voluntary Counseling and Testing conducted by CDC or Community Based Organizations (CBO) workers. Hospital samples included those from preoperative testing, sexually transmitted diseases (STDs) clinics, pre-blood (product) testing, testing of other patients, premarital medical examinations, as well as pregnancy and prenatal examinations. The "others" category included samples from physical examinations, blood donation tests, and the detection of compulsory/reeducation through labor drug rehabilitation personnel, among others. Education level was divided into two categories: low education, which included individuals with a middle school education or below, and high education, which included individuals with a senior high school education or above. For the route of transmission, individuals classified as injecting drug users (IDUs) or those with an unknown transmission route were categorized as "others."

Statistical analysis

Statistical analysis was performed using R v4.1.3. Descriptive statistics, such as interquartile ranges (IQR) and medians, were used to summarize continuous variables with non-normal distributions. Categorical variables were described using frequencies and percentages. Univariate and multivariate logistic regression models were used to analyze the factors associated with transmission within molecular clusters or large molecular clusters. The variables with significance (P < 0.05) in the univariate analysis were included in the multivariate logistic regression model for analysis. All statistical tests were two-tailed, and a p-value of less than 0.05 was considered statistically significant.

Results

Basic characteristics and subtype distributions

A total of 1161 newly diagnosed HIV-positive individuals were included in our study from 2019 to 2021. Among them, 93.88% were male, and the median age was 29 years old (with an interquartile range [IQR] of 24–43). The majority of the participants (97.93%) belonged to the Han ethnicity, and 62.45% were unmarried. In terms of occupation, 48.15% were employees, 16.11% were students, 15.33% were unemployed, and the remaining 12.83% and 7.58% were classified as "others" and "peasants," respectively. Regarding education level, 77.95% of the participants had a high level of education. Approximately 52.54% of the samples were obtained from voluntary counseling and testing (VCT) centers, and 17.05% of the participants had already developed AIDS. Among these individuals, 77.09% had a baseline HIV viral load (VL) above 1000 copies/ml, while 56.42% had a CD4 cell count between 200 and 500 cell/mm3. Additionally, 17.31% had been previously diagnosed with sexually transmitted diseases (STDs), and 7.84% exhibited TDR.

In terms of transmission routes, 68.91% of the participants were infected through homosexual behaviors, followed by heterosexual behaviors (29.97%), and other behaviors (1.12%) (Table 1).

Table 1 Demographic and clinical patient characteristics associated with the probability of belonging to a molecular transmission cluster

Regarding HIV subtypes, the predominant subtype was CRF07_BC (40.57%, 471/1161), followed by CRF01_AE (38.41%, 446/1161), CRF119_0107 (6.29%, 73/1161), CRF67_01B (3.19%, 37/1161), CRF55_01B (3.10%, 36/1161), B (3.01%, 35/1161), CRF08_BC (2.24%, 26/1161), CRF68_01B (2.07%, 24/1161), and URFs (1.12%, 13/1161) (Fig. 1).

Fig. 1
figure 1

Phylogenetic tree analysis of nucleotide sequences from newly infected in Nanjing

Characteristics of HIV molecular transmission network

The TN93 model was utilized to calculate pairwise genetic distances under a 1.5% optimal genetic distance threshold. A total of 137 transmission clusters were identified, containing 613 individuals (52.80%) and 1381 edges, with cluster sizes ranging from 2 to 74 sequences (Fig. 2A). Among the 137 clusters, 69 clusters (50.36%) included 2 sequences, 49 (35.77%) clusters contained 3–5 sequences, 11 (8.03%) clusters contained 6–10 sequences, and 8 (5.84%) clusters consisted of 11 or more sequences. All nodes in the transmission network had the number of links ranging from 1 to 28, with 184 (30.02%, 184/613) having one link with another node, 349 (56.93%) having 2–10 links with other nodes, 73 (11.91%) having 11–20 links with other nodes, and only 7 (1.14%) nodes having more than 20 links with other nodes (Fig. 2B). Within the 1.5% genetic distance threshold, 818 (59.23%) edges were found with less than 0.010 genetic distance. In total, 1381 edges were identified below the 1.5% genetic distance threshold (Fig. 2C).

Fig. 2
figure 2

The characteristics of the molecular transmission networks. A Distribution of molecular transmission clusters by cluster size; B Distribution of nodes in clusters by links; C Distribution of edges by difference genetic distances

Regarding the subtype distribution of clusters, there were 208 nodes with CRF01_AE, forming 64 clusters. Among these clusters, 62 were small clusters consisting of 176 nodes. Similarly, 250 nodes with CRF07_BC formed 43 clusters, with 39 small clusters comprising 136 nodes. Additionally, 30 nodes with CRF67_01B formed 6 clusters, and there were 5 small clusters with 16 nodes. For the CRF119_0107, a total of 63 nodes formed 5 clusters, with 4 of them being small clusters involving 17 nodes. Other subtypes such as CRF67_01B, CRF55_01B, CRF08_BC, subtype B, and URF had 17 nodes forming 4 clusters, 20 nodes forming 5 clusters, 13 nodes forming 4 clusters, 10 nodes forming 5 clusters, and 2 nodes forming 1 cluster, respectively. All of these clusters were small clusters.

The largest molecular cluster was CRF07_BC, which consisted of 71 males and 3 females, with the primary mode of transmission being homosexual transmission (71.62%). A total of 28 clusters involved female individuals. Among the analyzed clusters, 23 transmission clusters contained 44 individuals infected with TDR strains. Of these clusters, 9 (39.13%) clusters were identified as having a shared transmission relationship among TDR individuals. Among the above 9 clusters, 6 clusters composed of all TDR individuals. As for TDR mutations in the transmission network, the main PI-associated mutations Q58E/QE were all distributed in CRF07_BC cluster, NNRTI-associated mutations K103N/KN in CRF01_AE and CRF119_0107 cluster, as well as V179D/E and G190A in CRF07_BC and CRF_5501B cluster, respectively (Fig. 3).

Fig. 3
figure 3

The molecular transmission networks of newly diagnosed individuals are depicted, with clusters categorized by subtype. Individuals aged clusters with subtype are shown. Age ≤ 30 years old are labeled with squares (■), those aged 31 ~ 59 years old are represented by circles (●), and individuals aged ≥ 60 years old are labeled with regular hexagons ( ). Patients infected through heterosexual contact are shown in red, homosexual contact in blue, and other types of contact in green. Male individuals are indicated by a black border, while female individuals are indicated by an orange border. Letters are used to highlight cases of TDR, with k, q, g, v, m and o indicating K103N/KN, Q58E/QE, G190A, V179D/E, M41ML and other TDR mutations, respectively. The size of the nodes indicates the number of edges connected to the node, and the more connections, the larger the node size

Analysis of individuals with potential transmission links

Among the 1161 patients analyzed, there were 613 nodes within 137 clusters, accounting for 52.80% (613/1161). Univariate and multivariate logistic regression models were used to analyze the data, and the results were presented in Table 1. The multivariate analysis showed that compared with individuals aged ≤ 30 years old, those aged ≥ 60 years old were more likely to cluster (OR = 2.10, 95%CI = 1.18–3.79, P = 0.013). Employed patients were less likely to cluster than unemployed patients (OR = 0.67, 95%CI = 0.47–0.96, P = 0.029). Patients whose samples were obtained from sources other than VCT were more likely to cluster than those from VCT (OR = 1.69, 95%CI = 1.09–2.66, P = 0.021). Patients with CD4 cell counts between 200 ~ 500 or above 500 were more likely to cluster than those with CD4 cell counts below 200 (200 ~ 500: OR = 1.95, 95%CI = 1.20–3.21, P = 0.008; above 500: OR = 2.03, 95%CI = 1.19–3.50, P = 0.010). Regarding viral genealogies, subtype B was considerably less likely to cluster than CRF01_AE (OR = 0.44, 95%CI = 0.19–0.93, P = 0.037). Furthermore, CRF119_0107 (OR = 8.36, 95%CI = 4.32–17.90, P < 0.001), CRF67_01B (OR = 5.45, 95%CI = 2.44–13.90, P < 0.001), and CRF68_01B (OR = 3.60, 95%CI = 1.49–9.63, P = 0.006) were more likely to cluster than CRF01_AE (Table 1).

Characterization of large clusters

In our study, eight large clusters (with more than 10 nodes) were identified, containing a total of 206 individuals (191 males and 15 females). Among these clusters, the major infection route was homosexual transmission (58.74%), followed by heterosexual transmission (40.29%), and other transmission routes (0.97%) (Table 2). A total of 5 TDR cases were distributed across 4 large clusters.

Table 2 Characteristics of the large molecular transmission clusters

In the large clusters, the proportion of patients in the ≥ 60 years old group was higher than that in the ≤ 30 years old group (OR = 4.14,95%CI = 2.02–8.55, P < 0.001). The proportion of patients without TDR was higher than that of patients with TDR (OR = 5.32,95%CI = 2.15–16.30, P = 0.001). Regarding different subtypes, the proportion of individuals infected with CRF01_AE was lower than those infected with CRF119_0107 (OR = 40.92, 95%CI = 21.08–79.42, P < 0.001), CRF07_BC (OR = 4.48,95%CI = 2.91–7.09, P < 0.001), and CRF67_01B (OR = 10.55,95%CI = 4.73–23.52, P = 0.002) (Table 3).

Table 3 Factors influencing the inclusion of individuals into the large molecular transmission clusters

Discussion

Eight subtypes were identified in this study, which exceeded the six prevalent subtypes found in Shanghai and Hangzhou [16, 17]. This highlights the local HIV genetic diversity and the complexity of the HIV-1 epidemic among the population in Nanjing. Interestingly, contrary to previous studies [18], CRF07_BC has emerged as the main dominant strain, surpassing CRF01_AE. A meta-analysis has also suggested that CRF07_BC would become the dominant circulating strain among MSM [19]. The prevalence of CRF07_BC has surpassed that of CRF01_AE among MSM in Nanjing [20,21,22], indicating that the HIV epidemic in Nanjing was predominantly driven by homosexual transmission [9]. This may explain the increasing popularity of CRF07_BC. Similar trends have been observed in Shenzhen [12].

In our study, we discovered several novel circulating recombinant forms that have gained popularity in China in recent years, including CRF119_0107, CRF55_01B, CRF67_01B, and CRF68_01B [23, 24]. Particularly noteworthy was the emergence of HIV-1 second-generation recombinant strain composed of CRF01_AE and CRF07_BC, known as CRF119_0107, which has become the third most prevalent strain. This recombinant strain was first reported in the MSM population in Nanjing [25]. Previous studies by Wei Li have shown that CRF01_AE and CRF07_BC strains were already prevalent in Nanjing, with the earliest strains dating back to the 1980s-1990s [20, 21]. CRF01_AE and CRF07_BC were the main circulating strains among sexually transmitted populations, particularly in the MSM population [18, 26]. After nearly 30–40 years of evolution and transmission, some MSM populations became infected with both CRF01_AE and CRF07_BC, leading to the emergence of second-generation recombinant CRF119_0107 (CRF01_AE/CRF07_BC). Multiple second-generation recombinant forms (CRF01_AE/CRF07_BC) have been reported nationwide in recent years, such as CRF80_0107, CRF102_0107, CRF109_0107, and CRF123_0107 [27,28,29,30]. Additionally, certain second-generation recombinant forms (CRF01_AE/CRF07_BC), such as CRF79_0107 and CRF125_0107, have also been observed in heterosexual populations [31, 32]. Notably, we found that for the first time, a female individual was infected with the CRF119_0107 strain. This suggests that the CRF119_0107 subtype has started to spread from the MSM population to other populations. Considering the diversity and change trends of HIV-1 subtypes in Nanjing, it is necessary to conduct a comparative analysis with the national and global HIV-1 subtype background in the future, so as to carry out targeted surveillance, investigation and publicity interventions.

The risk of HIV transmission increases with higher access rates. In our study, more than half have entered the molecular network, which was higher than the access rates under the same gene distance threshold reported in Guangzhou (42.9%) [33] and Baoding (14.0%) [34]. This indicates a higher risk of local HIV transmission in Nanjing. In 2017, several MSM were infected with CRF119_0107 strains [25]. After several years of virus transmission, CRF119_0107 has formed a large transmission cluster with the highest access rate, which indicates that it has spread rapidly in Nanjing. Therefore, further surveillance work should be carried out.

The access rates of CRF67_01B and CRF68_01B were also high, with CRF67_01B forming a large cluster consistent with a previous study in Jiangsu. CRF67_01B mainly spreads within Jiangsu, with few reports from other provinces [19]. Further study found that CRF67_01B is growing rapidly in Wuxi City and is concentrated in young MSM populations. In this study, nearly four-fifths of the young MSM in the CRF67_01B transmission cluster were from Nanjing, which suggested that Nanjing was promoting the spread of the CRF67_01B epidemic. It has been reported that from 2015 to 2019, CRF67_01B and CRF68_01B experienced a rapid growth stage in 2014–2015 and then remained stable in Nanjing [24]. This finding also emphasizes the need for monitoring and intervention regarding the CRF67_01B and CRF68_01B subtypes to prevent the formation and spread of large clusters of the CRF68_01B subtype. In 2017–2018, the access rates of CRF01_AE and CRF07_BC in Jiangsu Province were 32.97% and 43.56%, respectively, which were lower than those in Nanjing. Further study found that Nanjing accounted for the highest proportion in the network [19]. CRF01_AE and CRF07_BC have the characteristics of interprovincial transmission in Hefei, with some cases related to Jiangsu Province [35]. As a neighboring city of Hefei and an important city in the Yangtze River Delta, Nanjing is more attractive. The frequent population communication in Nanjing and the long-term prevalence of CRF01_AE and CRF07_BC in Nanjing may lead to the trans-regional transmission of these strains.

We found that several factors influence access to HIV-1 molecular transmission networks, including age, occupation, sample source, and baseline CD4 cell count. Younger individuals, compared to older patients, tend to have higher mobility, and their infection sources may come from other places. A study on partner testing of HIV-infected individuals in Zhejiang showed that nearly 80% of the partners who were successfully tested positive for HIV were newly diagnosed infections [36]. Furthermore, research has shown that HIV patients diagnosed through passive detection have a higher prevalence of delayed HIV diagnosis [37]. In comparison to those diagnosed at voluntary counseling and testing (VCT) clinics, the lower rate of access to HIV-1 molecular transmission networks among HIV-infected individuals diagnosed through VCT may be related to earlier detection of their own diagnosis and delayed detection of the sexual partner who caused their infection. Previous studies have demonstrated that CD4 cell count continues to decrease in untreated HIV-infected populations [38, 39]. Due to the long duration of infection among individuals with initially low CD4 cell counts and population migration, it becomes difficult for them to enter the molecular transmission network.

Homosexual and bisexual individuals can contribute to the accelerated spread of HIV with higher efficiency [19]. In our study, the first three large clusters were composed of CRF07_BC, CRF119_0107, and CRF01_AE, respectively. These clusters included not only MSM but also heterosexual males, with MSM being the main component. This phenomenon can be explained by the fact that some heterosexual individuals were actually MSM or bisexuals, as previously reported [40]. We found that MSM were linked with heterosexual individuals in three-quarter large molecular clusters. This serves as a warning sign that we need to strengthen intervention efforts targeting MSM to halt the rapid spread of the epidemic and prevent the transmission of HIV from MSM to other populations. Although the access rate of CRF07_BC was higher than that of CRF01_AE, there was no statistical difference; however, CRF07_BC was more likely to form large clusters. We also observed that CRF07_BC had a higher proportion of patients aged 60 years and older, and these older individuals were more likely to be included in larger transmission clusters. Similar to the findings of a study conducted in Zhangjiajie [41], we found that older patients aged 60 years and above were more likely to acquire or transmit HIV within their own region. This can be attributed to the limited activities of the elderly within their local area and the common occurrence of unprotected non-marital sex among them [42], which further promotes the spread of HIV. Hence, we will carry out prospective continuous monitoring of molecular transmission networks, timely identify expanding molecular clusters and core sources of transmission, and seek key places in combination with in-depth interviews, so as to carry out targeted interventions to curb the HIV epidemic.

The overall prevalence of TDR in our study was 7.84%, which was significantly higher than the prevalence rate in Jiangsu (3.20%) [43] and nationwide (4.51%) [44]. The increased availability of ART for newly diagnosed HIV patients may lead to an increase in TDR [45]. Additionally, the prevalence of TDR was related to regional socioeconomic development to some extent [46]. For instance, high-income regions such as Shanghai and Tianjin have relatively high TDR prevalence rates, with rates of 17.4% and 12.2%, respectively [17, 47]. As the capital of Jiangsu Province, Nanjing has a relatively high economic level, and its high prevalence rate is therefore reasonable. It is worth noting that TDR has been transmitted in small clusters, highlighting the need for future TDR monitoring and individual intervention efforts.

NNRTI-associated mutation K103N/KN, G190A as well as V179D/E was detected most frequently in transmission network, which induced efavirenz (EFV) and nevirapine (NVP) resistance, consistent with previous research [48, 49]. In China, EFV or NVP has been used as a free first-line ART regimens since 2004. Due to the long-term use of these drugs, high prevalence of NNRTI-associated mutations associated with EFV and NVP resistance in ART patients, which further leads to the emergence of NNRTI-associated TDR. Similar to the Shenyang study [50], we found PI-associated mutation Q58E/QE was all detected in CRF07_BC cluster, which induced tipranavir/ritonavir (TPV/r) resistance. Fortunately, TPV/r was not included as a free antiviral drug and rarely used in China [51, 52], therefore, Q58E has little impact on the selection of ART prescriptions for the TDR patients. This means that it is necessary to implement optimized treatment regimens for cases resistant to first-line drugs in the network and continuously monitor molecular clusters with TDR cases, otherwise it will lead to reduced efficacy of ART and further spread of TDR. Furthermore, the TDR surveillance should be strengthened and the list of ART drugs should be expanded to facilitate the optimization of treatment regimen.

This study had some limitations. Frist, despite our efforts to collect samples from all newly diagnosed HIV patients, it was inevitable that some patients could not be included in the analysis due to untraceable personnel. Second, we classified 9 patients with unknown infection routes and 4 IDU patients into one category, namely others. This may cause some deviation.

Conclusion

In this study, we conducted a systematic and comprehensive analysis of the molecular transmission network of among newly reported HIV cases in Nanjing from 2019 to 2021. Our findings suggested that the composition of HIV molecular subtypes was complex and diverse, with the high risk of local HIV transmission, the rapid spread of CRF119_0107 and the high TDR prevalence. Therefore, it is crucial to implement targeted interventions for the molecular transmission clusters identified in the study to effectively control the HIV epidemic.