Acquired immune deficiency syndrome (AIDS) is caused by human immunodeficiency virus (HIV), a member of the genus Lentivirus within the family Retroviridae. It is a RNA virus with a 9.2-kb genome that encodes structural and regulatory proteins in the order 5’-gag-pol-vif-vpr-tat-rev-vpu-env-3’ [1]. The virus exists in two types: HIV type-1 (HIV-1) and HIV type-2 (HIV-2). HIV-1 is the most common worldwide, whereas HIV-2 is limited to specific regions of South Africa [2]. Due to a lack of proofreading ability of the viral reverse transcriptase, HIV has a high rate of genetic mutations, and recombination also occurs frequently. Therefore, viral variants such as escape mutants and/or drug-resistant mutants have emerged with the passage of time [3]. Indeed, HIV-1 has evolved into multiple genetic recombinant variants, named circulating recombinant forms (CRFs) and unique recombinant forms (URFs) [4]. Currently, according to the Los Alamos HIV Database (http://www.hiv.lanl.gov/), more than 90 CRFs are circulating around the globe. Of these, subtypes A, B, C, D and G are circulating widely, predominantly in the Asian countries with multiple CRFs [5,6,7].

Like any other developing country with associated implications due to prevailing socio-economic conditions, acquired immune deficiency syndrome (AIDS) is a significant health concern in Pakistan because of the growing number of clinical cases. Since its first report in 1987 [8], the number of clinical cases has increased to 165,000 (https://www.nacp.gov.pk/#). According to the NACP (National AIDS Control Program) report, Punjab province has had the largest number of HIV cases (75,000), followed by Sindh (60,000), Khyber Pakhtunkhwa (16,322) and Balochistan (5,275) (https://www.nacp.gov.pk/#). Reports of such a large number of clinical cases from Punjab province compared to other provinces in Pakistan made this region an interesting model to explore the genetic diversity of HIV-1 subtypes and the prevalence of drug-resistance-associated substitutions. In addition to this, to ensure the efficacy of the ongoing treatment regimen, population-based surveillance and monitoring of HIV drug resistance is recommended in resource-limited settings [9,10,11] such as Pakistan. In this regard, data about the circulation of different variants, subtypes, sub-subtypes and CRFs in a few other Asian countries are adequate to elucidate the epidemiology of the virus and drug-resistance-associated-mutations [12,13,14]. In Pakistan, a few HIV-related studies have been conducted in Sindh province [15, 16], but there has been a paucity of data pertaining to circulating subtypes and resistance-associated-substitutions among the prevalent HIVs in Punjab province. Therefore, the current study was conducted to examine the phylogenetic relationships and drug-resistance characteristics of circulating HIV-1 subtypes so that, if required, necessary interventions may be devised for treatment and control.

A total of 130 plasma samples were collected at HIV-treatment centers of the Primary and Secondary Healthcare Department, the Punjab AIDS Control Program (PACP), located in the districts of Lahore (n = 43), Faisalabad (n = 36), Gujranwala (n = 23) and Sargodha (n = 28) in Punjab province. These districts were selected because PACP surveillance centers are located in these districts of central Punjab province and at-risk individuals (sex workers, injecting drug users and homosexuals) and referred patients living in the surrounding areas visit these particular health centers for access to laboratory-based testing, free-of-cost treatment, and counseling services. The samples represented individuals with a history of HIV-related activities, such as injecting drug users, sex workers and homosexuals. Informed consent was signed by each individual before sample collection and processing. Necessary demographic information, including age, marital status, education level, profession, and history of injecting drug use, were also recorded. Necessary approval for sampling and subsequent procedures was obtained by the Institutional Review Committee (IRC) at the Institute of Public Health Lahore via a letter (# IRC: 41/17).

Initial screening was done using an HIV-1/2 Ag/Ab Combo test (Alere Determine, USA) as per the manufacturer’s instructions and interpreted accordingly. The assay is able to simultaneously detect both antigen and antibodies in a sample. The antigen test is specific for HIV-1 while the antibody test can detect antibodies against either HIV-1 or HIV- 2. After initial screening, viral RNA was extracted from antigen-positive samples using a QIAamp DSP Viral RNA Mini Kit (QIAGEN, Germany) as per the manufacturer’s instructions. The HIV-1 pol gene (1084 bp) was amplified by reverse transcription polymerase chain reaction (RT-PCR) using SuperScript® III and Platinum® Taq DNA Polymerase (Invitrogen, Carlsbad, CA), following a previously described protocol [17]. The amplified PCR products were purified using a Wizard® SV Gel and PCR Clean-Up System (Promega, Madison, WI, USA) and subjected to bidirectional sequencing using both forward and reverse primers and an ABI PRISM 310 Genetic Analyzer (Applied Biosystems). All of the amplified sequences were submitted to the GenBank database and are available under the accession numbers MN336502-MN336519.

The sequences obtained in this study were aligned with the reference sequences available in the Los Alamos National Laboratory (LANL) HIV Database (http://www.hiv.lanl.gov), using the ClustalW method in BioEdit [18]. A phylogenetic tree was constructed by the neighbor-joining method based on the Kimura two-parameter model and 1000 bootstrap replicates in MEGA 7.0 software [19]. Afterward, the subtypes of the sequences were determined using the REGA HIV-1 Subtyping Tool (Version 3.0) (http://dbpartners.stanford.edu:8080/RegaSubtyping/stanford-hiv/typingtool/). The subtyping results were confirmed using the jumping profile Hidden Markov Model (jpHMM) online tool (http://jphmm.gobics.de/). The occurrence of putative recombination events was investigated using the jpHMM online tool (http://jphmm.gobics.de/) and the REGA HIV-1 Subtyping Tool (Version 3.0) (http://dbpartners.stanford.edu:8080/RegaSubtyping/stanford-hiv/typingtool/). The Stanford HIV database algorithm was used to detect drug resistance mutations targeting major and minor protease inhibitors (PIs), nucleoside reverse transcriptase inhibitors (NRTIs), and non-nucleoside reverse transcriptase inhibitors (NNRTIs) (http://www.hivdb.stanford.edu). Univariate analysis was carried out to investigate the possible association between different categorical variables (demographic particulars) and positivity or negativity in HIV screening, using IBM SPSS statistics version 21.0. Variables that showed an initial association with p ≤ 0.20 were included in further logistic regression analysis. A p-value less than 0.05 was considered significant.

Initial screening with Alere HIV Combo showed that 45 samples were positive for HIV (34.62%; 95% CI: 26.99-43.13). These samples were positive for either antigen alone or both antigen and antibodies simultaneously (n = 18, 40%; 95% CI: 27.02-54.55) or antibodies alone (n = 27, 60%; 95% CI: 45.45-72.98). The highest prevalence was observed among individuals originating from the district of Lahore (44%; 95% CI: 24.99-41.16), followed by Faisalabad (33%; 95% CI: 20.00-35.38), Gujranwala (21%; 95% CI: 11.13-24.25) and Sargodha (32%; 95% CI: 14.47-28.61) (Fig. 1). The probability of occurrence of HIV infection was 11 times higher in individuals with a history of injecting drug use (68.08%; OR = 11.15; 95% CI: 53.84-79.61, p = 0.0001). A non-significant association was found between other univariates, including male subjects (36.06%; OR: 1.31; 95% CI: 28.09-44.9, p = 0.261), individuals with a lack of formal education (35.71%; OR: 1.22; 95% CI: 26.93-45.57, p = 0.6759), and those younger than 30 years old (34.83%; OR: 1.03; 95% CI: 25.75-45.17, p = 1.000) (Table 1). Although, except for individuals with a history of injecting drug use, a non-significant association was found between categorical variables and the outcomes of the screening assay, the increased probability of HIV infection in patients of different categories corresponded to observations made previously in a disease-endemic setting [20,21,22]. Therefore, the findings of the current study emphasize the need for more intensive and focused awareness programs regarding health education and safe injection practices around the country, particularly in Punjab province.

Fig. 1
figure 1

Geographical distribution of HIV-1 subtypes across selected districts of Punjab, Pakistan

Table 1 Association of socio-demographic details with HIV-1 infection in studied subjects

We found congruence between the screening test and RT-PCR, in which all the samples (n = 18) were confirmed to be positive. This supports the sensitivity and specificity of the Alere HIV Combo test, which is claimed to be 99.9% and 99.7%, respectively by the manufacturer. Similar observations have been reported in previous studies [23,24,25]. Subtyping of currently prevailing HIV-1 isolates is considered crucial for understanding the genetic evolution of this virus around the globe [26], and, for this particular purpose, the pol gene is considered an important marker for accurate delineation of HIV-1 subtypes [15,16,17, 27,28,29]. The amplified region of the pol gene comprised the protease gene (nt 1 to 297) and a part of the reverse transcriptase (RT) gene (nt 1 to 753). Since minimal genotyping of HIV-1 requires nt 30-297 of the PR gene and nt 123-720 of the RT gene [30,31,32], to classify a prevalent strain, we amplified both of these regions.

Phylogenetic analysis showed that the 18 pol sequences from this study clustered into three distinct clades. Of these, 14 clustered with sequences representing sub-subtype 02_AG, originating from Pakistan and Ghana, two sequences clustered within a clade representing subtype A from Uganda, South Africa and Pakistan, and two sequences clustered within a clade representing sub-type G from Kenya, Cameroon and Ghana (Fig. 2). Although the travel histories of the patients were not available to better correlate potential ancestral relationships, the clustering of subtypes 02_AG, A and G in African countries highlights an evolutionary origin of HIV-1 from Africa [33]. Such a clustering of isolates with a wide range of previously reported databases indicates the genetic heterogeneity and potential evolutionary dynamics of HIV-1, resulting in generation of multiple subtypes and inter-subtype recombinant forms at the population level [34]. Indeed, identification of three different subtypes highlights the propensity of this virus to mutate. Molecular epidemiological studies should be performed from time to time at a much higher resolution in future.

Fig. 2
figure 2

Phylogenetic analysis of pol gene (1084 nt) from clinical samples (black dots). The tree was constructed by the neighbor-joining method with 1000 bootstrap replicates in MEGA 7.0 software

The analysis revealed the circulation of two subtypes (A and G) and one circulating recombinant form (CRF02_AG) in the affected population. Among the districts studied, the highest genetic diversity of HIV-1 types was observed in the district of Lahore, with identification of subtypes A and G, while only 02_AG was found in the districts of Faisalabad, Gujranwala and Sargodha (Figs. 1 and 2). The high genetic diversity of HIV-1 in the district of Lahore could be attributed to its dense population, lack of implementation of biosecurity measures in medical facilities, and frequent movement of the population from nearby regions for education, health and other purposes. A potential influence of several factors, such as improper use of syringes for multiple individuals in hospitals and private clinics, certain recreation activities, and a lack of education about HIV have previously been found to be associated in the spread of HIV-1 [35, 36]. In another study, population density and frequent migration of HIV-infected individuals have been suggested to influence the emergence of novel subtypes or inter-subtype recombinants and circulating recombinant forms [37, 38]. The findings of the current study are in agreement with a previous study conducted in Sindh province, Pakistan, in which multiple HIV-1 subtypes and inter-subtype recombinants were identified [15, 39,40,41]. Furthermore, according to the Los Alamos HIV Database, subtype A (74.2%) is the most frequently reported HIV-1 subtype in Pakistan, while subtype B, C, G, A1/G, 02_AG, 35_AD, 02A1, 01_AE and others account for 5.7%, 4.3%, 1.6% and 1.7%, 4.3%, 1.4%, 2.7%, 2.9% and 1.3%, respectively (www.hiv.lanl.gov). This article is the first report of identification of CRF02_AG in the Punjab region of Pakistan.

Protease inhibitors, non-nucleoside analog reverse transcriptase inhibitors, and nucleoside analog reverse transcriptase inhibitors are anti-HIV drugs that are provided in combination at the PACP treatment centers at no cost (https://www.nacp.gov.pk/whatwedo/treatment.html). Therefore, it was essential to examine the sequences of the RT and PR regions of the pol gene to look for substitution mutations associated with drug resistance among the circulating HIV-1 subtypes. Although a number of substitutions were observed in the study subjects (T12A, I13A, K14R, I15V, K20I, E35D, M36I, R41K, H69K, L89M) at several sites across the whole length of PR region, none of these were significant. Only one subject (PK/LHR/HIV-1/97) from the district of Lahore had an accessory resistance mutation (M46K). Our findings agree with those of a previous study conducted in Sindh province, Pakistan, in which no major resistance-associated substitutions were identified [15]. However in another study from Sindh province in which sequences from patients with and without a history of use of protease inhibitor (PI) drugs were compared, the, authors reported one primary mutation (L90M) in a patient with a history of PI use, and two secondary mutations (E35E/G and M46T) were found in subjects without a history of PI use [16]. The data presented here suggest that the circulating viruses are sensitive to all protease inhibitors, including atazanavir/r, darunavir/r, and lopinavir/r. Similarly, an analysis of the RT region demonstrated a lack of any major or minor mutations conferring resistance to reverse transcriptase inhibitors (RTIs) among the 16 subjects, suggesting that the current strains are sensitive to the NRTIs and NNRTIs currently being used in Pakistan. Only two subjects had a mutation (V106I) known to confer a low level of resistance to NNRTIs [42,43,44] such as doravirine, etravirine, nevirapine, and rilpivirine. Another study also demonstrated low-level resistance against NNRTIs with minor mutations in the RT region of the pol gene [45]. Other non-significant substitutions that were observed across the whole length of the RT region were V35T, E40D, K49R, V60I, K104R, K122E, D123S, I135T, K173S, Q174K, D177E, V179I, T200A, Q207A, R211S, and V245Q. These minor mutations have been reported to have an insignificant impact on drug resistance so far [15, 16]. Taken together, the results of the analysis of the PR and RT regions, showing a low prevalence of drug-resistance-associated mutations in the affected population of Punjab province, are quite satisfactory so far, and PIs in combination with RTIs (lopinavir, ritonavir, efavirenz, nevirapine, lamivudine, tenofovir, zidovudine, abacavir), could continue to be used to treat these patients. However, studies such as this should be continued to keep monitoring the trends so that necessary changes in drug regimens can be applied immediately for effective treatment and subsequent control in the future.

The results of this work provide information about the circulating subtypes and drug-resistance-associated mutations in patients in selected districts of Punjab province, Pakistan, that will be useful to the regulatory authorities for choosing necessary interventions. Although the data analysis and conclusions are limited, this study provides basic information about the evolutionary relationship and the drug resistance pattern of currently prevailing HIV strains. Future large-scale studies are required to confirm these results, and continuous monitoring of drug resistance is needed for choosing appropriate therapy options.