Introduction

Late HIV presentation is a core indicator for monitoring the progress of HIV testing interventions funded by the U.S. Department of Health and Human Services [1] and one indicator for monitoring national health in Healthy People 2030 [2]. Individuals who are not aware of their HIV status at an early stage of infection cannot benefit from prompt linkage to medical care and antiretroviral treatment, and late presentation with advanced HIV disease (“LPAD”) is associated with a greater risk of severe disease and death [3,4,5]. There were more than 20 different definitions of HIV late presentation, such as the concurrence of HIV and AIDS diagnosis or having AIDS diagnosis within one year of initial HIV diagnosis. The consensus definition was reached in October 2009 in the Europe 2009 Conference at the Nobel Forum in Stockholm, and it was agreed upon that LPAD is defined as persons presenting for care with a CD4 count below 200 cells/ μL or presenting with an AIDS-defining event, regardless of the CD4 cell count [6]. Decreasing late HIV diagnosis has been increasingly emphasized as part of a comprehensive HIV prevention strategy for ending the HIV epidemic [4]. In 2020, 21.6% of new HIV diagnoses in the United States were LPAD, and South Carolina is one of the ten states in the United States with an LPAD percentage higher than 24.7%. Given the high percentage of HIV late presentation in South Carolina, it is imperative to understand the risk factors associated with LPAD to develop targeted interventions and practical HIV testing strategies.

Based on the modified socioecological framework used to understand the risks of the HIV epidemic, late HIV presentation may be driven by multi-level social risk factors [7]. Various structural-level characteristics, such as income inequality, socioeconomic deprivation, and high unemployment rate, are positively associated with late presentation [8, 9]. At the individual level, higher percentages of late presentation were found among persons who were Black or Hispanic/Latino [10, 11], older [12, 13], male [14], and lived in rural areas [10]. However, there are some knowledge gaps. For example, evidence linking HIV transmission mode and late presentation is mixed. According to a literature review, a positive relationship between heterosexual exposure and late presentation was observed in 16 out of 27 studies reviewed [11]. However, some studies found that infection attributed to men who have sex with men (MSM) was a risk factor for late presentation [12]. Additionally, few studies focused on time to engagement in care among people who were diagnosed late, limiting our ability to identify the risk factors of longer delay time among people with late HIV presentation. To know more about the underlying personal, environmental, and physical risk factors, there is a need for large real-world cohort studies to investigate both individual- and neighborhood-level risk factors of late HIV presentation and the length of delay time from infection to diagnosis.

By linking statewide individual-level public health surveillance data and publicly available county-level surveillance datasets (e.g., American Community Survey), this study aims to (1) explore the temporal variation of HIV LPAD and delay time to diagnoses and (2) identify the association of multi-level characteristics (e.g., individual demographic characteristics and county-level economic status) with HIV LPAD and delay time of diagnosis.

Methods

Study Population and data Source

All adults (≥ 18 years old) diagnosed with HIV in South Carolina between January 2005 and December 2019 and with at least one CD4 count record or AIDS diagnosis within three months of initial HIV diagnosis were included in this study. Their deidentified information, including HIV/AIDS diagnosis date, demographic characteristics (e.g., age, gender, race/ethnicity, and residence), HIV transmission mode, and laboratory results (e.g., CD4 count and viral load) were extracted from the South Carolina statewide Enhanced HIV/AIDS Reporting System. This public health surveillance database contains all HIV cases reported to the South Carolina Department of Health and Environmental Control. Individual data in the Enhanced HIV/AIDS Reporting System were further linked with county-level publicly available datasets (e.g., County Health Rankings and American community survey) via the Federal Information Procession Standards code, a residential county identifier. For the purpose of data analysis, the current study only included PWH with at least one CD4 cell count record or AIDS diagnosis within three months of initial HIV diagnosis (9,588 out of 11,209, 85.54%) [13]. The institutional review boards at the University of South Carolina and relevant South Carolina state agencies approved the study protocol.

Outcome

In the current study, LPAD was a binary outcome indicating whether a person has an AIDS diagnosis or lab record of CD4 count < 200 cells/μL within three months of initial HIV diagnosis [6]. To be more specific, a person was considered as “LPAD” if any of the following criteria were met: (1) was diagnosed with AIDS at the time of initial HIV diagnosis; (2) had AIDS diagnosis within three months of initial HIV diagnosis; or (3) had CD4 cell counts record less than 200 cells/μL within three months of initial HIV diagnosis.

Another outcome we were interested in was the delayed time of diagnosis, the estimated interval from HIV infection to initial diagnosis [14, 15]. We estimated the time from HIV infection to diagnosis based on a well-characterized CD4 depletion model commonly used in the literature:

$$\sqrt{CD4\left(t\right)}={a}_{i}+\left({b}_{i}\times t\right)+{e}_{it}$$

Where CD4 is the first CD4 value at HIV diagnosis, t is the duration from HIV infection to the date of the CD4 test, and ai and bi are the specific model parameters to the U.S. HIV population subgroups (sex, transmission category, and age) [16]. The detailed information for CD4 model parameters used for different subsets of populations was described elsewhere [16]. One difficulty in applying this model is the age of infection was unknown. We used the age at the first CD4 test as the approximation of the age at infection to estimate t. The age at infection was then calculated using the date of the first CD4 test minus t. The estimation of t was repeated using the calculated age at infection to determine the final CD4 modeling age group. The diagnosis delay was estimated as follows:

$${\rm{Diagnosis}}\,{\rm{delay}}\,{\rm{ = }}\,{\rm{Date}}\,{\rm{of}}\,{\rm{infection}} - {\rm{Date}}\,{\rm{of}}\,{\rm{first}}\,{\rm{CD4}}\,{\rm{test}}$$

Using data on the first CD4 value after HIV diagnosis during the study period, we calculated the diagnosis delays in PWH with LPAD and with at least one CD4 count record within three months of initial HIV diagnosis.

Individual-level Factors

Individual-level characteristics were obtained from the Enhanced HIV/AIDS Reporting System and included age at diagnosis (18–34, 35–54, ≥ 55 years old), gender (male, female), and race/ethnicity (Black, White, Hispanic, other/unknown). HIV transmission modes were categorized into heterosexual, MSM, injection drug users (IDU), and other/unknown. Residence information (Federal Information Procession Standards code) was classified into rural and urban areas based on the 2013 USDA Rural-Urban Continuum Codes, with codes 1–3 representing urban residents and 4–9 representing rural residents [17]. The International Classification of Diseases codes for 19 different comorbid disease categories (e.g., congestive heart failure) based on the Charlson Comorbidity Index (CCI) scoring system were extracted to calculate the CCI score [18]. For the updated CCI score, the scores assigned to HIV infection and AIDS were subtracted from the CCI score because all the subjects in the current study were diagnosed with HIV. The total score indicates comorbidity burden, and the higher the score, the higher the burden [19]. The adapted CCI score was further grouped into 0, 1, and ≥ 2 for data analysis.

County-level Factors

Seven social demographic factors obtained from the American Community Survey five-year estimates included population density, percentage of males, percentage of the population being Black or African American alone, percentage of the total civilian population for whom poverty status is determined, percentage of the civilian labor force that is unemployed, percentage of housing units with no vehicle available, and percentage of the adult population (aged 25 and older) with less than high school education. One healthcare resource-related factor, the number of Ryan White centers within 25 miles radius of each county per 100,000 people, was measured using the geolocation information of Ryan White HIV centers from the United States Department of Health and Human Services. The number of mental health providers per 100,000 residents was another healthcare resource-related factor obtained from the County Health Rankings and Roadmaps Program. The racial dissimilarity index, ranging from 0 to 1 and with higher values indicating greater residential segregation between Black and White residents, was also obtained from County Health Rankings and Roadmaps and used in the analysis.

Statistical Analysis

The temporal and spatial variations of the annual percentage of LPAD among people diagnosed with HIV from 2005 to 2019 in South Carolina were illustrated by bar plots with a linear regression line and a heat map. The median length of delay time by individual-level characteristics (e.g., residence, age, and gender) across time was illustrated by a bar plot. Generalized estimating equations models with backward selection were built to explore multi-level factors related to LPAD status (binary variable) and delay time (continuous variable). The “GEE” function in SAS with log-binomial and identity links was used for the binary and continuous outcomes, respectively. To capture the repeated value at the county level, the first-order autoregressive structure and compound symmetry correlation matrix was tested in two models, and the results with smaller Quasilikelihood Information Criterion were presented. Detailed SAS code information for the GEE models was provided in supplemental materials.

During the HIV acute infection period, the CD4 count may drop below 200 cells/μL within a few weeks of infection. Thus, we could not differentiate LPAD from those in the acute infection period based only on CD4 cell numbers. One potential way to distinguish the acute and late stages of HIV infection is the serum HIV viral load because the viral load in the acute infection stage often reaches over 1 × 106 copies/mL, which is typically higher than that after several years of infection [20]. To examine the influence of acute infection, we did a sensitivity analysis for delay time by excluding those having CD4 cell counts less than 200 cells/μL and the HIV viral load higher than 1 × 106 copies/mL within three months of initial HIV diagnosis. Statistically significant relationships were considered at p < 0.05. Figures were created using R 4.1.3, and all other statistical analyses were conducted in SAS software version 9.4 (SAS Institute, Inc., Cary, NC).

Results

Patterns and Temporal Trend for LPAD and Delay Time

The analysis included 8,913 people diagnosed with HIV from 2005 to 2019 in South Carolina. Most were male (75.24%) and Black (69.27%). Of the 8,913 new HIV diagnoses, 3,733 (41.88%) were LPAD. Compared to those who were not LPAD, a greater portion of LPAD were aged ≥ 55 (14.55% vs. 7.72%) and Hispanic (7.85% vs. 4.36%) (Table 1). Among the LPAD, 2,842 had at least one CD4 count record within three months of initial HIV/AIDS diagnosis. Their estimated median delay time from infection to diagnosis was 13.04 years.

Table 1 Late presentation with advanced disease (LPAD) by individual and county level factors, South Carolina, 2005–2019

As displayed in Fig. 1, there was a significant declining trend from 2005 to 2019 in the percentage of LPAD (52.06% in 2005 and 28.46% in 2019) in South Carolina. When breaking down the annual percentage of LPAD at the county level, an overall declining trend (indicated by fewer red cells in the later phase of Fig. 2) was observed in most counties in South Carolina. As shown in Fig. 3, the declining trend of the percentage of LPAD among newly diagnosed PWH who were aged over 55 years old, male, Hispanic, and with HIV transmission mode being heterosexual from 2005 to 2019 was significant. Delay time to diagnosis was greater among those Hispanic and those aged between 35 and 54 years old. (Fig. 4)

Fig. 1
figure 1

Percent of new HIV diagnosis that were late presentation with advanced HIV disease from 2005 to 2019 in South Carolina: a bar plot with a linear regression line

Fig. 2
figure 2

Percent of new HIV diagnosis that were late presentation with advanced disease (LPAD) across 46 counties in South Carolina, 2005–2019

Fig. 3
figure 3

Percent of late presentation with advanced disease among newly diagnosed HIV cases who were aged over 55 years old, male, Hispanic, and with HIV transmission mode being heterosexual: bar plots with linear regression lines

Fig. 4
figure 4

Median length (years) of diagnosis delay among people with late presentation with advanced disease by individual demographic characteristics

Factors Associated with LPAD and Delay Time

Compared to PWH aged 18 to 34 years, those aged 35–54 years (adjusted Prevalence ratio [aPR]: 1.57, 95% CI: 1.47 ∼ 1.67) and older than 55 years (aPR: 1.76, 95% CI: 1.62 ∼ 1.92) were more likely to be LPAD. In addition, LPAD were more likely to be male (aPR: 1.22, 95% CI: 1.12 ∼ 1.33) vs. female, and Hispanic (aPR: 1.42, 95% CI: 1.26 ∼ 1.61) vs. White and were more likely to be Black (aPR: 1.08, 95% CI: 1.03 ∼ 1.15) vs. White. Compared to PWH with heterosexual transmission mode, those reporting MSM (aPR: 0.89, 95% CI: 0.82–0.96) and IDU (aPR: 0.87, 95% CI: 0.79–0.96) were less likely to be LPAD. At the county level, PWH living in counties with a higher percentage of unemployment (aPR: 1.02, 95% CI: 1.00 ∼ 1.03) and a lower percentage of Black (aPR: 0.996, 95% CI: 0.993 ∼ 0.999) were more likely to be LPAD. (Table 2)

Table 2 Factors associated with late presentation with advanced HIV disease (LPAD): Generalized estimating equations model with backward selection

Similar to the results of LPAD, PWH aged 35–54 years (adjusted beta: 2.18, 95% CI: 1.72 ∼ 2.63) had a longer delay time to diagnosis compared to those aged 18–34 years. Hispanic individuals (adjusted beta: 1.17, 95% CI: 0.49 ∼ 1.85) had a longer delay time, but the association between being Black and delay time disappeared (adjusted beta: 0.11, 95% CI: -0.30 ∼ 0.52). Having the transmission mode as MSM (adjusted beta: -1.39, 95% CI: -2.01 ~ -0.77) or IDU (adjusted beta: -4.03, 95% CI: -4.64 ~ -3.41) was consistently associated with shorter delay time to diagnosis compared to heterosexual transmission. (Table 3) After excluding those having CD4 cell counts less than 200 cells/μL and the first HIV viral load larger than 1 × 106 copies/mL within three months of initial HIV diagnosis (304 out of 2,842 [10.70%]), the sensitivity analysis of delay time shows similar results to the findings in the models with the full sample. (Supplemental Table 1)

Table 3 Factors associated with delay time from HIV infection to diagnosis: Generalized estimating equations model with backward selection

Discussion

This study is one of the first attempts to simultaneously explore the association of multi-level factors with LPAD and the length of delay time to diagnosis. Our population-based analysis found that people who were older, male, Black or Hispanic, with the heterosexual HIV transmission mode, and living in counties with a larger proportion of unemployment were more likely to be LPAD. In addition, among people who were LPAD, those who were older, Hispanic, and with heterosexual transmission mode had a longer delay time. Targeted strategies should be tailored to different subgroups to address the barriers to LPAD and delay time.

In this study, our findings regarding the associations of age, gender, race/ethnicity, and HIV transmission mode with LPAD were consistent with most existing literature [11, 16, 21]. Numerous reasons could explain why older adults are more likely to be diagnosed late, such as inaccurate self-perception of risk, failure of medical providers to suggest HIV screening, and medical providers not able to recognize the clinical signs of HIV/AIDS when older adults present with illness [21]. Access to early HIV diagnosis may vary between men and women due to socio-cultural, healthcare access, and economic factors. Some of the women’s healthcare utilization, such as the opt-out test in prenatal care and cervical cancer screening, might contribute to lower likelihood of LPAD among women [22]. Numerous community efforts have been taken to increase HIV testing among gay and bisexual men since the mid-2000s because of their high-risk status [23]. This could partially explain the higher possibility of LPAD in heterosexual PWH. This finding suggests the need for greater public awareness and promotion efforts of HIV testing among this population despite their relatively lower risk of HIV in comparison with key populations such as gay and bisexual men.

Hispanic rather than Black/African American individuals had longer diagnosis delays compared to White individuals. One potential explanation is that except for some common barriers to accessing testing services in Hispanic and Black individuals, such as lack of health insurance, discrimination, and homophobia, Hispanic persons may face additional immigration issues and language barriers. Besides a higher possibility of LPAD, heterosexual persons were also found to have a longer diagnosis delay. This finding is consistent with previous studies [24]. A national-level study based on surveillance data in France found that heterosexual men, whether born in France or abroad, tended to have the longest time intervals from infection to diagnosis [24]. Hispanic and heterosexual persons should be targeted with greater attention to promoting early HIV testing because they are more likely to be LPAD and have longer delay time.

Individuals living in counties with a larger percentage of Black/African Americans were found to be less likely to be LPAD. The percentage of Black/African American residents in each county has been interpreted as a measure of neighborhood racial context, and higher values are postulated to be related to enhanced social cohesion, mutual social support, lower exposure to discrimination, and a sense of community. This could help explain why living in counties with higher percentages of Black residents was related to a lower likelihood of LPAD. CCI scores in the current study did not demonstrate statistically significant associations with LPAD and the duration of diagnosis delay. A potential explanation is that some components in the CCI scoring system (e.g., kidney and liver disease) might be caused by HIV infection, and this could raise temporality concerns when examining the relationship between CCI and HIV late presentation. The absence of significant association between some county-level variables (e.g., the quantity of Ryan White centers, the availability of mental health providers, and the dissimilarity index) and HIV late presentation suggests that these variables were not able to fully capture the complexities of HIV diagnosis patterns and further exploration of other contextual factors related to healthcare access and residential segregation may be needed.

Effective strategies to promote timely HIV testing include but are not limited to learning from the lived experience of PWH with late diagnosis, targeting and removing barriers they face for HIV testing, increasing HIV testing services at community settings, normalizing HIV, reducing HIV-related stigma, and expanding access to at-home HIV self-testing [23]. Literature suggests that self-testing was less stigmatizing for participants. One at-home self-testing program in Canada during the COVID-19 pandemic found that self-testing is one solution to have persons never tested for HIV being tested [25]. Additionally, HIV testing is a requirement for initiating or continuing pre-exposure prophylaxis (PrEP), a medicine that reduces the risk of HIV infection from sex or injecting drug use [26, 27]. Thus, the promotion of PrEP has a direct effect on HIV testing and could also be an effective strategy in reducing later HIV diagnosis. In addition, timely initiation of Highly Active Antiretroviral Therapy (HAART), coupled with direct referral to treatment centers after diagnosis, is pivotal for improving HIV health outcomes among PWH who were LPAD and suppressing onward HIV transmission within communities. Direct referral after diagnosis enables timely access to HIV treatment by mitigating barriers to linkage to care, such as stigma and transportation challenges.

The current study has several limitations that need to be acknowledged. First, the statistics based on the current definition of LPAD might lead to overestimating the number of late-presented patients because of a transient decline in the CD4 count in the acute phase of HIV infection. However, the results of our sensitivity analysis regarding delay time showed similar results to the primary analysis, supporting our findings regarding the risk factors of delay time among people with LPAD [20]. Second, individuals without AIDS diagnosis or CD4 count record within three months of initial HIV diagnosis were excluded from the analysis because of the uncertainty about whether they are LPAD or not. Thus, caution needs to be taken when generalizing findings to all PWH who are LPAD in the South Carolina Enhanced HIV/AIDS Reporting System dataset. Third, we are not able to incorporate some other important individual-level socioeconomic-related characteristics (e.g., insurance status, social support, and education) into our analysis due to the limitation of data availability. These factors may explain some of the differences we observed in LPAD by the proportion of unemployment at the county level. Fourth, caution should be taken when generalizing findings regarding risk factors of delay time. The value of CD4 count is necessary for the calculation of delay time, and those with initial HIV diagnosis being AIDS but having no CD4 counts within three months of initial diagnosis were excluded. Thus, not all people with LPAD were included in the delay time analysis. However, all the excluded late presented PWH have AIDS diagnosis within three months of initial HIV diagnosis, and AIDS diagnosis is based on CD4 count < 200 or opportunistic illness and could indicate linkage to healthcare. Thus, we believe that the excluded population is less likely to be impacted by healthcare-related delays than the included population, and the impact of excluding this population on our findings is minimal. Fifth, the accuracy of diagnosis delay time depends on the precision of the CD4 depletion model, which may undergo alterations over time in response to the virus’s evolving nature. Additionally, age at HIV diagnosis is used as a proxy for age at infection to estimate the duration from HIV infection to the date of the CD4 test, which may influence the accuracy of the CD4 depletion model.

Conclusion

The percentage of LPAD and the median delay time decreased from 2005 to 2019. Although these results are encouraging, there were consistent and significant demographic and socioeconomic disparities in late HIV diagnosis. Targeted and sustained interventions are needed for older, male, Hispanic, and heterosexual persons due to their higher possibility of LPAD. In addition, specific attention should be paid to people living in rural areas due to their longer delay time. Promoting HIV testing in communities with socioeconomic disadvantages (e.g., counties with a greater proportion of unemployment) with tailored interventions (e.g., expanding at-home HIV self-testing) is warranted to end the HIV epidemic in the South Carolina and beyond.