Host genetics and viral load in primary HIV-1 infection: clear evidence for gene by sex interactions

Research in the past two decades has generated unequivocal evidence that host genetic variations substantially account for the heterogeneous outcomes following human immunodeficiency virus type 1 (HIV-1) infection. In particular, genes encoding human leukocyte antigens (HLA) have various alleles, haplotypes, or specific motifs that can dictate the set-point (a relatively steady state) of plasma viral load (VL), although rapid viral evolution driven by innate and acquired immune responses can obscure the long-term relationships between HLA genotypes and HIV-1-related outcomes. In our analyses of VL data from 521 recent HIV-1 seroconverters enrolled from eastern and southern Africa, HLA-A*03:01 was strongly and persistently associated with low VL in women (frequency = 11.3 %, P < 0.0001) but not in men (frequency = 7.7 %, P = 0.66). This novel sex by HLA interaction (P = 0.003, q = 0.090) did not extend to other frequent HLA class I alleles (n = 34), although HLA-C*18:01 also showed a weak association with low VL in women only (frequency = 9.3 %, P = 0.042, q > 0.50). In a reduced multivariable model, age, sex, geography (clinical sites), previously identified HLA factors (HLA-B*18, B*45, B*53, and B*57), and the interaction term for female sex and HLA-A*03:01 collectively explained 17.0 % of the overall variance in geometric mean VL over a 3-year follow-up period (P < 0.0001). Multiple sensitivity analyses of longitudinal and cross-sectional VL data yielded consistent results. These findings can serve as a proof of principle that the gap of “missing heritability” in quantitative genetics can be partially bridged by a systematic evaluation of sex-specific associations.


Introduction
In the era of genome-wide association studies (GWas) on human traits and diseases, one overwhelming issue is "missing heritability," as thousands of GWas (http://www. genome.gov/gwastudies/) have readily identified and confirmed quantitative trait loci (Qtls) based on statistical significance, but these Qtls typically explain little or rather limited phenotypic variance (Brookfield 2013). Proponents of quantitative genetics have called for close attention to study design (Putter et al. 2011), phenotypic robustness (Queitsch et al. 2012), and the effects of rare (including de novo) variants, haplotypes (combinations of variants that are inherited as a single unit), gene by gene interaction (epistasis), gene by environment interaction, as well as epigenetics (eichler et al. 2010;Gianola et al. 2013;Keller et al. 2012;lee et al. 2011;Mahachie John et al. 2011). For complex traits with evolving and multifactorial mechanisms, the journey ahead for finding the missing heritability can be long and bumpy.
During the natural course of human immunodeficiency virus type 1 (HIv-1) infection, viremia and time from infection to development of severe immunodeficiency or aIDs are often used as quantitative traits to gauge HIv-1 pathogenesis and/or rates of disease progression. In particular, plasma viral load (vl) set-point during chronic HIv-1 infection offers a relatively steady and widely available outcome measure with both clinical and epidemiological implications (Fideli et al. 2001;lyles et al. 2000;Mellors et al. 1995;Quinn et al. 2000;saag et al. 1996). Predictors of set-point vl range from viral characteristics (e.g., subtypes and replicative capacity) (Prentice et al. 2014a;Prince et al. 2012;Yue et al. 2013) to host genotypes (Qtls) that govern innate and adaptive immune responses (apps et al. 2013;Fellay et al. 2009;leslie et al. 2010;Prentice and tang 2012). Depending on the study population and definition of set-point vl (single or multiple measurements), the proportion of vl variance explained by any single host or viral factor is often less than 4 % (Fellay et al. 2007;Prentice et al. 2014a;Yue et al. 2013). the most promising model that incorporates genetic and non-genetic features of epidemiologically linked HIv-1 transmission pairs (source and recipient partners) can account for nearly 37 % of early set-point vl variance (Yue et al. 2013).
Our recent data from a large cohort of HIv-1 seroconverters (sCs) suggest that host and viral factors associated with set-point vl can evolve as the infection progresses (Prentice et al. 2014a), even during the early chronic phase when complications by coinfections and comorbidities are infrequent. the correlates of longitudinal and cross-sectional vl in this cohort include four HLA-B variants (B*18, B*45, B*53, and B*57) that encode polymorphic cell surface glycoproteins specializing in antigen presentation (Prentice et al. 2014). While these observations are consistent with the well-documented hypothesis that viral epitopes bound to Hla-B molecules can dominate the induction of HIv-1-specific, cytotoxic t-lymphocyte responses (Kiepiela et al. 2004(Kiepiela et al. , 2007rajapaksa et al. 2012) and further dictate viral evolution or adaptation (Goulder and Walker 2012;Kawashima et al. 2009;leslie et al. 2004;Moore et al. 2002;rolland et al. 2010), the vl variance explained by individual HLA-B variants is also limited (ranging from 0.7 to 1.6 %). Our new objective is to refine the analytical approaches and to identify potential interaction terms between sex and Hla variants.

Subjects and methods
study population recent HIv-1 seroconverters (sCs) were enrolled from Kenya, rwanda, Uganda, and Zambia between 2005 and 2011 (table 1), under a uniform study protocol sponsored by the International aIDs vaccine Initiative (IavI) (amornkul et al. 2013;Price et al. 2011). the procedures for written informed consent and multidisciplinary research activities were approved by institutional review boards at all clinical research centers and participating institutions.
Follow-up strategies, genotyping, and outcome measures sCs in this study were identified by frequent (monthly to quarterly) testing of HIv-1 seronegative subjects at high risk of HIv-1 infection through heterosexual and homosexual exposure, with the majority being seronegative partners in HIv-1 discordant couples and/or individuals reporting multiple heterosexual partners or diagnosed with sexually transmitted infections (85 % of the sC cohort). the subjects included for this study were sCs with sufficient longitudinal data, and the visit intervals were expanded from 3 to 24 months (Prentice et al. 2014a) to 2 to 36 months beyond estimated dates of infection (eDI). all study visits considered were before the initiation of antiretroviral therapy under national guidelines (ngongo et al. 2012). viral sequencing, molecular Hla genotyping, and quantification of plasma vl followed procedures described in detail elsewhere (amornkul et al. 2013;Prentice et al. 2014a;Price et al. 2011;tang et al. 2011). Identification of Hla-B*18 1 3 (unfavorable), B*45 (unfavorable), B*53 (unfavorable), and B*57 (favorable) as independent correlates of longitudinal or cross-sectional vl in this heterogeneous cohort (Prentice et al. 2014a) was highly consistent with results concerning africans and african americans (apps et al. 2013; lazaryan et al. 2011; leslie et al. 2010; tang et al. 2010).

Descriptive statistics
HIv-1-infected men and women were compared for their overall baseline characteristics, including (a) Wilcoxon's rank-sum test for quantitative variables lacking a normal distribution, (b) t test for quantitative variables with a normal distribution, and (c) χ 2 and Fisher exact tests for categorical variables (table 1). these and other analytical procedures (summarized below) were done using sas, version 9.3 (sas Institute, Cary, nC, Usa).

Central hypothesis and analytical procedures
Our study aimed to test a central hypothesis that gene (Hla class I) by sex (viral microenvironment) interaction can be uncovered by separate analyses of men and women, especially when longitudinal vl measurements (with log 10 -transformation) are evaluated in mixed models. Data analyses began with the screening of potential interaction terms, with a focus on common Hla variants (population frequencies ≥4 %). the timing and magnitude of sex-specific effects on vl were further assessed by local regression (lOess) curves (longitudinal data) and generalized linear models for geometric mean (crosssectional) vl. association signals with false discovery rate (FDr) below 0.20 were entered into a series of sensitivity analyses using subsets of data corresponding to (1) the 3-to 24-month follow-up period with densely were treated as covariates. the performance of individual statistical models was gauged by their overall R 2 values (corresponding to variance explained by factors in the model), while the impact of individual factors was measured by the regression beta (adjusted mean beta difference, Δβ, and standard error, se). associations with borderline statistical significance (P ≤ 0.050, FDr = 0.20-0.50) were exempt from multivariable models or sensitivity analyses. refinement through evaluation of linkage disequilibrium (lD) profiles and extended haplotypes Using sas Genetics (sas Institute, Cary, nC, Usa), Hla genotyping data for eastern and southern african sCs were analyzed separately for lD and extended haplotypes, with additional reference to fully resolved haplotypes in other populations (Cao et al. 2001). association analyses based on 2-and 3-locus haplotypes were deemed informative if the adjusted effect sizes improved over those attributable to the component alleles.

Results
Characteristics of men and women in the study population a total of 521 subjects had sufficient prospective data (three or more visits) during the 2-to 36-month interval after eDI (table 1). the overall baseline data differed between 327 men and 194 women in terms of (1) age (P < 0.0001), country of origin (P < 0.0001), HIv-1 subtype (P = 0.040), and first available vl (P = 0.048). Hla alleles of interest had similar distribution in men and women (P = 0.14-0.97) (table 1).

Multivariable models for longitudinal vl data
For the 2-to 36-month intervals, the interaction term between female sex and Hla-a*03:01 was independent of other known factors pertinent to the study population (table 2), with an adjusted P value of 0.005. On average, vl differed by −0.67 ± 0.24 log 10 between Hlaa*03:01+ and a*03:01− women after adjusting for other known factors. analyses of data over the 3-24 months intervals yielded almost identical results (−0.71 ± 0.25 log 10 , P = 0.005 for the interaction term) (table 2).
Hla-a*03-restricted HIv-1 epitopes In the context of antigen presentation and Ctl responses, multiple studies have identified Hla-a*03-restricted HIv-1 epitopes, especially a conserved epitope (KK9/rK9) in Gag (p17) (see table 2 for summary statistics based on mixed models). Arrows indicate plasma viral load measurements that are <400 rna copies/ml (routinely transformed to 1.30 log 10 )

Further findings from bioinformatics
In populations of african ancestry (e.g., Yoruba), Hlaa*03:01 is tagged by one intergenic snP (rs2524024), which is in strong lD (r 2 = 0.81-1.0) with 63 other intergenic snPs distributed along a 45.1 kb region (5.9-51 kb upstream of HLA-A). the rs2524024 snP is also a known eQtl for the integral membrane protein 2a gene (

Discussion
By focusing on generalizable findings that are applicable to eastern and southern africa with multiple circulating HIv-1 subtypes, our analyses yielded clear evidence that female sex can be an important environmental factor to facilitate Hla class I-mediated immune control of HIv-1 infection. Because women typically have lower vl than men after acquiring HIv-1 (Fideli et al. 2001;Prentice and tang 2012;tang et al. 2002), our hypothesis about gene by sex interaction may offer some explanation.
In the context of HIv-1 infection, at least two earlier studies have alluded to sex-specific findings with Hlaa*74:01 and Hla-DrB1*11 (Hendel et al. 1999;Koehler et al. 2010). In our analysis, Hla-a*74:01 (a frequent allele) was weakly associated with relatively low vl in men. However, there was no evidence for interaction between Hla-a*74:01 and sex. the second hypothesis about Hla-DrB1*11 being unfavorable in women was derived from a French cohort (Hendel et al. 1999), but analyses of HIv-1-infected Zambians did not replicate that finding (tang et al. 2010). Unlike earlier studies that did not account for potential false discoveries from random, multiple testing, the interaction term seen here for female sex and Hla-a*03:01 was accompanied by a low FDr (<0.10). a series of sensitivity analyses established that other potential confounders, including age, geography, and viral subtypes, did not obscure or compromise our analytical approaches. Data from the Multicenter aIDs Cohort study may provide anecdotal evidence to support our key findings, as analyses of viral load and disease progression have never detected differential effects for Hla-a*03 in HIv-1-infected men (Kaslow et al. 1996;Mann et al. 1998). statistical significance aside, the threshold for a biologically significant difference in HIv-1 vl is around 0.30 log 10 after accounting for intra-and inter-assay variability (Modjarrad et al. 2008;saag et al. 1996). By our estimates, female sex by Hla-a*03:01 interaction was independently associated with ~0.70 log 10 reduction in vl (tables 2, 3, 4), which should impact disease progression and vertical or horizontal HIv-1 transmission.
the condition for analyzing gene by sex interactions in our study population was somewhat suboptimal. First, men and women eligible for analyses differed in several non-genetic (and potentially confounding) features (table 1), which mandates the application of multivariable models and sensitivity analyses. as such, the effect sizes (regression beta and R 2 values) attributable to specific interaction terms often differed by statistical models and complicated the interpretation process. second, Hla profile and genetic backgrounds can differ by country and geographic region, suggesting that our emphasis on generalizable findings might have come at the expense of country-specific phenomena. third, sample size was not equal between men and women, so the statistical power was somewhat compromised in analyses of female-specific associations. as such, the modest trend seen with Hla-C*18:01 in women (Fig. 2) is still worth noting. In the long term, statistical models for gene by sex interactions should continue to improve when homogeneous cohorts with unbiased sex ratios are available for follow-up studies.
Hla alleles that have early influences on HIv-1 viral load tend to impose a strong selection pressure for viral immune escape mutations, as often seen in individuals with Hla-B*57 and related alleles (Bansal et al. 2007;Crawford et al. 2009;leslie et al. 2004;novitsky et al. 2010;Wang et al. 2009). In HIv-1-infected african women, the vl trajectory associated with Hla-a*03:01 was relatively steady in the first 3 years of follow-up (Fig. 1). Further evaluation of immune responses and HIv-1 immune escape mutations in HIv-1-infected women with Hla-a*03:01 may provide new insights about durable immune protection against a broad spectrum of HIv-1 subtypes.
although Hla-a*03:01 itself can play an important role in inducing immune responses to a variety of Ctl epitopes, it is also possible that the interaction term seen with a*03:01 actually reflects the function of other variants that operate in a sex-specific fashion. such genetic variations can be either upstream (telomeric) or downstream (centromeric) from the HLA-A locus (vandiedonck and Knight 2009). the lD profiles in our study cohort strongly suggested that genes downstream from the HLA-A locus, including HLA-C and HLA-B, could not explain the a*03:01 effect. two alternative hypotheses can relate to other genomic regions. First, through strong lD with rs2524024, a trans-acting eQtls for the ITM2A gene at Xq13.3-Xq21.2, Hla-a*03:01 can Table 4 Multivariable models for geometric mean viral load (vl) at two overlapping intervals of early HIv-1 infection a With log 10 -transformation before analysis. summary statistics: β regression beta (mean deviation, Δ, from the reference group), SE standard error of the mean (Δ); R 2 proportion of vl variance attributable to each factor b as part of the sensitivity analyses c For consistency with earlier work, age is retained as a covariate regardless of its statistical significance   (Davis and Dorak 2010;Dorak et al. 1999;Morrison et al. 2010). For HLA-A variants alone, evidence of sex-specific effect further points to a short sequence motif corresponding to polymorphic amino acid residues 161, 163, and 165 of the Hla-a protein product (song et al. 2009). this particular sequence motif does not match the ones highlighted in a recent fine-mapping of Hla class I amino acid sequences in HIv-1-infected african americans (in the absence of stratification by sex) (Mclaren et al. 2012). nonetheless, the HLA-A locus is often overshadowed by HLA-B and HLA-C in studies of HIv/aIDs (apps et al. 2013;Fellay et al. 2009;leslie et al. 2010;Prentice and tang 2012). If environmental factors indeed dictate how HLA-A alleles are expressed or regulated, close attention to gene × environment or gene × sex interaction should provide a deeper understanding of "missing heritability" in quantitative genetics.
Acknowledgments this work was funded in part by IavI and made possible by the support from many donors, including: the Bill & Melinda Gates Foundation, the Ministry of Foreign affairs of Denmark, Irish aid, the Ministry of Finance of Japan, the Ministry of Foreign affairs of the netherlands, the norwegian agency for Development Cooperation (nOraD), the United Kingdom Department for International Development (DFID), and the United states agency for International Development (UsaID). the full list of IavI donors is available at http://www.iavi.org. additional funding for this work came from (i) the United states national Institute of allergy and Infectious Diseases (nIaID), through two r01 grants (aI071906 to r.a.K./J.t. and aI064060 to e.H.), (ii) the Fogarty aIDs International training and research Program (aItrP) (grant FIC 2D43 tW001042 to s.l.), and (iii) the KeMrI-Wellcome trust research Programme at the Centre for Geographical Medicine research-Kilifi (Wellcome trust award #077092). submission of this study for publication required approval by representatives of the Kenya Medical research Institute (KeMrI) and IavI, but the contents are the responsibility of the study authors and do not necessarily reflect the views of IavI, nIaID, UsaID or the United states government. We thank all members of the IavI africa HIv Prevention Partnership for their valuable contributions to cohort assembly and collection of prospective data. We are also grateful to several associates, especially travis r. Porter, Heather a. Prentice, and Wei song, for assistance with genotyping and biostatistics.
Conflict of interest the authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Open Access this article is distributed under the terms of the Creative Commons attribution license which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited. Hla-C*18:01-negative subjects. the thick and thin lines correspond to the expected mean value and 95 % confidence intervals for each stratum. Arrows indicate plasma viral load measurements that are <400 rna copies/ml (transformed to 1.30 log 10 )