Gut microbiome composition in the Hispanic Community Health Study/Study of Latinos is shaped by geographic relocation, environmental factors, and obesity
Hispanics living in the USA may have unrecognized potential birthplace and lifestyle influences on the gut microbiome. We report a cross-sectional analysis of 1674 participants from four centers of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), aged 18 to 74 years old at recruitment.
Amplicon sequencing of 16S rRNA gene V4 and fungal ITS1 fragments from self-collected stool samples indicate that the host microbiome is determined by sociodemographic and migration-related variables. Those who relocate from Latin America to the USA at an early age have reductions in Prevotella to Bacteroides ratios that persist across the life course. Shannon index of alpha diversity in fungi and bacteria is low in those who relocate to the USA in early life. In contrast, those who relocate to the USA during adulthood, over 45 years old, have high bacterial and fungal diversity and high Prevotella to Bacteroides ratios, compared to USA-born and childhood arrivals. Low bacterial diversity is associated in turn with obesity. Contrasting with prior studies, our study of the Latino population shows increasing Prevotella to Bacteroides ratio with greater obesity. Taxa within Acidaminococcus, Megasphaera, Ruminococcaceae, Coriobacteriaceae, Clostridiales, Christensenellaceae, YS2 (Cyanobacteria), and Victivallaceae are significantly associated with both obesity and earlier exposure to the USA, while Oscillospira and Anaerotruncus show paradoxical associations with both obesity and late-life introduction to the USA.
Our analysis of the gut microbiome of Latinos demonstrates unique features that might be responsible for health disparities affecting Hispanics living in the USA.
KeywordsMicrobiome Epidemiology Hispanic population Mycobiome Obesity
Immigrants from Latin America and the Spanish-speaking Caribbean make up the majority of the foreign-born population living in the USA. Immigration-related life course experiences may affect the gut microbiome (GMB) among Latinos, with potential implications for chronic diseases that have been linked to the GMB . Many of these, including obesity, diabetes, and asthma, are highly prevalent in the US Hispanic population [2, 3] although the association of these diseases with the Hispanic GMB pattern is unknown.
Migration from lower-income countries to higher-income countries is associated with change in community structure of the GMB due to adoption of a Western style diet, exposure to new natural and built environments, and other influences . Follow-up studies of migrants suggest that geographic relocation to the USA often coincides with a decrease in gut microbial diversity and transition in GMB organisms, concurrent with replacement of dietary starches and fiber with animal proteins and fats . Changes in diet alter the GMB makeup by restricting nutrients needed for growth of certain bacteria while enhancing the growth of others. After an altered GMB is established, the new microbial communities in the host gastrointestinal tract can lead to changes in metabolic processes and generation of metabolites [5, 6].
Hispanic/Latino groups, which include the largest immigrant population in the USA, are known to harbor a distinct GMB as compared with non-Hispanics , but this has only been studied in small, local populations . Longitudinal assessments among migrants (e.g., Thailand to USA)  have extended over weeks to months and are consistent with geographic variation in GMB shown in cross-national comparisons between lower- and higher-income countries . Lacking are large and detailed multicenter US Hispanic cohorts which can estimate effects of immigration on the GMB over the life course and inform about disease associations which may differ among populations . Furthermore, such knowledge has the potential to facilitate the development of therapeutic interventions to alter the microbiome and treat or prevent disease.
We used data from a longstanding multicenter US cohort study to characterize the association of relocation to the mainland USA with GMB characteristics among individuals from several Latin American national backgrounds.
Demographic, behavioral, and socioeconomic variables, by birthplace and age at relocation to the mainland USA
Mainland US born
Relocated 0–17 years old
Relocated 18–34 years old
Relocated 35–44 years old
Relocated 45+ years old
Age in years, mean
Education level < 9th grade
Education level > high school
Household income > $30K/year
Mother’s education > high school
Father’s education > high school
English language preference
Median year of relocation to USA
Childhood economic hardship*
Return to home country in last year
SASH social relations subscale
Hispanic diet habits (dietary acculturation) ‡
AHEI score, mean
MVPA, meets 2008 guideline goals
Sedentary time, upper quartile
Hours of sleep, mean
Analysis of GMB composition and its correlates
Several markers of gut microbiome community structure were defined. We quantified alpha diversity using the Shannon index to describe the 16S rRNA gene V4 region bacterial and ITS1 fungal microbiome. We also derived the Prevotella to Bacteroides ratio from 16S data; these taxa frequently appear as important and dominant in other gut microbiome studies [14, 15, 16], hence the focus here. From analysis of the Bray-Curtis community ordination, we performed principal coordinate analysis (PcoA) using the 16S and ITS1 data. The first 16S principal coordinate (PCoA1) was strongly correlated with the Prevotella to Bacteroides ratio (Spearman’s r = − 0.89), while PCoA2 was correlated strongly with Shannon index (r = 0.77). Correlations with PCoA1 were − 0.89 and 0.94 for relative abundance of Prevotella and Bacteroides, respectively.
Fungal (ITS1) GMB populations were dominated by Aspergillus proliferans and Saccharomyces cerevisiae in both mainland US-born and Latin American-born groups (Table 1). Relative abundance of several fungal taxa showed differences according to place of birth. Those born in the mainland USA had a mean relative abundance of Cyberlindnera jadinii of 5.8%, which was many folds higher than that among Latin American-born groups. Candida sake, Candida tropicalis, Candida glabrata, and Rhodotorula mucilaginosa were nearly absent in the US-born group but were fairly abundant in the range of 1 to 7% in Latin American-born populations.
Univariate analyses of 156 participant characteristics and health-related phenotypes, including dietary behaviors and disease-associated variables, were evaluated one-by-one by calculating beta diversity based on genus-level bacterial 16S and fungal ITS1 data. Multiple sociodemographic variables reflecting country of birth and relocation from Latin America to the mainland USA were identified in the top 35 variables (all P < 0.05) associated with Bray-Curtis distance in bacterial and fungal community-level analyses (Fig. 2). Nearly all of the variables associated with Bray-Curtis distance also met q value criteria of < 0.05, with a few exceptions for ITS1 analyses (Fig. 2).
Relocation to the US mainland is associated with GMB composition
We found little evidence that the geographic place of origin within Latin America had associations with summary measures of GMB composition (Fig. 3a and Fig. 3b). We conducted two additional analyses to discern whether the varied national backgrounds of our participants influenced our results. The association between birthplace and relocation age with Prevotella to Bacteroides ratio and GMB diversity was similar after serial exclusion of each Latino background group, indicating that a single group was not disproportionately influencing the overall result (data not shown). A subgroup analysis that was limited to the Mexican/Mexican American individuals was also conducted (Additional file 1: Figure S4), and it generally supported the overall conclusions derived from analyses shown in Fig. 3 a and b for the overall population.
Association between acculturation factors and GMB
Relative abundance of fungal species (ITS1 classification) comparing HCHS/SOL participants by region of birth
Country of birth
Aspergillus proliferans, abundance
Ratio versus mainland USA
Saccharomyces cerevisiae, abundance
Ratio versus mainland USA
Candida albicans, abundance
Ratio versus mainland USA
Candida sake, abundance
Ratio versus mainland USA
Debaryomyces hansenii, abundance
Ratio versus mainland USA
Candida tropicalis, abundance
Ratio versus mainland USA
Wallemia muriae, abundance
Ratio versus mainland USA
Candida glabrata, abundance
Ratio versus mainland USA
Cyberlindnera jadinii, abundance
Ratio versus mainland USA
Rhodotorula mucilaginosa, abundance
Ratio versus mainland USA
Association between diet and GMB
Diet among HCHS/SOL participants, classified according to place of birth and age at relocation from Latin America to the mainland USA
Mainland US born, Mean (SE)
Relocated 0–17 years old, Mean (SE)
Relocated 18–34 years old, Mean (SE)
Relocated 35–44 years old, Mean (SE)
Relocated 45+ years old, Mean (SE)
AHEI score, mean
51.4 (0.2) †
51.4 (0.4) †
51.4 (0.4) †
Dietary acculturation scale*
2.2 (0.05) †
1.7 (0.03) †
1.5 (0.05) †
1.5 (0.05) †
Energy, mean (kcal)
1888 (13) †
Carbohydrates, mean (g)
241 (1.5) †
242 (0.9) †
245 (1.3) †
245 (1.5) †
Carbohydrates, mean (% calories)
51.8 (0.2) †
52.2 (0.1) †
52.7 (0.2) †
52.6 (0.2) †
Protein, mean (g)
Protein, mean (% calories)
Fat, mean (g)
65.6 (0.4) †
64.5 (0.3) †
63.2 (0.4) †
63.0 (0.4) †
Fat, mean (% calories)
30.2 (0.2) †
29.8 (0.1) †
29.3 (0.1) †
29.2 (0.2) †
Saturated fat, mean (% calories)
9.9 (0.1) †
9.7 (0.0) †
9.6 (0.1) †
9.5 (0.1) †
Saturated fat, mean (g)
21.3 (0.2) †
20.8 (0.1) †
20.5 (0.2) †
20.4 (0.2) †
Transfat, mean (g)
2.6 (0.04) †
2.4 (0.02) †
2.4 (0.04) †
2.3 (0.04) †
Dietary fiber, mean (g)
17.9 (0.2) †
19.1 (0.1) †
19.3 (0.2) †
19.1 (0.2) †
Sodium, mean (g)
3179 (37) †
3141 (22) †
3159 (33) †
3152 (37) †
Intake of specific foods
Vegetable excluding potato, mean (servings/day)
Fruit excluding juice, mean (servings/day)
1.2 (0.1) †
1.3 (0.1) †
1.3 (0.1) †
Whole grain, mean (servings/day)
1.5 (0.2) †
1.9 (0.1) †
1.8 (0.1) †
1.9 (0.2) †
Refined grain, mean (servings/day)
Meat, mean (servings/day)
5.4 (0.2) †
5.3 (0.2) †
Red meat, mean (servings/day)
Processed meat, mean (servings/day)
Nuts and legumes, mean (servings/day)
Sugar sweetened beverages, mean (servings/day)
Milk/dairy, mean (servings/day)
Alcohol, mean (servings/day)
Association of socioeconomic variables and diet quality with features of the gut microbiome
16S Shannon index
ITS Shannon index
Beta (95% CI)
Beta (95% CI)
Beta (95% CI)
Household income/year ($)
0.03 (− 0.06, 0.13)
0.04 (− 0.04, 0.12)
0.10 (0.00, 0.20)
0.02 (− 0.08, 0.11)
0.02 (− 0.06, 0.10)
0.10 (0.01, 0.20)
0.08 (− 0.02, 0.18)
− 0.11 (− 0.19, − 0.03)
0.10 (− 0.01, 0.20)
< 9th grade
Some high school
0.01 (− 0.09, 0.11)
− 0.09 (− 0.18, − 0.01)
− 0.02 (− 0.12, 0.08)
High school diploma
− 0.03 (− 0.11, 0.05)
− 0.08 (− 0.15, − 0.01)
0.01 (− 0.08, 0.09)
Beyond high school
0.02 (− 0.05, 0.09)
− 0.15 (− 0.21, − 0.09)
0.05 (− 0.02, 0.12)
Childhood economic hardship
− 0.02 (− 0.10, 0.07)
0.05 (− 0.02, 0.12)
0.01 (− 0.07, 0.10)
Childhood sanitation (had basic facilities such as plumbing and sewer tank)
− 0.03 (− 0.14, 0.08)
− 0.14 (− 0.23, − 0.05)
0.08 (− 0.03, 0.19)
Alternative Healthy Eating Index*
−.0004 (−.0048, 0.0040)
0.0063 (0.0027, 0.0100)
0.0055 (0.0010, 0.0099)
Association of foods and nutrients with gut microbiome composition
16S Shannon index
Prevotella to Bacteroides ratio
ITS1 Shannon index
Beta (95% CI)
Beta (95% CI)
Beta (95% CI)
− .0280 (− .0553, − .0006)
0.0417 (0.0187, 0.0647)
0.0219 (− .0068, 0.0505)
− .0112 (− .0207, − .0016)
0.0007 (− .0073, 0.0088)
0.0016 (− .0083, 0.0116)
− .0467 (− .1392, 0.0457)
0.0723 (− .0056, 0.1501)
0.0382 (− .0579, 0.1344)
0.0237 (− .0172, 0.0646)
0.0078 (− .0267, 0.0422)
0.0366 (− .0061, 0.0794)
Vegetables (no potatoes)
0.0100 (− .0377, 0.0576)
0.0638 (0.0237, 0.1039)
0.0527 (0.0032, 0.1023)
Sugar sweetened beverages
− .0148 (− .0502, 0.0206)
0.0151 (− .0147, 0.0449)
−.0026 (− .0398, 0.0347)
Red and processed meat
− .0042 (− .1168, 0.1084)
− .1310 (− .2257, − .0364)
−.0790 (− .1956, 0.0376)
− .0016 (− .0076, 0.0044)
0.0044 (− .0007, 0.0094)
0.0060 (− .0003, 0.0122)
0.0654 (− .0698, 0.2006)
− .2413 (− .3546, − .1279)
−.0484 (− .1897, 0.0928)
0.0071 (− .0327, 0.0469)
− .0150 (− .0486, 0.0185)
−.0001 (− .0417, 0.0415)
0.0091 (− .0493, 0.0676)
0.0014 (− .0478, 0.0507)
0.0267 (− .0343, 0.0876)
Physical activity habits
Using data from 7-day accelerometry, we observed that late-life migrants to the USA had the worst physical activity habits (Table 2). However, there was no evidence that physical activity habits were related to measures of GMB composition including diversity or Prevotella to Bacteroides ratio (data not shown).
Association between socioeconomic variables and GMB
As compared with those who relocated to the US mainland in adulthood, both mainland US-born individuals and those arriving during childhood (age 0 to 17 years) had greater attained height, which is a marker of early-life socioeconomic advantage, and larger current household income (Table 2). Lower ratio of Prevotella to Bacteroides was associated with annual household income above $40,000 and higher educational attainment (Table 4). Conversely, higher Prevotella to Bacteroides ratio was found among those who lacked plumbing facilities during childhood.
Relatively few individuals (N = 293) had body mass index in the healthy range of 18.5 to 25 kg/m2, while a similar number (approximately 17%) of the cohort had class II obesity (N = 188, BMI 35 kg/m2 to 40 kg/m2) or class III obesity (N = 106, BMI above 40 kg/m2). Geographic region of birth and timing of relocation to the mainland USA were associated with obesity, and especially class II–III obesity (Additional file 1: Figure S5, Additional file 1: Table S3, and reference ).
Identification of bacterial and fungal taxa associated with birthplace, relocation, and obesity
Regression analyses relating genera with obesity, birthplace, and age at relocation to the mainland USA. After individually examining the associations of 74 genera with relative abundance > 0.01% with obesity and with birthplace/age at relocation to the mainland USA, the ten genera displayed in this table were found to be overlapping between these two analyses. Regression models for obesity adjusted for age, sex, and center, and regression models for birthplace and age at relocation adjusted for sex and center
Association with obesity
Association with birthplace and age at relocation to USA
Direction of association
Early-life exposure to USA
Fungal ITS1 classification yielded 16 class-level, 49 order-level, 109 family-level, 192 genus-level, and 396 species-level taxa (Additional file 2: Table S6). Analysis of fungal taxa (Additional file 1: Table S7) revealed a few differences comparing those born in the mainland USA versus those born in Latin America (|LDA score| > 104) (Aspergillus, Cyberlindnera, Tremellomycetes). Furthermore, in analysis of relocation age, among the 23 predominant fungal genera with relative abundance > 0.01% and present in more than 5% of individuals, Candida achieved an FDR-adjusted P value of 0.046 (Additional file 1: Table S8), while four others met nominal but not FDR-adjusted P value < 0.05 (Cyberlindnera, Aspergillus, Mrakia, Saccharomyces). We did not find any fungal correlates of obesity, with only Debaryomyces achieving a nominal P value < 0.05 (P value = 0.299 after FDR correction) (Additional file 1: Table S9).
The study of the human microbiome provides a new approach to understand health consequences of the environment across different geographic regions. Prior data suggest that gut microbiomes of Hispanic/Latino adults appear as a distinct cluster when analyzed alongside a collection of USA and worldwide populations [7, 23]. The results presented here describe characteristics of GMB variation and their determinants within the US Hispanic population. GMB heterogeneity among the US Latino study population was significantly accounted for by differences between the “first-generation” (Latin America-born) and “second-generation” (mainland US-born) groups. Each group had its own distinct microbiome pattern which was dependent both upon place of birth and timing of geographic relocation to the mainland USA (e.g., “relocation age”). People who relocated to the mainland USA from Latin America, particularly those who did so relatively late in life, were characterized by a relatively high ratio of Prevotella to Bacteroides. This accounts for the fact that migration- and acculturation-related variables were among the leading explanatory variables in Bray-Curtis distance clustering analyses of 16S sequence data when ranked by explained variation (Fig. 2, R2). There was also evidence for increased GMB diversity of both bacterial and fungal components in arrivals from Latin America, particularly among those who arrived in the USA during middle to late adulthood as opposed to early life. Our data are consistent with the prevailing tendency for people in lower-income countries to have different gut microbial characteristics  including a Prevotella-dominant microbiome , when compared with the US population. In contrast to the Latin America-born, the US-born Latino population had low Prevotella to Bacteroides ratio and low fungal alpha diversity.
Among Hispanic populations, dietary patterns (fiber, sugary sweets, animal products, etc.) and medical history (e.g., diabetes, number of medications, Charlson comorbidity index) ranked high in terms of the variance explained according to community-wide comparisons, consistent with other cohorts . A novel contribution of our study was our observation that the strength of sociodemographic, region of birth, and migration-related influences rivaled that of known contributors to GMB diversity. The findings are supportive of a strong and lasting influence of early-life environment on the gut microbiome. Our cohort of largely immigrant US Latinos captured the “1.5 generation,” a subset of the first generation which refers to those who relocated to the USA during childhood and adolescence. Individuals in this group have lived their adult life in the US environment, but during childhood development, their gut microbiomes would have been established under the influence of the Latin American environment and lifestyle. The “1.5 generation” had levels of Prevotella to Bacteroides ratio that were intermediate between the “first” and “second” generations. Particularly interesting was that relocation age effects were seen regardless of the current age of participants. Thus, the tendency for childhood arrivals with longer time living in the USA to have lower Prevotella to Bacteroides ratio as compared with adult arrivals was a consistent phenomenon that did not dissipate across the life course. This finding suggests a critical time window for establishment of the adult microbiome, in line with the observation that age at separation determined GMB concordance between twins in the UK Twins cohort . We also showed that Hispanic adult US residents raised in Latin America had diet patterns that differed from that among the US-born. These differences in prevailing diet patterns were discernable even after immigrants had lived in the USA a long time, and they appeared to contribute to the makeup of the GMB. However, diet did not explain the GMB differences by birthplace and migration. The dual dependencies of both GMB and diet on the historical age at migration provide an interesting avenue of research to understand the long-term health of Hispanic populations of the USA.
In contrast to results for Prevotella to Bacteroides ratio, the association of GMB bacterial diversity with birthplace and geographic region was less clear. We found a relatively weak overall association between exposure to the USA and bacterial diversity. As compared both with those who relocated as adults, and those who were born in the mainland USA, those who relocated to the USA during childhood tended to have lower bacterial diversity. Moreover, those preferring to use the English language over the Spanish language had significantly higher 16S Shannon index, which was at odds with the a priori expectation that higher acculturation to the US environment would be associated with reduced bacterial alpha diversity. This seems to provide a more nuanced picture when compared with findings among other communities  which have observed loss of GMB diversity after migration from a low-to-moderate income setting to the USA. It should be noted that in some studies these immigrant generation differences in bacterial diversity have been relatively modest  and most studies have not analyzed data separately from the “generation 1.5” childhood-arrival population.
We confirmed the expected association of low bacterial (16S) diversity with obesity . We also used classification of subjects according to Prevotella to Bacteroides ratio because it is a frequently used metric to define the microbiome, although it only captures one feature of microbiome space . While decreasing Prevotella relative to Bacteroides was associated with exposure to the USA and “US style” (versus “Latino”) foods, enigmatically Prevotella to Bacteroides ratio tended to be higher rather than lower among obese individuals. Therefore, our results were not consistent with the hypothesis that “replacement” of Prevotella with Bacteroides among immigrants relocating to high-income nations is associated with increased risk of obesity. On the contrary, our data suggested that normal weight Latino adults had low prevalence of Prevotella relative to Bacteroides. While resolving specific species and strains could not be done from our 16S data, it seems clear that this will be an important next step for assessing health effects of the GMB in Hispanics. For example, Prevotella copri is a common species that has been associated with increased risks of various diseases including diabetes . Both Prevotella  and Bacteroides  are highly diverse and with strain-specific gene functions that differ between Western and non-Western populations. As compared with the Prevotella-dominant GMB typical of the Latin American region, Latinos highly adapted to the USA who have a Bacteroides-dominant GMB may have different responses to dietary components and exposure to disease-related mechanisms such as short-chain fatty acid production and degradation of the GI mucus barrier [5, 6]. To resolve apparent differences between studies, an intriguing hypothesis that trans-cohort collaborations might be able to address states that disease-associated microbiota patterns may be different in different geographic regions .
Having observed a significant influence of dietary fiber on Prevotella to Bacteroides ratio, we considered whether types of carbohydrates, legumes, and starches consumed differed across subgroups of the Hispanic population. Fruit and whole grain consumption were variable in the population, favoring the older adult age immigrants to the USA who had higher intakes of these foods. Bean and legume consumption was high by US standards . However, this food had similar consumption across the population, and based on our adjusted analyses, we consider this diet component unlikely to contribute to the observed GMB differences.
Additional analyses identified that several genera had the signature of a bacterial group that was related in the same direction both to obesity and to early-life US exposure. For instance, Acidaminococcus (anaerobic, Gram-negative, acetate- and butyrate-producer ) was more abundant both with high BMI and with mainland US birth. Acidaminococcus has been associated with metabolic disease risks in prior worldwide studies. Abundance of these bacteria may be reduced in type 1 diabetes (China , Mexico ) and increased in children with stunting (Malawi, Bangladesh) . Consistent with our results, Acidaminococcus has been found to be increased in higher BMI adults (Bangladesh , USA ) and in adults with high combined cardiovascular risk factors (China) . We also confirmed that those with unfavorable body weight had reduced abundance of Oscillospira , which has been also shown as a microbiome feature that correlates with fatty liver disease which is of particularly high prevalence among Latinos . Paradoxically, although adiposity and US exposure are strongly associated with one another, Oscillospira as well as Anaerotruncus (another bacteria known to be negatively related with obesity) had lower abundance in the obese but higher abundance in the US-born. This discordant pattern between these two epidemiologically linked participant characteristics was therefore seen for Prevotella, Anaerotruncus, and Oscillospira, which we consider an interesting finding albeit of uncertain interpretation.
We found an association of reduced mycobiome diversity with early-life exposure to the USA. Components of the mycobiome have been implicated in chronic disease risk, but this is an understudied area . The lead explanatory variable for fungal beta diversity (Bray-Curtis distance) was poor oral health (missing teeth), and oral health overall is poor in the Latino population, as shown for the groups enrolled in HCHS/SOL . Fungal diversity also varied by income and neighborhood of residence (census block), which may be further evidence that low socioeconomic status and living environment may influence the mycobiome. A few of our findings relating to particular fungal taxa are worthy of note. We suspect that higher abundance of Cyberlindnera jadinii (which is added to processed foods ) among US-born as compared with Latin American-born individuals may be associated with some aspect of diet. Rhodotorula mucilaginosa, a yeast species that can be found in the environment including within foods and beverages , was practically absent in the US-born members of our cohort; however, among those of Latin American birth, this species had mean abundance ~ 1% in the Caribbean-born groups (Cuba, Dominican Republic, Puerto Rico) and 2–3% in the Mexico-, Central America-, and South America-born groups. R. mucilaginosa is considered a rare although emerging human pathogen , and in the context of chronic disease, it is interesting for its carotenoid-producing potential . Latin American-born individuals also had substantial mean abundance of several Candida species that were rare in the US-born, including C. sake, C. glabrata, and C. tropicalis. C. tropicalis is considered part of the normal human microbiota, yet it is of particular clinical interest for producing a virulent and sometimes antifungal-resistant systemic infection among patients in the Latin American and Asian regions . Despite several interesting differences in the fungal distribution between US- and Latin American-born people, we were unable to identify particular fungal taxa that correlated significantly with obesity among US Hispanics.
Following seminal work in this area , we can point to several possible explanations for why exposure to the Latin American and US environments may be associated with distinct microbiota patterns. These may include conditions and mode of childbirth, breastfeeding, diet, functioning of the immune system due to pathogen exposures, and exposure to pets and livestock. In our study, lifestyle factor profiles including diet and socioeconomic status differed between the Latin American-born and US-born groups. Physical activity levels also varied across Hispanic groups, although this dimension of lifestyle was not found to be associated with GMB, an interesting null finding in light of prior studies showing GMB differences across more extreme contrasts of exercise habits . Although several of these lifestyle factors were themselves associated with GMB, our multivariable adjustment models showed that lifestyle and socioeconomic variables did not explain the birthplace and migration associations with GMB or obesity risk. Nonetheless, despite the availability of a lengthy and wide-ranging in-person data collection protocol, it can be hard to exclude the influence of mismeasurement, unmeasured behaviors, or other environmental variables.
Over the short term, time-since-immigration effects on the GMB have been previously described in the USA —is it plausible that the timing as well as the duration of US exposure may have independent effects? We speculate that the life course experience of childhood migrants from Latin America may have a particular influence on GMB. For instance, dramatic changes in diet, nutritional status, and environment after relocation to the USA may exert different effects when experienced in early life versus later adulthood. Thus, we might consider age-varying explanatory biological phenomena involving immunity, the physiology and function of the gastrointestinal tract, or social factors such as contacts with other US- and non-US-born individuals in the household. The time course for establishment of the adult microbiome pattern has been well studied (see ), although little is known about how age may alter the response to environmental perturbation (here represented as age at relocation from Latin America to the USA). In this regard, we note our prior report from the HCHS/SOL cohort that adults who were childhood migrants to the USA had higher prevalence of asthma as compared with both US-born individuals and adulthood migrants . Like our GMB findings, these data on asthma are consistent with an immunological phenotype associated with early-life geographic relocation.
While we lacked a sufficient sample size to examine household clustering in this study [48, 49], in sensitivity analyses, we confirmed that key conclusions were similar after limiting the study to the subset of non-cohabitating individuals (data not shown). Other possible explanations which we may not have fully been able to control include differences across waves of migrant influx into the USA , as well as secular changes over time in the relevant environments (social, built, nutritional) of both the US and the Latin American source nations.
Limitations of this study include restriction to 16S and ITS1 sequencing. Shotgun metagenomic sequencing is in progress, which may allow identification of specific taxa down to the species and subspecies level, a necessary step to derive well-understood and modifiable biological targets. While we addressed the bacterial and fungal microbiome in parallel, interplay among bacterial and fungal taxa (co-occurrence, co-exclusion) will be complex to disentangle and will require larger samples and new statistical methods. Data on diet were assessed years prior to the GMB assessment, although we obtained these data using rigorous methods designed to capture habitual diet and showed strong associations between diet and GMB. Early-life environment was assessed retrospectively and subject to recall bias, suggesting that the relatively weak GMB signals in our data for variables such as childhood sanitation are likely to be underestimated. We did not study recent migrants because of the design of SOL, and geographic data was limited to the place of birth and the location of residence during the years of study participation. We also lacked repeated stool samples over time, and the analyses were cross-sectional, which will be overcome as the HCHS/SOL cohort members undergo future longitudinal assessments. Extant data suggest that genetic influences on the GMB are relatively weak and overshadowed by the environment [51, 52]. Hispanic background groups differ in average continental ancestry  yet we did not see a consistent pattern of difference by Hispanic background. Finally, only adults were studied, although results on migration suggest that studying children and adolescent migrant populations may capture a critical period for influences on lifelong GMB composition.
Strengths of the study setting include an extensive platform of clinical, biometric, behavioral, and sociodemographic variables which are of potential relevance to interactions among the host’s resident microbiome and the environment. Another design feature which lends credence to these comparisons was the approach of sampling all study participants from four US communities using random population-based recruitment methods and conducting assessments in a uniform manner across four US locations. The parent HCHS/SOL cohort had a relatively high participation rate of over 40%, which is notable considering that the cohort was inducted into a lengthy research program by door-to-door community recruitment. The participants were not selected from a diseased population, which allows us to address a large array of disease and biometric characteristics across a range of disease severity.
In summary, this study shows that early-life migration and length of stay in mainland USA significantly affect key components of the GMB of Hispanic/Latino groups, which differ from other groups in the USA in microbiome features. In addition, obesity was associated with low bacterial alpha diversity consistent with other studies, but the findings of higher Prevotella to Bacteroides ratio in obese individuals was enigmatic suggesting a unique aspect of the GMB-host relationship in Latinos. This in turn suggests the hypothesis that particular aspects of the microbiome may explain unusual epidemiological patterns observed among the Latino community, such as high prevalence of diabetes, obesity, and asthma [47, 54, 55], concurrent with a paradoxical propensity for longevity .
HCHS/SOL is a prospective, population-based cohort study of 16,415 Hispanic/Latino adults (ages 18–74 years at the time of recruitment during 2008–2011) who were selected using a two-stage probability sampling design from randomly sampled census block areas within four US communities (Chicago, IL; Miami, FL; Bronx, NY; San Diego, CA) [57, 58]. The HCHS/SOL Gut Origins of Latino Diabetes (GOLD) ancillary study was conducted to examine the role of gut microbiome composition on diabetes and other outcomes, enrolling participants for this analysis from the HCHS/SOL approximately concurrent with the second in-person HCHS/SOL visit cycle (2014–2017). The study was conducted with the approval of the Institutional Review Boards (IRBs) of Albert Einstein College of Medicine, Feinberg School of Medicine at Northwestern University, Miller School of Medicine at the University of Miami, San Diego State University, and University of North Carolina at Chapel Hill. Written informed consent was obtained from all study participants.
Participant characteristics and collection of clinical and behavioral data
A number of participant characteristics were ascertained by questionnaire at entry into HCHS/SOL, conducted by bilingual interviewers using the language preferred by the respondent. Self-reported variables included Hispanic/Latino background, place of birth, age at relocation (here termed “relocation age”), and years living in the mainland USA (with the US territory of Puerto Rico considered to be part of Latin America). Following previously described approaches, we used a combination of self-reported, objective monitoring, and clinical examination and blood laboratory components to define sociodemographic factors , medical history and medication use , physical activity including sedentary time and moderate-to-vigorous physical activity (MVPA) derived from 7-day hip worn accelerometry (Actical version B-1 model 198-0200-03; Respironics, Inc., Bend, OR) , and diet . Sedentary time was classified according to quartiles, while MVPA was categorized according to whether participants met the 2008 US guidelines . Diet variables were derived from the average of two 24-h dietary recalls that were collected at the HCHS/SOL baseline visit. The first recall was collected in person, and the second recall was collected by telephone within the following 3 months. Diet recalls were conducted using the Nutrition Data System for Research software (version 11) developed by the Nutrition Coordinating Center, University of Minnesota, (Minneapolis, Minnesota). Health insurance was defined according to participant self-report. Childhood economic hardship was assessed by the question, “Did your family ever experience a period of time when they had trouble paying for their basic needs, such as food, housing, medical care, and utilities, when you were a child? / Spanish: ¿Su familia alguna vez tuvo dificultades para pagar sus necesidades básicas como comidas, vivienda, cuidados médicos, o servicios públicos, cuando usted era niño(a)?” Access to sanitation during childhood was assessed by, “When you were growing up, did your home have the following basic utilities?... plumbing, septic tank. / Spanish: ¿Cuándo usted estaba creciendo, la casa donde vivía tenía los siguientes servicios públicos? Plomería, Drenaje/fosa séptica.” English or Spanish language preference was defined by the participant’s choice of English or Spanish written and spoken language in data collection encounters. Dietary acculturation was a self-reported measure stating whether a typical Hispanic, non-Hispanic (“American”), or blended style diet was consumed (“Of Hispanic/Latino and American food, do you usually eat...? Mainly or Mostly Hispanic/Latino foods” / Spanish: “De la comida hispana/latina y la comida americana, ¿por lo general come usted...? Principalmente comidas hispanas/latinas, or Mayormente comidas hispanas/latinas y algunas comidas americanas”.) We administered a modified 10-item version of the Short Acculturation Scale for Hispanics (SASH) which has 5-point Likert scale responses. The derived score for social acculturation was an average of the four SASH items regarding socialization practices and preferences . Higher SASH response values represent greater acculturation to the dominant US culture. The overall SASH reliability was acceptable in the full sample (Cronbach’s α = .90), and for both English and Spanish language versions (αEnglish = .76; αSpanish = .85). The reliability of SASH was similar across Hispanic/Latino background groups (ranging from αSouth Americans = .85 to αMexicans = .89). In addition, the use of antibiotics or probiotic supplements and dietary preferences within the prior 6 months, as well as stool characteristics (Bristol scale), were ascertained via directed questions on self-administered questionnaire at the time of stool sample collection.
Stool sample collection and processing
Enrolled participants were provided with a stool collection kit. For each participant, a single fecal specimen was self-collected using a disposable paper inverted hat (Protocult collection device, ABC Medical Enterprises, Inc., Rochester, MN). Participants were instructed to collect a sample of the specimen with a plastic applicator attached to the cap, to place the applicator into a supplied container with a stabilizer (RNAlater, Invitrogen, Carlsbad, CA) and 0.5-mm-diameter glass beads, and then shake the container to mix stool and preservative . Samples were shipped to Albert Einstein College of Medicine, aliquoted into 1-ml tubes and frozen at − 80 °C. Each aliquot was barcoded A–C and stored in a separate box.
The following method was used to randomize the samples sent to the Knight Lab for microbial sequencing. Using a team of three, three boxes were randomly selected from the set of all boxes containing the “A” sample using a random number generator. From a chosen box containing 81 samples, each person randomly selected three rows (9 tubes per row) of tubes and placed them randomly in one 96-well tube rack (1 rack per person; total 3 racks). The boxes were then rotated among the group, and the process was repeated twice resulting in three trays of 81 tubes consisting of 27 samples from each box. The process took less than 5 min and the tube racks were immediately returned to − 80 °C. The tubes from each rack were scanned in the randomized order creating a spreadsheet listing sample ID and location, placed in a new, labeled freezer box, and then returned to − 80 °C until shipment. Samples were shipped on dry ice via FedEx overnight delivery to the Knight lab for further analysis.
DNA extraction and sequencing
DNA extraction, 16S rRNA gene and ITS1 amplicon sequencing were done using Earth Microbiome Project (EMP) standard protocols (http://www.earthmicrobiome.org/protocols-and-standards/) . Briefly, DNA was extracted with the Qiagen MagAttract PowerSoil DNA kit as previously described . Amplicon polymerase chain reaction (PCR) was performed on the V4 region of the 16S rRNA gene using the primer pair 515f and 806r with Golay error-correcting barcodes on the reverse primer. Amplicon PCR was performed on the ITS1 region using primer pair ITS1f and ITS12 as described in the Earth Microbiome project (http://www.earthmicrobiome.org/protocols-and-standards/ITS1/). ITS1 amplicons were barcoded and pooled in equal concentrations for sequencing. The amplicon pool was purified with the MO BIO UltraClean PCR (Qiagen, Venlo, Netherlands) cleanup kit and sequenced on an Illumina MiSeq sequencing platform. Sequence data were demultiplexed and minimally quality filtered using the Quantitative Insights Into Microbial Ecology (QIIME) 1.9.1  script split_libraries_fastq.py, with a PHRED quality threshold of 3 and default parameters to generate per-study FASTA sequence files.
Bioinformatics processing and statistical analysis
Bioinformatic processing steps and statistical analyses were conducted in R versions 3.4.1 and 3.4.3 . 16S sequence reads were clustered into operational taxonomic units (OTUs) based on ≥ 97% similarity by the UCLUST algorithm, matched against the GreenGenes reference database (version. 13_8) [70, 71]. Phylogenetic reconstruction was performed by PyNAST  with the information from the centroids of the reference sequence clusters contained in the GreenGenes reference database. Sequences that failed to align (e.g., chimeras) were removed. Data were then rarefied and subsampled to a coverage depth of 10,000 reads per sample for downstream analyses. Rarefaction curves are presented in Additional file 1: Figure S8.
For fungal bioinformatic processing, reads were trimmed for bases that fell below a PHRED score of 25 at the 3′ end with PrinSeq V0.20.4 . DADA2 V1.8  was used to pre-process the ITS1 sequencing and to remove chimeras using the default denovo protocol . Processed reads were then clustered into amplicon sequencing variants using DADA2 and reference taxonomy was assigned using the naïve Bayesian classifier  and the UNITE reference database . Outputs were imported into R using the phyloseq  package and further processed with vegan  and coin  packages.
16S rRNA gene V4 region (“16S”) amplicon sequencing [80, 81] was performed on 1920 samples with 142 samples being blank controls. The sequencing yielded 21,991 ± 12,087 (mean ± SD) reads per sample. After analysis with QIIME (version 1.9.1) closed reference OTU picking, there was an average of 20,624 ± 10,771 (mean ± SD) reads per sample. Of the 1778 participant samples, 1674 samples passed all QC metrics and were used in subsequent analyses. To evaluate the fungal component of the GMB, ITS1 amplification and sequencing were performed on the same samples resulting in 12,468 ± 41,628 reads per sample. Following DADA2 analysis, an average read count of 11,902 ± 36,170 reads per sample was obtained. Rarefaction analysis identified a stable plateau point at 500 reads which allowed 1028 samples to be used in subsequent analysis. PERMANOVA analysis using Bray-Curtis distances did not show any significant biases among four sequencing runs.
Taxonomic analyses were performed after collapsing OTUs at the genus level. Genera data were normalized with cumulative sum scaling (CSS) and log2 transformation to account for non-normal distribution . The α-diversity (Shannon index) and β-diversity (Bray-Curtis distances) were calculated to investigate the community-level diversity of gut microbiota using phyloseq, vegan, and dada2 package in R (version 3.4.1) [77, 78]. Linear modeling was performed using the base R  lm function.
To identify correlates of GMB within the HCHS/SOL US Hispanic cohort, we used available information from the two in person HCHS/SOL study examinations as well as a brief diet, medication, and stool characteristic questionnaire that was collected at the time of GMB sampling. Lead correlates of beta diversity were identified by conducting PERMANOVA analysis of Bray-Curtis distances, computing the percent of sample clustering explained by 156 participant characteristics relating to stool quality, anthropometry (for example, height), behaviors (for example, diet), disease and use of medications (including clinical laboratory values, for example liver function tests), childhood exposures (including access to sanitation in home), sociocultural characteristics (including birthplace and relocation to the mainland USA), and demographic variables (sex, age). This set of variables was a subset of all collected variables available at the HCHS/SOL baseline and follow-up examinations, including those that had a plausible relationship with GMB and after selecting one out of every highly correlated set of variables. Pairwise correlations among included variables are shown in Additional file 1: Figure S9 and Additional file 1: Figure S10. The adonis function from the vegan package in R was used to assess statistical significance for PERMANOVA analyses. For simplicity, we used a single, uniform modeling approach for PERMANOVA analysis, using linear ordination across categories of independent (predictor) variables. This test was most sensitive to dose-response relationships between levels of the explanatory variable, and Bray-Curtis distance. To understand our results more fully, we also explored alternative statistical approaches including global differences among categories without assuming a dose-response ordination, which provided a more sensitive statistical test for variables such as relocation age which had a non-linear association with GMB metrics (data not shown). As expected, those variables rose in the R2 and P value rankings under the alternate modeling approach.
Using multivariable adjusted models, we isolated independent correlates of GMB outcomes. Linear modeling was performed using the base R  lm function with the dependent variable defined as the metrics of GMB including Shannon index, Prevotella to Bacteroides ratio, and the first two principal coordinates of Bray-Curtis distance. We performed log transformation as appropriate to improve model fit. We used the approaches of stratification combined with multivariable adjustment to address the relationship among multiple correlates of GMB in order to isolate associations with the variables of primary interest and exclude confounding. Adjustment variables were chosen based on a combination of empiric data on correlates of the main predictor and outcome variable, and knowledge of risk factor and disease relationships. These covariates included age (except for analyses with the primary predictor of interest defined as relocation age), gender, and study center for the initial adjusted models, and for the fully adjusted models, we added intake of vegetables without potatoes, intake of whole fruit, intake of whole grains, moderate-to-vigorous physical activity (continuous), BMI (six groups), diabetes/pre-diabetes/normoglycemic defined by American Diabetes Association criteria applied to study glucose and hemoglobin A1c levels (three groups), length and frequency of visits back to the participant’s country of origin (continuous), education level (four groups), income level (five groups), antibiotic use in the last 6 months (binary), and metformin use (binary). Next, in order to exclude confounding effects of age at the time of study, we examined the associations of relocation age with GMB across strata of current age at the time of GMB collection. This analysis was done after excluding individuals who relocated to the USA beyond age 26 years old in order to remove the strong correlation between relocation age and current age. A leave-one-out approach was also used to determine whether any single Hispanic background group was responsible for our main findings, and the Mexican subgroup of the HCHS/SOL was deemed large enough to allow analyses to be repeated in this group alone. To avoid false inferences due to small sample size, we excluded participant subgroups that had a small number of participants (for example, some of the mainland US-born groups separated out by Hispanic background). The final set of analyses examined the independent associations of GMB metrics and individual bacterial (16S) and fungal (ITS1) defined taxa with body mass index (obesity) and birthplace and migration. Significance testing followed a P < 0.05 criteria, and q values were used to control for multiple testing in R according to the method of Storey (http://github.com/jdstorey/qvalue).
The authors gratefully acknowledge Dr. Noel Weiss and Dr. Bing Yu for reviewing this work prior to submission. Dr. Kaplan gratefully acknowledges the Helen Riaboff Whiteley Center of University of Washington for facilitating the completion of this work.
Peer review information
Kevin Pang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
The review history is available as Additional file 3.
RCK conceived of the study, collected the data and specimens from the HCHS/SOL participants, obtained funding, and drafted the manuscript. ZW did the statistical analysis and drafted the manuscript. MU did the statistical analysis and processing of the HCHS/SOL fecal samples and drafted the manuscript. DSA collected the data and specimens from the HCHS/SOL participants, coordinated the HCHS/SOL cohort project, and obtained funding. MD collected the data and specimens from the HCHS/SOL participants and obtained funding. NS collected the data and specimens from the HCHS/SOL participants and obtained funding. GAT collected the data and specimens from the HCHS/SOL participants and obtained funding. MDG collected the data and specimens from the HCHS/SOL participants and obtained funding. BT collected the data and specimens from the HCHS/SOL participants and obtained funding. J-YM did the statistical analysis and drafted the manuscript. YV-B did the statistical analysis and gut microbial sequencing analysis and drafted the manuscript. DM did the statistical analysis, did the gut microbial sequencing analysis and drafted the manuscript. JSW-N did the statistical analysis and drafted the manuscript. MCW did the statistical analysis and drafted the manuscript. KEN collected the data and specimens from the HCHS/SOL participants, coordinated the HCHS/SOL cohort project, and obtained funding. JS did the statistical analysis. CCS collected the data and specimens from the HCHS/SOL participants and did the processing of the HCHS/SOL fecal samples. QQ collected the data and specimens from the HCHS/SOL participants, obtained funding, and drafted the manuscript. CRI collected the data and specimens from the HCHS/SOL participants, obtained funding, and drafted the manuscript. TW did the statistical analysis and drafted the manuscript. RK conceived of the study, did the gut microbial sequencing analysis obtained funding, and drafted the manuscript. RDB conceived of the study, collected the data and specimens from the HCHS/SOL participants, did the processing of the HCHS/SOL fecal samples, obtained funding, and drafted the manuscript. All authors read and approved the final manuscript.
The Hispanic Community Health Study/Study of Latinos (HCHS/SOL) is a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (HHSN268201300001I/N01-HC-65233), University of Miami (HHSN268201300004I/N01-HC-65234), Albert Einstein College of Medicine (HHSN268201300002I/N01-HC-65235), University of Illinois at Chicago – HHSN268201300003I/N01-HC-65236 Northwestern Univ), and San Diego State University (HHSN268201300005I/N01-HC-65237). The following Institutes/Centers/Offices have contributed to the HCHS/SOL through a transfer of funds to the NHLBI: National Institute on Minority Health and Health Disparities, National Institute on Deafness and Other Communication Disorders, National Institute of Dental and Craniofacial Research, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Neurological Disorders and Stroke, NIH Institution-Office of Dietary Supplements. Additional funding for the “Gut Origins of Latino Diabetes” (GOLD) ancillary study to HCHS/SOL was provided by 1R01MD011389-01 from the National Institute on Minority Health and Health Disparities. None of the funding agencies had a role in the design, conduct, interpretation, or reporting of this study.
Ethics approval and consent to participate
All participants enrolled in the HCHS/SOL completed informed consent at the time of enrollment into the cohort and subsequently provided informed consent to participate in the gut microbiome ancillary study project. IRBs of all participating institutions approved the study. The IRB of the lead institution (Albert Einstein College of Medicine) has approved the HCHS/SOL project under reference number 2007-432. The National Institutes of Health maintains an Observational Study Monitoring Board which reviews the project, participant safety, and burden. All experimental methods comply with the Helsinki Declaration.
Consent for publication
All participants provided written informed consent for collection, analysis, and publication of their study data and results of laboratory tests derived from their biospecimens. The manuscript was reviewed by the HCHS/SOL Publications Committee which provided its approval for submission of this work for publication.
The authors declare that they have no competing interests.
- 8.Romero-Ibarguengoitia ME, Garcia-Dolagaray G, Gonzalez-Cantu A, Caballero AE. Studying the gut microbiome of Latin America and Hispanic/Latino populations. Insight into Obesity and Diabetes. Systematic Review. Curr Diabetes Rev. 2018.Google Scholar
- 11.Marin G., Sabogal F., Marin B. V., Otero-Sabogal R., & Perez-Stable E. J. (1987). Development of a Short Acculturation Scale for Hispanics. Hisp. J. Behav. Sci. 9:183–205.Google Scholar
- 13.U.S. DHHS. 2008 physical activity guidelines for Americans. Washington, DC: 2008.Google Scholar
- 20.Mattei J, Sotres-Alvarez D, Daviglus ML, Gallo LC, Gellman M, Hu FB, et al. Diet quality and its association with cardiometabolic risk factors vary by Hispanic and Latino ethnic background in the Hispanic Community Health Study/Study of Latinos. J Nutr. 2016;146(10):2035–44.PubMedPubMedCentralGoogle Scholar
- 22.de la Cuesta-Zuluaga J, Corrales-Agudelo V, Carmona JA, Abad JM, Escobar JS. Body size phenotypes comprehensively assess cardiometabolic risk and refine the association between obesity and gut microbiota. Int J Obes. 2018;42(3):424–32.Google Scholar
- 25.McDonald D, Hyde E, Debelius JW, Morton JT, Gonzalez A, Ackermann G, et al. American gut: an open platform for citizen science microbiome research. mSystems. 2018;3(3):e00031-1. https://doi.org/10.1128/mSystems.00031-18.
- 35.Osborne G, Wu F, Yang L, Kelly D, Hu J, Li H, et al. The association between gut microbiome and anthropometric measurements in Bangladesh. Gut Microbes. 2019:1–14. https://doi.org/10.1080/19490976.2019.1614394.
- 38.Del Chierico F, Nobili V, Vernocchi P, Russo A, Stefanis C, Gnani D, et al. Gut microbiota profiling of pediatric nonalcoholic fatty liver disease and obese patients unveiled by an integrated meta-omics-based approach. Hepatology (Baltimore). 2017;65(2):451–64.Google Scholar
- 55.Kaplan RC, Aviles-Santa ML, Parrinello CM, Hanna DB, Jung M, Castaneda SF, et al. Body mass index, sex, and cardiovascular disease risk factors among Hispanic/Latino adults: Hispanic community health study/study of Latinos. J Am Heart Assoc. 2014;3(4):e000923. https://doi.org/10.1161/JAHA.114.000923.
- 61.Qi Q, Strizich G, Merchant G, Sotres-Alvarez D, Buelna C, Castaneda SF, et al. Objectively measured sedentary time and cardiometabolic biomarkers in US Hispanic/Latino adults: the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). Circulation. 2015;132(16):1560–9.PubMedPubMedCentralGoogle Scholar
- 64.Marin G, Sabogal F, Marin BV, Otero-Sabogal R, Perez-Stable EJ. Development of a short acculturation scale for Hispanics. Hisp J Behav Sci. 1987;9(2):183–205.Google Scholar
- 65.Flores R, Shi JX, Gail MH, Gajer P, Ravel J, Goedert JJ. Assessment of the human faecal microbiota: II. Reproducibility and associations of 16S rRNA pyrosequences. Eur J Clin Investig. 2012;42(8):855–63.Google Scholar
- 66.Gilbert JA, Jansson JK, Knight R. Earth Microbiome Project and Global Systems Biology. mSystems. 2018;3(3):e00217-17. https://doi.org/10.1128/mSystems.00217-17.
- 69.R Core Team. R: A language and environment for statistical computing. 2017.Google Scholar
- 78.Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, et al. vegan: Community Ecology Package. R package version 2.5–3 ed 2018.Google Scholar
- 79.Hothorn T, Hornik K, van de Wiel M, Zeileis A. A Lego system for conditional inference. Am Stat. 2006;60(3):257–63.Google Scholar
- 81.Walters W, Hyde ER, Berg-Lyons D, Ackermann G, Humphrey G, Parada A, et al. Improved bacterial 16S rRNA Gene (V4 and V4-5) and fungal internal transcribed spacer marker gene primers for microbial community surveys. mSystems. 2016;1(1):e00009-15.Google Scholar
- 83.Kaplan RC, Wang Z, Usyk M, Sotres-Alvarez D, Daviglus ML, Schneiderman N, et al. Burk_SOL GOLD. ERP117287. EMBL-EBI Eurpean Nucleotide Archive https://www.ebi.ac.uk/ena/data/search?query=ERP117287. Accessed 14 Oct 2019.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.