Aerobiologia

, Volume 33, Issue 1, pp 71–86

Molecular analysis of environmental plant DNA in house dust across the United States

  • Joseph M. Craine
  • Albert Barberán
  • Ryan C. Lynch
  • Holly L. Menninger
  • Robert R. Dunn
  • Noah Fierer
OriginalPaper

DOI: 10.1007/s10453-016-9451-5

Cite this article as:
Craine, J.M., Barberán, A., Lynch, R.C. et al. Aerobiologia (2017) 33: 71. doi:10.1007/s10453-016-9451-5

Abstract

Despite the prevalence and costs of allergic diseases caused by pollen, we know little about the distributions of allergenic and non-allergenic pollen inside and outside homes at the continental scale. To better understand patterns in potential pollen diversity across the United States, we used DNA sequencing of a chloroplast marker gene to identify the plant DNA found in settled dust collected on indoor and outdoor surfaces across 459 homes. House location was the best predictor of the relative abundance of plant taxa found in outdoor dust samples. Urban, southern houses in hotter climates that were further from the coast were more likely to have more DNA from grass and moss species, while rural houses in northern, cooler climates closer to the coast were more likely to have higher relative abundances of DNA from Pinus and Cedrus species. In general, those plant taxa that were more abundant outdoors were also more abundant indoors, but indoor dust had uniquely high abundances of DNA from food plants and plants associated with lawns. Approximately 14 % of the plant DNA sequences found outside were from plant taxa that are known to have allergenic pollen compared to just 8 % inside. There was little geographic pattern in the total relative abundance of these allergens highlighting the difficulties associated with trying to predict allergen exposures based on geographic location alone. Together, this work demonstrates the utility of using environmental DNA sequencing to reconstruct the distributions of plant DNA inside and outside buildings, an approach that could prove useful for better understanding and predicting plant allergen exposures.

Keywords

Environmental DNA Plant allergens Geography Next-generation sequencing 

1 Introduction

Allergic diseases are increasing worldwide (Pawankar et al. 2012) with 10–30 % of the world population affected by allergic rhinitis (WAO 2011). In the United States alone, approximately 16 million people suffer from allergic asthma (Denning et al. 2006) and the direct and indirect costs of asthma and allergic rhinitis in the USA exceed US$40 billion (Pawankar et al. 2012). Although a relatively small number of plant species are known to have allergenic pollen, pollen is a major source of the allergens that cause respiratory allergic reactions and asthma (D’Amato et al. 2007). Pollen itself is too large to directly enter the respiratory tract where allergic reactions take place (Taylor et al. 2002), yet activation or rupture of pollen in the environment can initiate aerosolization of allergens contained in cytoplasmic contents. As such, understanding the spatial patterns in allergenic and non-allergenic pollen exposures is critical to understanding the geographic variability in allergic disease. Information on allergenic pollen exposures can influence where a person chooses to live, the amount of time they spend outside, how they manage their local outdoor environment, and the steps they can take to manage their indoor environments to reduce pollen exposures.

The types and amounts of pollen found in the atmosphere can vary significantly across locations, often reflecting local abundances of plants (Dunwiddie 1987; McLauchlan et al. 2013; Sugita 1993), including lawn grasses and other plants growing close to homes. Yet, local pollen assemblages can also be influenced transport of pollen, with some pollen able to disperse thousands of kilometers (Campbell et al. 1999; Cecchi et al. 2006; Hjelmroos 1991; Rogers and Levetin 1998). As long-distance transport of pollen has the potential to decouple local plant and pollen assemblages, the degree to which pollen in homes reflects local conditions is still poorly understood.

In addition to being encountered outside of buildings, pollen can enter into homes via open windows and doors, ventilation, textiles (e.g., clothes and laundry), and pets (O’Rourke and Lebowitz 1984; Takahashi et al. 2008; Zavada et al. 2007). Pollen concentrations are generally higher outside than inside homes (Burge 2002; Hugg and Rantio-Lehtimäki 2007; Pichot et al. 2015; Sterling and Lewis 1998), but allergy-causing compounds derived from pollen can reside in household dust for months (O’Rourke and Lebowitz 1984; Pichot et al. 2015). Consequently, the abundance of allergens from pollen can be higher inside than outside homes during times when pollen is not being produced (Fahlbusch et al. 2000, 2001). As people in the United States and other developed nations spend more than 85 % of their time inside buildings (Klepeis et al. 2001), even relatively low pollen concentrations in the home may represent significant exposures. What remains undetermined is the degree to which indoor and outdoor pollen assemblages are correlated across broad scales and whether pollen of some taxa are more likely than others to enter the home.

To investigate broad patterns in pollen assemblages inside and outside homes, we used high-throughput sequencing of chloroplast DNA to identify the taxonomic source of environmental plant DNA found in settled dust collected on indoor and outdoor surfaces across USA homes. Since this DNA metabarcoding technique allows for greater taxonomic resolution than typical techniques using visual identification of pollen grains or immunoassays, DNA metabarcoding has recently been used to characterize pollen assemblages with nuclear and chloroplast markers (Bruni et al. 2015; Kress et al. 2015; Richardson et al. 2015). This technique will sequence any plant DNA that is deposited on surfaces, which includes pollen but also other aerosolized particles, and so cannot exclusively be interpreted as representing pollen on surfaces. Yet, just as pollen abundance is thought to correlate with allergen load, even though it is not always pollen that is the source of respiratory plant allergens, we begin with the assumption that plant DNA on surfaces will correlate with pollen abundance and the corresponding allergen load. Using the metabarcoding technique with a chloroplast marker, we quantified the abundance of environmental plant DNA inside and outside 459 homes across the United States to (1) quantify the basic geography of environmental plant DNA inside and outside homes, (2) identify the differences between indoor and outdoor environmental plant DNA signatures, and (3) assess the importance of covariates such as climate and home characteristics on assemblage composition, including allergenic taxa.

2 Methods

2.1 Sample collection

Dust samples were analyzed from 459 homes. This sample set represents a random subset of a broader sampling of approximately 1200 homes as part of a project described previously (Barberán et al. 2015a). Briefly, indoor and outdoor dust samples were collected by volunteers participating in the Wild Life of Our Homes citizen science project (http://robdunnlab.com/projects/wild-life-of-our-homes/). Enrolled participants were provided a written informed consent form approved by the North Carolina State University’s Human Research Committee (Approval No. 2177) as well as instructions for sampling their home and a sampling kit that included dual-tipped sterile BBL CultureSwabs. Participants were instructed to sample the upper door trim on an interior door in the main living area of the home and the upper door trim on the outside surface of an exterior door using the dual-tipped sterile swab. These sampling locations were selected as they are unlikely to be cleaned frequently and serve as passive collectors of settled dust inside or outside home with little to no direct contact from the home occupants. Each participant sampled their home once between March 2012 and May 2013.

Geographic coordinates were derived from home addresses with the latitude and longitude of each home recorded to only the nearest 0.5° to preserve anonymity. Coordinates were then used to obtain georeferenced variables for each household. These variables included climatic and soil factors, population density, and plant productivity [see details in (Barberán et al. 2015b)]. Home characteristics and occupant information (for both humans and pets) were obtained through a questionnaire filled out by the participants.

2.2 Molecular analyses

Genomic DNA from the settled dust samples was extracted using the MoBio PowerSoil htp-96 well Isolation Kit (Carlsbad, CA). A portion of the chloroplast trnL intron was PCR amplified from each genomic DNA sample using the c and h trnL primers (Taberlet et al. 2007), modified to include appropriate barcodes and adapter sequences for Illumina multiplexed sequencing. The barcodes were 12-bp error-correcting barcodes unique to each sample (Caporaso et al. 2012). Each 25-μL PCR reaction was mixed according to the Promega PCR Master Mix specifications (Madison, WI), with 2 μL of genomic DNA template. The thermocycling program used an initial step at 94 °C for 2 min, a final extension at 72 °C for 2 min and the following steps cycled 35 times: 2 min at 94 °C, 1 min at 55 °C, and 30 s at 72 °C. Amplicons from each sample were cleaned and normalized using SequalPrep Normalization Plates (Life Technologies) prior to being pooled together for sequencing on an Illumina MiSeq instrument running the 2 × 150 bp chemistry at the University of Colorado Next-Generation Sequencing Facility.

Sequences were demultiplexed and paired-end reads were merged using fastq_merge pairs (Edgar 2010). Since merged reads often extended beyond the amplicon region of the sequencing construct, we used fastx_clipper to trim primer and adaptor regions from both ends (https://github.com/agordon/fastx_toolkit). Sequences lacking a primer region on both ends of the merged reads were discarded.

Sequences were clustered into operational taxonomic units (OTUs) at the ≥97 % sequence similarity level, and sequence abundance counts for each OTU were determined using the usearch7 approach (Edgar 2013). We clustered sequences into OTUs to reduce the impact that PCR or sequencing errors could have on our estimates of diversity. In addition, because the targeted gene region often does not always allow for identification of plant taxa down to the species level of resolution, using OTUs defined sequence similarity threshold allows for more conservative and consistent interpretation of the results. Sequences were quality trimmed to have a maximum expected number of errors per read of less than 0.1 and only sequences with more than three identical replicate reads were included in downstream analyses. BLASTN 2.2.30+ was run locally, with a representative sequence for each OTU as the query and the current NCBI nucleotide collection and taxonomy database as the reference. The tabular BLAST hit tables for each OTU representative were then parsed so only hits with at least 97 % query coverage and >97 % identity were kept. The NCBI genus names associated with each hit were used to populate the OTU taxonomy assignment lists. We determined whether species produced allergenic pollen or not by comparing this output to a list of plants known to produce allergenic pollen (Esch et al. 2001).

2.3 Statistical analyses

A total of 1042 OTUs were identified in either indoor or outdoor samples. For most analyses, the 100 OTUs with the highest average abundance were used. As taxonomic groups that had multiple OTUs could bias multivariate relationships among taxa by disproportionately affecting relationships among taxa, we combined individual OTUs that shared genera into OTU groups before running a principal component analyses (PCA). From the original 100 OTUs, this produced a total of 57 individual OTUs and OTU groups. Separate PCAs were run for the indoor and outdoor samples with the two axes with highest eigenvalues for each PCA examined in detail here.

To determine the influence of covariates on OTU assemblages, a backward-elimination stepwise regression model explaining PCA scores of the first two indoor and outdoor axes was used. Predictor variables included latitude, longitude, mean annual precipitation (MAP), mean annual temperature (MAT), elevation, regional soil pH, distance from the coast, a categorical classification of urban or rural locations, whether the house had dogs, and whether the house had cats. After examination of the data, it appeared that homes in Washington State had unique assemblages, so we included a t test to determine whether these homes were uniquely different in their assemblages than those in all other states. Minimum p value for retention in the model was P < 0.01.

We calculated assemblage similarity using the Bray-Curtis metric after Hellinger transformation to reduce the influence of high abundance OTUs (Legendre and Gallagher 2001). We used Mantel tests to assess correlations among assemblage similarity matrices, and between assemblage matrices and geographic distance. To determine the influence of covariates on assemblage similarity, we used permutational multivariate analysis of variance (PERMANOVA) based on 1,000 permutations (Anderson 2001).

To identify taxa disproportionately associated with either indoor or outdoor samples, indicator values were calculated for each of the top 100 OTUs based on abundance and frequency of occurrence (Dufrêne and Legendre 1997). Generalized linear models (GLM) with binomial errors were used to explain the total abundance indoors and outdoors of taxa with allergenic pollen. We used the R (www.r-project.org) packages vegan (http://vegan.r-forge.r-project.org/), labdsv (http://ecology.msu.montana.edu/labdsv/R/) and ecodist (http://cran.r-project.org/web/packages/ecodist/) for multivariate statistics and indicator value analyses. Stepwise regression was computed in JMP v. 11.2.0 (SAS Institute, Cary NC, USA). For graphical representation of patterns, the proportional abundances of taxa and principal component axis scores were mapped by inverse distance weighting interpolation on 100 × 100 grid cells, using the gstat package (https://r-forge.r-project.org/projects/gstat/).

3 Results

3.1 Geography of outdoor environmental plant DNA

Across all outdoor samples, an OTU that matched with Pinus species was the most abundant (mean relative abundance of 21.4 %; Table 1). This OTU likely represents multiple species in the genus Pinus, as is the case with OTUs of many of the genera we consider. The next nine most abundant OTUs were from grasses as well as species of the genus Cedrus, Fabaceae species such as those from genus Glycine, Cupressaceae species from the genus Juniperus, and Ulmus species. Grouping OTUs to shared taxonomic groups, Pinus sequences represented 25 % of the outdoor sequences, while sequences from Pooid grasses such as Elymus and Festuca represented 11 % of the sequences.
Table 1

Ten most abundant OTUs indoors and outdoors, the genus selected to represent the OTU, its average abundance, and the percentage of base pairs matched with the reference sequence

 

Representative genus

% Sequences

% Match

Indoors

 1668

Pinus

13.47

98.6

 3

Triticum

12.06

100

 1783

Setaria

7.26

98.68

 1781

Zea

4.39

98.01

 1479

Festuca

4.21

98.18

 7

Solanum

2.85

100

 5

Zea

2.28

100

 6

Glycine

2.05

100

 9

Ambrosia

1.95

100

 12

Musa

1.56

100

Outdoors

 1668

Pinus

21.35

98.6

 4

Cedrus

6.79

100

 1783

Setaria

5.72

98.68

 3

Triticum

4.41

100

 1479

Festuca

3.64

98.18

 8

Juniperus

3.36

100

 6

Glycine

3.36

100

 11

Ulmus

2.23

100

 5

Zea

2.02

100

 1781

Zea

1.89

98.01

In outdoor samples, OTUs were non-randomly distributed across the USA (Fig. 1). For example, in the outdoor samples, Festuca DNA is largely restricted to the northern regions with continental climates, Cedrus DNA is largely restricted to coastal areas, and Glycine DNA is largely restricted to areas where soybeans are grown, namely the Mississippi River Valley of the Midwestern US.
Fig. 1

Maps of the relative abundance of aFestuca, bCedrus, and cGlycine as determined from environmental plant DNA found on the outside of homes. Points represent sampling locations

The composition of communities (based on the top 100 OTUs) outside of houses was influenced by climate, elevation, distance from the coast, and the degree of urbanization (Table 2). The explanatory power of the individual predictor variables included in our analyses was generally low, with each variable explaining, on average, only 1.9 % of the total OTU variation outdoors. Distance from the coast explained the greatest proportion of overall composition (5.4 % for outdoor samples).
Table 2

Results of PERMANOVA tests of the relative abundance of the top 100 OTUs for indoor and outdoor samples r2

 

Indoors

Outdoors

r2 (%)

P

r2 (%)

P

MAT

1.94

<0.001

1.95

<0.001

MAP

1.66

<0.001

1.60

<0.001

Soil pH

2.10

<0.001

2.97

<0.001

Urbanization

0.55

<0.001

1.26

<0.001

Distance coast

2.44

<0.001

5.39

<0.001

Elevation

2.18

<0.001

1.31

<0.001

Dogs

0.50

<0.001

0.15

0.73

Cats

0.71

<0.001

0.22

0.37

Predictor variables include mean annual temperature (MAT), mean annual precipitation (MAP), regional soil pH, the degree of urbanization, distance from the coast, elevation, and the presence of dogs or cats in the household

Outdoors, plant OTU assemblages of different homes became more similar with decreasing geographic distance (Mantel test; rM = 0.13; P < 0.001). In other words, homes located in closer proximity to one another tend to have more similar plant OTU assemblages outdoors. Examining the multivariate relationships among OTU abundances across sites for the outdoor dust samples, there were two broad geographic patterns that reflect fundamental distributions of North American plants. The first PCA axis reflects contrasts between the abundance of grasses and mosses and the abundance of Pinus and Cedrus species (Table 3). Urban, lower latitude houses in hotter climates that were further from the coast were more likely to have higher relative abundances of grass and moss OTUs, while rural houses in northern, cooler climates closer to the coast were more likely to have higher relative abundance of Pinus and Cedrus OTUs (P < 0.005 for all predictors; Table 4). The second geographic pattern primarily related to the unique OTU assemblage of houses in the Pacific Northwest (Fig. 2; Table 4). Houses in this region were more likely to have OTUs of moss species such as Ceratodon and Funaria in combination with tree species such as Thuja (Fig. 2; Table 3), all typical of the region.
Table 3

Eigenvectors and eigenvalues of the principal components analyses (PCAs) for the top 100 indoor and outdoor OTUs

OTU ID

Representative genus

Indoor

Outdoor

Axis 1

Axis 2

Axis 1

Axis 2

6

Glycine

−0.01

0.11

0.19

0.08

9

Ambrosia

0.07

0.04

0.08

0.05

11

Ulmus

0.01

−0.15

0.06

−0.05

12

Musa

−0.03

−0.03

0.03

−0.01

13

Medicago

0.03

−0.04

0.04

0.02

16

Ginkgo

−0.07

−0.01

−0.03

−0.03

18

Tsuga

−0.09

0.04

−0.16

0.08

19

Picea

−0.04

0.02

−0.05

−0.12

20

Olea

0.00

−0.01

0.00

−0.01

21

Ceratodon

0.07

0.28

−0.08

0.55

22

Amaranthus

−0.04

−0.11

0.07

−0.01

23

Huperzia

−0.10

−0.04

−0.08

−0.07

25

Podocarpus

−0.09

0.01

−0.01

−0.01

26

Sphagnum

0.08

0.15

0.00

−0.01

28

Juglans

−0.07

0.11

0.04

−0.01

29

Polygonum

0.10

−0.05

0.08

0.04

33

Funaria

−0.05

0.25

−0.06

0.53

35

Abies

−0.10

0.07

−0.07

0.05

37

Spinacia

−0.03

−0.02

0.04

0.01

38

Trifolium

0.19

−0.05

0.15

−0.07

41

Platanus

−0.09

0.07

−0.01

−0.02

42

Thuja

−0.03

0.17

−0.10

0.34

43

Liquidambar

−0.10

−0.15

0.05

−0.07

45

Oryza

−0.05

0.04

0.11

0.04

46

Sorghum

0.08

−0.12

0.13

−0.02

48

Convallaria

0.00

−0.03

0.01

−0.01

49

Arachis

0.05

0.11

−0.01

0.00

50

Stellaria

0.26

−0.27

0.14

−0.02

51

Fagopyrum

0.00

0.10

−0.09

0.02

52

Citrus

0.03

0.06

−0.06

0.01

55

Plantago

0.20

0.06

0.03

0.01

57

Dicranum

0.04

0.16

−0.03

−0.03

58

Cinnamomum

0.04

−0.21

0.01

0.02

59

Camellia

−0.07

0.24

−0.01

−0.01

62

Asparagus

0.02

0.06

0.06

0.05

64

Unknown

0.08

−0.20

0.06

−0.01

125

Magnolia

0.12

−0.13

0.05

−0.01

328

Pseudotsuga

−0.02

0.13

−0.13

0.20

491

Juglans

−0.02

−0.17

0.05

−0.11

498

Lycopodium

−0.02

0.16

0.01

0.02

688

Chamaecyparis

0.06

−0.02

−0.09

−0.15

790

Draba

0.00

−0.04

0.04

0.00

1213

Unknown

−0.06

0.02

−0.01

0.01

1456

Capsicum

−0.11

−0.02

0.03

−0.02

A

Allium

−0.01

0.18

0.14

0.10

B

Castanea

−0.08

−0.15

−0.04

−0.18

C

Cedrus

−0.16

−0.01

−0.18

0.18

D

Cucurbita

−0.03

0.18

0.02

−0.05

E

Digitaria

0.19

−0.33

0.17

−0.06

F

Elymus

−0.07

0.04

0.05

0.01

G

Festuca

0.41

0.20

0.45

0.02

H

Juniperus

−0.09

−0.08

−0.09

0.05

I

Lycopodium

−0.01

0.01

−0.04

−0.01

J

Pinus

−0.44

−0.22

−0.43

−0.27

K

Salvia

−0.02

0.06

−0.03

0.00

L

Solanum

0.04

0.17

0.03

−0.01

M

Zea

0.50

−0.03

0.52

0.04

Eigenvalue

 

2.1

1.8

2.1

1.7

% Explained

 

3.6

3.1

3.7

2.9

Separate PCAs were run for each. OTUs that shared genera were aggregated to common OTU groups

Table 4

Stepwise regression results of scores of homes for outdoor and indoor PCA axes

 

Parameter estimate

SE

SS

P

Outdoor

 Axis 1

  Intercept

2.74

0.63

 

<0.001

  Distance coast (km)

0.0019

0.0002

192.81

<0.001

  Elevation (m)

−0.0004

0.0002

12.06

0.008

  Latitude

−0.06

0.01

25.47

<0.001

  Longitude

0.013

0.004

19.21

<0.001

  Urbanization [Rural]

−0.22

0.07

17.78

0.001

 Axis 2

  Intercept

−3.82

0.77

 

<0.001

  WA [Yes]

1.46

0.15

84.44

<0.001

  Latitude

0.062

0.011

29.25

<0.001

  Longitude

−0.023

0.004

38.05

<0.001

  MAP (mm)

0.0007

0.0002

13.61

<0.001

Indoor

 Axis 1

  Intercept

−4.11

1.62

 

0.012

  Latitude

0.094

0.032

82.00

0.003

  Longitude

0.018

0.004

18.41

<0.001

  MAT (°C)

0.12

0.04

16.13

0.003

  Distance coast (km)

0.0013

0.0002

37.13

<0.001

  Dogs present [Yes]

0.40

0.13

16.80

0.002

 Axis 2

  Intercept

1.02

0.44

 

0.02

  WA [Yes]

0.52

0.17

13.12

0.003

  Elevation (m)

−0.0007

0.0002

34.33

<0.0001

  Longitude

−0.016

0.004

21.05

<0.001

  MAT (°C)

−0.13

0.016

106.49

<0.001

Parameter estimates include the slope of the relationship between the response variable and continuous independent variables. If the independent variable is categorical and binary, the estimate represents the change in the intercept if the categorical variable is one value with the change in the intercept opposite in sign if it is the contrasting value, e.g., rural versus urban or yes versus no

Fig. 2

Maps of PCA axes. a, b outdoor samples and indoor samples (c, d) for Axis 1 (a, c) and Axis 2 (b, d). Points represent sampling locations

3.2 Geography of environmental plant DNA indoors

OTUs that were abundant outdoors were also abundant indoors. Averaging the relative abundances of each OTU across all homes inside and also averaging them outside, those OTUs that were most abundant outdoors were also more abundant indoors (r = 0.83 for untransformed percentages, r = 0.61 for log-transformed percentages of OTUs that had abundance >0 both indoors and outdoors). In addition, the overall compositions of communities inside and outside were also similar, based on a Mantel test (rM = 0.11; P < 0.001). Despite this broad correspondence, there were some notable differences in the relative abundances of specific OTUs found inside versus outside homes (Fig. 3; Table 7 in “Appendix”). For example, the Pinus OTUs were less abundant inside homes than outside homes (mean relative abundance of 15.6 and 24.8 % inside and outside homes, respectively; P < 0.001). The remainder of the 10 most abundant indoor OTUs was predominately grasses, but also included Solanaceae species such as Solanum, Asteraceae species such as Ambrosia, Fabaceae species such as Glycine, and banana (genus Musa). Grouping all OTUs to shared taxonomic groups, Pinus sequences represented 16 % of the indoor sequences, while Pooid grasses represented 26 % of the indoor sequences.
Fig. 3

Log-transformed average abundance of OTUs indoors versus outdoors. Line indicates reduced major axis (r = 0.61, n = 1041)

Those OTUs that were consistently more abundant in indoor samples than in outdoor samples were generally food plants or plants associated with lawns (Table 5). Among the top 100 most abundant OTUs, key food species that uniquely identified indoor environments included banana, onion, walnut, spinach, rice, asparagus, chili pepper, cucumber, corn, and buckwheat. In contrast, most of those OTUs that were significantly less abundant in indoor samples were generally trees such as Pinus, Ulmus, Cedrus, Juniperus, and Tsuga or moss and lycophyte species such as Sphagnum, Dicranum, Huperzia, or Lycopodium (Table 5).
Table 5

OTUs that significantly indicate outdoor and indoor samples

OTU/OTU group

Indicator value

P

Representative genus

Outdoor

 OTU 11

0.360

<0.001

Ulmus

 OTU 23

0.597

<0.001

Huperzia

 C

0.418

<0.001

Cedrus

 H

0.188

<0.001

Juniperus

 I

0.524

<0.001

Lycopodium

 J

0.134

<0.001

Pinus

 OTU 18

0.048

<0.001

Tsuga

 OTU 57

0.121

<0.001

Dicranum

 OTU 26

0.064

<0.001

Sphagnum

 OTU 498

0.203

<0.001

Lycopodium

 OTU 19

0.099

0.002

Picea

 OTU 125

0.360

0.003

Magnolia

Indoor

 OTU 9

0.484

<0.001

Ambrosia

 OTU 12

0.257

<0.001

Musa

 OTU 16

0.455

<0.001

Ginkgo

 OTU 28

0.278

<0.001

Juglans

 OTU 37

0.195

<0.001

Spinacia

 OTU 38

0.341

<0.001

Trifolium

 OTU 45

0.192

<0.001

Oryza

 OTU 48

0.121

<0.001

Convallaria

 OTU 51

0.055

<0.001

Fagopyrum

 OTU 62

0.128

<0.001

Asparagus

 OTU 790

0.487

<0.001

Draba

 OTU 1213

0.301

<0.001

Unknown

 OTU 1456

0.327

<0.001

Capsicum

 A

0.500

<0.001

Allium

 D

0.500

<0.001

Cucurbita

 F

0.517

<0.001

Elymus

 G

0.563

<0.001

Festuca

 K

0.329

<0.001

Salvia

 L

0.036

<0.001

Solanum

 M

0.546

<0.001

Zea

Shown are indicator values and P values for indicator value test

Geographic patterns in plant distributions identified from indoor samples paralleled those for outdoor samples. As observed for the outdoor samples, factors including: climate, elevation, distance from the coast, and the degree of urbanization were significant predictors of the composition of assemblages of the 100 most common OTUs found inside homes across the USA (Table 2). In contrast to the outdoor samples, the presence of dogs and cats in a home influenced plant assemblage similarity in homes (Table 2). The relative explanatory power of the predictor variables examined for indoor samples was even lower than outside samples, with each variable explaining, on average, 1.5 % of the overall assemblage similarity. Distance from the coast explained the greatest proportion of plant assemblage similarity inside homes, though the explanatory power of this factor was less for samples collected inside homes than for those collected outside homes (2.4 vs. 5.4 %, respectively). Like the outdoor samples, plant OTU assemblages of different homes became more similar with decreasing geographic distance, but with slightly less convergence than outdoor samples (Mantel test; rM = 0.11; P < 0.001).

Examining multivariate relationships among OTUs from indoor samples, homes with high grass and moss abundances were more likely to have lower abundances of Pinus (Table 3). Northern homes far from the coast tended to have greater grass and moss abundance than homes near the coast with western mid-continent homes having greater grass and moss abundance than eastern mid-continent homes (P < 0.001 for distance to coast and longitude; Table 4). The presence of dogs increased the relative abundance of grass and moss OTUs inside the home as it was a significant predictor of Axis 1 scores (P = 0.002) as well as Festuca OTU abundance (P < 0.01, Mann–Whitney test). The second multivariate axis similarly identified homes in the Pacific Northwest as having high moss and Thuja abundance, but also higher abundance of DNA from tea (Camellia), onions (Allium), and tomatoes (Solanum). In contrast, homes in the southeastern USA were more likely to have DNA from warm-season grasses such as Digitaria, Pinus, and Stellaria (chickweed) than cooler, western homes.

3.3 Plant taxa with allergenic pollen

Across all homes, OTUs representing allergenic plant taxa were relatively more abundant outside homes than inside homes (13.7 and 8.5 % of sequences outside and inside homes, respectively, P < 0.001, paired Mann–Whitney test). Among all allergenic taxa, Cupressus/Juniperus (4.0 %) and Ulmus (2.2 %) were the most abundant allergenic taxa found outside homes, while Ambrosia (2.0 %) and Morus (1.3 %) were the most abundant taxa identified from inside homes. Allergenic taxa that were more abundant inside were also generally more abundant outside homes (r = 0.91 for log-transformed abundances, P < 0.001). There were some exceptions to this pattern: some allergenic taxa had notably higher relative abundance inside the home than outside, e.g., Ambrosia (2.0 vs. 1.4 %, P = 0.01; Table 6), while others were markedly more abundant outside homes, e.g., Cupressus/Juniperus (4.0 vs. 1.2 %; Table 6).
Table 6

Paired Mann–Whitney tests results of allergenic taxa indoors and outdoors (P values after false discovery rate correction for multiple comparisons)

OTU

Indoors (%)

Outdoors (%)

P

Acacia

0.22

0.37

0.61

Acer

0.10

0.09

<0.001

Alnus + Betula

0.02

0.02

0.02

Amaranthus + Chenopodium + Atriplex

0.09

0.62

<0.001

Ambrosia

2.00

1.36

<0.001

Artemisia

0.00

0.00

0.05

Corylus

0.00

0.00

0.17

Cupressus + Juniperus

1.19

3.96

<0.001

Fagus

0.00

0.00

0.02

Fraxinus + Ligustrum + Olea

0.46

0.76

0.30

Juglans

0.84

0.25

<0.001

Liquidambar

0.50

0.45

0.003

Morus

1.29

1.47

<0.001

Myrica

0.05

0.09

0.81

Parietaria

0.03

0.01

0.99

Plantago

0.29

0.19

<0.001

Platanus

0.15

0.33

0.28

Populus

0.01

0.04

0.01

Quercus

0.26

0.48

0.005

Rumex

0.09

0.05

0.03

Salix

0.01

0.06

0.06

Taxodium

0.20

0.78

<0.001

Ulmus

0.56

2.19

<0.001

Urtica

0.10

0.06

0.04

Shown are average total percentages of OTUs from allergenic taxa indoors and outdoors

There were no strong patterns of coincidence of the major allergenic taxa. Examining pairwise correlations of the 24 allergenic taxa to assess basic coincidence of the abundance of OTUs, only 14 correlations of the possible 552 pairwise correlations were significant indoors, which is the same as was expected by chance at P < 0.05. Among the 25 allergenic taxa examined outdoors, 25 correlations were significant at P < 0.05 and they were all positive. The strongest were between Parietaria and Corylus (r = 0.43, P < 0.001) and between Parietaria and Morus (r = 0.38, P < 0.001).

The total relative abundance of allergenic OTUs inside or outside homes was not well correlated with the measured environmental variables. There was only a very weak relationship between allergenic OTU assemblage composition and geographic distance (Mantel test; indoors: rM = 0.06, P < 0.001; outdoors: rM = 0.04; P < 0.001). Among the best predictors of allergenic OTUs outdoors, houses in warmer climates had higher relative abundances of allergenic plants (r2 = 0.03, P < 0.001).

4 Discussion

By studying tiny samples of dust, we are able to document a measure of the plants and plant material found in and around homes. Both geography and decisions made by owners influence the plant DNA present in homes. DNA sequence-based analyses of outdoor dust revealed predictable geographic patterns in environmental plant DNA across the continental USA. For example, southern, mid-continental homes had a greater proportion of grass and moss OTUs, while northern, coastal homes were more likely to have high abundance of Pinus and Cedrus. In contrast, homes in the Pacific Northwest had higher proportions of certain mosses and tree species such as Thuja typical of the temperate rainforest biome where most of the sampled homes were located. The general patterns observed for dust collected outside homes are similar to biogeographic patterns of pollen assemblages collected in pollen traps or in surface lake sediments. For example, McLauchlan et al. (2013) found patterns comparable to those of Fig. 2, with North American mid-continental pollen assemblages are also dominated by pollen from Poacae, Ambrosia, and Chenopodiaceae. Likewise, the outdoor dust pollen signatures from homes in the Pacific Northwest are similar to modern pollen assemblages in the region (Dunwiddie 1987; Gavin et al. 2005; Heusser 1978). In short, our approach quantified geographic distributions of environmental plant DNA like those quantified with morphology-based identification of pollen grains by pollen experts. Although only a subset of all potential covariates regarding the behavior of occupants could be examined, it appears that choosing to have dogs and bringing certain foods into the house affect the plant DNA found in homes. What other behaviors affect the abundance of environmental plant DNA in homes remains to be investigated.

In addition to showing us plant taxa we might expect based on pollen data, we found plant DNA in the collected dust samples from a number of taxa that do not have anemophilous (wind-dispersed) pollen. Although some of this DNA might be from intact pollen grains, we hypothesize that much of it comes from aerosolized plant material, whether pollen or other plant tissues. For example, approximately 3 % of the outdoor sequences were from Glycine (soybean) and similar Fabaceae species, which are exclusively pollinated by insects and have almost no atmospheric dispersal (Yoshimura 2011). Yet, somehow, large quantities of its DNA are being deposited on outdoor surfaces. This DNA could be from nearby agricultural fields processing soybean, which would aerosolize its DNA, from aerosolized food products such as soy meal or cooking oil, or even from insect frass. We also recovered DNA in outdoor dust samples belonging to plants that do not grow in the region or from insect-pollinated plants (e.g., Solanum) which are unlikely to be producing aerosolized pollen. More testing will be necessary to better understand the source of this plant DNA, but for now, we can conclude that although the majority of plant DNA on the sampled surfaces is likely coming from pollen, there are other routes by which plant DNA can be deposited in settled dust inside or outside homes.

The composition of plant taxa found inside homes generally mirrors that found outdoors. Indoor and outdoor samples recovered similar geographic patterns in plant composition. In contrast to outdoor dust samples, indoor surfaces had a high proportion of DNA from food species. Almost all of the plant taxa that were found to be significantly more abundant inside than outside homes are important as human food items (Table 5). These results highlight that plant DNA analysis from settled dust samples could be used to reconstruct culinary patterns in homes across the US.

The abundance of food DNA in homes could be coming from deposited insect frass or, more likely, the aerosolization of DNA during food preparation or consumption. If the DNA of food plants in homes is associated with aerosolization of food particles, those same food particles could serve as allergens, allergens invisible to traditional pollen studies of allergens. For example, inhalation of wheat products can cause asthma in those allergic to wheat (Salvatori et al. 2008). Inhalation of peanut and/or tree nut allergens causes allergic reaction (Sicherer et al. 1999). The occurrence of DNA from plants such as wheat or peanut DNA could help improve our understanding of the sources and distributions of indoor allergens.

In general, we found minimal continental-scale geographic pattern to the distribution of plant species with allergenic pollen inside homes. However, since our work only examined relative abundances of these species, absolute abundances of species with allergenic pollen may still vary geographically. Although determining absolute abundances of allergens is difficult, doing so will be critical in determining allergen exposures within homes and ultimately linking dust-based surveys of plant allergens to human health outcomes. Although we found that dogs appear to elevate the relative amounts of grass DNA found inside homes, these data do not provide specific guidance on how to reduce pollen influx into homes nor how to manage home environments to reduce the abundance of pollen (or DNA) from allergenic taxa.

5 Conclusions

This research demonstrates how the relatively simple measurement of plant DNA in home environments can be used to assess potential exposures to plant allergens. Given the consistent differences in the relative abundances of different plant taxa inside and outside homes, it cannot be assumed that the allergens we are exposed to outside our home are similar to those found inside our homes. Additionally, these findings suggest plant DNA barcoding approaches could be superior to direct pollen counting for allergen detection due to the likely detection of aerosolized allergenic plant tissues in both outdoor and indoor samples. In addition, the high richness of plant taxa in dust and the local to regional patterns observed in the samples suggests that quantification of plant DNA in household dust could improve forensic models, but more geographic sampling is necessary to examine these patterns. Still, plant DNA signatures could provide a geographic fingerprint that could be combined with other data such as fungal data to better identify the geographic origin of dust found on objects (Grantham et al. 2015).

Acknowledgments

We thank the volunteers who participated in the Wild Life of Our Homes project for collecting dust samples. Funding for the sample collection was provided by a grant from the A. P. Sloan Microbiology of the Built Environment Program (to NF and RRD).

Copyright information

© Springer Science+Business Media Dordrecht 2016

Authors and Affiliations

  • Joseph M. Craine
    • 1
  • Albert Barberán
    • 2
  • Ryan C. Lynch
    • 3
  • Holly L. Menninger
    • 4
  • Robert R. Dunn
    • 5
  • Noah Fierer
    • 2
    • 6
  1. 1.Jonah VenturesManhattanUSA
  2. 2.Cooperative Institute for Research in Environmental SciencesUniversity of ColoradoBoulderUSA
  3. 3.Medicinal GenomicsWoburnUSA
  4. 4.Department of Biological SciencesNorth Carolina State UniversityRaleighUSA
  5. 5.Department of Applied EcologyNorth Carolina State UniversityRaleighUSA
  6. 6.Department of Ecology and Evolutionary BiologyUniversity of ColoradoBoulderUSA

Personalised recommendations