Introduction

DNA banking describes the long-term storage and preservation of genetic material from an individual for future clinical or research testing. This service is typically voluntary and performed at the request of a consented individual or their legal representative(s) for a fee. Unlike clinical and research biorepositories, the individual or their legal representative has control over how, when, and why the genetic material is used. Comparably, biobanks for human subjects accept, process, store, and distribute de-identified biological specimens and associated data for use in research and clinical settings with appropriate consent and oversight (De Souza and Greenspan. 2013). Typically, the participant may request specimen and record destruction, but cannot withdraw the information for personal use, and will not receive individual results generated by research. Draft societal guidelines for DNA banking were first published in 1989 in response to emerging clinical molecular test availability for heritable disorders (Yates et al. 1989). At a time when molecular testing was in its infancy, preservation of genetic material was advocated in the hope that at some point testing may become available for molecular confirmation of the medical diagnosis, presymptomatic diagnosis, carrier detection, and prenatal diagnosis.

DNA banking may be performed at any point in one’s life, but is frequently utilized in the perimortem and postmortem periods, especially in the context of a personal or family history of a heritable condition (Quillin et al. 2010, 2018; Cléophat et al. 2020). Banking may be the last chance to retain a DNA sample for future interpretation when current testing is unclear, not readily available or accessible, or is not desired at the time. For severe adult onset neurological disorders, some individuals may not want to know their molecular status, but choose to bequest testing rights to extended family members (Smith et al. 2014). In cases where genetic testing is unavailable or inconclusive, DNA banking ensures sufficient relevant genetic material is available for testing as technologies advance, become more affordable and accessible (Overwater et al 2014).

For certain high-risk professions, DNA banking may be a requirement of employment. For instance, the US military maintains a biobank of DNA reference specimens, the Armed Services Repository of Specimen Samples for the Identification of Remains (AFRSSIR), for all active duty and reserve service members (Mehlman and Li. 2014; De Castro et al. 2016). Every member of the military is required to submit a blood sample upon enlistment, which can be retained for up to 50 years. Samples may be destroyed at the request of the depositor after the conclusion of military service (De Castro et al. 2016).

As with all biobanks for human subjects, there are potential ethical and legal considerations surrounding usage of commercial DNA banks (Coppola et al. 2019). This includes informed consent, specimen ownership, specimen and data storage, duration of storage, and ensuring dignity, privacy, and confidentiality. DNA banks are narrower in scope than traditional biobanks, as the individual or their legal representative has complete ownership over the biological specimen. After providing appropriate informed consent, the owner may request for the specimen to be withdrawn, transferred, tested, or destroyed at any point in time. It is important for DNA banks to consider specimen ownership in a broader context as specimens may be received from deceased individuals (postmortem) or from individuals whose life expectancy is limited or whose capacity to provide informed consent is limited. In such situations, ownership of the specimen may be appointed to a designated legal representative.

Although the utility of DNA banking has been described in the military setting, palliative care setting, isolated cases of rare disease, and in professional societal guidelines, to date, no studies exist of utilization in general populations. Here, we describe a cohort that has contributed to a single commercial DNA banking service over a 22-year period and explore general socioeconomic trends in utilization.

Materials and methods

This cross-sectional, descriptive study included individuals who utilized a single commercial DNA service (PreventionGenetics, Marshfield, WI, USA) between January 1997 and December 2019. The DNA banking service required an individual or their legal representative to complete a DNA banking requisition form, pay a service fee, and arrange specimen collection and transport to the facility for secure DNA storage over a period of at least 50 years from the date of deposit. All data was stored in a secure database, which was accessed in accordance with Institutional Review Board approval for this study. All individuals or their legal representatives provided consent for DNA banking. No information was obtained from the individual or provider for research purposes. All data was de-identified prior to commencement of this study. Cohort demographics and clinical information were summarized using descriptive statistics based on participant- or representative-completed requisition forms. This included information on year of deposit, age at deposit, sex, specimen type, source of referral for the service, and number of withdrawals. For participants residing in the US at the time of banking, location was limited to their state and zip code. For participants residing in a US territory or other country at the time of banking, location was limited to the country of origin. All participant-related data was de-identified for analysis. All analyses were conducted using R version 3.5.3. Graphics were created using the R packages US Map and ggplot2.

To broadly compare the socioeconomic characteristics of the study population to the overall adult US population, 5-digit zip code–level US Census data for economic status, race, ethnicity, and educational level were estimated for each study participant. Estimates for the DNA banking cohort were compared to the overall national metrics. Participants from US territories or non-US countries were excluded from this analysis.

The analysis was limited to the 2016 American Community Survey (ACS) 5-year estimates, which represents 60 months of data collected between 2012 and 2016 (data.census.gov, accessed July 2020). ACS 5-year datasets are released annually, have large sample sizes, and provide data for all areas of the US regardless of population size (census.gov, accessed July 2020). During this ASC 5-year interval, 50.19% (2,378/4,738) of the DNA banking cohort with valid US zip codes banked DNA. A total of 23.17% (N = 1,098) samples with valid US zip codes were banked from 1997 to 2011, and 26.64% (n = 1,262) were banked between 2017 and 2019. The 2012 to 2016 interval represents the midpoint for service volumes.

Results

Cohort demographics

Between 1997 and 2019, DNA banking was performed for 4,874 individuals. Overall, 55.58% (n = 2,709) of individuals identified as female, 41.18% (n = 2,007) of individuals identified as male, and 3.24% (n = 158) did not disclose sex. Deposits for fetal-derived samples comprised 6.50% of the cohort (n = 317). Of these samples, 22.08% (n = 70) were reported as female and 29.34% (n = 93) were reported as male. However, for the majority of fetal-derived samples (48.58%; n = 154), sex was not specified. Excluding known fetal-derived samples, 9.90% (n = 451/4,557) of individuals were confirmed as deceased at the time of deposit.

Samples originated from 31 countries across 6 continents (Fig. 1). However, the majority of samples originated from the US (97.37%; n = 4,746). Samples were received from all of the 50 US states. Wisconsin (14.38%; n = 701), California (11.49%; n = 560), and Minnesota (7.22%; n = 352) were the largest single states providing deposits (Fig. 1).

Fig. 1
figure 1

Geographical distribution of DNA banking samples. a DNA banking samples have been received from 31 countries. DNA banking samples have been received from all US states. b A heat map of sample origin within the US. The darker the state color, the larger the total number of samples received from that state. A total of 4,746 (97.38%) samples originated from the US (zip codes were available for 4,738 of these samples). Samples were received from all 50 states (range 0.04 to 14.38%). Of the non-US samples, 81 (1.66%) were of international origin and 47 (0.96%) did not specify origin

The vast majority of participants submitted a single specimen type for DNA banking (96.68%; n = 4,557; Fig. 2a). However, 3.32% (n = 162) of participants submitted two or more specimen types. A total of 5,403 specimens were submitted for DNA banking. Whole blood was the most frequently submitted specimen type (83.11%; n = 4,191). The second most frequent specimen type was extracted DNA (6.62%; n = 334), followed by biopsied tissue (4.38%; n = 221), cultured cells (3.63%; n = 183), and saliva or buccal swabs (2.20%; n = 111). Biopsied tissue sources were diverse and included liver, cardiac tissue, skin, skeletal muscle, brain, placenta, and products of conception. A whole blood specimen was provided for every DNA banking participant (sometimes with additional specimen types) between 1997 and 2007 (Fig. 2b). However, the primary specimen type submitted increased in diversity between 2008 and 2019 (Fig. 2b).

Fig. 2
figure 2

Biological specimen source. a A total of 5,043 Specimen types were provided for 4,874 individuals. b Whole blood was the most common specimen type provided DNA banking (83.72%; n = 4,222/5,043). From 1997 to 2007, a whole blood specimen was provided for every DNA banking request, sometimes with additional specimen types. Between 2008 and 2019, whole blood submissions varied from 72.85 to 98.54% per year

The median age of the cohort was 59.33 years (n = 4,523; range = 0–106.47 years). However, the median age for females (61.74 years; range = 0–104.11) was higher than that for males (54.53 years; range = 0–106.47). The cohort age had a bimodal distribution, peaking at 0 to 5 year of age and again at 60 to 70 years of age (Fig. 3). Males were the majority for the fetal, 0 to 20, and 20 to 25 years of age groupings. Females were the majority for the 20 to 25, and 30 to 100 years of age groupings.

Fig. 3
figure 3

The study population distribution by age (5-year intervals) and sex

Sample deposits peaked in 2015 with 559 deposits, after 10 years of steady growth (Fig. 4a). Since 2015, deposits have ranged from 421 to 527 samples per year (Fig. 4a). During the first decade of the banking service, samples were predominantly received from females. In the most recent decade of this service, the numbers of male samples have increased, as have fetal-derived samples.

Fig. 4
figure 4

DNA banking deposits and withdrawals from 1997 to 2019. a The number of participants depositing specimens per year by sex. b The number of DNA withdrawals per year by sex

During 2013, the DNA banking requisition form was updated to include questions regarding the source of referral for this service. As a result, source of referral is only available for 67.71% (n = 3,300) depositors (Table 1). Clinical genetic counselors were the most frequent source of referral (41.73%; n = 2,034). Other health care providers accounted for 8.86% (n = 432) of referrals. Referral by a relative or friend occurred for 7.16% (n = 349) of cases. Organizations such as the Sudden Arrhythmia Death Syndromes foundation and Sudden Unexplained Death in Childhood foundation accounted for 4.72% (n = 230) of referrals. The other category includes diverse sources such as autopsy facilities, funeral homes, conferences, health fairs, and the commercial laboratory’s website, and accounts for 3.75% (n = 183) of referrals for the service.

Table 1 Referral

One or more DNA withdrawals for clinical testing, research testing, or other purpose has been made for 9.93% (n = 484) of the cohort (Fig. 4b). Of these, 90.50% (n = 438) had a single withdrawal, 8.06% had 2 withdrawals, and 1.45% (n = 7) had 3 withdrawals. Currently, samples banked in 2017 have the highest volume of withdrawals (N = 83).

Socioeconomic trends in utilization

A valid 5-digit US zip code was available for 97.21% (n = 4,738) of depositors. Socioeconomic features were estimated for each of these individuals based on corresponding 2016 ACS 5-year estimate values for their reported 5-digit US zip code. These values were summarized across the DNA banking cohort and compared to national metrics (Table 2).

Table 2 Zip code–based socioeconomic characteristics based on American community survey data. Approximate income, racial/ethnic background, and other socioeconomic features among study population, using the zip code–based metrics reported in the American Community Survey (National Census). The study population’s mean, median, and standard deviation are presented, along with the national metrics reported in Census.gov reports for 2016

Overall, individuals utilizing this DNA banking service are estimated to reside in zip codes with higher incomes and higher levels of educational attainments than the national metrics. These individuals are also estimated to reside in zip codes with lower rates of individuals below the poverty line, lower levels of food stamps or the supplemental nutrition assistance program (SNAP) usage, and less reliance on Medicaid or other means-tested public coverage than the national metrics. In terms of race and ethnicity, individuals utilizing the DNA banking service are estimated to reside in zip codes with a larger percent of their population identifying as white, and a lower percent identifying as African American or Hispanic or Latino compared to the national metrics.

As the commercial laboratory offering the DNA banking service is based in the state of Wisconsin and markets locally, the socioeconomic trend analysis was repeated excluding samples originating from this state (n = 701). Trends in income, socioeconomic features, Hispanic or Latino ancestry, and general race and ethnicity were similar between the entire DNA banking cohort and sub-cohort with the state of Wisconsin excluded (Supplementary Table1).

Discussion

Historically, a human biobank or biorepository describes a professional collection of preserved, anonymized biological specimens from consented individuals, which may be linked to medical and epidemiological data (Henderson et al. 2013; Paskal et al. 2018). Once the specimen and associated data are collected and stored, the donor typically has limited control over access and utilization, but may request destruction. Commercial DNA banks are a distinct and non-traditional type of biorepository that provides this service directly to the consumer. This study investigates a single commercial DNA banking service that has acquired samples for 4,874 individuals over a 22-year period.

Overall, participation rates were higher for female individuals particularly as age increased. This may be a reflection of current societal trends where females are the primary health care decision maker for their families in the US (Matoff-Stepp et al. 2014). Higher volumes of male samples were observed in age ranges corresponding to prenatal, neonatal, infancy, and early childhood periods. This is consistent with trends in pediatric mortality in the US and may partially reflect male susceptibility to X-linked recessive disorders (Balsara et al. 2013). The number of samples deposited for fetal-derived specimens is not surprising given current estimates of monogenic disease and fetal loss (Smith et al. 2015). For individuals referred to the service by health care providers, the volume of fetal and postmortem specimens in our cohort may also reflect the National Association of Medical Examiners (NAME) recommendation for retaining appropriate postmortem samples for DNA banking and genetic testing (Middleton et al. 2013). Preservation of DNA is particularly important for cases with a suspected genetic etiology, as well as sudden and/or unexplained deaths where a cause is not clear at autopsy. Postmortem genetic testing has the potential to identify cause of death and identify relatives at risk for a genetic disorder or sudden death. In cases of recurrent fetal loss, retaining perimortem and postmortem DNA may aide future reproductive planning. Furthermore, in cases where individuals decline traditional autopsy due to personal beliefs, DNA banking may offer an alternate means to elucidate cause of death.

In this study, whole blood was the preferred specimen type submitted for genomic DNA extraction and banking, with the majority of individuals submitting a single specimen type (96.68%) at a single time point. Although genomic DNA is currently the preferred specimen type for most clinical and research-based molecular tests, preserving additional tissue types and biomolecules at multiple time points could enable future identification and interpretation of germline and somatic mosaicism, RNA analysis, and even facilitate epigenetic testing (Spinner and Conlin. 2014; Cavalli and Heard. 2019; Marco-Puche et al. 2019). Preservation of additional biomolecules may also enable molecular testing for infectious agents such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA (Chan et al. 2020). Such approaches have proved invaluable for DNA-based testing of dried blood spots from newborns to confirm suspected congenital cytomegalovirus infection, and may thus also prove useful for RNA virus detection provided RNA is preserved (Koontz et al 2019).

Clinical genetic testing and DNA banking have existed in the US for over 30 years (Yates et al. 1989; Amos and Patnaik. 2002). Although the DNA bank in this study has operated for over 22 years, substantial growth in the service did not occur until after 2005. This trend in growth likely reflects the rapid expansion of the field of human genomics that was catalyzed by the completion of the Human Genome Project (HGP) (Collins et al. 2003). Since completion of the HGP, clinical genetic testing and direct to consumer genetic testing volumes in the US have rapidly grown due to increased access, health spending, marketing, and media coverage, as well as technological advancements and decreasing costs (Amos and Patnaik. 2002; Collins et al. 2003; Lynch et al. 2011; Phillips et al. 2018).

The increase in DNA banking sample volume over time may also reflect the growth of the genetic counseling profession (Abacan et al. 2019). For individuals in this study providing a valid source of referral, 62.45% (n = 2,034/3,257) indicated they were referred by a clinical genetic counselor. By comparison, only 13.36% (n = 432/3,257) of individuals indicated referral by an alternate health care provider. The low rate of non-clinical genetic counselor health provider referrals may reflect a gap in medical practice. Prior studies suggest palliative oncologists do not feel qualified to recommend DNA banking and genetic counseling to patients (Quillin et al. 2011).

Individuals utilizing DNA banking services were estimated to reside in US zip codes with a higher percentage of non-Hispanic white residents, higher incomes, and less indicators of poverty compared to national metrics. These findings are consistent with prior studies, which have reported lower utilization of clinical genetic services in racial and ethnic minorities within the US (Underhill et al. 2016; Lynch et al. 2017; Carroll et al. 2020). Likewise, lack of insurance and/or reliance on Medicaid has also been reported to be a major barrier for genetic service access, particularly in racial and ethnic minorities in the US (Lynch et al. 2017; Rajpal et al. 2017). As the out-of-pocket cost of DNA banking is typically lower than that of clinical genetic testing in the US, this service has the potential to be accessible to individuals from a broad range of socioeconomic and demographic backgrounds (Quillin et al. 2011). Although cost is a common barrier to genetic testing, lack of testing recommendation or lack of referral for genetic services is also cited as a major barrier to access, particularly in minority populations (Muller et al. 2018; Cragun et al. 2019). In addition, lack of knowledge, lack of information, lack of communication, and distrust of the medical system may also limit participation in biobanks and utilization of precision medicine services among individuals from minority populations within the US (Heredia et al. 2017; Rosas et al. 2020). Interestingly, willingness to donate to research biobanks is correlated with increased knowledge and positive opinions of the service, which suggests that providing public education be a key initiative to improve commercial DNA bank enrollments (Domaradzki and Pawlikowski. 2019). Collectively, promotion of genetic training programs for health professionals, community outreach programs for the general public, and partial or full coverage of DNA banking services by insurance may increase the accessibility and utilization of DNA banking and other genomic services in the US (Quillin et al. 2011; Senier et al. 2019).

Limitations

This study has several limitations. Firstly, the study is observational and limited to a cohort for a single commercial laboratory. Acceptable specimen types changed over the 22-year interval. For the first 14 years of the service, the specimen type was almost exclusively restricted to whole blood. Although on rare instances, additional specimen types were banked along with whole blood. From 2011 onwards, the lab began formally accepting additional tissue types, as well as extracted DNA. This dataset is known to contain related individuals. The DNA banking service does not maintain genealogical information for participants. Due to potential violation of random, independent sampling, data in this study was limited to descriptive statistics. Generalizations from this cohort may also be limited as there are known socioeconomic biases in access and knowledge of DNA banking and genetic testing in the US (Quillin et al. 2018; Canedo et al. 2019). Lastly, the information presented in this study is based on zip codes as a proxy for demographic information, which was imputed based on ACS from a fixed 5-year interval. Although Census information provides a good overview of community characteristics, it is known to have limitations (McKenney and Bennett. 1994; Valles et al. 2015).

Conclusion

This study is the first to describe a cohort utilizing a direct to consumer DNA banking service in the US. This is being utilized by the general public and clinical genetic counselors, but appears to be under-utilized by other health care providers. This study also suggests that there may be socioeconomic-related barriers to access that require further investigation.