Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential

O’Reilly, Robert L.; Burke, Jared; Harraka, Philip; Yeh, Paul; Howlett, Kerryn; Behrouzfar, Kiarash; Rewse, Amanda; Tsimiklis, Helen; Giles, Graham G.; Bubb, Kristen J.; Nicholls, Stephen J.; Milne, Roger L.; Southey, Melissa C.

doi:10.1038/s41598-024-69398-0

Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential

Article
Open access
Published: 14 August 2024

Volume 14, article number 18917, (2024)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential

Download PDF

Robert L. O’Reilly^1,6,
Jared Burke¹,
Philip Harraka^1,6,
Paul Yeh^2,3,
Kerryn Howlett^1,3,
Kiarash Behrouzfar³,
Amanda Rewse¹,
Helen Tsimiklis¹,
Graham G. Giles^1,4,5,
Kristen J. Bubb^6,9,
Stephen J. Nicholls^6,7,
Roger L. Milne^1,4,5 &
…
Melissa C. Southey^1,4,6,8

475 Accesses
Explore all metrics

Abstract

Clonal haematopoiesis of indeterminate potential (CHIP) has been associated with many adverse health outcomes. However, further research is required to understand the critical genes and pathways relevant to CHIP subtypes, evaluate how CHIP clones evolve with time, and further advance functional characterisation and therapeutic studies. Large epidemiological studies are well placed to address these questions but often collect saliva rather than blood from participants. Paired saliva- and blood-derived DNA samples from 94 study participants were sequenced using a targeted CHIP-gene panel. The ten genes most frequently identified to carry CHIP-associated variants were analysed. Fourteen unique variants associated with CHIP, ten in DNMT3A, two in TP53 and two in TET2, were identified with a variant allele fraction (VAF) between 0.02 and 0.2 and variant depth ≥ 5 reads. Eleven of these CHIP-associated variants were detected in both the blood- and saliva-derived DNA sample. Three variants were detected in blood with a VAF > 0.02 but fell below this threshold in the paired saliva sample (VAF 0.008—0.013). Saliva-derived DNA is suitable for detecting CHIP-associated variants. Saliva can offer a cost-effective biospecimen that could both advance CHIP research and facilitate clinical translation into settings such as risk prediction, precision prevention, and treatment monitoring.

Validated WGS and WES protocols proved saliva-derived gDNA as an equivalent to blood-derived gDNA for clinical and population genomic analyses

Article Open access 17 February 2024

Quality of whole genome sequencing from blood versus saliva derived DNA in cardiac patients

Article Open access 29 January 2020

Identification of single nucleotide polymorphisms (SNPs) associated with chronic graft-versus-host disease in patients undergoing allogeneic hematopoietic cell transplantation

Article Open access 21 September 2023

Introduction

Age-related clonal haematopoiesis of indeterminate potential (CHIP), usually observed as somatic mosaicism in blood-derived DNA, has been associated with many adverse health outcomes including haematological conditions, cardiovascular disease (CVD), and all-cause mortality¹. CHIP is characterised as haematopoietic cells of peripheral blood with at least one driver mutation, and without haematological malignancy or detectable morphological evidence of dysplasia^2,3. Haematopoietic stem cells and progenitor cells with mutations that confer a fitness advantage will proliferate in clonal expansion, and the accumulation of these mutations can result in disease^1,2.

Research deciphering the molecular and associated clinical features of CHIP has gained considerable momentum via the analysis of large human data sets available from research initiatives such as the UK biobank and All of Us⁴. These studies have refined both our understanding of CHIP and the bioinformatic approaches required to identify CHIP in a range of genomic datasets including whole genome, whole exome, and targeted gene panel sequencing data. These studies have revealed CHIP to have diverse molecular phenotypes (somatic mutation-driven subtypes), that are associated with a spectrum of germline genetic causes and clinical features⁵.

Recently, population-scale genomic datasets have enabled further interrogation of the complexities of CHIP and the identification of important differential associations between disease susceptibility and the clone-specific gene mutation. For instance, DNMT3A mutations are not associated with CVD but have been shown to be associated with an increased risk of solid tumours. Kessler et al., further described common genetic variation associated with CHIP⁵. For example, common germline variants at the CD164 gene regions were associated with decreased risk of DNMT3A CHIP, whereas germline variants in TCL1A were associated with increased risk of DNMT3A CHIP.

More research is required to understand the critical genes and pathways relevant to each CHIP subtype, evaluate how CHIP clones change with time, and further advance functional and therapeutic studies. Population-scale genomic studies rarely involve serial blood sampling of participants and are thus not well placed to address some of these emerging questions in CHIP research. In contrast, large-scale epidemiological studies of human health often take serial biological samples from participants over long periods of time (often decades). These studies can therefore be well positioned to address some of these gaps in CHIP knowledge.

In this context, saliva is often collected as a source of germline DNA from research participants because it can be collected non-invasively at home and shipped at room temperatures at lower cost with no time sensitivities for downstream biobanking (e.g., processing and freezing). Several pieces of evidence suggest that DNA extracted from saliva may be a suitable template for CHIP analysis. First, white blood cells are known to cross the mucosal barrier and have been suggested to make up approximately 75% of the nucleated cells in a saliva specimen⁶. Second, DNA derived from mouthwashes after allogeneic blood stem cell transplantation have been shown to display chimeric or complete donor genotype supporting a considerable blood-DNA contribution^6,7. Third, saliva-derived DNA has been successfully used in targeted gene panel sequencing. Fourth, Soyfer et al., (2024), assessed saliva for haematopoietic cells and were able to successfully quantify somatic variants in families with myeloproliferative neoplasm⁸. However, there are likely considerable saliva-specific technical and bioinformatic challenges that will need to be overcome to differentiate germline and CHIP-associated genetic variation especially in the context of a potential reduction in CHIP-associated variant allele fraction (VAF) (if the contribution of blood-cell nuclei to the DNA yield is not high in saliva samples). If it can be demonstrated to be a suitable template for CHIP analysis, saliva-derived DNA offers a cost effective, practical alternative biospecimen that could be utilised to both advance research and be a companion to clinical translation into settings such as risk prediction, precision prevention, and treatment monitoring.

This study sought to assess the suitability of saliva-derived DNA in the detection of CHIP associated variants using a custom targeted gene panel (focusing on the 10 genes most frequently detected to carry CHIP-associated variants), a massively parallel sequencing approach, and saliva- and blood-derived DNA samples from 94 cohort study participants.

Results

Library preparation and sequencing

Paired blood and saliva samples were obtained from 94 healthy participants of the Australian Breakthrough Cancer cohort (Table 1) and DNA was extracted from all samples. A total of 192 samples successfully underwent library preparation. This included 188 test samples (94 blood-derived DNA and 94 saliva-derived DNA pairs), two commercial controls, and two in-house high molecular weight (HMW) controls. Quality metrics of all sequenced samples showed a median read duplication rate of 54.2% and, following deduplication, a median off-target base rate of 20.8%. Of the 188 test samples, 33 samples (17.6%) did not reach ≥ 80% target coverage at 500 × depth; 32 of these 33 samples were saliva-derived DNA, with one blood-derived DNA sample (Table 2). Nine of 188 test samples (5%) did not reach > 50% target coverage at 500 × depth; 8 of these 9 samples were saliva-derived DNA and 1 was a blood-derived DNA sample (Table 2). These 9 correspond to samples that, following enzymatic fragmentation, had poor pre-capture DNA library profiles (long fragment sizes, a plateau peak and/or low concentrations).

Table 1 A demographic representation of the 94 participants selected from the Australian Breakthrough Cancer cohort.

Full size table

Table 2 Sequencing alignment metrics of deduplicated reads for 188 samples and 4 controls.

Full size table

Controls

Variants that were included in the myeloid control, and in the 10 genes assessed, were called down to a VAF of 0.01 (Supplementary Table 1). Sequencing metrics for both our in-house HMW and commercial controls met the > 80% target coverage at 500 × depth criteria (Table 2).

Variants identified with VAFs between 0.02 and 0.2

In our cohort of healthy participants between the age of 64–75 (Table 1), twenty-one variants (VAF 0.02–0.20) were identified in 18 participants. Thirteen were detected in both blood and saliva-derived DNA pairs. Six variants appeared to be present only in blood-derived DNA, within the VAF thresholds, while two were detected only in saliva-derived DNA (Supplementary Table 2). Upon further investigation, five of these six variants found only in the blood-derived DNA were found below the 0.02 threshold in the saliva DNA pair (ranging between 0.007 – 0.019). The two variants observed in one saliva-derived DNA sample were not detected in the blood-derived DNA pair.

Only one artifact was identified (NM_004972.4:c.1777-7del) in 30/188 samples (15.9%), 14 in blood & 16 in saliva-derived DNAs (VAF ~ 0.03). This artifact was removed. No artifacts were observed in the manual inspection of CHIP associated variants in IGV.

Variants associated with CHIP

Fourteen of the twenty-one variants (VAF 0.02–0.20) were found to be associated with CHIP (Table 3). Ten variants were identified in DNMT3A; two variants in TP53; and two variants in TET2. No putative CHIP-associated variants were identified in the other seven genes assessed. Eleven of fourteen (79%) CHIP associated variants were found in both the blood and saliva-derived DNA pairs when applying the VAF 0.02—0.20 and variant depth (VD) ≥ 5 read thresholds. For a given variant, the VAFs were very similar between the blood and saliva-derived DNA pairs with a largest difference of ~ 3% (Table 3). Three of the fourteen (21%) CHIP associated variants were found in only the blood-derived DNA samples using the thresholds of VD ≥ 5 and VAF 0.02–0.20 (Table3; Fig. 1). However, they were detected in their paired saliva-derived DNA with a VD ≥ 5 and VAFs 0.008 – 0.013 (Table 3).

Table 3 Fourteen CHIP-associated genetic variants identified in 94 paired saliva and blood-derived DNA samples. Three variants fell below the VAF 0.02 threshold as indicated in bold.

Full size table

Discussion

Our study demonstrates high concordance between CHIP-associated variants called in pairs of DNAs sourced from blood and saliva, illustrating the suitability of saliva-derived DNA for the detection of CHIP.

This study focused on the analysis of 10 genes that have been reported in large studies to be the most frequently involved in CHIP-associated somatic mutation⁴. Vlasschaert et al., examined the distribution of genes carrying CHIP variants in 19,921 individuals and found that these ten genes carried the most CHIP-associated variants. Consistent with this, and other literature^4,9,10,11, our small study only identified variants in DNMT3A, TP53, and TET2, with DNMT3A being the most mutated gene.

Prior to this study, there was some evidence to support saliva-derived DNA being a suitable biological resource for detecting somatic mutations in clonal haematopoiesis and other haematologic malignancies. Soyfer et al. recently presented data that examined the feasibility of using DNA prepared from saliva specimens to measure somatic variation at low VAFs (≤ 0.1)⁸. However, challenges were still anticipated relating to the poorer quality of saliva-derived DNA and the proportion of blood cell nuclei represented in the DNA yield. Indeed, eight of nine DNA samples that did not meet the quality metric threshold of 50% coverage at 500X were from saliva and corresponded to pre-capture libraries with poor TapeStation profiles and/or low concentrations after pre-capture PCR. However, vast majority of saliva-derived DNA samples performed very well and had similar metrics to their paired blood-derived DNA sample.

When considering all variants identified with VAFs between 0.02 and 0.20, six variants were identified in blood-derived DNA, but not in the corresponding saliva-derived DNA pair, for six individuals. Five of these variants were found below the 0.02 threshold in saliva-derived DNA while one variant was not detected in saliva. Three of these five variants were identified as CHIP-associated variants (Table 3). There were two variants detected in saliva that were not detected in the paired blood samples (Supplementary Table 2). Interestingly, these were from the same individual, a woman with a prior history of smoking but who had ceased smoking 40 years before providing these samples. It is possible given their absence in blood, that these two variants could be derived from mucosal epithelia⁸. Further development of methodologies aimed at reducing the epithelial content of saliva, such as that described by Soyfer et al. (2024), could help to refine a saliva derived based assay for CHIP.

When considering all CHIP-associated variants with VAFs between 0.02 and 0.20, eleven of the fourteen variants were detected in both the blood and saliva-derived DNA pairs with these thresholds. The VAFs of these variants in blood and saliva were similar between pairs and there was no suggestion that the VAF measured in the saliva-derived DNA was consistently reduced compared to blood—consistent with the DNA being predominantly from blood cell nuclei. There was no identifiable technical reason why three CHIP associated variants identified in different saliva-derived DNA samples had lower VAFs (between 0.008—0.013). TapeStation profiles were consistent with other well performing saliva-derived DNA samples, and all three of these saliva-derived DNA samples had at least 50% coverage at 500x (one had as high as 88% target coverage at 500x). The time between sampling of the three saliva and blood sample pairs ranged between 2 months and 34 months. However, given that CHIP progression seems to be ~ 0.5–1.0% per year², it is unlikely CHIP clones evolved enough during this time between biological sampling to reflect observed changes in CHIP clone frequency in these VAF.

The small number of artifacts found in this study is likely a result of a combination of the small sample size; assessing only ten specific genes, none of which present technical sequencing challenges; and deep sequencing (average 1196x).

This study has a number of strengths: The Horizon’s myeloid control was diluted with a wildtype reference to provide confidence that variants would be called if present in the samples. All variants that were in this control, and in the assessed 10 genes, were successfully called after applying our pipeline and filtering methods. The participants included in this work were 64–75 years old, given the age relatedness of CHIP, the number of CHIP-associated variants in this group was anticipated to be ~ 10–15%^11,12, which was consistent with our results. Variants were detected below the VAF threshold of 0.02 in saliva samples, indicating this method could be applied to variants present below this frequency. There is some evidence that supports clinical relevance for detecting CHIP-associated variants below the standard 2% threshold^13,14,15. A limitation of this study due to the technical design, is that the study does not capture large chromosomal alterations and thus cannot detect mosaic chromosomal alterations.

Conclusion

This study has demonstrated that saliva-derived DNA is a suitable template for CHIP analysis. Saliva-derived DNA offers a cost effective, practical alternative biospecimen that could be utilised to both advance research and be a companion to clinical translation into settings such as risk prediction, precision prevention and treatment monitoring.

Methods

Ethical statement

The Australian Breakthrough Cancer Study is approved by the Cancer Council Victoria Human Ethics Review Committee (#1403). The conduct of our study is consistent with The National Health and Medical Research Council of Australia’s National Statement on ethical conduct in human research and performed in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants.

Source material

Paired saliva and blood samples were collected from 94 participants aged 64–75 years at enrolment into the Australian Breakthrough Cancer Study, a prospective cohort of over 56,000 Australians aged 40–74 and unaffected by cancer when recruited in 2014–18. Study participants were provided an at-home saliva collection kit, Oragene OG-500 (DNAGenotek), and returned the sample to Biobanking Victoria via a postal service. Blood samples were collected in EDTA tubes at local pathology services and processed centrally within 72 h of blood draw. Duration between collection of paired saliva and blood samples ranged from 2 to 34 months.

Reference standards were utilised including 100% wildtype (Catalogue ID: HD752) and a myeloid DNA reference standard (Catalogue ID: HD829) (Horizon Discovery, UK) to identify if this platform could detect variants at a VAF of at least 0.01. This control mix was included in each of the two 96 well plates.

DNA extraction

DNA was extracted from paired whole blood and saliva samples using either a Qiagen Symphony or Chemagic™ platform following manufacturers protocols (Qiagen, Valencia, CA; PerkinElmer, Waltham, MA, United States).

Sequencing panel design

The panel design consisted of 39 genes and covered 57.111 kbp. This study considered ten specific genes and gene regions (~ 28,805 kbp of the design) that where most likely to contain somatic variants associated with CHIP: DNMT3A, TET2, ASXL1, JAK2, GNB1, PPM1D, TP53, NF1, SRSF2, SF3B1^1,4,9,16.

Library preparation and sequencing

Agilent’s SureSelect XT HS2 DNA System was utilised using the automated Agilent NGS Workstation Option B (SureSelect; Agilent Technologies, Santa Clara, CA, USA). Input genomic DNA was 200 ng for both blood and saliva-derived DNA samples and 100 ng for the prepared horizon control. DNA enzymatic fragmentation and library preparation followed the SureSelect protocol with minor modification including extension of the fragmentation incubation time from 25 to 30 min to accommodate the target size of 2 × 75 bp. Pre-capture PCR conditions involved 8 cycles with unique dual-indexed primers, and sample libraries were assessed on Agilent’s 4200 TapeStation system using a D1000 ScreenTape. Libraries with poor profiles or low concentrations were noted but not excluded from sequencing to understand the impact that poor libraries had on variant calling between the source materials. Multiplex hybridisation (16x) and capture method for enrichment of targeted genes was applied before sequencing on NextSeq 550 using Illumina’s high output kit v2.5 (150 CYS) with the aim of reaching 80% coverage of target region at 500X. Sequencing methods followed Illumina’s NextSeq System: Denature and Dilute Libraries Guide¹⁷.

Bioinformatic pipeline for variant calling

Bioinformatic pipelines (Fig. 1) were written in Nextflow (v23.10.1)¹⁸ (https://github.com/Prec-Med/bldsal-analysis/tree/main) and executed on the ‘The Multi-modal Australian ScienceS Imaging and Visualisation Environment (MASSIVE) high performance computing infrastructure’ established by Monash University and partners¹⁹.

Raw sequence data conversion from bcl files to fastq used illumina’s bcl2fastq (v2.20) to achieve this. SureSelect adapters were trimmed with Agilent’s AGeNT tools v3.0.6 trimmer (Agilent Technologies, Santa Clara, CA, USA), before alignment to human genome reference build GRCh38 using BWA-MEM v0.7.17²⁰. Unique Molecular Index (UMI) deduplication was performed with Agilent’s AGeNT CReaK in hybrid mode (Agilent Technologies, Santa Clara, CA, USA). Metrics for Fastqs and BAMs were generated with FastQC (v 012.1)²¹ and Genome Analysis Toolkit (GATK v4.4.0.0)²² before aggregating using MultQC (v1.18)²³.

VarDict-java (v1.8.3)²⁴ was used to call somatic variants as the caller can be used to call single nucleotide variants, multi-nucleotide variants, insertions/deletions, complex, and even structural variants^13,24,25. However, this study focused specifically on insertions/deletions and single nucleotide variants. Variant calling thresholds were set at a VAF ≥ 0.005 before applying secondary thresholds later in the pipeline. Indel normalisation and multiallelic site decomposition, along with general VCF file manipulation, was conducted using bcftools (v1.18)²⁶ before annotating with Ensembl-VEP v111²⁷. Variants were then filtered with slivar (v0.3.0)²⁸ using a threshold requiring a minimum of 5 reads per variant, and VAF between 0.02—0.20 (2—20%).

Agreement between variants called in the paired blood-saliva samples was evaluated using Starfish (https://github.com/dancooke/starfish) which uses Real Time Genomics (RTG)²⁹ engine for VCF intersections. Blood/saliva VCF pairing, parallel execution of intersections, and aggregation of variant statistics from intersected VCFs (Supplementary Material) were performed in Python using pysam (https://github.com/pysam-developers/pysam).²⁶ Sequence artifacts were identified and removed by applying a threshold of variant detected in greater than 10% of samples, other studies have used similar cut-offs (6%)¹³.

Variant filtering and identifying putative CHIP variants

Only variants identified in the genetic regions reported by Vlasschaert, et al. were assessed excluding premature truncating variants 3’ to the last 50 bases of the penultimate exon—to distinguish bona fide CHIP variants from somatic variants that have not been previously associated with clonal expansion of haematopoietic stem cells⁴.

Read alignment and quality for all variants were manually inspected using Interactive Genomics Viewer (IGV, Broad Institute, MA) to confirm sufficient read depth and allele balance. Variants were also inspected to make sure they were not i) in regions of low genomic complexity (i.e. homopolymer regions), ii) in regions with multiple misaligned reads, iii) in regions with multiple nearby non-reference or poor-quality base calls, or iv) in regions with exon–intron boundary soft clipping. Any variants suspected to be sequencing or mapping artifacts were flagged. Variants that were not identified in both samples were investigated to identify if this was because the VAF fell outside of the 0.02—0.2 cut-off or if the VD was less than 5.

Data availability

Data presented in this report can be requested via PEDIGREE. https://www.cancervic.org.au/research/epidemiology/pedigree.

References

Asada, S. & Kitamura, T. Clonal hematopoiesis and associated diseases: a review of recent findings. Cancer Sci. 112, 3962–3971. https://doi.org/10.1111/cas.15094 (2021).
Article CAS PubMed PubMed Central Google Scholar
Steensma, D. P. et al. Clonal hematopoiesis of indeterminate potential and its distinction from myelodysplastic syndromes. Blood. 126, 9–16. https://doi.org/10.1182/blood-2015-03-631747 (2015).
Article CAS PubMed PubMed Central Google Scholar
Heuser, M., Thol, F. & Ganser, A. Clonal hematopoiesis of indeterminate potential. Dtsch Arztebl Int. 113, 317–322. https://doi.org/10.3238/arztebl.2016.0317 (2016).
Article PubMed PubMed Central Google Scholar
Vlasschaert, C. et al. A practical approach to curate clonal hematopoiesis of indeterminate potential in human genetic data sets. Blood. 141, 2214–2223. https://doi.org/10.1182/blood.2022018825 (2023).
Article CAS PubMed PubMed Central Google Scholar
Kessler, M. D. et al. Common and rare variant associations with clonal haematopoiesis phenotypes. Nature. 612, 301–309. https://doi.org/10.1038/s41586-022-05448-9 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Thiede, C., Prange-Krex, G., Freiberg-Richter, J., Bornhäuser, M. & Ehninger, G. Buccal swabs but not mouthwash samples can be used to obtain pretransplant DNA fingerprints from recipients of allogeneic bone marrow transplants. Bone Marrow Transplant. 25, 575–577. https://doi.org/10.1038/sj.bmt.1702170 (2000).
Article CAS PubMed Google Scholar
Endler, G., Greinix, H., Winkler, K., Mitterbauer, G. & Mannhalter, C. Genetic fingerprinting in mouthwashes of patients after allogeneic bone marrow transplantation. Bone Marrow Transplant. 24, 95–98. https://doi.org/10.1038/sj.bmt.1701815 (1999).
Article CAS PubMed Google Scholar
Soyfer, E. M. et al. Saliva as a feasible alternative to blood for interrogation of somatic hematopoietic variants. Blood Neoplasia. https://doi.org/10.1016/j.bneo.2024.100012 (2024).
Article Google Scholar
Bolton, K. L. et al. Cancer therapy shapes the fitness landscape of clonal hematopoiesis. Nat. Genet. 52, 1219–1226 (2020).
Article CAS PubMed PubMed Central Google Scholar
Coombs, C. C. et al. Therapy-related clonal hematopoiesis in patients with non-hematologic cancers is common and associated with adverse clinical outcomes. Cell Stem Cell. 21, 374–382. https://doi.org/10.1016/j.stem.2017.07.010 (2017).
Article CAS PubMed PubMed Central Google Scholar
Jaiswal, S. et al. Age-related clonal hematopoiesis associated with adverse outcomes. New Engl. J. Med. 371, 2488–2498. https://doi.org/10.1056/NEJMoa1408617 (2014).
Article CAS PubMed Google Scholar
Park, S. J. & Bejar, R. Clonal hematopoiesis in aging. Curr. Stem Cell Rep. 4, 209–219. https://doi.org/10.1007/s40778-018-0133-9 (2018).
Article PubMed PubMed Central Google Scholar
Chan, I. C. C. et al. ArCH: Improving the performance of clonal hematopoiesis variant calling and interpretation. Bioinformatics. https://doi.org/10.1093/bioinformatics/btae121 (2024).
Article PubMed PubMed Central Google Scholar
Friedman, D. N. et al. Clonal hematopoiesis in survivors of childhood cancer. Blood Adv. 7, 4102–4106. https://doi.org/10.1182/bloodadvances.2023009817 (2023).
Article Google Scholar
Young, A. L., Tong, R. S., Brenda, M. B. & Todd, E. D. Clonal hematopoiesis and risk of acute myeloid leukemia. Haematologica. 104, 2410–2417. https://doi.org/10.3324/haematol.2018.215269 (2019).
Article CAS PubMed PubMed Central Google Scholar
Marnell, C. S., Bick, A. & Natarajan, P. Clonal hematopoiesis of indeterminate potential (CHIP): Linking somatic mutations, hematopoiesis, chronic inflammation and cardiovascular disease. J. Mol. Cell Cardiol. 161, 98–105. https://doi.org/10.1016/j.yjmcc.2021.07.004 (2021).
Article CAS PubMed PubMed Central Google Scholar
Illumina. NextSeq system: denature and dilute libraries guide. https://support.illumina.com/content/dam/illumina-support/documents/documentation/system_documentation/nextseq/nextseq-denature-dilute-libraries-guide-15048776-09.pdf (2018).
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319. https://doi.org/10.1038/nbt.3820 (2017).
Article CAS PubMed Google Scholar
Goscinski, W. J. et al. The multi-modal Australian sciences imaging and visualization environment (MASSIVE) high performance computing infrastructure: Applications in neuroscience and neuroinformatics research. Front. Neuroinformatics. https://doi.org/10.3389/fninf.2014.00030 (2014).
Article Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv 1303, (2013).
Andrews, S. FastQC a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc (2010).
O'Connor, B. D. & van der Auwera, G. Genomics in the Cloud: Using Docker, GATK, and WDL in Terra. (O'Reilly Media, Incorporated, 2020).
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 32, 3047–3048. https://doi.org/10.1093/bioinformatics/btw354 (2016).
Article CAS PubMed PubMed Central Google Scholar
Lai, Z. et al. VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research. Nucleic Acids Res. 44, e108. https://doi.org/10.1093/nar/gkw227 (2016).
Article CAS PubMed PubMed Central Google Scholar
Soerensen, M. et al. Clonal hematopoiesis and epigenetic age acceleration in elderly danish twins. HemaSphere. 6, e768. https://doi.org/10.1097/hs9.0000000000000768 (2022).
Article CAS PubMed PubMed Central Google Scholar
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience. https://doi.org/10.1093/gigascience/giab008 (2021).
Article PubMed PubMed Central Google Scholar
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122. https://doi.org/10.1186/s13059-016-0974-4 (2016).
Article CAS PubMed PubMed Central Google Scholar
Pedersen, B. S. et al. Effective variant filtering and expected candidate variant yield in studies of rare human disease. NPJ Genom. Med. 6, 60. https://doi.org/10.1038/s41525-021-00227-3 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cleary, J. G. et al. Comparing variant call files for performance benchmarking of next-generation sequencing variant calling pipelines. bioRxiv. 023754, https://doi.org/10.1101/023754 (2015).

Download references

Acknowledgements

This study was made possible by the contribution of many people. In particular, we thank the thousands of participants from across Australia who continue to participate in the study. The ABC Study was funded by Cancer Council Victoria, State Trustees, and a generous gift from the Geary Estate. Funding to collect blood samples [LP1] was provided by Gandel Philanthropy, the Ian Potter Foundation and the Harry Secomb Foundation and the Percy Baxter Charitable Trust, managed by Perpetual Trustees. Funding to collect faecal samples was provided by Gandel Philanthropy and Perpetual Trustees, (Winifred & John Webster Charitable Trust Fund, Pf – Alan (Agl), Shaw Endowment and Broomhead Family Foundation. [LP2] Cases and their vital status were ascertained through the Victorian Cancer Registry and the Australian Institute of Health and Welfare, including the Australian Cancer Database. This work was also funded by the Australian Medical Research Future Fund (PI Nicholls) and the National Health Medical Research Council (Investigator Grant GNT2017325; Southey).

Author information

Authors and Affiliations

Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC, Australia
Robert L. O’Reilly, Jared Burke, Philip Harraka, Kerryn Howlett, Amanda Rewse, Helen Tsimiklis, Graham G. Giles, Roger L. Milne & Melissa C. Southey
Monash Haematology, Clayton, VIC, Australia
Paul Yeh
Department of Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC, Australia
Paul Yeh, Kerryn Howlett & Kiarash Behrouzfar
Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, VIC, Australia
Graham G. Giles, Roger L. Milne & Melissa C. Southey
Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville, VIC, Australia
Graham G. Giles & Roger L. Milne
Victorian Heart Institute, Monash University, Clayton, VIC, Australia
Robert L. O’Reilly, Philip Harraka, Kristen J. Bubb, Stephen J. Nicholls & Melissa C. Southey
Victorian Heart Hospital, Clayton, VIC, Australia
Stephen J. Nicholls
Department of Clinical Pathology, The University of Melbourne, Parkville, VIC, Australia
Melissa C. Southey
Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia
Kristen J. Bubb

Authors

Robert L. O’Reilly
View author publications
You can also search for this author in PubMed Google Scholar
Jared Burke
View author publications
You can also search for this author in PubMed Google Scholar
Philip Harraka
View author publications
You can also search for this author in PubMed Google Scholar
Paul Yeh
View author publications
You can also search for this author in PubMed Google Scholar
Kerryn Howlett
View author publications
You can also search for this author in PubMed Google Scholar
Kiarash Behrouzfar
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Rewse
View author publications
You can also search for this author in PubMed Google Scholar
Helen Tsimiklis
View author publications
You can also search for this author in PubMed Google Scholar
Graham G. Giles
View author publications
You can also search for this author in PubMed Google Scholar
Kristen J. Bubb
View author publications
You can also search for this author in PubMed Google Scholar
Stephen J. Nicholls
View author publications
You can also search for this author in PubMed Google Scholar
Roger L. Milne
View author publications
You can also search for this author in PubMed Google Scholar
Melissa C. Southey
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.L.O., J.B., P.H., M.C.S., P.Y., K.H., conceptualised and drafted the manuscript; developed the study design’s logistics; generated, analysed, and interpreted the data. R.L.O., A.R. conducted the lab work. H.T. managed the biological material and collection. M.C.S., S.J.N., K.J.B., G.G.G., R.L.M. provided grant funding and conceptualised the study design. All authors contributed substantially to manuscript preparation. All authors approved the final version.

Corresponding author

Correspondence to Melissa C. Southey.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

O’Reilly, R.L., Burke, J., Harraka, P. et al. Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential. Sci Rep 14, 18917 (2024). https://doi.org/10.1038/s41598-024-69398-0

Download citation

Received: 21 May 2024
Accepted: 05 August 2024
Published: 14 August 2024
DOI: https://doi.org/10.1038/s41598-024-69398-0
Springer Nature Limited

Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential

Abstract

Similar content being viewed by others

Validated WGS and WES protocols proved saliva-derived gDNA as an equivalent to blood-derived gDNA for clinical and population genomic analyses

Quality of whole genome sequencing from blood versus saliva derived DNA in cardiac patients

Identification of single nucleotide polymorphisms (SNPs) associated with chronic graft-versus-host disease in patients undergoing allogeneic hematopoietic cell transplantation

Introduction

Results

Library preparation and sequencing

Controls

Variants identified with VAFs between 0.02 and 0.2

Variants associated with CHIP

Discussion

Conclusion

Methods

Ethical statement

Source material

DNA extraction

Sequencing panel design

Library preparation and sequencing

Bioinformatic pipeline for variant calling

Variant filtering and identifying putative CHIP variants

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Saliva-derived DNA is suitable for the detection of clonal haematopoiesis of indeterminate potential

Abstract

Similar content being viewed by others

Validated WGS and WES protocols proved saliva-derived gDNA as an equivalent to blood-derived gDNA for clinical and population genomic analyses

Quality of whole genome sequencing from blood versus saliva derived DNA in cardiac patients

Identification of single nucleotide polymorphisms (SNPs) associated with chronic graft-versus-host disease in patients undergoing allogeneic hematopoietic cell transplantation

Introduction

Results

Library preparation and sequencing

Controls

Variants identified with VAFs between 0.02 and 0.2

Variants associated with CHIP

Discussion

Conclusion

Methods

Ethical statement

Source material

DNA extraction

Sequencing panel design

Library preparation and sequencing

Bioinformatic pipeline for variant calling

Variant filtering and identifying putative CHIP variants

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Supplementary Information

Supplementary Information.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation