Skip to main content

Advertisement

Log in

Assisting the analysis of insertions and deletions using regional allele frequencies

  • Research
  • Published:
Functional & Integrative Genomics Aims and scope Submit manuscript

Abstract

Accurate estimation of population allele frequency (AF) is crucial for gene discovery and genetic diagnostics. However, determining AF for frameshift-inducing small insertions and deletions (indels) faces challenges due to discrepancies in mapping and variant calling methods. Here, we propose an innovative approach to assess indel AF. We developed CRAFTS-indels (Calculating Regional Allele Frequency Targeting Small indels), an algorithm that combines AF of distinct indels within a given region and provides “regional AF” (rAF). We tested and validated CRAFTS-indels using three independent datasets: gnomAD v2 (n=125,748 samples), an internal dataset (IGM; n=39,367), and the UK BioBank (UKBB; n=469,835). By comparing rAF against standard AF, we identified rare indels with rAF exceeding standard AF (sAF≤10-4 and rAF>10-4) as “rAF-hi” indels. Notably, a high percentage of rare indels were “rAF-hi”, with a higher proportion in gnomAD v2 (11-20%) and IGM (11-22%) compared to the UKBB (5-9% depending on the CRAFTS-indels’ parameters). Analysis of the overlap of regions based on their rAF with low complexity regions and with ClinVar classification supported the pertinence of rAF. Using the internal dataset, we illustrated the utility of CRAFTS-indel in the analysis of de novo variants and the potential negative impact of rAF-hi indels in gene discovery. In summary, annotation of indels with cohort specific rAF can be used to handle some of the limitations of current annotation pipelines and facilitate detection of novel gene disease associations. CRAFTS-indels offers a user-friendly approach to providing rAF annotation. It can be integrated into public databases such as gnomAD, UKBB and used by ClinVar to revise indel classifications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Code used to generate the rAF for gnomAD, IGM and UKBB dataset is available at https://github.com/ColumbiaCPMG/CRAFTs-Indel along with other scripts used to generate the tables and figures in this paper.

References

Download references

Funding

Dr. Milo Rasouly received a K01 award from the NIH (Grant number K01DK132495) during the conduct of this study.

Dr. Milo Rasouly was also awarded the Donald E. Wesson Research Fellowship from the ASN Foundation for Kidney Research.

Dr. Motelow received support as a Samberg Scholar and a Thrasher Early Career Research Award during the conduct of this study.

Author information

Authors and Affiliations

Authors

Contributions

“Hila Milo Rasouly and Sarath Babu Murthy Krishna contributed to the study conception and design. Material preparation, data collection and analysis were performed by Sarath Babu Murthy Krishna, Sandy Yang, Shiraz Bheda and Nikita Tomar. The first draft of the manuscript was written by Hila Milo Rasouly and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.”

Corresponding author

Correspondence to Hila Milo Rasouly.

Ethics declarations

Competing interests

The authors declare no competing interests. 

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Krishna Murthy, S.B., Yang, S., Bheda, S. et al. Assisting the analysis of insertions and deletions using regional allele frequencies. Funct Integr Genomics 24, 104 (2024). https://doi.org/10.1007/s10142-024-01358-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10142-024-01358-3

Keywords

Navigation