Abstract
Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an ‘Atlas’ of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.
Introduction
Two decades after sequencing the first human genome, millions of human exomes and genomes have been sequenced. Interpreting the effects of the hundreds of millions of variants thus discovered has become a central challenge for genomics. The genomes of the 8 billion people alive today collectively contain nearly all ~ 9 billion possible single nucleotide genetic variants compatible with life, as well as numerous insertions, deletions and other types of variants [1, 2]. Moreover, within the trillions of cells of each individual, every possible single nucleotide genetic variant will have arisen through somatic mutation. The functional impact of genetic variants has primarily been determined by asking if the variant co-occurs with a disease, disorder or other trait, an approach which has collectively characterised the functional impact of less than 1% of genetic variation. Moreover, our knowledge of variant effects is focused on the best-understood 1–2% of our DNA—the genes that encode proteins. For non-coding variation, the situation is even less certain, because the location of most known non-coding functional elements has only been recently identified [3]. Moreover, non-coding elements are not as highly conserved and their functions are often cell type and development stage specific [4].
Our lack of information about the effect of variation found through genetic testing or genome sequencing is the major barrier to the use of sequence information for diagnosing genetic disease. This lack of information limits the effectiveness of genetic precision medicine and hinders our ability to understand genome function. Even when a variant in a well-annotated functional element is known to increase disease risk, the mechanism by which it does so is often unknown. A solution lies in our ability to assess the functional effect of variants using in vitro or cell-based assays, which can provide strong evidence to interpret their biological and clinical impact and can, in principle, be applied to any variant. However, owing to the resource- and time-intensive nature of traditional variant effect assays, they have generally been undertaken reactively for individual variants only after and, in most cases long after, the first observation of the variant. Now, multiplexed assays of variant effect (MAVEs) enable the generation of ‘variant effect maps’ characterising aspects of the function of every possible single nucleotide change in a gene or functional element of interest. Because variant effect maps are comprehensive, they profile all previously observed variants, as well as those that might be found in the future. Generating variant effect maps for every protein encoding gene and regulatory element in the human genome would create an ‘Atlas’ of variant effect maps that would transform our understanding of genetics by ushering in a new era of nucleotide-resolution functional knowledge of the genome.
The generation of an Atlas of Variant Effects (AVE) would have major impact across multiple areas of basic and translational research and, importantly, for clinical care. Any effort to determine whether a variant alters function would be transformed by having an Atlas, including in the following high impact areas (Fig. 1):
-
Precision genomic medicine. Variant effect maps of functional elements known to harbour disease-causing variation can drive more accurate, rapid and inexpensive genetic diagnostic testing. Variant effect maps can also enhance our understanding of penetrance and variable expressivity and potentially even reveal compensatory genetic perturbations. For a wide variety of genetically driven disorders, knowledge of disease risk variants allows screening within families or even populations for early detection and thus early intervention [5].
-
Disease association studies. Just as targeted variant functional assays have assisted discovery and validation of associations between specific rare genetic variants and disease risk, variant effect maps can enable this approach broadly, at scale [6, 7].
-
Therapeutic development and pharmacogenetics. Variant effect maps can shed light on disease mechanisms and may identify novel potential targets for drugs or other therapeutics [8], help predict the safety and efficacy of modulating specific targets, reveal routes of resistance and identify patients likely to respond favourably in clinical trials. Variant effect maps of pharmacogenes, where genetic variation can influence the activity or metabolism of drugs, could reveal the optimal dose for an individual or identify predispositions to adverse reactions. Variant effect maps could also enable the systematic study of genetic dose–response curves through functional and clinical correlations.
-
Sequence/structure/function relationships. Understanding the relationship between sequence and function is fundamental to biology [9] and remains difficult to predict. Variant effect maps can illuminate this relationship, for example by improving or benchmarking computational variant effect prediction; revealing protein function, allostery or structure; and discerning the composition and mechanisms of regulatory elements [10,11,12,13,14,15,16,17,18,19].
-
Evolutionary genetics. Differences in the biology of species, including those of commercial interest, is genomically encoded. Variant effect maps can highlight the subset of genetic differences between species that have functional consequences, probe inferred ancestral sequences [20] and improve phylogenetic inference [21,22,23].
-
Pathogen biology. Genetic variation in pathogen genomes influences key characteristics of pathogen biology, including virulence, transmission, immune evasion and drug resistance. Variant effect maps can inform the surveillance of pathogen evolution [24] and provide opportunities to respond more rapidly, as well as revealing drug resistance and immune evasion variants [25].
By comprehensively capturing the impact of variants in functional elements throughout the genome, an Atlas of Variant Effects would accelerate and empower biological research, drug discovery and clinical practice. Systematic variant analysis, unbiased by allele frequency in any population, would empower equitable interpretation and reduce healthcare disparities [26]. Building and implementing a coherent Atlas of Variant Effects will necessarily be a collective endeavour, drawing together diverse expertise from different communities, including patients, patient advocates, researchers, clinicians, diagnostics companies and drug developers.
MAVEs can measure the effect of genetic variants at the scale necessary to compile an Atlas of Variant Effects
MAVEs are a rapidly growing family of methods that involve mutagenesis of a DNA-encoded protein or regulatory element followed by a multiplexed assay for some aspect of function [9, 27,28,29]. High-throughput DNA sequencing is used to read out each variant’s effect in the assay (Fig. 2A). MAVEs encompass both assays of protein function, often called deep mutational scans, and of regulatory elements, often called massively parallel reporter assays. Early MAVEs were applied to small protein domains and short regulatory elements [14, 15, 30] generally querying single ‘sub-functions’ of an element such as promoter activity [14, 15], protein–ligand interactions [30,31,32] or stability [33, 34]. Other early efforts focused on the ability of an element to perform its overall cellular function in a cell-based growth assay [35]. Subsequently, MAVEs have been developed for a variety of functions and have been used to generate multiple variant effect maps examining different functions for the same element [9, 28, 36]. Now, MAVEs have been scaled up and optimised to enable routine application to entire genes, measuring the relative functional impact of tens of thousands of variants in a single controlled experiment.
To date, variant effect maps have been generated for hundreds of functional elements encompassing over 11 million total variants (Fig. 2B). However, existing variant effect maps cover < 1% of the known clinically relevant human genome and are largely focused on single nucleotide variants, as these are the type of variants most often encountered in current human genome sequencing and clinical testing. No functional element has been mapped in a diverse panel of cell types or across developmental stages. However, even at this very early stage in the development of a comprehensive Atlas of Variant Effects, multiplexed variant functional data are proving to be powerful. In particular, variant effect maps are beginning to reshape how human variants found in clinical genetic testing are interpreted and also to redefine our understanding of the mapping between DNA sequence and molecular, cellular and organismal phenotype.
The value of functional evidence for informing clinical variant interpretation is already well appreciated and has been incorporated within current professional guidelines for genetic diagnosis that are used internationally [37, 38]. MAVE-derived variant functional data has numerous advantages as compared to functional data derived from traditional, low-throughput assays. Unlike testing variants in small batches using different methods in different labs, MAVEs can determine the effects of thousands of variants simultaneously, not only improving reproducibility but allowing assessment of variants in the context of the functional effects of all of the variants in that gene, including the effects of known pathogenic and benign variants. Thus, MAVE-derived functional data can be used to eliminate many, if not most, of the uncertain, clinically observed variants in monogenic disease genes demonstrating the power of functional data to help deliver more definitive genetic test results to patients and clinicians [39,40,41].
Multiplexed variant functional data can also transform our understanding of how variants encode molecular and cellular function and how sequence dictates biological structure. For example, multiplexed measurements of variant abundance and ligand binding in SH3 and PDZ domains, combined with a model, enabled a comprehensive accounting of allostery within each domain [16]. Multiplexed variant functional data can be used to validate proposed protein structures [17, 42] or, where variant combinations are assayed, even infer them de novo [12, 13]. Knowledge of the precise mechanism of variant effects opens the door for variant-guided therapies designed to ameliorate protein misfolding or aggregation, aberrant splicing and more.
Existing variant effect maps for human genes have been generated by a range of different technologies, from yeast complementation assays to CRISPR-based saturation genome editing in human cells. Each technology has specific advantages and disadvantages. For example, yeast complementation assays are only applicable to a minority of human genes [43] and would not be appropriate for identifying some variant effects, such as those that affect functions beyond those needed for complementation or those that disrupt splicing. CRISPR-based saturation genome editing of an endogenous locus is costly and practical only for growth-based assays. Thus, no single technology can currently be used to generate maps of variant effects for all functional elements. Indeed, even within a single gene, multiple assays may be required to assess different pathophysiological mechanisms. Current MAVEs require appreciable effort, and the time and cost needed to develop new assays can be considerable. Moreover, some variant effects may only be well-modelled in terminally differentiated cell types or in multicellular systems or by assaying variant effects on complex phenotypes like cell morphology or transcriptional state. Thus, the existing portfolio of MAVE technologies can be applied to a substantial fraction of the genome, but more technology development is required to achieve comprehensive coverage of genomic functional elements and to identify the mechanism by which most variants act.
The AVE Alliance provides international coordination to create, disseminate and implement an Atlas of Variant Effects
Compiling a complete Atlas of Variant Effects for all 20,000 human genes, not to mention potentially hundreds of thousands of noncoding regulatory elements, will require an international collaborative effort involving thousands of researchers, clinicians and technologists. Comparing this initiative to some of the landmark genomic collaborative achievements of the past 30 years highlights some of the key challenges to be addressed. The Human Genome Project (HGP) required a small number of centres generating data at unprecedented scales, in a highly coordinated and centralised fashion. By contrast, the Protein Data Bank (PDB) contains structures for thousands of human proteins, generated by thousands of researchers, in a largely uncoordinated and decentralised fashion [44]. Despite their differences, both HGP and PDB succeeded in generating an enduring and sustainable knowledge base and depended, crucially, on robust data standards, community-agreed quality metrics and centralised data deposition and dissemination. Moreover, a strong community ethos was essential for the development and adoption of these core standards and infrastructure. Some of the critical informatics infrastructure needed to support the AVE has already been developed, for example the MaveDB repository [45, 46], initial standards [47] for MAVE datasets and a MAVE project registry [48].
We envisage that the AVE will sit between the extremes exemplified by HGP and PDB, with a combination of a small number of centres generating variant effect maps at scale using generalisable assays and a large number of laboratories generating small numbers of maps, using bespoke assays, leveraging their expertise in investigating particular genes and biological pathways. Integration of variant effect data for the same gene, generated using different MAVEs, will in some cases be required to achieve accurate and comprehensive characterisation of different functional effects [39, 49,50,51,52]. The computational prediction of variant effect maps using AI/ML methods will continue to improve and will leverage growing numbers of experimentally determined variant effect maps, analogous to the advances in computational prediction of protein structures based on thousands of experimentally determined protein structures (Fig. 3). With these expectations in mind, we can identify some of the key challenges that realising the AVE vision will face and some of the likely solutions on the critical path to success:
-
Diverse expertise. Developing new experimental technologies that reflect the complexity of biology and disease, scaling existing technologies, processing and managing complex data, and translating knowledge into clinical benefits requires a broad range of expertise, interests and competencies, working collaboratively. No one centre or community will be able to create the AVE in isolation. Technology developers, geneticists, cell biologists, protein scientists, data scientists, software engineers, clinicians will need to work together, aligned around a common vision, language and values.
-
Technology development and scaling. Generating variant effect maps for all 20,000 genes will require both the scaling of existing technologies that can be applied to many genes, and the development of new technologies that will extend coverage of MAVE-compatible assays to all functional elements. Moreover, new approaches will be needed to assess variant effects in more complex contexts, such as specific cell types or in development, and for more complex phenotypes, such as cell morphology and behaviour.
-
Democratisation of technology. Completing the AVE will require a major expansion in the numbers of researchers and organisations actively performing MAVEs. Readily accessible training materials, protocols, experimental resources (e.g. cell lines, libraries) and easy-to-use and flexible software will all be crucial, as will advocacy and support to facilitate researchers with expertise in informative assays to adopt MAVE technologies.
-
Data standards and coordination. Data standards, community-agreed quality standards, centralised data deposition, open dissemination and a FAIR ethos [53] are all necessary but not sufficient for compiling the Atlas of Variant Effects. The existing informatics infrastructure needs to evolve, become integrated into the wider clinical and biological data ecosystem and be actively sustained for long term impact. Moreover, community-wide adoption of best practices with regard to data and meta-data deposition are critical for data integration.
-
Ensuring trustworthy clinical adoption. The potential clinical impact of the Atlas of Variant Effects can only be achieved through rigorous and clinician-trusted integration into diagnostic workflows. Co-development of quality standards and guidelines with clinical communities will help to build trust, as will starting conservatively. Integration with existing clinical decision support software (e.g. DECIPHER [54]) and data resources (e.g. ClinVar [55]), as opposed to requiring diagnosticians to use new systems, will facilitate rapid adoption.
To achieve the AVE vision and tackle these challenges, an international group of diverse researchers, clinicians and diagnosticians established the Atlas of Variant Effects Alliance (www.varianteffect.org). The AVE Alliance currently has over 400 members from over 100 institutions, located in 30 countries, united by the mission to bring the AVE into reality. The AVE Alliance is committed to Open Science and places diversity and inclusion at the heart of its activities. The AVE Alliance organises an annual meeting, the Mutational Scanning Symposium, and a monthly seminar series, the Variant Effect Seminar Series. To tackle the challenges identified above, AVE has established workstreams to:
-
Develop, standardise and democratise experimental and computational technologies,
-
Develop the infrastructure necessary to ingest, store and disseminate high quality FAIR data,
-
Ensure that clinical benefits are realised,
-
Expand, coordinate and sustain a diverse and motivated community.
The AVE Alliance provides a ‘front door’ for other organisations and initiatives to work with the diverse AVE community, from complementary large-scale national initiatives such as the NIH-funded Impact of Genomic Variants on Function (IGVF), as well as research funders and commercial organisations who are keen to engage with the community as a whole. We welcome any and all readers who are interested in building and learning from the Atlas of Variant Effects to join the Alliance and get involved [56, 57].
Availability of data and materials
Not applicable.
References
Shirts BH, Pritchard CC, Walsh T. Family-specific variants and the limits of human genetics. Trends Mol Med. 2016;22:925–34.
Kruglyak L, Nickerson DA. Variation is the spice of life. Nat Genet. 2001;27:234–6.
Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, Bilenky M, Yen A, et al. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518:317–30.
ENCODE Project Consortium, Moore JE, Purcaro MJ, Pratt HE, Epstein CB, Shoresh N, et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature. 2020;583:699–710.
Green RC, Berg JS, Grody WW, Kalia SS, Korf BR, Martin CL, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome sequencing. Genet Med. 2013;15:565–74.
Schiabor Barrett KM, Masnick M, Hatchell KE, Savatt JM, Banet N, Buchanan A, et al. Clinical validation of genomic functional screen data: analysis of observed BRCA1 variants in an unselected population cohort. HGG Adv. 2022;3:100086.
Dorling L, Carvalho S, Allen J, Parsons MT, Fortuno C, González-Neira A, et al. Breast cancer risks associated with missense variants in breast cancer susceptibility genes. Genome Med. 2022;14:51.
Nelson MR, Tipney H, Painter JL, Shen J, Nicoletti P, Shen Y, et al. The support of human genetic evidence for approved drug indications. Nat Genet. 2015;47:856–60.
Kinney JB, McCandlish DM. Massively parallel assays and quantitative sequence-function relationships. Annu Rev Genomics Hum Genet. 2019;20:99–127.
Gray VE, Hause RJ, Luebeck J, Shendure J, Fowler DM. Quantitative missense variant effect prediction using large-scale mutagenesis data. Cell Syst. 2018;6:116-24.e3.
Riesselman AJ, Ingraham JB, Marks DS. Deep generative models of genetic variation capture the effects of mutations. Nat Methods. 2018;15:816–22.
Rollins NJ, Brock KP, Poelwijk FJ, Stiffler MA, Gauthier NP, Sander C, et al. Inferring protein 3D structure from deep mutation scans. Nat Genet. 2019;51:1170–6.
Schmiedel JM, Lehner B. Determining protein structures using deep mutagenesis. Nat Genet. 2019;51:1177–86.
Kinney JB, Murugan A, Callan CG Jr, Cox EC. Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proc Natl Acad Sci U S A. 2010;107:9158–63.
Patwardhan RP, Lee C, Litvin O, Young DL, Pe’er D, Shendure J. High-resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat Biotechnol. 2009;27:1173–5.
Faure AJ, Domingo J, Schmiedel JM, Hidalgo-Carcedo C, Diss G, Lehner B. Mapping the energetic and allosteric landscapes of protein binding domains. Nature. 2022;604:175–83.
Chiasson MA, Rollins NJ, Stephany JJ, Sitko KA, Matreyek KA, Verby M, et al. Multiplexed measurement of variant abundance and activity reveals VKOR topology, active site and human variant impact. Elife. 2020;9. https://doi.org/10.7554/eLife.58026.
Livesey BJ, Marsh JA. Using deep mutational scanning to benchmark variant effect predictors and identify disease mutations. Mol Syst Biol. 2020;16:e9380.
Frazer J, Notin P, Dias M, Gomez A, Min JK, Brock K, et al. Disease variant prediction with deep generative models of evolutionary data. Nature. 2021;599:91–5.
Starr TN, Picton LK, Thornton JW. Alternative evolutionary histories in the sequence space of an ancient protein. Nature. 2017;549:409–13.
Klein JC, Keith A, Agarwal V, Durham T, Shendure J. Functional characterization of enhancer evolution in the primate lineage. Genome Biol. 2018;19:99.
Bloom JD. An experimentally determined evolutionary model dramatically improves phylogenetic fit. Mol Biol Evol. 2014;31:1956–78.
Gallego Romero I, Lea AJ. Leveraging massively parallel reporter assays for evolutionary questions. Genome Biol. 2023;24:26.
Lee JM, Huddleston J, Doud MB, Hooper KA, Wu NC, Bedford T, et al. Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants. Proc Natl Acad Sci U S A. 2018;115:E8276–85.
Stiffler MA, Hekstra DR, Ranganathan R. Evolvability as a function of purifying selection in TEM-1 β-lactamase. Cell. 2015;160:882–92.
Wright CF, Campbell P, Eberhardt RY, Aitken S, Perrett D, Brent S, et al. Optimising diagnostic yield in highly penetrant genomic disease. bioRxiv. 2022. Available from: https://www.medrxiv.org/content/10.1101/2022.07.25.22278008v1.
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable functional assays for the interpretation of human genetic variation. Annu Rev Genet. 2022;56:441–65.
Starita LM, Ahituv N, Dunham MJ, Kitzman JO, Roth FP, Seelig G, et al. Variant interpretation: functional assays to the rescue. Am J Hum Genet. 2017;101:315–25.
Gasperini M, Starita L, Shendure J. The power of multiplexed functional analysis of genetic variants. Nat Protoc. 2016;11:1782–7.
Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, et al. High-resolution mapping of protein sequence-function relationships. Nat Methods. 2010;7:741–6.
Zhang H, Torkamani A, Jones TM, Ruiz DI, Pons J, Lerner RA. Phenotype-information-phenotype cycle for deconvolution of combinatorial antibody libraries selected against complex systems. Proc Natl Acad Sci U S A. 2011;108:13456–61.
Ernst A, Gfeller D, Kan Z, Seshagiri S, Kim PM, Bader GD, et al. Coevolution of PDZ domain-ligand interactions analyzed by high-throughput phage display and deep sequencing. Mol Biosyst. 2010;6:1782–90.
Kim I, Miller CR, Young DL, Fields S. High-throughput analysis of in vivo protein stability. Mol Cell Proteomics. 2013;12:3370–8.
Araya CL, Fowler DM, Chen W, Muniez I, Kelly JW, Fields S. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc Natl Acad Sci U S A. 2012;109:16858–63.
Hietpas RT, Jensen JD, Bolon DNA. Experimental illumination of a fitness landscape. Proc Natl Acad Sci Natl Acad Sci. 2011;108:7896–901.
Weile J, Roth FP. Multiplexed assays of variant effects contribute to a growing genotype–phenotype atlas. Hum Genet. 2018;137:665–78.
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.
Brnich SE, Abou Tayoun AN, Couch FJ, Cutting GR, Greenblatt MS, Heinen CD, et al. Recommendations for application of the functional evidence PS3/BS3 criterion using the ACMG/AMP sequence variant interpretation framework. Genome Med. 2019;12:3.
Fayer S, Horton C, Dines JN, Rubin AF, Richardson ME, McGoldrick K, et al. Closing the gap: systematic integration of multiplexed functional data resolves variants of uncertain significance in BRCA1, TP53, and PTEN. Am J Hum Genet. 2021;108:2248–58.
Radford EJ, Tan HK, Andersson MHL, Stephenson JD, Gardner EJ, Ironfield H, et al. Saturation genome editing of DDX3X clarifies pathogenicity of germline and somatic variation. medRxiv [Internet]. Cold Spring Harbor Laboratory Press; 2022; Available from: https://www.medrxiv.org/content/10.1101/2022.06.10.22276179v1.
Scott A, Hernandez F, Chamberlin A, Smith C, Karam R, Kitzman JO. Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol. 2022;23:266.
Adkar BV, Tripathi A, Sahoo A, Bajaj K, Goswami D, Chakrabarti P, et al. Protein model discrimination using mutational sensitivity derived from deep sequencing. Structure. 2012;20:371–81.
Kachroo AH, Laurent JM, Yellman CM, Meyer AG, Wilke CO, Marcotte EM. Evolution. Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science. 2015;348:921–5.
wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 2019;47:D520-8.
Esposito D, Weile J, Shendure J, Starita LM, Papenfuss AT, Roth FP, et al. MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect. Genome Biol. 2019;20:223.
Rubin AF, Min JK, Rollins NJ, Da EY, Esposito D, Harrington M, et al. MaveDB v2: a curated community database with over three million variant effects from multiplexed functional assays. bioRxiv. 2022 . p. 2021.11.29.470445. Available from: https://www.biorxiv.org/content/10.1101/2021.11.29.470445v2. [Cited 2022 Dec 5].
Gelman H, Dines JN, Berg J, Berger AH, Brnich S, Hisama FM, et al. Recommendations for the collection and use of multiplexed functional data for clinical variant interpretation. Genome Med. 2019;11:85.
Kuang D, Weile J, Kishore N, Rubin AF, Fields S, Fowler DM, et al. MaveRegistry: a collaboration platform for multiplexed assays of variant effect. Bioinformatics. 2021;37:3382–3.
Mighell TL, Thacker S, Fombonne E, Eng C, O’Roak BJ. An integrated deep-mutational-scanning approach provides clinical insights on PTEN genotype-phenotype relationships. Am J Hum Genet. 2020;106:818–29.
Suiter CC, Moriyama T, Matreyek KA, Yang W, Scaletti ER, Nishii R, et al. Massively parallel variant characterization identifies NUDT15 alleles associated with thiopurine toxicity. Proc Natl Acad Sci U S A. 2020;117:5394–401.
Jepsen MM, Fowler DM, Hartmann-Petersen R, Stein A, Lindorff-Larsen K. Chapter 5 - Classifying disease-associated variants using measures of protein activity and stability. In: Pey AL, editor. Protein Homeostasis Diseases. Academic Press; 2020. p. 91–107. https://doi.org/10.1101/688234, https://www.biorxiv.org/content/10.1101/688234v2.full.pdf.
Cagiada M, Johansson KE, Valanciute A, Nielsen SV, Hartmann-Petersen R, Yang JJ, et al. Understanding the origins of loss of protein function by analyzing the effects of thousands of variants on activity and abundance. Mol Biol Evol. 2021;38:3235–46.
Wilkinson MD, Dumontier M, Jan Aalbersberg I, Appleton G, Axton M, Baak A, et al. Addendum: the FAIR guiding principles for scientific data management and stewardship. Sci Data. 2019;6:6.
DECIPHER v11.16: Mapping the clinical genome. Available from: http://www.deciphergenomics.org. [Cited 2022 Dec 3].
Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res. 2014;42:D980–5.
Atlas of variant effects alliance. Atlas of Variant Effects Alliance. Available from: http://www.varianteffect.org. [Cited 2022 Dec 3].
AVE Alliance Founding Members. The Atlas of Variant Effects (AVE) Alliance: understanding genetic variation at nucleotide resolution. Zenodo; 2021. Available from: https://zenodo.org/record/4989960.
Acknowledgements
We thank all members of the Atlas of Variant Effects Alliance for their work and contributions. Uta Mackensen helped prepare final figure graphics. We thank Carlos Araya for helpful input. We thank Alex Hopkins for administrative support.
To fairly give credit, and for the purposes of PubMed, we list the following AVE Alliance contributing authors, who provided substantial comments and edits to this manuscript.
Atlas of Variant Effects Alliance contributing authors
Nadav Ahituv14, Orli G. Bahcal15 , Dustin Baldridge16, Jonathan S. Berg17, Alice H. Berger18, Aisha Haley Bianchi19, Benedetta Bolognesi20, Michael Boutros21, Steven Brenner22, Matthew H. Brush23, Vanessa Bryant24, Carol J. Bult25, Martha Bulyk26, Melissa Call27, Hannah Carter28, Melina Claussnitzer 28,29, Feng Chen30, Melissa S. Cline31, Josh T. Cuperus1, Moez Dawood32, Hannah N. De Jong33, Mafalda Dias34, Michael Dunn5, Jesse Engreitz35, Kyle Farh30, Phillip G. Febbo30, Stanley Fields1, Gregory M. Findlay36, Helen Firth37, James S. Fraser38, Jonathan Frazer34, Mattia Frontini39, Irene Gallego Romero40, Andrew M. Glazer41, Murat Güler21, Rasmus Hartmann-Petersen42, Richard Houlston43, Kuan-lin Huang44, Carolyn M. Hutter45, Sujatha Jagannathan46,47, Richard G. James48, Martin Kampmann49,50, Rachel Karchin51, Justin B. Kinney52, Alexis C. Komor53, Sriram Kosuri54, Ben Lehner5, 34, 55, 56,Kresten Lindorff-Larsen42, Zané Lombard57, Daniel G. MacArthur58, Maria Martin59, Ultan McDermott60, Shannon M. McNulty61, Alex N. Nguyen Ba62, Anne O'Donnell-Luria63,64, Brian J. O'Roak65, Victoria N. Parikh66, Leopold Parts5, Michael J. Pazin45, Tina Pesaran67, Slavé Petrovski68, Christine Queitsch1,3, David E. Root8, Jay Shendure1,3, Amanda B. Spurdle69, Kevin L Taylor70, Clare Turnbull43, Judit Villén1, L.E.L.M. Vissers71, Alex H. Wagner72,73, Matthew J. Wakefield74, Jochen Weile10, Jenny Xiao75
14 Department of Bioengineering and Therapeutic Sciences and Institute for Human Genetics, University of California San Francisco USA
15 Cell Genomics, New York, NY USA
16 Washington University School of Medicine, St. Louis, MO USA
17 The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
18 Fred Hutchinson Cancer Center, Seattle, WA USA
19 National Institute of Aging, Baltimore, Maryland USA
20 Institute for Bioengineering of Catalunya (IBEC) and The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
21 German Cancer Research Center (DKFZ) and Heidelberg University, Germany
22 University of California, Berkeley, Berkeley, CA USA
23 Department of Biomedical Informatics, University of Colorado, Aurora, CO USA
24 The Walter and Eliza Hall Institute of Medical Research/Department of Medical Biology, University of Melbourne/Dept Clinical Immunology, Royal Melbourne Hospital, Australia
25 The Jackson Laboratory/Mouse Genome Informatics (MGI) consortium
26 Brigham & Women's Hospital and Harvard Medical School, USA
27 The Walter and Eliza Hall Institute of Medical Research/Department of Medical Biology, University of Melbourne, Australia
28 Department of Medicine, University of California San Diego, La Jolla CA USA
29 The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease USA
30 Illumina USA
31 UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
32 Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
33 Department of Genetics, Stanford University School of Medicine, Stanford, CA USA
34 Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST) and Universitat Pompeau Fabra (UPF), Barcelona, Spain
35 Department of Genetics, Stanford University School of Medicine, CA USA
36 The Francis Crick Institute, London, UK
37 Sanger/Cambridge Hospitals UK
38 Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA USA
39 Department of Clinical and Biomedical Sciences, University of Exeter Medical School, Faculty of Health and Life Sciences, RILD Building, Barrack Road, Exeter, EX2 5DW.
40 School of BioSciences, University of Melbourne, Parkville, Australia
41 Department of Medicine, Vanderbilt University Medical Center, Nashville, TN USA
42 Department of Biology, University of Copenhagen, Copenhagen, Denmark
43 Division of Genetics and Epidemiology, The Institute of Cancer Research, Sutton, Surrey UK
44 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY USA
45 Division of Genome Sciences, NHGRI, Bethesda, MD, USA
46 Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
47 RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
48 Seattle Children's Research Institute, Seattle WA USA
49 Department of Biochemistry & Biophysics, University of California, San Francisco CA USA
50 Institute for Neurodegenerative Diseases, University of California, San Francisco CA USA
51 Johns Hopkins University, Baltimore, Maryland USA
52 Cold Spring Harbor Laboratory, NY USA
53 Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA USA
54 Octant Inc USA
55 University Pompeu Fabra (UPF), Barcelona, Spain
56 Institució Catalana de Recerca i estudis Avançats (ICREA), Barcelona, Spain
57 Division of Human Genetics, National Health Laboratory Service, and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
58 Centre for Population Genomics, Garvan Institute of Medical Research and Murdoch Children's Research Institute, Australia
59 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome, Hinxton, UK
60 R&D Oncology, AstraZeneca UK
61 Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC USA
62 Department of Biology, University of Toronto, Toronto, Canada
63 Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA USA
64 Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA USA
65 Oregon Health & Science University, Portland, OR USA
66 Stanford Center for Inherited Cardiovascular Disease, Stanford School of Medicine, CA USA
67 Ambry Genetics, Aliso Viejo, CA USA
68 Centre for Genomics Research, Discovery Sciences, R&D, Astrazeneca UK
69 QIMR Berghofer Medical Research Institute, Brisbane, Australia
70 Proteogenomics, BioLegend USA
71 Department Human Genetics, Radboud University, Nijmegen, NL
72 The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA
73 Departments of Pediatrics and Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH 43210, USA
74 The Walter and Eliza Hall Institute, Parkville, Vic, Australia & Department of Obstetrics and Gynaecology, The University of Melbourne, Australia
75 Guardant Health, Palo Alto USA
76 Center for Genomic Medicine Massachusetts General Hospital, Harvard Medical School USA
Funding
F.P.R. acknowledges support from the NIH/NHGRI Impact of Genomic Variation on Function (IGVF) Initiative (HG011989) and from an NIH/NHLBI R01 grant (HL164675) and from a Canadian Institutes of Health Research Foundation Grant. L.M.S., L.A.M., D.M.F. and A.F.R. acknowledge support from NIH/NHGRI Impact of Genomic Variation on Function (IGVF) Initiative (HG011969). L.M.S., L.A.M., D.M.F., A.F.R. and F.P.R. all receive support from the NIH/NHGRI Center of Excellence in Genomic Science (HG010461). DMF also receive support from R01HL152066. L.M.S. is also supported by the Brotman Baty Institute. A.L.G. is a Wellcome Trust Senior Fellow (200837/Z/16/Z) and is also supported by NIDDK (UM-1DK126185). W.C.H. acknowledges support from NIH/NCI U01CA176058. J.T.N. acknowledges support from the Novo Nordisk Foundation (NNF21SA0072102) and from an NIH DP2 grant (1DP2GM146252). D.J.A. is supported by Cancer Research UK (CG-MAVE: EDDPGM-Nov22/100004) and the Wellcome Trust. A.F.R received grant funding from the Australian Government. D.S.M. acknowledges support from Chan Zuckerberg Initiative CZI2018- 191853. and NIH TR01 grant (1R01CA260415).
Author information
Authors and Affiliations
Contributions
D.M.F., A.L.G. and M.E.H. contributed to the conceptualization and writing of the original draft. The remaining co-authors contributed to the writing of the original draft. Contributing authors listed in the acknowledgements reviewed and edited the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Competing interests
A.L.G. declares that her spouse is an employee of Genentech and holds stock options in Roche. D.J.A. is a consultant for Microbiotica and Astra Zeneca. D.S.M. is a consultant for Insitro, Dyno and Octant. J.T.N. receives research support from Bristol Myers Squibb. F.P.R. holds shares in Ranomics, Inc., and is an investor and advisor for SeqWell, Inc. and Constantiam Biosciences, Inc. L.M.S. is a consultant for Nostos Genomics. W.C.H. is a consultant for Thermo Fisher, Solasta Ventures, MPM Capital, Tyra Biosciences, Frontier Medicines, Jubilant Therapeutics, KSQ Therapeutics, RAPPTA Therapeutics, Serinus Biosciences, Hexagon Bio, Function Oncolog, Riva Therapeutics, and Calyx. M.E.H. is a consultant for AstraZeneca and co-founder, director, shareholder of Congenica Ltd.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Fowler, D.M., Adams, D.J., Gloyn, A.L. et al. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol 24, 147 (2023). https://doi.org/10.1186/s13059-023-02986-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13059-023-02986-x