Introduction

Stroke is a leading cause of death and disability, which is attributed to complex genetic and environmental risk factors (Starby et al. 2014). Familial aggregation is a risk factor that increases stroke susceptibility (Chung et al. 2016; Ilinca et al. 2016). Whole-exome sequencing (WES) has become an effective tool to detect rare monogenic disorders (Rabbani et al. 2014). Lingren and his colleagues reported possibly disease-causing variants in six out of 22 familial young stroke patients (under 55 years old) by WES, revealing a higher hit rate (27.3%) than targeted gene panels (Ilinca et al. 2020). However, only a few WES studies are available addressing limited and specific cerebrovascular diseases (Santoro et al. 2018; Carrera et al. 2016; Mukawa et al. 2017; Sauvigny et al. 2020). The strategy and efficacy to apply clinical WES in familial stroke are still elusive. The main challenges come from the high heterogeneity in both phenotypes and genotypes of familial stroke patients, as well as inconsistent variant interpretation from WES data across laboratories.

There is a lack of comprehensive WES studies in Asian stroke patients. Compared to Caucasians, East Asians are more susceptible to small vessel disease (SVD), intracerebral hemorrhage (ICH), and intracranial large artery atherosclerosis (LAA) (Toyoda et al. 2015; Wei et al. 2021; Gurdasani et al. 2019; Kim and Kim 2014). Genetic variations may play an essential role in the different distribution of stroke subtypes among ethnic populations. In East Asians, cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) caused by NOTCH3 mutations (Lee et al. 2019), CADASIL2 and CARASIL (recessive) caused by HTRA1 mutations (Liao et al. 2015), and moyamoya disease (MD) caused by RNF213 mutations, are particularly prevalent (Bang et al. 2020). To delineate the monogenic causes and the genotype–phenotype correlations of familial stroke in Asians, here we investigated the frequency of likely disease-causing variants across a full spectrum of stroke subtypes in a Taiwanese cohort of familial stroke patients aged 18–79 using prevalent gene/hotspot screening and/or WES, depending on stroke subtypes and neuroimaging features. The efficacy of this integrative diagnostic pipeline and the therapeutic implications for the unraveled monogenic stroke disorders were addressed.

Materials and Methods

Patient Enrollment and Study Design

This study was performed with the approval of the Taipei Veterans General Hospital Institutional Review Board (2016-09-010C, 2020-02-007B) in accordance with the tenets of the Declaration of Helsinki. We prospectively screened 432 patients with familial ischemic or hemorrhagic stroke between September 2016 and July 2021 from 4769 inpatients consecutively admitted to Taipei Veterans General Hospital, a cohort from one of the largest tertiary referral centers in Taiwan. A positive family history was defined as a patient and at least one of the first or second-degree relatives having their first stroke between 18 and 79 years old. The inclusion criteria and diagnostic algorithm of this study are summarized in Fig. 1. The diagnosis of stroke was established by brain MRI/MRA and corresponding neurological deficits. The subtype classification was determined according to the classification of TOAST (Trial of Org 10172 Acute Stroke Treatment) for ischemic stroke (IS) and SMASH-U for intracerebral hemorrhage (ICH) (Adams et al. 1993; Meretoja et al. 2012). Cerebral venous sinus thrombosis-induced stroke was separately subtyped.

Fig. 1
figure 1

The inclusion criteria and diagnostic algorithm of familial stroke. We prospectively screened 432 patients with familial history of stroke in 4769 consecutive inpatients admitted to Taipei Veterans General Hospital (VGHTPE stroke registry) between 2016 and 2021. Totally 161 probands fulfilled the criteria and participated. Nine probands were diagnosed by sequencing specific gene hotspots and the other 152 probands were sent for whole-exome sequencing (WES) and a customized array. Thirty-three probands were identified carried pathogenic/likely pathogenic variants and 39 carried variants of uncertain significance among 325 stroke-related genes. The other 89 probands were categorized as unidentified, including 42 probands with no variants identified after variant filtering and other 47 with incompatible variants with phenotype or inheritance mode. MRI: magnetic resonance imaging

Totally, 161 probands fulfilled and provided consent to participate. Patients with characteristic clinical and neuroimaging features were first screened by conventional Sanger sequencing of specific genetic hotspots, including NOTCH3 p.R544C for CADASIL, m.3243A > G for mitochondrial encephalopathy, lactic acidosis, and stroke-like episodes (MELAS), and the GLA gene for Fabry disease. The remaining undiagnosed probands were submitted to an integrative approach which combined WES and a supplementary single nucleotide polymorphism (SNP) array customized for the genomic variants of Taiwanese population, including intronic and mitochondrial variants (Axiom Genome-Wide Taiwan Biobank Array Plate, TWB 2.0 National Center for Genome Medicine, Academia Sinica, Taiwan) (Wei et al. 2021). Rare variants are defined by the minor allele frequency (MAF) equal to or less than 0.1% in the global population among gnomAD database. The MAF in local control population was obtained from Taiwan Biobank whole-genome sequencing dataset of 1517 normal subjects. The clinical significance of the genetic variant was determined according to the American College of Medical Genetics (ACMG) guideline (Kalia et al. 2017). Detailed clinical data including the medical records, intensive interviewing, laboratory, and imaging studies were reviewed by two neurologists to confirm the phenotype-genotype correlation and segregation. The level of genetic diagnosis was categorized into (1) the definite genetic diagnosis with a pathogenic/likely pathogenic variant (PV) (i.e., monogenic stroke), (2) variants of uncertain significance (VUS) in the 325 stroke-associated genes (Supplementary Table S1), and (3) unidentified (Fig. 1).

Brain Magnetic Resonance Imaging (MRI)

All probands received MRI within 7 days after stroke onset in a 1.5 Tesla MRI scanner (Signa HDxt, GE Healthcare, Milwaukee, WI). A regular stroke MRI protocol consisted of T1-weighted imaging, T2-weighted and Fluid Attenuated Inversion Recovery (FLAIR) imaging, Diffusion-Weighted Imaging (DWI), Apparent Diffusion Coefficient (ADC), and time-of-flight (TOF) MR Angiography (MRA). For probands with SVD or ICH, T2*-weighted Susceptibility-Weighted Angiography (SWAN) was additionally applied. For probands with suspected dissection, a high-resolution contrast-enhanced vessel wall imaging was followed for intramural hematoma, intimal flap or double lumen in a 3.0 Tesla MRI scanner (Discovery MR750, GE Healthcare).

Whole-Exome Sequencing (WES)

Genomic DNA was extracted from peripheral blood with a Gentra Puregene Blood Kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. Thirty-eight samples were enriched by NimbleGen SaqCap v3 (Roche NimbleGen, Inc., California, USA) and processed by the Genome Research Center of National Yang Ming Chiao Tung University, while the other 114 samples were enriched by SureSelect V6 (Agilent Technologies, California, USA) and processed by the Genomics Biotechnology Co., Ltd. All sequencing was performed on the Illumina NovaSeq 6000 platform.

Bioinformatic Pipelines of Variant Annotation

The quality of raw reads was checked by FastQC (v0.11.4) (Andrews 2010). We used BWA (v0.7.12) (Li and Durbin 2009) and Samtools (v1.9) (Li et al. 2009) for alignment to the GRCh38 reference genome by default parameters. Marking duplications, base quality score recalibration, and variant calling were performed using GATK (v4.1.4.1) (McKenna et al. 2010; Van der Auwera et al. 2013). Variants with read depths less than 30 were excluded from data interpretation. After variant calling, multiple reference databases, including ClinVar (retrieved 2021-07), dbNSFP (v4.1a), dbSNP (v154), and Online Mendelian Inheritance in Man (OMIM, retrieved 2021-07), were applied for further annotation (McLaren et al. 2016).

Variant Filtration and Prioritization of Candidate Causal Variants

We first searched for stroke-related pathogenic and likely pathogenic variants (PVs) in the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/). The stroke-related phenotypes were determined according to the OMIM database (https://www.omim.org/). If no reported or phenotype-compatible PV was found, we then examined novel rare variants which were predicted to cause structural and functional impacts on the coding products in 325 stroke-related genes (Supplementary Table S1) by searching the terms of stroke, ischemic stroke, hemorrhagic stroke, stroke-like episode, cerebral hemorrhage, lacunar stroke, and thromboembolic stroke in OMIM and PubMed (Ilinca et al. 2019; Hamosh et al. 2005; Dichgans et al. 2019). The selected variants must have met all of the following inclusion criteria: (1) the global MAF ≤ 0.1% in public genome database; (2) not reported as a benign or likely benign variants in ClinVar or SNP database; (3) relatively evolutionarily conserved (GREP +  +  > 3); (4) computational predictions assessed by at least two scoring systems (SIFT, Polyphen2, MutationTaster, computer-aided design and drafting score > 20) or Emsembl rare exome variant ensemble learner score ≥ 0.6 (Ioannidis et al. 2016); (5) considered phenotype-genotype correlation. If the selected variant fulfilled pathogenic or likely pathogenic according to the ACMG guideline (PVs, judged with InterVar’s assistance) (Kalia et al. 2017; Li and Wang 2017), Sanger sequencing validation and intrafamilial segregation were performed in the proband and available family members.

Statistical Analysis

The statistical analysis was conducted using SPSS (Version 20.0. for Windows, Armonk, NY: IBM Corp. USA). To compare the characteristics between probands with and without monogenic PVs, two-sample independent t-tests were used for age at onset, initial severity of National Institute of Health Stroke Scale (NIHSS) scores, modified Rankin Scale (mRS) scores at baseline and three months after stroke, and risk factors. Risk factor numbers (0–7) were derived by summing the risk factors present in a stroke patient, including hypertension, diabetes, hyperlipidemia, atrial fibrillation, cigarette smoking, excessive alcohol consumption, and obesity (body mass index ≥ 30). Categorical variables, including sex, stroke subtypes, and obesity, were compared by the chi-square test or Fisher’s exact test. The hit rate of monogenic stroke based on stroke subtyping was defined as the patient number with monogenic causes divided by the total patient number of that specific subtype. Statistical significance was set at p < 0.05.

Results

Patient Characteristics of the Familial Stroke Caused by Monogenic Etiologies

Nine out of 161 patients (9/161 = 5.6%) having characteristic phenotypes and brain MRI/MRA were diagnosed via conventional Sanger sequencing of specific gene hotspots, including six with NOTCH3-related CADASIL (all carried p.R544C), two with GLA-related Fabry disease (p.Y365* and IVS4 + 919G > A), and one with m.3243A > G-related MELAS. For the remaining 152 probands who had a family history but the candidate genes were unclear (152/161 = 94.4%), WES revealed pathogenic/likely pathogenic variants (PVs) in 24 individuals (24/152 = 15.8%). The overall yield rate to detect a PV for relevant phenotype by the combination of WES and conventional sequencing was high, 20.5% (33/161). The clinical diseases and the genotypes of the 33 probands with ascertained monogenic stroke are summarized in Table 1. The detailed phenotypic and genotypic spectrum was described as follows. Another 23 family members ascertained with monogenic strokes shared the corresponding phenotypes and genotypes with individual probands (Supplementary Fig. S1-S2).

Table 1 Genotypes and phenotypes of the 33 unrelated probands diagnosed as monogenic familial stroke, including two carrying novel variantsa

The demographic characteristics of 161 probands are shown in Table 2. The mean age at onset was 53.2 ± 13.7 years old and 63.4% were male. The most common ischemic stroke subtype was SVD (29.8%) and LAA (27.3%). ICH accounted for 11.3%. The stroke subtype distribution was close to previous publications for Asians (Kim and Kim 2014). Notably, the subtype distribution was different between the patients with and without ascertained monogenic stroke. The patients ascertained with monogenic stroke (n = 33) appeared to have more SVD (45.5% vs 25.8%, p = 0.046) and ICH (21.2% vs 8.6%, p = 0.04) as compared to those unassigned probands. In contrast, LAA and cardioembolism were the less common subtypes in the monogenic group. Additionally, the patients with monogenic stroke had fewer vascular risk factors (the proportion with 0–1 risk factor: 42.4% vs 19.5%, p = 0.01). Apart from stroke subtypes and risk factors, we did not find significant differences in the other clinical features between the two groups, including age at onset, sex, the number of strokes affected members per family cluster, the initial severity of National Institute of Health Stroke Scale (NIHSS) and the modified Rankin Scale (mRS) outcome at three months after stroke.

Table 2 Demographics of ascertained monogenic versus unassigned familial stroke patients

Phenotypic and Genotypic Spectrum and the Hit Rate of Monogenic Causes

Among the 33 probands with ascertained monogenic stroke (Table 1), CADASIL was the most common hereditary stroke disease (n = 16, 48.5%). The other disorders included pseudoxanthoma elasticum (n = 2), coagulation factor II disorder (n = 2), beta-thalassemia (n = 2), Fabry disease (n = 2), MELAS (n = 1 for each disease below), protein S deficiency, hereditary hemorrhagic telangiectasia (HHT), polycystic kidney disease (PKD), CADASIL2, moyamoya disease (MD), cerebral cavernous malformation (CCM), familial hyperlipidemia, and primary thrombocythemia.

The hit rates (HR) of detecting causative genes for a specific subtype are shown in Fig. 2. The greatest diagnostic yield was in ICH patients (7/18, 38.9%). Specifically, all four patients with structural vasculopathy subtype had a monogenic etiology (4/4, 100%): KRIT1 for CCM, RNF213 for MD, ENG for HHT, and PKD1 for PKD. For ischemic stroke (IS), the SVD subtype had the highest HR (15/48, 31.3%), followed by other determined causes (4/21, 19%), whereas the LAA (4/44, 9.1%) and the cardioembolic (0%) subtypes had the lowest HR. NOTCH3 mutations presented with various subtypes, including SVD (n = 13), hypertensive ICH (n = 2), and LAA (n = 1).

Fig. 2
figure 2

Phenotypic and genotypic spectrum and the hit rate across stroke subtypes. CVT Cerebral venous sinus thrombosis, HTN hypertensive subtype, LAA large artery atherosclerosis, SMASH-U Structural lesion, Medication, Amyloid angiopathy, Systemic/other disease, Hypertension, Undetermined, SV structural vasculopathy, SVD small vessel disease, TOAST Trial of Org 10172 in Acute Stroke Treatment. *Novel variant

Intracerebral Hemorrhage (ICH)

The representative monogenic cases identified using WES are shown in Fig. 3. We disclosed a novel stop-gain variant, KRIT1 p.E379* in a 63-year-old man with multiple CCMs presented as recurrent ICH and cavernoma-related epilepsy (Fig. 3a). His epilepsy was firstly diagnosed at 27 years old and poorly controlled by medication since then. After his first ICH event in 2019, multiple CCMs had been found by MRI and the largest (4.8 cm) cavernoma at the left temporal region was removed by craniotomy. The histopathology showed aggregated dilated vascular space with thinning wall and thrombus formation, which was compatible with cavernoma. His younger brother, who carried the same variant, also had recurrent ICH due to multiple CCMs and microbleeds in the bilateral cerebral and cerebellar hemispheres. The segregation analysis of KRIT1 p.E379* was consistent with the disease and only in two affected family members. This novel variant is absent in global population database gnomAD and Taiwan Biobank database. The stop-gain variant is presumed to cause loss of function of the gene at vascular endothelial tight junctions (Merello et al. 2016). It is predicted to be deleterious by multiple lines of bioinformatic evidence (CADD phred score 40, GERP +  + NS 5.66). Therefore, this variant is considered pathogenic according to ACMG guideline criteria (PVS1 + PM2 + PP3 + PP4, see Table 1). Based on the genetic diagnosis, we provided genetic counseling, close follow-up of brain MRI, and strict blood pressure control for the patients and affected family members.

Fig. 3
figure 3

Representative pedigrees of monogenic familial strokes. a A 63-year-old man with numerous cerebral cavernous malformations carried a novel stop-gain variant KRIT1 p.E379*, which was segregated in terms of genotype and phenotype with his younger brother. b A 35-year-old woman with intraventricular hemorrhage due to moyamoya disease carried a heterozygous RNF213 p.R4810K variant. c A 42-year-old man with cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) with recurrent infarctions and severe white matter hyperintensities carried homozygous NOTCH3 p.R544C variants, which was segregated with his parents. d A 21-year-old woman with superior sagittal venous sinus thrombosis and bilateral hemorrhagic infarctions carried a novel likely pathogenic variant F2 p.F382L, which was segregated within her family. AO age of onset, CTA Computed tomography angiography, DVT deep vein thrombosis, DWI diffuse weighted image, MRA magnetic resonance angiography, SWAN Susceptibility weighted angiography, WMH white matter hyperintensity. Yellow arrowheads indicate the occlusive vessel

A 35-year-old woman with no known risk factor presented by acute intraventricular hemorrhage and obstructive hydrocephalus, and the brain MRA documented Suzuki stage IV with occlusive changes in the bilateral distal internal carotid arteries and tenuous anterior and middle cerebral arteries (Fig. 3b). Her mother expired due to a hemorrhagic stroke in her late 40 s. She was identified to carry a heterozygous RNF213 p.R4810K variant, a hotspot variant of Moyamoya disease (MD) in the East Asia population (Lee et al. 2015; Ishigami et al. 2022). We provided education about the condition and transferred to neurosurgery department for extracranial-intracranial arterial bypass according to treatment recommendations (Fujimura et al. 2022).

WES detected an intronic pathogenic variant PKD1 c.11014-10C > A in a 38-year-old man who suffered from subarachnoid hemorrhage by a ruptured anterior communicating artery aneurysm and PKD. His father had PKD and died from a ruptured brain aneurysm. All his three siblings had been diagnosed as PKD and shared the same PKD1 variant. Computed tomography (CT) angiography was suggested for these siblings to detect asymptomatic brain aneurysms. Also, a 23-year-old woman who had arteriovenous fistula presented by ICH and seizure had been identified ENG p.Y258Lfs*76. The proband and both of her two siblings had spontaneous and recurrent nosebleeds since childhood, which was compatible with hereditary hemorrhagic telangiectasia (HHT) phenotype. Education was also provided for these two probands.

Another two probands carried the CADASIL hotspot PV NOTCH3 p.R544C and suffered from dementia prior to the ICH event. HBB p.K18* variant was detected in a 63-year-old man who suffered from hypertensive thalamic ICH and thalassemia (low hemoglobin).

Ischemic Small Vessel Disease (SVD)

In this subtype, twelve probands were identified as carrying the hotspot PV NOTCH3 p.R544C and one as carrying the NOTCH3 p.R1321C. A representative proband, carrying the homozygous NOTCH3 p.R544C variants (Fig. 3c), had his first stroke at 42 years old and later had early-onset dementia (Mini-Mental State Examination = 21) at the age of 46. His brain MRI/MRA revealed large tissue loss in the right anterior frontal region, bilateral severe white matter hyperintensities and multisegmental mild narrowing in the M1 segment of right middle cerebral artery. Also, he had post-stroke seizure under Frisium and Depakine. Both of his parents were confirmed carrying heterozygous NOTCH3 p.R544C with moderate white matter hyperintensities. Regular MRI follow-up for brain microbleeds and careful antithrombotic adjustment were suggested. The other pedigrees of NOTCH3-related stroke were presented in Supplementary Fig. S1. Another 51-year-old woman with CADASIL syndrome of stroke and mild dementia carried heterozygous HTRA1 p.G276A, which was reported to be likely pathogenic to cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy type 2 (CADASIL2) (Lee et al. 2018). Furthermore, a 52-year-old woman who carried HBB p.F42fs variant and who suffered from a lacunar stroke in the motor area with thalassemia.

Cerebral Venous Sinus Thrombosis (CVT)

We discovered a novel variant F2 p.F382L in a 21-year-old woman suffering from acute consciousness change and generalized tonic–clonic seizures due to cerebral venous sinus thrombosis and hemorrhagic infarction at the bilateral frontoparietal cortices (Fig. 3d). Her mother and maternal grandmother, who also carried F2 p.F382L, both suffered from recurrent episodes of deep vein thrombosis and cerebral venous thrombosis. One maternal uncle died from pulmonary embolism at his 30 s and another suffer from stroke and deep vein thrombosis at his 50 s. Decreased coagulation factor II activities were observed in the proband, her mother, and grandmother (44.7, 67.1, and 68.5%, respectively; reference: 79–131%). The segregation analysis was consistent with the disease in the aforementioned three affected family members and one unaffected family member. This novel variant is absent in global population database gnomAD and Taiwan Biobank database. However, it is unknown how this mutation leads to low factor II activities and prothrombotic and prohemorrhagic tendency. The variant location is highly evolutionary conserved (GERP +  + NS = 5.23, phastCons = 1) for encoding factor IIa thrombin and fibrin interaction exosite I domain (Huntington 2005). Missense variants in the F2 gene have very low rate as benign variants (ClinVar F2 missense variants: 21/38 pathogenic or likely pathogenic, while 2/38 benign or likely benign). Taken together, this variant is considered likely pathogenic according to ACMG guideline criteria (PM1 + PM2 + PP1 + PP2 + PP4, see Table 1). The establishment of genetic etiology introduced timely non-vitamin-K oral anticoagulant therapy with a specific factor X inhibitor (Yaghi et al. 2022), which successfully prevented the recurrence of thromboembolic events and achieved good functional outcomes in this family for years of followed-ups till now.

Other Determined Diseases and Undetermined Ischemic Stroke

We found a 43-year-old man proband carrying a rare homozygous ABCC6 p.E709K variant had infarcts at the bilateral corona radiata, bilateral basal ganglia, and pons because of vertebrobasilar artery dissection, which was phenotypic compatible with ABCC6 related pseudoxanthoma elasticum. Pseudoxanthoma elasticum is an inherited elastic fiber disorder usually caused by ABCC6 variants in an autosomal recessive inheritance pattern, but heterozygous carriers may express partial phenotypes (Miksch et al. 2005; Plomp et al. 2004). Besides, a rare pathogenic variant, F2 p.R596Q, known for thrombophilia and antithrombin resistance (Takagi et al. 2016), was identified in a 41-year-old man presented by embolic stroke in the right parietal cortex. The proband’s factor II activity remained mildly low during follow-ups. Lastly, a well-known pathogenic variant JAK2 p.V617F for the myeloproliferative disorder was identified in a 72-year-old woman with thrombocythemia and an undetermined subtype of stroke.

Large Artery Atherosclerosis (LAA)

We found a 59-year-old man proband who had basilar artery atherosclerotic infarction as well as peripheral artery occlusive disease, carrying the heterozygous ABCC6 p.R1235W variant, which is known to be pathogenic for pseudoxanthoma elasticum. Also, LDLR p.D90N, a PV to familial hyperlipidemia, was detected in a 55-year-old man suffered from recurrent stroke with the third event of an acute vertebral artery occlusion infarction over the bilateral cerebellum and left medulla oblongata. Furthermore, WES identified PROS1 p.Y592* mutation in a 72-year-old man who had protein S deficiency and episodes of large artery atherosclerosis-related infarction and deep vein thrombosis. The same variant was also detected in his son who suffered from deep vein thrombosis and decreased protein S activity as well. Lastly, a 62-year-old woman suffered from right posterior cerebral artery in-situ thrombosis infarction was identified carrying the hotspot pathogenic variant NOTCH3 p.R544C.

Stroke-Associated Variants of Uncertain Significance (VUS)

Apart from pathogenic or likely pathogenic findings, 41 rare VUS (Table 3) in 325 stroke-related candidate genes (Supplementary Table S1) were detected in 39 probands, including eight novel variants: ABCA1 p.H1223P, ACTA2 p.N298S, COL5A1 p.P930R, FBN1 p.Q514E, FBN1 p.K1052T, GP1BA p.P437*, RNF213 p.N1631Efs*34, and TSC1 p.P397del. Although the evidence of pathogenicity is not enough due to the unavailability of familial segregation or the paucity of their functional impacts currently, the rare allele frequency in our population, in silico prediction and the corresponded clinical manifestations indicate that these variants may still be potentially associated with the pathogenesis of stroke. We would like to report them here and expect more evidence to elucidate their clinical significance in future. Finally, we identified no findings after variant filtering in 42 probands and the other 47 had incompatible variants with phenotype or inheritance mode, they were categorized as unidentified (n = 89).

Table 3 Variants of uncertain significance in 325 stroke-associated genes detected in this study

Discussion

We proposed a stroke subtype-guided WES in combination with prevalent gene/hotspot screening and achieved an overall diagnostic yield 20.5% of monogenic causes in 33 out of 161 unrelated probands with well-defined familial stroke. This study included all IS and ICH subtypes, and thus outlined a more comprehensive genomic landscape of familial stroke. The identified PVs were distributed in a wide range of genes and rare diseases. For example, CADASIL syndrome could be caused by NOTCH3 or HTRA1 mutations, both account for a significant number of SVD and/or ICH patients in our population (Lee et al. 2018). Notably, we identified two novel PVs, KRIT1 p.E379* in a family of cerebral cavernous malformation-related ICH, and F2 p.F382L in a family of cerebral venous thrombosis-related stroke, respectively. Our findings demonstrated the power and necessity of WES in genetic diagnosis of monogenic stroke, a disease entity with high heterogeneity and complex etiologies.

Compared with the previous next-generation sequencing (NGS) studies in stroke patients, our diagnostic yield was relatively high as summarized in Table 4, although these study designs, methods and key findings were substantially different. Overall, most NGS studies were limited by small sizes, restricted screened genes, or narrow stroke subtypes (Tan et al. 2019; Lee et al. 2019; Bersano et al. 2016; Ilinca et al. 2020; Mishra et al. 2019; Auer et al. 2015). According to our results, the subtypes of familial ICH and SVD stroke, determined by the clinical features and characteristic brain MRI features, can provide the most informative clue to possible underlying monogenic etiologies. The highest HR was in the patients with the ICH (38.9%), especially for structural vasculopathy (100%), followed by SVD (31.3%). In parallel, when a patient with familial stroke has very few (0–1) conventional vascular risk factors, an underlying monogenic etiology should be aggressively searched. Otherwise, there was no difference in age at onset, sex, family history, the initial stroke severity or stroke outcome between the patients with and without a definite genetic diagnosis.

Table 4 Comparison of the diagnostic yield of monogenic stroke diseases among previously relevant genetic studies and our study*

NOTCH3 is the most prevalent disease-causing gene in our cohort of familial stroke. The existence of certain NOTCH3 variants is also an important risk factor for sporadic stroke in the Taiwan population (Lee et al. 2019). Because NOTCH3 p.R544C is the leading hotspot in both familial stroke (15/161, 9.3%, in this study) and sporadic stroke patients in Taiwan (Wei et al. 2021), suspicious patients with CADASIL syndrome could be screened for NOTCH3 p.R544C by Sanger sequencing. However, if the hotspot screening turns negative, it would be tedious and time-consuming to screen the whole NOTCH3 gene by conventional sequencing, or search for other monogenic cerebrovascular diseases. In this study, we adopted Taiwan Biobank (TWB) 2.0 SNP Array in the genetic diagnosis pipeline, which includes NOTCH3 p.R544C and other pathogenic intronic and mitochondrial variants not covered by WES, like m.3243 A > G for MELAS (Wei et al. 2021). Our integrative NGS approach combining WES and TWB 2.0 Array may be an efficient alternative to conventional sequencing for non-CADASIL stroke patients. The establishment of genetic diagnosis is important for personalized antiplatelet agents (Chung et al. 2020) and early risk factor modification for the carriers of these disease-associated variants (Lee et al. 2019).

Genetic diagnosis of familial CCM is an emerging issue. Familial CCM, with the hallmark of multiple ICH of structural vasculopathy subtype, has significantly higher risks of rebleeding and epilepsy than the sporadic CCM cases. The Angioma Alliance Scientific Advisory Board Clinical Experts Panel has recommended that diagnostic genetic testing should be performed when there is a positive family history or multiple CCMs (Akers et al. 2017). The three responsible genes, CCM1 (KRIT1), CCM2 (MGC4607) and CCM3 (PDCD10), are associated with variable risk of rebleeding or refractory epilepsy (Fischer et al. 2013; Yadla et al. 2010; Riant et al. 2010). CCM3 variations have been reported to be associated with a higher risk of early-onset ICH and recurrent bleeding (Choquet et al. 2015; Riant et al. 2013; Shenkar et al. 2015). Therefore, the genetic etiology is essential for prediction of the risk and severity of CCM-related epilepsy or ICH and interconnected to decision of genetic testing and individualized treatment plans.

There are limitations of this study. It should be very cautious in determining the clinical significance of the identified rare variants. First, it is particularly challenging when facing the novel variants or VUS with limited clinical evidence. Accumulating genomic data in large population may help to clarify the significance of some undetermined variants (VUS in 39 out of 161, 24.2%) in future. Second, the function of the novel variant F2 p.F382L is unknown. Further studies of the impact on thrombin-fibrin interaction and thrombin antithrombin complex (Miyawaki et al. 2012) are warranted to elucidate the pathophysiology. Third, the scope of this study primarily focused on known genes related to stroke. Given that trio study was unavailable for most of our mid-age-onset and late-onset stroke probands, there are probable novel causative genes concealed in the unidentified cases (89 out of 161, 55.3%). Fourth, severe patients could not sign the consent, e.g., unconscious or unstable vital signs, thus there was selection bias by the severity of stroke. Lastly, from our small, single center cohort, LAA and cardioembolic stroke subtypes took much smaller proportions in the patients with a definite genetic diagnosis, suggesting that these subtypes are more likely secondary to acquired and aging factors rather than monogenic causes. Further studies are warranted.

In summary, monogenic diseases cause a significant number of familial strokes, particularly in patients with the ICH or SVD subtypes. This study supported the subtype-based WES is effective in delineating rare monogenic etiologies of familial stroke. Well-characterized phenotypes, including the clinical and neuroimaging manifestations, are the cornerstone to guide the diagnostic approach and enhance the genetic yield rate. We also proposed the potential of SNP array to be an alternative to screen ethnic gene hotspots. The integrative diagnostic pipeline could target the most prevalent monogenic causes and benefit patients and their families by introducing therapeutic interventions and therefore improve the personalized stroke medicine. Accumulating knowledge of the genotype–phenotype correlations would strengthen our understanding of the genomic landscape of stroke.