Next-Generation Sequencing-Based Copy Number Variation Analysis in Chinese Patients with Primary Ciliary Dyskinesia Revealed Novel DNAH5 Copy Number Variations

Primary ciliary dyskinesia (PCD) is a rare disorder characterized by extensive genetic heterogeneity. However, in the genetic pathogenesis of PCD, copy number variation (CNV) has not received sufficient attention and has rarely been reported, especially in China. Next-generation sequencing (NGS) followed by targeted CNV analysis was used in patients highly suspected to have PCD with negative results in routine whole-exome sequencing (WES) analysis. Quantitative real-time polymerase chain reaction (qPCR) and Sanger sequencing were used to confirm these CNVs. To further characterize the ciliary phenotypes, high-speed video microscopy analysis (HSVA), transmission electron microscopy (TEM), and immunofluorescence (IF) analysis were used. Patient 1 (F1: II-1), a 0.6-year-old girl, came from a nonconsanguineous family-I. She presented with situs inversus totalis, neonatal respiratory distress, and sinusitis. The nasal nitric oxide level was markedly reduced. The respiratory cilia beat with reduced amplitude. TEM revealed shortened outer dynein arms (ODA) of cilia. chr5:13717907-13722661del spanning exons 71–72 was identified by NGS-based CNV analysis. Patient 2 (F2: IV-4), a 37-year-old man, and his eldest brother Patient 3 (F2: IV-2) came from a consanguineous family-II. Both had sinusitis, bronchiectasis and situs inversus totalis. The respiratory cilia of Patient 2 and Patient 3 were found to be uniformly immotile, with ODA defects. Two novel homozygous deletions chr5:13720087_13733030delinsGTTTTC and chr5:13649539_1 3707643del, spanning exons 69–71 and exons 77–79 were identified by NGS-based CNV analysis. Abnormalities in DNA copy number were confirmed by qPCR amplification. IF showed that the respiratory cilia of Patient 1 and Patient 2 were deficient in dynein axonemal heavy chain 5 (DNAH5) protein expression. This report identified three novel DNAH5 disease-associated variants by WES-based CNV analysis. Our study expands the genetic spectrum of PCD with DNAH5 in the Chinese population. Supplementary Information The online version contains supplementary material available at 10.1007/s43657-023-00130-0.


Introduction
Primary ciliary dyskinesia (PCD) is a rare, heterogeneous ciliopathy resulting in mucociliary clearance failure.Patients typically present with newborn respiratory distress, daily wet cough, chronic nasal congestion, and laterality defects.PCD diagnosis can be achieved by following diagnostic algorithms that include nasal nitric oxide (nNO) measurements, high-speed video microscopy analysis (HSVA), immunofluorescence (IF), transmission electron microscopy (TEM), and molecular testing (Lucas et al. 2017).Currently, approximately, 50 genes are known to be associated with PCD (Wallmeier et al. 2020).Due to a lack of appropriate diagnostic facilities and trained diagnosticians, few PCD patients have been reported in China (Guan et al. 2021).
Dynein axonemal heavy chain 5 (DNAH5) is a protein that functions as a force-generating protein to produce cilia bending (Olbrich et al. 2002).It has been reported that DNAH5 is responsible for nearly 15-29% of all cases of PCD in European and American populations (Zariwala et al. 1993).Previous studies focused on patients with DNAH5 in non-Asian populations, revealing that mutations in DNAH5 lead to absent or shorter outer dynein arms (ODA) in respiratory cilia that are mostly immotile or only exhibit flickering movements (Raidt et al. 2014;Hornef et al. 2006;Emiralioğlu et al. 2020).
At least 100 different pathogenic variants in the DNAH5 gene have been reported (Zariwala et al. 1993;Hornef et al. 2006;Emiralioğlu et al. 2020;Wang et al. 2021).However, the clinical significance of copy number variations (CNVs) in DNAH5 has rarely been reported.Moreover, CNV analysis in PCD in the Chinese population has never been examined.CNVs contribute substantially to autosomal recessive Mendelian disorders.CNVs have been reported in up to 10.8% of autosomal recessive Mendelian disorders but are not detectable by routine whole-exome sequencing (WES) analysis (Aradhya et al. 2012).In this study, we performed WES analysis to identify diseasecausing variants.For "negative" cases, WES-based CNV analysis was conducted to detect CNVs.Eventually, three PCD patients in the Chinese population carrying three novel DNAH5 CNVs were identified.Sanger sequencing, quantitative real-time polymerase chain reaction (qPCR) and IF were performed to further support pathogenicity.

Patients and Clinical Materials
The project was approved by the Ethics Committee of The Children's Hospital of Fudan University, and written informed consent was acquired from the guardians of the patients participating in this study.Three affected individuals and five relatives from two families were recruited for this study.Clinical data, including lung function, highresolution computed tomography (HRCT), bacterial cultures, and PCD-related diagnostic tests, were obtained at enrollment.At the same time, clinical evaluations and genetic testing were performed in five relatives, including the patients' parents and Patient 3's daughter (Table S1).The diagnosis of PCD was based on clinical findings, nNO, HSVA, TEM, and genetic testing, in accordance with the guidelines of the European Respiratory Society (Lucas et al. 2017).

nNO Measurements
Measurements of nNO were performed with an EcoMedics CLD88 chemiluminescence NO analyser (Duernten, Switzerland); the aspiration sampling rate of 330 nL/min was verified before and after each subject was tested.The measurement of nNO in cooperative children was performed by breath hold maneuver for at least 10 s to close their velum as described previously (Guo et al. 2020).For children who were uncooperative (less than five years old), nasal sampling was performed for 60 s during tidal breathing.The results were reported in nL/min with the following equation: nNO (nL/min) = NO (ppb) × sampling rate (nL/min).

TEM and HSVA
Nasal tissue was collected using a Rhino-Probe (Arlington Scientific, Springville, UT).To avoid secondary ciliary dyskinesia, nasal samples were taken in patients who were free from acute airway infection for at least four weeks.The tissue was suspended in L-15 medium (Invitrogen, CA) immediately for analysis using a Leica inverted microscope (Leica DMI300B, Solms, Germany) with a 63 × oil objective under differential interference contrast optics.Cilia beats were recorded at 200 frames/s at room temperature (25 °C) using a 680 PROSILICA GE camera (Allied Vision, PA).At least 10 videos were derived from each sample.The digital recordings were evaluated by two experienced but blinded investigators.
The ciliary ultrastructure was analyzed in nasal tissue fixed in 2.5% glutaraldehyde, as described previously (Guo et al. 2020).For each specimen, at least 30 transverse ciliary sections of different cells were used to evaluate the internal axonemal structure.Ciliary abnormalities were defined as the presence of defects in > 50% of cilia.

WES Analysis
Blood samples were obtained from the proband and available family members.Genomic deoxyribonucleic acid (DNA) was extracted using the Gene Blood DNA Rapid Extraction Kit (Qiagen, China).WES was performed by Gemple Biotech Co., Ltd.(Shanghai, China).Whole-exome libraries were constructed using the KAPA platform and KAPA Hyper Prep kit (Roche KAPA, Switzerland).Exomes were captured using SeqCap EZ MedExome (Roche NimbleGen) and sequenced by a HiSeq X Ten instrument (Illumina, San Diego, CA).Raw data were evaluated using FASTQC (version 0.11.5), and the linker sequences were removed by Cutadapt (version 1.10).BWA software (version 0.7.15) was used to align the reads to the human reference genome GRCh37/hg19 (UCSC).Variant filtering was conducted according to variant type, pathogenicity predictor scores, and variant frequencies in population databases.Briefly, allelic variants with a frequency ≥ 5% in any of the databases used (GnomAD, ExAC and 1000 Genomes) were filtered out.Variants classified as benign or probably benign by multiple subscribers in ClinVar database, synonymous variants and intronic variants localized more than 10 nucleotides from the exon/intron junction were also filtered out, nonsynonymous missense variants with a "benign effect" according from all pathogenicity predictors (SIFT, PolyPhen, MutationTaster, CADD) were excluded.The remaining variants were further assessed according to the inheritance model, available report on the pathogenicity, and recorded clinical manifestation.Variants were classified following the American College of Medical Genetics and Genomics (ACMG) guidelines (Richard et al. 2015).Sanger sequencing was performed to validate the candidate variants, and segregation analyses were performed in family members.The RefSeq accession numbers of the transcript and corresponding protein isoform used for mutation nomenclature were NM_001369.2and NP_001360.1,respectively.

CNV Analyses
CNV detection based on targeted capture-based next-generation sequencing (NGS) data was performed using the R package panel copy number estimation by a mixture of Poissons (cn.MOPS) (Povysil et al. 2017) set to default parameters.The panelcn.MOPS package is based on the genomewide and whole-exome-wide CNV detection tool cn.MOPS (Klambauer et al. 2012).cn.MOPS builds a local model that captures the read characteristics of each region of interest, avoiding bias induced by the targeting procedure (Povysil et al. 2017).Twenty-five blood samples that were sequenced using the same targeted panel and did not show any CNVs in complementary array analysis were used as the control dataset.The complete target regions of all target genes of the test and control samples were used for normalization and as controls for the panelcn.MOPS pipeline.Probes spanning an individual region of more than 300 nucleotides were subdivided into smaller targets of at least 70-100 nucleotides and a maximum of 200 nucleotides to increase the resolution of CNV detection.In addition, a csv file was created for each sample, displaying statistical parameters and copy number changes (CN: CN0 = loss; CN1 = one copy; CN2 = two copies; CNx = x copies) for each target.
An average of 10 Gb raw data (fastq) as input was generated for each sample by illumina sequencers.First, the paired-end reads was performed quality control by FASTQC version 0.11.5.Second, Burrows-Wheeler Aligner (BWA) version 0.7.15 (Li and Durbin 2009) was used to align sequencing reads to the reference genome GRCh37.SAM format files were generated by BWA.Third, the SAM format files were further processed to BAM files using Samtools version 1.3.1 (Li et al. 2009), then removing duplicates by Picard version 2.5 (http:// broad insti tute.github.io/ picard/).After these processes, variant calling was performed by GATK version 3.5 (McKenna et al. 2010) and the vcf file was generated.Finally, we used an in-house software to annotate the variants from the vcf file and integrate information from multiple databases.The final variants can feed to the downstream advanced analysis pipeline.

qPCR Analysis
To confirm CNVs, qPCR was performed using DNA from patients, their parents and two healthy controls of similar age with Patient 1 and Patient 2. DNA was extracted from peripheral blood.qPCR was prepared using KAPA SYBR FAST qPCR Kit Master Mix (2×) Universal (Roche KAPA, Switzerland) on a CFX96™ Real-Time System instrument (BIO-RAD) and was performed on a LightCycler 480 System II (Roche Diagnostics) in triplicate.The reaction conditions were set at 95 °C for 3 min, followed by 40 cycles of 95 °C for 5 s and 55 °C for 30 s. Specific primers designed by NCBI (https:// www.ncbi.nlm.nih.gov/) are summarized in Table S2.The comparative ΔΔCt method was used to calculate relative expression with data normalized to the mean level of an internal standard.

Whole-Genome Sequencing (WGS)
Genomic DNA was obtained and sheared into fragments with an average length of 350 bp to construct libraries using the KAPA platform and KAPA Hyper Prep kit (Roche KAPA, Switzerland).The libraries were then subjected to sequencing on the Illumina HiSeq X platform, and 150 bp paired-end reads were generated.The same bioinformatics pipeline described above for the exome sequencing analysis was used to analyze DNAH5 mutations and precise breakpoints.

Clinical Data
A 0.6-year-old female (II-1) from Family 1 (Fig. 1a) was referred to our clinic with a chief complaint of persistent wet cough and nasal congestion.Situs inversus totalis was detected during the fetal period (Fig. 1b).The girl was born naturally at full term but displayed shortness of breath starting after a few hours of life and needed supplemental oxygen.No obvious abnormality was noticed on CT scans of the lung (Fig. 1c).A CT scan of the paranasal sinus showed maxillary sinusitis and otitis media (Fig. 1d).Nasal NO production was 8.1 nl/min, which is low and compatible with PCD.
In Family 2, Patient 2 (IV-4) was a 39-year-old male, an offspring of a consanguineous family (Fig. 2a).The proband did not suffer from neonatal respiratory distress during the newborn period or recurrent respiratory diseases during childhood.Otitis and hearing loss were not reported.However, he suffered from productive cough and recurrent pneumonia since adolescence.In addition, the proband was diagnosed with infertility.Chest X-ray (CXR) showed that the cardiac apex was rightward (Fig. 2d).HRCT showed a tree-in-bud pattern and mild bronchiectasis without significant patchy shadow (Fig. 2e).nNO testing of Patient 2 revealed an nNO production rate of 9.9 nL/min.Two siblings of Patient 2 were also affected.The eldest brother, Patient 3 (IV-2), was subjected to recurrent respiratory system diseases and was hospitalized for pneumonia treatments since childhood.HRCT showed situs inversus totalis and imaging features of bronchiectasis (Fig. 2h).In addition, a CT scan of the paranasal sinus showed thickened mucosa of the bilateral maxillary sinus, ethmoidal sinus and sphenoid sinus (Fig. 2i).Patient 3 died at the age of 42 years old of lung failure.The clinical records of IV-3 were not accessible.According to his parents, IV-3 had recurrent lung infections since three years old.He was diagnosed with situs inversus and suffered from lung failure at the age of 38.After home oxygen therapy for six months, the patient died of lung failure.No PCD symptoms were observed in the consanguineous parents or Patient 3's daughter.No abnormal organ positions or bronchiectasis features were observed on CXR (Fig. 2b, c).
The detailed clinical manifestations and latest medical examination of each patient are summarized in Table 1.

Ciliary Structural and Functional Analysis
Ultrastructural analysis of Patient 1 showed a shortened ODA.HSVA of her nasal cilia exhibited minimal residual, disorganized beating (Video S1).Ultrastructural analysis of Patient 2 and Patient 3 showed complete absence of ODA.Most frames in the videos showed completely immotile cilia (Video S2).A powerful beating stroke followed by a recovery stroke was observed in the healthy control (Video S3).

Genetic Analysis and Validation
The WES results were analyzed with an algorithm to detect CNVs.The positions of the CNVs identified in our study are shown in Fig. 3a.A CNV causing heterozygous deletions corresponding to exons 71-72 of DNAH5 was suspected in Patient 1 by analysis.To verify the large deletions, qPCR was performed with DNA from the proband and her family members.Single copy loss was detected in Patient 1 and her father by qPCR (Fig. 3b).As WES only covers exonic regions, WGS was performed to specify the precise breakpoints.Eventually, chr5:13717907-13722661del was detected and confirmed by Sanger sequencing (Fig. 3d).Two CNVs causing homozygous deletions spanning exons 69-71 and exons 77-79 were suspected in Patient 2 and his eldest brother Patient 3 (Table 2); qPCR verified that exons 69-71 and exons 77-79 were homozygous deletions in Patient 2 and Patient 3 and heterozygous deletions in their consanguineous parents (Fig. 3c).This indicated an autosomal recessive inheritance pattern conforming to the genetic disposition of PCD.To specify the precise breakpoints of the two CNVs, WGS was performed.Ultimately, chr5:13720087_13733030delinsGTT TTC and chr5:13649539_13707643del were detected and confirmed by Sanger sequencing (Fig. 3e).
To further confirm the pathogenicity of the CNVs, we analyzed the cDNA of Patient 2. The consequence of chr5:13720087_13733030del was verified at the messenger ribonucleic acid (mRNA) level.It resulted in a reading frameshift after exon 68, followed by a premature stop codon at exon 72 (Fig. S1a).Subsequently, the ciliary protein of Patient 2 was extracted for western blotting.The level of DNAH5 protein expression was obviously decreased (Fig. S1b).Since Patient 1 is an infant, we regrettably obtained very few cilia samples.Therefore, cDNA sequencing and western blotting could not be performed on Patient 1.However, chr5:13717907-13722661del, which spans exons 71 and 72 of DNAH5, was believed to cause a 466-bp deletion at the mRNA level.The 466-bp deletion was predicted to cause a reading frameshift during translation, which could damage protein coding.In addition, c.13724-1G > A was predicted to cause a splicing aberration by scsnv.Finally, we studied DNAH5 localization in respiratory cells by IF and found a complete absence of DNAH5 in the ciliary axonemes of Patient 1 and Patient 2 compared with the healthy control (Fig. 4).

Discussion
Our study recruited patients with situs inversus totalis, chronic sinusitis, and productive cough.HSVA, nNO, and TEM results suggested that these patients had high suspicion of PCD.A total of 115 patients have undergone genetic testing in our center.Through NGS and targeted CNV analysis, clinically significant single-nucleotide variants (SNVs) have been identified in 97 patients.CNVs have been detected in three patients.Subsequent qPCR confirmed the copy number abnormalities.WGS and Sanger sequencing were performed to specify and confirm the precise breakpoints.IF was performed to further support the pathogenicity.In addition, there are 15 cases where the carrier status is unknown in our cohort.
Several studies have systematically investigated associations between genotype and phenotype in PCD (Emiralioğlu et al. 2020;Pifferi et al. 2021;Shoemark et al. 2021;Blanchon et al. 2020).Individuals with DNAH5 mutations have been reported to be phenotypically diverse (Shoemark et al. 2021;Blanchon et al. 2020).Consistent with a previous report, phenotypic diversity was observed in our cohort, even among siblings.Despite frequent respiratory infections,  Patient 2 still had an apparently normal life at 39 years old with only mild to moderate impairment of pulmonary function.In contrast, his older brothers who carried identical mutations and were in the same living environment were not so fortunate; they all had more severe bronchiectasis and died from progressive lung failure at middle age.Data on the mortality of PCD are extremely limited.To the best of our knowledge, the two respiratory deaths in our study were the first reported in DNAH5-associated PCD.
In recent years, sequencing techniques and bioinformatics analysis have rapidly advanced.Currently, WES plays a very important role in PCD diagnosis in China.To date, over 50 genes have been reported to cause PCD (Wallmeier et al. 2020).However, the genetic basis of PCD remains unknown in approximately 30% of suspected patients.Routine WES analysis most often focuses on SNVs and short insertions and deletions.CNVs regrettably cannot be detected by routine WES analysis (Harel et al. 2018).Defined as genomic intervals that deviate from the normal diploid state, CNVs have been collectively detected in an estimated 12-16% of the human genome (Pös et al. 2021).However, the prevalence and clinical significance of CNVs in PCD-associated genes are unclear.Few studies have reported the relationship between CNVs and PCD in European, American and Japanese populations (Takeuchi et al. 2020;Marshall et al. 2015;Keicho et al. 2020).Marshall and colleagues reported that WES followed by targeted CNV analysis identified four of 45 (8.8%) PCD patients who harbored clinically significant CNVs (Marshall et al. 2015).Fassad reported that CNVs accounted for 4% of variants overall in PCD patients (Fassad et al. 2020).
In the present study, CNVs were predicted by analysis with WES data plus panelcn.MOPS, a reliable algorithm to detect CNVs with high sensitivity and specificity, and confirmed by qPCR and Sanger sequencing.Through this process, we identified three clinically significant CNVs in two Chinese families with PCD.Our study suggests that the complementary roles of WES and CNV analysis in the molecular diagnosis of PCD are clinically beneficial.Generally, the gold standard tool for CNV detection in diagnostic settings is microarray.Microarray is economical and fast for large CNVs, but it is not normally sensitive for small CNV events involving one or a few exons.On the other hand, microarrays can only detect the target region covered by the probes.The accuracy of WES in different regions varies according to sequencing depth.Therefore, CNVs detected based on WES data need to be further verified.However, WES allows for the identification of large and small CNVs in entire coding regions at once.It also permits the detection of both and CNVs concurrently, thus eliminating the need to use a range of different technologies in one patient and optimizing the diagnostic process (Royer-Bertrand et al. 2021).

Fig. 1
Fig. 1 Pedigree structure and clinical examinations of Patient 1's family.a Pedigree structure of Patient 1's family.The arrow denotes proband Patient 1.The affected patient is designated by a black symbol; the half-black symbol indicates heterozygous deletions of DNAH5.Chest X-ray (CXR) images show mirrored distributed viscera (b).CT images of Patient 1 show no abnormality except for dextrocardia (c).CT of the paranasal sinus shows otitis media and maxillary sinusitis (d)

Fig. 2
Fig. 2 Pedigree structure and clinical examinations of Patient 2's family.a Pedigree structure of Patient 2's family.The arrow denotes proband Patient 2. Symbols with diagonal slashes denote deceased individuals.CXR images of the mother (b) and father (c) of the proband show normal distribution of organs.Clinical imaging of

Fig. 3
Fig. 3 CNV analysis by qPCR and Sanger sequencing.a Schematic of the DNAH5 protein structure, with the red slash indicating the position of the CNVs.qPCR analysis of relative DNA content in whole blood from the control subject, Patient 1 (b),

Fig. 4
Fig. 4 Subcellular localization of DNAH5 in respiratory epithelial cells.Axoneme-specific antibodies against acetylated α-tubulin (red) were used as the control.In respiratory epithelial cells from healthy probands (a), DNAH5 (green) localized predominantly along the

Table 1
Phenotypic features of individuals with DNAH5 FEV1 forced expiratory volume in 1 s, FVC forced vital capacity, NA not available

Table 2
DNAH5 mutations in primary ciliary dyskinesia ACMG American College of Medical Genetics and Genomics, CNV Copy number variation, LP Likely pathogenic