Background

Cleft lip with or without cleft palate (CL/P) is a common congenital malformation affecting speech, hearing, feeding, among others functions [1]. Individuals with CL/P require a comprehensive treatment, including multiple plastic and maxillofacial surgeries from birth to adulthood, and speech therapy [2]. It has an average worldwide birth prevalence of 1 in 1000, ranging from 1 in 500 in the Asian population to 1 in 2500 in the African population, with wide variability per geographic origin, ethnicity, and socioeconomic status [3, 4]. The complexity of its etiology seems to result from various genetic and environmental risk factors along with gene- environment interactions (e.g., approximately 70% cases are non-syndromic) [5]. Genome-wide association studies (GWAS) and linkage studies have identified genetic susceptibility to CL/P and differences in such susceptibility in various populations and ethnic groups [6]. However, as seen in many complex diseases or traits, most of the mutations and loci identified from such studies are in the noncoding genomic regions and do not have specific functional annotations [7, 8]. Although much progress has been made in identifying genes whose mutations are associated with CL/P, little is known about the mechanisms by which environmental and epigenetic factors adversely influence gene expression during lip development. Recent studies indicate that environmental factors contribute to changes in phenotype or gene expression at post-transcriptional level through the regulation of noncoding RNAs, including microRNAs (miRNAs) [9]. miRNAs are short (~ 22 nucleotides) noncoding RNA molecules that regulate gene expression at the post- transcriptional level, and they fine-tune the expression of ~ 30% of all mammalian protein-encoding genes [10]. miRNAs were first reported in mammalian systems in 2001 [11], but the latest release of the miRNA database [miRBase (ver. 20): more than 24,000 miRs annotated] highlights the rapid growth of this field of research; however, the expression patterns and functions of most miRNAs still remain to be discovered. The miRNA-gene regulatory mechanisms contribute to the pathogenesis in various diseases [12, 13]. Nonetheless, a limited number of miRNAs (e.g. miRNA- 140, miRNA-17-92 cluster, miRNA-200b, miRNA-133b) have been reported as miRNAs involved in craniofacial deformities in zebrafish and mouse models, as well as in humans [14]. Therefore, an investigation of the functional regulation, at the level of biological pathways and post- transcriptional regulation mechanism, will improve our understanding of genetic susceptibility to CL/P [7, 15].

In this study, in order to identify signatures of causative pathways in the complex etiology of CL/P, we performed a systematic review and subsequent bioinformatics analysis for CL/P-candidate genes. In addition, we analyzed miRNA-gene regulation by bioinformatics analyses. The function in cell proliferation/survival for six candidate miRNAs was experimentally evaluated by cell proliferation assays. Our findings will help us understand the role of genes and miRNAs that are associated with CL/P, in biological pathways and networks. Such knowledge will provide the basis for the diagnosis, prevention, and treatment of craniofacial anomalies such as CL/P.

Methods

Data sources

The publishing guidelines set forth by PRISMA (Preferred Reporting Items for Systematic Reviews and Meta- Analyses) were followed during the literature search and review. The search was conducted using three main literature databases: Medline (Ovid), PubMed (NLM), and Embase (Ovid). In addition, related citations were searched in Scopus (Elsevier, Inc.) to check whether any unique studies were missed from the regular database searches.

Inclusion and exclusion criteria

The articles meeting the following eligibility criteria were included in our systematic review:

  • described genes causing or potentially associated with CL/P in humans;

  • referred to studies of either syndromic or non-syndromic CL/P;

  • were published as original articles (not as review articles, editorials, dissertations, conference proceedings or comments);

  • included cross-sectional, case-control, cohort studies, or clinical trials;

  • were published in English.

After screening for articles using the criteria above, we manually excluded those articles in which:

  • gene mutations were not described in the original articles;

  • CL/P was not specifically described;

  • CL/P resulted from environmental factors;

  • treatments for CL/P were described;

  • were case reports;

  • the articles failed to fit in any of the above criteria but did not have CL/P candidate genes or related information.

Search methods to identify studies

Concepts included in the search were: CL/P, genetics (gene mutations), and humans. A combination of Medical Subject Headings (MeSH) terms and titles, abstracts, and keywords was used to develop the initial Medline search string, and then adapted to search the other databases. The last search date was May 28, 2018.

Study selection and data collection

All the citations found in the search process were stored in RefWorks (ProQuest) and any duplicates were excluded. Search strategies and results were tracked using the Primary Excel Workbook designed for systematic reviews (http://libguides.sph.uth.tmc.edu/excel_SR_workbook). To check the reliability of study selection between the screeners, Cohen’s Kappa test was performed using a randomly selected sample of 66 citations screened for CL/P-candidate genes by titles and abstracts. After achieving a > 90% score for the Cohen’s Kappa, all the titles and abstracts of the articles found through the database search were independently examined by two screeners. The full text of the articles not excluded in the above process was manually reviewed, and all results from the screening were recorded in the Primary Excel Workbook. The data collected were displayed as a descriptive narrative. A codebook for data extraction from eligible articles was developed, as described previously [16]. The data elements extracted for the codebook included citation information, study level information (characteristics and results), and quality level information. The quality assessment of each study identified was performed using the Newcastle- Ottawa Scale (NOS), considering the selection criteria of pre- and post- fortification samples, comparability of these groups, and the ascertainment of either the exposure or outcome. NOS assigns a maximum score of 9 points where studies showing < 5 points have high risk of bias and limitations, with these being excluded from a meta-analysis [17].

Bioinformatics analysis of CL/P genes

Functional enrichment analyses of the candidate genes were conducted using the pathway databases Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) through tools of the database for annotation, visualization, and integrated discovery (DAVID, version 6.8, http://david.ncifcrf.gov) [18]. The enrichment of CL/P genes in a pathway or GO term was tested by the hypergeometric test implemented in DAVID. The p-values were adjusted by false discovery rate (FDR, q-value). The significant pathways were obtained by q < 0.05 and at least four genes from the list of input genes (CL/P genes) in each pathway included. Hierarchical level 4 was used as the cutoff in order to avoid too general GO terms. The miRNA- mediated post-transcriptional regulation of the CL/P candidate genes was analyzed by using the following method: first, the miRNA-gene pairs were identified from three computationally predicted miRNA-target gene interaction databases, miRanda, PITA, and TargetScan [19,20,21,22]; then, the miRNA-gene pairs that were experimentally proofed were selected through miRTarbase analysis [13]; finally, for each miRNA, the enrichment of CL/P candidate genes was examined using Fisher’s Exact Test. To investigate which human phenotype terms are strongly enriched for the candidate genes, the WEB-based Gene SeT AnaLysis Toolkit (WebGestalt) (http://webgestalt.org) was used to perform the overrepresentation enrichment analysis (OEA) using the Human Phenotype Ontology (HPO) annotation database. The top 30 results from the OEA were retrieved on March 15, 2019. The minimum number of genes per category was set to 5 (default). The Benjamini–Hochberg procedure was used for multiple test correction [23]. FDR values were used for all statistical analyses.

Genotype-phenotype association analysis

Genotype-phenotype association analysis was conducted using GWAS available from dbGaP (dbGaP accession phs000774.v1.p1, https://www.ncbi.nlm.nih.gov/gap). This dbGaP repository includes 11,925 individuals from 4058 families, mostly from trios. Out of 11,925 individuals, 5327 were reported as Caucasians, 2221 as Asians, 473 as Africans, and 3904 were reported as having more than one ancestry. A total of 5008 individuals were reported as of Hispanic ethnicity. Not including the parents, 52% of children were males and 48% were females. Array-based genotypes from Illumina Infinium HumanCore Beadchips, including GWAS markers from HumanCore v1 with additional exome and custom contents from the Center for Craniofacial and Dental Genetics (CCDG) consortium, were analyzed with the PLINK software (version 1.90b) to test the association of each directly genotyped variant in the genes listed through the systematic review. Transmission disequilibrium test (TDT) was applied for maximizing analytical power for trios and for minimizing artifacts from population stratification.

Cell culture

Human lip fibroblasts were obtained from JCRB Cell Bank (#JCRB9103KD) and cultured in Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum (FBS), penicillin/streptomycin, and L-glutamine. Human lip fibroblasts were plated onto 96-well cell culture plates at a density of 5000/well and treated with a mimic for either negative control, miR-124-3p, miR- 369-3p, miR- 374a-5p, miR-374b-5p, miR-497-5p, and miR-655-3p (mirVana miRNA mimic, ThermoFisher Scientific) using TransIT-X2 system (Mirus Bio LLC, Madison, WI), following the manufacturer’s protocol (3 pmol mimic and 0.3 μl transfection reagent in 100 μl DMEM per well). Cell proliferation was determined using a cell counting kit 8 (Dojindo Molecular Technologies, Gaithersburg, MD) (n = 6 per group).

Quantitative RT-PCR

Total RNAs were isolated from cultured human lip fibroblasts treated with a mimic for target miRNAs for 1 day (n = 6 per group), using the QIAshredder and RNeasy mini extraction kit (QIAGEN), as previously described [24]. The PCR primers used for quantitative RT-PCR are listed in Additional file 1: Table S1.

Statistical analysis

Two-tailed student’s t tests were applied for the statistical analysis. A p value < 0.05 was considered statistically significant. For all graphs, data are represented as mean ± standard deviation (SD).

Results

Literature search

Our systematic search identified a total of 5016 publications. After eliminating 2653 duplicates from the list, the remaining 2363 articles were further screened, using the titles and abstracts, independently by the two screeners, which resulted in 1558 publications being further excluded based on reasons such as referring to non-genetic studies and case reports. The remaining 748 articles were further assessed for eligibility through manual full-text review. Through this process, 393 articles satisfying all inclusion criteria were selected while 355 articles were excluded. These selected 393 studies were used for collection of CL/P-candidate genes and in the follow-up analyses (Fig. 1 and Additional file 1: Table S2).

Fig. 1
figure 1

PRISMA flowchart for study selection. A graphical representation of the flow of citations reviewed in the course of the systematic review was provided using a PRISMA flow diagram

Summary of human CL/P genes

We identified 172 CL/P-candidate genes from the qualified studies above (Table 1 and Additional file 1: Tables S3–S8). For the bioinformatics analyses, we excluded phenotypic markers and genes with unknown genomic location. Among the CL/P-candidate genes, 10 genes were studied at least five times in different populations: IRF6 (52 studies, encoding interferon regulatory factor 6, located at genomic locus 1q32.2), MTHFR (26 studies, encoding methylenetetrahydrofolate reductase, at 1p36.2), TGFA (18 studies, encoding transforming growth factor alpha, at 2q13.3), MSX1 (25 studies, encoding msh homeobox 1, at 4p16.2), TGFB3 (16 studies, encoding transforming growth factor beta 3, at 14q24.3), NECTIN1 (10 studies, encoding Nectin cell adhesion molecule 1, at 11q23.3), BMP4 (10 studies, encoding bone morphogenetic protein 4, at 14q22.2), FOXE1 (6 studies, encoding forkhead box E1, at 9q22.33), BCL3 (6 studies, encoding B-cell CLL/lymphoma 3, at 19q13.32), and CRISPLD2 (5 studies, encoding cysteine rich secretory protein LCCL domain containing 2, at 16q24.1). Most of the gene mutations (168/177 = 94.9%) were reported only in non-syndromic CL/P while mutations in nine genes (ADH1C, FGFR1, IRF6, MID1, NECTIN1, PHF8, SOX9, TGFA, and TP63) were also reported in syndromic CL/P (9/177 = 5.1%). Several genes were reported only in syndromic cases; CHD7 (CHARGE syndrome), CR1 (van der Woude syndrome), EFNB1 (craniofrontonasal syndrome), KISS1R (Kallmann syndrome), MID1 (Opitz G/BBB syndrome), REN (van der Woude syndrome), and SOX9 (Pierre-Robin syndrome). Mutations in eight genes (BCL3, BMP4, IRF6, MSX1, MTHFR, NECTIN1, TGFA, and TGFB3) were reported across different ethnic groups such as Caucasians, African Americans, Hispanics, and Asians, including but not limited to Chinese, Japanese, Indian, Turkish, Polish, Finnish, Brazilian, American, and European. By contrast, a total of 75 genes were reported as insignificant in some populations. For example, while mutations in TGFA were significant in the Iranian, Korean, Caucasian and Chilean population, they were not significant in the South American, Italian, Malaysian or Indian populations. Mutations in IRF6 were not significant in some studies, but they were significant in larger studies in various populations (Additional file 1: Tables S3–S8). Recent advances in the discovery of genetic variants at whole genome level and the design of genome-wide approaches enable investigators to identify the involvement of multiple genes and loci in a single study (e.g., GWAS). Mutations in multiple genes and loci were reported in 87 and 12 studies, respectively. A significant association of IRF6, MSX1 and NECTIN1 with CL/P has been reported in several GWAS in different populations (Additional file 1: Table S4). However, no significant association was reported in several studies conducted in Danish, Mesoamerican, and African populations for IRF6, American and Latvian populations for MSX1, and Taiwanese population for NECTIN1 (Additional file 1: Table S6).

Table 1 Summary of genes associated with cleft lip with/without cleft palate in humans

GO term enrichment analysis

We analyzed CL/P genes enriched in the GO terms to assess the functional features of CL/P genes (Additional file 1: Tables S9–S12). The most specific enriched terms among GO biological processes showed strong association with development and morphogenesis of other organs (Additional file 1: Tables S9 and S10). These results suggest that CL/P genes may potentially cause additional developmental disorders while 70% of CL/P cases are non-syndromic without any additional birth defects. In non-syndromic CL/P cases, molecules that are specifically expressed in the lip region during lip formation may help explain the molecular mechanism in regard to how only lip formation is affected. Lip formation involves the growth and fusion of maxillary and nasal processes during embryogenesis [25]. We identified a strong association with positive and negative regulators of several cellular processes, including cell proliferation, apoptosis, differentiation, and epithelial-to-mesenchymal transition. We also found that genes involved in folic acid metabolic process were significantly enriched (7 CL/P genes) (Additional file 1: Table S10). The current approach for the prevention of neural tube defects and CL/P is folic acid supplementation [26]. Our results suggest that some individuals with CL/P have defects in folic acid metabolism, and that a folic acid supplement can potentially prevent CL/P.

Among the enriched GO terms in its domain Molecular Function (MF) (Additional file 1: Tables S9 and S11), we observed enrichment of several terms related to molecular binding: transcription factor activity, sequence-specific binding (28 CL/P genes), sequence-specific DNA binding (23 CL/P genes), chromatin binding (16 CL/P genes), and binding to Frizzled, a family of G protein-coupled receptors for WNT ligands (11 CL/P genes). The remaining enriched terms in the MF domain included: protein homodimerization activity (26 CL/P genes); growth factor activity (13 CL/P genes), which is induced by FGF, BMP, TGFβ and PDGF; protein tyrosine kinase activity (10 CL/P genes), 1- phosphatidylinositol-3 kinase activity (8 CL/P genes), and phosphatidylinositol-4,5-bisphosphate 3-kinase activity (8 CL/P genes) induced by FGF signaling; and others. Thus, the GO MF term analysis highlighted the contribution of morphologic factors involved in WNT, FGF, BMP, TGFβ, and PDGF signaling pathways.

Among GO terms representing cellular components (Additional file 1: Tables S9 and S12), most terms were enriched in extracellular components: extracellular region (48 CL/P genes), extracellular space (36 CL/P genes), cell surface (27 CL/P genes), and proteinaceous extracellular matrix (ECM) (22 CL/P genes). In GO terms, many growth factors including WNT, FGF, BMP, TGFβ, and PDGF and their extracellular regulators (e.g. ECM) were identified in these groups. These findings are in agreement with the fact that growth factors are activated in the extracellular space.

Human phenotype ontology enrichment analysis

We used the HPO database to investigate which human phenotype terms were enriched for the identified set of CL/P candidate genes. The most enriched phenotypic term was abnormal number of teeth (28 genes), followed by CL (23 genes), cleft upper lip (22 genes), reduced number of teeth (25 genes), and oral cleft (36 genes) (Additional file 1: Table S13). Among the top 30 results, most phenotypes were directly related to CL/P, congenital dental disorders, or congenital abnormalities of the fingers.

KEGG pathway analysis

We hypothesized that CL/P genes share common features among wide arrays of biological functions and pathways. We therefore examined which biological pathways were enriched with CL/P genes by using the DAVID bioinformatics tool and the canonical pathways from the KEGG database (Additional file 1: Table S14). The regulatory pathway annotation was performed based on the score and visualization of the pathways used in the KEGG database. Among the cellular functions in KEGG pathways, 19 pathways were statistically significant in the enrichment of CL/P genes (FDR < 5%). Seven of these pathways were related to cancer: pathways in cancer (43 CL/P genes), basal cell carcinoma (16 CL/P genes), proteoglycans in cancer (16 CL/P genes), melanoma (10 CL/P genes), pancreatic cancer (8 CL/P genes), bladder cancer (6 CL/P genes), and colorectal cancer (7 CL/P genes). Previous population-based studies suggest that individuals with CL/P have a higher risk of cancer in the breast, brain, and lung later in life [27,28,29]. Among them, molecules related to Hippo, FGF, WNT, and TGFβ signaling are highlighted in these cancer-related pathways. These cellular signaling pathways were significantly involved in CL/P: Hippo signaling pathway (22 CL/P genes), signaling pathways regulating pluripotency of stem cells (21 CL/P genes), phosphoinositide 3-kinase (PI3K)- Akt signaling pathway (15 CL/P genes), mitogen-activated protein kinase (MAPK) signaling pathway (13 CL/P genes), WNT signaling pathway (11 CL/P genes), and TGF-β signaling pathway (9 CL/P genes).

Genotype-phenotype association analysis

To determine the contribution of genetic variations to CL/P, a genotype-phenotype association analysis was conducted using GWAS for the genetics of orofacial clefts and related phenotypes (e.g. CL/P) available from dbGaP (dbGaP accession phs000774.v1.p1). By PLINK analysis, we identified 2975 cases vs 8751 controls out of 11,925 individuals. We investigated whether single nucleotide polymorphisms (SNPs) mapped to the CL/P-candidate genes are associated with human CL/P phenotypes using TDT of The Genetics of Orofacial Clefts and Related Phenotypes GWAS dataset. This method is appropriate for statistical imbalances between transmitted and non-transmitted alleles in parent-child trios. We investigated all directly genotyped (i.e. not imputed) variants within the genes found in the systematic review. A total of 5437 variants from 179 genes were included in the association analysis, while some genes did not have the genotyped SNPs in this dataset and some SNPs did not present any variations in the analyzed samples. Because most SNPs in the same gene have strong linkage disequilibrium, we set our candidate-wise significance threshold at 2.79 × 10− 4 by using the Bonferroni level with the number of genes tested (0.05/179). We identified 56 SNPs from 12 genes with p-values smaller than 2.79 × 10− 4 (Table 2). The top association signals were from IRF6 and NTN1 (Netrin 1), which also showed nominal genome-wide significance in the GWAS dataset (p < 5 × 10− 8). We also observed gene-wide significant signals (p < 2.5 × 10− 6) from ABCA4 (ATP-binding cassette, sub-family A, member 4) and PAX7 (paired box 7). Although ADAM3A (ADAM metallopeptidase domain 3A), FOXE1 (forkhead box E1), MSX2 (msh homeobox 2), MTHFR (methylenetetrahydrofolate reductase), TP63 (tumor protein p63), TPM1 (tropomyosin 1), VAX1 (ventral anterior homeobox 1), and WNT9B (wingless-type MMTV integration site family, member 9B) did not reach the genome-wide nor gene-wide significance thresholds, these genes reached candidate-wise significance level (p < 2.79 × 10− 4).

Table 2 Genotype-phenotype association

Environmental and epigenetic factors

In addition to gene mutations, genetic background such as ethnicity, population and gender have substantial influence on the birth prevalence of CL/P. Environmental factors such as maternal age, smoking, alcohol consumption, obesity, and micronutrient deficiencies are known to be strong susceptibility risk factors. Recent studies suggest that environmental factors can interact with the epigenetic system and control gene expression at the post- transcriptional level [30,31,32]. To explore the degree to which miRNAs regulate the expression of CL/P genes, we performed a bioinformatics analysis to identify miRNAs whose target genes are statistically enriched with CL/P genes (Table 3). By applying the adjusted p-value (FDR) < 0.1, a total of 16 miRNAs were significantly enriched with the targeted CL/P genes. These 16 miRNAs were categorized into 10 known and one unknown miRNA families: the miR-27 family (hsa-miR- 27b- 3p, 23 CL/P genes), miR-124 family (hsa- miR- 124-3p, 29 CL/P genes), miR-154 family (hsa-miR-369-3p, 13 CL/P genes; hsa- miR-655-3p, 14 CL/P genes; hsa-miR-300, 18 CL/P genes; hsa-miR-381-3p, 18 CL/P genes), miR-203 family (hsa-miR-203a-3p, 9 CL/P genes), miR-368 family (hsa-miR-376b-3p, 10 CL/P genes), miR-374 family (hsa-miR- 374a-5p, 22 CL/P genes; hsa-miR-374b-5p, 22 CL/P genes), miR-497 family (hsa-miR-497-5p, 25 CL/P genes), miR-503 family (hsa-miR-503-5p, 10 CL/P genes), miR-550 family (hsa-miR-550a-5p, 5 CL/P genes; hsa-miR-550a-3-5p, 5 CL/P genes), miR-1271 family (hsa-miR-1271-3p, 5 CL/P genes), and an unknown family (hsa-miR-3678-3p, 5 CL/P genes). Notably, except for hsa-miR-27b-3p [33], these miRNAs have not yet been reported in CL/P. In the enrichment analysis, CL/P genes regulated by multiple miRNAs were: EN2 (targeted by 6 miRNA families), FZD6 (targeted by 5 miRNA families), HECTD1 (targeted by 5 miRNA families), and YOD1 (targeted by 5 miRNA families) (Table 4). Because gene expression of EN2, FZD6, HECTD1, and YOD1 is regulated by several miRNAs, expression of these genes may be more susceptible to environmental factors during lip formation.

Table 3 MicroRNA (miRNA) enrichment analysis of 161 human CL/P genes (FDR < 0.1)
Table 4 Human CL/P genes targeted by at least two microRNA (miRNA) families

Experimental validation

miRNA regulates expression of its genes anti- correlationally [34]. To test whether induction of potential CL/P-candidate miRNAs caused proliferation defects through the suppression of target gene expression, we treated cultured human lip fibroblasts with each miRNA mimic. The miR-497-5p and miR-655-3p mimics were most significantly inhibited cell proliferation in human lip fibroblasts (Fig. 2a). To identify target genes regulated by either miR-497-5p or miR-655-3p, we performed quantitative RT-PCR analyses for the predicted target genes (AXIN2, BAG4, BCL2, CHD7, CRISPLD2, EN2, EYA1, FGF1, FGF2, FGFR1, FGFR2, FOXP2, FZD6, HECTD1, JARID2, MTHFR, PAX7, RHPN2, RUNX2, SATB2, SLC6A4, TFAP2A, TPM1, WNT3A, and YOD1 for miR-497-5p; and BCL2, CYP1A1, DMD, EN2, FZD6, GREM1, HOXB3, MAFB, MID1, NTN1, PAX6, SATB2, TULP4, and YOD1 for miR-655-3p) in human lip fibroblasts after treatment with mimics of either miR-497-5p or miR-655-3p. The expression of almost target genes except EN2, GREM1, MAF6, TULP4 and YOD1 was significantly downregulated by treatment with miR- 655-3p mimic (Fig. 2b). The expression of target genes (BAG4, CHD7, CRISPLD2, FGFR1, FOXP2, HECTD1, RUNX2, and TFAP2A) was significantly downregulated by treatment with miR-497-5p mimic (Fig. 2c). PAX6 was excluded because its expression is restricted in the anterior ectoderm during early embryogenesis and the ectoderm of craniofacial surface during craniofacial development [35,36,37]. EYA1 expression was unexpectedly increased after treatment with miR-497-5p mimic.

Fig. 2
figure 2

Experimental validation of predicted miRNAs. a Cell proliferation assays in human lip fibroblasts treated with the indicated mimics of miRNAs. Negative control (control, light blue), miR-369-3p (orange), miR-655-3p (gray), miR-374a-3p (yellow), miR-374b-3p (blue), miR-497-5p (light green), and miR-124-3p (dark blue). *** p < 0.001. b Quantitative RT-PCR for the indicated genes after treatment with negative control (light blue) or miR-655-3p mimic (orange). * p < 0.05, ** p < 0.01, *** p < 0.001. c Quantitative RT-PCR for the indicated genes after treatment with negative control (light blue) or miR-497-5p mimic (orange). * p < 0.05, ** p < 0.01, *** p < 0.001

Discussion

Orofacial cleft is described in approximately 400 known human syndromes [38,39,40]. Several factors have been implicated in clefting by studies of mouse models [25] and genetic screening in humans. Our literature search identified 177 genes as possible causative genes of CL/P. Our follow-up bioinformatics analysis could group genes by common features of CL/P genes in function, pathway, and miRNA regulation. As expected, the contribution of most of the highlighted pathways (e.g. FGF, Hippo, TGFβ, and WNT) to the growth and developmental process has been shown in previous mouse genetic studies for craniofacial development [25]. Cellular metabolic pathways such as one-carbon metabolism are also highlighted in KEGG pathway analysis. The one-carbon metabolism, mediated by the folate cofactor, is involved in multiple physiological processes including biosynthesis of purines and thymidine, amino acid homeostasis, epigenetic maintenance, and redox defense [41]. Mice with one-carbon metabolic aberrations (Mthfd1l−/− mice) are embryonic lethal by E12.5, with craniofacial deformities including midfacial cleft [41]. However, in human studies it is still controversial in the significance of genetic mutations in genes involved in one-carbon metabolism [42,43,44,45]; further studies are necessary to reach to the conclusive evidence.

Multiple processes synthesize miRNAs and then transcribe them as long primary transcripts that are cleaved by Dicer, a type 3 ribonuclease, to produce mature, functional miRNAs. In human genetic studies, the increasing number of studies show functional significance of single- nucleotide polymorphisms (SNPs) in genes related to CL/P [46,47,48,49,50]. These SNPs might alter the binding activity of miRNAs. For example, previous studies show that SNPs in the miRNA-binding sites of MSX1, FGF2, FGF5 and FGF9 are associated with the susceptibility of nonsyndromic orofacial clefts [51, 52]. A recent study shows that plasma miRNAs are differentially expressed in nonsyndromic CP and nonsyndromic CL/P [53]. The miRNAs may also systemically regulate gene expression during embryogenesis, while some miRNAs may uniquely regulate gene expression in some particular tissues with an interaction with mRNAs expressed in a tissue-specific manner. In mice, loss of Dicer results in absence of mature miRNAs and, therefore, the phenotype in Dicer knockout mice reflects how important miRNAs are for proper development. Interestingly, cranial neural crest (CNC) cell- specific Dicer knockout (DicerF/F;Wnt1-Cre) mice, but not epithelial-specific Dicer knockout (DicerF/F;K14-Cre) mice, exhibit severe midfacial deformities resulting from decreased cell proliferation and increased apoptosis in the developing craniofacial regions [54,55,56,57], indicating that miRNAs have crucial roles in the fate determination of CNC cells during midfacial development [54,55,56,57]. In this study, to evaluate the function of each miRNA in cell proliferation/survival, we employed human lip fibroblasts, CNC-derived cells, for our experimental validation. The functional significance of candidate miRNAs were tested in cell proliferation/survival assays in cultured human lip fibroblasts. We found that miR-369-3p, miR-655-3p, miR-374a-5p, miR-374b-5p, and miR-497-5p that were first identified in this study as candidates involved in human CL/P suppressed cell proliferation in cultured human lip fibroblasts. Top two candidate miRNAs, miR-655-3p and miR-497-5p, were further tested in the miRNA-gene regulation assay. We found that these miRNA mimics suppressed expression of genes associated with human CL/P. Thus, the miRNAs predicted were successfully validated in our experiments. Taken together, this study provides a better understanding of CL/P as well as data that will be available for the future research of genetic approaches used to characterize individual or cluster miRNAs identified in this study.

In this study, we validated that overexpression of miR-655-3p and miR-497-5p suppressed the expression of multiple genes (BCL2, CYPLA1, DMD, FZD6, HOXB3, MID1, NTN, and SATB2 by miR-655-3p; and BAG4, CHD7, FGFR1, FOXP2, HECTD1, RUNX2, and TFAP2A by miR-497-5p) that are associated with CL/P. Previous studies indicate that miR-655-3p is downregulated in several cancers through dysregulation of ADAM10 (a disintegrin and metalloproteinase domain-containing protein 10) and the WNT/β-catenin pathway [58, 59]. By contrast, it remains unknown how miR-655-3p expression is induced.

Previous studies also show that miR-497-5p is downregulated in several cancers [60,61,62]. Interestingly, miR-497-5p is upregulated during myofibroblast differentiation of lung resident mesenchymal stem cells and in the lung tissues of a pulmonary fibrosis mouse model [63]. While it remains unclear what factors and triggers induce miR-655-3p expression, increased miR-655-3p levels may result in accelerated tissue differentiation with less proliferation in various tissues. Several environmental risk factors for CL/P such as smoking, alcohol consumption and toxins [64] may cause CL/P through the upregulation of these CL/P-associated miRNAs. In addition, when the levels of these microRNAs are increased too much, multiple CL/P genes would be suppressed, as we demonstrated in this study. Our results may partially explain why individuals with CL/P show suppression of multiple CL/P genes without genetic mutations in the coding regions, which contributes to the complexity of the CL/P etiology.

In this study, we found that miR-497-5p suppressed CHD7 expression at maximum degree of 80% in cultured human lip fibroblasts. Mutations in CHD7 cause CHARGE syndrome, which is characterized by CL/P (in 20–36% of the cases), abnormal middle and external ear, coloboma, choanal atresia and hypoplastic semi-circular canals, rhombencephalic dysfunction, hypothalamo- hypophyseal dysfunction, mental retardation, and tracheoesophageal fistula, but do not contribute to nonsyndromic CL/P [65]. miR- 497-5p also suppressed RUNX2 expression; SNPs in RUNX2 are known to increase the risk of nonsyndromic CL/P. Although FOXP2 and BAG4 mutations are associated with nonsyndromic CL/P [66, 67], less is known about the role of these genes in CL/P development. Mouse genetic mutant models are useful tools to investigate the role of genes in lip formation, but the loss of these CL/P genes fails to cause CL/P in mice, although it has not been examined whether suppression of multiple CL/P-associated genes causes CL/P. However, mice deficient for Tfap2a, which is suppressed by miR-497-5p, exhibit midfacial cleft [68], suggesting that TFAP2A may be a principal downstream target of miR-497-5p in human lip fibroblasts. Mutations in TFAP2A cause Branchio-Oculo-Facial Syndrome (BOFS), which is characterized by CL/P, branchial arch defects, and ocular anomalies [69], and are associated with nonsyndromic CL/P in several populations. In addition, Tfap2a deficiency suppresses Fgf8 expression in craniofacial regions in mice [68], suggesting that FGF signaling is a downstream pathway of TFAP2A. In this study, we found that miR-497-5p suppressed FGFR1 gene expression in cultured human lip fibroblasts. Therefore, a combined downregulation of TFAP2A and FGFR1 may compromise this cascade. FGFR1 mutations are also associated with increased risk of nonsyndromic CL/P, as well as of Kallmann syndrome with CL/P [70] and Hartsfield syndrome, which is characterized by holoprosencephaly, ectrodactyly, and CL/P [71].

miR-655-3p also suppressed the expression of FZD6, a WNT receptor, with a maximum reduction of 85%, and of SATB2 in cultured human lip fibroblasts. Mutations in FZD6 are found in nonsyndromic CL/P individuals, and SNPs in SATB2 are associated with nonsyndromic CL/P in several populations. Importantly, mice deficient for Satb2 also show CL and CP, as seen in individuals with mutations in SATB2. Interestingly, wild-type mouse embryos maternally exposed to phenytoin, a drug used to prevent and control seizures but presenting congenital anomalies as common side effects, show decreased Satb2 expression in craniofacial tissues [72]. This suggests that miR-655-3p may be upregulated in this condition of CL/P. miR-655-3p also suppressed expression of DMD and MID1 in cultured human fibroblasts. These genes are located on the X chromosome so that their mutations are associated with X-linked nonsyndromic CL/P in men and women. Mutations in MID1 are associated with the X-linked Opitz G/BBB syndrome, characterized by midline defects such as CL and CP, hypertelorism, and laryngo-trachea-esophageal (LTE) abnormalities [73, 74]. CYP1A1 is involved in drug/agent metabolism and several of these metabolites are known to be carcinogens [75]; CYP1A1 mutations may be associated with CL/P induced with smoking and detoxification [76]. The role of other downstream targets (NTN1 and BCL2) of miR-655-3p in the CL/P etiology remains largely unknown, and mice deficient for either Ntn1 or Bcl2 fail to display CL/P and CP.

Our systematic data collection timely summarizes the CL/P-candidate genes; however, it has some limitations. For example, some genes are from syndromes that include CL/P as a feature, but often they do not. These causative genes may be more broadly related to development, but do not contribute to CL/P; CL/P may be secondary to other defects in these syndromes. The current genetic signature of CL/P or other complex genetic diseases may be due to the bias of the type of genes that have been studied. Unbiased genome sequencing approaches will likely overcome the limitations and help us discover genes associated with CL/P. Molecular profiling of CL/P at the genomic, transcriptomic, regulatory (e.g. enhancer), epigenomic, and proteomic levels in model organisms like mice will provide us with a more detailed and accurate view of how genetic changes cause CL/P.

Conclusion

Our bioinformatics analysis results suggest that a disruption of extracellular cues and their signaling pathways might be a major cause of CL/P, and that miRNAs might play important roles in the CL/P etiology through the regulation of CL/P genes. In this study, we found that several miRNAs suppressed cell proliferation in cultured human lip fibroblasts. Among them, we confirmed that miR-655-3p and miR497-5p negatively regulated CL/P-candidate genes in the cultured cells. This study will have potential relevance to the miRNA-gene pathways and networks, not only in CL/P but also in other organogenesis processes. Therefore, this study will contribute to a better understanding of the mechanisms of CL/P and to future clinical interventions to prevent and diagnose CL/P.