Background

Lipids are found in all organisms and are essential for life [1,2,3]. They contribute to cell membrane constitution, acting as signaling molecules and as an important source of energy. In animal species, the deregulation of lipid homeostasis processes is responsible for dyslipidemia and many diseases of major importance in human health such as obesity, diabetes, non-alcoholic fatty liver disease (NAFLD) or cardiovascular diseases. The major site of lipid synthesis may differ from one species to another. Lipogenesis from carbohydrates sources occurs mainly in the liver in primates and rodents, and in adipose tissue in carnivores and ungulates (pig, cow, goat, dog) [1]. Note that for chicken [4,5,6] and many fishes [7], the major site of de novo fatty acid synthesis is the liver. The synthesis of cholesterol in mammals, birds and fish is more ubiquitous but predominates in the liver [5, 8,9,10]. The storage of fatty acids in triglycerides is the main animal’s energy reserve, which is constituted during periods of energy excess and mobilized during periods of energy deprivation. This storage mainly occurs in adipose tissue within the adipocytes. Energy homeostasis is regulated by hypothalamus which regulates food intake and energy expenditure [11], and involves many hormones and adipokines to organize the crosstalk between organs. Lipid metabolism is a complex process that involves a large variety of molecular pathways and different transcriptional regulators some of which have been described so far.

Since the 2000s, with the advent of whole-genome sequencing technologies, most genes coding essentially for proteins have gradually been identified. The databases reference nearly 20,000 genes in human [12] [Human GENCODE v.28–10 July 2018], 22,000 in mouse [13] [Mouse GENCODE v.M17–10 July 2018] and pig [Ensembl v.92–10 July 2018], 18,000 in chicken [Ensembl v.92–10 July 2018]. Small RNAs, such as miRNA that do not code for proteins, are also relatively well annotated today (from 1705 to 5531). In contrast, long noncoding RNA (lncRNA) are less described in the genomes: they are strictly defined as RNAs of more than 200 nt with no ORF long enough to be translated into a protein (< 150 nt) [14]. They have important roles since they control the regulation of gene expression via a large diversity of mechanisms as they can interact with DNA, RNA or proteins [15, 16]. The number of lncRNAs with a well-described functional role is now estimated at 1% [17]. lncRNA genes appear to be as numerous as or even more numerous than genes encoding proteins with 15,779 and 12,533 lncRNA referenced via the GENCODE projects [12,13,14] in human [Human GENCODE v.28] and mouse [Mouse GENCODE v.M17], respectively. Modeling of these gene entities structure is far from complete. In human, specialized databases such as NONCODE [18] or LNCipedia [19] announce even higher numbers than those referenced in Ensembl and NCBI databases. On the contrary, in less studied livestock species such as pig and chicken [Ensembl v.92], a very small fraction of lncRNA is referenced (361 and 4643, respectively). So far, most lncRNA and their underlying transcripts, are gene models predicted by bioinformatics pipelines from RNAseq data and in the majority of species have rarely been experimentally confirmed. The difficulty of lncRNA gene modeling and identification is due to different causes: 1. an expression 10 to 100 times weaker than the protein coding genes [14, 20, 21], which makes harder the identification of the different transcripts exon-intron structures and therefore of the gene locus; 2. the tissue-specificity of lncRNA expression [14, 20], requiring the use of many different tissues to establish an exhaustive catalogue; 3. the criteria used for lncRNA prediction that can be the presence of an ORF, a blast against a protein database, the size of transcripts and/or the composition in k-mers, depending on the bioinformatics tool used; the main tools being CPC [22, 23], CPAT [24], PhyloCSF [25], FEELnc [26] and the Ulitsky’s team pipeline PLAR [27]. 4. Finally, a last point concerns the low conservation of lncRNA sequences between species, especially for species that are evolutionary distant [14, 20, 27,28,29,30]. All these drawbacks make it difficult to find orthologous long noncoding genes by sequence conservation analysis, as frequently done for protein-coding genes.

The first review about lncRNAs involved in lipid metabolism is very recent [31] (2016). Since then several other reviews and articles were published but none of them provided an exhaustive lipid-related lncRNA catalogue. The first objective of this study was to fill this gap providing an exhaustive catalogue of lncRNAs involved in lipid metabolism. An extensive analysis of the literature, generally focusing on human or mouse, allowed us to draw up a catalogue of 60 lncRNA genes related to lipid metabolism for which we report mechanisms of action when it is described This led us to highlight some spurious genes and therefore, to rename some lncRNAs in accordance to the rules published by the HUGO Gene Nomenclature Committee for long noncoding genes [32]. Second, we analyzed the conservation of these 60 lncRNA in chicken, which last shared common ancestor with mammal dates back to 300 M years ago, with the assumption that such a conservation would support an important role of these genes in the metabolism of interest. For this, an approach by synteny analysis was used, which highlighted 5 lncRNAs preserved between human/mouse and chicken. Finally, we have more precisely described their functional roles and analyzed their conservation between 8 species from mammals to zebrafish.

Results

60 lncRNA identified as involved in lipid metabolism by expert curation of the literature

To our knowledge, the first catalogue of lncRNA potentially involved in lipid metabolism was proposed by Chen in 2016 [31] and included the 5 lncRNA, called lncLSTR, HULC, APOA1-AS, lincRNA-DYNLRB2–2 and SRA. The same year, Zhou et al. [33] published a broader review of lncRNA genes potentially involved in lipid and glucose metabolisms and related diseases (atherosclerosis, type 2 diabetes, insulin resistance) in which TRIBAL, ANRIL, lncLSTR, AT102202, APOA1-AS, lincRNA-DYNLRB2–2, RP5-833A20.1 and CRNDE were described as involved in lipid metabolism. Later in the year, Smekalova et al. [34] published a list of lncRNA involved in liver pathophysiology including two lncRNA involved in lipid metabolism HULC and lincRNA-DYNLRB2–2 and Ananthanarayanan [35] reports three other lncRNAs involved in triglyceride, cholesterol and bile acid homeostasis: lnc-HC, lncLSTR, APOA1-AS. In 2017, Zhao et al. [36] present a review on lncRNA involved in liver metabolism and cholestatic liver disease in which lncLSTR, Lnc18q22.2, SRA1, HULC, MALAT1, lncHR1 were related to the lipid metabolism and lnc-HC, APOA1-AS, H19, MEG3, lincRNA-DYNLRB2–2, LeXis involved in cholestatic liver pathologies. More recently, in 2018, Van Solingen et al. [17] present a review covering the 12 lncRNA previously described by Zhao et al. enriched with RP5-833A20.1 renamed NFIA-AS1 and MeXis a new lncRNA discovered in early 2018 by the team of Tontonoz [37]. Likewise recently, Zeng et al. [38] added eight other lncRNAs: Gm16551, SPRY4-IT1, APOA4-AS, LINK-A, RP1-13D10.2, E330013P06 (named CARMN in databases), LOC100506036 (named CNNM3-DT in databases) and SNHG14. All these reviews report a total of 27 lncRNA involved in lipid metabolism. An extensive literature analysis allowed us to add 33 new lncRNAs, bringing the total to 60 lncRNAs. These extra lncRNAs include one identified in 2015 not mentioned in the aforementioned studies, it was the lncRNA AT115872 described with the lncRNA AT102202 (more largely cited in the literature). The first acts at distance on the expression of ACAT2, that encodes a key enzyme for the absorption of dietary cholesterol, while the second acts locally on the HMGCR gene, which encodes a key enzyme for cholesterol anabolism [39]. Other genes have been described between 2016 and today: 8 in 2016, 15 in 2017 and 15 in 2018. These 60 genes potentially involved in lipid metabolism are listed in Table 1. Most of these genes have been identified in human (34) and mouse (19) with two in both species, the others have been described in rat (1) or livestock species as pig (5) and chicken (4).

Table 1 The 60 genes involved in lipid metabolism and their associated publication

Mechanisms of action of lncRNAs

The demonstration of a link between the lncRNA of interest and lipid metabolism can be variable from a publication to another (Table 1). For 38 of the 60 genes, a direct or indirect causative effect of the lncRNA on the lipid metabolism is given in response to an invalidation and/or an overexpression of the lncRNA. Such experiments were conducted in human (22) or murine (4) cell cultures or through in vivo experiments, by injection of viral vector in the tail of mice (11) or more rarely KO mice (3 with LeXis, MeXis and SRA1 lncRNAs). Out of these 38 studies, 25 go further to partially or totally decipher the action mechanism (Table 2). For 20 of the 60 genes, the link with lipid metabolism was only based on a co-expression between the lncRNA and one (several) transcript(s) or metabolite(s) related to the lipids in response to a disease (10), a genotype (6) or a molecule (8) known to act more or less specifically on lipid metabolism. Finally, 3 studies reported a link with lipid metabolism by GWAS analysis between genetic markers within or close to the lncRNA and a phenotype associated to lipid metabolism suggesting that the lncRNA is a potential causal gene for the phenotype variation.

Table 2 Mechanisms of action reported for 25 lncRNAs

All the 60 lncRNAs we suggested as involved in lipid metabolism cover most of the types of action described so far in the literature for the noncoding genes. As shown in Fig. 1, these lncRNAs may function as regulators of the transcription by acting at the DNA level (Fig. 1a), of the post-transcription and translation by acting at the RNA level (Fig. 1b) and finally of the post-translation by acting at the protein level (Fig. 1c). Concerning the underlying biochemical mechanisms, most of them are based on lncRNA-RNA or lncRNA-protein(s) interactions. RNA immunoprecipitation and pull-down assays [98] have revealed a vast range of interactions between lncRNAs and proteins, proteins that sometimes interact with other RNAs. Such interactions constitute real scaffolds that can inhibit or activate different biological processes. At the transcriptional level (Fig. 1a), different studies showed an action of lncRNAs on the promoters of genes involved in lipid metabolism. For example, it is the case of LeXis as reported in a very comprehensive study conducted by the Tontonoz’s lab [55] (Fig. 1a, right part): first, LeXis was observed as the most up-regulated lncRNA in mouse primary hepatocytes when treated with GW3965, an agonist to the liver X receptor (LXR) that mediates cellular and systemic cholesterol homeostasis and in particular inhibits cholesterol biosynthesis. The existence of a response element to LXR was then demonstrated in the LeXis promoter using luciferase reporter gene experiment and ChIP-qPCR. Using overexpression of LeXis by adenovirus injection or knockdown experiments in mouse, the authors show that LeXis decreased cholesterol and HDL and decreased the expression of genes involved in the cholesterol biosynthesis pathway. LeXis−/− mice also showed an increase in hepatic cholesterol. Present in the nucleus, LeXis is suspected to interact on gene transcription by modifying protein recruitment on chromatin. Tontonoz’s lab then demonstrates, using ChIRP/SM and ChIP experiments, a binding of LeXis on the heterogeneous nuclear ribonucleoproteins (hnRNP) RALY, suspected to be a potential transcriptional co-factor. Its knockdown is responsible for a decrease in cholesterol levels as well as genic expression in cholesterol biosynthetic pathway. The authors show, using knockdown and ChIP-qPCR experiments that RALY binds the promoter of different cholesterologenic genes and activates their expression, activation affected by LeXis through the modulation of RALY DNA-binding [55].

Fig. 1
figure 1

The different mechanisms of action of lncRNAs. a Mechanisms with effect at transcriptional level, b at post-transcriptional level and (c) on proteins. d LncRNAs with a role as small noncoding RNA host. e LncRNAs with translational activity through a small ORF. In red, lncRNA; in green, mRNA; in blue, miRNA; the green, yellow and blue oval spheres are proteins. The genes in bold are those mentioned in this review, the others are examples from research fields other than lipid metabolism: SAF66 and NRON [97]

Other studies have shown an action of lncRNAs on the transcription. For example, APOA1-AS seems to inhibit the transcription of the APO gene cluster (APOA1, APOC3, APOA4, APOA5) that codes for protein components of lipoproteins, by DNA compaction through epigenetics mark modulation [41] (Fig. 1a, left part). This mechanism seems to require the recruitment of the LSD1 protein known to induce gene silencing through the removal of active methyl marks, and of the SUZ12 protein, a key component of the polycomb recessive complex (PRC2) known to mediate chromatin silencing through H3K27 trimethylation [41]. Indeed, the APOA1-AS depletion in HepG2 cells increased the active histone H3K4me3 marks at the APOA1 promoter in parallel to a significant decrease of LSD1 occupancy. APOA1-AS depletion also decreased the repressive histone H3K27me3 marks at the APOA1 promoter that coincided with a marked reduction of SUZ12 occupancy in this region [41]. A second example of lncRNA action on DNA compaction is Lnc-leptin [68] (Fig. 1a, middle part). By using chromatin conformation capture experiments, a direct interaction was detected between lnc-leptin and the LEP (leptin) gene, which codes for a major adipokine secreted by white adipocytes and functioning as an energy sensor to regulate energy homeostasis. This “lnc-leptinleptin promoter” interaction occurred at the enhancer region of LEP and it was diminished upon lnc-leptin knockdown in mature adipocytes.

At the post-transcriptional level, some lncRNAs seem to play a role in the maturation of RNAs such as lncRNA uc.372 (Fig. 1b, left part) which prevents the maturation of a pri-miRNA by camouflaging the area targeted by the Drosha protein [93]. On mature RNAs, the so called “competing endogenous RNA (ceRNA)” role of lncRNAs resembles that of a sponge for small RNAs regulating the mRNA target of small RNAs (Fig. 1b, middle part). Among the lncRNAs involved in lipid metabolism, NEAT1 [80], SNHG16 [88] and PVT1 [83] are endogenous competitors of ATGL, SCD and FASN transcripts, respectively. Sometimes, lncRNA-protein complexes target a mRNA and thus regulate its stability (lnc-HC [65], H19 [50], MEG3 [76], NEAT1 [79], APOA4-AS [42]) or its translation (lncSHGL [71]) (Fig. 1b, right part). LncRNAs can bind via their three-dimensional conformation, involving one or more proteins in the structure (Fig. 1c). They can bind via their sequence RNA of hnRNP such as SPRY4-IT1 [89] (Fig. 1c, left part). They can also form protein scaffolds such as Blnc1 which associates with EDF1, LXRα and hnRNPU [95] or linc-ADAL which associates with IGF2BP2 and hnRNPU [57] (Fig. 1c, left part). These protein scaffolds do not yet have a well-defined mechanism of action. Some publications have gone further by highlighting the interest of this complex. For example, lncRNA protein complexes can modulate the half-life of the proteins involved in the complex, such as MALAT1 which binds to SREBP1c in order to stabilize it [75] (Fig. 1c, middle part), or they can modulate its function, such as LINK-A which binds to PIP3 and intensifies its interaction with Akt [59] (Fig. 1c, right part) or modify its cell location as illustrated by NRON [97]. Another role of lncRNA [99] is that of hosting small RNAs (Fig. 1d). SNHG16 [88] illustrates very well this role because it hosts 3 small nucleolar RNAs (snoRNA) SNORD1A, SNORD1B and SNORD1C which are positioned within the introns in the sense direction of the lncRNA, thus benefiting from the co-transcription with the host lncRNA. Finally, lncRNAs can also host small ORFs (Fig. 1e) allowing the translation of small peptides. CASIMO1 is a perfect example, lncRNA CASIMO1 hosts the sequence of a small transmembrane peptide that interacts directly with the SQLE protein and modulates the formation of lipid droplets [45].

New names proposed for misnamed lncRNAs

When a new field is explored, the associated nomenclature requires a certain amount of time to be standardized; this was the case for the nomenclature associated with protein-coding genes and of course it is also the case for the long noncoding genes. When referring to the official HUGO gene nomenclature committee (HGNC), it appears that some lncRNA were not properly named, which can lead to misunderstandings. Indeed, the HGNC lists very precise rules on how to name lncRNA [32]. These can be summarized in seven points (Fig. 2): 1. if the lncRNA function is well described, the lncRNA takes an abbreviated name symbolizing its function, e.g. LeXis for liver-expressed LXR-induced sequence, lncLSTR for liver-specific triglyceride regulator. In the case of unknown function, the lncRNA takes the symbol gene name of the gene harboring it enriched by a suffix describing its genomic location: 2. the ‘Intronic’ and ‘sense’ lncRNA genes are appended with -IT for Intronic Transcript (e.g. SPRY4-IT1); 3. the ‘sense’ and overlapping a protein-coding gene lncRNA gene are appended with the suffix -OT for Overlapping Transcript (e.g. SOD2-OT1); 4. the ‘Antisense’ lncRNA gene are appended with the suffix -AS for Antisense (e.g. APOA1-AS, NFIA-AS1). 5. Close intergenic divergent lncRNA (< 1 kb) transcribed in the opposite direction to nearby protein-coding genes takes the gene symbol name appended with the suffix -DT for Divergent Transcript (e.g. DHCR24-DT). 6. Other intergenic lncRNAs take the name LINC followed by a number assigned by HGNC committee (for human genes). 7. An exception exists for lncRNAs hosting small noncoding RNAs that take the name of the small hosted RNA appended with the suffix HG for Host Gene (e.g. SNHG16). Finally a long noncoding transcript that has common splice junctions with protein-coding transcripts is considered as an additional isoform and therefore belongs to this protein-coding gene [32]. In spite of this approved nomenclature guidance, some lncRNAs are still misnamed. For example the lncRNAs described by Tristán-Flores et al. [62] have been misnamed in lncACACA, lncFASN and lncSREBF1, although they are neither sense nor antisense of the ACACA, FASN and SREBP1 genes. The lncRNA depletion by KO experiments affected the expression levels of the three genes, which again does not allow to use these names.

Fig. 2
figure 2

HGNC decision tree for naming lncRNAs according to the Wright’s schema [32], here updated by including divergent lncRNA and lncRNA hosting small noncodingRNA

We propose to rename some lncRNAs according to the HGNC rules whatever the species (Table 1, column 3). Seven lncRNA have been renamed according to the nearby protein-coding gene: lnc_DHCR24 [20] in DHCR24-DT, AT115872 [39] in SOD2-OT1, XLOC_014379 [94] in NF1-IT1, FLRL3 [43] in RAD54B-AS1, XLOC_019518 [94] in RNF7-DT. We have also renamed two other genes: LISPR1 (long intergenic noncoding RNA antisense to S1PR1) [60] in S1PR1-DT, and uc.372 (ultra-conserved 372) [93] which has a generic name and should be rather called RALGAPA1-AS1. We have restored the official and standardized names, LINC01970 and SMCR2 (Smith-Magenis syndrome Chromosome Region, candidate 2), for the two lncFASN and lncSREBF1 genes [62] respectively. All other XLOC_013639, XLOC_011279, XLOC_064871 [94], lncACACA [62], LOC157273 [74], Gm16551 [49], PLA2G1Bat1 [47] and lnc-KDM5D-4 [67] genes should have a standard name of the type LINC#####. Finally, the lncRNA AT102202 has several common splice junctions with the HMGCR protein-coding gene and should therefore be considered as a HMGCR isoform rather than a lncRNA new gene. Note that the knockdown of this isoform in HepG2 cells was described by Liu et al. to increase the expression level of the gene encoding HMGCR due to the presence of several HMGCR isoforms with different expression patterns [39]; although the authors missed to specify which isoform was targeted by siRNAs and which one was up-regulated (see Additional file 1). According to this figure, the knockdown of the AT102202 isoform should lead to an absence of HMGCR functional protein.

In some cases, a gene is named with several aliases between species. This is the case for lncSHGL in mouse known as B4GALT1-AS1 in human which were both mentioned in Wang et al. [71]. Naming problems also exist within the same species. For example, the NCBI mouse gene named Gm30838 by the MGI (Mouse Genome Informatics), is double named in Lo et al. lnc-ORIA9 and lnc-leptin [68]. Similarly, the human ENSG00000266304 lincRNA gene, firstly known as RP11-484 N16.1, was named lnc18q22.2 by Atanasovska et al. [61] and finally officially renamed LIVAR. Due to their recent studies, we found some genes in databases with an “ID” name instead of its common names used in the literature (e.g. AC023161.1 instead of lncHR1 [66]). Other problems are due to some articles that use lncRNA gene names without referring to any database: the most classical examples are XLOC/TCONS names reflecting experiment-dependent naming from Cufflinks (e.g. pig new genes in Huang et al. [94]) but other names exist such as AT102202 [39] which does not refer to any easily detectable source because it is a contraction of the NONCODE ID NONHSAT102202.

Detailed examination of model architecture

It is important to note that the vast majority of RNAs present in the databases are only predictions and that their structure has only very rarely been validated experimentally. A lncRNA model localized close to a protein-coding gene and in a same strand of this latter can be spurious; it is possible to be an extension of the protein-coding gene due to the difficulty of modelling the ends of transcripts [100]. We detected two lncRNA in this case. The ALDBGALG0000005049 lncRNA is close (~ 1 kb from the SCD 3’UTR) and on the same strand of the SCD gene (Fig. 3a), that encodes the stearoyl-CoA desaturase enzyme that catalyzes the rate-limiting step in the formation of monounsaturated fatty acids. Fan et al. described in chicken myoblasts that the inactivation of the ALDBGALG0000005049 gene by siRNA led to an under-expression of the SCD gene and a modulation of other genes such as PPARA, PPARB and PPARG (Peroxisome Proliferator Activated Receptor) [101] making this lncRNA a potentially important lncRNA in lipid metabolism. However, we propose here to verify the existence of the ALDBGALG0000005049 gene using chicken liver cDNA treated by DNAse using genomic DNA as control. The amplification of the intergenic region “SCD-ALDBGALG0000005049”, confirmed by sequencing, clearly shows the existence of a transcript that overlaps the two genes (Fig. 3b), demonstrating that ALDBGALG0000005049 and SCD are in fact a single gene (Fig. 3c). We report another case, the lncRNA FLRL7, located very close (150 nt from the FADS2 3’UTR) and on the same strand of the FADS2 (fatty acid desaturase 2) protein-coding gene (Fig. 3d) and whose expressions were both up-regulated in the liver of NAFLD mouse according to Chen et al. [43]. We demonstrated using mouse liver cDNA treated by DNAse that FLRL7 and FADS2 genes are a unique gene (Fig. 3e), thus extending the 3′ UTR end of the FADS2 gene (Fig. 3f).

Fig. 3
figure 3

Verification of the ALDBGALG00000000505049 and FLRL7 gene models with their neighboring SCD and FADS2 genes in chicken and mouse. a, d Model of protein-coding genes are from Ensembl v.92 and models of lncRNA are from the article of Fan et al. [101] that used ALDB v1.0, a lncRNA database (a for ALDBGALG00000000505049) and from the article of Chen et al. [43] that used their own models (d for FLRL7). Primers (black arrows) were design in order to amplify a fragment (red line) specific of lncRNA gene (I), of coding-protein gene (IV), of intergenic region (III) and of a fragment linking both genes (II). The expected sizes have been specified (black for RNA; red for DNA) according to the models. b, e Electrophoretic gels with the lengths of the amplicons, showing the existence of a unique gene, the lncRNA being an extension of the protein-coding gene. c, f New experimentally corrected models for the protein-coding gene

Conservation between phylogenetically distant species

Gene conservation during evolution, especially between phylogenetically distant species suggests an important functional role for these genes. Conservation of function of lncRNA across species was tested in some studies, showing a function retained despite a minor sequences conservation [29]. However the task remains difficult because of the poor conservation of lncRNA sequences between phylogenetically distant species as opposed to protein-coding genes, suggesting a more rapid evolution of lncRNA during evolution [14, 20, 27]. Therefore, we propose here to carry out orthologous lncRNA searches between phylogenetically distant species (chicken and mammals) using an approach based on the genome synteny as already presented in Muret et al. [20] and Foissac et al. [28] (see Materials & Methods). Even if this manual approach is time-consuming, we investigated the whole 60 long noncoding genes involved in lipid metabolism.

Out of the 60 lncRNA genes mentioned in the Table 1, five genes satisfy the aforementioned criteria between chicken and man (or mouse) and can therefore be considered as orthologous between the species analyzed: there are CRNDE, DHCR24-DT, NFIA-AS1, PVT1 and SRA1. Their roles are described in more detail below. We can note that 5 other lncRNAs are potentially orthologous since 1. they correspond positionally to several lncRNAs in the other species or 2. the gene environment has slightly evolved, e.g. a new coding gene appear in locus or the distance between genes are different between species. These lncRNAs are lncACACA, DYNLRB2–2, MeXis, PLA2G1Bat1 and TRIBAL. Table 3 shows the genomic positions in human, mouse and chicken of these 10 lncRNA. These lncRNAs potentially play important roles in lipid metabolism since they act on the expression of key genes of this metabolism as key enzyme genes (FASN, SCD, ACADS, ACADVL, ATGL), lipid transport genes (ABCA1, FABP4) or lipid metabolism transcription factor genes (PPARA, PPARG).

Table 3 The 10 lncRNAs potentially conserved between human/mouse and chicken genomes

CRNDE (ColoRectal Neoplasia Differentially Expressed) is a lncRNA divergent with respect to the IRX5 gene, at 2 kb and 1.2 kb in human and chicken, respectively. In chicken, it has been described as ORat7 [47] (ALDBGALT0000001763). Composed of 6 to 9 exons in chicken and 6 exons in human, CRNDE is mainly noncoding but some isoforms are known in human to code for small proteins [102]. In human, CRNDE knockdown by siRNA in colorectal cancer cell lines leads to an under-expression of FASN, and an overexpression of ACADVL and ACOT9, two genes involved in lipid catabolism [46].

Lnc_DHCR24, renamed in DHCR24-DT (Table 1), is a lncRNA divergent with respect to the DHCR24 gene coding for a key enzyme of the cholesterol synthesis. We have modelled this transcript for the first time in 2017 in chicken and located a possible orthologue in human [20]. The intergenic region is very small, i.e., 200 bp in human and 300 bp in chicken [20]. In a previous study, we have suggested that these two genes are likely to be co-regulated by a bi-directional promoter, because of their high hepatic co-expression in several chicken lines (layers and broilers) analysed at different ages (young and, adult stage) [20].

NFIA-AS1 is an intronic antisense lncRNA of the NFIA gene (Nuclear Factor I A). Composed of 4 to 7 exons, it is found on the second intron of NFIA gene in human as well as in chicken. It negatively regulates in THP-1 cells the expression of the NFIA gene by a negative feedback mechanism with the transcription factor NF1A binding the promoter of NFIA-AS1 to activate it [86]. LncRNA NFIA-AS1 (also called RP5-833A20.1) is induced by a high level of oxidized and acetylated LDL [86]. These two genes (NFIA-AS1 and NFIA) appear to regulate the homeostasis of cholesterol (LDL, HDL, VLDL) and inflammatory cytokines (IL-1β/6, TNF-α and CRP) in Apo−/− mice [86].

PVT1 (Plasmacytoma Variant Translocation 1) is a same strand lncRNA with respect to the MYC nearest protein-coding gene. Modelled in the NCBI databases of human, mouse and rat, the two genes are separated by 50-60 kb in these three species. In the chicken, we also found it at the same distance (56 kb) of the MYC gene. PVT1 in human is a competitive endogenous lncRNA by acting as a sponge of MIR-195 in osteosarcoma cells [83]. It upregulates BCL1 and BCL2, two genes playing roles in the control of cell cycle and apoptosis, but also FASN, one of the key lipogenic enzymes.

SRA1, better known by its first name SRA (Steroid Receptor Activator), is one of the first lncRNA described in human (1999) [103]. Its functional RNA acts as a steroid receptor coactivator. It can coactivate androgen receptor (AR), estrogen receptors (ERα, ERβ), progesterone receptor (PR), glucocorticoid receptor (GR), thyroid hormone receptor (TR) and retinoic acid receptor (RAR) [104]. As early as 2003, it was demonstrated that its codes for a functional SRAP protein improving the trans-activation of PPARG and genes coding for AR and GR [105]. In Human, as in mouse or other species, this double “protein coding-lncRNA” classification exists [106]. In this review, we are only interested in its noncoding isoform. SRA1 is an antisense exonic lncRNA of the gene coding for ANKHD1 in human. Currently, we do not know if SRA1 has noncoding isoforms in chicken but a coding isoform is present in Ensembl database (ENSGALG00000040453). Using a mouse SRA1 KO, the noncoding SRA1 isoform seems to regulate in the liver the expression of genes involved in lipid metabolism as PPARA, PPARG, FABP4 and SCD by a mechanism still unknown [91]. In 2016, the same team showed that in liver, the noncoding SRA1 isoform positively regulates the expression of PNPLA2. PNPLA2 codes for an enzyme that plays a role in lipid hydrolysis that if under expressed leads to the progressive fatty liver steatosis [90].

For the previous 5 lncRNA found to be preserved between human and chicken, two model species, xenope and zebrafish, and three livestock species, goat, bovine and pig, have been added to the syntenic conservation analysis to better appreciate the conservation in vertebrates. Note that the model species xenope (370Myr) and zebrafish (440Myr) are more phylogenetically distant from human than the chicken (320Myr) (Fig. 4a). We have selected also cattle and pig because they are economically important livestock species, widely consumed in the world. To study conservation, in addition to the public reference Ensembl and NCBI databases, we used the lncRNA catalogues that we recently published for these species [28]. The Fig. 4b shows the conservation of the 5 lncRNA across the 8 species. In the current state of the genome annotation, two genes DHCR24-DT and NFIA-AS1 are preserved only within some amniotes, despite the fact that the two DHCR24 and NFIA neighboring protein-coding genes were conserved in other vertebrates. Such observations can be due to a poor gene annotation in the species were these two long noncoding genes were not identified. Another hypothesis is a convergent evolution in mammals and birds that could explain why it is present in chicken and not in other mammals that are evolutionary closer to human and mouse. On the other hand, CRNDE is found in all tetrapods with robust reliability and the cluster of IRX genes in which CRNDE is located is very strongly preserved in vertebrates [109]. Finally, two genes are conserved in all vertebrates: PVT1 and SRA1 which are in particular involved in FA biosynthesis through the regulation of two key enzymes FASN and SCD respectively.

Fig. 4
figure 4

Genomic conservation in 8 species of the five lncRNA previously found as conserved between human and chicken. a Tree of genome evolution in vertebrates based on Kumar and Hedges studies [107, 108]. b Conservation of the five lncRNA (yellow) through the animal kingdom in relation to their genomic environment: protein-coding gene (blue). The distances between the intergenic entities are in bases

We then studied the expression of these five lncRNA in three tissues known to be involved in lipid metabolism, using human [110] and chicken samples. The tissues chosen were the liver and adipose tissue as key organs/tissues for lipid synthesis and triglyceride storage. The third tissue is the hypothalamus because of its central role in energy homeostasis via the regulation of food intake and energy expenditure [11]. The chicken embryo was also studied because many lncRNAs were described as expressed during embryonic development [111,112,113]. The expression of the 5 lncRNAs is shown in Fig. 5a. In chicken the 5 lncRNAs were overexpressed in the 3 tissues and in the embryo compared to the total lncRNA expressions (t-test; p-value < 0.05), the median of their expression being equivalent to the third quartile of the total lncRNA expression (Fig. 5b). LncRNAs are known to be very poorly expressed compared to protein-coding genes (10 to 100 times less [14, 20]). This observation shows that the lncRNAs described so far are generally more expressed than the whole modelled lncRNAs. PVT1 and SRA1 were well expressed in all 3 tissues, both in human and chicken. DHCR24-DT was overexpressed in the liver of both species compared to the other tissues. On the contrary, CRNDE was not expressed in human, at least in the tissues and samples analyzed. Finally, NFIA-AS1 was not expressed in any of the 3 tissues analyzed, whatever the species. This does not preclude for expression in other tissues, because Hu et al. have shown a link between this gene and lipids in macrophage cells [86].

Fig. 5
figure 5

Expression of lncRNA in embryo, liver, adipose tissue and hypothalamus in human and chicken. a Expression of the 5 lncRNA conserved between human and chicken. b Expression of all lncRNAs (pale colors) in the different tissues against all the 5 lncRNA studied here. Embryo (E), liver (L), adipose tissue (A), hypothalamus (H). Top: expression in human with n = 3 (embryo not represented), bottom: expression in chicken with n = 16. **: p value< 5% ***: p value< 1%

Discussion

To our knowledge, previous reviews taken together described 27 lncRNAs involved in lipid metabolism. An extensive research has allowed us to report 33 additional lncRNAs, the majority of which identified in human or mouse, resulting in a total of 60 long noncoding genes having a role in lipid metabolism. Such a research has to be meticulous because of the multiple aliases intra and between species and is tedious and time-consuming.

In addition to gathering these 60 genes as related to the lipid metabolism, we demonstrated experimentally that two lncRNA close and on same strand with the SCD and FADS2 protein-coding genes and called by the authors as ALDBGALG00000000505049 and FLRL7 related to the lipid metabolism, were spurious. We also made a lncRNA nomenclature revision and propose to rename some genes according to the official rules of the HUGO gene nomenclature committee [32].

We then performed a comparative analysis to identify orthologous genes between phylogenetically distant species. Such a approach allows to make use of the knowledge obtained in one species to infer the presence on lncRNAs in other species not examined so far. In particular, it facilitates the understanding of the biological role by working on species in which gene manipulations are easier. Unfortunately, this research is difficult for lncRNAs because these genes are very poorly conserved in sequence between relatively distant species; such research required synteny approaches to study the conservation of the local genomic structure around the gene of interest. We identified, out of the 60 lncRNAs, 5 lncRNAs conserved within amniotes (320Myr), including two conserved within vertebrates (440Myr). Such results are consistent with other studies; some articles reported less than 20% of lncRNAs conserved across mammals [114, 115]. Interestingly, we showed that lncRNAs described so far are more expressed compared to the whole lncRNAs present in the gene databases; the reason is likely technical since it is easier to work on lncRNAs that are more expressed.

Conclusions

Such a catalogue of 60 genes is interesting, first, by providing new regulators of a complex lipid metabolism known to be highly regulated at the mRNA levels [42, 76]. Second, because the lipid metabolism is particularly important in human health through the numerous diseases related to this metabolism (obesity, cardiovascular diseases, hepatic steatosis ...). Third, because this metabolism is also important in livestock species where it is linked to economically important traits, such as intramuscular fat content, meat quality traits [116,117,118] and ectopic fat deposition and economic carcass values. It should be noted that more than 2/3 of the 60 lncRNAs reported in this review were discovered in the last 2 years and we have no doubt that this first set will be rapidly enriched in the coming years.

Methods

Literature review

The state of the art regarding lncRNA potentially involved in lipid metabolism was obtained by expert curation of the literature database Pubmed. In this way, lipid metabolism was defined as the including synthesis, degradation and transport of different types of lipids, with a particular interest for fatty acids and cholesterol. The keyword literature search was therefore conducted by combining the terms “lncRNA” and one of the following terms associated to apolipoprotein or to the 8 lipid classes: sterol (cholesterol), prenol, fatty acid, acylglycerol (triglyceride), phosphoglyceride, glycolipids, polyketides and sphingolipids.

Syntenic conservation between the chicken and the species where the lncRNA was discovered

The syntenic conservation analysis was performed for all 60 lncRNA genes identified in the first step, using the following annotation reference databases: Ensembl v.92, NCBI release 109, NONCODEv5 [18], ALDB v1.0 [119], completed by the new lncRNA models that we have previously published for chicken [20, 28]. Long noncoding gene conservation by synteny analysis between human/mouse and chicken was performed according to three criteria used in the following order of priority: 1. the lncRNA in human/mouse genome is surrounded by two neighboring protein-coding genes for which a 1-to-1 orthologous gene is available in the chicken genome. We considered the lncRNA as syntenically conserved if this lncRNA was also found in chicken between these two orthologous protein-coding genes, in the same orientation and order and same relative position and/or intron-exon structure. 2. The lncRNA in human/mouse genome is a close divergent lncRNA (< 1 kb) or antisense from a protein-coding gene with a 1-to-1 orthologous gene in the chicken genome; 3. the lncRNA in human/mouse genome is hosting small noncoding genes. Such a lncRNA was syntenically conserved if a lncRNA in chicken is hosting small noncoding genes with the same numbered-names.

Syntenic conservation between eight species

The syntenic conservation analysis was further performed across eight species (human, mouse, goat, cattle, pig, chicken, xenope and zebrafish) for the subset of 5 lncRNA genes, which were found conserved between the chicken and the human/mouse. We used the same three criteria aforementioned, taking the lncRNA reference databases in the following order: Ensembl v.92, NCBI annotation release 109, and the new lncRNA catalogue that we have previously published for livestock species (goat, cattle, pig) [28]. The phylogenetic tree was performed with the phyloT v2018.3 tool (https://phylot.biobyte.de/), which generates phylogenetic trees based on NCBI species names and taxonomy. The tree visualization was carried out with the iTOL v4.2.3 tool [120].

Tissue expression analysis in chicken and human

The human expression raw data from liver, white adipose tissue and brain were obtained from RNAseq data (2 × 100 bp) from three subjects (approximatively 18 M read-pairs per sample) chosen from the NCBI BioProject accession number PRJEB4337 [110]. The aim of Fagerberg et al. was to study tissue-specific expression of genes from 27 healthy human tissues. The chicken liver and abdominal adipose tissue expression data were obtained from RNAseq data (2 × 100 bp) from 16 animals (about 55 M read-pairs per sample for liver and 70 M for adipose tissue) extracted from our PRJNA330615 project [20]. The aim of Muret et al. was to model new lncRNA in chicken in these two tissues. For the hypothalamus and the embryo in chicken, the raw data used correspond to RNAseq data (2 × 150 bp) from 16 individuals (approximately 50 M read-pairs per sample - PRJEB28745). For these 2 species and all the tissues mentioned, normalized RPKM data were used.

RT-PCR for remodeling gene structure

Total RNAs was extracted from the liver of 48 chickens (from 9-weeks broilers) and mice (C57BL/6) as described by Desert et al. [4] and reverse-transcribed using High-Capacity cDNA Reverse Transcription kit (Applied Biosystems, Foster City, CA) following manufacturer’s instructions. cDNAs was treated with DNAseI (kit Ambion – RNAse-free) and diluted 1:10 for specific PCR amplification using the primers defined in Additional file 2. The amplification specificity was confirmed by sequencing. The amplification fragment lengths were verified on 2% agarose gel using the marker SmartLadder (Eurogentec MW1700_10).