Introduction

Obesity is a major global public health burden. It is characterized by excessive accumulation of fat mass in white adipose tissue, which occurs due to increased adipocyte volume, increased number of cells, or both1,2. Adipocytes develop from mesenchymal stem cells (MSCs), which become committed preadipocytes and then fully differentiated adipocytes. There are two main types of adipocytes—white and brown adipocytes. White adipose tissue (WAT) functions in fat storage and is characterized by adipocytes containing large unilocular lipid droplets. It is an active endocrine organ that regulates insulin sensitivity, lipid metabolism and satiety. It is distributed throughout the body in subcutaneous regions and surrounds visceral organs. The main function of brown adipose tissue (BAT) is thermogenesis. It is composed of multiloculated adipocytes that contain large numbers of mitochondria. Its location is restricted to the paravertebral, supraclavicular and periadrenal regions3. Our focus is on white adipogenesis since obesity is caused by WAT outgrowth4.

Murine cells lines, such as 3T3-L1, 3T3-F422A and OP9, are commonly used in adipogenesis research1, with the most popular model being 3T3-L1 cells. However, these murine cells cannot completely recapitulate the process in human cells due to several differences. For example, 3T3-L1 and 3T3-F442A cell lines express the adipocyte-secreted factor leptin at much lower levels than primary adipocytes. Recently, an increasing number of studies are using human Adipose-Derived Stem Cells (ASCs) and human preadipocytes to better capture characteristics of adipose tissue when addressing questions of adipose-related biology1. However, in depth sequencing analysis of these human cells is limited but is critical to dissecting the molecular regulatory mechanisms that characterize human adipogenesis. In our study, we examine gene expression regulation during differentiation of human primary preadipocytes into white adipocytes by using high-depth sequencing libraries and advanced bioinformatic tools, and in this way, expand our understanding of the role of long non-coding RNAs (lncRNAs), alternative splicing, and alternative polyadenylation (APA).

LncRNAs are small RNA transcripts that are longer than 200 nucleotides, generally have a 5’-cap and 3’-polyA tail, and do not code functional proteins. They regulate various processes and some play important roles in adipogenesis based on studies using cell lines and primary cells5,6. Since lncRNAs have low conservation between human and mouse7, studies using primary human cells are of greater relevance to translational research. Studies using primary human adipocytes have characterized individual lncRNAs identified from microarray or RNA-sequencing8,9 or studied expression in cells from obese and lean individuals10. One study in 2015 examined global changes in lncRNA expression during differentiation of human adipose-derived stromal cells11. However, the improvement of sequencing strategies may now allow the identification of novel lncRNAs with important roles in adipogenesis that may have been missed in the earlier studies. In our study, we sought to use global profiling to identify such lncRNAs and provide a resource for comparative analysis with future datasets.

APA during adipogenesis has been less extensively studied. APA is a pre-mRNA processing mechanism that generates different mRNA isoforms based on usage of alternative polyA (pA) sites. APA often results in isoforms with the same coding sequence but different 3’-UTR lengths, with downstream effects on mRNA stability, translatability, localization, and RNA-seeded protein/protein interactions, while intronic APA can alter both the coding sequence and 3’-UTR sequences12,13,14,15. Use of pA sites located in introns affects the coding sequence, leading to transcript degradation or expression of different protein isoforms12,13,15. Previously, analysis of 3’-end reads showed an overall lengthening during 3T3-L1 differentiation16. Also, in 3T3-L1 cells, the shorter transcript of heme oxygenase 1 (HO1) was demonstrated to have a stronger inhibitory effect on differentiation of preadipocytes17. In 2013, a different group used RNA sequencing to investigate APA in the 3’-UTR and found that differentiated human adipocytes have a modest trend towards longer 3’-UTRs compared to undifferentiated adipocytes18. However, a comprehensive analysis of APA at a genome-wide level with standard RNA-seq data is difficult as few reads are localized in the 3’-UTR. A goal of our study was to use 3’-focused sequencing data to accurately and quantitatively characterize alternative pA site usage and location and identify specific white adipogenesis-related genes undergoing such regulation.

Alternative splicing is another pre-mRNA processing mechanism that diversifies the transcriptome and regulates gene expression19. It is carried out by the spliceosome and various RNA-binding proteins (RBPs) that activate or repress splicing of regulated exons by binding to enhancer or silencer elements, respectively, in the pre-mRNA19. Spliced isoforms and splicing factors relevant to adipogenesis have for the most part been identified in 3T3-L1 cells and mouse animal models19. Some studies have characterized specific spliced isoforms in human adipogenesis20,21, but to our knowledge, only one paper addressed global alternative splicing during the process by inducing mesenchymal stem cells from bone marrow to differentiate into adipocytes22. In the current study, we sought to take advantage of a recent dataset23 on differentiation of stem cells derived from subcutaneous adipose tissue and analyze global alternative splicing with a commonly used tool, rMATS24.

By utilizing multiple sequencing datasets and newly available bioinformatics tools, combined with a thorough investigation into available experimental evidence, we have identified potentially new mechanisms of regulation of white adipocyte differentiation at the level of lncRNAs, alternative splicing and APA.

Methods

RNA samples and sequencing

Commercially available total RNA of human preadipocytes (Day 0) and their respective differentiated adipocytes (Day 14) were purchased from Zen-Bio Inc (Durham, NC, USA). The preadipocytes were isolated from human subcutaneous adipose tissue, plated and then differentiated into mature adipocytes for 14 days using the company’s Adipocyte Differentiation Medium (catalog number DM-2), yielding cells that were rounded with large lipid droplets apparent in the cytoplasm (Supplementary Fig. S1). Zen-Bio has also confirmed that the resulting mature adipocytes expressed the fatty acid binding protein aP2/FABP4, responded to lipolytic agents, and secreted leptin and adiponectin. We also tested expression of markers by reverse transcription of the RNA samples to cDNA using using NEB LunaScript RT SuperMix kit (New England Biolabs, M3010L) followed by qPCR using NEB Luna Universal qPCR Master Mix (New England Biolabs, M3003L) and primers listed in Supplementary Table S1. Expression of adipogenesis markers were normalized to that of RPL13A and the qPCR results were quantified using the ddCt method. Details of the donors of the adipose tissue samples are provided in Supplementary Table S2. These samples were sent to Admera Health LLC (NJ, USA) for 3’-end sequencing using the Quant-Seq FWD library kit. The generated cDNA libraries were sequenced as 150 bp paired-end reads with read depth of 60 million reads on an Illumina [NovaSeq X Plus](Illumina, California, USA) instrument.

Differential gene expression analysis

The raw sequence reads of the samples were processed to remove adapters using Cutadapt25 (v2.8) and the quality of these trimmed reads passed using FastQC26 (v0.11.8) and MultiQC27 (v1.7.0). The FastQC reports are uploaded to GEO database (GSE250525). R1 reads were aligned using STAR28 (2.6.1d) to the Human hg38 genome. Aligned reads were then quantified using featurecounts29 (v1.6.3) and log2-fold changes were calculated with DESeq230(1.40.2) using default parameters and the UCSC annotation file (created on 2015-08-14). A fold change of 1 and an adjusted p-value of less than 0.05 were applied to select differentially expressed genes. The enrichment of Gene Ontology Biological Processes (GOBP) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathways for upregulated and downregulated genes were carried out using the DAVID Functional Annotation tools31 and the default settings, with background genes restricted to those detected in our analysis.

To obtain a comprehensive list of lncRNAs, we obtained a complete list of gene symbols from GeneCards32, along with their categories, from https://www.genecards.org/cgi-bin/cardlisttxt.pl. We then filtered for the “RNA Gene” category and then further filtered these “RNA Genes” by removing RNAs such as piRNAs and miRNAs, that are not lncRNAs. Using this list as reference, we extracted data for differentially expressed lncRNAs from the differential gene expression data of the DESeq2 analysis above.

For analysis of differential gene expression of lncRNAs during brown adipogenesis, we used the publicly available GEO dataset GSE16838733 which contained three replicates each of brown preadipocytes (Day 0) and differentiated adipocytes (Day 14) from the human immortalized brown adipocyte Paz6 cell line. Paired-end reads of samples (GSM5137833, GSM5137834, GSM5137835, GSM5137836, GSM5137837, GSM5137838) were aligned to the hg38 genome, quantified, and the log2-fold changes were calculated as above. Differentially expressed lncRNAs were selected based on adjusted p-value < 0.05 and log2 fold change > 1 or <− 1.

Alternative polyadenylation analysis

APA analysis was performed using a polyA_DB-based APA analysis. Adapter and poly(T) sequences from QuantSeq FWD raw read 2 data were trimmed. Trimmed reads were then aligned to a human (hg38) polyA_DB34 reference (− 100nt, polyA_DB-annotated-PolyA Site (PAS), + 25 nt) using STAR-2.7.7a. The last aligned position of each mapped read was compared to polyA_DB-annotated-PASs, allowing ± 24 nt flexibility. Matched reads were considered pA site-supporting (PASS) reads, which were used for further APA analysis. UCSC genome browser plots shown in this paper are based on PASS reads. For 3’-UTR APA analysis, two pA sites with the highest usage in the 3’-UTR were compared, and for intronic APA analysis, all intronic isoforms were combined and compared to all isoforms using pA sites in the terminal exon. A RED value (Relative Expression Difference) was calculated as difference in log2-ratio of isoform abundance (isoforms from use of distal pA sites vs those from proximal pA site) between undifferentiated and differentiated samples. A “shortening” or “lengthening” of gene was called when RED <− log2(1.2) or > log2(1.2), respectively, and BH (Benjamini and Hochberg)-adjusted P-value < 0.05 (Fisher’s exact test).

The enrichment of GOBP and KEGG Pathways for shortened and lengthened genes were carried out using the DAVID Functional Annotation tools31 and the default settings, with background genes restricted to those detected in our analysis.

The presence of miRNA-binding sites was identified using the “Custom prediction tool” in the mirDB35,36 database. MiRDB includes target prediction scores between 50 and 100. For scores below 60, there is low confidence in the prediction, and it is recommended to have additional supporting evidence. Therefore, miRNA-binding predictions with “target scores” less than 60 were excluded from our analysis. We also examined whether binding sites of these miRNAs are enriched in the 3’-UTRs of genes undergoing 3’-UTR APA relative to the 3’-UTRs of all genes regardless of their APA status. To do this, we used miRDB to determine the number of genome-wide targets for the miRNAs of interest (miRNA targets with target score less than 60 were filtered out) and calculated the ratio of these miRNA targets to all genes expressed in the system. We then compared that to the ratio of miRNA-target genes undergoing APA and all genes that undergo APA in our analysis. Statistical analysis was done using a two-tailed Poisson test.

The presence of RBP-binding sites was identified through RBPmap37,38 (v1.2), using the hg38 database assembly, all 223 human/mouse motifs included in RBPmap, the stringency level set to high, and default settings for other parameters. For analysis of RBP-binding, we overlapped results from RBPmap38 with CLIP-seq data from the POSTAR339 database to identify RBP-binding to 200 bp upstream and downstream of proximal, distal and intronic polyadenylation sites.

Alternative splicing analysis

We used the GEO dataset GSE17602023 for alternative splicing analysis. In this dataset, adipose tissue-derived stem cells (ASCs) were induced to differentiate in vitro. We analyzed data from undifferentiated (Day 0) and differentiated (Day 9) samples derived from abdominal subcutaneous adipose tissue (GSM5352869, GSM5352872, GSM5352874, GSM5352877, GSM5352879, GSM5352882). FASTQ files from two runs of each sample were merged, followed by alignment of the paired-end reads to human hg38 genome using STAR28 (v2.6.1d). The BAM (alignment) files generated were used as input to the rMATS24 (v4.1.2) program, using all default parameters and UCSC annotation file (created on 2015-08-14), to detect splicing events between undifferentiated and differentiated adipocyte samples. The “JCEC” output files containing both junction and exon counts were used to detect significant differentially spliced events. To increase confidence in the rMATS calls, counts of each sample in both conditions were only considered if greater than 25. Differentially spliced events were called if dPsi [IncLevelDifference, or psi (percent-spliced-in) values between undifferentiated and differentiated samples] is greater than 0.05 or less than -0.05, and if the FDR value is less than 0.05. The enrichment of GOBP and KEGG Pathways for differentially spliced genes were carried out using the DAVID Functional Annotation tools31 and the default settings, with background genes restricted to those detected in our analysis. Sashimi plots for specific genes were generated using rmats2sashimiplot (v3.0.0) (https://github.com/Xinglab/rmats2sashimiplot).

The splicing output files from rMATS were input into “Motif Map” analysis of rMAPS240,41 (http://rmaps.cecsresearch.org/MTool/) using default parameters. This analysis involves scanning for occurrences of binding motifs of known RBPs in the intronic sequences 250 bp upstream or downstream of the target exon or flanking exons, and the first 50 bp of the 5’- or 3’-end of exonic sequences of included, skipped and non-regulated or background exons. A motif score (density) within a 50 bp sliding window is calculated, along with a P-value given in comparison between regulated and control exons40,41. Motif maps generated by rMAPS2 were edited in Inkscape (www.inkscape.org).

Image generation

Volcano plots and bar graphs were generated using Rstudio (2023.09.0) with the packages dplyr (v1.1.4)42, ggplot2 (v3.4.3)43, Enhanced volcano (v1.18.0)44 and smplot2 (v0.1.0)45. Pie charts and bar graphs were generated using GraphPad Prism version 10.0.2 for Windows, GraphPad Software, Boston, Massachusetts USA, www.graphpad.com.

Results

Long non-coding RNAs LINC00312, LINC00607 and TYMSOS may be key regulators of white adipogenesis

We sequenced polyadenylated RNAs from three different sets of preadipocytes isolated from human subcutaneous adipose tissue and compared these 3’-end datasets to those from white adipocytes derived from these precursors. Differential gene expression analysis of the sequencing data shows upregulation of previously established adipogenesis markers FABP4, ADIPOQ and PPARG46 (Supplementary Fig. S2A), and this was confirmed by RT-qPCR analysis of the same RNA samples used for sequencing (Supplementary Fig. S3). This analysis supports the efficient differentiation of our samples. GOBP and KEGG enrichment analysis of upregulated genes shows significantly enriched terms related to adipogenesis, such as fatty acid metabolism, PPAR signaling, and lipid metabolic process, whereas that of downregulated genes show enriched terms related to cell cycle and cell adhesion (Supplementary Figs. S2B and S2C).

LncRNAs have been reported to have deregulated expression in obese individuals, and some are known to play a role in the differentiation of both brown and white adipocytes5. Analysis of our dataset detected upregulation and downregulation of 158 and 77 lncRNAs, respectively, during differentiation into white adipocytes (Fig. 1A and Supplementary Table S3). Sixteen of the 158 upregulated lncRNAs were previously reported, based on integrative co-expression analysis, as enriched in adipocytes, and 5 of the downregulated 77 lncRNAs were enriched in adipose progenitor cells47. The others have not been identified before as enriched in any particular tissue. Among the top 5 downregulated and top 5 upregulated lncRNAs (Fig. 1A), only ADIPOQ-AS1 is known to function in adipogenesis through the regulation of murine adiponectin48. The large change in the expression of the nine novel lncRNAs warrants further investigation into their role in human adipogenesis. Most of these lncRNAs have deregulated expression in cancer, affecting properties like proliferation, migration, and invasion of cancer cells through their target miRNA/gene axis49,50,51,52,53,54,55. Considering that cell cycle and proliferation genes are downregulated during early adipogenesis56,57, these lncRNAs may promote adipogenesis by targeting cell proliferation.

Figure 1
figure 1

Multiple lncRNAs are differentially expressed during white adipogenesis. (A) Volcano plot depicts differential gene expression of lncRNAs at Day 0 vs Day 14 of differentiation. Red represents upregulated genes and blue represents downregulated genes that have p-adjusted values less than 0.05 and log2-fold change of more than or equal to 1, or less than or equal to − 1, respectively. The top 5 upregulated and top 5 downregulated lncRNAs are highlighted. (B) Venn diagram shows overlap analysis of lncRNAs differentially expressed during white and brown adipogenesis. (C) Possible mechanisms of lncRNAs LINC00312, LINC00607 and TYMSOS in the regulation of adipogenesis.

To identify lncRNAs differentially expressed in the same or opposite manner during white and brown adipogenesis, we carried out differential expression analysis of a previously published human brown adipogenesis dataset33. Overlap analysis with our white adipogenesis dataset (Fig. 1B) showed upregulation of 13 and downregulation of 8 lncRNAs during differentiation of both brown and white adipocytes (Table 1). Additional lncRNAs were up- or down-regulated specifically in brown or white adipocytes.

Table 1 List of lncRNAs that are differentially expressed during brown and white adipogenesis either in same or opposite direction.

Among the lncRNAs upregulated or downregulated during both brown and white adipogenesis, only LIPE-AS1 and LINC01119 have been shown to have a functional role and are differentially expressed during either murine or human adipogenesis58,59. LncRNAs FOXD2-AS1, LINC00968, LINC01116 and SENCR regulate glioma stem cell60, osteogenic61, neuronal62 and endothelial63 differentiation, respectively, and may play similar roles in adipogenic differentiation via regulation of target genes and miRNAs. The other lncRNAs (LINC01085, LINC01003, LINC00663, LINC01140, RAMP2-AS1, NIFK-AS1, LINC00673, FRMD6-AS1, SLC8A1-AS1) have been associated with cancer, playing diverse roles in cancer proliferation and migration, cell cycle and signaling pathways64,65,66,67,68,69,70,71,72,73.

No lncRNAs were upregulated during white adipogenesis but downregulated during brown adipogenesis. Three lncRNAs, LINC00312, LINC00607 and TYMSOS, were upregulated during brown adipogenesis but downregulated during white adipogenesis. None of these have been characterized in adipogenesis, but we hypothesized their regulatory mechanisms based on literature review (Fig. 1C).

LINC00312 may regulate adipogenesis by two potential mechanisms.

  1. (1)

    LINC00312 directly binds to and negatively regulates expression of miR-9 in breast cancer74. Also, miR-9 is upregulated during 3T3-L1 adipogenesis, but surprisingly, its overexpression inhibits 3T3-L1 differentiation75. LINC00312 potentially regulates adipogenesis by regulating availability of miR-9.

  2. (2)

    LINC00312 directly binds to and negatively regulates expression of miR-21 in colorectal cancer cells76. miR-21 is reported to positively regulate adipogenesis and is upregulated in the adipose tissue of obese individuals77. miR-21 in 3T3-L1 cells decreased expression of adipogenic markers and increased expression of genes involved in thermogenesis and browning78. Therefore, miR-21 is a possible link between LINC00312 and adipogenesis.

LINC00607 may also regulate adipogenesis through miR-607. LINC00607 acts as a sponge of miR-607 in osteosarcoma79. Mir-607 was shown to downregulate EGFR in triple-negative breast cancer cells80—a gene required for white adipogenic differentiation81.

TYMSOS may regulate adipogenesis via three mechanisms.

  1. (1)

    TYMSOS may affect adipogenesis via PI3K-AKT signaling, which is important for the differentiation and browning of white preadipocytes82. TYMSOS was shown to activate the PI3K/AKT signaling pathway in thyroid cancer cells83.

  2. (2)

    TYMSOS sponges miR-4739 in gastric cancer cells84, and miR-4739 was shown to promote adipogenic differentiation of hBMSC (human bone marrow stromal cells)85.

  3. (3)

    TYMSOS was also shown to act as a sponge for miR-214-3p in NSCLSC (non–small cell lung cancer) cells86 and miR-214-3p was shown to promote adipogenesis of 3T3-L1 cells87.

In summary, our analysis suggests that LINC00312, LINC00607 and TYMSOS may regulate adipogenesis by acting as sponges to limit the availability of specific miRNAs.

Alternative polyadenylation could regulate gene expression during adipogenesis

APA is a co-transcriptional process that can generate multiple isoforms through differential usage of alternative pA sites in the 3’-UTR as well as the coding region14. We analyzed our 3’-end RNA-seq data of preadipocytes (Day 0) and adipocytes (Day 14) to identify differential usage of pA sites during adipogenesis. Our analysis of 4211 genes for significant 3’-UTR APA identified 70 shortened genes and 78 lengthened genes (Fig. 2A). Genes undergoing 3’-UTR APA did not show a correlation of shortening or lengthening with gene expression (Fig. 2B). GOBP and KEGG enrichment analysis (Supplementary Fig, S4) showed that genes undergoing 3’-UTR shortening are involved in multiple pathways and processes, among which are adipogenesis-related terms such as “fatty acid metabolism” and “PPAR signaling pathway”. The lengthened genes encompassed more general terms, such as “protein modification” and “collagen fibril organization”.

Figure 2
figure 2

Alternative polyadenylation in the 3’-UTR region is regulated during adipogenesis. (A) Volcano plot depicts shortened (red) and lengthened (blue) genes (p-adjusted value < 0.05, RED values > = 0.26 (lengthening) or < = − 0.26 (shortening). The clear circles are the 4063 genes that fell below these thresholds. The P-adjusted value and RED value thresholds are shown by dashed lines. (B) Scatter plot shows both RED and Log2-Fold expression change of shortened and lengthened genes. Genes with p-adjusted value < 0.05 and log2-Fold change > 1 (upregulated) are shown by triangles pointed upward and those with p-adjusted value < 0.05 and log2-Fold change < − 1 (downregulated) are shown by triangles pointed downward. Genes unchanged in expression are shown by black circles. The color of triangle outline represents shortened (red) and lengthened (blue) genes. The correlation between log2-fold changes and RED values is represented by the green dashed line of best fit, with the Spearman’s correlation value and associated p-value shown in the top right of the graph. (C) UCSC genome browser plots depict shortened and lengthened genes. The genes are oriented as 5’ (left) to 3’ (right). The tracks show gene, annotated polyA-sites (blue indicates RNAs are transcribed from the minus whereas red are transcribed from the plus strand), and peaks (or reads) of merged triplicates of undifferentiated (black) and differentiated (purple) cells. Ranges of normalized coverage signals (reads per million) are given on the right. The gene direction is represented by the green arrow and the exact proximal (P) and distal (D) pA site positions are shown.

Several key genes in adipogenesis undergo 3’-UTR APA during differentiation (Fig. 2C). Shortened genes SCD and ACSL5, which are involved in fatty acid metabolism88,89, are upregulated, whereas CDK16, a gene involved in cell cycle with known roles in differentiation90, is unchanged in expression. Lengthened genes WWTR1 and E2F1 play important roles in adipogenesis91,92 and are upregulated and downregulated, respectively, in gene expression. HADHB, another lengthened gene, is involved in fatty acid oxidation93 and is upregulated during adipogenesis. This suggests that adipogenesis is directly affected by APA-mediated regulation of genes.

To assess the gain or loss of miRNA sites due to changes in 3’-UTR length of genes in Fig. 2C, we used miRDB35 to investigate the presence of miRNA-binding sites in the 3’-UTR region between the differentially used pA sites. We then checked if any of the predicted miRNAs were known to play a role in adipogenesis based on literature review77,94,95, and if they are differentially expressed during adipogenesis96 (Table 2). Nine miRNAs known to be important for adipogenesis and multiple miRNAs that are differentially expressed had binding sites in the genes undergoing APA. However, among the nine miRNAs important for adipogenesis, none were significantly enriched for APA genes relative to all targets detected in the system (Supplementary Table S4).

Table 2 List of miRNAs that have binding sites in a 3’-UTR region gained or lost during APA lengthening or shortening, respectively.
Table 3 Predicted binding of RNA-binding proteins around pA sites of genes undergoing APA. The table is divided into three sections. The upper section shows RBPs that could bind around the proximal (Prox) or distal (Dist) pA sites of genes undergoing 3’-UTR APA and are known to function in adipogenesis (Known RBPs), and RBPs that bind around the pA site but do not yet have defined roles in adipogenesis (Other RBPs). Shaded boxes represent RBP binding around proximal (lighter shade) and distal (darker shade) pA sites of particular genes. The middle section shows RBPs that are predicted to bind at pA sites of only one of the examined genes undergoing 3’-UTR APA. The bottom section shows Known and Other RBPs that bind around intronic pA sites showing altered use during adipogenesis, and RBPs that are predicted to bind at intronic pA site of only one of the examined genes undergoing intronic APA (Unique RBPs). Shaded boxes in the bottom section represent RBP binding around the intronic site.

Based on Table 2 and literature review, we propose below how APA can modulate miRNA-mediated regulation of adipogenesis:

  1. 1.

    The shortening of SCD leads to loss of binding sites for miR-181a-5p, miR-16-1-3p, and miR-125b-2-3p. Both miR-181a-5p and miR-16-1-3p promote 3T3-L1 differentiation97,98 whereas miR-125b-2 inhibits lipogenesis and negatively regulates expression of SCD in mice fed with high-fat diet99. Therefore, the shortening of SCD and consequent avoidance of miRNA-mediated regulation, may contribute to increased expression of SCD and its correct timing, during adipogenesis.

  2. 2.

    Shortening of CDK16 leads to loss of binding sites for miR-23a-5p which is a negative regulator of adipogenesis100, and for miR-23b-5p, which inhibits the thermogenic program of brown adipocytes101.

  3. 3.

    Shortened gene ACSL5 lost binding sites for miRNAs known to be differentially expressed during adipogenesis.

  4. 4.

    HADHB lengthening led to gain of binding sites for miR-30a-3p, which triggers anti-inflammatory responses in adipose tissue and increases insulin sensitivity in diet-induced obesity mice77. Overexpression of miR-30a can also activate the beige fat transcriptional program102. Extended 3’-UTRs can act as sponges to trap miRNAs and prevent their action at other mRNAs12. Therefore, lengthening of HADHB may not subject it to miR-30a-mediated regulation since HADHB is upregulated during adipogenesis, but it could competitively bind to miR-30a-3p to prevent browning of differentiated white adipocytes.

  5. 5.

    Lengthening of WWTR1 leads to gain of binding sites for miR-9-3p, miR-17-3p and miR-10b-3p. Among these miRNAs, miR-17-3p103 may promote adipogenesis, while the others negatively regulate adipogenesis75,104. WWTR1 negatively regulates adipogenesis in mice by down regulating PPARγ activity91,105 and its expression is upregulated in our dataset. Therefore, the presence of binding sites of these miRNAs in the lengthened 3’-UTR suggests that it may be subjected to miRNA-mediated regulation that affects its protein levels. Alternatively, the extended WWTR1 sequence could competitively bind to miRNAs that inhibit adipogenesis.

  6. 6.

    Lengthened gene E2F1 did not gain any miRNA-binding sites of adipogenesis-related or differentially expressed miRNAs.

We also assessed which RBPs may regulate APA of genes shown in Fig. 2C. To do so, we checked for RNA-binding motifs by RBPmap38 and evidence of binding from CLIP-Seq data in POSTAR339 in the regions 200 bp upstream and downstream of proximal and distal pA sites. Table 3 lists RBPs satisfying both criteria that are known to function in adipogenesis19,106,107,108, as well as other RBPs that are not reported to have roles in adipogenesis.

Among the RBPs known to function in adipogenesis, IGF2BP2 was predicted to bind around the proximal pA site of CDK16, the distal pA site of HADHB, and around both proximal and distal pA sites of SCD and WWTR1. IGF2BP1 motifs were found around the proximal sites of SCD and HADHB, and SRSF1 bound around both the distal and proximal pA sites of E2F1 and only the distal site of CDK16. These RBPs have not yet been identified as polyadenylation regulators. While the other RBPs have not been shown to play a role in adipogenesis, they have been shown to regulate APA in other systems109 and may therefore regulate adipogenesis through APA. For example, both U2AF2 and SRSF7 regulate 3’-UTR length109, and in our adipogenesis dataset, U2AF2 bound around proximal and distal pA sites of multiple genes while SRSF7 bound only around the proximal pA site of the shortened CDK16 gene.

In addition to 3’-UTR APA, multiple genes, out of a total of 8449 genes, were identified as undergoing intronic APA. Shortening (increased intronic pA site usage) was found for 441 genes and lengthening (decreased intronic pA site usage) for 816 genes (Fig. 3A). Genes undergoing intronic APA showed a strong correlation with expression (read counts), whereby genes with suppressed usage of intronic pA sites (increased RED values) were upregulated, while those with increased usage (decreased RED values) were downregulated (Fig. 3B). GOBP and KEGG enrichment analysis shows that most genes with increased intronic APA are related to cell cycle and DNA replication, while those with suppressed intronic APA are enriched in categories involved in lipid and fatty acid metabolism and insulin signaling, consistent with the increased need of adipocytes for these gene products (Supplementary Fig. S4).

Figure 3
figure 3

Alternative polyadenylation in introns is regulated during adipogenesis (A) Volcano plot depicts shortened (red) and lengthened (blue) genes (p-adjusted value < 0.05, RED values >= 0.26 (lengthening) or <= − 0.26 (shortening). The clear circles are genes that undergo no APA. The P-adjusted value and RED value thresholds are shown by dashed lines. (B) Scatter plot shows both RED and Log2-Fold expression change of shortened and lengthened genes. Genes with p-adjusted value < 0.05 and log2-Fold change > 1 (upregulated) are shown by triangles pointed upward and those with p-adjusted value < 0.05 and log2-Fold change < − 1 (downregulated) are shown by triangles pointed downward. Genes unchanged in expression are shown by black circles. The color of triangle outline represents shortened (red) and lengthened (blue) genes. The correlation between log2-fold changes and RED values is represented by the green dotted line of best fit, with the Spearman’s correlation value and associated p-value shown in the top right of graph. (C) UCSC genome browser plots depict shortened and lengthened genes. The genes are oriented as 5’ (left) to 3’ (right). The tracks show gene, annotated pA-sites (blue indicates RNAs are transcribed from the minus whereas red are transcribed from the plus strand), and peaks (or reads) of merged triplicates of undifferentiated (black) and differentiated (purple) cells. Ranges of normalized coverage signals (reads per million) are given on the right. The gene direction is represented by the green arrow and the exact intronic pA site (INT) position is shown.

Examples of genes with changes in intronic APA are shown in Fig. 3C. Usage of intronic pA sites increases in the NCAPG2, ORC1 and CDCA2 transcripts during adipogenesis. NCAPG2 and CDCA2 are involved in cell cycle regulation110,111 and ORC1 in DNA replication112, and all are downregulated in expression during adipogenesis. Intronic pA site usage is suppressed in NDUFB3, MLYCD, and PIK3R1. NDUFB3 has been implicated in brown adipogenesis113, while MLYCD is needed for fatty acid biosynthesis114 and PIK3R1 in the PI3K/AKT pathway is important for insulin signaling in adipocytes115. Both MLYCD and PIK3R1 increase in expression during adipogenesis. NDUFB3 expression is unchanged but the proportion of transcripts ending in the intron decreases.

We analyzed the presence of RBP motifs and binding as described earlier by considering the sequence 200 bp upstream and 200 bp downstream of the target intronic pA site of genes in Fig. 3C (Table 3). For most intronic pA sites, we did not detect binding of an RBP. However, among the RBPs known to function in adipogenesis, SRSF1 bound around intronic pA sites of NCAPG2 and CDCA2. The binding sites of RBPs HNRNPA1 and HNRNPC were found around the intronic pA sites of CDCA2, NCAPG2, NDUFB3, and PIK3R1 and both RBPs are known to function in APA116,117.

Alternative splicing regulates key genes in adipogenesis

By changing which exons are included in the final mRNA, alternative splicing can regulate gene expression by generating more than one mRNA isoform per gene, hence contributing to transcriptomic and proteomic diversity. It happens during adipogenesis, and obese and lean individuals have different alternative splicing profiles106. To analyze global changes in alternative splicing during adipogenesis, we used the program rMATS24 on a recently published RNA-seq dataset of the differentiation of human primary preadipocytes23. We identified 100 alternative splicing events, with the most common types of splicing being “skipped exon” (SE) events (Fig. 4A,B). One gene can undergo multiple events of the same event type as well as multiple event types.

Figure 4
figure 4

Analysis of the global splicing profile reveals important genes being regulated by splicing during adipogenesis (A) Volcano plots of distribution of significant splicing events (FDR less than 0.05 and dPsi value greater than or less than 0.05 or − 0.05, respectively) in each type of splicing event. Red denotes splicing events where psi value is greater than 0.05 and the target exon/region is included, whereas blue denotes splicing events where dPsi value is less than − 0.05 and the target exon/region is skipped. For the graphical representation of splicing events, blank rectangles represent constitutive exons and shaded exons or introns represent alternatively spliced regions. (B) Percentages of splicing events distributed across the 5 types of splicing events. (C) Sashimi plots depict splicing of genes during adipogenesis. Each plot is labelled with the splicing event type and gene symbol above. Genomic coordinates of the exons in question are shown on top of plot. Tracks show junction counts and the psi value with top (purple) being the undifferentiated sample and the bottom (red) being differentiated. At the bottom, exons are shown as black boxes and introns as lines with arrows showing directions of genes.

To understand the roles the alternatively spliced genes might play in adipogenesis, we carried out GOBP and KEGG enrichment analyses (Supplementary Fig. S5). Although the enriched terms are not directly associated with adipogenesis, the spliced genes are involved in various processes important for adipogenesis such as “positive regulation of GTPase activity” and “ECM-receptor interaction”118,119. We analyzed some of the spliced genes in greater detail (Fig. 4C):

  1. (1)

    Skipping of exon 6 of COL6A3: COL6A3 encodes the alpha-3 chain of type VI collagen, which is highly enriched in adipose tissue. The expression of COL6A3 expression in abdominal subcutaneous adipose tissue positively correlated with BMI and total body fat mass120. Exon 6 is skipped less frequently in preadipocytes compared to adipocytes. Studies have shown that the inclusion of exon 6 of COL6A3 is correlated with upregulation in cancer121,122. This suggests that the skipping of exon 6 may be responsible for the downregulation that we observe in adipocytes.

  2. (2)

    Alternative 5’ splice site usage in MAP4K4: In mice, MAP4K4 negatively regulates adipogenesis and insulin sensitivity123. Also, during brown adipogenesis in mice, alternatively spliced MAP4K4 isoforms had different effects on the phosphorylation of JNK, which in turn was correlated with the differentiation and metabolic signature of brown adipocytes124. MAP4K4 was not studied in human adipogenesis, but its silencing in myotubes was shown to prevent TNF-α-mediated insulin resistance and increase glucose uptake125. In our analysis, the expression of MAP4K4 decreases during adipogenesis, and alternative 5’- splice site usage in the target exon of MAP4K4 may regulate adipogenesis through changes in MAP4K4 expression and/or through signaling via JNK phosphorylation.

  3. (3)

    Alternative 3’ splice site usage in FN1: FN1 is an extracellular matrix (ECM) protein with roles in cell adhesion, migration, and fibrosis. FN1 is also known to undergo alternative splicing that results in proteins with different tissue localization126. It is dysregulated in obese adipose tissue127, and we also observed that its expression is downregulated during adipogenesis.

  4. (4)

    Mutually exclusive exons of RBPJ: RBPJ has been labelled as “mutually exclusive exons” event by rMATS. It is involved in both transcriptional activation and repression of genes and is a downstream effector of the Notch signaling pathway128. Though RBPJ is not directly associated with adipogenesis, Notch signaling is a known regulator of the process129,130. Therefore, alternatively spliced transcripts of RBPJ may regulate adipogenesis through Notch signaling.

  5. (5)

    Intron retention of SRSF1: SRSF1 is a splicing factor that negatively regulates adipogenesis19, but there have been no studies on the expression of its variants in the adipocyte system. Intron retention in the 3’-UTR region of SRSF1 is higher in differentiated cells in our analysis. With intron retention, the transcript was shown to be more stable and produce more protein131,132. Since SRSF1 is a negative regulator of adipogenesis, splicing out of the intron may be required for undifferentiated cells to undergo differentiation, but after differentiation, SRSF1 may have other functions in mature adipocytes.

To identify splicing factors that could contribute to global alternative splicing changes, we used rMAPS240,41 (RNA map analysis and plotting server). The program rMAPS2 carries out motif enrichment analysis (based on known consensus RBP motifs from the literature) on various regions of alternatively spliced exons—specifically, intronic sequences 250 bp upstream or downstream of target and flanking exons, as well as the first 50 bp of the 5’- or 3’-end of the exons40. Since SE events had the highest number of regulated exons, we limited our analysis to enrichment of RBP motifs in SE events. Among the RBPs identified by rMAPS2, we observed nine RBPs that are already known to influence adipogenesis19,106,107,108,133. Predictions of where these known RBPs could bind (along with the motifs) to regulate global changes in splicing during adipogenesis are shown in Fig. 5 and Supplementary Fig. S6 and summarized in Fig. 5D. For example, SRSF1 could mainly promote exon skipping by binding in the intronic sequence upstream of the target exon. Sam68 (KHDRBS1), on the other hand, could lead to exon inclusion if it binds just downstream of the 3’-end of the upstream flanking exon, but can also cause a skipped exon event if it binds just downstream of 3’-end of the target exon. IGF2BP2 could stimulate inclusion of exons by binding in the intronic sequence downstream of the target exon or promote skipping if it binds upstream of the target exon (Fig. 5B).

Figure 5
figure 5

Specific splicing factors may regulate global “skipped exon” events. (A) Specific motif sequences used to search for binding sites in the vicinity of the skipped exon for the splicing regulators shown in panels (B) and (C) and key to the binding profiles of RBPs traced in the graphs. The solid lines are motif scores (Y axis, left side), calculated as the overall percentage of nucleotides covered by the motif in a 50 bp window, with red denoting the motif density score for exons with increased inclusion (38 events), blue for exons with increased skipping (22 events), and black for background (non-regulated) exons (70 events). The dotted lines are − log10 (P-value) (Y-axis, right side) based on comparison of motif scores between regulated exons against background exons. Only -log10 (P-value) greater than 1.3 are considered. The green exon is the target exon (that is skipped or included) and the flanking exons are shown in grey. (B) Maps of the RBPs SRSF1, Sam68 (KHDRBS1), and IGF2BP2, which are known to play a role in adipogenesis. (C) Maps of novel RBPs ESRP1, LIN28A, CPEB4, PABPC1, KHDRBS2, and HNRNPA1 that could alter the global alternative splicing profile during adipogenesis. (D) Summary of binding profiles of the RBPs, with proteins promoting inclusion indicated as red circles and those promoting skipping as blue circles.

We also identified six other RBPs that are not known to have roles in adipogenesis but could potentially be major regulators of alternative splicing (Fig. 5C,D). Both ESRP1 and PABPC1 could cause skipping of exons. LIN28A and KHPRBS2 could mainly promote inclusion of exons by binding to intronic regions downstream and upstream of the target exon, respectively. CPEB4 could lead to skipping or inclusion of the target exon by binding upstream or downstream of the exon, respectively. HNRNPA1 is predicted to mainly bind to regions downstream of the target exon and promote inclusion and would lead to skipping only if it binds to a region within 125 bp upstream of the downstream flanking exon.

Discussion

Obesity is an increasing global health problem. It is therefore important to understand the basic mechanisms which regulate human adipogenesis because its dysregulation can lead to obesity and obesity-related disorders3. In this study, we have used multiple sequencing datasets to identify potentially important regulators of human adipogenesis at the level of lncRNAs, alternative splicing and alternative polyadenylation.

Long non-coding RNAs

Using the sequencing dataset that we generated from human primary preadipocytes, we first explored lncRNA-mediated regulation. We reasoned that the most important lncRNAs were likely to be those with high change in expression during white adipogenesis, and similarities or differences in expression when compared to their changes during brown adipogenesis. The changes in lncRNAs levels may drive adipogenesis or facilitate function of the differentiated adipocytes. It will be important to characterize their role in adipogenic systems as they may be crucial in maintaining the differentiated state or in determining susceptibility to develop obesity or related disorders and may therefore be important therapeutic targets. Additionally, adiposity has been associated with certain types of cancer4, and because some of the lncRNAs highlighted in our study have been associated with cancer, they may affect the susceptibility to developing cancer. LncRNAs that undergo a similar change in expression in both white and brown adipogenesis may function in the commitment of mesenchymal stem cells into the adipocyte lineage or the maintenance of the differentiated cell state. The lncRNAs that change in opposite directions during white and brown adipogenesis are especially interesting, since browning of white adipose tissue has received increased attention as a therapeutic approach to obesity and its related disorders134,135. Therefore, the lncRNAs LINC00312, LINC00607 and TYMSOS, that increase during brown adipogenesis but decrease during white adipogenesis, could be important therapeutic candidates.

Our analysis of lncRNAs is limited by lack of time-course data. For example, we did not see a change in expression of HOTAIR between preadipocytes at Day 0 versus mature adipocytes at Day 14, but this finding is consistent with the previous report that HOTAIR increases during early adipogenesis and decreases at later stages23. There are also depot-specific46 as well as sex-specific136 differences in adipose tissue gene expression. For example, HOTAIR was shown to vary in expression in gluteal versus abdominal subcutaneous tissue137. Since our samples are not depot-specific or sex-specific and only examined the terminal differentiated state, our experimental design does not allow discrimination of these reported differences.

Alternative polyadenylation

APA during adipogenesis has not been extensively investigated. The two studies on global APA changes were published ten years ago and mainly focused on identifying a trend to longer 3′ UTRs during adipogenesis16,18, and the one examining human cells used total RNA-seq data and did not characterize intronic APA. Our study is the first to generate 3’-end sequencing data of human preadipocytes and adipocytes, which provides much more accurate and quantitative characterization of pA site usage. We observed only a slight trend towards 3’-UTR lengthening during human adipogenesis, consistent with the modest tendency towards longer 3’-UTRs that was previously reported18, but unlike the more notable trend seen during mouse 3T3-L1 adipogenesis16. This divergence may be due to differences in models used to study the process, as well as differences in experimental design and analysis. Interestingly, increased or decreased use of intronic pA sites was much more prevalent in our data set than changes in the 3’-UTR at the gene′s end. Suppression of an intronic pA site was most common, and a prime example is PIK3R1, a gene that regulates adipogenesis and insulin sensitivity115. This APA event is expected to yield more transcripts extending to the end of the gene and being translated into full-length protein. The strong enrichment in this group of genes involved in lipid and fatty acid metabolism and insulin signaling would be consistent with adipocytes needing more of these types of proteins.

Changes in 3’-UTR length could subject genes to different degrees of miRNA-mediated regulation. We found that several miRNAs with known functions in adipogenesis had potential binding sites in 3’-UTR sequences of the shortened or lengthened transcripts that we examined, suggesting miRNA-mediated regulation of these genes. We also identified several other miRNAs that are differentially expressed during adipogenesis and may play important, yet undefined roles in this transition. While none of the miRNAs that we tested were enriched in the 3’-UTRs of genes undergoing APA compared to all 3’-UTRs expressed in adipogenesis, it is still possible that APA regulates the miRNA action on the predicted targets.

To identify RBPs that may function in both 3’-UTR APA and intronic APA, we used motif enrichment analysis and global protein/RNA crosslinking data to identify binding sites in the regions 200 bp upstream and downstream of pA sites. We found that RBPs with known roles in adipogenesis bind around alternate pA sites of seven of the twelve genes that we examined, and also identified several other RBPs that could be regulating ten of these twelve genes. Interestingly, IGF2BP2, SRSF1, and HNRNPA1 are also possible regulators of skipped exon events during adipogenesis, and thus might be regulating both alternative splicing and APA.

For ORC1 and MLYCD that are undergoing intronic APA, no RBPs were detected to bind around the intronic pA site. Besides RBPs, other factors such as transcriptional dynamics and expression levels of subunits of the cleavage and polyadenylation complex can regulate APA14, and may function in regulating intronic APA of these genes during adipogenesis. Regulation may also occur by suppressing recognition of the splice sites of the flanking exons15. It is important to note that the CLIP-seq evidence for our RBP-binding predictions was based on experiments with non-adipogenic cell lines, such as HEK293, which may not be expressing all of the APA regulators present during adipogenesis, such as ones that would act on the ORC1 and MLYCD intronic pA sites. Further experiments are needed to demonstrate a direct function of the various RBPs on regulating APA during adipogenesis.

Alternative splicing

Spliced isoforms have been shown to play roles in adipogenesis19,20,21, and our analysis of a publicly available RNA-seq dataset revealed 100 alternative splicing events occurring during human adipogenesis. Like our study, the global analysis of alternative splicing in human adipogenesis published by Yi et al.22, also identified exon skipping as the most prevalent type of alternative splicing. However, they analyzed cells derived from bone marrow whereas the cells in our study originated from subcutaneous adipose tissue. The use of primary human preadipocytes, the stringent thresholds that we have used, and the application of a widely used and reliable alternative splicing analysis tool, rMATS24 makes our analysis a valuable resource for future studies to dissect the importance of alternative splicing to adipogenesis.

Our study uncovered possible mechanisms of adipogenesis regulation by specific splicing events. Of note is splicing of MAP4K4 transcripts, which encode a potential therapeutic target that is alternatively spliced during brown adipogenesis in mice124. Alternative splicing of RBPJ, a gene involved in the Notch signaling pathway128, is interesting as this pathway is associated with adipocyte dedifferentiation, a therapeutic potential in the field of obesity138. SRSF1, which shows increased intron retention in mature adipocytes, is an important regulator of alternative splicing in adipogenesis19, and in our analysis, it is a potential mediator of both exon skipping and APA. Additionally, by motif enrichment analysis, we demonstrated how RBPs previously characterized in adipogenesis can potentially regulate alternative splicing of multiple genes during differentiation, thus extending their roles in adipogenesis beyond the few targets examined in previous studies. We have also identified novel RBPs that could regulate the global alternative splicing profile during adipogenesis but are as yet uncharacterized.

To the best of our knowledge, our study presents the first comprehensive analysis of different modes of gene regulation during human adipogenesis. We generated 3’-sequencing datasets from matched preadipocyte and adipocyte RNA samples, in addition to using publicly available datasets for analysis with specialized tools. With these resources, we identified multiple potential mechanisms of regulation that can be addressed in future research. In addition to illustrating how lncRNAs, alternative polyadenylation and alternative splicing may regulate adipogenesis, we also identified novel lncRNA, miRNA, and RBP regulators that may play a crucial role in fine-tuning adipocyte differentiation. Further research into these potential regulators of mRNA and protein output during adipogenesis will aid in designing therapeutics to combat obesity and its related disorders.