Introduction

Housekeeping genes (HKGs) are a large class of constitutively expressed genes subjected to low levels of regulation under various conditions. They generally perform biological actions fundamental to basic cellular functions such as the cell cycle, translation, metabolism of RNA, and cell transport [1, 2]. Thus, the stable expression of HKGs is assumed in all cells of an organism independent of the tissue, developmental stage, cell cycle state, or presence/absence of external signals [3, 4].

The use of internal controls when performing quantitative gene expression analysis (such as microarrays, RNA-sequencing [RNA-seq], and quantitative reverse transcriptase-polymerase chain reaction [qRT-PCR]) represents the most common strategy to normalize gene expression to correct for intrinsic errors related to sample manipulation and the technical protocol. The gene expression profiles obtained depend significantly on the reference genes employed as internal controls; therefore, inappropriate controls can lead to inaccurate results.

Given their fundamental roles, HKGs tend to display medium-high expression levels; this characteristic makes these genes especially suitable as internal controls/reference genes to normalize gene expression data in quantitative gene expression analysis [2, 5, 6]. Ideally, internal controls should exhibit stable gene expression across most sample types and experimental conditions to minimize undesired experimental variation; however, the literature suggests that the expression of commonly used HKGs varies depending on the experimental conditions and chosen setup and the analyzed tissue [5,6,7,8,9,10,11,12,13]. Importantly, these limitations do not invalidate the use of HKGs as a normalization strategy; instead, they support the need for a deeper understanding of how HKGs behave under different conditions or in distinct tissues. The stability of HKG expression must be validated under the particular conditions of interest of each study as a mandatory step [5], considering all experimental, biological, or clinical variables [7, 14,15,16]. Importantly, this should include sex as an essential variable.

The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in biological studies. Karp et al. recently demonstrated how sex phenotypically influenced a substantial proportion of mammalian traits, both in wildtype and mutants [17]. Meanwhile, Oliva et al. reported the impact of sex on gene expression in various human tissues through metadata analysis by the GTEx platform, generating a catalog of sex-based differences in gene expression and the regulatory pathways involved [18]. The authors revealed ubiquitous effects of sex on gene expression; however, they highlighted significant sex-based differences in human visceral and subcutaneous adipose tissue. Sex as an intrinsic variable has not been historically considered of immense importance. In a recent review of more than 600 animal research studies, 22% of publications did not specify animal sex [19]. Of the reports that specified animal sex, 80% of publications included only males and 17% only females, leaving only 3% that considered animals of both sex [20]. An analysis of the number of animal studies revealed a more significant disparity—16,152 males vs. only 3,173 females. Only seven studies (1%) reported sex-based results. Thus, the number of male-only studies and the use of male animals have become more disparate over time [20, 21]. Unfortunately, human counterpart studies do not provide any encouragement; while international institutions now consider sex as a critical variable [22, 23], the male perspective predominates in past studies. The lack of consideration of sex as a variable can accentuate/attenuate gene expression analysis, which has subsequent implications on biological or biomedical interpretations.

The quantitative analysis of gene expression data has allowed assessments of gene expression levels within different tissues and under various conditions, which has identified stable expression profiles/patterns [1, 9, 12, 24,25,26,27,28]. Public repositories of gene expression data have appeared in the last decades. The Gene Expression Omnibus (GEO) [29], a well-known international public repository, stores and allows access to gene expression data generated by different high-throughput technologies such as microarrays or next-generation sequencing. Exploiting and reusing the vast amount of data in these repositories has become a powerful tool for those searching for gene expression patterns across many diverse types of tissues and conditions.

A survey of 40 studies of human adipose tissue (AT) published since 2001 noted that 70% of papers employed the ACTB, GAPDH, and 18S HKGs as reference genes [14]. Related studies have supported the use of additional HKGs (i.e., PPIA, HPRT, RPS18, or RPL19) in human AT-based studies [16, 30, 31]. Importantly, these studies failed to include sex as a biological variable, suggesting that these HKGs may not be as suitable as anticipated. In short, there exists an important limitation in gene expression studies due to the lack of inclusion of the sex perspective. In response, this study determines the gene expression variability levels of six HKGs commonly used in human and mouse adipose tissue (AT) and genes included in various whole-transcriptome microarrays available at GEO that consider sex as a covariable. Further, we identify novel candidate reference genes that do not display sex bias in human AT. We extend this analysis to experimental analyses of mouse models deposited in the GEO. Our findings reveal that studies generally lack sex specificity or employ mainly male animals; furthermore, certain conventional HKGs fail the requisite of being constitutively expressed in both sexes. Also, we establish new putative sex-unbiased HKGs (suHKGs) for gene expression analysis in male and female human AT, and putative orthologs for mouse AT. We present a general framework for reference gene selection that may be useful in gene expression studies and develop an open web tool to select adequate suHKGs according to customized experimental designs in AT.

Methods

The bioinformatics analysis strategy was carried out using R 3.5.0 [32] and Python 3.0 and is summarized in Fig. 1.

Fig. 1
figure 1

Data-analysis workflow. This study consisted of seven main block-steps: 1 The collection of public microarray information located at GEO (Gene Expression Omnibus) database with Python and R. 2 Raw data pre-processing and probe annotation. 3 Statistical data analysis with three different statistics to get the gene expression variability in adipose tissue samples of Hsa and Mmu, considering the biological sex as a variable. 4 Meta-analysis by Rank Product method. 5 Functional annotation with Gene Ontology (GO) terms. 6 GTEX-based gene expression filtering, to select potential reference genes suitable to compare both sexes in gene expression analyses. 7 Experimental validation by qPCR

Systematic review and data collection

A comprehensive systematic review was conducted to identify all available transcriptomics studies with adipose tissue samples at GEO. The review considered the fields: sample source (adipose), type of study (expression profiling by array), and organism of interest (Homo sapiens or Mus musculus). The search was carried out during the first quarter of 2020, with the review period covering the years 2000-2019. From the returned records, the study GSE ID, the platform GPL ID, and the study type were extracted using the Python 3.0 library Beautiful Soup. The R package GEOmetadb [33] was then used to identify microarray platforms and samples from adipose tissue. The top 4 and 5 most used platforms in Hsa (Table 1) and Mmu (Table 2), respectively, were selected. Given the complex nature of some of the studies, those with information regarding the sex of samples were manually determined, and the keywords used to annotate them homogenized. Finally, studies not meeting the following predefined inclusion criteria were filtered out: i) include at least 10 adipose tissue samples; ii) use one of the selected microarray platforms to analyze gene expression data; iii) present data in a standardized way; and iv) not include duplicate sample records (as superseries).

Table 1 Processed data sets for selected studies of Hsa
Table 2 Processed data sets for selected Mmu studies

Data processing and statistical analyses

The normalized microarray expression data of the selected studies from GEO were downloaded using the GEOQuery R package. All the probe sets of each platform were converted to gene symbols, averaging expression values of multiple probe sets targeting the same gene to the median value.

Three statistical stability indicators were calculated for each gene in each study to determine the relative expression variability: the coefficient of variation (CV), the IQR/median, and the MAD/median. The CV, computed as the standard deviation divided by the mean, is used to compare variation between genes with expression levels at different orders of magnitude; however, extreme values can affect this value. Therefore, the interquartile range (IQR) divided by the median and the median absolute deviation (MAD) divided by the median (two statistics based on the median) were also considered. These measures provide more robustness in skewed distributions [34]. Both statistics were multiplied by a correction factor of 0.75 and 1.4826 to make them comparable to the CV in normal distributions. Lastly, the gene variability scores per platform were expressed as the median of all statistics from the studies analyzed with each platform. These median values were ranked by gene variability value for each platform, with lower ranks corresponding to higher stability levels.

The described analysis pipeline was performed on three different sample groups based on sex and species: female Hsa, male Hsa, and all Mmu samples. The analysis was not performed separately for male and female mice due to the lack of female Mmu samples.

Meta-analysis

The gene variability ranks for each platform were integrated using the Rank Product (RP) method [35, 36], a non-parametric statistic identifying the elements that systematically occupy higher positions in ranked lists. This approach combines gene ranks rather than variability scores to create platform independence. The RankProd package [37, 38] was used to calculate the RP score for each gene (Eq. 1, where i is the gene, K the number of platforms, and rankij the position of gene i in the ranking of platform j). Three final rankings were obtained (one for each sample group [Mmu, Hsa female, and Hsa male samples]) by sorting the genes in increasing order of RP:

$$RP_{i} = \left( {\prod\limits_{j = 1}^{K} {rank_{ij} } } \right)^{{{1 \mathord{\left/ {\vphantom {1 K}} \right. \kern-0pt} K}}} .$$
(1)

Selection of candidate HKGs

To encounter appropriate sex-unbiased HKG (suHKG) candidates, male and female Hsa samples were randomly selected, and the Mmu group was discarded. Gene functional information was then incorporated to exclude genes involved in metabolic alterations. The AnnotationDbi and org.Hs.eg.db annotation packages converted Gene Symbol to Gene name. After removing pseudogenes and non-coding genes, the associated GO terms of the remaining genes were obtained using the GO.db annotation package. Related information from all three gene ontologies were included (Biological Process, Molecular Function, Cellular component). Genes related to physiopathological conditions were filtered out, and a unique ranking by sex was generated (the male and female MetaRankings), which averages the three statistical rankings Eq. (2):

$$MetaRanking\,position = \frac{positionCV + positionIQR/median + positionMAD/median}{3}.$$
(2)

The difference in the ranking positions occupied by males and females was also calculated to reveal sex-based stability differences at a gene level.

Selecting stable suHKG with high levels of expression, followed several steps—we first (i) downloaded the "GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct.gz” file from GTEx; (ii) select the adipose tissue samples; (iii) take the gene median transcript per million (TPM) value in visceral adipose tissue; (iv) filter out from our sex-specific MetaRankings genes with median TPM < 20; (v) select the genes in the top 10% positions of each MetaRanking; and (vi) intersect the two top lists to find stable and highly expressed genes common to both sexes.

Experimental validation

Study selection and sample processing

Subjects were recruited by the endocrinology and surgery departments at the University Hospital Joan XXIII (Tarragona, Spain) in accordance with the Helsinki declaration. Human visceral and subcutaneous AT samples were obtained during surgery from lean and obese male and female individuals. Total RNA was extracted from adipose tissue using the RNeasy lipid tissue midi kit (Qiagen Science). One microgram of RNA was reverse transcribed with random primers using the reverse transcription system (Applied Biosystems) [39].

Mouse AT was obtained from wild type and Irs2−/− [40] (insulin resistance and type 2 diabetes model) C57BL/6 littermates. According to the criteria outlined in the “Guide for the Care and Use of Laboratory Animals”, all animals received humane care [22]. Total RNA was extracted from abdominal fat using a combined protocol including Trizol (Sigma) and RNeasy Mini Kit (Qiagen) with DNaseI Digestion. First-strand synthesis was performed using EcoDry Premix (Takara).

Gene expression analysis

Quantitative gene expression analysis was performed on 50 ng cDNA template. Real time-PCR was conducted in a LightCycler 480 Instrument IIR (Roche) using SYBR PreMix ExTaqTM (mi RNaseH Plus, Takara). Genes selected as potential HKG in human and mouse WAT were 18 s, PPIA and RPL19. Primers were designed in two consecutive exons, when possible, taking into consideration all reference sequences for mRNA in NCBI (https://www.ncbi.nlm.nih.gov/gene/; https://www.ncbi.nlm.nih.gov/nuccore/) [41] and aligned to search for common regions with Pairwise Sequence Alignment (https://www.ebi.ac.uk/Tools/psa/) [42]. Alternative transcript variants were analyzed by AceView (https://www.ncbi.nlm.nih.gov/IEB/Research/Acembly/index.html) [43] and primers (designed either by Primer3 or PrimerBlast) amplifying most represented sequence/s were chosen (Additional file 1). All primers used in this study are noted in Additional file 2: Table S1. Crossing point (Cp) values were analyzed for stability between samples and relative quantification using 2^-ΔCt. Statistical analyses were performed with GraphPad Prism 8 (Graphpad Software V 8.0). The results are expressed as arithmetic mean ± the standard error of the mean (SEM). When two data sets were compared, a Student's t-test was used. The differences observed were considered significant when: p-value < 0.05 (*), p-value < 0.01 (**) and p-value < 0.001 (***).

Web tool

A freely available web tool, called metafun-HKG (https://bioinfo.cipf.es/metafun-HKG) was created during this study to allow users to review and share the large volume of generated data and results. The front-end was developed using the Bootstrap library. This easy-to-use resource is organized into four sections: (1) a quick summary of the results obtained with the analysis pipeline in each phase. Then, for each of the studies, the detailed results of the (2) exploratory analysis and (3) variability assessment. Finally, all results are integrated and summarized in (4) gene stability meta-analysis by sex and organism. The user can interact with the web tool through graphics and tables and search information for specific genes.

Results

Classic HKG selection

An extensive bibliographic review revealed that reference genes chosen for qRT-PCR-mediated analysis of gene expression in human AT or various types of adipocytes generally included the metabolic genes GAPDH [7, 14,15,16, 39, 44], HPRT [7, 16], PPIA [14, 39, 44], UBC [14, 45] and ribosomal genes 18S [7, 14, 16, 39, 46,47,48] and RPL19 [49]. As these genes have been commonly used to analyze gene expression as reference genes in several experimental conditions (although the sex variable was generally not considered), we selected these six classic human AT HKG genes for evaluation when considering sex as a variable to assess their suitability as sex-unbiased HKG (suHKGs). In the case of 18S, we specifically selected 18S5 for our analysis.

Systematic review and data collection

We searched the GEO by defining the sample tissue, type of study, and organism of interest and obtained a total of 187 and 214 candidate studies for Homo sapiens (Hsa) and Mus musculus (Mmu), respectively. We selected the main microarray platforms for each species that contained the greatest number of studies; this provided 4 and 5 platforms for Hsa (Table 1) and Mmu (Table 2), respectively. We excluded 138 and 171 studies of Hsa and Mmu, respectively, as they failed to meet the inclusion criteria. Finally, we selected 49 Hsa studies and 43 Mmu studies for sex-based evaluations (Fig. 2), which involved 2724 Hsa and 1072 Mmu samples.

Fig. 2
figure 2

Flow diagram of the systematic review and selection of studies for meta-analysis according to PRISMA statement guidelines for database searches

In Hsa, 24 (51%) of the 49 selected studies included sample information regarding sex. 10 studies covered both sexes in their analysis, while 11 included females exclusively, and 3 contained only male samples (Fig. 3A). In Mmu, 22 (51%) of the 43 selected studies informed about the sex of samples; only 1 study covered both sexes while 2 included exclusively female samples and 19 contained only male samples (Fig. 3B). Finally, we selected human samples with known sex information (681 male and 875 female samples, Additional file 2: Table S2 and Fig. S1) and all mouse samples (1072 samples, 559 known to be male and 34 from female, Additional file 2: Table S3 and Fig. S2) for analysis. Due to the low number of known female samples in mice, we excluded Mmu studies from this sex-based analysis.

Fig. 3
figure 3

Summary of sex as a variable during the review of Hsa and Mmu studies. A Out of 49 Hsa studies, 49% specified the sex of samples, and 20.5% used samples from both sexes in the experimental procedure. B In Mmu, 51% of studies presented information regarding sex but focused mainly on male samples; almost no female samples were found in these studies. Only one study included samples from both sexes

Stability data meta-analysis

After downloading and annotating normalized expression data sets for the selected studies, we calculated three estimators of variability: the coefficient of variation (CV), the interquartile range divided by the median value (IQR/median), and the mean absolute deviation divided by the median value (MAD/median). Additional file 2: Fig. S3, S4, and S5 summarize the levels of variability of the six selected HAT HKGs (UBC, RPL19, RNA18S5, PPIA, HPRT1, and GAPDH) for male and female Hsa and Mmu.

We conducted a meta-analysis based on the Rank Product (RP) method to integrate statistical results from different platforms; this approach combines gene ranks rather than variability scores (creating platform independence) and identifies the elements that systematically occupy higher positions in ranked lists (giving to each element in the ranking an RP score). We calculated the RP score of 41,975 and 47,203 Hsa and Mmu genes, respectively, and then sorted all genes—in this ranking, lower positions indicate higher expression stability. 18S displayed significant variability in Hsa in both males and females; however, this gene represented the second most stable selected HKG in Mmu. Figure 4 depicts the positions occupied by the six selected HAT HKGs in Mmu, Hsa males, and Hsa females. Surprisingly, HKG stability in humans differed between female and male samples, with females displaying greater instability. Accessing the Metafun-HKG webtool provides the whole rankings with the positions and RP scores of all evaluated genes in each experimental condition.

Fig. 4
figure 4

A Ranking of stability levels for classic HKGs evaluated in Hsa females (upper) and males (lower). The position in the ranking for each selected gene is described on the X-axis. This ranking was generated by taking the mean of the obtained RP values for the three statistical approaches (CV, IQR/median, and MAD/median) after filtering non-coding genes. Ranking based on 18,973 genes. B Ranking stability levels for classic HKGs evaluated in Mmu. This ranking was generated by taking the mean of the obtained RP values for the three statistical approaches (CV, IQR/median, and MAD/median). Ranking based on 47,203 genes

In order to decipher differences in gene expression stability between male and female AT samples, we conducted a deconvolution analysis. Overall, we did not find consistent differences between the sexes in cellular composition across datasets (Additional file 3).

To select sex-unbiased, highly expressed, and stable human AT HKG candidates, we combined the scores of the three statistical approaches in a unique list of positions for each experimental condition (metaRanking) and filtered out genes with low expression (TPM < 20) in the GTEx database. These steps provided a list of 5,315 genes. We next intersected the top 10% (532) most stable genes in the Hsa male and Hsa female metaRankings separately, which resulted in a list of 195 candidate suHKGs (http://bioinfo.cipf.es/metafun-HKG/). This analysis revealed relative stability and expression values high enough for detection by different gene expression analysis technologies in total Hsa samples (Table 3, Fig. 5). From this list, we selected human AT HKGs that included the classical HKGs PPIA, UBC, RPL19, and RPS18 and the additional novel candidate suHKGs RPS8 and UBB. We also detected stable, highly expressed genes in one sex but not in the other (such genes included ANXA2, DDX39B, and PLIN4 in males and DNASE2, NDUFB11, and RARA in females (Additional file 2: Table S4, Fig. 5), which may be used as sex-specific reference genes. We failed to find the expression of the 18S gene in GTEx, although we searched for different aliases (RNA18S5, RNA18S1, RNA18SN1, RNA18SN5, RN18S1).

Table 3 Candidate suHKGs for gene expression analysis
Fig. 5
figure 5

MetaRanking of HKG stability levels for Hsa females and males. Dot shape indicates classical HKG (star) or new potential HKGs (circle). The color indicates if a gene is stable for both sexes (green), only in females (violet), only in males (red), or unstable (black). Dashed line indicates the limit position of the top 10% most stable genes with an expression of at least 20 TPM

Experimental validation

We selected PPIA, RPL19, and 18S for experimental validation according to our computational assessment of variability. We analyzed human AT mRNA from lean and obese male and female individuals by qPCR to validate the previous computational metadata analysis (Table 3; Fig. 6). Raw crossing point (Cp) value coefficient variation (CV) analysis revealed similar Cp values between male and female samples, with low CV values for PPIA and RPL19 (Fig. 6A); however, 18S exhibited significant differences in Cp values between male and female samples, which displayed high CV values (Fig. 6A). Further, gene expression analysis of multiple experimental targets revealed differing patterns when using PPIA or RPL19 compared to 18S as a HKG (Fig. 6B). We analyzed several genes involved in physiological and metabolic adipose tissue functions (e.g., IRS1, LEPR, and PPARγ) in male and female human AT samples under two different physiological conditions using potential suHKG candidates. Results obtained provided evidence for the suitability of RPL19 and PPIA as suHKGs and disqualified 18S as a HKG when considering sex as a variable (Fig. 6B). Overall, the experimental procedures validate the computational metadata analysis, discarding 18S and selecting PPIA and RPL19 as suHKG for HAT analysis.

Fig. 6
figure 6

Gene expression analysis in HAT from male and female samples using different HKGs. A Coefficient of variation (CV) in the Cp values of each candidate gene calculated in male and female for lean and obese samples. B IRS1, LEPR, and PPARγ expression analysis using PPIA, RPL19, and 18S as reference genes. Male Lean n = 3; Female Lean n = 7; Male Obese n = 10; Female Obese n = 10. Student's t-test applied for significance—*p-value < 0.05, and **p-value < 0.01

To circumnavigate the lack of sex-based Mmu data to compute a Mmu metaRanking, we experimentally evaluated mouse orthologs (Ppia, Rpl19, and 18 s) of validated human suHKGs, in wt and in an insulin resistance, Irs2−/− ko model in male and females. Relative gene expression analysis demonstrated that the internal control affected the relative expression of different experimental targets in different experimental mouse models. 18 s used as HKG alters relative gene expression of InsR, Lepr, and Phb in males and females mouse AT samples, while Ppia and Rpl19 succeed as suHKG in mouse AT samples (Additional file 2: Fig. S6). These results confirm that mouse homologs of suHKG candidates can be used in mouse-based gene expression studies.

Metafun-HKG web tool

We created the open platform web tool Metafun-HKG (https://bioinfo.cipf.es/metafun-HKG) to allow easy access to any information related to this study. This resource contains information related to the study samples, systematic revision, gene variability scores, and stability rankings. The stability indicators for each gene evaluated by platform, species, and sex can be freely explored by users to identify profiles of interest.

Discussion

Assessment of suHKG candidates

The two main objectives of this work were (i) evaluating the suitability of a group of six classic HKGs acting as human AT suHKGs and (ii) identifying genes with a stable, high expression profile that represent new Human AT suHKG candidates. Our novel strategy has reviewed the role of HKGs by considering sex, species, and platform as variables in evaluated studies.

We performed our analysis on three different sample groups based on sex and species: female Hsa, male Hsa, and all Mmu samples. We did not analyze Mmu female and male samples separately due to the lack of reported female Mmu samples in the selected studies. HKGs displayed platform-dependent variability under all conditions, given that each microarray platform has its probe design and technical protocol. Previous studies on technology dependence concluded that this factor has less determining power than the differences in transcript expression levels caused by varying cell conditions [24].

Results exhibit considerable differences in gene stability, including stability differences in the six classical selected HKGs between Hsa female and male samples showing higher instability in females in general term. PPIA, UBC, and RPL19 displayed high stability levels for samples from both sexes, while HPRT1 and 18S exhibited low stability levels in both sexes. Interestingly, GAPDH displayed high stability in male samples and low stability in female samples. In apparent contradiction, 18s presents high stability levels in Mmu, but this may be explained by the overwhelming presence of male samples in this group and the fact that this gene suffers a significant sex bias in mouse (Additional file 2: Fig. S6). The common absence of female samples in studies (as further evidenced by our systematic review) could explain the systematic reports of 18s as a stable HKG.

The results of this work showed a different pattern of instability of HKG expression and we wondered whether this might be related to a different distribution of cell types in males and females. To address this relevant question, a deconvolution analysis was performed in each study, which allowed us to compare all male and female participants in each microarray dataset. Deconvolution studies showed a heterogeneous cell panorama characteristic of human adipose biopsies including progenitor and differentiated adipocytes, and immune cells lineages among others. The joint evaluation of the results of all the studies showed that there were no differences in cell composition between males and females, so no relationship was identified between the patterns of instability in the expression of these genes and their cell type distribution (Additional file 3). The analysis of single-cell RNA-seq data in adipose tissue samples may provide complementary information of interest to evaluate these differential patterns by sex. There are currently few datasets of this technology, although its generation is increasing and will be an important and accurate source of information.

We propose a list of 195 suHKG candidates suitable for use as internal controls in HAT-based gene expression studies including male and female samples; these genes exhibit high expression (TPM > 20) and stability levels and a minimal influence of sex on expression patterns. As we could not reproduce the pipeline followed with human samples in mouse studies due to the lack of female mouse samples, we suggest the orthologs of proposed human suHKGs as mouse suHKGs.

We validated a selection of suHKG candidates experimentally to assess the robustness of our computational findings; overall, our gene expression analysis validated the in silico results (Table 3). PPIA, a widely used human AT HKG, and RPL19, used as a HKG in several cell types [30, 31, 50] and occasionally in human AT studies [49], have been validated as human AT suHKGs; however, experimental validation shows that 18S, which is widely used as human AT HKG [7, 14, 16, 39, 46,47,48], displays significant levels of variability in both male and female samples and sex-specific expression patterns (Fig. 6). These results agree with the findings of other recently published studies [51] and correlate with those found in mouse adipose tissue. The use of 18 s as a HKG induces apparent differences in the relative expression levels of several genes in males and females and wild type and Irs2−/− samples (Additional file 2: Fig. S6); instead, we suggest Rpl19 and Ppia as more optimal suHKGs in mouse adipose tissue analysis.

We identified several additional genes human AT suHKGs from the computational analysis, including RPS18, RPS8, and UBB (Table 3), that present characteristics such as appropriate stable and high expression levels. We also suggest the mouse orthologs of these human suHKGs as mouse suHKGs. To this end, we designed a web tool to customize the best suHKG for human or mouse adipose tissue experimental design.

Strengths and limitations

Massive data analysis of gene expression represents a pivotal tool for understanding different biological scenarios, which may eventually help elucidate mechanisms affecting basic and biomedical research. Data analyses must be assessed in the laboratory by studying relative gene expression normalized to an adequately chosen HKG. Selection of an ideal HKG remains a challenging process, although this choice will help to ensure an accurate result and must consider all experimental conditions and biological variables. Incorporating sex-based analyses into research will improve reproducibility and experimental efficiency by influencing the outcome of experiments and must be accounted for as a critical biological variable. Sex must be considered to monitor sex-based differences and similarities for all diseases and biological processes that affect both sexes, which may help reduce bias, enable social equality in scientific outcomes, and encourage new opportunities for discovery and innovation, as evidenced by several studies analyzing this issue [20, 22, 52,53,54,55].

Numerous lines of evidence suggest that the current status quo does not address fundamental issues of sex-based differences evident in gene expression. Up to date, many classic HKGs remain unevaluated when including sex as a biological variable; these include those commonly used in human AT studies (e.g., ACTB, GAPDH, and 18S) and additional HKGs such as PPIA, HPRT, RPS18, or RPL19. Using a HKG to normalize samples without assessing their behavior under the specific experimental conditions used in each study (including sex), may lead to a biased outcome. HKGs may remain stable in one sex but not in the other, as in the case of DDX39B and PLIN4 (stable in males) or NDUFB11 and RARA (stable in females), or may have stable yet distinct expression levels in both sexes, such as for 18 s in mouse. Ignoring sex and choosing a non-optimal HKG may introduce confounding variables and the inability to assess whether differences in the data derived from the experimental design or the normalization process. This source of variability in the data would reduce statistical power, thereby making it more difficult to find significant results. In this study, we analyzed the role of six conventional HAT HKG considering sex as a variable for the first time.

Many published studies do not include a sex-based perspective by omitting animal sex from reporting of the animals or performing studies with animals of only one sex (typically males). Our systematic review found that 51% of Hsa studies and 49% of Mmu studies failed to include information regarding the sex of samples, with just 19% of Hsa and a striking 2% of Mmu studies including samples from both sexes. Of note, Mmu studies including only female samples represented just 5% of the total. The small number of Mmu studies, including female sample information, represented a significant limitation of the study and prevented the creation of a Mmu meta-ranking to select highly expressed stable Mmu suHKG candidates as for Hsa. We evaluated the Mmu orthologs of the selected Hsa suHKG candidates experimentally to overcome this limitation, which confirmed their suitability as Mmu suHKGs.

Despite the widespread use of 18S RNA as a HKG, its annotation represents another limiting factor of this study; we failed to encounter this gene in the GTEx platform under any proposed alias from GeneCards. We also noted that identifiers for this gene are unstable or not included in reference assemblies. In addition, the DNA sequence of the RNA18SN5 gene (Accession Number NR_003286.4) has 99–100% identity with other ribosomal RNAs such as RNA18SN1, RNA18SN2, RNA18SN3, RNA18SN4, and RNA18SP3 (Accession Numbers NR_145820.1, NR_146146.1, NR_146152.1, NR_146119.1, NG_054871.1, respectively). Furthermore, 18S rRNA has different copy numbers among individuals and varies with age [56]. Considering all these factors, and integrating experimental data assessing differential expression levels according to sex, makes the 18S gene less suitable as a HAT suHKG than other suHKGs proposed in this study.

Other limitations of the study included the filtering and pre-processing of biological information located in the GEO to identify the published studies with transcriptomic data of adipose tissue, and the classification of the samples depending on the sex. A primary limiting factor involved the absence of standardized vocabulary to tag sex in sample records of the studies. Even though the gene expression data in GEO are presented as a standardized expression matrix, the metadata (including sample source, tissue type, or sample sex) is reported through free-text fields written by the researcher submitting the study. The absence of standardized vocabulary and structured information constrains data mining power on large-scale data, and improvements in this regard could aid the processing of data in public repositories [57].

For the first time, this study presents a computational strategy that includes a massive data analysis capable to assess the sex bias in expression levels of classical and novel HKGs, over a large volume of studies and samples. This strategy revealed that an accurate experimental design for adipose tissue requires the adequate selection of a suHKG, such as PPIA, RPL19, or new options, such as RPS18 or UBB. In that context, we could finally avoid the common practice of pooling males and females or even discard the only male-presence effect. This study presents the relative expression stability of six commonly used HKGs and the variability levels of other genes covered by the analyzed microarray platforms. This strategy is aligned with the FAIR principles [58] (Findability, Accessibility, Interoperability, and Reusability) to ensure the further utility and reproducibility of the generated information.

Although limited to adipose tissue, our findings suggest that the sex bias in commonly used HKGs could appear in other tissues, thereby affecting the normalization process of gene expression analysis of any kind. Incorrect normalization may significantly alter gene expression data, as shown in the case of 18S, and lead to erroneous conclusions. This study highlights the importance of considering sex as a variable in biomedical studies and provides evidence that thorough analyses of HKGs as internal controls in all tissues should be promptly addressed.

Perspectives and significance

Our results focus on the importance of taking into consideration sex as a biological variable when choosing the best HKG as reference in HAT gene expression analysis. Our novel computational strategy includes massive data analysis capable to assess the sex bias in expression levels of classical and novel HKGs to select sex-unbiased HKG. Conventionally reported HKG genes include several metabolic and ribosomal genes such as GAPDH, HPRT, PPIA, UBC, 18S and RPL19. However, our novel computational strategy based on meta-analysis techniques has proven that certain classical HKGs, like one of the most extended, 18S, may fail to function adequately as the reference gene as it differentially expressed in males and females, while others like PPIA and RPL19, succeeded as reference genes. Further, following selection criteria, several markers, like RPS8 and UBB are also proposed and an open web resource (https://bioinfo.cipf.es/metafun-HKG) offered for customized experimental design.

All these results provide new useful insight in evaluating gene expression analysis in human adipose tissue under several experimental conditions and with biomedical purposes. Using an incorrect HKG may lead to inappropriate results interpretation and applications, while using a suHKG will always provide a better experimental approach, either when taking into consideration male and females as separate groups, either included in the same experimental group but properly analyzed. This study highlights the importance of considering sex as a variable in gene expression analyses in human AT and provides evidence for future extensive tissues suHKG selection to be hopefully, promptly addressed.