The term big data has been coined to refer to the increasingly available large amounts of data. In biomedicine, diverse novel technologies have been developed to generate such quantities of data at different scales, from molecules to clinical readouts (Table 1). We here discuss two major areas, molecular “omics” and tissue imaging, although others, such as the data generated by wearables, can be of importance for IgAN .
Our ability to measure diverse biomolecules at a large scale and speed has increased dramatically over recent years. This spans from DNA (genomics) and RNA (transcriptomics) sequencing to mass spectrometry applied to proteins (proteomics) and metabolites (metabolomics). These methods are collectively called omics and are increasingly able to provide information at the single-cell level and even from tissues, preserving information on the location of the cells. These methods are increasingly being used in nephrology [9,10,11,12]. Omics approaches in the kidneys and other organs involved in IgAN can identify novel biomarkers and improve our understanding of the disease mechanism (Fig. 1).
The pathogenic complexity of IgAN is highlighted by its complex genetic basis [13, 14]. Genome-wide association studies (GWAS) have found variations in genes involved in the immune system, including antigen-presentation, alternative pathway of the complement system and mucosal immunity. These findings have provided a genetic basis to the 4-hit theory of IgAN. Collectively, GWAS studies in IgAN have found nearly 20 independent risk alleles, yet these only explain 7% of the disease risk, although it is expected that this will increase with larger cohorts in the future .
As an immediate readout of our genome, transcriptome profiling is an attractive strategy to characterize diseases. Due to the relative ease of generating this data, it has been broadly applied in kidney diseases [9, 10], including IgAN. Recent technological developments have made it possible to measure the transcriptome of individual cells, i.e., single-cell RNA sequencing (scRNA-seq). This substantially increases our capacity to examine disease mechanisms , including the immune system . scRNA-seq has allowed, for example, to dissect the key cells involved in scar formation in the kidney , understand the distribution of distinct immune cell populations in the kidney , and identify protective mechanisms mediated by nuclear receptors . A first study applied scRNA-seq to kidney cells and monocytes from peripheral blood of 13 IgAN patients and compared these to 6 controls . The analysis found upregulation of JCHAIN, a gene involved in the dimerization of IgA in mesangial cells, and altered expression profiles of macrophages and CD8+ T-cells that could lead to a deregulation of inflammation. These results illustrate the value of these technologies, but must be taken with caution, given the limited number and the relative heterogeneity of the patients studied.
Besides the commonly measured messenger RNA, other forms of RNA with regulatory roles, such as microRNAs, can be measured with sequencing technologies. A recent study found four microRNAs (-150-5p, -155-5p, -146b-5p, -135a-5p) to be differentially expressed between IgA nephropathy progressors and non-progressors. The most deregulated, miR-150-5p, was found however to be a general meditator of fibrosis rather than specific of IgAN .
Messenger RNAs are typically translated into proteins. Although their measurement at large scale, called proteomics, is more challenging to scale up compared to nucleotide-based molecules, it has improved substantially . Besides the expression levels of proteins, their post-translational modifications can be informative, as they can regulate protein function. In IgAN, the aberrant glycosylation of IgA1, that that leads to immune complex deposition and disease pathogenesis, is actively investigated . More generic profiling of blood proteins and peptides can provide biomarkers and molecular signatures. One study analyzed nine published urinary proteomics datasets and integrated them with transcriptomic data and literature knowledge to identify twenty proteins involved in IgAN in the kidney . The relevance of three of these proteins (adenylcyclase–associated protein 1 (CAP1), SHC-transforming protein 1 (SHC1), and prolylcarboxypeptidase (PRCP)) was experimentally confirmed . Finally, proteomics, like transcriptomics, is becoming increasingly available with spatial resolution from tissues , paving the way to generate multiplexed-histological data to complement classic pathological assessment.
Metabolomics analyses provide a snapshot of the metabolites within samples. This snapshot serves as a metabolic signature of the cellular processes driven by the transcriptome and proteome. Metabolomics approaches have been extensively applied in CKD to identify metabolic changes, biomarkers, and signatures . In an early retrospective metabolomics analysis, 16 plasma metabolites were associated with CKD incidence in a follow-up period of 8 years. Five of these, i.e., 5-hydroxyindoleacetic acid, citrulline, kynurenic acid, kynurenine, and xanthosine), were identified as eGFR-independent CKD predictors and were used to improve the predictive ability of a logistic regression model with clinical risk factors, such as proteinuria and eGFR . Another profiling of the plasma metabolome reported 16 metabolites as possible predictive risk markers for primary outcomes of progression to ESKD and death in a longitudinal cohort . Metabolic alterations in the urine [26,27,28] and fecal samples  of IgAN patients when compared to healthy controls were also described. These studies made attempts to associate changes in free amino acids and p-cresyl levels with disease progression , aromatic amino acid metabolism and biosynthesis with disease severity , and betaine and citrate with regulation of the inflammatory marker TNF-α . These results are encouraging but the reported metabolic changes often overlap with alterations described in general CKD, such as p-cresyl and derivatives of tryptophan metabolism. Larger cohorts and follow-up studies are needed to characterize and confirm these metabolic alterations in IgAN and their specificity for the disease.
The human microbiome encompasses the microbial communities that occupy distinct parts of the human body such as the skin, tonsils, and gut. Recent advances in high-throughput technologies have substantially expanded our understanding of the microbiome complexity and dynamics, as well as its alterations in various diseases, including CKD  and inflammatory bowel disease (IBD) .
A sequencing-based technique targeting the bacterial 16S ribosomal RNA gene has been widely used to profile the taxonomic composition of the microbiome in different conditions. 16S analyses in IgAN reported alterations in the microbiome’s structure in the saliva , tonsil swabs , and gut [33,34,35], while denaturing gradient gel electrophoresis showed compositional changes in tonsil tissues . These taxonomic alterations were noted in IgAN patients when compared to healthy controls [26, 31, 32, 34, 36] or other nephropathy patients [32, 33]. Some of the analyses also linked specific microbiota with remission rates [26, 36] and clinical measurements, such as proteinuria , serum albumin [32, 33], and inflammatory markers , thus highlighting the microbiome’s potential as a diagnostic and prognostic marker of IgAN.
The role of bacteria as potential inductors of IgAN pathology is further supported by their association with exaggerated antibody production in IgAN patients when compared to controls [37, 38] and deposition of IgA antibodies and bacterial antigens in the glomeruli [39,40,41]. Mice overexpressing B cell–activating factor (BAFF) showed that the presence of microbiota was essential for the development of a phenotype mimicking IgAN pathophysiology . Moreover, IgAN phenotype was delayed or prevented in mice expressing a human IgA1 variant prone to mesangial deposition, when grown under germ-free conditions or upon antibiotic-induced microbiome depletion, respectively [43, 44]. Recently, binding of polymeric IgA (pIgA) to certain microbiota was found to be enriched in the tonsil crypts of IgAN patients and IgA binding intensity to the same taxa correlated with Gd-IgA1 serum levels . Yet, a preceding analysis reported no significant alterations between the tonsillar microbiome of IgAN and recurring tonsillitis patients . These data suggest that an excessive mucosal immune response  against particular taxa might underlie glomerular immune-complex deposition in IgAN .
Albeit promising, IgAN human microbiome analyses were performed on small cohorts of ethnically uniform patients, and data on key confounders are missing, such as the use of immunosuppressants. Furthermore, more in-depth techniques, such as Shotgun Metagenomics, which attempts to quantify all genetic material within a sample, can be used to provide higher taxonomic resolution and pinpoint the metabolic or functional changes in the IgAN microbiome.
Several techniques enable the analyses of molecular expression patterns directly on tissue sections. Such techniques can be especially interesting for analyses of rare tissues, such as kidney biopsies. We discuss some examples that were also used in nephrology and nephropathology, acknowledging that this is not comprehensive and represents only selected methods.
Multiplexing techniques enable visualization of multiple molecular targets at once, providing an advantage compared to traditional immunofluorescence techniques, which are usually limited to 4–5 markers (colors). Multi-epitope ligand cartography (MELC) is a high-throughput immunofluorescence method that relies on repeated cycles of staining and bleaching, enabling to compile a so-called toponome map, i.e., the expression of target molecules in a cell or tissue . Theoretically, this approach can be used to visualize expression of any molecule to which a fluorescently labelled ligand is available.
Another technique, the points accumulation for imaging in nanoscale topography (PAINT) [48, 49], also enables high resolution tissue multiplex analyses . Exchange-PAINT uses fluorescently labeled oligonucleotides that bind to antibodies tagged with a DNA-PAINT docking sequence. To visualize several antigens, iterative cycles consisting of staining, imaging, applying a unique pseudocolor, and washing are performed. Importantly, Exchange-PAINT can be performed using a single dye and laser, allowing to choose the dye with optimal intrinsic properties for the imaging tasks for all probes .
Co-detection by indexing (CODEX) uses dyed nucleotides for multiplex tissue analysis. CODEX uses DNA-antibody-tags with specific 5′-overhangs that are sequentially extended by a polymerase in each cycle. This way in each cycle only tags of defined antibodies will incorporate the dyed nucleotides. After incorporation, imaging is performed and the dyed nucleotides are removed by inter-cycle Tris(2-carboxyethyl)phosphine hydrochloride (TCEP) cleavage . This enables simultaneous imaging of 66 markers in formalin-fixed and paraffin-embedded (FFPE) tissue . Theoretically, the analyses can be performed using a standard immunofluorescence microscope.
A similar technology, imaging mass cytometry (IMC), can be used to visualize multiple proteins in FFPE sections at once and has recently been applied to kidney tissue . IMC uses special antibodies that are conjugated to specific isotopes. The tissue is meandered using a laser with a resolution of 1 μm, aerosolizing, atomizing, and ionizing it. Then, the tissue is fed into a mass spectrometer for isotope abundance analysis, which identifies the respective antibodies at a given location, providing spatial expression information. For visualization, the final image must be constructed computationally. A recent study applied IMC to human kidneys and found a potentially novel cell type in the distal convoluted tubule (DCT) that does not express calbindin (a typical DCT-marker) and is larger than an intercalated cell .
Matrix-assisted laser desorption/ionization mass spectrometry imaging (MALDI-MSI) can analyze many analytes directly on tissue samples with reasonable spatial resolution. A molecule of interest can be identified using the mass-to-charge ratio (m/z). This technique was recently applied to IgAN . By comparing eleven IgAN cases to six non-IgAN cases with a mesangioproliferative glomerular injury pattern, the authors could identify proteomic signatures associated with progressive IgAN, e.g., increased glomerular vimentin expression .
The methods above were largely applied to 2D tissue sections. 3D tissue imaging represents an interesting alternative with some advantages over 2D section imaging, particularly for the assessment of structures like vessels or glomeruli. Such 3D tissue imaging can be destructive, i.e., when the tissue needs to be fully processed for the method making it unavailable for further analyses, or non-destructive; i.e., the tissue remains available and can be used for other “destructive” molecular methods. MicroCT imaging of tissues is one example of non-destructive imaging that has already been used in kidneys . Optical tissue clearing is another interesting approach for 3D organ visualization, feasibility of which has already been shown in the kidney .
Finally, all non-invasive imaging methods of radiology and nuclear medicine, i.e., sonography, computed tomography (CT), magnetic resonance imaging (MRI), positron emission tomography (PET), and single-photon emission computed tomography (SPECT), provide spatial and non-invasive morphological information. There are substantial developments in each of these imaging modalities, including technological developments, such as super-resolution sonography, or various specific MRI imaging sequences and techniques. Another interesting development is the non-invasive molecular imaging of kidney diseases, as recently shown for imaging of fibrosis [56, 57]. Given that all these techniques provide images, AI approaches are increasingly being developed and implemented for augmented diagnostics and analysis.