tRigon: an R package and Shiny App for integrative (path-)omics data analysis

David L. Hölscher^1,3,
Michael Goedertier^1,2,
Barbara M. Klinkhammer¹,
Patrick Droste^1,3,
Ivan G. Costa²,
Peter Boor^1,3^na1 &
…
Roman D. Bülow¹^na1

1381 Accesses
1 Citation
5 Altmetric
Explore all metrics

Abstract

Background

Pathomics facilitates automated, reproducible and precise histopathology analysis and morphological phenotyping. Similar to molecular omics, pathomics datasets are high-dimensional, but also face large outlier variability and inherent data missingness, making quick and comprehensible data analysis challenging. To facilitate pathomics data analysis and interpretation as well as support a broad implementation we developed tRigon (Toolbox foR InteGrative (path-)Omics data aNalysis), a Shiny application for fast, comprehensive and reproducible pathomics analysis.

Results

tRigon is available via the CRAN repository (https://cran.r-project.org/web/packages/tRigon) with its source code available on GitLab (https://git-ce.rwth-aachen.de/labooratory-ai/trigon). The tRigon package can be installed locally and its application can be executed from the R console via the command ‘tRigon::run_tRigon()’. Alternatively, the application is hosted online and can be accessed at https://labooratory.shinyapps.io/tRigon. We show fast computation of small, medium and large datasets in a low- and high-performance hardware setting, indicating broad applicability of tRigon.

Conclusions

tRigon allows researchers without coding abilities to perform exploratory feature analyses of pathomics and non-pathomics datasets on their own using a variety of hardware.

View this article's peer review reports

PROMO: an interactive tool for analyzing clinically-labeled multi-omic cancer datasets

Article Open access 26 December 2019

Software Workflows and Infrastructures for Precision Oncology

Building Portable and Reproducible Cancer Informatics Workflows: An RNA Sequencing Case Study

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Introduction

Histologic tissue analysis is vital for investigating disease states, understanding pathophysiological mechanisms and guiding diagnostics. Recent technological developments in digital and computational pathology enabled automated large-scale histopathology analyses [1,2,3,4]. The expansion of digital pathology has especially been fueled by deep learning-based workflows [5,6,7,8]. While end-to-end approaches focus on direct clinically or diagnostically actionable outputs, pathomics uses large-scale extraction of explainable, quantitative color or geometric features (e.g., the circularity) from histological structures identified using semantic segmentation for data mining of histopathology [9,10,11,12,13,14]. This approach is similar to molecular omics approaches and aims to better understand morphology by generating morphometric features for relevant tissue structures, allowing exploratory analyses [15]. The extracted features could be integrated into clinical decision-making, e.g., for patient risk stratification [16] or outcome prediction [17, 18]. Pathomics data can be generated with comparatively little cost in comparison to other omics methods, enabling broad implementation in many research groups. This makes pathomics analyses especially interesting for biomedical researchers performing histological analyses, but the datasets can be challenging for established conventional omics workflows due to large outlier variability and missingness caused by inconsistent occurrences of analyzed structures. In addition, biomedical researchers who mostly perform tissue-based analyses often lack the specific coding skills needed for analyzing pathomics data and streamlining time-intensive data curation processes [19]. For these reasons, we have developed an R shiny application — tRigon (Toolbox foR InteGrative (path-)Omics data aNalysis) — to make exploratory pathomics data analyses more open, accessible and feasible to researchers and clinicians. While tRigon was mainly designed for its application to pathomics data, it is also suitable for analysis of other high- or low-dimensional data such as molecular omics or medical datasets.

Implementation

tRigon is a Shiny application [20] built in the R framework [21] and is available both on CRAN (https://cran.r-project.org/web/packages/tRigon) and GitLab (https://git-ce.rwth-aachen.de/labooratory-ai/trigon). It includes various functions such as descriptive statistics, statistical tests and visualizations for analyzing large and complex datasets (Fig. 1). tRigon was tested on Windows, Linux and MacOS.

Pathomics datasets typically consist of multiple .csv files, for example generated by our previously published framework for large-scale histomorphometry (FLASH) [9]. The datasets include structural morphometric measurements (e.g., diameter, area or shape-descriptors) for major histological compartments and structures. For large human cohorts or animal experiments, this can be challenging to analyze. Furthermore, the data needs to be integrated with additional metadata. For human specimens, all tissue pieces on a slide typically belong to the same case and share the same clinical information (e.g., two biopsy cores) while some slides from animal experiments contain samples from multiple experimental conditions, e.g., multiple specimens from various animals or a diseased specimen and its internal or contralateral control tissue on the same slide. Additionally, pathomics data can be analyzed on the specimen level (e.g., a single human pathology case) or with single structure resolution.

tRigon can aggregate large amounts of pathomics files based on metadata with other (e.g., clinical) information of the analyzed samples. Based on the desired analysis the application allows for human- or animal-type data workflows and supports specimen or structure level calculations.

For the aggregated feature files or own loaded datasets, tRigon provides users with a toolbox of different analytical methods, i.e., statistics, data visualizations and machine learning algorithms (Table 1). Each analysis tool represents a tab in the application and consists of an easily understandable user interface (Figs. 2, 3, 4, 5, 6, 7). tRigon users can tailor all functions to their specific needs by choosing from various statistical tests, distribution plots, machine learning methods and output style options. To effectively handle heterogeneous datasets, missingness is automatically reported in the application, non-normally distributed features are supported by multiple non-parametric tests and outliers can be scaled in plots accordingly. Additionally, the application includes a help section with instructions and common pitfalls. All processed data, generated plots and computed statistical tests can be downloaded if desired. To enable reproducible analyses across user sessions and to keep a record of results tRigon can generate and save markdown-based.html-reports including all relevant inputs (e.g., chosen features and group column, plot selection, etc.) and outputs for each task (Table 1). A full example analysis is provided in the supplementary material (Additional file 9: Table S1–S3 and Additional file 9: Figs. S1–S4).

Table 1 tRigon functions with explanations

Full size table

In addition to running tRigon locally via the R console, the application is freely available online in the ShinyApps.io workspace (https://labooratory.shinyapps.io/tRigon), albeit the memory size for free use is limited to 1 GB of Random-Access Memory (RAM). Therefore, users are advised to process and analyze computationally expensive files such as large pathomics datasets locally.

Results

Nine datasets from different platforms were acquired to demonstrate the effectiveness, versatility, and limitations of tRigon. Five of those are pathomics datasets including four human kidney cohorts and one animal experiment for 2,8-dihydroxyadenine crystal nephropathy, a mouse model for diet-induced tubulointerstitial fibrosis and scarring [22]. The human kidney datasets consist of two in-house biopsy (AC_B) and nephrectomy (AC_N) datasets [9] as well as the freely available Kidney Precision Medicine Project (KPMP) [23] and Human BioMolecular Atlas Program (HuBMAP) [24] datasets containing kidney biopsies and nephrectomies. Furthermore, we analyzed freely available aggregated specimen level pathomics data from a recent study on breast cancer, replicating their results (Additional file 9: Table S4 and Additional file 9: Figs. S5–S9) [14]. In total, the four human pathomics datasets include 3,287 instance level files with a total file size of 312.7 MB while the 2,8-dihydroxyadenine crystal nephropathy pathomics dataset consists of 9 files with a total file size of 13.0 MB. The aggregated breast cancer histomics data file contains a file size of 7.55 MB. Furthermore, three freely available non-pathomics medical datasets [25,26,27] with a total file size of 4.62 MB from the Teaching of Statistics in the Health Sciences (TSHS) Resources Portal were included.

Computation time was evaluated using two different settings, representing a high- and low-resource setting and three datasets with different sizes (Table 2). Setting A refers to running the application on a hybrid tablet-notebook (Intel Pentium CPU 1.60 GHz with 8 GB RAM) while setting B refers to running tRigon on a workstation (Intel Xeon Gold 6128 CPU 3.40 GHz, 128 GB RAM). In general, running tRigon on a workstation was faster, but computation times were still quick, and performance was smooth when running the app on setting A, even for large datasets (Table 2). Regardless of hardware tRigon was especially fast for statistical analysis (summary statistics, pairwise Wilcoxon-rank tests, and correlations) and visualizations (distribution plots, scatter plots, and correlation matrices). Processing data frames and machine learning algorithms remained more time-consuming operations, as expected (Table 2).

Table 2 tRigon runtime based on data frame size and computational setting

Full size table

Discussion

tRigon is a user-friendly Shiny application for high-throughput, simple and reproducible analysis of high-dimensional data including pathomics datasets.

An obvious limitation of tRigon is that it is not designed to generate pathomics data. This means it cannot be used to directly investigate whole slide images and users must use another software. However, there are tools available that allow researchers, in some instances even without coding experience, to perform such analysis [28,29,30,31]. Another limitation is that tRigon is not designed as a full-scale statistical program, i.e., in-depth statistical analyses need to be performed with dedicated tools. However, the app allows adding new functionalities, potentially increasing the analytical tools in the future.

Conclusion

With tRigon, users can easily and effectively summarize or correlate features, visualize distributions, statistically test hypotheses, implement machine learning algorithms and cluster data. Markdown reports can help users with documenting each analysis step. tRigon can further accelerate pathomics research and facilitate creating valuable readouts for large (path-)omics datasets. We will continuously update and expand tRigon in the future.

Availability of data and materials

Project name: tRigon; Project home page: https://git-ce.rwth-aachen.de/labooratory-ai/trigon; https://cran.r-project.org/web/packages/tRigon/index.html; https://labooratory.shinyapps.io/tRigon/; Operating system(s): Tested on Windows 10 & 11, Linux and MacOS; Programming language: R and CSS; Other requirements: all required packages will be installed when installing via the command “install.packages(“tRigon”); License: GNU GPLv3; Any restrictions to use by non-academics: as detailed in GNU GPLv3. All datasets included in this study are deposited in the tRigon repository (https://git-ce.rwth-aachen.de/labooratory-ai/trigon/-/blob/main/demo_data.zip) or the respective study repository and are freely available for users to test out the application.

References

van der Laak J, Litjens G, Ciompi F. Deep learning in histopathology: the path to the clinic. Nat Med. 2021;27:775–84.
Article PubMed Google Scholar
Kather JN, et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat Cancer. 2020;1:789–99.
Article CAS PubMed PubMed Central Google Scholar
Bulten W, et al. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: a diagnostic study. Lancet Oncol. 2020;21:233–41.
Article PubMed Google Scholar
Kers J, et al. Deep learning-based classification of kidney transplant pathology: a retrospective, multicentre, proof-of-concept study. Lancet Digit Health. 2022;4:e18–26.
Article CAS PubMed Google Scholar
Abels E, et al. Computational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the Digital Pathology Association. J Pathol. 2019;249:286–94.
Article PubMed PubMed Central Google Scholar
Niazi MKK, Parwani AV, Gurcan MN. Digital pathology and artificial intelligence. Lancet Oncol. 2019;20:e253–61.
Article PubMed PubMed Central Google Scholar
Boor P. Artificial intelligence in nephropathology. Nat Rev Nephrol. 2020;16:4–6.
Article PubMed Google Scholar
Barisoni L, Lafata KJ, Hewitt SM, Madabhushi A, Balis UGJ. Digital pathology and computational image analysis in nephropathology. Nat Rev Nephrol. 2020;16:669–85.
Article PubMed PubMed Central Google Scholar
Hölscher DL, et al. Next-generation morphometry for pathomics-data mining in histopathology. Nat Commun. 2023;14:470.
Article ADS PubMed PubMed Central Google Scholar
Gupta R, Kurc T, Sharma A, Almeida JS, Saltz J. The emergence of pathomics. Curr Pathobiol Rep. 2019;7:73–84.
Article CAS Google Scholar
Bouteldja N, et al. Deep learning-based segmentation and quantification in experimental kidney histopathology. J Am Soc Nephrol. 2021;32:52–68.
Article CAS PubMed Google Scholar
Ginley B, et al. Automated computational detection of interstitial fibrosis, tubular atrophy, and glomerulosclerosis. J Am Soc Nephrol. 2021;32:837–50.
Article CAS PubMed PubMed Central Google Scholar
Hermsen M, et al. Convolutional neural networks for the evaluation of chronic and inflammatory lesions in kidney transplant biopsies. Am J Pathol. 2022;192:1418–32.
Article CAS PubMed Google Scholar
Amgad M, et al. A population-level digital histologic biomarker for enhanced prognosis of invasive breast cancer. Nat Med. 2023. https://doi.org/10.1038/s41591-023-02643-7.
Article PubMed Google Scholar
Bülow RD, Hölscher DL, Costa IG, Boor P. Extending the landscape of omics technologies by pathomics. NPJ Syst Biol Appl. 2023;9:38.
Article PubMed PubMed Central Google Scholar
Zimmermann M, et al. Deep learning-based molecular morphometrics for kidney biopsies. JCI Insight. 2021;6:15-28
Article Google Scholar
Chen Y, et al. Clinical relevance of computationally derived attributes of peritubular capillaries from kidney biopsies. Kidney360. 2023;4:648–58.
Article PubMed PubMed Central Google Scholar
Yi Z, et al. Deep learning identified pathological abnormalities predictive of graft loss in kidney transplant biopsies. Kidney Int. 2022;101:288–98.
Article CAS PubMed Google Scholar
Dasu T, Johnson T. Exploratory data mining and data cleaning. New York: Wiley; 2003.
Book Google Scholar
Beeley C. Web application development with R using Shiny. Birmingham: Packt Publishing Ltd; 2016.
Google Scholar
R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computation. 2022.
Klinkhammer BM, et al. Current kidney function parameters overestimate kidney tissue repair in reversible experimental kidney disease. Kidney Int. 2022;102:307–20.
Article CAS PubMed Google Scholar
de Boer IH, et al. Rationale and design of the Kidney Precision Medicine Project. Kidney Int. 2021;99:498–510.
Article PubMed PubMed Central Google Scholar
HuBMAP Consortium. The human body at cellular resolution: the NIH Human Biomolecular Atlas Program. Nature. 2019;574:187–92.
Article ADS CAS Google Scholar
Nowacki, A. S. Hypoxia MAP Treatment Dataset. TSHS Resources Portal (2022).
Nowacki AS, Homoki J. Core temperature Dataset. TSHS Resources Portal (2023).
Nowacki AS. Surgery timing dataset. TSHS Resources Portal (2016).
Pocock J, et al. TIAToolbox as an end-to-end library for advanced tissue image analytics. Commun Med. 2022;2:120.
Article PubMed PubMed Central Google Scholar
Gutman DA, et al. The digital slide archive: a software platform for management, integration, and analysis of histology for cancer research. Cancer Res. 2017;77:e75–8.
Article CAS PubMed PubMed Central Google Scholar
Gutman DA, et al. Cancer digital slide archive: an informatics resource to support integrated in silico analysis of TCGA pathology data. J Am Med Inform Assoc. 2013;20:1091–8.
Article PubMed PubMed Central Google Scholar
Pettersen HS, et al. Code-free development and deployment of deep segmentation models for digital pathology. Front Med. 2021;8:816281.
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL. This work was supported by the START-Program of the Faculty of Medicine of the RWTH Aachen University [grant number 148/21]; the German Research Foundation [DFG; Project IDs 322900939, 454024652]; European Research Council [ERC Consolidator grant number 101001791]; the Federal Ministries of Education and Research [BMBF; grant number STOP-FSGS-01GM1901A & EMED Consortium Fibromap], Health [Deep Liver, grant number ZMVI1-2520DAT111] and Economic Affairs and Energy [EMPAIA, grant number 01MK2002A], and the Innovation Fund of the Federal Joint Committee [Transplant.KI, grant number 01VSF21048].

Author information

Peter Boor and Roman D. Bülow shared senior authors.

Authors and Affiliations

Institute of Pathology, RWTH Aachen University Clinic, Aachen, Germany
David L. Hölscher, Michael Goedertier, Barbara M. Klinkhammer, Patrick Droste, Peter Boor & Roman D. Bülow
Institute for Computational Genomics, RWTH Aachen University Clinic, Aachen, Germany
Michael Goedertier & Ivan G. Costa
Department of Nephrology and Immunology, RWTH Aachen University Clinic, Aachen, Germany
David L. Hölscher, Patrick Droste & Peter Boor

Authors

David L. Hölscher
View author publications
You can also search for this author in PubMed Google Scholar
Michael Goedertier
View author publications
You can also search for this author in PubMed Google Scholar
Barbara M. Klinkhammer
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Droste
View author publications
You can also search for this author in PubMed Google Scholar
Ivan G. Costa
View author publications
You can also search for this author in PubMed Google Scholar
Peter Boor
View author publications
You can also search for this author in PubMed Google Scholar
Roman D. Bülow
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: DLH, RDB, PB; Resources: PB, RDB; Data Curation: DLH; Software: DLH, MG, RDB; Formal Analysis: DLH, RDB; Supervision: PB, RDB; Funding Acquisition: PB, RDB; Validation: DLH, BMK, PD, MG, RDB; Investigation: DLH; Visualization: DLH; Methodology: DLH, MG, IC, RDB; Project Administration: None; Writing – original draft: DLH, RDB; Writing – review and editing: DLH, MG, PD, BMK, IC, PB, RDB.

Corresponding author

Correspondence to Roman D. Bülow.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

. tRigon session report in html-format for a k-means clustering analysis including all inputs, setting options and outputs.

Additional file 2

. tRigon session report in html-format for a correlation analysis including all inputs, setting options and outputs.

Additional file 3

. tRigon session report in html-format for loading data into the application including a detailed description of the loaded data frame.

Additional file 4

. tRigon session report in html-format for processing omics datasets including a detailed description of input files, processing settings and the processed data frame.

Additional file 5

. tRigon session report in html-format for descriptive statistics including all inputs, setting options and outputs.

Additional file 6

. tRigon session report in html-format for a feature importance analysis including all inputs, setting options and outputs.

Additional file 7

. tRigon session report in html-format for feature plots including all inputs, setting options and outputs.

Additional file 8

. tRigon session report in html-format for statistical testing including all inputs, setting options and outputs.

Additional file 9

. Supplementary Material containing Supplementary Tables S1-S4 and Supplementary Figures S1-S9.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Hölscher, D.L., Goedertier, M., Klinkhammer, B.M. et al. tRigon: an R package and Shiny App for integrative (path-)omics data analysis. BMC Bioinformatics 25, 98 (2024). https://doi.org/10.1186/s12859-024-05721-w

Download citation

Received: 21 November 2023
Accepted: 26 February 2024
Published: 05 March 2024
DOI: https://doi.org/10.1186/s12859-024-05721-w

tRigon: an R package and Shiny App for integrative (path-)omics data analysis

Abstract

Background

Results

Conclusions

Similar content being viewed by others

Explore related subjects

Introduction

Implementation

Results

Discussion

Conclusion

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation