Background

Quantitative, reverse transcription polymerase chain reaction (qRT-PCR) is a widely used technique for the detection and quantification of mRNA. Currently, with the spread of COVID-19, and despite the development of a number of interesting imaging and artificial intelligence-based diagnostic procedures [1, 2], qRT-PCR is still considered the gold-standard tool for SARS-CoV-2 detection by the World Health Organization (WHO) [3]. Depending on the amount of viral mRNA detected by the qRT-PCR system, a sample is assigned as Positive, Negative or Undetermined. Most healthcare providers rely on automated detection procedures that make use of proprietary reagents and software and offer limited options for parameter set-up, with special emphasis on those recommended by WHO (CFX Manager 3.1 from BioRad) and by the FDA (Food and Drug Administration) (the SDS 1.4 software from Applied Biosystems).

In this context, research laboratories worldwide are developing alternative diagnostic protocols that do not depend on commercial kits and their accompanying software. Thus, researchers may choose from a wide variety of quantitative gene expression reagents (one- vs. two-step qRT-PCR or fluorescent probes vs. intercalating dyes) and adapt protocols to the real-time amplification systems available [4]. In addition to the need for independent experimental protocols, there is also a need for an open-source tool that can rapidly analyze qRT-PCR data irrespective of the protocol or equipment used. Additionally, in order to discard samples with unspecific amplification products, a software for accurate qRT-PCR analysis should allow the inspection of amplification and melting curves.

Finally, the increase in the number of worldwide infections, driven by fast-spreading variants that are especially relevant in low-income countries, makes qRT-PCR the first choice for SARS-Cov-2 detection. There is a need for open-source tools for qRT-PCR data analysis that are easily customizable to the experimental set-up in each setting, without the prohibitive cost of proprietary licenses. Furthermore, the high risk in dense populations together with the close contact of SARS-CoV-2 reservoir in host’s animals makes the qRT-PCR detection an urgent need with strong public health implications [5, 6].

In this context, and taking into account both WHO recommendations and the new challenges of the global COVID-19 pandemic, the objective of our work was to develop a flexible, fast and non-proprietary software for massive qRT-PCR data analysis. Herein, we present shinyCurves, a Shiny-based, user-friendly, flexible, integrative, non-proprietary and free application that is able to:

  1. 1

    Process qRT-PCR raw amplification data obtained with either fluorescent probes or intercalating dyes, from different plate formats and qRT-PCR systems, including those recommended by the main public health agencies;

  2. 2

    Establish the settings that will classify samples into three categories (Positive, Negative or Undetermined), and to include both a range of optional experimental controls, and the possibility of using serial dilutions of viral RNA/DNA;

  3. 3

    Plot both amplification and melting curves, providing additional quality control of the specificity of the amplification and offering the possibility of visually inspecting the results obtained.

A COVID-19 toy dataset is also provided as a practical example (see Additional file 1).

Implementation

shinyCurves is designed for users with limited or no programming experience who wish to analyze qRT-PCR data in a simple and efficient manner. The application was completely written using R [7] in combination with the ‘shiny’ package [8], and can be found in the shinyapps.io repository as a web application (https://biosol.shinyapps.io/shinycurves/). The source code can be freely downloaded from the GitHub repository https://github.com/biosol/shinyCurves. Analysis tables are processed by ‘data.table’ [9] and ‘dplyr’ [10] due to their high efficiency, and plots are generated using ‘ggplot2’ [11] and ‘plotly‘ [12]. Specifically, to plot melting curves the ‘qpcR’ R package [13] is used.

Raw data can be uploaded directly to the application in different file formats, including csv, xlsx or xls files generated in the two most widely used qPCR systems (i.e. BioRad and Applied Biosystems platforms), as pointed out in Fig. 1. Once the upload is complete, the intercalating dye pipeline will start by plotting a melting curve per reaction, in order to discard those that show unspecific amplification products, before sample classification into the Positive, Negative and Undetermined groups. In the case of the fluorescent probe pipeline, it will first perform the classification of the samples or the calling analysis, and then plot the amplification curves for visual inspection of the results.

Figure 1.
figure 1

An overview example of two shinyCurves analyses (intercalating dye and fluorescent probe, in green and pink, respectively). Both analyses can be performed using qRT-PCR results from BioRad CFX or Applied Biosystems Quant Studio platforms, among others. The input data required for each analysis are specified in the black boxes. In green, the intercalating dye pipeline starts with the plotting of melting curves to discard unspecific amplification products. Then, in the calling analysis each sample is assigned a final result (Positive, Negative, Undetermined). In pink, the fluorescent probe pipeline starts with the calling analysis and is followed by the plotting of amplification curves for the visual inspection of the results

Finally, it is worth to mention that shinyCurves allows for the use of sample duplicates and is also independent from the plate format, i.e. 96- or 364-well plates. The inclusion of experimental controls and serial dilutions of viral DNA is optional, but if included, must follow specific formats, as described in the provided Manual (see Additional file 2).

Results

To illustrate the shinyCurves pipeline, data sets generated by the COVID-19 Basque Inter-Institutional Group (coBIG) in both fluorescent probe and intercalating dye experiments, and run on the BioRad CFX and Applied Biosystems Quant Studio systems are provided [4]. The genes chosen for these analyses are N1, RdRp and RNAseP (human genomic control) in the fluorescent probe assay, and N, S, RdRp and H30 (endogenous control) in the intercalating dye assay. These genes are included in the COVID-19 diagnostic panel described by the US Centers for Disease Control and Prevention (CDC, https://www.cdc.gov/).

Melting curves (in the intercalating dye analysis)

In the dye experiments, shinyCurves allows users to plot melting curves to exclude from the final calling those samples lacking reliable and unique melting temperature (Tm) peaks. These plots are generated using the meltcurve function from the ‘qpcR’ package in R [13]. As a result, users can download a table including only those plate wells that contain samples with unique Tm peaks that meet the established criteria, together with the Tm value and plot assigned to each well. This table must be included in the calling analysis as a prior filtering step (ID_well file).

Calling analysis

After uploading the input data, users are allowed to fine-tune a series of parameters that will be used to classify samples as Positive, Negative or Undetermined. In general, two types of calling criteria are allowed, namely, the Ct value of the analyzed viral gene(s) (compulsory) and the estimated viral RNA copy number (optional). A human endogenous control is included to make sure that the nucleic acid extraction worked. Adjustable parameters include, among others: use of a viral DNA standard curve (yes/no), use of duplicates (yes/no) and number of “positive” viral genes to consider a sample Positive. Calling results are presented in a downloadable table. For more details on the calling criteria, see Additional file 3, where the algorithm of the full Calling Analysis has been described in detail.

Amplification curves (in the fluorescent probe analysis)

After performing the calling analysis in probe experiments, shinyCurves allows to plot General Amplification Curves including all the samples, as well as individual Amplification Curves of samples classified as Undetermined. Each Undetermined sample is plotted independently, together with those with a final calling. Upon visual inspection of Undetermined sample curves, the user can decide whether these fit a sigmoidal distribution, and therefore represent a specific amplification, or not.

Conclusions

To our knowledge, this is the first tool designed to automatize the calling of clinical samples containing pathogen nucleic acids through qRT-PCR, and that is completely flexible as regards the user’s requirements and experimental settings. In fact, several open-access software packages and tools for the analysis of qPCR data already exist (see review by [14]). However, some of them either have been discontinued (CopyCaller) or are no longer maintained [15] or need a subscription or license (Cy0 Method, https://www.cy0method.org/, [16]). Moreover, the alternatives for drawing melting curves other than proprietary software are very limited. As far as we know, the qpcR R package is the only available free package, but no graphical user interface is provided [17].

In summary, shinyCurves is a user-friendly application that analyzes and allows the visualization of qRT-PCR data coming from different amplification methods and platforms. It is easily accessible for any user profile, as no programming skills are required. shinyCurves is set up automatically in the shinyapps.io server, making basic Internet connection its only requirement. Its minimal requirements make it a ready-to-use tool applicable to the clinical routine. Therefore, we conclude that it is a significant improvement in analytical capacity, speed and reproducibility, which are key factors in pathogen detection analyses, especially in COVID-19 times.