
The original version of this article [1] unfortunately contained a publisher error in Fig. 4. The figure was incorrectly captured as a duplicate of Fig. 5. The correct Fig. 4 has been published in this Erratum. See Fig. 1.

Fig. 1
figure 1

The ratio of the F Hadoop /F HPC as a function of the reciprocal dataset size in Gb. The pipelines were run on the Hadoop I and II clusters, as well as a 16 core HPC node. The analytical curve f(x) = (a1x + b1)/(a2x + b2) was used to fit the data for the stretches of linear scaling of calculation time on the HPC platform. The outliers are marked with crossed symbols