1 Introduction

Supplementary cementitious materials (SCMs) play a critical role in cement and concrete [1,2,3]. Partial replacement of Portland clinker with SCMs (e.g., coal combustion fly ash, natural or natural calcined pozzolans (calcined clay), ground granulated blast furnace slag, silica fume) is a sustainable way of addressing the environmental concerns of the cement industry (i.e., CO2 footprint and energy consumption). The use of SCMs can also offer economic benefits and improve durability in certain environments [4,5,6,7]. However, the performance of SCMs, which is closely related to their chemical reactivity [7], varies profoundly depending on their type, source, composition, and processing/production conditions [5, 8]. For this reason, an accurate assessment of the reactivity of SCMs is critical. In 2017, RILEM TC 267-TRM (Tests for Reactivity of Supplementary Cementitious Materials) was created to evaluate and develop reactivity tests methods for SCMs. In the first phase of its work 21 participants evaluated the robustness and correlation to strength development of existing tests used to qualify SCMs based on chemical reactivity. 11 “conventional” SCMs, covered by current standards for cement and concrete were included [9]. None of the known standardised tests showed acceptable correlation with the 28 days compressive strength (R2 > 0.85). The R3 (rapid, relevant and reliable) test which uses the cumulative 7 day heat release or bound water of an SCM-lime-alkali-sulfate/carbonate mixture reacted at 40 °C to measure the reactivity of the SCMs, in contrast, yielded promising correlation and robustness (i.e. R2 of 0.86 and 0.94, for bound water and cumulative heat release respectively).

The objective of the second phase of the work by this technical committee was to improve the correlation between the R3 results and compressive strength, and to identify factors influencing the reproducibility between participating laboratories. This with the aim to optimise the test protocol. The final R3 testing protocol was consolidated in a standardised test method, ASTM C 1897 [10], as concisely presented in the Methods section of this paper.

The aim of the present paper is to report the outcomes of the third phase of the activities of this technical committee, which was to apply the final standardised protocol to a wide range of materials, including both commonly used SCMs and emerging materials that are being considered to be used as SCM in the near future. This paper presents the results of 52 materials tested by both R3 protocols (heat release and bound water) by 13 laboratories and discusses threshold levels for SCM reactivity from the dataset presented.

2 Experimental

2.1 Materials

The materials used for the phase 3 study were two different Portland cements (PC) of type CEM I 42.5 N/R and a wide variety of test materials. These materials were collected through an open call offering third parties the opportunity to have their materials tested by the RILEM TC 267-TRM. For reasons of confidentiality and impartiality, the materials were pseudonymised during further distribution, testing and characterisation. For the same reasons no individually traceable results are discussed in this paper.

In total 52 materials were examined. These were assigned to 6 groups plus quartz as an inert reference, as described below:

  1. 1.

    FA: 10 Fly ashes: 5 siliceous and 3 calcareous coal combustion ashes, and 2 biomass ashes.

  2. 2.

    SL: 13 Slags: 6 slags were ground granulated blast furnace slags (GGBFS) from pig iron production. The other 7 slags were generated from other metallurgical processes. This variation is reflected in the relatively broad composition ranges in Table 1.

  3. 3.

    CY: 10 Calcined clays, all clays (before calcination) were kaolinitic, but varied in purity and other constituent minerals. All materials in this group were received previously calcined.

  4. 4.

    PZ: 10 Natural Pozzolans, 6 natural pozzolans meeting SCM specifications for use in cement and/or concrete (i.e. EN 197–1 or ASTM C618), and 4 other materials of natural origin.

  5. 5.

    SF: 5 Silica Fumes, 4 silica fumes met the requirements for use as cement and concrete constituent, one silica fume material had a lower purity and fineness.

  6. 6.

    Others: 3 materials not falling within the above-mentioned groups.

Table 1 Ranges of oxide contents determined by X-ray fluorescence spectroscopy and Loss on Ignition (LOI) values per group of material

The ranges for the chemical (oxide) composition and the amorphous/crystalline contents of the SCMs are provided in Table 1 and Fig. 1, respectively. The X-ray powder diffraction (XRD) Rietveld refinement results show significant variations in the crystallinity of different groups of SCMs. The reported ranges are wider than usually expected for standardised SCMs because of the inclusion of non-conventional materials. These non-conventional materials were included to assess the scope of the R3 test and to identify certain material components that can generate false positives by yielding a test response indicating chemical reactivity, yet do not contribute to strength development.

Fig. 1
figure 1

Ranges of crystalline and amorphous contents per group of materials determined by XRD Rietveld analysis

Suppliers were requested to deliver materials ready for blending with Portland cement, i.e. pre-milled to the fineness used in practice. Figure 2 shows the particle size distributions of the SCMs. This was measured by laser diffraction on ultrasonicated suspensions using isopropanol as solvent for FA and SL samples, water with 3 wt.% of PCE superplasticiser for SF materials, and a sodium carbonate aqueous solution (pH > 10) for PZ, CY and Other materials [11]. The measurements were made using a Malvern particle size analyser model MasterSizer S operated in Fraunhofer mode, refractive indices and absorbances were adjusted to the solvent and material appearance (colour).

Fig. 2
figure 2

Particle size (PS) distribution per group of materials determined by laser diffraction. DVx designates the particle size value corresponding to the Xth percentile in the cumulative volumetric particle size distribution

2.2 Mortar compressive strength test

A cement replacement of 30 wt.% for all types of materials was used for the preparation of the mortars, with the exception of the silica fumes where a substitution of 10 wt.% was used. Mortar prisms (prepared with a local CEM I 42.5 N/R cement and EN 196–1 standard sand) were prepared, cured and tested according to EN 196–1 by the two laboratories carrying out the mortar strength tests, each using a local CEM I 42.5 N/R Portland cement. Compressive strength measurements were carried out at 2, 7, 28 and 90 days. In addition, a sulfate adjustment for the calcined clays was applied following the procedure described in [9]. A PCE superplasticiser was used to control the workability of the mortars with calcined clay and silica fume. Relative compressive strengths (compared to the PC control mortar) were used to reduce the variations in the absolute strength due to the use of different cements. These relative compressive strengths were used as a benchmark for the assessment and interpretation of the R3 test results. The relative compressive strength \(R_{{\text{SCM,relative}}}\) was calculated according to Eq. (1) using the absolute compressive strength of the blended mortar containing SCM (denoted as \(R_{{{\text{SCM}}}}\)) and that of the reference mortar (denoted as \(R_{{{\text{PC}}}}\)). Afterwards, these results were averaged over both cement types (CEM I 42.5 N/R) for each SCM. As a reference for inert material, the relative strength of quartz-containing blended cements was also measured.

$$R_{{\text{SCM,relative}}} \, \left( \% \right) = \frac{{R_{{{\text{SCM}}}} - R_{{{\text{PC}}}} }}{{R_{{{\text{PC}}}} }} \times 100 \%$$
(1)

2.3 R3 test

The R3 tests were carried out on paste samples incorporating SCMs following the mix design of the model paste shown in Table 2. [10, 12]. The pastes were tested for their bound water using a muffle furnace and for their heat release at 7 days by isothermal calorimetry. 13 laboratories contributed to the R3 testing of the 52 materials. To distribute the work load, each material was sent to at least two different laboratories for testing. For the R3 bound water test, the sampled pastes were cured in sealed plastic containers at 40 °C for 7 days. After this time, the samples were crushed and sieved on a 2 mm sieve and dried in an oven at 40 °C for 24 h. The dried samples were dehydrated at 350 °C for 2 h and cooled in a desiccator for 1 h. The bound water (for hydrates, excluding portlandite) was calculated according to Eq. (2), where \(w_{0}\) is the total mass of the dried paste and crucible, \(w_{{\text{h}}}\) is the total mass of the 350 °C—dehydrated paste and crucible, and \(w_{{\text{c}}}\) is the mass of the empty crucible.

$${\text{H}}_{{2}} {\text{O}}_{{\text{bound, dried}}} \left( {\frac{g}{{100{\text{g of dried paste}}}}} \right) = \frac{{w_{0} - w_{{\text{h}}} }}{{w_{0} - w_{{\text{c}}} }} \times 100$$
(2)
Table 2 Mass proportions of the R3 test model paste. [10, 12]

The R3 heat release test was carried out by isothermal calorimetry at 40 °C for 7 days. The cumulative heat release per gram of SCM (\({H}_{SCM}\)) was calculated using Eq. (3), where \(H\) is the cumulative heat from 75 min after mixing until 7 days, \(m_{{\text{p}}}\) is the mass of the paste in the calorimeter vial, and 0.101 is the mass fraction of the SCM in the paste specimen.

$$H_{{{\text{SCM}}}} \left( {\frac{J}{{\text{g of SCM}}}} \right) = \frac{H}{{\left( {m_{{\text{p}}} \times 0.101} \right)}}$$
(3)

R3 bound water and isothermal calorimetry tests were carried out in duplicates by 8 and 10 different participants, respectively.

Full details of the testing protocol can be found in ASTM C1897 [10].

2.4 Regression analysis

The objective of regression analysis is to develop regression functions that can predict the average expected relative strength of each SCM as a function of the cumulative heat release \(\left( {H_{7} } \right)\) or bound water \(\left( B \right)\). The relative 28 days compressive strength data along with the cumulative heat release and bound water results were used to develop two separate sets of linear regression models (one for each test method). These functions were then used to develop charts that can determine the reactivity class of the SCMs using their R3 test result. To this end, the SCMs were first classified into three groups based on their type as shown in Table 3. The rationale behind such classification was based on the differing trends for the linear regression lines which were different for fly ashes, natural pozzolans and slags (Group 1) than for calcined clays (Group 2). Silica fumes were considered separately (Group 3) because of the lower level of cement replacement used in the mortars (10 wt.%). The materials classed as “other” were omitted from this analysis.

Table 3 The grouping system of the SCMs based on their type

A linear regression equation was developed relating the relative strength \(\left( R \right)\) to \(H_{7}\) or \(B\) for each of the three groups of SCMs (six possible combinations). The models were obtained using the Regression Model tool of Minitab 19 software. Details of constructing the model structures, the fitting process, and the evaluation of the models are provided in Appendix 1.

After developing the regression equations, they were used for categorizing the SCMs into three reactivity levels: non-reactive (NR), moderately reactive (MR) and highly reactive (HR). This was done by calculating the chance or the probability of every group of SCM’s relative 28 days compressive strength belonging to specific ranges. SCMs with relative strength values equal to or less than − 35% (which is the relative strength value associated with inert quartz powder) were considered to be non-reactive. Relative strength values ranging from  − 35 to 0% were associated with moderately reactive SCMs, and the SCMs with relative strength values greater than zero were categorized as highly reactive. For each group of SCMs (Group 1–3), probability plots were generated that estimate the chance of an SCM to belong to a reactivity level for a given test result (H7 or B).

3 Results

3.1 Compressive strength benchmarking

Figure 3 shows the relative compressive strength results at 2, 7, 28 and 90 days of curing. In most cases the difference between tests from the different laboratories was low and trends were similar over time or when comparing different SCMs. In general, the differences between the laboratory results decreased with age (Table 4). The somewhat larger differences at early age can be related to the different strength class designation (N vs. R) of the Portland cements used.

Fig. 3
figure 3

Relative compressive strength at 2, 7, 28 and 90 days per group of blended SCMs cements. Dashed lines indicate relative compressive strength for quartz-blended mortars

Table 4 The correlation coefficients between the relative compressive strength results reported by the laboratories at different ages

3.2 R3 test results

Figure 4 shows the average R3 test results and their standard deviations for all SCMs. Overall, the R3 test results presented a very good reproducibility between the participants, in addition to a wide spread of averaged result values which is beneficial for statistical analysis. The average coefficient of variation of the cumulative heat release results reported by the participants is 4.8%, while that for the bound water was found to be 19.4%. The larger interlaboratory variation observed for the bound water could be due to the sensitivity of the bound water result to how the protocol was followed. For example, how well the sample is crushed, how much it is dried before going to the oven, etc. as reported by Avet et al. [12]. Nonetheless, an analysis of variance (ANOVA) suggests that neither the variations observed in the cumulative heat release nor in the bound water between the participants are statistically significant (p-values = 0.909 and 0.435, respectively). Furthermore, the differences in results between participants were found mainly for the non-conventional materials.

Fig. 4
figure 4

Average R3 test—Bound water (left) and heat (right) results for different SCMs. Green circle: conventional SCMs, Pink diamond: non-conventional SCMs

Figure 5 shows the range of R3 test results by group of SCMs for conventional samples. The range of bound water results for natural pozzolans and fly ashes (independently of their Ca-content) are similar and between 4.5 and 8.3 g/100 g of dried paste, while the conventional slags presented a narrow range between 7.2 to 7.9 g/100 g of dried paste. On the other hand, the calcined clays showed a higher average and a broader range of bound water values between 6.8 to 14.8 g/100 g of dried paste. This wide range could be associated with the range of metaclay content present in the calcined clays (from 43 to 85 wt.%, see Fig. 1). In the case of the silica fume samples, this range was from 6.2 to 9 g/100 g of dried paste.

Fig. 5
figure 5

Range of R3 test—Bound water (left) and heat (right) results for conventional SCMs

Regarding the hydration heat, each group of SCMs showed a specific range; from 50 to 300 J/g SCM for pozzolans, 160 to 360 J/g SCM for fly ashes, 350 to 550 J/g SCM for conventional slags, 250 to 960 J/g SCM for calcined clays, and 350 to 630 J/g SCM for silica fumes. As for bound water, the range of R3 results (by hydration heat) for calcined clays was found to be broader than that of other types of SCMs, and the group of conventional slags showed the narrowest range of variations.

Figure 6 shows the correlation between the cumulative heat release and the bound water (R2 = 0.87) for all tested materials. It implies that both techniques can be used interchangeably to predict the reactivity of a material.

Fig. 6
figure 6

Linear correlation of Heat to Bound water (R3 test) for all tested materials. Green diamond: conventional SCMs, Pink diamond: non-conventional SCMs

3.3 Correlation of R3 test results and relative compressive strength

Figure 7 shows the linear correlations between the R3 test results and relative compressive strength for the different types of SCMs. Note that the SF group was not included in these analyses because of their 10 wt.% cement replacement as opposed to 30 wt.% for the other SCMs. As can be observed, there are two clear trends of results for each R3 test response parameter. One trend includes natural pozzolans, fly ashes and slags; the other trend only comprises calcined clays. In the case of bound water, R2 values of 0.70 and 0.86 were found for the cluster of natural pozzolans, fly ashes and slags, and the cluster of calcined clays, respectively. The lower R2 value in the first cluster could be attributed to the narrower range of response parameter values (bound water or cumulative heat) and the broader scope (i.e. material origin, chemical and phase composition, and physical properties). The cumulative heat release showed R2 values of 0.87 and 0.84 for the cluster of natural pozzolans, fly ashes and slags, and the cluster of calcined clays, respectively. The observed correlations between the R3 test results and the relative compressive strengths across a wide range of conventional SCMs provide support to the use of this method for assessing the chemical reactivity of candidate SCMs. More in-depth statistical analysis is carried out in the next section to demonstrate how the R3 test results can be used for predicting the strength performance of different types of materials and hence for deriving R3 test threshold values for chemical reactivity classes.

Fig. 7
figure 7

Linear correlation of R3 test—Bound water (left) and heat (right) to relative compressive strength at 28 days for conventional SCMs

3.4 Regression analysis results

Table 5 shows the regression equations for the three groups of SCMs as a function of cumulative heat release and bound water. The standard deviation and the coefficient of determination of each of the two models are also included in the table. It is observed that in both models only the linear terms of the continuous variables are significant. Both models appear to have a relatively high coefficient of determination, which suggests the effectiveness of the defined structure for the regression models and the SCM grouping system in predicting the relative strength. An assessment of the regression models is provided in Appendix 2.

Table 5 The list of regression equations as a function of cumulative heat release and bound water for different groups of SCM

Using the regression formulas and Eqs. (68), classification charts estimating the probability of an SCM belonging to each of the three levels of reactivity are calculated and shown in Fig. 8 for each predefined group. The lines are solid in the experimented data ranges and dashed when extrapolating outside the range of observed data. The dashed lines are thus associated with larger uncertainty. Comparing the classification charts across the three SCM groups it can be clearly observed that the probability distributions are shifted to higher cumulative heat/bound water values for Group 2, i.e. for calcined clays, than for the other groups. This reflects the different trend line for calcined clays in the correlation plots of heat release (H7) and bound water (B) against relative strength in Fig. 7

Fig. 8
figure 8

Probability curves for the bound water (ac) and hydration heat at 7 days (df) for Group 1 (slags, fly ashes and natural pozzolans): (a and d), Group 2 (calcined clays): (b and e), and Group 3 (silica fume): (c and f)

4 Discussion

The derived probability curves help to predict whether a candidate SCM is chemically reactive and contributes to strength development based on the R3 test results. Most current standards (e.g. EN 196–2 or EN 196–5) aim only to identify if materials are chemically reactive or not. If this is the need, a single minimum threshold value could be proposed to distinguish between non-reactive fillers and reactive SCMs based on a comparison of the test result to the inert quartz reference. For instance, for Group 1 SCMs, based on the statistical analysis of the dataset presented in this paper, threshold values can be estimated and are presented in Table 6 for confidence levels of 66 and 90%. Raising the desired confidence level comes with a drawback of increasing the likelihood of generating “false negatives”, in this case, candidate SCMs that do not reach the specified threshold in the R3 test but do contribute to strength development to some extent. In addition, it is important that conventional SCMs should not be excluded or wrongly classified because of threshold values that are too stringent. For instance, selecting a 90% confidence level would pre-emptively exclude several slowly reacting natural pozzolans. In this example, it would be preferable to select the 66% confidence level threshold values.

Table 6 Example of the derivation of threshold values for R3 heat at 7 days (H7) and R3 Bound water (B) for Group 1 Materials (slags, fly ashes, natural pozzolans) estimated for confidence levels of 66 and 90%

The threshold values proposed are roughly in the range of − 30% compressive strength relative to the Portland cement reference, which is also the amount of neat Portland cement that has been replaced by an SCM, a very similar situation to the standardised activity index testing. As per EN 450–1 a mix of 75% limestone-free Portland cement (CEM I) with SCM should achieve 75% of the strength of the corresponding neat Portland cement reference by 28 days, and 85% by 90 days. The classification into reactive and non-reactive materials based on the R3 test should therefore roughly correspond to that by the activity index testing, but the result is obtained in 7 days, not 28 days or later.

From the cement manufacturing perspective, it is crucial to identify the most appropriate material for the given purpose and thus save natural resources. The classification into non-reactive and reactive materials could greatly help in this task. The non-reactive materials could for example be considered as a replacement for limestone in areas of the world where this material is scarce or expensive, and allowing for the limestone to be used where it is really needed for its synergistic effect with calcined clays, slags and fly ashes [13,14,15].

An additional advantage of the R3 test is the indication of how chemically reactive a material is. This is why the statistical analysis included a category that indicates “highly reactive” SCMs that can reach or surpass the reference cement strength by 28 days. Based on the probability curves and desired confidence levels, minimum threshold values can be estimated. The tested materials can also be assessed by comparison to currently used SCMs to find the most promising alternatives. Instead of classifying materials against discrete threshold values, Fig. 5 can be used to compare a material to the known value ranges for natural pozzolans, fly ashes, slags, silica fumes and calcined clays. These ranges can be used as reference in the assessment. Mixes of various SCMs have not been assessed in this study, but the reactivity test results of such mixes are expected to correspond with the mix proportion-weighted average of the individual SCM test responses. This way the R3 test could as well be applied directly to a fly ash-slag mix to check its performance in a blended ternary or quaternary cement, for example.

While the models developed present a satisfactory level of precision in predicting the strength contribution of conventional SCMs studied in this research, caution should be exercised when interpreting the results of non-conventional materials such as biomass fly ashes, or other materials not included in this study. For non-conventional materials separate in-depth studies are required to establish whether the R3 test is capable of properly measuring reactivity and predicting contribution to strength development. In addition, the R3 test and the statistical models derived thereof are at the time most suited for estimation of strength activity up to the age of 28 days, and the correlation of the R3 results with later age activity of certain SCMs is somewhat lower [9]. Slowly reacting SCMs typically present a gradually increasing test response that can continue beyond 7 days. Inclusion of a kinetic parameter could provide a way to distinguish slowly reacting SCMs from non-reactive materials, in case of doubt. Such kinetic parameter could consider the slope of the test response over time, or, more simply, compare test response at 1 day and 7 days of curing.

As a final note, it is worth mentioning that the R3 test is intended to provide an initial estimation of the chemical reactivity and strength performance of SCMs. At present it should not be directly used or otherwise interpreted as a means to predict other performance aspects of SCMs such as their influence on the workability or the durability of cements produced by such SCMs.

5 Conclusion

This paper presents the results of the R3 reactivity test on a wide range of materials, which includes both standardised SCMs and non-conventional materials falling outside the scope of current standards or specifications. An excellent correlation between the 7 days heat release and the bound water (R2 = 0.9) was identified, indicating the interchangeability between both techniques to predict the chemical reactivity of a material. The R3 test results showed a high correlation (R2 ≥ 0.7) to mortar compressive strength, making it a rapid, reliable and relevant approach to classify different types of SCMs in terms of chemical reactivity while relating to their potential contribution to the strength development of blended cements. Comparison of the R3 test results to mortar compressive strength development showed that all conventional SCMs followed the same correlation trend, with the notable exception of very reactive calcined kaolinitic clays. In addition, a statistical analysis of the R3 test results was carried out to propose threshold values to classify the reactivity of candidate SCMs.