Report of RILEM TC 267-TRM phase 3: validation of the R3 reactivity test across a wide range of materials

RILEM TC 267 TRM– “Tests for Reactivity of Supplementary Cementitious Materials” recommends the Rapid Reliable Relevant (R3) test as a method for determining the chemical reactivity of supplementary cementitious materials (SCMs) in Portland cement blends. In this paper, the R3 test was applied to 52 materials from a wide range of conventional and alternative SCMs with the aim to validate such test. An excellent correlation was found between the cumulative heat release and the bound water determined following the R3 test method. Comparison of the R3 test results to mortar compressive strength development showed that all conventional SCMs (e.g. blast furnace slag and fly ashes) followed the same trend, with the notable exception of very reactive calcined kaolinitic clays. It is discussed, through an in-depth statistical regression analysis of the R3 reactivity test results and the 28 days relative compressive strengths, how reactivity threshold values for classification of the chemical reactivity of SCMs could be proposed based on the R3 test results.

reason, an accurate assessment of the reactivity of SCMs is critical. In 2017, RILEM TC 267-TRM (Tests for Reactivity of Supplementary Cementitious Materials) was created to evaluate and develop reactivity tests methods for SCMs. In the first phase of its work 21 participants evaluated the robustness and correlation to strength development of existing tests used to qualify SCMs based on chemical reactivity. 11 ''conventional'' SCMs, covered by current standards for cement and concrete were included [9]. None of the known standardised tests showed acceptable correlation with the 28 days compressive strength (R 2 [ 0.85). The R 3 (rapid, relevant and reliable) test which uses the cumulative 7 day heat release or bound water of an SCM-lime-alkali-sulfate/carbonate mixture reacted at 40°C to measure the reactivity of the SCMs, in contrast, yielded promising correlation and robustness (i.e. R 2 of 0.86 and 0.94, for bound water and cumulative heat release respectively).
The objective of the second phase of the work by this technical committee was to improve the correlation between the R 3 results and compressive strength, and to identify factors influencing the reproducibility between participating laboratories. This with the aim to optimise the test protocol. The final R 3 testing protocol was consolidated in a standardised test method, ASTM C 1897 [10], as concisely presented in the Methods section of this paper.
The aim of the present paper is to report the outcomes of the third phase of the activities of this technical committee, which was to apply the final standardised protocol to a wide range of materials, including both commonly used SCMs and emerging materials that are being considered to be used as SCM in the near future. This paper presents the results of 52 materials tested by both R 3 protocols (heat release and bound water) by 13 laboratories and discusses threshold levels for SCM reactivity from the dataset presented.

Materials
The materials used for the phase 3 study were two different Portland cements (PC) of type CEM I 42.5 N/R and a wide variety of test materials. These materials were collected through an open call offering third parties the opportunity to have their materials tested by the RILEM TC 267-TRM. For reasons of confidentiality and impartiality, the materials were pseudonymised during further distribution, testing and characterisation. For the same reasons no individually traceable results are discussed in this paper.
In total 52 materials were examined. These were assigned to 6 groups plus quartz as an inert reference, as described below: 1. FA: 10 Fly ashes: 5 siliceous and 3 calcareous coal combustion ashes, and 2 biomass ashes. 2. SL: 13 Slags: 6 slags were ground granulated blast furnace slags (GGBFS) from pig iron production.
The other 7 slags were generated from other metallurgical processes. This variation is reflected in the relatively broad composition ranges in Table 1. 3. CY: 10 Calcined clays, all clays (before calcination) were kaolinitic, but varied in purity and other constituent minerals. All materials in this group were received previously calcined. 4. PZ: 10 Natural Pozzolans, 6 natural pozzolans meeting SCM specifications for use in cement and/or concrete (i.e. EN 197-1 or ASTM C618), and 4 other materials of natural origin. 5. SF: 5 Silica Fumes, 4 silica fumes met the requirements for use as cement and concrete constituent, one silica fume material had a lower purity and fineness. 6. Others: 3 materials not falling within the abovementioned groups.
The ranges for the chemical (oxide) composition and the amorphous/crystalline contents of the SCMs are provided in Table 1 and Fig. 1, respectively. The X-ray powder diffraction (XRD) Rietveld refinement results show significant variations in the crystallinity of different groups of SCMs. The reported ranges are wider than usually expected for standardised SCMs because of the inclusion of non-conventional materials. These non-conventional materials were included to assess the scope of the R 3 test and to identify certain material components that can generate false positives by yielding a test response indicating chemical reactivity, yet do not contribute to strength development. Suppliers were requested to deliver materials ready for blending with Portland cement, i.e. pre-milled to the fineness used in practice. Figure 2 shows the particle size distributions of the SCMs. This was measured by laser diffraction on ultrasonicated suspensions using isopropanol as solvent for FA and SL samples, water with 3 wt.% of PCE superplasticiser for SF materials, and a sodium carbonate aqueous  Particle size (PS) distribution per group of materials determined by laser diffraction. D Vx designates the particle size value corresponding to the Xth percentile in the cumulative volumetric particle size distribution solution (pH [ 10) for PZ, CY and Other materials [11]. The measurements were made using a Malvern particle size analyser model MasterSizer S operated in Fraunhofer mode, refractive indices and absorbances were adjusted to the solvent and material appearance (colour).

Mortar compressive strength test
A cement replacement of 30 wt.% for all types of materials was used for the preparation of the mortars, with the exception of the silica fumes where a substitution of 10 wt.% was used. Mortar prisms (prepared with a local CEM I 42.5 N/R cement and EN 196-1 standard sand) were prepared, cured and tested according to EN 196-1 by the two laboratories carrying out the mortar strength tests, each using a local CEM I 42.5 N/R Portland cement. Compressive strength measurements were carried out at 2, 7, 28 and 90 days. In addition, a sulfate adjustment for the calcined clays was applied following the procedure described in [9]. A PCE superplasticiser was used to control the workability of the mortars with calcined clay and silica fume. Relative compressive strengths (compared to the PC control mortar) were used to reduce the variations in the absolute strength due to the use of different cements. These relative compressive strengths were used as a benchmark for the assessment and interpretation of the R 3 test results. The relative compressive strength R SCM;relative was calculated according to Eq. (1) using the absolute compressive strength of the blended mortar containing SCM (denoted as R SCM ) and that of the reference mortar (denoted as R PC ). Afterwards, these results were averaged over both cement types (CEM I 42.5 N/R) for each SCM. As a reference for inert material, the relative strength of quartz-containing blended cements was also measured.

R 3 test
The R 3 tests were carried out on paste samples incorporating SCMs following the mix design of the model paste shown in Table 2. [10,12]. The pastes were tested for their bound water using a muffle furnace and for their heat release at 7 days by isothermal calorimetry. 13 laboratories contributed to the R 3 testing of the 52 materials. To distribute the work load, each material was sent to at least two different laboratories for testing. For the R 3 bound water test, the sampled pastes were cured in sealed plastic containers at 40°C for 7 days. After this time, the samples were crushed and sieved on a 2 mm sieve and dried in an oven at 40°C for 24 h. The dried samples were dehydrated at 350°C for 2 h and cooled in a desiccator for 1 h. The bound water (for hydrates, excluding portlandite) was calculated according to Eq. (2), where w 0 is the total mass of the dried paste and crucible, w h is the total mass of the 350°Cdehydrated paste and crucible, and w c is the mass of the empty crucible.
The R 3 heat release test was carried out by isothermal calorimetry at 40°C for 7 days. The cumulative heat release per gram of SCM (H SCM ) was calculated using Eq. (3), where H is the cumulative heat from 75 min after mixing until 7 days, m p is the mass of the paste in the calorimeter vial, and 0.101 is the mass fraction of the SCM in the paste specimen.
R 3 bound water and isothermal calorimetry tests were carried out in duplicates by 8 and 10 different participants, respectively. Full details of the testing protocol can be found in ASTM C1897 [10].

Regression analysis
The objective of regression analysis is to develop regression functions that can predict the average expected relative strength of each SCM as a function of the cumulative heat release H 7 ð Þ or bound water B ð Þ. The relative 28 days compressive strength data along with the cumulative heat release and bound water results were used to develop two separate sets of linear regression models (one for each test method). These functions were then used to develop charts that can determine the reactivity class of the SCMs using their R 3 test result. To this end, the SCMs were first classified into three groups based on their type as shown in Table 3. The rationale behind such classification was based on the differing trends for the linear regression lines which were different for fly ashes, natural pozzolans and slags (Group 1) than for calcined clays (Group 2). Silica fumes were considered separately (Group 3) because of the lower level of cement replacement used in the mortars (10 wt.%). The materials classed as ''other'' were omitted from this analysis.
A linear regression equation was developed relating the relative strength R ð Þ to H 7 or B for each of the three groups of SCMs (six possible combinations). The models were obtained using the Regression Model tool of Minitab 19 software. Details of constructing the model structures, the fitting process, and the evaluation of the models are provided in Appendix 1.
After developing the regression equations, they were used for categorizing the SCMs into three reactivity levels: non-reactive (NR), moderately reactive (MR) and highly reactive (HR). This was done by calculating the chance or the probability of every group of SCM's relative 28 days compressive strength belonging to specific ranges. SCMs with relative strength values equal to or less than -35% (which is the relative strength value associated with inert quartz powder) were considered to be non-reactive. Relative strength values ranging from -35 to 0% were associated with moderately reactive SCMs, and the SCMs with relative strength values greater than zero were categorized as highly reactive. For each group of SCMs (Group 1-3), probability plots were generated that estimate the chance of an SCM to belong to a reactivity level for a given test result (H 7 or B). Figure 3 shows the relative compressive strength results at 2, 7, 28 and 90 days of curing. In most cases the difference between tests from the different laboratories was low and trends were similar over time or when comparing different SCMs. In general, the differences between the laboratory results decreased with age ( Table 4). The somewhat larger differences at early age can be related to the different strength class designation (N vs. R) of the Portland cements used. Figure 4 shows the average R 3 test results and their standard deviations for all SCMs. Overall, the R 3 test results presented a very good reproducibility between the participants, in addition to a wide spread of averaged result values which is beneficial for statistical analysis. The average coefficient of variation of the cumulative heat release results reported by the participants is 4.8%, while that for the bound water was found to be 19.4%. The larger interlaboratory variation observed for the bound water could be due to the sensitivity of the bound water result to how the protocol was followed. For example, how well the sample is crushed, how much it is dried before going to the oven, etc. as reported by Avet et al. [12]. Nonetheless, an analysis of variance (ANOVA) suggests that neither the variations observed in the cumulative heat release nor in the bound water between the participants are statistically significant (p-values = 0.909 and 0.435, respectively). Furthermore, the differences in results between participants were found mainly for the non-conventional materials.  Figure 5 shows the range of R 3 test results by group of SCMs for conventional samples. The range of bound water results for natural pozzolans and fly ashes (independently of their Ca-content) are similar and between 4.5 and 8.3 g/100 g of dried paste, while the conventional slags presented a narrow range between 7.2 to 7.9 g/100 g of dried paste. On the other hand, the calcined clays showed a higher average and a broader range of bound water values between 6.8 to 14.8 g/100 g of dried paste. This wide range could be associated with the range of metaclay content present in the calcined clays (from 43 to 85 wt.%, see Fig. 1).

R 3 test results
In the case of the silica fume samples, this range was from 6.2 to 9 g/100 g of dried paste.
Regarding the hydration heat, each group of SCMs showed a specific range; from 50 to 300 J/g SCM for pozzolans, 160 to 360 J/g SCM for fly ashes, 350 to 550 J/g SCM for conventional slags, 250 to 960 J/g SCM for calcined clays, and 350 to 630 J/g SCM for silica fumes. As for bound water, the range of R 3 results (by hydration heat) for calcined clays was found to be broader than that of other types of SCMs, and the group of conventional slags showed the narrowest range of variations. Figure 6 shows the correlation between the cumulative heat release and the bound water (R 2 = 0.87) for all tested materials. It implies that both techniques can be used interchangeably to predict the reactivity of a material.    Table 5 shows the regression equations for the three groups of SCMs as a function of cumulative heat release and bound water. The standard deviation and the coefficient of determination of each of the two models are also included in the table. It is observed that in both models only the linear terms of the continuous variables are significant. Both models appear to have a relatively high coefficient of determination, which suggests the effectiveness of the defined structure for the regression models and the SCM grouping system in predicting the relative strength. An assessment of the regression models is provided in Appendix 2.

Regression analysis results
Using the regression formulas and Eqs. (6-8), classification charts estimating the probability of an SCM belonging to each of the three levels of reactivity are calculated and shown in Fig. 8 for each predefined group. The lines are solid in the experimented data ranges and dashed when extrapolating outside the    Fig. 7

Discussion
The derived probability curves help to predict whether a candidate SCM is chemically reactive and contributes to strength development based on the R 3 test results. Most current standards (e.g. EN 196-2 or EN 196-5) aim only to identify if materials are chemically reactive or not. If this is the need, a single minimum threshold value could be proposed to distinguish between non-reactive fillers and reactive SCMs based on a comparison of the test result to the inert quartz reference. For instance, for Group 1 SCMs, based on the statistical analysis of the dataset presented in this paper, threshold values can be estimated and are presented in Table 6 for confidence levels of 66 and 90%. Raising the desired confidence level comes with a drawback of increasing the likelihood of generating ''false negatives'', in this case, candidate SCMs that do not reach the specified threshold in the R 3 test but do contribute to strength development to some extent. In addition, it is important that conventional SCMs should not be excluded or wrongly classified because of threshold values that are too stringent. For instance, selecting a 90% confidence level would pre-emptively exclude several slowly reacting natural pozzolans. In this example, it would be preferable to select the 66% confidence level threshold values.
The threshold values proposed are roughly in the range of -30% compressive strength relative to the Portland cement reference, which is also the amount of neat Portland cement that has been replaced by an SCM, a very similar situation to the standardised activity index testing. As per EN 450-1 a mix of 75% limestone-free Portland cement (CEM I) with SCM should achieve 75% of the strength of the corresponding neat Portland cement reference by 28 days, and 85% by 90 days. The classification into reactive and non-reactive materials based on the R 3 test should therefore roughly correspond to that by the activity index testing, but the result is obtained in 7 days, not 28 days or later.
From the cement manufacturing perspective, it is crucial to identify the most appropriate material for the given purpose and thus save natural resources. The classification into non-reactive and reactive materials could greatly help in this task. The non-reactive materials could for example be considered as a replacement for limestone in areas of the world where this material is scarce or expensive, and allowing for the limestone to be used where it is really needed for its synergistic effect with calcined clays, slags and fly ashes [13][14][15].
An additional advantage of the R 3 test is the indication of how chemically reactive a material is. This is why the statistical analysis included a category that indicates ''highly reactive'' SCMs that can reach or surpass the reference cement strength by 28 days. Based on the probability curves and desired confidence levels, minimum threshold values can be estimated. The tested materials can also be assessed by comparison to currently used SCMs to find the most promising alternatives. Instead of classifying materials against discrete threshold values, Fig. 5 can be used to compare a material to the known value ranges for natural pozzolans, fly ashes, slags, silica fumes and calcined clays. These ranges can be used as reference in the assessment. Mixes of various SCMs have not been assessed in this study, but the reactivity test results of such mixes are expected to correspond with the mix proportion-weighted average of the individual SCM test responses. This way the R 3 test could as well be applied directly to a fly ash-slag mix to check its performance in a blended ternary or quaternary cement, for example. While the models developed present a satisfactory level of precision in predicting the strength contribution of conventional SCMs studied in this research, caution should be exercised when interpreting the results of non-conventional materials such as biomass fly ashes, or other materials not included in this study. For non-conventional materials separate in-depth studies are required to establish whether the R 3 test is capable of properly measuring reactivity and predicting contribution to strength development. In addition, the R 3 test and the statistical models derived thereof are at the time most suited for estimation of strength activity up to the age of 28 days, and the correlation of the R 3 results with later age activity of certain SCMs is somewhat lower [9]. Slowly reacting SCMs typically present a gradually increasing test response that can continue beyond 7 days. Inclusion of a kinetic parameter could provide a way to distinguish slowly reacting SCMs from non-reactive materials, in case of doubt. Such kinetic parameter could consider the slope of the test response over time, or, more simply, compare test response at 1 day and 7 days of curing.
As a final note, it is worth mentioning that the R 3 test is intended to provide an initial estimation of the chemical reactivity and strength performance of SCMs. At present it should not be directly used or otherwise interpreted as a means to predict other performance aspects of SCMs such as their influence on the workability or the durability of cements produced by such SCMs.

Conclusion
This paper presents the results of the R 3 reactivity test on a wide range of materials, which includes both standardised SCMs and non-conventional materials falling outside the scope of current standards or specifications. An excellent correlation between the 7 days heat release and the bound water (R 2 = 0.9) was identified, indicating the interchangeability between both techniques to predict the chemical reactivity of a material. The R 3 test results showed a high correlation (R 2 C 0.7) to mortar compressive strength, making it a rapid, reliable and relevant approach to classify different types of SCMs in terms of chemical reactivity while relating to their potential contribution to the strength development of blended cements. Comparison of the R 3 test results to mortar compressive strength development showed that all conventional SCMs followed the same correlation trend, with the notable exception of very reactive calcined kaolinitic clays. In addition, a statistical analysis of the R 3 test results was carried out to propose threshold values to classify the reactivity of candidate SCMs.

Declarations
Conflict of interest The authors declare that they have no conflict of interest.
Consent for publication The contents of this paper reflect the views of the authors, who are responsible for the validity and accuracy of presented data, and do not necessarily reflect the views of their affiliated organisations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Development of the regression models
A regression equation with the general structure shown in Eq. (4) was developed relating the relative strength R ð Þ to H 7 or B for each of the three groups of SCMs (six possible combinations). In Eq. (4), i 2 1; 2 f g is a dummy variable representing the continuous variable used for formulation of the regression model (i.e., X 1 ¼ H 7 and X 2 ¼ B), and j 2 1; 2; 3 f gis a dummy variable indicating the level of the categorical variable (SCM group number). R i; j ð Þ is the observed relative strength, c ij is the constant term of each regression equation. a ij is the coefficient of the continuous variable X i , which will be different for each SCM group if the interaction between the continuous variable and the SCM group is deemed significant in regression analyses. a ii is the coefficient of the quadratic term of the continuous variable which will be included in the regression functions if deemed significant, and e is the residual of the regression, which is indeed the difference between the observed and the predicted relative strength. For independent and identically distributed i:i:d ð Þ relative strength observations, it can be shown that the residual term is a random variable having a normal distribution with a mean equal to zero and a variance of r 2 , which is also the variance of the dependent variable: R i; j ð Þ. The mathematics behind regression is an attempt to find the global solution for the unknown variables (i.e., the constant term and the coefficients) such that the squared sum of residuals is minimized. Ifĉ ij ,â ij andâ ii are the best possible estimations of the regression function's unknown parameters, Eq. (5) shows the formula for predicting the relative strength for any given SCM group and X i . The differences between the observed and predicted relative strength can be denoted asê (referred to as the prediction error). The main assumptions of linear regression are (1) the normality of the residuals, and (2) the homoscedasticity or the uniform scattering of the residuals, which pertains to equality of variance of prediction errors across all predicted values. These assumptions are to be tested and verified after the regression models are developed. Note that these assumptions are not critical in developing the regression equations. However, a violation of the underlying assumptions may cause some bias in estimating the significance of predictors and confidence intervals of the predicted values.
The regression equations were obtained using the Regression Model tool of Minitab 19 software. To this end, the observed relative strength was set as the response (dependent variable), the grouping (as per Table 3) was listed as the categorical variable and the H 7 or B as the continuous variable. The linear and quadratic terms of the continuous variable as well as the grouping and the linear interaction of the continuous variable and grouping were included in the list of model predictors (i.e., potential regression equation terms). The regression analysis was run, the insignificant predictors were removed (at a significance level of 0.05), and finally the six regression equations were obtained.
After developing the regression equations and testing the normality of the prediction errors, the regression equations were used for categorizing the SCMs into three reactivity levels: non-reactive (NR), moderately reactive (MR) and highly reactive (HR). This was done by calculating the chance or the probability of every group of SCM's relative 28 days compressive strength belonging to specific ranges for each value of X i (i.e., H 7 or B). SCMs with relative strength values less than -35% (which is the relative strength value associated with quartz powder) were considered to be non-reactive. Relative strength values ranging from -35 to 0% were associated with moderately reactive SCMs, and the SCMs with relative strength values greater than zero were categorized as highly reactive. Equations (6)(7)(8) show how the probability of SCMs (with any given cumulative heat release or bound water) belonging to each reactivity class is calculated. ð8Þ P NR f g, for instance, is the probability of belonging to the non-reactive class, which is equivalent to the probability of the true relative strength of the SCM (from Group j with a known level of cumulative heat release or bound water ( X i ð Þ) being less than -35%. In these equations,R i; j ð Þ is the predicted relative strength calculated per Eq. (5), and parameter s i is the standard error of the model which is the ratio between the sum of squared errors and its degrees of freedom. The parameter z can be shown to have standard normal distribution (under the assumption of normality of residuals) and thus the probability terms can be calculated using standard normal probability distribution tables. Using Eqs. (6)(7)(8), all three probabilities are calculated for the three groups of SCMs at different values of H 7 and B. The results are then plotted into classification charts as shown in Fig. 8. Fig. 9 The homoscedasticity a and normality b plots of prediction errors for the regression model based on the cumulative heat release Appendix 2

Assessment of the regression models
An assessment was made to verify the normality and homoscedasticity (the property of having equal statistical variances) of the prediction errors for both regression models. It was observed that while the prediction errors of the regression model based on the cumulative heat release are homoscedastic (see Fig. 9a), they do not closely follow a normal distribution (note the deviation of the data points from the normality line in Fig. 9b). Two attempts that are usually made to resolve this issue are the transformation of the dependent variable (referred to as the Box-Cox transformation) and the elimination of the outliers, which in this case did not fully rectify the deviation of prediction errors from the normal distribution. This suggests that the classification probability charts might be affected to some degree due to the non-normality of the residuals, but the effect should be limited.
The homoscedasticity and normality plots of the prediction errors of the regression model per the bound water are shown in Fig. 10a and b, respectively. It is observed that errors are homoscedastic and closely follow the normality line. As such, the underlying assumptions are confirmed and the classification probabilities will have less bias compared to the model based on the cumulative heat release.