Introduction

Wood is a material that mitigates global warming via the carbon stock, material substitution, and energy substitution effects [1,2,3]. The replacement of concrete/steel materials by wooden ones has led to reduced carbon dioxide emissions [4,5,6]. Therefore, attempts have been made to replace steel and concrete used in architecture and civil engineering with novel wooden materials such as cross-laminated, nail-laminated, and dowel-laminated timber [7,8,9]. In addition to increasing the volume of the carbon stock effect, trials have been conducted to prolong its lifespan by increasing the shelf life of wooden materials to reduce global warming. Hence, studies have been carried out to develop more durable wooden materials [10,11,12,13,14,15], and many methods have been proposed to evaluate them, such as laboratory decay tests [16, 17], stake tests [17,18,19], and fungus-cellar tests [17, 20]. Among these, the stake test is a fundamental method for estimating wood durability; therefore, it has been widely adopted by many researchers. According to the International Research Group on Wood Protection (IRG), approximately 900 IRG documents related to stake tests have been released in the last two decades [21].

The stake test procedure is straightforward. Stakes are inserted into the ground up to half the stake lengths and checked visually after a certain interval. The observed stake deterioration levels are evaluated according to the criteria for each standard. Japanese Industrial Standard (JIS) K 1571 [17] defines the sound stake and broken stake deterioration levels as 0 and 5, respectively. Stakes showing deterioration levels intermediate between these values are rated between 1 and 4, depending on the stake conditions. The durability of each treatment group is compared based on the year when the mean deterioration level reaches 2.5.

Although the stake test is an easy way to predict the wood durability in ground contact conditions, it lacks scientific precision in the data analysis process, and hence, it is impossible to compare the durability of stake groups with each other. To overcome this issue, we applied survival analysis [22] to the data obtained in the conventional test. We present the results after applying survival analysis to the stake test data samples and discuss the merits of adding survival analysis to the conventional stake test.

Material and methods

Preparation of stakes

Stakes of Japanese cedar (Cryptomeria japonica), Japanese cypress (Chamaecyparis obtusa), and Japanese larch (Larix kaempferi) were prepared by a preserving company. Green logs of the three species obtained from Gunma Prefecture were sawn to make planks. After the planks were dried naturally, they were sawn to prepare stakes of size 3 × 3 × 60 cm (L).

Test sites

Stake tests were performed at three test sites (Table 1). The locations of the sites were obtained from maps on the Geospatial Information Authority of Japan website [23]. Weather data for the sites were determined based on the monthly weather data at Tsukuba, Nara, and Toyama-city from 2012 to 2018 on the Japan Meteorological Agency website [24]. The soil types of the Ibaraki and Nara sites were obtained from the Japan soil inventory website constructed by the National Agriculture and Food Research Organization [25]; that of the Toyama site was determined by the specifications at the construction of the field test site by the Toyama Forest Products Research Institute.

Table 1 Characteristics of the three field test sites

Exposure to wood attacking organisms

At the three test sites, the prepared stakes were inserted into the ground up to half the stake lengths. The abbreviations for the stakes are listed in Table 2. The stake deterioration levels were evaluated annually according to JIS K 1571 [17] criteria (Table 3). The data obtained for 7 years were used for further analysis.

Table 2 Abbreviations used for each stake group
Table 3 Deterioration level criteria

Data analysis by conventional method

Data analysis and service life determination using the conventional method was carried out in accordance with JIS K 1571 [17]. Apparent arithmetic means of the annually collected data were calculated by considering the ordinal scale as a proportional scale. The years during which the apparent mean deterioration level reached 2.5 were determined according to Eq. (1) and designated as the service life of the stake (Fig. 1):

$${\text{YSL }} = {\text{ Y1 }} + \, \left( {{2}.{5 } - {\text{ DL1}}} \right)/\left( {{\text{DL2 }} - {\text{ DL1}}} \right),$$
(1)
Fig. 1
figure 1

Methodology for service life calculation

where YSL is the service life of the stake group, Y1 is the last year in which the mean deterioration level of the stake was below 2.5, DL1 is the mean deterioration level of the stake observed at Y1, and DL2 is the mean deterioration level of the stake observed 1 year after Y1.

A stake that reached a deterioration level of 5 during a certain period was considered to have a deterioration level of 5 thereafter. If a stake was lost in a certain year, mean calculation was conducted without the lost stake.

Data analysis according to survival analysis

The service life of the stake was designated as the year when the deterioration level of a stake reached 2.5. The service life was determined according to Eq. (1), except that the individual deterioration level of each stake was used instead of the mean deterioration level. The individual service life data were collected and used for survival analysis. Survival analysis was carried out using R software [26]. Multiple comparisons of survival curves were carried out according to the Peto & Peto modification of the Gehan–Wilcoxon test with the Holm’s p adjustment method [27].

Results and discussion

Characteristics of test sites

The weather data for the test sites are shown in Fig. 2. The data showed a similarity in the monthly temperature patterns, but a difference was noted in the precipitation patterns of the three sites. Ibaraki and Nara showed higher precipitation from June to October, while Toyama had rainy and snow season from July till January.

Fig. 2
figure 2

Monthly temperature and precipitation data for the three test sites. −Temperature, precipitation

As indicated by the soil type data (Table 1), the soil of the Nara site appeared more moist than that of the Toyama site, which appeared slightly dehydrated because of its high drainage property. In addition to the moisture conditions, the soil quality of the Toyama site was the poorest because this site was newly established in 2009 after amending with sandy-loam soil.

Service life determination using the conventional method

The field test has been adopted as a standard testing method in many countries to determine the durability of wood and wood products in ground contact conditions. The procedure of the field test in JIS K 1571 [17] is as follows: stakes inserted into the ground up to half the stake lengths are first inspected after a certain period by visual observation. Second, the deterioration levels observed on the stakes are checked and designated as per six deterioration levels, between 0 (sound) and 5 (collapse). Finally, the service life of the stakes is determined by plotting mean deterioration levels against exposure periods.

Figure 3 shows an example of service life determination by conventional calculation using cedar stakes data obtained from the three sites. Close and open circles indicate the mean deterioration levels of Ced-S and Ced-H observed during each exposure period. As shown in the figure, the mean deterioration levels of Ced-S and Ced-H increased with an increase in the exposure period and reached 2.5 in approximately 2.0 and 2.7 years, respectively. Therefore, the service lives of Ced-S and Ced-H were 2.0, and 2.7, respectively (Table 4).

Fig. 3
figure 3

Plot of deterioration level against exposure period. Closed circle; Ced-S, open circle; Ced-H

Table 4 Service life evaluated by the conventional method

The years during which the mean deterioration level of the same stake group reached 2.5, was also calculated according to Eq. (1). A summary of service life determined by each group and site is presented in Table 4. The service life of the stakes ranged from 1.5 (Ced-S-I) to 3.8 (Lar-H-T). From the conventional calculation results, it appears that the wood deterioration rate is the highest and lowest at the Nara and Toyama sites, respectively. Contrastingly, in terms of the tree species and position, Ced-S appears to have the lowest durability, while Ced-H, Cyp-H, and Lar-H appear to have high durability. This result is common because it is accepted that sapwood shows lower durability than heartwood [28].

The service life determination by the conventional method is simple and useful; however, there remains mathematical ambiguity in the calculation process. The point of ambiguity derives from misuse of the ordinal scale. As shown in Table 3, the deterioration levels in JIS K 1571 [17] are designated from 0 to 5 according to the stake conditions. It is noteworthy that the values have a nonlinear relation and cannot be calculated in the same way as that using the proportional scale. It is mathematically inaccurate to calculate the mean of the deterioration levels collected by annual observation using the ordinal scale. Therefore, it is also incorrect to discuss differences in durability from apparent mean values calculated by the conventional procedure.

Service life calculation by survival analysis

To overcome the above ambiguity, survival analysis was adopted to address the differences in the service lives of the stake groups. In contrast to the conventional methods, the mathematical ambiguities in the stake tests can be diminished by applying survival analysis because it handles the proportional scale, that is, the time when the specific incidents occur. Therefore, the data can be used for further mathematical calculations without ambiguity. Additionally, survival analysis, such as the Kaplan–Meier method, has an added advantage as it can handle missing data. Incidentally, some stakes were lost after a long exposure period, and in other stakes, the bottom was lost because a rupture occurred in the stake at the ground level. In such cases, the Kaplan–Meier method can treat the lost data as censored data and draw a survival curve with mathematical rigidness.

The Kaplan–Meier method was used to compare the wood durability, for which the selection of the event was important. We defined the event as the time at which the stake deterioration level was 2.5. According to JIS K 1571 [17], the deterioration level is evaluated using integer values from 0 to 5. Therefore, the timing was estimated from the last year that the deterioration level was below 2.5, and the first year that the deterioration level was above 2.5, according to Eq. (1).

The Kaplan–Meier curves for the cedar stakes tested at the three sites are shown in Fig. 4. The Y-axis indicates the survival probability, which is the ratio of stakes that did not reach the deterioration level of 2.5. The first sapwood stake (Ced-S) to reach the deterioration level of 2.5 was after a 10-month exposure, and half of the stakes reached the deterioration level of 2.5 in 2.2 years (Fig. 4). In the case of Ced-H, deterioration of the first stake appeared after 1.25 years of exposure, and it took 2.5 years for half of the stakes to reach the deterioration level of 2.5. The Peto & Peto modification of the Gehan–Wilcoxon test revealed a significant difference between the durability of Ced-H and Ced-S (p = 0.0061).

Fig. 4
figure 4

Kaplan–Meier curves observed in Ced-S and Ced-H deterioration. Dashed line; Ced-S, solid line; Ced-H. Color-coded areas indicate 95% confidence intervals

Figure 4 shows another advantage of survival analysis. A tick mark appears on the Ced-S data (dashed line) after the 4-year exposure; this indicates that one stake was lost before it reached the deterioration level of 2.5. Survival analysis was developed to handle data sets containing lost data (censored data).

Effect of test site on the Ced-S and Ced-H service lives

Since it was revealed that there was a significant difference between the service lives of Ced-S and Ced-H, we also compared their Kaplan–Meier curves among the three sites and found different characteristics (Fig. 5). At the Ibaraki site, Ced-S quickly reached a 2.5 deterioration level. However, Ced-H required more than 1.5 times longer than Ced-S to reach the 2.5 deterioration level. Contrastingly, Ced-S and Ced-H tested at the Nara and Toyama sites showed similar Kaplan–Meier curves. Therefore, there may be no significant difference between the service lives of Ced-S and Ced-H. To reveal a significant difference between their service lives in the Nara and Toyama sites, Peto & Peto modification of the Gehan–Wilcoxon test was carried out using the Holm’s p adjustment method. The results shown in Table 5 reveal that there are no significant differences between Ced-H-N and Ced-S-N and between Ced-H-T and Ced-S-T. Although it is accepted that the durability of Ced-H is higher than that of Ced-S, Fig. 5 and Table 5 indicate that this relationship does not apply to all cases. The former conventional field test carried out at the Nara site showed that deterioration of the Ced-S stake proceeds faster than that of the Ced-H stake [29, 30]. The difference between the former studies and this work may be due to the stake origin. The former study used logs from Nara Prefecture. The survival analysis applied to estimate the effects of logging prefectures on natural durability will be the focus of our subsequent study.

Fig. 5
figure 5

Kaplan–Meier curves observed in Ced-S and Ced-H deterioration at different sites. Dashed line; Ced-S, solid line; Ced-H. Color-coded areas indicate 95% confidence intervals

Table 5 Adjusted p value between each stake group

Effect of test site on the service lives of all stakes

As discussed above, the testing sites affect the Ced-H and Ced-S service lives. Therefore, it is assumed that there is a significant difference in the wood deterioration rates among the three sites. To test this hypothesis, the survival curves of all stakes at the three sites were plotted (Fig. 6). The results indicate that the survival probability at the Nara site decreased the most quickly, whereas the slowest decrease was observed at the Toyama site. To estimate whether the difference in the survival curves is significant, the test was carried out in the same way as described in the previous section, and a significant difference was revealed between the Toyama site and the other two sites (Table 6).

Fig. 6
figure 6

Kaplan–Meier curves observed for deterioration of all stakes at different sites. Color-coded areas indicate 95% confidence intervals

Table 6 Adjusted p value between each site

It is unclear what causes this difference; however, the soil characteristics may be influential in causing this difference. The soil types at the Ibaraki, Nara, and Toyama sites are andosol, gleysol, and sandy-loam, respectively (Table 1). Among the three sites, the soil of the Toyama site may contain insufficient nutrients and water, which slows wood deterioration caused by wood-decaying microorganisms. Further research on microorganisms is required since soil and air conditions influence their survival on the stake [31].

Effect of species on service life

Besides the differences in heartwood and sapwood, wood durability is also known to vary with species. For example, the Housing Quality Assurance Act considers that Cyp-H is more durable than Ced-H and Lar-H and allows Cyp-H to be used as a sill member without wood preservation [32]. To verify the order of durability among the three species, survival analysis was applied to the Ced-H, Cyp-H, and Lar-H data at the three sites. The results shown in Fig. 7 suggest that the heartwood of the three species show similar Kaplan–Meier curves. Therefore, there is no significant difference among Cyp-H, Ced-H, and Lar-H. To confirm this relationship, a significance test was carried out according to the Peto & Peto modification of the Gehan–Wilcoxon test with the Holm’s p adjustment method. The results that all p values are higher than 0.8, suggest that there is no significant difference among the durability of heartwood in the three specimens (Table 7). To investigate the order of durability among the three species, further research is needed.

Fig. 7
figure 7

Kaplan–Meier curves observed in the deterioration of heartwood at the three sites. Color-coded areas indicate 95% confidence intervals.

Table 7 Adjusted p value between each species

Conclusion

We developed a conventional stake test by adding a survival analysis to the conventional method.

The Kaplan–Meier curve is useful for estimating the stake group durability. Furthermore, it is useful for groups containing censored data, namely, lost data.

The durability difference among the stake groups can be calculated by the modified Gehan–Wilcoxon test with the Holm’s p adjusted with scientific collectiveness.