1 Introduction

In large engineering structures, fatigue cracks typically occur at weld transitions where stress concentrations due to the joint geometry are relatively high. In addition, small imperfections and tensile residual stresses are created by the welding process. The fatigue strength of the weld is thus generally lower than that of the parent material. To mitigate these effects, various post-weld treatment techniques are being developed. These techniques are either based on weld geometry improvement or on altering of the residual stress state at the weld toes [1].

Common techniques that alter the residual stress state are high-frequency mechanical impact treatment, see [2, 3], or shot peening, see [4, 5]. Weld geometry techniques aim at smoothing the weld toe transitions by reducing stress concentrations and removal of possible welding defects [6,7,8,9]. Thus, fatigue strength results up to the level of the base material can be achieved, see [10]. Recently, a statistical assessment of burr grinding and weld profiling [11] showed the potential of grinding to improve the fatigue assessment of butt- and fillet-welded joints. For butt-welded joints, burr grinding, however, was found to decrease the fatigue strength if the plate thickness is low. This is related to the increase in nominal stress at the critical location due to grinding. Fulfilling the requirement of grinding to a depth of 0.5 mm reduces the plate thickness significantly in thin plated structures. Alternatively, the fatigue strength of butt-welded joints may be improved by flush grinding.

Current rules and guidelines include fatigue design (FAT) classes for flush ground butt-welded joints from FAT110 to FAT155; however, in the majority of cases, the underlying database and specimen-related details are unclear or unknown. Thus, the aim of this paper is to investigate the effect of flush grinding on the fatigue strength of butt-welded joints and the main influencing factors (weld details, steel strength, stress ratio, etc.) in more detail. For this purpose, 1003 stress-life (S-N) fatigue test results from 29 publications [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41] were collected—including flat specimens, circular hollow sections, and round specimens machined from butt joints by milling and turning. It is important to note that this study focuses exclusively on steel joints.

At first, the state of the art on current fatigue resistance curves is given in “Section 2.” “Section 3” serves to explain the obtained literature data and to determine a suitable slope exponent for the S-N curves. Subsequently, the results are statistically assessed for influencing factors in “Section 4.” In “Section 5,” the fatigue test results are used to determine new fatigue resistance curves according to the Eurocode 3 [42] and IIW best practice [43, 44]. The results are discussed in “Section 6” with respect to major influencing factors.

2 State of the art on current fatigue resistance curves for flush grinding

For the fatigue assessment of flush ground butt-welded joints, FAT classes are available in rules and guidelines, Table 1 and Fig. 1. The FAT classes lie in a small range of 110 MPa ≤ Δ𝜎 ≤ 124 MPa. Only the JSSC regulation [45] allows a higher stress range of FAT 155. The slopes vary slightly between 3.0 ≤ m ≤ 3.5. The knee point varies between 2·106 ≤ 𝑁k ≤ 107. For constant amplitude loading, this leads to a maximum deviation of f = 155 MPa/65.5 MPa = 2.35 between the guidelines.

Table 1 S-N curves for flush ground butt-welded joints
Fig. 1
figure 1

S-N curves for butt-welded joints flush ground for constant amplitude loading normal to the weld (up to t = 25 mm)

Additionally, to the differences between the FAT classes, the slope, and the position of the knee point, there is a difference within the guidelines in the factor 𝑛 that has to be applied for the correction of the thickness effect. This factor varies between n = 0 in the JSSC [45] and BS7608 [49] standards and n = 0.2 in Eurocode 3 [42]. Going up to a thickness of t = 100 mm, the difference accounts to a factor

$$f=\left(155\ \mathrm{MPa}\cdot {\left(\frac{25\ \mathrm{mm}}{100\ mm}\right)}^0\right)/\left(65.5\ \mathrm{MPa}\cdot {\left(\frac{25\ \mathrm{mm}}{100\ \mathrm{mm}}\right)}^{0.2}\right)=3.1$$
(1)

at N=107 cycles, see Fig. 2, where 65.6 MPa is derived transforming FAT112 to N=107 cycles with a slope of m = 3.

Fig. 2
figure 2

S-N curves for butt-welded joints flush ground for constant amplitude loading normal to the weld, thickness corrected to t = 100 mm

Within the FAT classes, a certain amount of axial and angular misalignment is included. It results from the fatigue data that build the basis for the derivation of the FAT classes. A quantification of the amount is difficult, since it is in most cases unclear on which test series from literature the FAT classes have been derived. Even if the test series are known, quality aspects like the amount of axial and angular misalignment are typically not properly documented. In the guidelines, generic amounts are included like an axial misalignment of, e.g., 5%, within the IIW recommendations [43] and in the Recommended Practice C203 of DNVGL [47].

In addition, there is some uncertainty in terms of the grinding procedure. It is expected that the resulting surface condition (i.e., not only the roughness but also the residual stress introduced by the grinding procedure) has a significant impact on the fatigue strength; however, especially grinding marks perpendicular to the loading direction lead to a pronounced decrease in fatigue strength [50]. As for the misalignment, a detailed description of the surface condition is not available in the publications that have been used to derive S-N curves for the rules and guidelines.

3 Literature data and estimation of a suitable slope exponent

3.1 Literature data

Grinding is one of the oldest and most applied post-weld improvement techniques in many industries, as it enables an easy improvement of fatigue strength of welded connections. In addition, the quality of grinding can be determined visually without in-depth investigations. Typical requirements for grinding specify the depth of grinding to remove weld defects [1] and to the surface finish to ensure longevity of corrosion protection coating, see [47]. The large variety of fatigue resistance curves for flush ground butt-welded joints inevitably leads to the questions, which curve appropriately assessed the actual fatigue strength of this weld detail. Thus, the aim of this paper is to investigate the effect of flush grinding on the fatigue strength and the main influencing factors in more detail. For this purpose, 1003 fatigue test results on flush ground steel joints from 29 publications [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41] were collected, see Table 5. This data includes flat specimens, circular hollow sections, and round specimens machined from butt joints by milling and turning.

In Fig. 3, the available test results are presented without run-outs to ease the interpretation. A large difference in fatigue strength for the different specimen types is apparent. The round (machined) specimens have by far the highest fatigue strength, while the circular hollow sections have the lowest fatigue strength of the three types.

Fig. 3
figure 3

S-N data for butt-welded joints flush ground tested under constant amplitude loading with the IIW fatigue resistance curve for flush ground joints (without run-outs and specimens with reported weld defects)

The large scatter for the flat specimens in Fig. 3 is likely connected to the different (mostly undocumented) conditions of the specimens.

  1. (i)

    Axial and angular misalignment

  2. (ii)

    Inner porosities, cold laps, micro-cracks, etc.

  3. (iii)

    Different surface conditions

  4. (iv)

    Material strength and material combination

  5. (v)

    Welding process

  6. (vi)

    Thickness

  7. (vii)

    Residual stresses (due to welding and post-weld treatment)

As shown above, the class FAT112 is conservative and a result from the evaluation of a huge database. In this database, a variety of different joints are included with different welding and grinding qualities. High-quality joints show a substantial higher fatigue strength of up to a factor 5 at 2·106 cycles in stress direction in comparison to, e.g., the circular hollow sections (CHS).

The information on important characteristics of the data available in literature is presented in Fig. 4. Therein, data of joints with weld defects are excluded. The main conclusions on the available data are subsequently summarized. The majority of tests specimens are small scale specimens with thin plate thickness and small cross-sectional area. Gas metal arc welding was the most used welding method and flat specimens contribute the largest share of specimen types. Interestingly, some specimens were only ground from one side. The other side consequently behaves as in the as-welded state, for which no fatigue strength improvement can be assumed.

Fig. 4
figure 4

Distribution of main parameters reported for the data available in literature (without specimens with reported weld defects)

3.2 Evaluation of slope exponent

A statistical analysis of all S-N data sets has been conducted based on the maximum likelihood approach [51]. In this approach, also run-out data can be considered. This becomes especially important for the joints under consideration, since in many test series, failures occurred in the base material or the clamping area at single specimens. These specimens have been evaluated as run-outs, identical to the specimens that have not failed during the tests. A few test series where there were fatigue tests only on one load level have been excluded. By a variation of the location of the knee point between 5 · 105 ≤ Nk ≤ 108 and a parallel evaluation with maximum likelihood, the S-N curve with highest probability was identified.

For the evaluation of the slope m of the S-N curve, only test series have been included, at which the number of cycles to failure Nmin is 10 times smaller than the evaluated knee point Nk. This minimum range of cycles of one decade is important, since otherwise from engineering point of view, illogical S-N curve may result.

As a result of the statistical evaluation, a mean slope of mmean = 8.19 with a standard deviation of mstd = 3.56 is evaluated for the overall 18 exploitable test series on flat specimens, Fig. 5, which is significantly higher than the current recommendation of m = 3. This agrees with results of other studies on geometrical improvement of welds, see [9, 11]. This value is also higher than the slope exponent for base materials stated in many standards and recommendations, e.g., m = 5, in the IIW recommendations [43]. This is, however, reasonable as these standards have to be conservative for mild steels and include the possibility of minor surface irregularities during production and operation, e.g., scratches.

Fig. 5
figure 5

Histogram of evaluated slopes of the S-N curves for all specimens, divided by specimen type

For the circular hollow sections test series, a much steeper S-N curve was determined. The round specimens showed slopes up to m > 50 and, therefore, have partly been excluded from plot and evaluation. Since the majority of test series at circular hollow sections and round specimens do not fulfill the abovementioned requirements, the determined slopes are not plotted in Fig. 5.

4 Statistical assessment of influencing factors

To assess the effect of the various influencing factors on the fatigue strength of flush ground butt-welded joints, a statistical assessment of the test results is performed. Correlations between number of cycles to failure Nf and different numeric influencing factors are determined to quantify the impact on fatigue strength. Only flat specimens are included in the statistical assessment due to the large difference in fatigue strength for the three test specimen types. In addition, only tests with number of cycles to failure between 2·104 and 107 are included in the assessment. First, a linear regression model (normal distribution) is fitted to the logarithm of number of cycles to failure and stress range of all available test data, as the tests have been performed at different stress ranges. The residual of each data point is calculated according to Eq. (2):

$${r}_i={y}_i-\hat{y_i}$$
(2)

with yi as the observed value and \(\hat{y_i}\) as the predicted value from the linear regression model. As typical for S-N curves, the stress range is considered to be the independent variable and the number of cycles to failure to be the dependent variable. Thus, the residuals are determined for the logarithm of number of cycles to failure, see Fig. 6.

Fig. 6
figure 6

Theoretical background on the statistical assessment method

Secondly, the residuals between test data and regression model are used to determine the Pearson correlation coefficient:

$${r}_{xy}=\frac{\operatorname{cov}\left(x,y\right)}{\sigma_x{\sigma}_y}$$
(3)

with σx and σy as standard deviations, and cov(x, y) as covariance of x and y. A Pearson-type correlation was chosen, as literature typically assumes linear relations between (the logarithmized) fatigue strength and material strength as well as stress ratio. For plate thickness, an exponential relation is often assumed; however, it was found that the choice of correlation coefficient definition does not significantly influence the outcome of the evaluation.

To verify the assumption of normally distributed residuals, the residuals are visually compared with the fitted logarithmic number of cycles to failure, see Fig. 7, and assessed by an Anderson-Darling (AD) test [52]. The null hypothesis of the AD test is that the data is from a population that follows a normal distribution. This is determined by computing the test value AD and comparing it with a critical (tabulated) value AD—at a given significance level α = 0.05 [53]. Figure 7 shows that the residuals are symmetrically distributed and seem to follow a normal distribution well. Also, the AD test does not reject the null hypothesis. It is thus assumed that the data stems from a normal distribution, as none of the tests presents any evidence that a normal distribution can be ruled out. In contrast, all three tests support this conclusion.

Fig. 7
figure 7

Difference between residuals and fitted logarithmic number of cycles to failure from the linear regression model (left), and a normal probability plot of residuals with the results of the Anderson-Darling test

The correlation coefficients are supported by calculations of probability values (p-values) of each fit, i.e., the result of a hypothesis test of no correlation against the alternative hypothesis of a non-zero correlation [53]. According to common practice, p-values (p < 0.05) are considered to represent a significant correlation coefficient [54].

The results are presented in Fig. 8. Interestingly, no correlation was determined between the residuals and any of the influencing factors. Only the correlation coefficient for the stress ratio is close to a mild correlation (typically assumed to start at |rxy| ≥ 0.3).

Fig. 8
figure 8

Determination of the correlation between the residuals of the logarithmic number of cycles to failure and various influencing factors: a ultimate tensile strength, b yield strength, c stress ratio, d plate thickness, e seam width, and f cross-sectional area

It is a well-known fact that the fatigue strength of plain specimens without notches is influenced by the parent material static strength [55]; yet, no correlation was determined in this study. It is assumed that the reason is related to multivariant relations and multi-modal distributions of specimen characteristics. For example, the majority of very high strength steels were tested at a stress ratio R =  − 1. In addition, the majority of tests were performed using small scale specimens with thickness t < 20 mm. More importantly, Fig. 3 indicates a clear influence of the specimen type on the fatigue strength. Such non-numeric parameters cannot be included in a statistical assessment using correlation coefficients. Finally, influences from non-reported parameters are likely, such as surface roughness. In summary, no clear insight is gained by the statistical assessment due to various reasons. Thus, more high-quality data are required to gain precise knowledge on the influencing factors of fatigue strength of flush ground butt-welded joints.

5 Determination of fatigue resistance curves based on different standards

5.1 Introduction

According to the rules of Eurocode 0 [56] and the IIW recommendations [43], different methods are used to evaluate test results. In the following, these methods are presented; however, before the data can be evaluated, some data scrubbing is required. For better comparability, only the results of those experiments have been evaluated out of primary references, to avoid that previously made mistakes are inherited. In addition, only data with a ratio of R ≥ 0, cycles N > 2·104, under axial loading, and without any post-weld treatment such as stress relieved annealing is considered. Run-outs and specimens with severe welding defects are also removed.

For a precise evaluation, it is also advisable not to include test results, which lasted more than 5·106 cycles (Eurocode 3), respectively 107 cycles (IIW). This is related to the difference between the rules of Eurocode and the IIW. Eurocode 3 assumes a fatigue limit at 5·106 cycles for constant amplitude loading, whereas the IIW recommendations use a knee point at 107 cycles. Thus, only test specimens that failed before those two limits are included in the S-N curve evaluations.

5.2 Eurocode 3 best practice

5.2.1 Overview

For the assessment of a characteristic fatigue strength, Eurocode 0 [56] defines the reliability and safety concept of all European standards for structural design in civil engineering. The informative Annex D contains rules for design assisted by testing.

5.2.2 Resistance model and regressions analysis

For high cycle fatigue of steel structures, the applied stress range, S, and the corresponding number of stress cycles to failure, N, follow an exponential law [57]. On a log-log scale with decimal logarithm, the test data can generally be allocated to a straight line expressing a linear dependency of stress cycles on the stress range, Equation (4):

$$\log N=\log a-m\cdot \log S$$
(4)

Figure 9 shows the log-linear relationship in the finite life region. The S-N curve corresponds to the resistance model. The parameters a (intercept of the theoretical locus where the S-N curve of the finite life region intersects the horizontal axis S = 100 = 1) and m (slope of the S-N curve) in Equation (4) can be calculated using a regression analysis.

Fig. 9
figure 9

Linear dependency of the number of stress cycles on the stress range, based on [58]

Since both parameters are estimated based on the information of a limited number of fatigue tests, they have to be substituted by the estimates â and \(\hat{m}\). If the slope m of the S-N curve is pre-set by previous information (for example, m = 3 for welded details with sharp notches [59]), â respectively log â is given by Equation (5):

$$\log \hat{a}=\frac{1}{n}\cdot \left(\sum \log {N}_i+m\cdot \sum \log {S}_i\right)$$
(5)

where n is the sample size (number of fatigue test data) and i is the index of the single fatigue test. The standard deviation s of the population is either known or unknown. In the latter case, it is estimated by the sample. The standard deviation s in terms of log N (see Fig. 10) amounts to Equation (6):

$$s=\sqrt{\frac{\sum {\left[\log {N}_i-\left(\log\ \hat{a}-m\cdot \log {S}_i\right)\right]}^2}{n-1}}$$
(6)
Fig. 10
figure 10

Schematic procedure for statistical evaluation of test data, based on [58]

“Section 3.2” shows that the slope of the S-N curve for flush ground butt-welded joints tends to m = 7 due to the smooth respectively notch free transition of the weld toes; however, to not exceed the slope exponent currently recommended for base materials, a slope exponent of m = 5 is used for the assessment. Following the design principle of the Eurocode rules, fatigue resistance curves for m = 3 are required. The reason is that the general design method for the fatigue assessment of steel structures like bridges uses damage equivalent factors which take account of the real cumulative traffic. At the moment, the damage equivalent factors only exist for S-N curves with slope m = 3. Thus, an assessment is performed for both slopes m = 3 and m = 5.

5.2.3 Distribution and prediction interval

Eurocode 0 [56] implicitly assumes that the distribution of the population is normal or log-normal. As there is no prior knowledge about the mean, it is estimated by the sample. In case, where the slope m is forced to be of a certain value and is not calculated from the sample, Eurocode 0, Annex D.7 [56] is applicable for the derivation of a characteristic S-N curve.

According to Eurocode 0 [56], the factor kn may be used to derive characteristic values with 95% probability of survival (see Table 2).

Table 2 kn factor for characteristic values with 95% survival probability (extract from Eurocode 0)

Since samples of normal distributed populations are t distributed, the lower row of Table 2 (which has to be used when the standard deviation is estimated by the sample, Equation (6)) considers the t distribution probability. The kn factors are based on the prediction method of fractile estimation (prediction interval) [60]. The characteristic value of the intercept ak is obtained by Equation (7). The procedure is shown schematically in Fig. 10.

$$\log\ {a}_k=\log\ \hat{a}-{k}_n\cdot s$$
(7)

The characteristic reference value Δσc of the fatigue strength at 2·106 stress cycles amounts to:

$$\log {S}_c=\frac{\log 2\cdot{10}^6-\log {a}_k}{-\mathrm{m}}$$
(8)
$$\varDelta {\sigma}_c={10}^{\mathit{\log}\ S_c}$$
(9)

The procedure described meets the requirements of the background document 9.01 of Eurocode 3 Part 1-9 [61] and of Eurocode 0, Annex D [62]. It is only valid for the finite life region and cannot make any predictions about the constant amplitude fatigue limit; however, the procedure is easy to use and reliable for the evaluation of large amounts of data, see also [63].

5.2.4 Evaluation

Using the above explained filter, all available test results of fatigue tests on flush ground butt welds were sorted. For the statistical analysis according to Eurocode EN 1993-1-9 [42], all data that can be assigned to the constructional detail “Splices in plates and flats, welded from both sides and flush ground” were considered. As it was already shown in “Section 3.2,” a slope of m = 7 is obtained on average, when each study is considered individually. This is higher than most standards recommend for base material specimens. The assessment was thus limited to a slope of m = 5, which agrees with many design standards for fatigue of plain specimens without process-related notches. In contrast, the current design practice of Eurocode 3 using damage equivalent factors is based on m = 3. Thus, an additional assessment for m = 3 is presented in Fig. 11.

Fig. 11
figure 11

Test results and characteristic S-N curve for “Splices in plates and flats, welded from both sides and flush ground” with m = 3 (left) and m = 5 (right)

The statistical analysis shows that the characteristic fatigue strength of plates and flats, welded from both sides and flush ground, is significantly higher than given in the existing Eurocode 3 Part 1-9 [42] as well as the draft prEN 1993-1-9 of the second generation of Eurocodes [64]. To improve the fatigue detail category for this detail seems feasible. How such an improvement in future versions of Eurocode EN 1993-1-9 could look like is presented in Table 3. In fact, the condition of a misalignment < 5% is repeated from the code without having the proof that the tests evaluated really do represent this or whether they are in general better aligned, because this feature is often not documented. Therefore, the choice of FAT125 (m = 3) is well beyond the calculated value of 137.

Table 3 Formulation of a possible updated Eurocode 3 detail category for “Splices in plates and flats, welded from both sides and flush ground”

5.3 IIW best practice

5.3.1 Differences to the Eurocode 3 evaluation

In principle, the statistical assessment of fatigue test data according to the IIW best practice follows the same principle as Eurocode 3; yet, there are some minor differences. A detailed description is found in [43, 44]. The characteristic value ak (in IIW terminology xk) corresponding to 97.5% survival probability is also determined from Equation (7); however, the parameter kn is not determined from Table 2, but from the following equation due to the difference in survival probability:

$$k=1.645\left(1+\frac{1}{\sqrt{n}}\right)$$
(10)

5.3.2 Evaluation

Subsequently, the results obtained using the IIW best practice are presented. To this goal, similar criteria as in “Section 5.1” for data scrubbing are applied. Contrary to the Eurocode 3 evaluation, fatigue test data up to 107 cycles were included in the assessment. In addition, the IIW fatigue resistance curve is not limited to flat plates and to two-sided welding. Thus, also circular hollow sections and curved specimens cut from circular hollow sections are included in the assessment; however, all round specimens are removed as they do not comply with the stress ratio requirement (R ≥ 0).

The filtered fatigue test results obtained from the literature are plotted in Fig. 12, with the obtained S-N curve by enforcing slopes exponents of m1 = 3 and m1 = 5 for 50% and 97.5% survival probability (dashed and solid line), respectively. The IIW S-N fatigue resistance curves for the as-welded case (FAT90) are included for comparison (dash-dot line).

Fig. 12
figure 12

Fatigue strength improvement of flush ground butt-welded joints for m1 = 3 and m1 = 5 in comparison with the fatigue resistance curve for as-welded butt joints (FAT90)

Using the IIW best practice for fatigue test data evaluation, a characteristic fatigue strength of 123 and 166 MPa was determined at 2·106 cycles for m = 3 and m = 5, respectively. For both slope exponents, the observed characteristic fatigue strength is higher than the current recommendation; however, it is lower than determined using the Eurocode 0 best practice. The difference in determined characteristic fatigue strength is related to differences in detail description (Eurocode 3 is limited to flat plates) and in the assumed knee point (5·106 vs. 107 cycles). Subsequently, different aspects for practical applications are discussed before a proposal for an updated IIW fatigue design class is drafted.

6 Discussion

6.1 Weld defects

In order to derive FAT classes for high-quality flush ground butt welds, the quality needs to be taken into account. This is the misalignment as geometrical parameter (that leads to an increase in amplitude), the inner irregularities, and the surface conditions (that both lead to a stress concentration). To ensure a high-quality joint, the properties need to be quantified by experimental measurements like non-destructive testing.

Additionally, if the quality is high, effects like the metallurgical notch (transition between, e.g., weld metal and heat affected zone) will probably have an influence on the fatigue strength. If this metallurgical notch is not present, for example due to a heat treatment, the influence of the materials strength is expected to affect the resulting fatigue strength.

There are seven typical failure mechanisms occurring for cyclically loaded flush ground butt-welded joints, Fig. 13.

  1. (1)

    Failure might start from imperfections in the weld, such as porosities, micro-cracks, or areas with lack of fusion.

  2. (2)

    In some cases of one-sided welding, the weld root is not fully liquified and a lack of penetration occurs.

  3. (3)

    If no or only very small weld defects are present, fatigue failure may initiate in the base material due to an increase in hardness and subsequently a higher (static) strength of the weld metal.

  4. (4)

    Due to the removal of the stress concentration at weld transitions, there is a higher risk of failure to occur in the clamping area of the test specimen.

  5. (5)

    In case of high strength materials, failure might also happen in the heat-affected zone in which a drop in hardness and subsequently fatigue strength occurs.

  6. (6)

    For joints with minor imperfections in the fusion zone, cracks might start at the metallurgical notch, i.e., in the transition between weld metal and heat affected zone.

  7. (7)

    Depending on the grinding procedure, marks may be present from which failure can initiate. These are especially critical if perpendicular to the loading direction.

Fig. 13
figure 13

Possible failure locations of flush ground butt-welded joints

The severity of welding defects on fatigue strength of flush ground butt joints is illustrated in Fig. 14. Herein, all data found in literature on flush ground joints is presented; however, only primary literature sources are considered. This presentation now includes all data that was previously removed due to violations of the strict requirements (stress ratio ≥ 0, only axial loading, no round specimens machined from butt joints, etc.) or because severe welding defects were reported for the specimens. In total, 318 specimens with informations about the defect type were extracted from literature. Typically, these defects were larger than the acceptance criteria by the welding quality standard ISO 5817 [65]. Run-outs are still removed due to the large amount of data.

Fig. 14
figure 14

Presentation of flush ground specimens with and without known welding defects

It is assumed that not all publications reported welding defects; however, the comparison between data with and without known welding defects presents strong evidence for the severity of welding defects in flush ground butt joints. Although some specimens contained porosity of up to 6%, or lack of fusion defects or cracks of several millimeter in length, the majority of specimens with severe welding defects lies still above the FAT112 curve with slope exponent m = 5. This clearly highlights the level of conservatism of this design curve.

6.2 Thickness effect

Within a current research project [66] combining a meta study of gathered test data and new experimental results, there is a tendency that the actual fatigue test results do not depend strongly on the plate thickness, see Fig. 15. For this analysis, the data, filtered as explained in “Section 5.2” and without the circular hollow sections, were analyzed. This covers both, the flat butt joints welded from one and from two sides. Concerning the flush ground, some are flush ground from one side, but more than 70% are flush ground from two sides. These data were transformed to two million cycles with a slope of m = 5, as this matches the actual slope exponent better than m = 3, see “Section 3.2.”

Fig. 15
figure 15

Dependency of the stress range (transformed to 2 Mio. cycles with m = 5) from the plate thickness for flush ground butt-welded joints, with data from [13, 15,16,17, 22, 23, 25, 29,30,31,32, 34, 36, 37]

Although there is a high scatter of the data in Fig. 15, a lower limit seems applicable. The scatter may be the result of fabrication tolerances, different surface treatments, and other dominating influences. In summary, based on the available data, it is not possible to come to a clear conclusion regarding the thickness exponent n = 0.2 in Eurocode 3 [42]; however, in prEN 1993-1-9 [64], Table 10.4 Note 1 allows to consider the size effect for t > 25 mm for details that are ground flush and where t is the thinner plate thickness in mm for which the stress range is calculated, a modification by the exponent of 0,1. Similar as also, other standards include this thickness correction for flush ground joints, as can be seen from Table 1.

6.3 Specimen types and recommendations for practice

Already from Fig. 3, a clear difference in fatigue strength for the different specimen types is evident. The circular hollow sections show by far the lowest fatigue strength of the three specimen types. One could argue that this is related to the difference in size and specimen preparation; yet, the investigation on the thickness effect and the statistical investigation ruled out a simply size-related effect. In contrast, the type of specimen preparation could be linked to the difference in fatigue strength, if other related influencing factors are accounted for. Clearly, machined specimens show no effect of misalignment and will typically have lower surface roughness compared to flat or circular hollow sections, which are treated by a regular grinding tool. Unfortunately, detailed information on the surface quality and misalignment are rare in literature; nevertheless, it can be expected that the larger the component, the more critical aspects such as misalignment become since the control is more complicated. In light of the fact that fatigue resistance curves are foremost used to design large-scale structures, a correct estimation of the behavior of these structures is paramount.

To update the IIW recommendations, a link between the fatigue strength of flush ground butt-welded joints and the quality criteria is proposed. For joints that fulfill the general requirements of the as-welded state (e/t < 10% and FAT80), a FAT112 class is proposed with a slope exponent m = 5, which corresponds to a fatigue strength increase of three FAT classes. If joints fulfill the following quality criteria, a FAT125 class (also m = 5) is deemed suitable:

  1. (1)

    100% non-destructive testing

  2. (2)

    No grinding marks transverse to the main loading direction

  3. (3)

    Misalignment < 5% of plate thickness

  4. (4)

    Full removal of the weld overfills with no remaining undercut

This increase is lower than the actual increase in fatigue strength observed in Fig. 12; however, as the majority of test specimens were small scale specimens, it cannot be completely ruled out that the fatigue strength increase is not as high in full scale components as observed in this study. Even if axial misalignment and surface quality are better than observed for the circular hollow sections tested by Salama and Liu [34], granting an increase of more than 3 FAT classes seems currently not justifiable. If in the future more full scale test results become available, this recommendation could be updated.

The formulation of a higher FAT class for Eurocode 3 (FAT125 for m = 3) is reasonable, as the restriction on two-sided welding, and splices in plates and flats clearly leads to a higher fatigue performance. In this regard, permitting a higher FAT class than for the IIW detail category is justifiable by the stricter requirements (Table 4).

Table 4 Proposal for an updated IIW detail category for “Transverse loaded butt weld (X-groove or V-groove) ground flush to plate”

7 Conclusions

This study aimed at gaining a better understanding of the influencing factors on the fatigue strength of flush ground butt-welded joints made of steel. To this goal, data from 1003 fatigue tests were gathered from various literature sources. Next, statistical methods based on correlation analysis were employed to quantify the impact of typically influencing factors on fatigue strength and to determine a suitable slope exponent for this weld improvement technique. Then, fatigue resistance curves were determined based on Eurocode 3 and IIW best practice. Finally, proposals for updates of rules and guidelines are established by determining new FAT classes based on the current and the new determined slope exponent. In addition, an overview of design standards and recommendations is given and main influencing factors are discussed. The following conclusions can be drawn from the investigation:

  1. (i)

    By comparing the results with the fatigue resistance curves in as-welded state, it is obvious that flush grinding has a large positive effect on the fatigue strength of butt-welded joints, which exceeds current recommendations for this weld detail.

  2. (ii)

    During the data assessment on flush ground butt-welded joints, a large scatter of fatigue test results was observed, which was linked to different fatigue specimen types.

  3. (iii)

    From the statistical evaluation of the S-N curves determined by the maximum likelihood method, a mean slope of mmean = 8.19 with a standard deviation of mstd = 3.56 is evaluated for the overall 18 exploitable test series on flat specimens. The observed slope exponent is significantly higher than the current recommendation of m = 3. This agrees with results of other studies on geometrical improvement of welds for weld toe grinding and weld profiling [11], and tungsten inert gas dressing [9]. To not exceed the slope exponent currently recommended for base materials, it is proposed to use a slope exponent of m = 5 for design. To not obstruct the current design practice of Eurocode 3, design values are also derived for m = 3.

  4. (iv)

    From the correlation analysis, no correlation is observed between the fatigue strength of the test specimens and typical influencing factors such as parent material strength, stress ratio, and specimen size-related properties (plate thickness, seam width, cross-sectional area); however, a slight negative effect of increasing stress ratio on fatigue strength improvement was observed.

  5. (v)

    Based on Eurocode 0 and IIW best practices, stress-life curves are determined based on the specific requirements of both guidelines on stress ratio, range of number of cycles to failure, loading type, etc. Using Eurocode 0, a characteristic fatigue strength of 137 and 193 MPa was determined at 2·106 cycles for m = 3 and m = 5, respectively. Similarly, an increase to 123 and 166 MPa (m = 3 and m = 5) was observed following the IIW best practice for fatigue test data evaluation. The difference in determined characteristic fatigue strength is related to differences in detail description (Eurocode 3 is limited to flat plates) and in the assumed knee point (5·106 vs. 107 cycles).

  6. (vi)

    Assessing the available test data, no strong thickness effect was observed. This confirms the tendency of smaller thickness exponent for flush ground welded butt joints compared to as-welded such as in the new draft of Eurocode 3.

  7. (vii)

    Finally, a possible update of the Eurocode 3 to FAT125 (m = 3) and of the IIW detail category to FAT125/FAT112 (m = 5), respectively, is proposed. The reason for the higher FAT class for Eurocode 3 is linked to different detail descriptions. The two FAT values proposed for the IIW recommendations are related to the higher observed fatigue strength, if high-quality criteria are met. For structures where the highest requirements on misalignment and surface quality cannot be met, the lower (current) FAT112 class should be applied and structures with excellent quality a higher class FAT125 may be permitted.