1 Introduction

Railway turnouts are high-asset and maintenance-intensive parts of the railway superstructure. They are a limiting factor in a reliable and cost-effective railway infrastructure due to their short lifecycles and the difficulties in predicting the remaining useful life of turnout elements [1]. Despite significant advances in the application of various automated measurement and diagnostic systems [2, 3] in the course of railway digitalization, the predictability of the crossing lifetime is low. The reasons include not only a large uncertainty of inertial measurements due to many random influences [4] but also the complexity of degradation processes [5].

Many different systems [6] are used for the inspection of common crossing rolling surfaces: profile, surface scan and video inspection, microstructure imaging, eddy current and ultrasound, vehicle-based and track-based inertial measurements (Fig. 1). However, none of the systems can yet replace the conventional inspection method, which includes expert judgement based on visual estimation and acoustic perception of train impacts.

Fig. 1
figure 1

Common crossing inspection methods

Profilometer methods deliver information on the longitudinal wheel trajectory and cross section of common crossings, which is simple to interpret. Their main drawback is that the measurement takes place in an unloaded state. Surface scanning methods show similar limitations, while at the same time, the additional measurement information can also demand additional interpretation. The German Railways (DB AG) tested laser surface scanning methods for common crossings at Scorpion and Lauros (Fig. 2). Both methods depict the wear state of the rolling surface, and the additional instantaneous high-resolution imaging is considered.

Fig. 2
figure 2

Surface scanning methods for common crossings. (left, Scorpion [7]; right, Lauros [8])

The conventional methods for rail fatigue assessment are eddy current and ultrasonic methods [9]. Both methods can primarily be considered as fault-detection methods that are not able to detect crack origination. Vehicle-based inertial measurement, like the axle box ESAH-F measurement system [3], which is installed on regular trains, allows inspecting a big number of turnouts with low expenses. The application of the system is limited to the detection of existing faults or wear, without prediction of common surface damages.

Magnetic particle inspection (MPI) is a reliable method to detect surface features. However, it is a very time-consuming inspection method with a low degree of automatization. High-resolution photo inspection (HRPI) would be a promising alternative to MPI. It enables highly automatized application in measuring cars as well as in easy practical inspection with mobile devices. Figures 3 and 4 show MPI and HRPI images of a frog nose rolling surface during its normal operation at 33 Mt and after rolling surface fatigue damages at 52 Mt. The MPI image clearly demonstrates a different MPI crack image pattern at 33 Mt in zones corresponding to the fatigue fault at 52 Mt. Thus, it can be considered a characteristic pattern of the future fatigue zones. However, it is difficult to assess the remaining useful life of the rolling surface, since the patterns can appear long before the visible cracks.

Fig. 3
figure 3

MPI (above) and HRPI (below) images of the rolling surface on a frog nose after 33 Mt

Fig. 4
figure 4

MPI (above) and HRPI (below) images of the rolling surface on a frog nose after 52 Mt

The HRPI images, different to those of MPI, from the first sight give no indication about the imminent surface fault. However, experienced experts can clearly detect it and consider that the HRPI images are not less informative than the MPI images. The main problem of high-resolution photo inspection is the difficulty of automatic crack recognition in the early phase of their emergence. It is therefore necessary to develop image processing methods that are able to transform the HRPI images to images of a form that corresponds to MPI images without substantial loss of information.

Image processing and machine learning methods are successfully used in civil and transportation engineering. A dimensionality reduction method, like principal component analysis, is used in the study [10] to assess the condition of bridges using the data collected during visual inspections. Fatigue fracture diagnostics of building structure elements, including microhardness measurements and statistical processing, are used in [11]. An evaluation of railway ballast consolidation with discriminant and cluster analysis is proposed in [12]. A histogram-based image segmentation method, which is proposed in [13], is a promising technique for pre-processing HRPI images of the rail rolling surface. Raster image processing methods are used in [14] for the extraction of objects of interest from point clouds and their automatic classification.

Deep learning pertaining to models is applied in [15] to improve the automatized processing of crack images in concrete structures. The approach of machine learning predictive detection that is introduced in [16] could be used to improve the predictive detection of rolling surface degradation. A rail surface inspection method using deep learning and image processing is proposed in the study [17]. Paper [18] shows an approach for semi-supervised rail defect detection with the aim of improving the performance of squat detection. The authors of the study [19] propose an improvement of head surface defect detection-based video inspection and image processing methods. An early detection of common crossing rolling contact faults with vehicle-based inertial measurements and machine learning methods is studied in [20]. A mechanical modelling of short-term dynamic interaction and long-term settlements of common crossing is presented in [21]. The modelling results are compared with those of the on-board inertial measurements. The results can be used to improve common crossing lifecycle prediction.

The general feature of the reviewed studies of rolling surface diagnostics is that most of them consider fault detection in the late state of development without the prediction of their following growth. The aims of the present research are the objectification and automatization of the conventional human visual inspections, as well as discovering the possibilities for early prediction of rail contact fatigue in common crossings.

2 Approach Description in General

The main part of the paper consists of the solution of the following problem: detecting the feature changes in the crack images that are statistically related to the frog lifetime. The problem is solved by using image processing and statistical image analysis. A workflow diagram of the present research is shown in Fig. 5.

Fig. 5
figure 5

Workflow diagram of image processing and statistical image analysis

The image processing and statistical image analysis is based on data collected from one frog during its lifecycle. The information consists of 5 MPI images of the frog’s rolling surface after 13, 22, 33, 43 and 52 Mt. The image processing and the following statistical analysis are based on those MPI images. Additional information sources are the HRPI images after 13, 33 and 52 Mt, which are used as independent information for the validation of the statistical model.

3 Image Pre-processing and Feature Detection

To prepare the images for the feature detection, they are initially improved with morphological image enhancement techniques, which increase the signal-to-noise ratio of images. The subsequent dilation and erosion operations are used to remove small objects from an image and to smooth the border of large objects [22].

The extraction of the meaningful information from the improved images is carried out by using image analysis. During the analysis, the independent image objects are detected and their properties are measured. The following table shows all used properties, providing a description of the shape measurements and the used abbreviations (Table 1).

Table 1 The measured properties and their abbreviations

The MPI images after the enhancement techniques still show a big number of objects that are evidently not related to crack images. The line-shaped objects are extracted as a pre-processing step, using two simple conditions: the Ar feature is limited within a minimal, and a maximal value and the PAR feature is assumed to be more than 0.5. Figure 6 shows the MPI image, filtered using the two conditions, with object colours corresponding to their values of the PAR feature. A remarkable observation is that the cracks with a big PAR feature, mostly belonging to small cracks, are randomly distributed over the rolling surface. Obviously, the PAR feature itself could not be considered as a characteristic for the rolling surface state estimation.

Fig. 6
figure 6

Image objects with their PAR features (43 Mt)

4 Preliminary Statistical Analysis

The overall statistic of crack objects from 5 images during the frog’s lifecycle contains 940 observations with 12 predictors and 5 classes of response variables. A preliminary statistical analysis is carried out to describe the general properties of the statistics. The data are of different dimensions, e.g. linear, angular or area measures, which causes some difficulties while comparing different features. To simplify the problem, all the data are reduced to undimensional values or normalized. In their normalized state the mean values of the features equal 1.

The second particularity of the data is a very big asymmetrical variability. Figure 7 with boxplots for all 12 features shows that most of them have a variability inside the upper and lower quartiles, which is comparable with their median value. The Orn (Angle between the x-axis and the major axis of the ellipse) feature, that corresponds to the crack orientation, has much bigger variability due to positive and negative values.

Fig. 7
figure 7

Box plots of features for image objects

The mean values of the features depending on the tonnage are depicted in Fig. 8. The trend of the mean values is rather random and ambivalent.

Fig. 8
figure 8

The trend of the feature mean values

Figure 9 shows the variability of the data along the frog’s lifecycle. Many features show very similar values, which means that they do not carry new information and therefore could be supposed redundant. The diagrams show no evident relation of the measured features to the frog’s lifetime.

Fig. 9
figure 9

Normalised feature values of image objects

5 Principal Component Analysis and Feature Selection

To reveal the relations between the features that are still hidden in the Figs. 8 and 9, the principal component analysis (PCA) is used. PCA is mostly used as a means of exploratory data analysis and for making predictive models [23, 24]. It helps to reveal the internal structure of the data in a way that best explains the variance in the data. The sense of the PCA consists of replacing a group of variables with a single new variable, called principal components. Each principal component is a linear combination of the original variables. All the principal components are orthogonal to each other, so there is no redundant information [25]:

$$z_{i1} = \varphi_{11} x_{i1} + \varphi_{21} x_{i2} + \cdots + \varphi_{p1} x_{ip}$$

here: \(z_{i1}\) are the scores, and \(\varphi_{11} , \ldots ,\varphi_{p1}\) is the loadings of the first principal component.

The following pareto diagram (Fig. 10) shows the first 6 components that explain 95% of the total variance. The first component explains a bigger share of the variance than all other components together.

Fig. 10
figure 10

The pareto diagram of total variance explained with PCA components

However, Fig. 10 cannot explain the reason for the variance. The particularity of PCA is that it is an unsupervised approach, since it involves only a set of features. To find out what component corresponds to the response variable, i.e. the frog’s lifetime, the scatter plot of data in the coordinates of the first two components is built (Fig. 11). Together with this, a biplot is depicted that allows visualization of the magnitude and sign of each feature’s contribution to the first two components. The first principal component, on the horizontal axis, is strongly influenced by the features Ar, Per, ConvAr, EqDm and less strong by MinAxLn, MajAxLn, all of them in a positive direction. The second principal component, on the vertical axis, has positive coefficients for the PAR end EN features and negative for Ext, Sol and Orn. The different lifetime is depicted using different colours from red to blue. Apparently, more blue points with a tonnage of 43 and 52 Mt are concentrated in the positive direction of the second component. Although the first component has four times bigger weight than the second, it has no relation to the frog’s lifetime.

Fig. 11
figure 11

The first two principal components and feature vectors

The PCA shows that many of the 12 features are redundant. To select the meaningful features for the first two components, their weights are plotted for each feature (Fig. 12). The Pc1 has the following four most meaningful features and weights: Ar(0.21), ConvAr(0.94), FilAr(0.22) and Per(0.13). The Pc2 has the following: Ar(0.60), MajAxLn(0.11), ConvAr(0.33), FilAr(0.61) and Per(0.36).

Fig. 12
figure 12

The features and weights of Pc1 and Pc2

Both Pc1 and Pc2 are influenced by a similar range of features: Ar, MajAxLn, FilAr and Per. The only difference between Pc1 and Pc2 consists in the feature ConvAr, which for Pc2, has a negative value. The relation of the mean values of Pc1 and Pc2 with the selected features to the frog’s lifetime is shown in Fig. 13. The shown Pc values are not normalised, but the beginning values are reduced to 0 for a convenient visualisation. Figure 13 shows the monotonous relation of Pc2 to the frog’s lifetime. Although Pc1 has a much bigger weight than Pc2, it cannot provide an unambiguous relation to the lifetime. The variance of the Pc2 value could be explained by differences in the image acquisition due to the MPI technique, such as contrast, brightness of illumination, etc.

Fig. 13
figure 13

The mean Pc1 and Pc2 values versus the frog’s lifetime

The main advantage of PCA is that it provides an easy interpretable indicator that demonstrates systematic variation versus crossing lifetime and takes into account the main significant features. The disadvantage is that the PCA is an unsupervised learning approach that does not take into account response variable. Therefore, the relation of the principal components to the crossing lifetime is determined after the analysis. Other supervised linear techniques, like partial least square regression or lasso regularisation [24, 25], could provide additional improvements.

6 Validation of the Method

The discovered relation of Pc2 to the tonnage only indicates some systematic changes in the form of MPI cracks during the crossing’s lifetime. It should be verified if the changes also have a relation to the faults observed on the rolling surface. The validation of the developed statistical method is performed using an independent information source—the available HRPI images. The problem of the validation using MPI images is that the crack images disappear as soon as the visual fault occurs. Therefore, the last MPI images before the fault occurrence are used for the validation. Figure 14 (above) shows the MPI image with crack ranking according to the Pc2 criterion for a lifetime of 43 Mt. Cracks that are most likely to cause surface faults are coloured red. Figure 14 (below) shows the HRPI image of the same part of the rolling surface at 52 Mt—after the first damages have appeared. Two prominent groups of cracks are found in the Fig. 14 (below): the right group has grown to a surface damage at 52 Mt, the left shows an evident initiation of damage. Serious damage could be expected within the next 10 Mt.

Fig. 14
figure 14

The predicted faults with MPI imaging at 43 Mt (above) and appeared faults after 52 Mt (below)

7 Prediction of Remaining Useful Life Based on the Inspected Principal Component Value

To determine the statistically significant relation of Pc2 to the lifetime, a regression analysis is carried out. The regression diagram (Fig. 15) depicts the data points in Pc2 form, and a polynomial fit with the 95% function confidence bounds. The HRPI image for 52-Mt inspection shows a frog surface fault (zone highlighted in red in Fig. 15).

Fig. 15
figure 15

The regression of Pc2 values versus the frog’s lifetime

To reveal the possibilities of the remaining life prediction, the following considerations are taken into account (Fig. 16). The inspection delivers a principal component (PC) value at some inspection time \(B_{R,0}\). The intersection of the value with confidence bounds delivers \(B_{R,\hbox{min} }\) and \(B_{R,\hbox{max} }\). The estimation error of the remaining useful life can be determined with the three values as well as the fault prediction horizon.

Fig. 16
figure 16

Assessment of the remaining useful life and prognosis error

The first indication in crack features on the future crack development can be found already on 33-Mt MPI images. However, due to the prognosis error, a prediction earlier than 38 Mt (Fig. 15) is not possible. The beginning of visual rail surface faults is also difficult to be determined exactly due to relatively infrequent inspections. Therefore, it could be supposed that the significant fault could have appeared approximately in the middle between the two neighbouring inspections at 43 and 52 Mt. Therefore, the prediction of the fault is possible within a prediction horizon of not more than 11 Mt.

8 Conclusion and Subsequent Studies

The investigation of MPI images of the rolling surface on a crossing nose during a frog’s lifecycle has shown that they have a significant statistical relation to the frog’s lifetime. The relation is however not evident for simple statistical estimation due to the strong influence of random factors. The application of PCA helps to select the meaningful features carrying information about the fatigue state of the rolling surface. The method validation, using new independent data, shows that the method is able to find the cracks that lead to the surface damages. The regression analysis shows that by using surface crack images it is possible to detect the state changes and forecast the rolling surface damages within a prediction horizon of up to 11 Mt.

However, the presented method also has some drawbacks that could be considered as problems for subsequent studies. The major practical problem is the application of the time-consuming MPI imaging inspection method for railway infrastructure with a low degree of automatization. The HRPI inspection method could be used as alternative method, but its application raises the problem of image processing. The problem could be solved using deep learning image processing methods.

Another way to extend the prediction horizon exists in the improvement of statistical information. The main problem here is that the number of cracks that leads to the damages is relative to the general statistics. The problem could be solved in two ways: either by increasing the statistics with new information, or by performing a qualitative improvement of the existing information. New feature measures could be used to describe the internal properties of the crack objects.