Research on Subhealth Diagnosis Method for Resistance of Urban Rail Transit Door System

The rail vehicle door system is one of the key components of rail vehicles. Its failure rate accounts for more than 30% of vehicle failures. By analyzing early warnings provided by subhealth data from the door system, the efficiency and reliability of their health maintenance can be effectively improved and stable operation of the door system can also be guaranteed. In this paper, early-stage resistance changes in the subhealth state of rail vehicle door systems are considered as the research object. Firstly, the distribution rules for the motor parameters are studied, and the time-domain and normal operating envelope features of the operating motor are extracted. Secondly, subhealth conditions with different resistances are simulated using a test rig, and the experimental data are applied to summarize the rules. According to the subhealth types and the distribution of features, diagnostic rules for subhealth are formulated. To check the possibility of fault diagnosis, a verification using running rail vehicle door system data is carried out in MATLAB. The results reveal that the misdiagnosis rate of resistance subhealth is 0% while the rate of missed diagnoses is 2%. Meanwhile, the diagnostic process based on the established rules is relatively efficient. This method is suitable for application for resistance subhealth diagnosis of urban rail vehicle door systems.


Introduction
As one of the key components of rail vehicles, the door system has an important impact on the safe operation of trains. Due to the complexity of their structure and operating environment, many problems caused by passenger squeezing, train vibration, aging of electronic components, and wear of components arise during the operation of rail vehicles [1]. According to statistics, the failure rate of the door system accounts for more than 30% of all rail vehicle failures [2], thus posing a serious threat to train safety and requiring urgent solutions. Plenty of work has been carried out on the problem of frequent failures of rail vehicle door systems; For example, Long et al. [3] collected the angle, speed, and current signals from the door system's motor to extract their time-domain features, providing a basis for door system fault diagnosis. Han et al. [4] designed an intelligent diagnosis system based on Big Data for a rail vehicle door system. Faults could be effectively classified and diagnosed by applying Big Data analysis and artificial intelligence diagnosis algorithms to process the data. To deal with the huge and redundant data provided by rail vehicle door systems, Chen et al. [5] proposed a fault diagnosis method based on the information gain rate. As the amount of data increases, this model automatically adjusts to improve its accuracy. The high accuracy of fault diagnosis was verified by experiments. Shi et al. [6] proposed an unsupervised anomaly detection method for a rail vehicle door system (RVDS) using a density peak clustering (DPC) algorithm. Based on Euclidean distance, multiple door systems are compared regularly, and those door systems exhibiting abnormal conditions are screened and labeled. This method has been applied on the North Extension of Guangzhou Metro Line 3. Chen et al. [7] proposed a fault diagnosis method for railway vehicle door systems based on a Bayesian network. Firstly, a fault model was established to calculate the prior probability. Secondly, the posterior probability was obtained on the basis of the Bayesian model. Finally, through simulation and experiment, the accuracy of the fault diagnosis was verified. Long et al. [8] proposed a health monitoring method based on door movement resistance analysis. Principal component analysis was applied to construct the health indicators for online monitoring. The experimental results showed that this method could accurately reflect the latent health status of the railway vehicle door system.
According to the different causes of rail vehicle door faults, they can be divided into three categories: strong parameter changes, structural changes (changes in system information flow), and sensor actuator failures [9][10][11]. Based on the research described above, it is seen that most work on diagnosis and maintenance methods of rail vehicle door systems has mainly been based on diagnostic methods for strong parameter changes. After a failure occurs, the system analyzes the features to make a diagnosis [12], but this usually misses the best maintenance time. Moreover, more research has been carried out on fault diagnosis but less on the subhealth state of rail vehicle door systems in recent years. The subhealth state is a failure in its early stage that changes the operating status of the door system slightly but does not affect its normal operation. There is thus an urgent need to study the abnormal state of door systems in the early stages of failure. The Shewhart control chart is a commonly used method for quality monitoring, where statistical principles are applied to describe the distribution rules of the data and identify abnormal data. This method is widely used for quality control in laboratories, health status diagnosis in medicine, health monitoring of mechanical equipment, etc. Importantly, this method does not require a complicated modeling process, and its data analysis efficiency is high, making it very suitable for the identification of abnormal data in the early stage of the subhealth state of door systems.
The Shewhart control chart method is thus applied herein to recognize the subhealth state of rail vehicle door systems based on early-stage (overall and local) resistance changes. Firstly, the time-domain features are extracted and analyzed to establish a threshold model. Secondly, two different experiments with abnormal resistances are conducted using an experimental platform, and the experimental data are used to formulate the subhealth rules based on Shewhart control chart theory. Finally, real-time data are collected on a running rail vehicle door system to verify the rules. The results confirm that this method can provide real-time warning of a subhealth state, thus effectively improving the efficiency and reliability of door system maintenance.

Fundamental Theory
The Shewhart control chart [13,14] is mainly used for quality control of industrial products, also being called a quality control chart. It is a graphical method that applies sample statistical principles to product quality control. Commonly used control charts can be divided into the mean-range ( X À R) control chart, mean-standard deviation ( X À S) control chart, single value-moving range (X -R m ) control chart, etc; Given that huge amounts of data are collected from the rail vehicle door motor and three variables in the online motor data (i.e., the motor speed, angle, and current) can be used to calculate quality indicators, the mean-standard deviation ( X À S) control chart is chosen for this research. The structure of the univariate Shewhart control chart is shown in Fig. 1. The horizontal axis represents the sample number of the time sequence sampling, whereas the vertical axis represents the measurement value of the monitored feature. The centerline (CL) is a standard used for the measurement values of the monitored feature, usually being the sample mean. The upper control line (UCL) and lower control line (LCL) are usually set as multiples of the standard deviation r, usually being 3r. The monitored feature should be a random variable conforming to a normal distribution when using a multiple of r to set the upper and lower limits.
In Fig. 1, on the basis of the 3r criterion, the probability that the sample feature falls within the range is 99.73%, or 0.27% for outside the range. Obviously, there is only a small probability that the monitored value will fall outside this range. In particular, it is impossible for this lowprobability event to occur during a test. Therefore, once the monitored value falls outside the range, it is very likely that the process is out of control and must be adjusted. According to the control chart, the r standard can detect random fluctuations and abnormal changes in resistance during the opening and closing of rail vehicle doors in a timely and accurate fashion. Control chart theory reveals two variations in statistical samples. One is a random variation caused by accidental factors, while the other is the actual changes in the process due to the studied reasons. The result is judged to be abnormal when the measured values exceed the range of random error. An abnormal state of the collected sample can thus be detected by using control chart theory. The theory suggests standards to identify a statistical process that is out of control. Therefore, when using the Shewhart control chart to diagnose the subhealth state of an urban rail door system, it is necessary to obtain operating data for the health state of the door system in order to establish the upper and lower limit curves on the control chart. When detecting the data for an unknown health status, if the number of data points exceeding the limit is too large, a performance degradation of the door system is identified.

Subhealth Diagnosis Process
The diagnosis of the subhealth status of an urban rail door system is based on a data-driven approach. The key point is to formulate diagnostic rules through Shewhart control chart theory. This is mainly divided into data preprocessing and model building steps. The specific processes are shown in Fig. 2. 1. Firstly, a signal acquisition system for the rail vehicle door system is established to obtain and save signals for detection, such as the motor speed, running angle, and output current. 2. The collected data are preprocessed and fault and abnormal data are distinguished based on the overall characteristics. The normal motor speed data are then divided into segments according to the signal trend. It is the basis for the subsequent subhealth diagnosis. 3. Regarding the normal door opening and closing status, the upper and lower control limits are defined based on the current signal data. Simulated resistance equipment is used to obtain experimental data for the global and local resistance of the door system. The features are then segmented and refined to count the number of outof-limit points corresponding to abnormal test samples and identify anomalies corresponding to the rules for the global and local resistance of the rail vehicle door.
4. Finally, the diagnostic rules established by the Shewhart control chart are used to verify the running data of the main line, indicating the effectiveness of the subhealth diagnostic method presented herein.
3 Data Processing and Establishment of Subhealth Rules

Data Acquisition
The research object of this paper is rail vehicle sliding plug door. Online information including the motor speed, angle, and current is collected from the door by a device integrated into the motor of the door system. The acquired data can be transmitted to the host computer through a secondary computer and processed centrally by the host computer. Normally, the time for door opening or closing is about 3 s; the sampling frequency is set to one recording point every 10 ms (considering the integrated sampling accuracy and the transmission rate), so about 300 points are recorded during one door opening and closing.

Anomalous Data Filtering
The data obtained from the door system can be roughly divided into fault, subhealth, and normal data according to the severity of the fault. To reduce the interference from the fault component and ensure effective identification of the normal and subhealth data of the resistance variation, the signal collected from the door system must be filtered to remove fault and error points. The typical faults affecting the door system function are selected here, according to existing door system faults and expert experience, include (1) opening door obstacle monitoring, (2) closing door obstacle monitoring, (3) closed position switch not being triggered and off-position switch not being released, (4) door remaining shut in place without permission, and (5) door not unlocking in 3 s. These door system faults are then compared with the data for the whole and local resistance of the typical subhealth state. The motor's angle data are shown in Fig. 3. Figure 3 shows curves of the door's motor angle in different states using various colors and line types. It can be seen that, compared with the fault signal, the health and subhealth signals show obvious differences in terms of the time span and total travel.
According to the speed, angle, and current of the door motor during door opening or closing, it is acuqired of the overall features including the total time, total position, number of opening and closing directions, number of locked-rotor events, and stalling. The overall feature values are extracted as a basis for preliminary feature analysis.   The rules for filtering the normal and subhealth data are based on differences in the overall feature values as presented in Table 1. Data that do not meet the rules in Table 1 are eliminated from the collected data. The normal and subhealth data are filtered for subsequent feature extraction and rule establishment.

Segment Definition
After filtering the data, the normal and subhealth data for the rail vehicle door are obtained. The motor's angle, speed, and current data are obtained by the intelligent motor during the normal door opening or closing, as shown in Fig. 4. Based on the motor's speed trend during door opening or closing, the signal can be divided into four sections, including rising speed, uniform speed, slow speed, and the termination section. Figure 5 shows the speed curve of the motor in the four stages during the door opening and closing process.

Construction of Control Limits
The angle and speed data of the rail vehicle door motor could not directly reflect the resistance change of the door system during the door opening and closing, but the current could quickly and intuitively reflect changes in state. Therefore, the statistical characteristics of the current signal are mainly analyzed.
To quantitatively measure the distributed discrete rule for the current data distribution and effectively distinguish the severity of the resistance changes, multiple r values are selected to set multiple limits for the statistical analysis of the data. Given the statistical data X ¼ fx 1 ; x 2 ; . . .; x n g, three types of statistical distribution boundaries Y 1 , Y 2 , Y 3 are defined in formulas (1)-(3).
where l ¼ P n i¼1 x i n ; ð2Þ Real normal data from the off-peak period in over 5 months are taken as the sample. First, the current sample data are aligned, and the probability density distribution of the statistical data is compared with a normal distribution. Figure 6 shows the probability density of multiple current curve data at 500 ms (in the rising section), at 1500 ms (in the uniform section), and at 3000 ms (in the slow section). In Fig. 6, the horizontal axis represents the current, while the vertical axis represents the probability density of the data distribution. The distribution of the data points at 500 ms, 1500 ms, and 3000 ms are compared with the normal distribution; it can be seen that, although they do not completely conform to the normal distribution, the symmetry and kurtosis of the probability density distribution at each location show that they are close to normal distribution. Therefore, most of the normal data can be contained by the distribution limits of Y 1 , Y 2 , and Y 3 .
The statistical probability of the data points sampled at 500 ms, 1500 ms, and 3000 ms lying within the distribution limits Y 1 , Y 2 , and Y 3 are presented in Table 2.
The fraction of the probability distribution of the data curves lying within Y 2 and Y 3 is above 99%, basically indicating normal data. The corresponding overall envelope boundaries defined as Y 1 , Y 2 , and Y 3 are now compared with normal and typical subhealth data. Figure 7 shows that the r curve represents the envelope limit Y 1 . The 3r curve represents the envelope limit Y 2 , and the 6r curve represents the envelope limit Y 3 . The curves of Y 2 and Y 3 contain the normal data, while the whole and local resistance data lie outside the envelope limit. Therefore, the upper and lower limits of Y 2 and Y 3 (l ± 3r, l ± 6r) are used to calculate the over-limit feature values. In consideration of the segmentation definition in Sect. 3.3, the current over-limit point features are thus defined as presented in Table 3.

Formulation of the Resistance Rules for the Whole Process
A test fixture was built to simulate the resistance changes throughout the whole process. A schematic diagram and physical picture are shown in Fig. 8a, b. As shown in these two pictures, the door control unit controls the direct-current (DC) motor during the operation of the door system, and the motor drives the screw to rotate, which drives the left and right door leaf to realize the door's opening and closing action. The magnetic powder brake is installed at the end of the screw, applying resistance to the screw to simulate resistance changes of different degrees throughout the whole process.
As the driving current of the motor is directly proportional to the torque, it can indirectly reflect the resistance changes of the door system. The subhealth resistance change consists of an increase and reduction in resistance. The resistance increase indicates that the door system is bearing an external force that affects the action of opening and closing the door, while the resistance reduction represents the instability of the motor's current. Since such resistance increases are a major problem and have a more serious influence on the door's function, this subhealth type is the focus in the following section. Figure 9 shows the extracted feature values, that is, the number of points over the 3r and 6r limits throughout the whole process of door opening and closing under different resistance conditions. In addition, manual opening and closing of the door with the corresponding resistances are compared. The normal manual door opening and closing force is about 45-50 N.  If the manual opening and closing force of the door is about 60-160 N, the resistance of the door system is considered to exhibit a slight increase in the whole process. If the manual opening and closing force on the door is above 160 N, the resistance of the door system is considered to exhibit a serious increase in the whole process. When the force exceeds 60 N, the number of points over the 3r limit in the whole process shows a clear upward tendency, being more than 100 points. When the force exceeds 160 N, the number of points over the 6r limit in the whole process is significantly increased, clearly being more than 100 points. Based on the analysis and reasoning presented above, the rules for the resistance change throughout the whole process can be established. When opening the door, with slight resistance during the whole process, the measured current curve shows more than 100 points exceeding the 3r lower limit but fewer than 100 points exceeding the 6r lower limit, which can be judged as a slight resistance increase. The other limits can be analyzed in a similar way. The resulting specific resistance change rules are presented in Tables 4 and 5.

Formulation of Local Resistance Rules
The local resistance was also simulated on the test bench. The local resistance of the uniform section of the opening door, and the uniform section and slow section of the closing door, were simulated by fixing a cylinder loading device and a transverse follow-up loading device. The cylinder loading bench is shown in Fig. 10.
The feature values of the over-limit points under typical conditions were selected and are presented in Tables 6 and  7.
The test results showed at least over 40 points exceeding the 3r limit of the local resistance in the uniform and slow speed section of the opening and closing door. It can be initially concluded that there should be at least 30 points in the rising speed section. The rules established to identify local resistance changes are presented in Tables 8, 9, 10, and 11. Specifically, an increase in the whole resistance of the opening door occurs when the total M 24 over the -3r value is between 50 and 100. Similarly, when the M 21 over the -3r value is between 30 and 60 in the raising speed process during the opening of the door, the local resistance increases in this phase. The analysis procedures for the other local resistance increase rules are analogous to the discussion above.

Application Verification
Real running data from a door system on a certain line in Nanjing (a city of China) were analyzed and used to verify the typical subhealth rules proposed above. Firstly, some off-peak normal data were selected and trained to construct the statistical envelope feature thresholds. Then, 3000 sets of real data were selected to verify the rules. The real-time M 11 and M 21 represent the number of points over the 3r upper limit and below the 3r lower limit, respectively, in the rising speed section. Similarly, N 11 and N 21 represent the number of points over the 6r upper limit and below the 6r lower limit, respectively, in the rising speed section. M 14 represents the number of points over the 3r upper limit in the whole process. For the meanings of the other parameters, see the body text above.
(a) Opening door process (b) Closing door process diagnostic results for the analog opening and closing door in a MATLAB environment are shown in Fig. 11. The diagnostic results contain typical subhealth conditions, such as a local resistance increase in the rising speed section of the opening door, a serious resistance increase in the whole process of the opening door, a local resistance increase in the slow speed section of the closing door, and a local resistance increase in the uniform section of the closing door.
According to the maintenance history of door systems, the problem of local resistance increasing basically occurs during the peak commuting hours. At that time, passengers squeeze between the doors, but the function of the door system is not affected for a short period of time. Meanwhile, a serious increase in the resistance of the whole door is caused by poor lubrication of the long guide column, which returns to normal after relubrication following internal inspection.
Part of the typical subhealth data diagnosed is compared with the normal data according to the established rules in Fig. 12. In Fig. 12a, compared with normal data, the opening door current signals diagnosed with a resistance anomaly in the whole process and in the uniform speed section of the opening door increase at corresponding positions. Also, in Fig. 12b, for the resistance anomaly for slow speed and uniform speed sections of the closing door, the current signals are higher than the normal data signals at the corresponding points. Therefore, the results are consistent with actual working conditions.  The statistics of the diagnosis results using the subhealth rules are presented in Table 12. Note that there are 2995 door openings and closings in total, for which the subhealth state is completely diagnosed with a misdiagnosis rate of 0%.
From the data judged to be normal, 200 randomly sampled sets are used to test the subhealth diagnostic rules. Table 13 presents the missed diagnosis statistics, showing that four groups of subhealth data are not diagnosed, corresponding to a missed diagnosis rate of 2%.
Overall, the diagnosis results based on the typical subhealth rules indicate that they are successful. In consideration of the low rate of missed diagnoses, the rules can not only filter out typical faults and abnormal data but also cover most of the whole and local resistance subhealth states. The average diagnostic time for a single datum was only 0.1945 s, thus fully meeting the needs for online realtime diagnosis by the server.
In addition, the signals with missed diagnoses mainly correspond to data for local resistance increases in the slow section of the closing door. Due to the locked current, the peak distribution in the termination section of the closed door is unstable, which may result in a small probability of missed diagnosis or misdiagnosis. Further work will continue to optimize the rules and reduce the missed diagnosis rate. For the diagnosed subhealth data, the real-time subhealth levels could be divided according to their severity,    which facilitates subsequent subhealth diagnosis and early warning processing for the door system.

Conclusions
To effectively capture the resistance changes in a vehicle door system, subhealth diagnosis rules are applied to recognize its health condition. Based on real-time data collected from the motor of the door system, various features are extracted, such as the time-domain features including the rotation angle, speed, and current and the current envelope features. Based on historical statistics for normal door system operating data, health thresholds are then established to formulate corresponding rules. The accuracy of the resistance subhealth rules is verified by applying them to running rail vehicle door system data.
The main conclusions are:     1. The verification results show that the typical subhealth rules proposed in this paper can filter out faults and abnormal data and accurately identify the subhealth state based on the resistance changes during the whole as well as part of the process. 2. The proposed method can distinguish the severity of the resistance changes and judge the general location where the local resistance occurs. The proposed subhealth diagnosis rules are entirely suitable for real-time diagnosis processing by servers.
In the future, the rules will be continuously improved, and new subhealth types will be added to improve the diagnosis efficiency and accuracy of the subhealth rules and achieve early warning of failures. The reliability and safety of urban rail vehicle door systems will thereby be effectively improved.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.