Introduction

Monitoring and maintenance have been always around. Due to rapid increase in engineering systems complexity, especially transportation vehicles, complete, and integrated system for vehicle fault detection, diagnostics, failure prognostics, maintenance planning, operation decision support, and decision making, was needed and became a huge challenge.

NASA was the first organization interested in ISHM (or IVHM) since the IVHM panel was established by NASA SATWG in 1990.

The implementation of ISHM leads to the following benefits [1]:

  • improving system safety and reliability which increases the probability of mission success;

  • reducing processing and operation time, manpower, and costs;

  • increasing system availability and utility

Prognostics is one of the most difficult and challenging aspects in ISHM. It is considered as the game-changing technology that can push the boundary of systems health management [2]. Using prognostics implies a reduction in complex system O&S cost and life cycle TOC as well as safety improvement [3]. Estimating the RUL of the system, subsystem, and components makes a paradigm shift in both system’s maintenance and operation. Prognostics has many benefits. The following are just to name a few:

  • it moves the strategy of maintenance and decision making from being reactive to be proactive [4];

  • secondary damage reduction;

  • reconfiguration and replaning in case of failure to optimally use the RUL of the failed parts and complete the mission safely [5];

  • maintenance planning, enhancement of logistic support, and alerting the crew about the impending failure;

  • knowledge of hidden evolving fault (due to normal internal system tear, wear, and degradation) from multidimensional and spars sensors data.

Due to lack of standardization, prognostics can be considered an art rather than a science. Need for prognostics standardization inspires researchers to develop a standard framework of prognostics and its performance metrics. Saxena et al. [6] presented a very good and comprehensive summary about almost all measures that can be used for prognostics performance evaluation based on end user objectives. Saxena et al. [7] introduced four new prognostics-specific performance metrics. Standardization of prognostics research methods takes a lot of attention due to its impact on technology development [8]. Voisin et al. [9] developed a global formalization of the generic prognosis business process.

So far, there is no literature that gives a wide and complete vision about prognostics. If one would like to have a clear understanding of all topics related to prognostics, it will be a difficult and time-consuming task. It may take months to just collect articles about prognostics. Reading and understanding all of these articles, especially for new prognostics researchers, is not easy because each article addresses only single or few topics. So we decided to write this literature review about prognostics to aid the research community to find a thorough, clear, wide and complete overview about prognostics in a single piece of paper. New prognostics researchers can start by reading this paper and then select which topic he/she is interested in. It will also help the discipline itself, since originating concepts is needed for any growing technology.

Fig. 1
figure 1

Overall ISHM process

The paper is organized as follows.

Section 3 gives an overview about ISHM and its benefits, and discusses about how the idea of ISHM started and its challenges.

Section 4 talks thoroughly about prognostics and its relation to health management, shows how prognostics has improved and can be applied in several areas, describes different prognostics approaches, advantages and disadvantages of each approach and how multiple approaches can be combined together to produce better results, and finally discusses the prognostics challenges and how to deal with all of these challenges.

Section 5 presents the summary and conclusion.

Integrated systems health management (ISHM)

Due to the complexity in safety critical engineering systems, traditional ways for system operation and maintenance are not efficient. Failure in such systems can be catastrophic and causes loss of lives or at least mission aborts.

To ensure safe and reliable operation, systems must be continuously and fully monitored. Correct and timely decision must be taken at all stages of the system life cycle from design to O&S in an integrated way.

ISHM helps in fault prevention, mitigation and recovery during operation [10].

Maintainers, logisticians, engineers, safety persons, mission planners and program managers benefit from ISHM [11]. ISHM also helps system designers, developers, and testers to improve their systems using the feedback data about system field operation and behavior.

ISHM has many incomplete definitions. Some of these definitions do not consider design, development, and test stages as a part of ISHM. Others neglect logistics, resources allocation, and the decision-making process. The definition of ISHM must cover all stages of the whole system life cycle. We see ISHM as an integrated process that is applied to the systems, subsystems, and components from its birth as an idea to its EoL to preserve its health and the desired performance and in the same time ensures safety, availability, reliability, and autonomy and minimize cost. This process integrates system design, development, testing, and evaluation with fault detection, fault diagnosis, and failure prognosis as well as decision support and decision making into a comprehensive system that uses all gathered information, operational demands, and available resources to take appropriate decisions about mission planning, resources allocation, required reconfiguration, maintenance strategies, logistic support, and management strategies taking into consideration closing the loop between each consecutive steps as shown in Fig. 1.

Origin and revolution

Interests in ISHM started when the NASA Office of Space Flight identifies IVHM as the highest priority technology for present and future space transportation systems [1]. That is, the concept started and applied only for vehicles and was known as IVHM. As the systems are growing and their complexity are increasing, this concept crossed the transportation vehicles and became preferable for systems and subsystems; then the term changed to be ISHM. To avoid confusion, the two terms IVHM and ISHM are applicable.

In November 1989, NASA Strategic Transportation Avionics Technology Symposium was held, and then in 1990 the SATWG was established. SATWG initiated some activities and formed panels to fulfill these activities. One of these panels was the IVHM panel which focuses on IVHM planning and NASA/industry interaction. IVHM held several meetings and then definition of IVHM requirements; determination of NASA, DOD, and industry desires, needs and capabilities; and determination of IVHM technology needs, goals, and objectives were significantly built up [1].

After NASA published its first document in ISHM “Research and Technology Goals and Objectives for Integrated Vehicle Health Management”, a big attention from research community, industry, and governments has been drawn to the importance of this new technology. Some standards are published as an attempt to unify the used concepts and methodologies. ISO-13374-1 [12] defines the blocks of functionality of condition monitoring system and input/output for each block in Fig. 2.

Fig. 2
figure 2

Data processing and information flows [12]

OSA-CBM [13] is an implementation of ISO-13374. ISO-13372 [14] defines terms relating to condition monitoring and diagnostics of machines. ISO-17359 [15] sets out guidelines for the general procedures to be considered when setting up a condition monitoring program for machines. These standards are part of a huge series of condition monitoring and diagnostics of machines standards.

NASA Ames Research Center and the Jet Propulsion Laboratory are playing a huge rule in ISHM systems development. The United States DoD forces project managers to consider diagnostic, prognostic, system health management, and automatic identification technologies [16].

In 2009, NASA launched the IVHM project to develop validated tools, technologies, and techniques for automated detection, diagnosis, and prognosis that enable mitigation of adverse events during flight [17]. The goals of this project coincide with the goals of:

  • The aviation safety program.

  • The agency roles and responsibilities for NASA.

  • The 2007 national plan for aeronautics research and development-related infrastructure.

  • The 2007 next-generation air transportation system research and development plan.

This project shows the technical approaches for ISHM at different levels (Fig. 3) and 5 years roadmap with its major milestones.

Fig. 3
figure 3

Levels of research within IVHM and the logical flow from foundational research to project-level goals [17]

Air Force Research Laboratory defines ISHM architecture and set near-term, mid-term, and far-term technology goals [18]. It also defines a roadmap to achieve these goals [19].

Nevertheless, many ISHM programs have been initiated, but few of them could be considered as a complete and perfect example. This is due to the gap between health management user objectives and engineering development [5].

In an unprecedented step to increase understanding and deliver higher level of ISHM professionals, Cranfield University offered Master of Science in IVHM [20].

A lot of efforts have been exerted in ISHM to obtain mature and widespread systems. Additional efforts need to be undertaken to enable widespread adoption of ISHM and resolve its challenges.

Challenges

Nevertheless, ISHM is rapidly improving; it faces some challenges which hold it back. NASA addressed two challenges in its IVHM project [17] and considered it critical:

  • Developing tools and techniques that combine messages from single aircraft health management system and results from analysis of fleet-wide health management system into an integrated real-time automated reasoning and decision-making system.

  • Avoiding system/component malfunctions and failures because of the difficulties in detecting, diagnosing, and mitigating hardware faults and failures in-flight with the existing technologies. These failures can imply catastrophic accidents.

Wheeler et al. [5] presented the following ISHM challenges:

  • Deployment of the ISHM system due to the big difference between ISHM user objectives and engineering development.

  • Ability to quantify exactly the benefits of the newly developed ISHM.

  • Difficulties to provide aviation systems with effective ISHM system.

  • Resolving tasks of aging and expected life and cost vs. benefit.

Prognostics

Prognostics is located at the heart of ISHM. It resides at level 3 of technical approaches for ISHM Fig. 3 and is one of the main blocks of functionality of the condition monitoring system (Fig. 2). It also appears in a long duration time scale for ISHM systems [21] Fig. 4.

Fig. 4
figure 4

ISHM block diagram [21]

Prognostics is one of the top ten challenges in NASA aviation safety program. It plays the most important role in improving system safety, reliability, and availability.

Prognostics itself is not a new concept, since humans are anxious about what will happen in the future to either avoid catastrophes or at least cope with it. Also in business, the coal miner used to put canaries into the mine to know in advance the level of oxygen: it is bad if the canary dies. Prognostics played a historical rule in medicine and is considered as a matured technology that has its special impact on patient management tasks [22]. On the other hand, prognostics is still a developing technology in engineering.

The word prognostics is originally a Greek word “progignôskein” that means to know in advance. In engineering, prognostics can be defined as the process of RULs estimation of system/subsystem/component that is degrading due to either normal operation (no fault symptoms) or detected fault. This RUL estimation should:

  • guarantee safe operation to EoL;

  • output multiple RULs due to different failure modes;

  • combine RULs with an uncertainty index to be trusted.

We take into consideration:

  • historical normal and faulty operational data;

  • current and future scenarios (operating and environmental conditions, maintenance actions);

  • manufacturing data, e.g., failure modes effect and criticality analysis and material conditions and variations.

As long as prognostics is concerned about future knowledge of system health and condition, it can make an evolution in maintenance and operation support. It comes with new concepts such as CBM and PHM. Prognostics can be seen as a revolutionary discipline that can change the whole world of complex engineering system life cycle management.

Among a lot of the prognostics benefits, it introduces the following to the engineering world:

  • minimization of machines downtime and better productivity [23];

  • moving from fail and fix strategy to predict and prevent [9];

  • reduction of inventory due to the knowledge beforehand about the time to failure; this knowledge allows planners to order only the needed spare parts when required;

  • total life cycle cost management optimization due to improvement of CBM using prognostics [24];

  • unseen degradation detected and projected from system-monitored parameters.

Figure 5 shows some benefits of diagnostics and prognostics. Here, we are concerned about prognostics.

Fig. 5
figure 5

Diagnostics and prognostics benefits [4]

Prognostics and health management (PHM)

Prognostics itself is useful because it supplies the decision maker with early warning about the expected time to system/subsystem/component failure and let him decide about appropriate actions to deal with this failure. The benefit from prognostics can be flourished if its information is used as the main source to system health management. PHM is the emerging engineering discipline that links studies of failure mechanisms to system life cycle management [8].

PHM can be seen, as the process involves system monitoring, fault detection, isolation, and identification (fault diagnostics), failure prognostics, and action taken (e.g., required logistics, maintenance,... etc.) to improve safety and reduce maintenance cost. Pecht and Kumar [25] proposed a generic management methodology for PHM Fig. 6.

Fig. 6
figure 6

CALCE Prognostics and health management methodology [25]

Prognostics is very essential for PHM. It plays the most effective rule because it represents the predictive part in PHM, which enables no surprises for PHM users especially the maintainers. The location of prognostics in PHM is shown in Fig. 7.

Fig. 7
figure 7

Prognostics in PHM [11]

PHM can also change the strategy in the system design and development by achieving high system reliability without adding many redundant devices. High reliability is achieved by replacing static reliability of the system calculated in design phase by online dynamic reliability calculation in actual operating conditions.

The main objective of creating the PHM system is to maximize ROI by combining different maintenance strategies (e.g., scheduled maintenance, condition-based maintenance, and predictive maintenance) to achieve optimum cost-effectiveness versus performance decisions [26].

Bonissone [27] made a good representation of PHM functional architecture (Fig. 8) based on domain knowledge and time. The architecture combines all PHM aspects in a descriptive way that shows the relation between each PHM functional block. It also relates different PHM component parts to the relevance decision-making process based on a segmentation of decision time horizon. This segmentation is: single decisions (one time); multiple, repeated decisions (tactical, operational, and strategic); life cycle decisions.

Fig. 8
figure 8

PHM functional architecture [27]

PHM is quickly evolving because many organizations have started recognizing the benefits of applying the PHM systems. Rolls-Royce has a long history in applying PHM concepts in aeronautics, especially in engines health management [28]. The BAE systems established a project for fleet health monitoring and machine learning technology for CBM and applied this system to heavy duty transit bus to enable fleet health management remotely in different cities [29].

The Xerox Company found that embedded and remote PHM can be the key to achieve customer satisfaction with minimum overhead due to after sale service [30]. General Electric developed a PHM project for aircraft turbine engines that are already in service [31]. This project focuses on satisfying user needs by applying all PHM aspects (sense, diagnostics, prognostics, and decision making) to already in service engines resulting in reduction in system O&S cost, reduction, or elimination of maintenance tasks, improvement of mission planning, and enhancement of prognostics capability. UTC and Pratt and Whitney created a generalized PHM value model to identify different values to customers and providers and correlate these values to concrete metrics [32].

The F-35 JSF aircraft is a complete and comprehensive example of applying PHM concepts. PHM for JSF is the key enabler for its AL support [33]. AL allows reduction in logistics footprint, safety improvement, increase sortie generation rate, and reduction in O&S cost. PHM allows fault detection and isolation in real time onboard the aircraft for all main systems and subsystems. It also allows failure prognostics for a selected critical systems and components. Recommended actions are also displayed for pilot when needed to avoid the predicted failure. The PHM capability for JSF aircraft is the main reason for using single engine aircraft with high reliability as dual engines.

Prognostics approaches

Prognostics approaches are classified in different ways. Sometimes, the classification is based on the type of available data and knowledge about the system. Another time prognostics approaches are classified according to the type of the used methodology. The prognostics system developers can benefit from these classifications in algorithm selection based on available background about the system and suitable forecasting techniques. Prognostics approaches classification also helps in identifying what techniques from other technologies can be used in prognostics algorithms development. A key point about prognostics approaches classification is building a way to obtain a standard methodology for prognostics applications development within a standard framework.

In general, prognostics approaches can be classified into four types:

  • reliability based approach;

  • physics-based approach;

  • data-driven approach;

  • hybrid approach.

We prefer this classification because it combines almost all used prognostics techniques.

The complexity, cost, and accuracy of prognostics techniques is inversely proportional to its applicability (Fig. 9). Increasing prognostics algorithm accuracy with low cost and complexity is a big challenge.

Fig. 9
figure 9

Prognostic approaches [24]

Reliability-based approach

Experienced-based prognostics, life usage model, or statistical reliability-based approach are different names for the same approach. This approach is used mainly for uncritical, unmonitored components that do not have a physical model and are mass produced. In this approach, assessing the health of individual components in real time considering the operating and environmental conditions is not considered. It depends only on massive historical data about the same components population and its average rate of failure. MTBF was obtained mainly from the original equipment manufacturer and updated during the field operation. This method can be used to be the driver for scheduled maintenance, the maintenance interval of which can be calculated based on the historical usage of a large set of components or on the accelerated tests in case of insufficient data about MTBF of newly used components. Techniques used for this method are solely based on statistics, e.g., Weibull analysis and log-normal and Poisson laws [34]. Figure 10 is a simple representation of this methodology.

Fig. 10
figure 10

Experienced-based prognostics [25]

Fig. 11
figure 11

Physics of failure prognostics [36]

The advantages of this approach dwells in its simplicity and can be easily applied. It does not require any knowledge about failure modes or system operation. Although the simplicity of this method, it has many drawbacks. The main problem about replacing a part at every fixed interval is that the component-specific conditions are not considered causing either early replacement of working component or late replacement that implies component failure before replacement. It is also hard and inaccurate to apply this approach to newly developed components, because it requires massive failure historical data.

Physics-based prognostics

PoF-based prognostics is one of the major methodologies used for prognostics. It is located on the top of the pyramid of prognostics approaches Fig. 9. In this approach, a physical model for the system or component is developed. This physical model is a mathematical representation of failure modes and degradation phenomenon. To establish this model, a thorough understanding of the system/component physics is required. In addition to knowledge about the system, knowledge about operating conditions and life cycle loads applied to the system/component are also required.

Modeling of the system can be at a micro level, i.e., modeling the effect of stresses into the material by establishing for example a finite element model. Another level of modeling is to establish a macro-level model. A macro-level model is based on the first principle knowledge about the system to model the relation between its component parts. Modeling is performed by mathematical equations such as modeling degradation of turbofan engines as a function of efficiency loss and flow [35].

After establishing the system model, an in situ monitoring of the system is performed, then system diagnosis is used to assess its performance. The model can use the knowledge about the current system health and future scenario about the load exposure to forecast RUL. Figure 11 shows a description of PoF methodology for prognostics [36].

Physics-based prognostics has been applied to the systems in which their degradation phenomenon can be mathematically modeled such as in gearbox prognostic module [24], residual-based failure prognostic in dynamic systems (applied to hydraulic system) [34], and military LRU prognostics [37].

This methodology is very efficient and descriptive because system degradation modeling depends on laws of nature. It is also accurate and precise, but accuracy and precision depend on model fidelity [8]. The advantages of this approach are that it is easy to validate, certificate, and verify.

Fig. 12
figure 12

the ISHM algorithm [2]

There are some drawbacks and limitations of this approach that hold it back from being widely spread such as: developing a high fidelity model for RUL estimation is very costly, time consuming, and computationally intensive and sometimes it could not be obtained. Also, if this expensive model is obtained, it will be component/system specific and its reusability will be very limited to other similar cases. For all of these reasons, sometimes the next approach (data-driven) is used instead of the physics-based one.

Data-driven prognostics

Data-driven prognostics approach is the recommended technique when the feasibility study implies a difficulty of obtaining a PoF degradation model. Although the physics-based approach is preferable because of its accuracy, precision, and real-time performance, the data-driven prognostics is more widely spread than the physics-based one in the PHM community. This wealth of available applications based on data-driven prognostics is due to its quick implementation and deployment. Data-driven approach mainly relies on techniques from AI which has its readymade tools that could be applied directly with minor modifications. The low cost of algorithms development and no or little knowledge required about system physics make this approach preferable by prognostics system developers.

The idea about this approach is to use the measured performance parameters of the system, e.g., pressure, temperature, speed, vibration, current ..., etc. to create a model that correlates these parameters variation to system degradation and fault progression and then use this model for RUL estimation. The creation of this model is solely based on techniques from soft computing, e.g., ANN, fuzzy logic, neuro-fuzzy, support vector machine, RVM ..., etc., and sometimes techniques from statistics such as regression analysis. Techniques from soft computing are preferable than statistics because of their ability for noise rejection and learning hidden relations between parameters. Data-driven techniques can be classified into conventional numerical methods and machine learning methods Fig. 12.

The key requirement for data-driven prognostics algorithm development is the availability of multivariate historical data about system behavior. These data must cover all phases of system normal and faulty operation as well as degradation scenarios under certain operating condition. Availability of these data for algorithm training is a challenging task, but once the data are available, the creation of the algorithm will not be a matter.

Three methods can be used to obtain run to failure data for prognostics algorithm development: (1) fielded applications; (2) experimental test beds; (3) computer simulations [38]. Fielded applications suffer from that the systems continuously monitored rarely fail, whereas failed systems do not have sufficient sensors. In the same time when data are available, proprietary issues pan these data to be available for public use. Experimental test beds are costly, dangerous, and time consuming. Accelerated aging may not contain all failure modes. Computer simulation is complex and difficult, because building a high-fidelity simulation model is not an easy task, but once this model is available, computer simulation can be considered the best way to acquire run to failure data [38]. PCoE at intelligent system divisions in NASA Ames research center provides a huge data repository for prognostics algorithm development that is available for public use [39].

Data-driven methodology is mainly used at systems and subsystems level that experience gradual degradation. These systems/subsystems are equipped with multiple sensors that can monitor its operating behavior.

There are two ways for RUL estimation using data-driven approach: either to use the developed model of system behavior to directly calculate the remaining useful system life or to use this model for system health state estimation and then extrapolate or project the system health to obtain the degradation curve until it intersects with the failure threshold to calculate RUL [40].

Pecht and Kumar [25] presented a methodology for data-driven algorithm development. This methodology starts with functional consideration of the system, i.e., system analysis including its limitations, operating and environmental conditions, and performing the feasibility study for applying this algorithm. The next step is data acquisition from sensors in real time. These data represents system behavior and should contain all of its healthy and faulty modes. During this step, data preprocessing is performed to enhance data for future use. Data preprocessing includes data cleaning, normalization, and noise reduction. After data preparation, features related to system health degradation are selected. Monotonic feature behavior is preferable in this context. Then the baseline definition and data model creation based on the prepared training data are created for system health assessment. In real-time operation, the diagnostics system catches a variation in system performance and then performing fault isolation and identification. After fault identification, the prognostics system is triggered for RUL estimation Fig. 13.

Fig. 13
figure 13

Data-driven prognostics methodology [25]

Of course, these steps are not mandatory because sometimes the situation changes from one system to another, but it can be used as a guideline for prognostics system developers. In some cases, diagnostics system could not catch the starting of system degradation, because the degradation phenomenon could not be monitored. This happens when the degradation is due to internal system wears and tears and it is too difficult to have direct sensor readings of its deviation from normal values. In this case, using AI techniques to learn the relation between monitored parameters and system health is the best solution. Many tools from the data mining community can be used to discover the hidden relationships between the monitored parameters to explain the strange behavior of system degradation.

Usage of data-driven prognostics has many advantages such as no system knowledge is required, it is fast and easy to implement, the algorithm can be tuned to be used for another system, and hidden relations about the system behavior may be learned.

A lot of prognostics applications are based on the data-driven methodology. The very famous data-driven solutions are presented in the 2008 PHM conference (PHM08) data challenge competition where training and test data are provided for unknown complex engineering systems. The objective was to estimate RUL for this system in the test data where no information about system physics or even system type was provided. It was a pure data-driven problem. Heimes [41] used recurrent neural network trained by extended Kalman filter to solve this problem. Wang, Yu, Siegel, and Lee [42] used similarity-based prognostics to tackle the PHM08 problem. The wining algorithm for this competition used Kalman filter ensemble of multilayer perceptron neural networks for RUL estimation [43].

Data-driven approach faces the following challenges:

  • Usage of multivariate and noisy data requires a robust algorithm.

  • Because most of the techniques are based on approximation, uncertainty management must be taken into consideration which is another challenge.

  • Sometimes, the results are not intuitive because of the absence of physical knowledge about the system.

  • It can be computationally intensive due to large datasets that affect the real-time performance. Well-designed algorithm and suitable resources can overcome this problem, but it remains a challenge in development.

  • Overfitting and overgeneralization while training the algorithm can affect the results tremendously.

  • There is unavailability of data, especially for newly developed systems.

Wang et al. [44] proposed a generic probabilistic framework for structural health prognostics and uncertainty management. The proposed methodology succeeded in solving two problems facing the data-driven approaches. The first one is establishing a generic framework that can be followed in developing data-driven systems instead of being application specific. The second is propagating uncertainty for RUL estimation uncertainty management.

Zio and Di Maio [45] developed a similarity-based algorithms for online identification of failure modes and RUL estimation of nuclear systems. The computational performance analysis of the proposed methodology showed its applicability for online application. However, the algorithm is tested on an Intel® Core2 Duo of 1.83 GHz that exists in normal personal computers and not vehicles onboard computers.

Hu et al. [46] proposed an ensemble of multiple data-driven algorithms to achieve a performance better than each individual algorithm. This method is efficient because it is not limited to the proposed algorithms, but allows addition of any other data-driven algorithm.

It could be a good solution to combine both physics-based and data-driven methodologies into one hybrid approach to gain the benefits from each and overcome its limitation.

Hybrid approach

As mentioned above, each technique, either PoF or data driven, has some limitations. A hybrid (or fusion) approach is combining both data-driven and PoF approaches together to get the best from each, i.e., PoF can compensate the lack of data and data driven compensates the lack of knowledge about system physics. This fusion can be performed either before RUL estimation which is called pre-estimate where PoF and data driven are fused to perform RUL estimation or after RUL estimation by fusing the results from each individual approach to obtain the final RUL called post-estimate [47].

Cheng and Pecht [48] presented a nine-step fusion approach for RUL estimation of electronic products, Fig. 14. These steps can be used to develop a fusion approach for any other application. However, it is not the only way to implement this approach.

Fig. 14
figure 14

Fusion approach [48]

Cheng and Pecht [48] presented a case study for RUL estimation using fusion approach for ceramic capacitors. Another application based on this approach is the prognostics of lithium ion battery [49]. Goebel et al. [50] used a fusion approach for aircraft engines bearing, and the results show that this method gives more accurate and robust outcome than using either data driven or PoF alone.

Although this approach is used to eliminate the drawbacks of PoF and data-driven methods and gain their benefits, it also carries the disadvantages of both methods to a certain extent, but of course not by the same level if each technique is used individually.

The Kalman filter which is adaptive in nature and particle filter are used for the implementation of this methodology.

One may ask how I can select appropriate prognostics approach for certain applications. Goebel [47] answered this question by presenting a flowchart based on the requirements (Fig. 15). This flowchart is a nice and handy guide for selecting the prognostics approach.

Fig. 15
figure 15

Prognostics approach selection [47]

Prognostics applications

In the last few years, great attention has been given to prognostics due to its good effect in improving complex engineering systems health management.

Prognostics has a great contribution in different fields such as in medicine where the future course and outcome of the disease processes are predicted after treatment [51] and in everyday weather forecast. Medicine and weather forecast are mature prognostics applications that already proved its applicability. We are here concerned about prognostics applications in engineering fields which is still an Achilles’ heel in CBM and needs to be matured enough as in medicine and weather forecast.

Prognostics applications can be online and works in real time or near real time whether it is onboard or off-board. Prognostics also can be applied off-line regardless of the operation time of the monitored system. The real-time prognostics takes online data from the data acquisition system to perform RUL estimation and gives a warning about the impending failure to allow system reconfiguration and mission replaning. The off-line prognostics system uses fleet wide system data and performs deep data mining processes that could not be performed onboard in real time due to the lack of resources and time criticality. The results from off-line prognostics system can be used in maintenance planning and decision making for logistics support management.

Prognostics is originally one of the forecasting applications (Fig. 16).

Fig. 16
figure 16

Forecast applications [7]

Applying prognostics in the engineering field is not easy because system’s EoL must be forecasted accurately in sufficient time in advance to allow the controller to react and prevent system failure. In this section, we demonstrate some of the prognostics applications in different engineering fields.

Vehicles prognostics applications

As long as safety is one of the most important aspects that prognostics is created for, many prognostics applications are directed towards safety critical parts of vehicles especially in aerospace.

In one of the US patent application, a vehicle diagnostics and prognostics system is developed [52]. The system is composed from the VMC which performs diagnostics and prognostics functions, VON that provides the communication between VMC and other onboard processors, sensors that give readings about vehicle subsystems operational data to the VMC through VON, vehicle maintenance database that provides VMC with maintenance information, and transceiver for communication with the central base station which has a centralized diagnostics and prognostics system (Fig. 17).

Fig. 17
figure 17

Vehicle diagnostics and prognostics system [52]

When anomaly is detected by the VMC diagnostics system, the prognostics system is triggered for data trending based on the available information in the vehicle database. The vehicle driver is informed about the impending failure by a message sent to the vehicle display. A file about the predicted failure is sent to the centralized diagnostics and prognostics system in the base station. If the VMC cannot trend the upcoming failure based on the available vehicle data, it sends a maintenance message to the base station which uses fleet-wide data to perform prognostics. The result from central diagnostics and prognostics system is then uploaded to the VMC.

Fig. 18
figure 18

TREPAN algorithm [55]

One of the most important prognostics system has been developed for EMA which plays a dominant rule in controlling surfaces of new-generation fly-by-wire aircraft and spacecraft in severe conditions [53, 54]. EMA is a safety critical part. The ability to confidently monitor, diagnose, and prognose EMA can save lives as well as millions of dollars. NASA Ames Diagnostic & Prognostic Group in collaboration with Impact, Moog, Georgia Institute of Technology, California Polytechnic State University, Oregon State University, and US Army developed a very useful PHM system for EMA. The developed system can be used onboard in real time to provide current and predicted EMA health that allows safe reconfiguration. To achieve this goal, a flyable electromechanical actuator test stand is developed and used in laboratory experiments as well as in flight onboard UH-60 Blackhawk aircraft. After the diagnostics system catches the fault, the prognostics system which uses GPR is initiated for RUL estimation based on the fault mode and intersection of fault progression with the fault threshold. Results show that prediction error of time to failure is less than 10 %.

Aircraft gas turbine engine is a safety critical system that needs to be health monitored and proactively maintained. Due to the complexity of such a system, creation of physical model for system prognostics is very difficult and costly. ANN can identify faulty and nominal system behavior if it is trained appropriately. It also has the ability of novelty detection, but needs massive training data and looks like a black box. Data mining rule extraction tools can also perform the same task and give more insight into the behavior than ANN.

Brotherton et al. [55] used a combination between ANN and rule based for development of online aircraft prognostics system to take the benefits from both techniques. This combination is based on an algorithm called TREPAN Fig. 18. The idea about this system is to use DL-EBF neural networks to learn system nominal and faulty state as well as stages of fault progression. The benefit of using DL-EBF is that it gives more insight into the system dynamics. The rule extraction module data mines the neural network by queries to generate the rules used for trending. The good things about this system is that it does not require massive data for training, especially at the beginning of fault evolution, good statistical performance, discovery of new rules, novelty detection, and real-time performance.

Pacific Northwest National Laboratory did a feasibility study for development of embedded real-time prognostics system for gas turbine engine AGT1500 used on the M1 Abrams tank [56]. This work is sponsored by the US Army Logistics Integration Agency for evaluating the ROI from prognostics technology. The system is developed as an ad hoc for already manufactured engines. It uses 25 sensors originally installed by the manufacturer and 13 other sensors are added for the purpose of system development. Additional data acquisition system is used for sensors readings collection and processing. A microprocessor(s) is used for data analysis and EoL prediction. This system is called REDI-PRO and is an extension of TEDANN. RUL estimation is done using regression analysis. Results showed that the benefit of the prognostics system is its cost of about 11:1 which prooves the applicability of prognostics in such effective areas.

Industrial applications

Prognostics plays a dominant rules in industry to increase system availability and utility.

Yan et al. [23] developed an online prognostics algorithm for machine performance assessment to monitor system health and predict future failure. This allows proactive maintenance in various industries such as an elevator door motion. The algorithm is implemented in three steps:

  • Using logistic regression to build a model that maps performance parameters to the probability of failure.

  • Real-time system performance is evaluated by inputting online data on the model.

  • RUL estimation is obtained by using autoregressive moving average and the prediction is dynamically updated with time.

This approach can be applied to various industries to know the component health in real time as well as its predicted future state.

Bonissone and Goebel [57] developed a very useful and practical online prognostics system using hybrid soft computing model to estimate time to break in the wet-end part of the paper making machine and give an early indication about this break. The methodology is divided into two stages: training stage which is performed off-line and the testing stage that runs online. In the training stage, the historical Web break data are collected by sensors. The collected data are then preprocessed (data scrubbing, segmentation, filtering, and smoothing) and analyzed (variable selection and principal component analysis). After that ANFIS model is built and fitted to the training data, the online testing stage starts. Data are collected online and preprocessed the same way as in training then input to the ANFIS model for time to break calculation. The output from ANFIS model drives a stop light metaphor which stays green in the normal conditions, the yellow light is turned on 90 min before break, and the red light if the time to break becomes 60 min. A block diagram of this process is shown in Fig. 19.

Fig. 19
figure 19

Hybrid soft computing model for prognostics [57]

Fig. 20
figure 20

PoF methodology for embedded prognostics [58]

Electronics applications

Electronics are very important and are used widely in complex systems such as aircraft and spacecraft. The failure of such electronics can lead to a failure of the whole system. Electronics used in such complex systems are always exposed to thermal cycle loads that affect its operation. The development of embedded diagnostics and prognostics system that runs onboard in real time with low power and cost is a challenge.

Rouet et al. [58] presented PWA as a case study of embedded diagnostics and prognostics system for electronics. The system is implemented using a data logger of the type Lifetime Assessment Monitoring System. Data are collected by in situ smart sensors and then prediction is made using the PoF technique (Fig. 20). The results are evaluated by comparing the output from the algorithm to the results from accelerated tests performed on the PWAs. Results showed very low discrepancies between the real experiment measurements and the model output which ensures the applicability of this method.

Tuchband and Pecht [37] used prognostics for military LRU exposed to severe flight conditions. The use of prognostics was a part of complete interactive supply chain for the US military. The LRU is monitored online by an embedded sensor. Sensor data are transferred remotely to the base station using a wireless communication. After data analysis, the result of this analysis is uploaded to a Web portal for RUL estimation. Integration of wireless communication, Web portals, and prognostics allows not only RUL estimation, but also availability of this data for multiple users worldwide.

Battery applications

Recent years have seen a rapidly growing interest in research on Li-ion battery health monitoring and prognostics with a focus on battery capacity estimation and RUL estimation. Saha and Goebel [59] found a base for health management application for energy storage devices by presenting an empirical model to describe battery behavior during individual discharge cycles and over its cycle life. This model is used further for RUL estimation.

Wang et al. [60] introduced a novel methodology for Li-ion battery prognostics. This methodology is based on RVM to find the RTVs. Then RTVs are used to calculate the parameters of the conditional three-parameter capacity degradation model using least square regression. Finally, the RUL is obtained by extrapolating the fitted model to reach the failure threshold.

Hu et al. [61] proposed a multiscale framework with extended Kalman filter for state of charge and capacity estimation. Then Hu et al. [62] extended this work and used Gauss–Hermite particle filter to project the capacity fade and calculate the RUL with high accuracy and uncertainty representation of the estimate.

Hu et al. [63] proposed a data-driven methodology for estimating the capacity of Li-ion battery based on the charge voltage and current curves. In this methodology, five characteristic features of the charge curves are defined to indicate the capacity. Then a regression model based on k-nearest neighbor is developed to identify the relation between the five features and the capacity. Particle swarm optimization is used to find the optimum weight combination of the five features. This data-driven methodology was verified and accurately estimated the capacity of Li-ion battery.

Onboard resources used to run prognostics algorithm are always a barrier for the deployment of prognostics solutions. To resolve this challenge, Saha et al. [64] developed a distributed prognostics algorithm using GPR. All computer nodes run diagnostics routines, once an off-nominal situation is detected, and nodes running prognostics module related to the fault mode engages in RUL estimation. Wireless communication between nodes is used which imposes more difficulty to the system. A case study for battery health management is conducted to prove the concept. Distributed prognostics algorithm can be considered as a large step forward in prognostics algorithm development.

All of the discussed prognostics applications were just examples, whereas the prognostics applications could not be counted. Prognostics is widely used and applied in several engineering areas such as unmanned aerial vehicle propulsion, military aircraft turbofan oil systems, semiconductor manufacturing, cracks in rotating machinery, heating, air conditioning, wheeled mobile robots, electronics, gas turbines, actuators, aerospace structures, aircraft engines, clutch systems, batteries, bearings, and hydraulic pumps and motors. Prognostics is also involved in many projects related to the nuclear industry due to its criticality, e.g., nuclear plant life prediction NULIFE [4]. As the prognostics technology is improving, in the near future it will be part of almost all systems, from the very complex ones to household equipment.

Prognostics challenges

Like any other developing technology, prognostics is facing some challenges. The PCoE addressed the following prognostics challenges:

  • uncertain management;

  • autonomic control reconfiguration based on prognostics output;

  • integration of different and sparse data collected from interconnected subsystems to be processed;

  • prognostics system validation and verification;

  • post-prognostics reasoning.

Those are not the only challenges in prognostics. Long-term prediction, data trending correctness, variability in external affecting factor that is difficult to be quantified [65], availability of run-to-failure data, accelerated aging test for off-line algorithm evaluation, developing of real-time algorithm [66], prognostics requirement specification [11], and prognostics standardization are also challenges.

Here, we will discuss the following major challenges: uncertainty management, validation and verifications, prognostics standardization, and post-prognostics reasoning.

Uncertainty management

Prognostics in nature is an uncertain process, because it incorporates projection of damage progression into future.

Future loads and environmental conditions used in prognostics cannot be accurately predicted. Besides, there are several parameters imposing uncertainty on prognostics. These parameters exist in the whole system life cycle from design to operation and support.

Assumptions during system design and development, AIT equipment and tools, system model, system inputs, disturbances, data processing, sensors, state estimation techniques, RUL estimation approaches, performance metrics ..., etc. are of course imperfect and participate in uncertainty growth.

Sankararaman and Goebel [67] presented a very good overview of the state of the art of uncertainty quantification and management in prognostics and health monitoring. They classified uncertainty sources into four main categories:

  • Present state uncertainty which is the result from sensor noise, gain and bias, data processing, filtering, and estimation techniques.

  • Future uncertainty that appears due to loading, environmental, and operating conditions.

  • Modeling uncertainty: as its name indicates, it comes from all kinds of models used in the prognostics process, e.g., system model and failure model.

  • Prediction method uncertainty.

Figure 21 shows how uncertainties from multiple sources affect RUL estimation. Figure 21a shows how indefinite failure criterion can move the intersection point between the extrapolated damage and the failure threshold and how the noisy measurement results in several damage propagation models. Figure 21b shows how the extrapolated trending parameter(s) varies according to model accuracy. Model inaccuracy can be represented by the probability density function (PDF) of the trending parameter and can be extrapolated into the future using methods such as Kalman or particle filters.

Fig. 21
figure 21

a Effects of measurement uncertainty. b Effects of model and input uncertainties

Uncertainty in prognostics cannot be eliminated completely, instead it can be managed by noise modeling, algorithm overfitting avoidance, model training, and using hybrid forecasting techniques.

Accuracy and precession are the simplest ways to measure uncertainty in EoL prediction [68] Fig. 22.

In practice, the error in the RUL estimation is not normally distributed, and sometimes parametric distribution is not even known. When the distribution is not normal, it is better to use median instead of mean to represent the location of the estimation and interquartile range instead of variance to measure the spread. Visualization of data can be done using error bars and box plots Fig. 23.

Saxena et al. [7] laid a very good and efficient concept for incorporating uncertainty. Instead of using single point estimate and measuring the difference between the estimated EoL and actual EoL, an error bound is presented which is called \(\alpha \)-bounds (Fig. 24). This \(\alpha \)-bound does not have to be symmetric especially in prognostics, which prefers early prediction than late prediction. By integration of the area under the PDF curve from \(\alpha \)- to \(\alpha \)+ and comparing the result to predefined threshold \(\beta ,\) we can know if the prediction is within \(\alpha \)-bounds or not and is called the \(\beta \)-criterion. The parameter \(\beta \) is defined to establish a relationship between uncertainty and risk tolerance of the system.

RUL estimation is so important to the ISHM decision-making process. The amount of uncertainty in RUL estimation informs the decision maker about the percentage of how much he/she can rely on prognostics system results. For this reason, researchers in the past few years identified the RUL estimation task as an uncertainty propagation problem [69]. Sankararaman and Goebel [70] and Sankararaman et al. [71] proposed analytical methods such as the most probable point concept and first-order reliability methods to propagate different sources of uncertainty to RUL estimation. These analytical methods are not computationally expensive as sampling methods, which make it useful for online applications. Also, the results from these methods do not change on repetition.

Although the trend of solving RUL estimation problem as an uncertainty propagation task is useful, it focuses only on mathematical methods and neglects the usage of AI techniques which are commonly used in prognostics.

Validation and verification

Validation and verification of the prognostics process is highly required, because deployment of the prognostics system could not be done before the assurance of its performance. Developing a good prognostics algorithm without the ability to quantify its performance makes it useless.

Having performance metrics is very important because:

  • They help in the creation and evaluation of requirement specification needed for system design Fig. 25.

  • They assess which part of the prognostics system affects its performance that helps in performance improvement.

  • They can be used in comparison between different algorithms in a standardized way [6].

  • They are also used to identify the week areas in prognostics that requires more researches.

  • Performance metrics are used to identify ROI which is a very important aspect that defines whether to deploy the prognostics system or not [6].

Fig. 22
figure 22

Accuracy and precession representation of uncertainty

Fig. 23
figure 23

Representation of different types of distributions [7]

Fig. 24
figure 24

Uncertainty concept [7]

Since the beginning of applying prognostics concept, focus was only on prognostics algorithms development. Recently, the ISHM community paid too much attention to the importance of having prognostics metrics.

Prognostics performance metrics can be classified as follows:

Functional classification It can be considered as the most important and widely used classification. It is based on the information that the metrics provides to fulfill certain functions Fig. 26.

End user-based classification This classification is based on customer requirements. Each one sees the benefits of the prognostics system from different points of view that need to be quantified by specific metrics. This classification is shown in Table 1.

Performance of prognostics algorithm should improve with time as more data become available. Prediction at the beginning of life is normally less confident than the prediction just before failure. A good algorithm should give a confident prediction with suitable time in advance and the prediction confidence should evolve with time.

That is why prognostics metrics should be dynamic and take into consideration the changing of algorithm performance with time. Saxena et al. [7] presented four sequential prognostics specific metrics that evaluate prognostics performance and consider the effect of time scale into the performance evaluation.

The four new prognostics performance metrics track the performance change with time and is presented as a waterfall model. If the algorithm meets the first metric, the second one can be applied if not the second metric cannot be applied (Fig. 27). These metrics are prognostics horizon, \(\alpha -\lambda \) performance, relative accuracy, and convergence.

Fig. 25
figure 25

Performance metrics helps in creation and evaluation of the requirement specification [7]

Fig. 26
figure 26

Functional classification of performance metrics (adapted from [6, 7])

The first metric is the prognostics horizon (PH). PH determines how far in advance before EoL the prognostics algorithm predicts RUL with the desired performance that is identified by \(\alpha \)-bounds and evaluated by \(\beta \)-criterion. Equation (1) is used for PH calculation

$$\begin{aligned} \mathrm{PH}=t_\mathrm{EoL} -t_{\alpha \beta }, \end{aligned}$$
(1)

where: \(i_{\alpha \beta } =\mathrm{min}\{j\vert ( {j\epsilon p})\wedge (\pi {[r(j)]}_{-\alpha }^{+\alpha } )\ge \beta \}\) is the first time index when predictions satisfy \(\beta \)-criterion for a given \(\alpha \), p is the set of all time indexes when predictions are made, l is the index for lth UUT, \(\beta \) is the minimum acceptable probability mass, r(j) is the predicted RUL distribution at time \(t_{j}\), \(t_{\mathrm{EoL}}\) is the predicted EoL, \(\pi [r( j)]_{-\alpha }^{+\alpha }\) is the probability mass of the prediction PDF within the \(\alpha \)-bounds, PH can be used for comparing the performance of two algorithms (Fig. 28).

Table 1 End user based classification (adapted from [6])
Fig. 27
figure 27

Waterfall design of prognostics performance metrics [7]

If the algorithm meets the PH horizon requirements, we can apply the second metric, \(\alpha -\lambda \) Performance. This metric checks whether the algorithm stays within the required accuracy margin (\(\alpha \)-bounds) at a specified time t. As the algorithm approaches EoL, the required accuracy margin at a specific time instance shrinks, which means that the algorithm performance must improves with time as more data become available to be the best just before the EoL. The \(\alpha -\lambda \) performance metric creates an accuracy cone that converges with time (Fig. 29).

The \(\alpha -\lambda \) performance is a binary measure. Pass, no pass concept is applied in this metric as identified by (2).

$$\begin{aligned} \alpha -\lambda \mathrm{accuracy}=\left\{ {{\begin{array}{lll} 1&{}\quad \mathrm{if} \,\, \pi \left[ r(j)\right] _{-\alpha }^{+\alpha } )\ge \beta \\ 0&{} \quad \mathrm{otherwise} \\ \end{array} }} \right. , \end{aligned}$$
(2)

where \(\lambda \) is the time window modifier such that \(t= t_{\mathrm{p}}\) + (\(t_{\mathrm{EoL}}\)-\(t_{\mathrm{p}}\)), \(\beta \) is the minimum acceptable probability for \(\beta \)-criterion, \(r(i_{\lambda })\) is the predicted RUL at time index \(i_{\lambda }\), \(\pi [r( j)]_{-\alpha }^{+\alpha }\) is the probability mass of the prediction PDF within the \(\alpha \)-bounds.

The third metric that is used after the first and the second metrics are satisfied is the measure of how the prognostics algorithm prediction improves with time with respect to actual RUL. This can be measured using relative accuracy RA and cumulative relative accuracy CRA. This measure is almost the same as the \(\alpha -\lambda \) performance, but the calculation is performed with respect to the actual RUL. The RA is a normalized error between the predicted RUL and the actual RUL at a specific time index i. The range of RA is from 0 to 1, the best result is 1. It is calculated according to (3).

$$\begin{aligned} \mathrm{RA}_\lambda ^l =1-\left( \left| {r_{*}^l ( {i_\lambda })-r^l( {i_\lambda })} \right| /r_{*}^l ( {i_\lambda })\right) , \end{aligned}$$
(3)
Fig. 28
figure 28

PH based on point prediction, PH based on \(\beta \)-criterion [7]

Fig. 29
figure 29

a RUL versus EoL. b RUL error versus EoL [7]

where: \(\lambda \) is the time window modifier such that \(t = tp +\) (tEoL-tp), l is the index for lth UUT, \(r*(i\lambda )\) is the ground truth RUL at time index i\(\lambda \), \(r({i_\lambda })\) is an appropriate central tendency point estimate of the predicted RUL distribution at time index \(i_{\lambda }\). The illustration of RA is given in Fig. 30.

Fig. 30
figure 30

Relative accuracy [7]

RA evaluates the algorithm at specific time instant. If we need to evaluate the general behavior of the algorithm over time, a CRA is a good method. Using CRA, one can evaluate the algorithm at multiple time instance till t. CRA is calculated as a normalized weighted sum of RAs at multiple time instances according to (4).

$$\begin{aligned} \mathrm{CRA}_\lambda ^l =\frac{1}{\left| {P_\lambda } \right| }\mathop \sum \limits _{i \epsilon P_\lambda } w( {r^l( i)})\mathrm{RA}_\lambda ^l, \end{aligned}$$
(4)

where: \(w({r^l(i)})\) is a weight factor as a function of RUL at all time indices, P is the set of all time indexes before \(t_{\lambda }\) when a prediction is made, \(\vert P \vert \) is the cardinality of the set. The fourth measure is convergence. Convergence quantifies how fast the algorithm performance improves with time based on specified measure as accuracy or precision. It is defined as the distance between the centroid of the area under the prediction accuracy or precision curve and the origin. The shorter the distance the better the convergence of the algorithm Fig. 31. Equation (5) shows how to calculate the convergence.

$$\begin{aligned} C_M =\sqrt{(x_c -t_p )^2+y_c^2}, \end{aligned}$$
(5)

where: \(C_{M}\) is the Euclidean distance between the center of mass (\(x_{c}\), \(y_{c})\) and (\(t_{P}\), 0), M(i) is a nonnegative prediction accuracy or precision metric with a time varying value, (\(x_{c}\), \(y_{c}\)) is the center of mass of the area under the curve M(i) between \(t_{P}\) and \(t_{EoUP}\).

Fig. 31
figure 31

Convergence of three different algorithms [7]

Fig. 32
figure 32

Integrated system of proactive maintenance ISPM [72]

The previously discussed metrics are very useful and can be considered as an achievement in establishment of prognostics specific performance metrics. Some other problems still need to be resolved, such as the connection between top-level user requirements and performance metrics. Also, the previously discussed metrics can be used for off-line prognostics algorithm evaluation, so there is a need for online prognostics algorithm evaluation where the ground truth data are not available. Accurate and applicable prognostics performance metrics are needed to obtain a standardized methodology for algorithms validation, verification, certification, and to compare between different algorithms efficiently.

Prognostics standardization

Prognostics standardization is highly required to allow easy, fast, and effective prognostics system development and deployment. It will also unify the concepts within the community and help in identifying technology gaps that need more attention.

Prognostics standardization can be divided into three types: standardization in prognostics terms and definitions, standardization in prognostics system development, and standardization in prognostics metrics.

Standardization in prognostics terms and definitions is intended to remove ambiguity when using different terminologies. This will help clear understanding while reading and discussing any prognostics topic.

ISO-13372 and ISO-13381-1 [72] presents some of these terms and vocabularies. Prognostics national framework [7] contains a rich glossary of prognostics terms like RUL, UUT, EoL..., etc. as well as definitions like time index, time of detection of fault, prognostics features .... etc.

Standardization in prognostics system development is aimed at generalizing the prognostics process, information exchange within prognostics system, implementation of prognostics system, and prognostics system design methodology. Because prognostics is the key enabler of CBM, the standardization of prognostics system development is identified within the CBM system development.

ISO-13374 provides a general framework of CBM Fig. 2. OSA-CBM uses the framework presented in [12] and introduces a standard way to implement this framework. It defines how to implement each CBM component part by defining the data types and the information flow between these components. OSA-CBM is written in UML to simplify software engineers’ task. Existence of such open system architecture implies a large cost saving, because there is no need to develop each CBM system from scratch. Instead, one can use the already developed standard components. Also each vendor can master definite parts of the CBM system instead of developing the entire one, which also allows competition and cooperation between different vendors.

Fig. 33
figure 33

Prognostics decision support system [73]

ISO-13381-1 correlates prognostics process with monitoring and decision-making processes within the e-maintenance architecture (Fig. 32).

Based on the work done in [13] and ISO-13381-1, Voisin et al. [9] presented a formalization of a generic prognostics process. This formalization is based on five steps:

  • Formalization of the prognosis process environment. This is achieved by defining the relation between prognostics process and other business processes such as monitoring, diagnostics, and decision support. It also defines the connections between processes.

  • Formalization of the prognosis final purpose. The general method is defined for calculating the final output of prognostics which is the RUL. This is performed by defining two things: the first is the threshold of the component performance (usually its EoL); the second is the functional threshold of the whole system performance which defines the limits of the safe and useful system operation.

  • Formalization of the functional decomposition of the prognosis. For RUL calculation, the prognostics process is divided into four sub-processes: three of them are in sequence to compute RUL (“To initialize state and performances”, “To project” and “To compute RUL”) and the fourth one (“To pilot prognosis”) coordinates the work of the three other sub-processes.

  • Formalization of the coordination of the sub-processes needed to fulfill the prognosis mission. This is achieved by creating a sequence diagram of the prognostics.

  • Formalization of the prognosis objects and data. Classes diagrams are presented for prognosis data and objects based on the OSA-CBM standards [13]. This presentation is created using UML classes diagram representation.

The standardization in prognostics metrics allows a standardized methodology to evaluate, validate, and compare different prognostics algorithm. Prognostics metrics has already been discussed in the validation and verification section.

Another type of prognostics standardization is the unification of research objectives. This unification is not yet considered by the research community, although it will help to fill the gaps in prognostics technology and create an integrated effort to resolve prognostics challenges.

Post prognostics reasoning

The prognostics system provides one of the information pieces (RUL with the corresponding confidence level) that the decision maker uses with other pieces of information to take appropriate decision about system maintenance and operation to increase system reliability, safety, and availability as well as reduce total life cycle cost and logistics footprint. That is, having valuable information is important but using this valuable information correctly and efficiently is much more important.

Post-prognostics reasoning is a challenge because it requires developing an integrated information system that links the operation, maintenance, logistics, decision support, and decision making, all together in a way that allows each user to benefit from the information that other users have without making any interruption to the system.

ISO-13381-1 defines what to do practically with prognostics information. It identifies the alert (alarm) point for the remaining life before failure of the system that allows taking the required counteraction to rescue system function from failure. Another defined point is the trip (shutdown) limit, at which the system is turned off before failure. Trip limit is normally less than the failure threshold of the system.

The defined limits in ISO-13381-1 do not show the full picture of the post-prognostics reasoning. Iyer, Goebel, and Bonissone [73] developed a decision support system that uses information from a reliable prognostics system and produces different evaluated decisions to the decision maker to enhance logistics of a fleet of assets. A block diagram of this system is shown in Fig. 33.

The information from the OBPHM is processed in PDSM. PDSM is composed from two modules, the IP and the MODSS. The IP deals with the incoming information and checks its consistency, deals with uncertainty, and aggregates all of these information to be more useful by the MODSS module. The MODSS module contains two submodules the OpSIM module, and the EMOO module. MODSS module provides different ranked decisions and evaluates its impact on the operation. The output from MODSS is presented to the user on an HMI.

More and more implementations of automated decision support systems based on prognostics information are needed to increase the benefit from prognostics system and increase its applicability and acceptance by the engineering community.

Conclusion

Prognostics is quickly evolving, but still needs more attention from governments, industries, and academia to become less of an art and more of a science. This could be done if all efforts in this field are integrated together to obtain a clear and definite steps for prognostics system design, development, validation, and verification.

In this paper, we tried to present a complete vision about prognostics as a major component part of ISHM. We gathered a lot of sparse information about prognostics and combined all of these information together to present an integrated work that shows the importance of prognostics and its influencing rule in ISHM. We also clarified how the maintenance strategies can shift from “fail and fix” to “predict and prevent” based on the proactivity in prognostics and how prognostics is the main building block in CBM. The concept that relates prognostics to health management has been also introduced (PHM). After that, we discussed the prognostics approaches, their advantages and disadvantages, and how to use the suitable technique according to the prognostics problem definition. We also presented a lot of prognostics applications which have been already deployed or are just an experiment. Finally, we addressed the more challenging aspects in prognostics and how the research community is trying to resolve these challenges.

This literature review paper about prognostics is mainly intended for new prognostics researchers. Professional prognostics researchers who delve into the details of different prognostics aspects can also benefit from this paper to recall the concepts.