Engineering Analysis of Failure: A Determination of Cause Method

The purpose of this article is to present an engineering method for the determination of cause by the identification of defects that lead to failure. Further discussion on this topic is of course warranted and this article is anticipated to catalyze such discussions. In order to facilitate the development and presentation of this cause determination method, the definition of defect as it relates to failure is first presented. While not all failures are the result of defects (and not all defects result in failures), identification of a defect may point to opportunities to prevent recurrence and assist in the determination of cause. Furthermore, use of this method also serves to identify causes of failure that are not attributable just to the actions of responsible parties: such as wear and tear, acts of nature, and the unknown. Once the cause is identified, the resolution, recovery, and recurrence prevention process then has the opportunity to move forward. To further demonstrate application of the cause determination method presented here, case studies of failures are provided.


Introduction
When consideration is given to the cause of a failure (with consequent loss in the form of property damage or personal injury), two primary aspects are at issue; reasonable preventability of the failure, and responsibility. A related aspect to preventability is whether or not information that comes to light as a result of analyzing the failure will make prevention of recurrence of the failure mode practical: That is, a determination of reasonable means to prevent such failure.
The approach of this article is to apply the perspective of the engineer to the question of cause determination. As such this article is intended to facilitate the work of engineers in fulfilling their primary responsibility to ''Hold paramount the safety, health, and welfare of the public [1].'' A determination that a failure was preventable by reasonable means implies that some lack of reasonableness was present prior to the failure. However, it is important to note that reasonable means available after a failure has taken place may not have been available before the failure took place. That is, the occurrence of a failure may of itself bring to light unanticipated information that now allows reasonable preventative steps to be taken.
While much more can be said on this topic, the purpose of this article is to present an engineering method for the determination of cause by the identification of defects that lead to failure. The work of Charles O. Smith on product liability and design, published in the ASM Handbook 11 Failure Analysis and Prevention, is of particular pertinence to this discussion [2]. Further discussion on this topic is of course warranted and this article is anticipated to catalyze such discussions. In order to facilitate the development and presentation of this cause determination method, the definition of defect as it relates to failure is first presented. While not all failures are the result of defects (and not all defects result in failures), identification of a defect may point to opportunities to prevent recurrence and assist in the determination of cause. Furthermore, use of this method also serves to identify causes of failure that are not attributable just to the actions of responsible parties: such as wear and tear, acts of nature, and the unknown. Once the cause is identified, the resolution, recovery, and recurrence prevention process then has the opportunity to move forward. To further demonstrate application of the cause determination method presented here, case studies of failures are provided.

Definition of Defect
In common usage defect may be defined as ''an imperfection that impairs worth or utility'' or ''a lack of something necessary for completeness, adequacy, or perfection [3].'' A legal definition of defect is ''an imperfection or shortcoming, especially in a part that is essential to the operation or safety of a product [4].'' However, from an engineering perspective as related to failure analysis and cause determination, it is useful to define a defect as an identified deviation from reasonable efforts to prevent a failure or to mitigate the severity of a failure. Thus, it is the lack of reasonableness that is the key for the engineer in identifying the defect. Furthermore, using lack of reasonableness in identifying defects will serve to go beyond the frequently applied superficial (wrong) analyses that assert that the occurrence of a failure is prima facie evidence that there was a defect. The defect is the lack of reasonableness that existed prior to the failure and that resulted in the circumstances that led to the failure. However, it should be noted that a defect can exist that is not causal. As discussed later in this article, the defect (or lack of reasonableness) must have resulted in the failure to then allow the engineer to assert that the responsible person or entity caused the failure. Defect identification then allows the engineer to proceed with efforts to protect the public. In this regard, the following working definitions are set forth: Hazard a condition or situation that can result in property damage, personal injury, or death. Risk the probability that a hazard will become manifest. Controlled Hazard a hazard for which all reasonable steps have been taken to minimize the risk associated with the hazard (and for which no unreasonable steps have been taken that increase the risk associated with the hazard).
Defect an uncontrolled hazard that is a lack of reasonable steps or a presence of unreasonable steps.
It should be acknowledged that, with respect to the concept of defect, another recognized definition of a defect is focused on physical aspects of a given component that was involved in a failure. For example, the ASM Materials Engineering Dictionary defines a defect as: ''(1) A discontinuity whose size, shape, orientation, or location makes it detrimental to the useful service of the part in which it occurs. (2) A discontinuity or discontinuities which by nature or accumulated effect (for example, total crack length) renders a part or product unable to meet minimum applicable acceptance standards or specifications. This term designates rejectability [5].'' In the ASM Handbook Volume 11, defect is defined as: ''(1) An imperfection (deviation from perfection) that can be shown to cause failure by quantitative analysis and that would not have occurred in the absence of the imperfection [6].'' In this situation one might say that there was a defect in a shaft that led to crack propagation and fracture of the shaft. This focused definition of defect is also valid and useful for describing the physical aspects that resulted in a failure. As such the question that is answered is, ''What happened?'' However, in another section of the ASM Handbook Volume 11 dealing with products liability and design, the above definition of defect is but one of several types of defect that an engineer who engages in failure analysis should consider [7]. Other defect types to be considered include manufacturing defects, design defects, marketing defects, etc. Thus, it is seen that a failure can result from a defect that is other than some physical aspect of the component involved in a failure. By defining defect in the way that is set forth above, that is ''an uncontrolled hazard (with the associated lack of reasonableness),'' two benefits are realized. First, defects other than physical defects are included and may be considered as a part of the analysis. Second, there is the further opportunity to address what caused the loss. The question that is answered is, ''Why did it happen?''

Determination of Cause Method
The cause of a failure may be categorized as one of four distinct types: 1. Wear and tear. 2. Unknown. reasonableness of the conclusions derived from the application of the method. The object here is to present a basic path of investigation.
A graphical representation of this method for determination of cause is shown in Fig. 1. At each decision point in the diagram, application of the scientific method is required. Given that the situation is analysis of a failure after it has taken place, the scientific method will proceed by making observations, applying inductive reasoning to formulate hypotheses as to how the observed conditions could have come about, utilizing the process of elimination (part of both inductive and deductive reasoning) to test and reject hypotheses, and then using abductive reasoning to reach a conclusion from among the hypotheses that were not rejected. Use of reference materials, experiments, and analytical techniques of engineering are expected.
Referencing Fig. 1, Decision Point D1 poses the question ''Is the loss/damage/injury consistent with reasonable care and use over a period of time of the involved object(s)?'' A yes answer to this question will lead to Cause C1: wear and tear. An example that would fall into this category is automobile tires that exhibit uniform wear consistent with use of the tires. Another example would be weathering to the exterior of a building that is consistent with the age of the building. In both instances, signs of abuse would not be observed and indication of proper maintenance would be confirmed. A no answer to the question then leads to Decision Point D2.
Decision Point D2 poses the question ''Are the hazards that resulted in the loss/damage/injury able to be identified?'' A no answer at this point will lead to Cause C2: unknown. An example that would fall into this category might be a piece of electronic equipment that cannot be tested since it is not functioning and where the history of the equipment is not known. A yes answer then leads to the Decision Point D3.
Decision Point D3 poses a compound question -the central point of which is to assess whether or not the actions leading up to the failure were reasonable. Lack of reasonableness may be manifest either in the absence of a reasonable act or in the presence of an unreasonable act. To wit: were reasonable steps taken to minimize the risk due to the identified hazard(s) and were there no unreasonable steps taken that would have increased the risk from the hazard? A no answer asserts an identifiable unreasonableness and will lead to Cause C3a, a defect due to actions of a person or entity. 1 An example of a lack of reasonable steps would be a contractor who dug a hole at a construction site but did not put up barriers or markers to prevent someone from falling into the hole. An example of the presence of unreasonable steps would be a modification to a machine or process operating system that exposes operating personnel to injury. A yes answer leads to Decision Point D4.
Decision Point D4 (and also Decision Point D2 discussed earlier) recognizes that, unsatisfying as it may be, there are times when the information available is insufficient to identify a cause and therefore leads us to Cause C2: unknown. Decision Point D4 poses the question ''Was the loss due to identified hazards that exceeded the control provided by reasonable steps taken?'' Consider a building damaged by a fire in which the fire damage is great enough that the pre-fire conditions could not be established, the origin of the fire could not be determined, and for which the pre-fire history of the building is not known. Whether  Fig. 1 Overview of method 1 The ''person or entity'' cause is separated into two categories to acknowledge the fact that a person or entity can be responsible for a loss incident even when a defect does not exist.
or not the controls were exceeded cannot be assessed and a no answer is required. Another example would be a codecompliant building that has sustained wind damage from winds that were less than the design and construction of the building should have allowed the building to withstand. 2 No deficiencies of materials, workmanship, or design are identified. In this case, adequate controls for the identified hazard (wind) were in place, but damage was sustained nevertheless. Either there was another unknown hazard (different from the identified wind load hazard) that resulted in the loss or some aspect of the history of the building created a deficiency that is not able to be identified. Regardless, there would be a no answer which leads to Cause C2: unknown. Another example leading to a no answer here would be a fractured part where design, choice of materials, and installation are known and confirmed, but where the service use history is not known. Further consideration of Decision Point D4 leads to the alternative in which the controls for the identified hazard are known to have been exceeded. A yes answer would lead to Decision Point D5. An example here would be a 500-year flood (hazard) that ruptured (damaged) a dam built to withstand a 100-year flood. Another example would be a consumer product that was manufactured according to the best knowledge and practices at the time of manufacture that later injured a user due to some previously unrecognized hazard. The protections provided by the control were exceeded by the hazard.
Recall that in order to reach this point in the analysis, it has been established that the actions leading up to the failure were reasonable and that the controls that the reasonable actions put in place were exceeded by the hazard. Further, note that the example hazards presented in consideration of Decision Point D4 included a hazard of natural origin (the 500-year flood) and a hazard of human origin (injury from a manufactured product). Decision Point D5 seeks to differentiate between human-created hazards and natural hazards and, therefore, poses the question ''Was the hazard that resulted in the damage due to a human-created hazard?'' A no answer will lead to Cause C4: Act of the Natural World. A yes answer leads to Decision Point D6.
Arriving at this point in the analysis, if the hazard is now recognizable due to information brought to light by the failure, then a judgment may be made as to the reasonableness of the hazard. Given the information that exists due to the fact that the failure has taken place, Decision Point D6 then poses the question ''Were the conditions prior to the loss unreasonably hazardous?'' A yes answer will lead to cause C3a: a defect due to actions of a person or entity. In a case such as this, even though due care was exercised and appropriate reasonable steps were taken to prevent loss from the hazard (based upon the information and practices available at the time), the loss still occurred as a result of a hazard that was not controlled. Further a person or entity was involved.
This example defines a special case of assigning responsibility (i.e., assigning cause) that was ultimately defined into case law in the USA. The legal term that applies is ''strict liability [8],'' a concept that needs to be recognized and appreciated by the engineer. Referencing step D2 in Fig. 1, the hazard that resulted in the loss could not have been known in advance. The answer at this step is no. However, under the theory of strict liability, the cause, rather than unknown, is attributed to a person or an entity under Cause C3a: defect due to action of person or entity. Two examples here will suffice to explain. A first example would be a company creates a product which after the fact is identified as defective (without a prior knowledge of the hazardous condition). The product later results in a loss or injury. Only then is the product identified as defective and only then is there recognition of steps that could have been taken to control the hazardous condition of the product. A product, therefore, can be deemed defective even though the manufacturer had taken reasonable care in its production and there is no record of abuse. A second example would be a premature service-related fracture of an axle on machine part or on a vehicle. The design and manufacturing record confirms that all reasonable steps were taken to avoid premature fracture. The vehicle was not misused. Analysis determines that the fracture was due to a latent manufacturing deficiency in the material, present in spite of the record of design, manufacture, and care.
Alternatively, if the utility of an object is inseparable from a hazard and a benefit is derived from use of the object, then the object has an inherent hazard that is reasonable and the risk from the inherent hazard is borne by a person or entity. The answer to the question at Decision Point D6 is no. While in this instance there is no defect, because there is a benefit derived from an inherent hazard the risk is borne by person or entity which leads to Cause C3b: actions of a person or entity. Objects whose basic function is not separable from some hazardous feature would be deemed reasonably hazardous. A sharp knife would be a recognizable example that falls into this category. Flammable fuel for automobiles would be another.

Thoughts as to Putting the Method into Practice
Use of this method often leads to a well-defined single cause for a failure. However, in some instances the cause of a failure is attributable to a combination of underlying factors. That is, there may be more than one cause. For example, a product which is unreasonably hazardous may be used by someone in a way that is also unreasonable -the combination of which conditions results in an injury. Elimination of either the unreasonable hazard or the unreasonable use would have prevented the failure. Thus, it is incumbent upon the engineer in conducting the analysis to determine whether multiple factors were present.

Case Study 1: Rupture of a Pressure Vessel: Improper Maintenance
This incident concerns the rupture of a steam accumulator that was part of a steam-generating facility with consequent damage to a facility [9]. On a normal working day while being operated at its typical working pressure of 120 psi, the pressurized vessel ruptured without warning. The weld that connected the bottom head section of the pressure vessel to the main shell section had fractured separating the bottom head of the vessel from the shell. Manufacture of this pressure vessel falls under Section VIII, Division 1 of the ASME Boiler and Pressure Vessel Code. It had been designed for a maximum working pressure of 150 psi. An appropriate steel had been used for the manufacture of the pressure vessel. The code specifies that a hydrostatic test pressure of 225 psi be applied to the pressure vessel at the time of manufacture.
A review of the history of the pressure vessel revealed a drain coupling on the bottom of the pressure vessel had been repaired a few years prior to the incident. Repairs were carried out in accordance with the National Board Inspection Code (the NBIC), with a hydrostatic pressure test at 120 psi.
Eighteen months prior to the incident, a small leak was detected at a crack in the weld that connected the bottom head to the shell of the pressure vessel. Repair work was undertaken by welding the crack. The vessel was not subjected to a hydrostatic test prior to being placed back into service.
Examination of the fracture surfaces subsequent to the incident revealed that a crack (present since the time of original manufacture) had grown due to cyclic loading while the pressure vessel was in service, resulting in the rupture. The precise size of the crack at the time of manufacture could not be determined. However, the pressure vessel did pass the ASME code required hydrostatic testing conducted at the time of manufacture.
The leak that precipitated the repair work was determined to be from a crack in the weld that connected the bottom head to the shell of the pressure vessel that had been present from the time of original manufacture. The size of the crack, however, was not of such an extent that precluded passing hydrostatic tests both at original manufacture and after replacement of the drain coupling. Returning now the repair of the crack location where the vessel was found to be leaking. Repair of such cracks is governed by the NBIC which requires the crack be removed as a part of the repair process. Post-loss examination demonstrated that, in fact, the crack was not removed as a part of the repair. The requirements of the NBIC were not fulfilled. As the pressure vessel continued in service, the crack continued to grow.
The crack that had manifested as a leak 18 months prior to the rupture, consistent with the ''leak before rupture (or break)'' design philosophy appropriate for pressure vessels [10]. The severity of this earlier condition was modest (a release of steam with limited potential for injury or additional property damage). Repair of this crack created conditions that resulted in a ''rupture before leak'' with consequent greater severity (property damage and personal injury).
The flow chart in Fig. 1 is repeated as Fig. 2 to illustrate application of the method to this case. At Decision Point D1, the rupture was not a result of wear and tear as pressure Fire science teaches that in order for a fire to start three things must be brought together in the right combination to enable an uninhibited chemical reaction: a competent ignition source, an ignitable fuel, and oxygen (or an oxidizer). Unintended fires are prevented by keeping this combination from being brought together. In the context of the design and manufacture of internal-combustion engine powered vehicles, ignition sources, ignitable fuels, and oxygen are present throughout the vehicle. The prevention of fires is accomplished by preventing an ignitable combination from coming together. Partial list of ignition sources in the form of heat sources/hot objects includes: Wiring faults in the engine compartment. Wiring faults in the passenger compartment. Exhaust system. Catalytic converter. Exhaust manifold. Exhaust pipe. Friction from locked-up accessory drive pulleys. Wheel bearing deprived of required lubrication. Underinflated tires.
Dragging brakes. Partial list of ignitable materials includes: Fuel (gasoline, diesel, propane, natural gas). Miscellaneous plastic components within the engine compartment. Upholstery in the passenger compartment (these are often fire resistant but not fire proof -i.e., they will burn if flame is supported by another fuel). Foreign objects/debris that has accumulated in the vehicle (grass, objects from the road, animals impacted while in motion or building nests, etc.) This incident concerns a vehicle designed with an appropriate diesel fuel-handling system within the engine compartment. The hoses, connectors, diesel fuel pump, and storage tanks were selected with care. Also, the long-term wear properties of the components and the service environment were considered. The design took reasonable precautions for the hazards that were identified. Further, the manufacturing process for the assembly of the engine and its subsequent installation in the vehicle addressed issues associated with preventing an unintended fire from being started. However, in specifying the manufacturing process for workers to install the diesel fuel lines, a clamp that attached a hose in the engine compartment needed to be oriented such that a raised portion of the clamp would not rub on an adjoining hose that carried diesel fuel. Instructions were provided for the workers regarding the installation of the bracket and its clamp. Manufacturing test runs were performed. For workers who utilized their right hand to install the clamp, the needed clamp orientation was comfortable. However, for workers who utilized their left hand, a different (more comfortable) orientation for installing the clamp resulted in the raised portion of the clamp being in proximity to the adjoining diesel fuel hose. While the engine was in operation, the diesel fuel hose  Fig. 3 Application of method to vehicle fire J Fail. Anal. and Preven.
would vibrate and come into contact with the raised portion of these clamps on an intermittent basis. Over time, a hole was worn in the diesel fuel line. Fuel was released onto the vehicles exhaust manifold with a consequent fire. After conducting investigations of fires that took place in these vehicles, the manufacturer determined that a different style of clamp, with no raised area that could rub on the adjoining diesel fuel line, could be utilized. However, prior to what was now an appreciation of the hazardous condition presented by the original diesel fuel hose clamp, the step of utilizing the alternative clamp was not recognized. Once the hazard of a fire in the vehicles as a result of the use of the original clamp was recognized, it was then realized that a reasonable means was available to minimize the risk of the fire hazard.
The flow chart in Fig. 1 is repeated as Fig. 3 to illustrate application of the cause method to this case. The fires were not a result of wear and tear -Decision Point D1. Vehicles are not expected to catch on fire as a natural course of their use. Cause C1 was eliminated. Move to Decision Point D2. At Decision Point D2 the hazard of a vehicle fire is identified. Move to Decision Point D3. At Decision Point D3, reasonable steps were taken and no unreasonable steps were taken with respect to the hazard of a fire. Move to Decision Point D4. At Decision Point D4 it is determined that the controls put in place for the identified hazard were exceeded. Move to Decision Point D5. It is evident that, despite reasonable actions have been taken, there was a vehicle fire due to a human-created hazard. Move to Decision Point D6. The hazard is deemed to be unreasonable as it may be prevented by reasonable means. Cause C3a is affirmed.

Conclusions
A determination of cause method, intended to assist the engineer in the investigation of failure, has been presented along with two example cases. Further, a definition of defect has been proposed to facilitate the investigation of failure. The engineering judgment in the application of the method and the definition of defect hinge upon a qualified assessment of what is reasonable and what is unreasonable. Reasonableness is the core of the work of the engineers in their efforts to protect the public. As noted at the onset of this paper, the results of the engineer's analysis are useful in addressing the question of loss resolution, recovery, and compensation.