1 Introduction

The increasing focus on carbon emissions in recent years has had a profound impact on many areas of technology, including a very rapid adoption of electrical propulsion systems for transportation. A key building block in the electrification of transport is cost-effective and reliable manufacture and integration of electrical machines (EM) as a replacement for more traditional sources of propulsion power, most notably internal combustion engines, in order to provide a competitive alternative to the incumbent technologies. The move towards increased mass production of electrical machines, the search for greater reach into other applications or provide alternatives to existing ones brings with it pressure to increase productivity, and improve designs towards manufacturing boundaries (light-weighting, reduced footprint, robustness and active component performance). A result of this is a much tighter control on the tolerances needed to maintain the high standard and qualities required for performance and safety of the electrical machine, particularly in safety critical applications such as those found within aerospace. The effect of this is a need for better systems to be in place for the tracking and traceability of the electrical machine manufacturing process across all key areas that impact the boundaries we wish to advance through better design / materials.

In parallel to the adoption of electrical systems in transport, there has been something of a revolution in manufacturing in recent years, with the adoption of ever-more digital technologies under the broad heading of ‘industrie 4.0’. This digital revolution is seeking to exploit recent advancements in electronic hardware, algorithms, data processing and machine intelligence, to extract information throughout the manufacturing lifecycle and gain value from it. This ‘manufacturing digitisation’ process has seen notable success in a number of fields and applications, in particular by facilitating an improved understanding of the processes that are undertaken during the manufacture of a particular product, the ability to inspect and detect any anomalous variation in these processes, and to identify when such processes or activities produce errors that may lead to failure. A significant amount of this endeavour has been enabled by the recent advances in machine learning and computer vision, with deep learning and deep neural networks (DNN) featuring prominently. Harnessing these powerful tools at various stages in the manufacturing process, offers the prospect of being able to identify errors including the underlying causes, improve condition monitoring of tools and systems, provide real-time and reactive guidance to workers, and collect insight into the influence of design features on the manufacturing process and final outcomes.

There are a number of failure modes that can arise in electrical machine over their lifecycle, from bearing failure, increased electrical resistance, insulation failure, and short-circuits that can lead to catastrophic failure during operation. Although the operating environment and duty can play a significant role in the initiation of these faults, many can be traced back to the manufacturing process in which they were constructed and assembled. One such area of concern in relation to process control is the winding of coils and their subsequent insertion into the stator core. Close tolerances and high repeatability are needed to ensure that both quality outcomes (fill factor, electrical resistance) are maintained and the introduction of faults (insulation damage, wire crossing and geometry errors) do not go undetected. This paper investigates the scope for how machine vision, particularly drawing on DNNs, can be used to aid in fault classification of errors within the coil winding manufacturing process of electrical machines.

The main contributions of this paper include:

  • A demonstration of coil winding defect classification for electric motor production using state of the art residual network convolutional models trained on a three-class problem image dataset.

  • We develop a methodology for the dimension reduction of coil windings to improve further the classification performance.

  • We explore the role generative models can have in augmenting training data to further improve accuracy over a base model trained using limited data. Applying StyleGAN2 trained on our original coil defect dataset we are able to generate novel instance of coil windings to boost training size and overall performance.

This paper begins with an overview of electrical machine manufacture and some of the key process control challenges before a look into recent applications of manufacturing digitisation and machine learning to solve some of these challenges. Section 3 presents a brief outline of the field of few-shot learning and generative data models and their applicability to this research of coil winding fault classification. Section 4 presents the coil winding failure use-case and discusses the methodology used to undertake the coil winding failure classification task, followed by section 5 presenting and discussing the experimental results, before ending with conclusions of this work.

2 Electrical machine manufacture and quality assurance

The review presented in [1] gives a wide-ranging overview of some of the key areas in which digitisation, IoT and Industry 4.0 (I4.0) can impact the manufacture of electric machines. This includes a discussion on the fundamentals of I4.0 as it relates to data acquisition, storage, processing and visualisation and associated technologies (sensors, data analysis, machine learning and modelling and simulation). This review included a discussion of how the integration of these tools can impact or enable more advanced shop floor interactions such as machine to machine communications (M2M), human-machine interaction or knowledge-based systems, and finally how cyber-physical systems and the move towards more robotic or automated systems can be aided through further manufacturing digitisation.

A number of papers also exist [1, 2] that outline a number of major steps for the manufacture and assembly of an electrical machine and the variants therein (asynchronous, permanent magnet, etc), each containing common and unique processes and challenges. The core components of any electrical machine however consist of the stator and rotor cores, windings, slot and additional insulation, all encased within a housing. The steps, activities and processes associated with the manufacture of these core components and their assembly can vary, however a typical example begins with production of laminated steel sheets, cut into the desired shape for both the stator and rotor core and then stacked / sandwiched together and bonded into a whole. This can be achieved through a number of methods, such as standard cutting, electro discharge machining (EDM) or laser-based methods. After the stacking process, further bonding can occur, i.e. through welding and then the winding or coil insertion process can begin. The major manufacturing and assembly for electrical machines and the variants therein (asynchronous, permanent magnet, etc), were outlined in [1] and [2].

2.1 Electrical machine manufacture and process control

A wide variety of processes and material transformations are employed during the manufacture of electrical machines and therefore contribute to the final product quality. Establishing close control on any process variation, is therefore highly important as is tracking these tolerances over time. Of the many steps involved in manufacturing an electrical machine, arguably the most important step which has the greatest propensity for part-to-part variation is the winding and insertion into the core of the stator coils. Coil winding refers to various processes involved in forming a continuous copper wire (usually circular in cross-section) into a complete coil or series of connected coils for integration into a machine stator or rotor. Traditional winding processes can be categorised into linear winding, needle winding, flyer winding, or pull-in winding [3]. All of these methods involve the seamless forming of an individual coil from single continuous strand of wire, although the individual coils which are combined to form a particular phase winding are often joined to each other with some form of electrical joint. In contrast, so-called hairpin windings involve pre-forming of short sections of each coil by bending into a hairpin-like shaped components which are inserted into the stator slots, with each individual section then being connected to adjacent coils sections by welding or crimping to form a complete winding [4]. Although this type of winding can involve hundreds of individual in-situ welding operations it is attracting significant interest as a route to achieving a high slot fill factor, particularly for automotive traction machines [5].

In conventional coils wound with continuous wire, arguably the key target of the winding process is maximizing the slot fill factor, i.e. the proportion of the slot cross-section which is occupied by copper as opposed to insulation and inter-turn gaps, as this has a very significant bearing on the power density and efficiency of the machine. This high slot-fill factor must be achieved without compromising the integrity of the wire insulation coating.

In forming successive individual turns within the coil, which may number many hundreds in some cases, it is important to control the position and geometry of each and every individual turn, Typical errors in the coil geometry which arise during winding are crossover, gap, loose wire, and double winding, along with more aggregate faults such as bulging, convex or concave winding. These faults mainly result from the incorrect wire feeder turning point, improper wire tension, residual strain as well as the stochastic nature of wire motion during winding [6]. In addition to discrepancies in the layout of the coil geometry, is it also important to monitor the condition of wire insulation coating to detect damage and excessive residual tension. The main process parameters in an automated winding include, spindle speed ramp up, winding speed, machine stiffness, damping castor angle, wire feed rate, turning point, exit angle, wire tension, wire oscillation, and free wire length [7].

2.2 Machine learning for fault detection in EM manufacture

Over recent decades, the field of manufacturing has witnessed a seemingly inexorable growth in the availability of data captured throughout the manufacturing lifecycle [8, 9]. This trend has accelerated of late with the advent of ‘Big Data’ and the Internet of Things (IoT) . The benefits from exploiting this resource ultimately relies on a capability to understand any underlying causal relationships that occur during manufacture and the anticipated aim or goals over which control is to be exercised, e.g quality outcomes, cost estimations, process optimisation and customer requirements [10]. However, there are several challenges in this endeavour, as the data can fall into many different formats and categories, semantics and quality. Moreover, the data is often also high dimensional, dynamic, temporal and fragmented, and in the most challenging processes, also difficult to obtain [10]. As a consequence, there is a pursuit to incorporate more powerful data analytics and machine learning algorithms into the manufacturing lifecycle to address some of these manufacturing aims . Although data analytics, artificial intelligence and machine learning covers a broad spectrum of techniques, one such technique has caught the imagination of researchers over the recent years, that of deep learning through the use of deep neural networks. These data driven techniques are advantageous given that they are capable of finding highly complex and non-linear patterns and relationships in a wide variety of data and source modalities, extracting important features, obviating the need for hand-crafted, engineered methods, towards applications for prediction, classification, regression and anomaly detection to name a few [10].

Examples can be found across many applications and systems within manufacturing, whether it is in attempts to automate or improve anomaly detection within industrial processes or provide feedback for corrective measures within these processes. The production and assembly of an electrical machine across the life-cycle of its manufacture presents many challenges for a fault detection system capable of tracking and identifying failure across a number of interdependent activities and tasks [11]. These activities are often spread across multiple manufacturing cells each performing a range of assembly operations and functional tests, possibly interspersed with manual activities undertaken by human operators [12]. The following section highlights recent work in the area of electrical machine manufacture and some of the challenges and solutions created through the use of machine learning and, in particular, machine vision.

2.2.1 Fault detection in electrical machines manufacture

Traditionally, the detection of processing faults or component failure in the manufacturing of electrical machines has been reliant on targeted inspection or end-of-line tests (resistance testing, partial discharge testing, vibration analysis etc). To complement this approach, a number of life-cycle analysis and fault detection methods have been employed to track and monitor the health of the machine during operation. The operating environment and duty of the machines can lead to the eventual accumulation of damage and faults, through either contamination of the system, humidity ingress, elevated temperature cycling, vibration and partial discharge high-frequency inverter voltages [6]. Though this accumulation and damage occurs over the duration of the machine service lifetime, the component failure (winding short circuit, insulation failure, etc) and its cause can, in some cases be linked backed to its original manufacture and assembly. This has motivated many researchers and practitioners to develop in-process monitoring and fault detection methods as the electrical machine is being produced.

The processing of materials during the manufacture of electric machines can introduce a number of factors that influence the overall quality of the machine. By way of example, the stamping of the electric steel and its subsequent assembly into a stator or rotor core can cause the characteristic magnetic properties of the electrical steel to change. To address this well recognized behaviour, work by [13] sought to develop a method for incorporating these process changes into a model for providing more accurate estimates of losses within the electric steel at the end of manufacture. Work by Meyer et al. [14] looked to develop a calibrated 64-pole hall sensor line array for the measurement of the magnetic field of an operating electric motor rotor during test at manufacture. By measuring this property, they hoped to be able to identify causal variations in cogging torque or targeted harmonics as a result of dislocated magnets, dimensional variation of individual magnets and varying remanence flux density.

2.2.2 Vision systems for fault inspection

An approach that employs a vision system to inspect the thermal expansion of the electric steel in stator core laminations was developed by [15] to allow for tighter quality control and measurement of tolerances in stamping tool wear effect. Typically, stator parts are measured manually with both outer and inner diameters checked. As a means to avoid expensive coordinate measuring machines, useful for alignment of stator core laminations, this work developed a machine vision system that catalogues the stator sheet lamination dimensions and uses this information across multiple stator cores to establish a relationship between thermal expansion of the material and the measured shape, allowing it over time to identify where undesired sizing anomalies may occur due to this effect.

A common processing step that is a target for monitoring is the process of coil winding, assembly and terminations step of electrical machine manufacture. Early work by [16] considered the application of vision systems for stator faults (lamination gaps, mechanical damage etc), with a focus on the effect of illumination on detection and how it influences the practical introduction of the technology for this processing step. Similar work by [17] looked to combine more traditional computer vision methods with a deep convolutional neural network (CNN) for the automatic detection of defects within micro-motor armatures during their manufacture. Focusing on detecting regions of copper wire crossing, initially undertaken by operators through the use of microscopes, this approach is capable of rapidly identifying the region of interest, and classifying the category of failure to an accuracy of over 90%. The application of a machine vision system for the detection of copper wire insulation degradation for low-voltage electromagnetic coils has also been undertaken by [18] using an ensemble method to achieve respectable accuracy across six different degradation states.

One of the challenges when looking to apply machine vision methods is the need for data, often images from which a model can be trained to accurately detect or quantify a particular error or fault during a manufacturing process. The use of data augmentation is one such approach, capable of increasing the amount of available data. In one example of such an approach demonstrated by [19], was the application of a multitask convolutional neural network solution was used for the detection of wire defects within the inner sleeve of a spring-wire socket. Here, data augmentation is also explored to boost the training data available and increase classification accuracy. This multi-task solution is able to classify a number of different defects on data gathered from a real industrial setting achieving over 95% accuracy.

The introduction of automated and visual inspection systems capable of undertaking tasks often conducted by human operators has been of keen interest to researchers within electrical machine manufacture [16]. In many assembly tasks the control process is still done by operators on the shop floor where the final outcome is dependent on user skill, experience and decision making capabilities. In addition to assembly tasks a number of inspections are required to check the process has been completed satisfactorily and that the electric machine meets certification or test requirements at that stage of assembly.

The use of human operators throughout the electrical machine process to undertake quality control tests during the process of manufacture is prevalent. This can often introduce a degree of variability in how each test is run or assessed, particularly if the requirements include visual inspections. As a result, a number of researchers have put forward solutions that look to machine vision systems to perform these tasks. Work performed by [20] investigated a number of vison-based inspection systems for different steps of electrical machine manufacture. These include the inspection of stator windings during motor assembly for faults missed in standard electrical testing, where the coil is in contact or imminent contact with the rotor core. During the lifecycle of the machine this contact can lead to short circuit, and/or eventual conductor break. The vision-based system looks to characterize the coil and lacing cords of the windings and determine whether a contact has taken place, showing a high percentage accuracy in determining coil contact failure. Previous work [21] had investigated vision systems for detecting contacts before insertion of the rotor before moving to inspection after insertion. Similar work on tracking rotor assembly includes work by [22] who investigated a means to extract texture-based features from images through the use of a Local Binary Pattern (LBP) method before using this information to feed into a convolutional neural network (CNN) model for final classification of rotor parts to detect missing or broken wire windings.

Beyond coils and lacing cords, work in [23] investigated the detection of defects and anomalies in electrical connectors, in particular incomplete disconnection of the stator power cables during operation or manufacture, a defect that can be missed by human operators. A vision-based system which utilizes a thresholding algorithm to enhance the region of interest was used to characterize the connector surface area / circumference and perform and assessment based on any deviation from expectations.

Electric contacts are another component within electrical machines that have been targeted for machine vision-based quality assurance. Work by [24] looked to develop a three-subsystem machine vision approach for detecting a number of defects (burrs, scratches, cracks and breaks) that can reside on the surface of electric contacts. Such defects can reduce the life-time of the product, causing diminished electric conductivity and heat conduction of contacts. Over the life-time of an electrical machine these defects can lead to premature failure of the machine during operation. The proposed system consisted of three sub-systems, which inspect the top, side, and bottom surfaces of electric contact for different types of defects respectively. Utilizing classical machine vision methods (edge detection, blob detection, component labelling, gamma transformations etc), the authors were able to develop an approach that produced accuracies of over 95% accuracy, albeit within a limited dataset. Some early work on cables and connections within manufacturing and processes within assembly have looked to develop systems for identifying the order of colour electrical wires for an industrial connector cable process [25]. A vision system was used to detect for displacement or deformation of any wires within the connector, providing positional information on the centre of each wire along with colour matching for determining wire order. As mentioned previously, one of the challenges in applying machine vision systems, particularly from the field of deep learning lie in the need for large amounts of data. The next section discusses recent work suitable to overcome this challenge by removing the need for large amounts of collected data, or allowing methods to generate new data for training.

3 Few shot learning & generative data models

Within the realm of manufacturing there are many examples where it is not possible to obtain sufficient training samples in which to provide suitable data for the development and application of machine learning methods, in particular those which require large amounts of data, ie DNNs. These examples include, obtaining anomaly data on disparate events that occur infrequently, quality assurance related data, such as images or videos tied to material or product failure, or process data tied to the lifecycle of a tool or machine that degrades over lengthy time periods (tool wear). Given, that this is not an uncommon feature of the challenges faced in environments and fields outside manufacturing where machine learning is also applied, then it is unsurprising to find that solutions to this constraint of limited data have been put forward and developed.

The two main approaches can be divided into either one of an algorithmic approach or data sampling approach. The first mainly looks to the development of new algorithms designed specifically for the task at hand (classification, prediction, regression etc) but under conditions of limited data, ignoring perhaps the most common methods for data augmentation (scaling, cropping, rotation, colour shift, etc) which are often introduced as a means to aid in generalisation as much as to increase the data sample size. For example the approach outlined by Yang et al. [26] looked at rotating machinery fault diagnosis using limited raw time-domain vibration signals. Here the authors introduce a convolutional autoencoder layer for feature extraction that is capable of working under limited data regimes. The second approach looks at generating new data (or augmenting available data) to facilitate training.

Deep neural networks that are trained on limited (few-shot) or single (one-shot) samples have become more prevalent as a means to overcome the challenge of limited data in certain fields or applications. Wolf et al. [27] investigated the use of a bag of features representation to learn a similarity kernel for image classification of insects. Fei-Fei et al. [28] developed a variational Bayesian framework for one-shot image classification, while Qiao et al. [29] investigated the scenario where a large number of categories exist but the number of examples is limited. Here they propose a novel method that can adapt a pre-trained neural network to novel categories by directly predicting the parameters from the activations.

Few-shot learning can cover a range of typical machine learning applications, from image classification, video classification, sentiment classification, object recognition [28], along with regression and reinforcement learning domains.

There are a host of techniques within the realm of few shot learning and they are listed below:

  • Semi-supervised learning [30] which learns from a small number of labelled samples in conjunction with a large number of unlabelled samples.

  • Weakly supervised learning [31] which broadly learns from data or experience that is incomplete, inexact, inaccurate or noisy [32].

  • Transfer learning [33], here often a pre-trained model on a relevant or similar problem domain (image classification, regression), but cross domain application exists. The goal is to transfer knowledge from a domain with abundant data to a task or domain that has very little data.

  • Imbalanced learning [34] involves the training of a model with an dataset of samples where the categories are skewed, either as a result of the problem (anomaly or fraud detection) or lack of data through other means (cost, time, accessibility).

  • Meta-learning [35] Attempts to learn meta knowledge across a number of tasks as a form of generalization, before generalizing the meta-learner again for a new task using task-specific information / data.

Their application to the field of manufacturing and engineering has also been recently undertaken.

Authors in [36] have investigated how to use a ‘few-shot’ learning DNN algorithm for the diagnosis of rolling bearing failure. Fault diagnosis as a whole is a challenging task as the degree of variation within a working environment can be high, with the signals of the same faults often very different as a result of different working conditions and interdependencies [36]. The authors were able to demonstrate for the first time that a few-shot learning-based diagnosis model can boost the performance of fault diagnosis by making use of the same or different class sample pairs. A survey of the field of few shot learning can be found in [32].

One shot learning takes the previous approach to the extreme. Here a single instance of a dataset set, for example an image of a particular class is presented to a network for inference, with the goal of recognizing whether it belongs within a specific class when paired with other samples.

Work undertaken by Deshpande et al. [37] demonstrated the use of one-shot learning for quality control in detecting surface defects found on steel. Using a Siamese Deep Neural Network (DNN) approach and a limited dataset of hot rolled steel trip containing six classes of varying surface defects they were able to show a high degree of predictive accuracy given the limited available image data. The Siamese architecture can also be used for time-series data, in particular condition monitoring or predictive analytics within manufacturing [38] by exploiting their capabilities for differentiating between data instances for anomaly detection.

Generative models are a class of deep neural networks built to produce novel samples of high-dimensional data distributions, for example images, from a target data distribution. There exist many forms in which such generative models exist, such as the common autoregressive models [39], or variational autoencoders (VAEs) [40]. Another such method that has seen significant growth and application by exploiting adversarial learning are the class of models called Generative Adversarial Networks (GANs) [41].

Consisting of two networks, a generator network is trained to generate a new sample from a latent code that is hopefully indistinguishable from the target data distribution it is being trained on. Secondly, a discriminator network than looks to assess or act as a ‘critic’ of the produced data, trying to determine if it is from the original training data, i.e ‘real’, or from the generator model, i.e ‘fake’. Both the generator and discriminator are trained in parallel, competing against each other in order to improve themselves and ‘outdo’ the other.

The general field of GANs has over the recent years found itself being applied to a number of interesting applications, from standard image synthesis [41], to image-to-image translation, semantic-image-to-photo translation, human face generation, and many more. The domain of manufacturing is one such area which has recently taken up the application of GANs to solve a number of challenging problems, whether it is to augment or generate new training data, or utilise the discriminator component for anomaly detection.

Surface inspection is one such example, here recent work by Liu et al. [42] utilized GANs to generate novel examples of defective errors on the target surface (button surface defect) which were later refined by wavelet fusion to improve pixel-wise accuracy. A surface defect segmentation method was proposed by Wei et al. [43] through the use of a defect sample simulation method, here a two stage simulation algorithm based on a GAN and a neural style transfer network work to create a local defect and then blend it into the background to improve accuracy. A surface defect and crack detection framework was developed by Wang et al. 2019 [44] to detect various cracks of scale and type found within inspection of rail lines through the use of acoustic emission technology. Here a Least-Square GAN is trained to improve ‘burst-type’ crack signal denoising, preserving more detail of the event (crack waveform) and allow for improved classification / detection.

Further research into data augmentation techniques through application of GANs have been applied in a number of processing steps, for example, a critical step in the semi-conductor manufacturing process is defect detection and classification. Examples of defects can be limited and are often created manually to aid in developing models for their detection. Work by Singh et al. [45] looked to utilize GANS for the synthetic generation of a defects to be used to augment the training of a CNN-based machine vision model for failure / defect classification. An automatic surface abnormality detection and identification pipeline was developed by [46]. Here a GAN is used to generate exaggerated defect samples, which were used to enhance the accuracy of various classifiers to detect surface defects on steel and ceramic. The need for domain specific datasets within industrial settings for the creation of anomaly detection models, in particular those based around machine vision and images is a challenge. To overcome this, researchers [47], looked towards the creation of a GAN model for the automatic annotation and insertion of visual anomalies on existing industrial datasets (industrial reactor images) to aid in classifier model development and testing.

Beyond image generation, a number of other applications have been trailed within manufacturing, varying across simple anomaly detection, to manufacturability and design. For example, to overcome the limited and unbalanced data of real-world examples for mechanical faults within manufacturing systems, the development of a conditional GAN for the generation of failure signals from operating machines (electric machine) was proposed [48]. By augmenting the available data, the researchers were able to improve the classification of mechanical failures (bearing, gearbox) through the use of a CNN-based classifier using the additional GAN generated samples. Machinery plays a significant role in manufacturing systems, with even small deviations or deterioration machinery components leading to product quality variation or even halting the process altogether. The limited availability of data pertaining to machinery failure makes the development of systems for fault detection difficult. Work by Dai et al. [49] use a GAN model trained on samples of normal operation of manufacturing machinery, before the discriminator is extracted and used as a means to detect anomalies that deviate from the expected normal operator frequencies. This work, applied to three different use-cases (bearing, gearbox or rotor operation) showcased the benefits that GANs can bring to fault diagnosis and generalisation to a number of different applications over more traditional methods.

Manufacturability is an important constraint for any given product or component design. The use of GANs for the generation of synthetic 3D voxel that lie within the distribution of the training data, in this instance real manufactured designs has also been researched [50]. The model is then able to output new 3D voxel design from the latent space, allowing optimization for manufacturability to occur within the latent vector space. By way of example, the manufacture of semiconductors requires the use of extremely precise lithography processes to build the desired mask pattern. One step in this process is the use of lithography simulation, a means to avoid costly experimental verification, simulation is used as a faster replacement, however the steady decrease of feature sizes means model complexity is on the increase. To overcome this challenge Ye et al. [51] investigated the use of a GAN-based approach, coined LithoGAN, for the generation of the required output resist patterns from the supplied input masks. They were capable of demonstrating the capability to produce accurate resist patterns with an order of magnitude increase in speed.

4 Methodology

Electrical machine failure as a result of defects within the enamel copper wire coils or supporting bobbin / stator tooth can arise as a result of many factors. Careful consideration has to be given to the tooth design, including in some cases, the inclusion of a suitable inner groove geometry to aid in alignment of the first layer of copper wire. Manufacturing tolerances must also be tight, as the presence of a burr or other shape deviations can lead to errors in the alignment or geometry of the wires during the coil winding process [7]. There are also a number of possible geometrical and structural defects that can arise in the manufacture of electrical coils as shown in Fig. 1. An interval of low wire tension can lead to loose wire ends in the winding or errors in individual layer structure, which can then give rise to further sources of defects, such as reduced thermal characteristics due to loss in thermal conduction between the wiring and stator tooth / bobbin.

Fig. 1
figure 1

Typical coil winding failures

Fig. 2
figure 2

Test rig used to build dataset for coil winding failure

Fig. 3
figure 3

ResNet50 Architecture

A process error at a particular layer may give rise to a gap or wire crossing, or double winding, whilst errors in tension may result in development of a loose winding [7]. These defects can give rise to increased electrical resistance, increasing temperature and creating internal ‘hot-spots’ that may that eventually cause catastrophic failure of the coil insulation. Changes in wire / coil geometry, for example a bulging, convex or concave winding, can also lead to similar failure through poor thermal conductance, or loss of fill factor. Variation in stresses applied to coil winding process can also often be identified through wire damage or tears [7].

Taking into consideration the possible defects that can occur during the coil winding process, a number of defect failures were considered, developed and then used for the investigation into the application of machine vision for quality inspection. Three classes of coil winding quality were chosen, two failure classes (gap and crossover as shown in Fig. 1), and a single pass class which contained not defects. Each coil was hand wound, using the available 1mm diameter circular enamelled copper wire onto a mock-up of a simplified stator tooth (produced by 3D printing). Each coil consisted of two layers of winding, sufficient to demonstrate each class. The gap failure is derived through a single or double width gap in turn spacing, whilst the crossover failure involves a single wire crossing situated on the top most layer, both can be seen in Fig. 1. Following the creation of a number of examples of coil winding failure, an imaging dataset was developed through an imaging apparatus.

The imaging apparatus used to create the coil failure dataset shown in Fig. 2 is composed of two main components, the first a test bed built to house a FLIR Blackfly S camera directly above the test plate, and the second a set of 3D printed single tooth ‘stator’ bobbins used as a core for the winding of 1mm enamelled copper wire as discussed previously. In addition to the FLIR Blackfly S camera, an Intel RealSense D435i depth imaging camera was incorporated into the test bed. The camera will be used for future experiments. For every image created, the wound tooth module is placed in the centre of the imaging apparatus, and in order to increase the number of images generated there are number of operations applied. These are two different levels of camera lens zoom (close and far), three different orientations (90\(^{\circ }\), 180\(^{\circ }\) and 270\(^{\circ }\)), and two different light source directions (left side, right side). A total of 683 images were created across the three classes and split into training (crossover: 196, gap: 419, pass: 23) and validation (crossover: 20, gap: 20, pass: 5) sets. Armed with this image dataset, it is possible to train a machine vision model for fault classification of the coil winding process.

Machine vision methods for the classification of images can come in many forms for example the more traditional feature descriptors such as SIFT methods can be used to create a’ bag of features’ that characterise the image or object class. These are often combined with K-Means, or Support Vector Machines to partition or cluster these features to specific classes of object. An alternative and currently popular approach is the use of DNN methods such as convolutional neural networks. As described in section 2B, the range of application and its success within manufacturing is undeniable however there are some limitations to the use of this approach. In some cases, DNN methods for machine vision require priors or embedded knowledge in the model architecture in order to work well, evidence by the success of convolutional neural networks and their convolutional kernels and pooling layers derived from biological processes. There is also the challenge of learning relationships between objects in an image, and their order, for example when a scene contains multiple objects that need to be identified. Finally, CNN methods are not as overly generalizable as some traditional methods, and reliant on training data, if this is limited the model can overfit and will fail outside of the original distribution it is trained on. However, even with these constraints the applicability and success of modern CNN models makes it an ideal choice for this class of manufacturing problem.

Fig. 4
figure 4

Residual block

In this instance the Residual Network architecture with 50 layers or ResNet-50 [52] is used, shown in Fig. 3. The ResNet-50 model was pre-trained on the classic ImageNet [53] benchmark dataset, before being retrained again through transfer learning by adapting the last softmax layer for our specific coil winding failure dataset.

The ResNet50 network is a deep convolutional network which utilizes skip convolutional layer blocks with short connections as shown in Fig. 4. These basic blocks, named ‘bottleneck’ blocks follow two simple design rules. Firstly, for similar sizes of output feature maps, the same filter numbers were applied to layers, and secondly, if the feature map is halved, the number of filters is doubled, this is shown in Fig. 3, with each set of coloured blocks / layers. Down sampling is performed directly by convolutional layers that have a stride of 2 and batch normalisation is carried out immediately after each convolution and before the activation of ReLU. When the input and output are of the same dimensions, the identity shortcut is used. When the dimension’s increase, the projection shortcut is used to match dimensions through 1 x 1 convolutions. In both cases, when shortcuts passed through feature maps with two different sizes, they are performed with a stride of 2. The network ends with a 1,000 fully connected layer with softmax activation. The total number of weighted layers is 50, containing over 23,534,592 trainable parameters.

The pre-trained ResNet50 network is then frozen for the first 49 layers, leaving the final 1,000 fully connected layer to be replaced with 3 fully connected layer tied to the specific classes for which an accurate identification is sought.

4.1 Proposed experimental method

The proposed set of experiments and methodology can be broken down into a number of steps:

  • An image dataset for the 3 classes to be classified was created using the test apparatus (crossover, gap and pass). This will form the base dataset for all future experiments in this research.

  • Convert each RGB image to be fed into the chosen ResNet50 architecture by rescaling to 224 x 224 dimension and subtracting the mean RGB value computed on the ImageNet dataset from each pixel, as proposed by Krizhevsky et al. [54].

  • Build a deep convolutional neural network (CNN) based on the ResNet50 architecture, replacing the final layer with a 3 fully connected softmax layer and using the pre-trained weights for the initial 49 layers.

  • Undertake three separate experiments using this base ResNet50 model:

    • Original Dataset: Through transfer learning, freeze the preceding layers of the ResNet50 model, whilst allowing the new 3 fully connected softmax layer to retrain.

    • Augmented Dataset: Incorporate a new pre-processing step to extract the region-of-interest (ROI) and dimension reduction for each image through the use of the GrabCut algorithm [55].

    • Generative Dataset: Expand the original dataset through the use of a Generative Adversarial Network (GAN) model to create new images and investigate whether such newly generated can boost accuracy of the ResNet50 model through increased data.

figure a

The augmented dataset was developed in order to pre-process the original images so as to reduce the complexity of the classification task. To begin with the dimensions of the image were reduced by flattening the colour channels so as to only keep the blue values within the RGB image. The next step was to apply the GrabCut [55] algorithm to further process the region of interest (ROI), focusing around the actual coil and removing any additional structures (bobbin, background scene) that might affect the training and overall performance of the ResNet50 model. The segmentation of foreground from background has a number of alternatives, most recently the use of state-of-the-art DNN methods trained to segment out the object of interest. Though supervised DNN methods can achieve very high accuracy they require significant amounts of labelled data to train effectively. Traditional image processing methods such as the watershed algorithm can also be applied, based upon the analogy of a topographic surface, where hills and valleys denote background and foreground for segmentation, and the algorithm slowly defines each over a series of iterations. For clearly defined objects this approach can work well, however often noise or complex scenes lead to over segmentation. Alpha mapping methods are supervised methods that require a labelled ’trimap’ and uses a linear colour model to predict the likelihood of a pixel being a foreground or background. One problem with alpha mapping is the requirement for an error free mapping, otherwise distortions of the mask can occur.

Methods developed around the Gaussian Mixture Model (GMM) have also shown impressive performance, utilising a probability density function for each pixel, new pixels from new images can be separated into background and foreground using this model, and also updated based upon recent information. It is here we look for fast and accurate separation. The GrabCut algorithm, developed by Rother and Kolmogorov [55], takes as input a ROI suitable enough that it separates both the foreground to be extracted and the background pixels we wish to remove. A k-mean clustering algorithm is then applied to cluster the pixels of the foreground and background respectively. Once the pixels have been clustered, GMM is used to model the foreground and background pixels. As a result, the probability that each pixel belongs to the foreground or background can be calculated. A final energy minimisation step is used to extract the foreground pixels of interest. The overall process of the GrabCut algorithm can be seen in Algorithm 1.

The generative dataset was developed as a means to expand the availability of training data, with the hope of increasing the accuracy of the trained ResNet50 model by increasing generalisation. The limited size and imbalanced nature of the original dataset, means that there is a risk of overfitting and skewing to particular classes of failure is a risk. To overcome this, we utilized a generative adversarial network or GAN was used to provide additional training samples to boost the imbalanced datasets and negate any skewing, and overall improve model accuracy.

Fig. 5
figure 5

StyleGAN2 Architecture

Fig. 6
figure 6

Generative Adversarial Network (GAN) training process

Fig. 7
figure 7

GAN generated examples

To this end, a number of GANs exist within the literature that are applicable to learning the data distribution of the original dataset and capable of generating new image samples that lie across our three classes. One such popular GAN for natural image synthesis is the StyleGAN [56] model, in particular the most recent incarnation StyleGAN2 [57]. This particular class of generative model incorporates new changes such as weight demodulation along with a progressive training regime that starts with low resolution images and gradually shifts focus to higher resolutions and its architecture can be found in Fig. 5. The approach adopted seeks to take a pre-trained StyleGAN2 model, originally for the generation of high definition faces, and through transfer learning, re-train the model using the original coil dataset. Over time the GAN learns to output images from the target dataset distribution as shown in Fig. 6. Once trained the models latent space can be sampled to generate new images and for each of the classes increase the available number of images to train on. Some examples of the output of the trained model can be found in Fig. 7. The StyleGAN2 model was then trained on the original dataset of 683 images, over 100 epochs until the model had converged and no further improvement in image quality could be discerned. This model was then used to generate 130 new coil image samples that did not exhibit any defects and could be used to overcome the imbalance with the pass class by boosting the numbers available for training from 23 images to 153, now more in line with the other classes.

Experimentation involved training an instance of the ResNet50 model over 100 epochs using a batch size of 20. The optimizer used was ADAM, with a learning rate of \(1^{e-3}\). During the training process, standard augmentation techniques are applied to improve generalisation of the model. These are random resize and crop, random rotation, and random horizontal flip using default settings. For each of the three datasets, an instance of the ResNet50 model is trained five times in order to gauge their mean performance.

For all experiments the performance was evaluated using accuracy, f1-score, recall and precision as a means of performance evaluation. These evaluation metrics have been used extensively in the research community to provide detailed assessments of methods.

  • True Positive (TP): The true category is positive, predicted category is positive.

  • True Negative (TN): The true category is negative, predicted category is negative.

  • False Positive (FP): The true category is negative, predicted category is positive.

  • False Negative (FN): The true category is positive, predicted category is negative.

Using the above factors, each of the metrics mentioned can be calculated.

Accuracy is defined as the ratio of correctly predicted outcomes to the sum of all predictions:

$$\begin{aligned} Accuracy = \frac{TN + TP}{TP + TN + FP + FN} \end{aligned}$$
(1)

Precision is defined as the proportion of positive predictions that are actually correct:

$$\begin{aligned} Precision = \frac{TP}{TP + FP} \end{aligned}$$
(2)

Recall is defined as the models ability to correctly detect all potential classes:

$$\begin{aligned} Recall = \frac{TP}{TP + FN} \end{aligned}$$
(3)

F1-score is defined as the weighted average of recall and precision:

$$\begin{aligned} F1-score = \frac{Precision \times Recall}{Precision + Recall} \end{aligned}$$
(4)

The next section provides and analysis and discussion of the results for each approach, looking at their benefits and drawbacks to coil winding failure classification.

5 Coil winding failure classification

The focus in this paper is an approach for classifying coil winding failures that lead to structural / geometric defects in the coil layout rather than, for example, very localized defects in the integrity of the insulation coating.

Beginning with simply the original dataset, which stands as imbalanced due to the lack of examples within the ‘pass’ class it can be seen that the model itself is able to train up to an accuracy of just over 65% as seen in Fig. 8 (mean across all five runs). The shortfall in performance is as a consequence of the model being incapable of discerning the ‘gap’ failure class images, instead classifying them as ‘pass’ class images instead as shown in Fig. 10 and Table 1. In this case, the model has a strong recall for the ‘pass’ class while exhibiting poor precision, indicating that it has, at least in this instance, simply learnt to overclassify the ‘pass’ class set of images in relation to the ‘gap’ class. Where the ‘crossover’ class of images often exhibit a more defined structural change, i.e. two wires overlapping and the resulting visual features that brings, the ‘pass’ and ‘gap’ are quite similar, with only the edge of the wound bobbin an indicator between the two as a visible characteristic.

Fig. 8
figure 8

Mean training (a) and validation (b) accuracy for original, GAN and processed dataset ResNet-50 model

Fig. 9
figure 9

Mean training (a) and validation (b) loss for original, GAN and processed dataset ResNet-50 model

Fig. 10
figure 10

Mean confusion matrix for original (a), GAN (b) and processed (c) dataset ResNet-50 model

There is some improvement through the bolstered and more balanced generative dataset results. Here the model, trained on the additional GAN generated pass images sees a modest increase in accuracy to just over 70%, though both do track relatively the same across training and validation accuracy as shown in Fig. 8. There is a distinct change as a result of the additional images in both training and validation loss over the course of the 100 epochs between the original and generative datasets. In Fig. 9, the original dataset model is able to improve its loss over the course of training, but is unable to exhibit this improvement transfer over to the validation set, oscillating around its starting loss until the end. This is not found within the generative dataset, which is capable of improving its accuracy on the validation over time. Ultimately the introduction of new GAN generated images within the generative dataset has had some positive impact, showing some modest improvement in both precision and recall when compared against the original dataset. The model is now more capable of differentiating between the ‘gap’ and ‘pass’ image classes, showing higher precision and recall as shown in Table 1.

Fig. 11
figure 11

Example validation classifications for original model

Fig. 12
figure 12

Example validation classifications for augmented model

Finally, the approach to include some pre-processing of the images in the augmented dataset has had a significant impact on performance, in relation to the original dataset. To begin with, the overall accuracy of the model has increased to around 87%, a significant boost to performance. Interestingly, there is also a marked change in the models training and validation accuracy and loss during the process. The training loss is much higher, though with a similar shallow and gradual improvement over time, when compared with the other models. The accuracy follows a similar pattern, gradually increasing to around 89% accuracy over time, a large shortfall under the 95% found by the other models. It is only in the validation set that it can be observed how the marked difference in accuracy is played out, showing a much greater reduction in loss over time, and a consistent improvement in accuracy that matches the levels found within the training set. This is echoed in Table 1, where the augmented dataset model has a marked improvement in precision and recall across the three classes.

The models ability to better generalise, and it would seem, not over fit on the training data is one possible explanation for the improvement. The reduction in the complexity of the input space through lowering the colour channels and removing of background features also seems to have played a role. One of the key characteristics of the ‘gap’ class is its distinguishable deviation in the surface edge as shown originally in Fig. 10. Under the lighting and zoom conditions it would seem that the original dataset was still a challenge in trying to capture these details. However, the application of the GrabCut algorithm means that such features are somewhat exacerbated, through the removal of any background noise and extenuating the foreground coil shape itself. Samples of images and the models predictions can be observed in both Figs. 11 and 12 for the original and augmented datasets.

5.1 Future directions and challenges

Several pointers to the overall picture of in-process monitoring of coil winding within the manufacture of electrical machines can be garnered from these findings. Firstly, the challenge is non-trivial, even when simplified to the use case presented in this paper, it requires additional processing to achieve anything remotely useful in the domain of deep neural network-based machine vision. The conditions in which the images are generated in-situ of an industrial process can for the most part be controlled, and this paper demonstrates that it is possible to classify coil failures of a structural (gap, crossover) form. Naturally there are questions about the generalizability of such an approach, and whether CNN models such as ResNet50 can be applied into production, or whether the noise and variability of such processes is too much for it to handle. It would be expected that adapting the process to different light conditions, image capture angle or material changes to the copper wire (changes in thickness) would lead to a drop in accuracy, though not in so much that such models could be quickly re-trained back to their base performance. One challenge when it comes to out of distribution shift may arise when transferring to a whole new process, for example from linear winding to concentrated needle winding methods. The lay up of the copper wire material and its geometrical shape may ultimately be too great a change from the original that performance would never be recoverable, though perhaps one-shot or few-shot methods as discussed earlier could provide a solution. The next challenge is in adapting such a framework for detection to one that can work in real-time. Coil winding is a dynamic and real-time process, whether the coil winding is performed at high speed with a variety of winding methods and resulting coil geometries, e.g. pre-formed or needle wound, distributed or concentrated. As discussed previously the task of classification should be undertaken continuously throughout the process as successive layers are built-up. To the deeper question that threads through this work, viz. the challenge of limited data and how to overcome it to meet the aims of researchers or industrial practitioners, the answer is mixed. As discussed in depth at the beginning, it has been shown that there currently exist a number of strategies for aiding the training of machine learning models with limited or imbalanced data. The choice taken here was to explore the role GANs could play in generating new samples that might enhance classification and generalisation. To this end, it was a mixed success, in that it was demonstrated that it is certainly possible to incorporate such generated images, however it proved to be a challenging task in its own right to do so. The actual training process for GANs ironically suffer the same fate in that they require large amounts of data to evolve a model that accurately captures the data distribution you want. In the case considered in this paper, it was the distribution of images across the three classes of coil winding failure. In addition, GANs can also suffer from mode collapse, where the diversity of images found within the distribution becomes non-existent as the model collapses to a particular style or image. Thankfully, this was not the case here, though not withstanding the need to adapt the StyleGAN2 hyper-parameters and incorporate recent techniques (differential augmentation, self-attention) to ensure any prospect of converging to a model that output images of reasonable quality. This is arguably a challenge that will be faced by many going forward, however data limited GANs are a vibrant current research topic, so solutions to this end will hopefully be forthcoming.

Table 1 Classifier Metrics

6 Conclusions

Industrial production and manufacturing require tight control of their underlying processes with the aim of maintaining strict tolerances and avoid variation as a product is being manufactured or assembled. When these processes begin to deviate from their respective targets, errors and defects can begin to accumulate, leading to downtime, repair, scrappage, or even in-service failure. Monitoring of these processes and their outcomes is therefore crucial to maintain the standards required at each step. The manufacture of high-value safety critical assets such as those found within electrical machine manufacture of the aerospace and automotive sector necessitates such actions.

This paper began with an overview of the process for electrical machine manufacture and quality assurance methods along with recent solutions to improve these processes through data analytics and machine learning, with a particular focus on fault detection. Modern methods such as those found within machine learning require large amounts of data in order to effectively model and detect errors that can occur during these processes. Unfortunately, access to such data is often limited, with examples of failure or defects, particularly when encapsulated through imaging, often scarce and hard to obtain. This paper explores this challenging problem within manufacturing through an investigation into available methods for handling sparse or unbalanced datasets for the training of machine learning models, particularly from the field of deep neural networks (DNNs).

This was followed by an investigation into the application of a Generative Adversarial Network (GAN) architecture that can be utilised to augment a limited and unbalanced dataset in order to better aid in the training of a state of the art convolutional neural network (CNN) architecture for coil winding failure classification. In addition, an approach was proposed which utilised pre-processing to reduce the dimensionality and complexity of the source images as a means to aid overall performance and accuracy of the trained models. In both instances there are clear benefits when it comes to model accuracy for the fault detection of coil winding failure through the use of a deep convolutional neural network architecture. The ability to generate new samples from the data distribution of our targeted classes of coil winding failure provide additional images in which to train on and help overcome any imbalances in the original dataset, whilst the pre-processing and dimensionality reduction method outlined was able to further increase accuracy of the trained models. Overall the proposed solution is able to detect faults within coil windings for our three classes (gap failure, crossover failure and pass) with a mean accuracy of 87% compared with our standard CNN approach of 65%. This work presents a first look into attempts to characterise and classify coil windings and associated failure types during the electrical machine manufacturing process. The next steps are to investigate how we can perform such classification and quality measurement of the coil can be performed as it is being generated in real-time as a means to try and hopefully identify causal factors and apply corrective measures to reduce the need for scrappage or repair.