1 Introduction

Inspection is defined as a deliberate, in-depth, exacting process that requires more than mere looking or scanning (See 2012). Drury and Prabhu (1992) point out precision, depth, and validity are critical elements of the definition. Inspection processes require a large amount of mental processing, concentration, and information transmission, along with extensive use of both short-term and long-term memory (Gallwey 1982).

Eye tracking studies help to analyze the human visual activities. An eye-tracker is a device for measuring eye positions and eye movement (Jacob and Karn 2003). There are several applications of eye-tracking technologies such as, web usability, advertising, sponsorship, package design, and automotive engineering. In general, commercial eye-tracking studies focus on a sample of consumers and the eye-tracker is used to record the activity of the eye. Michalski and Grobelny (2016) present results of the eye-tracking data for box package designs. Michalski (2017) analysis digital control panels in terms of effectiveness and efficiency. Ozkan and Ulutas (2017a) use eye-tracking to assess knowledge and behavior of medicine information leaflet. Ozkan and Ulutas (2017b) provide results for evaluation of medicine leaflets on screen. Mobile eye-trackers can be used in manufacturing environments such as introduced by Ozkan and Ulutas (2016) that utilize data as an indicator of cognitive workload for forklift drivers. Ulutas and Ozkan (2017a) compare fixation data from expert and novice operators conducting visual control activities for ceramic tile inspection. Apart from manufacturing environments, tower crane operator’s eye tracking data to assess their performance is a pioneering application in construction sites (Ulutas and Ozkan 2017b).

Khasawneh et al. (2003) attract attention to the inspection task during quality control, particularly the search portion and use of eye-tracking technology. See (2015) provides the first empirical data to address the reliability of visual inspection for precision manufactured parts. In a recent study, See et al. (2017) summarize the factors influencing inspection performance and suggest to explore new research areas in terms of visual inspection by considering automated inspection. Gramopadhye et al. (1997) focus on importance of visual inspection and the principles of effective training. Speed (2015) summarizes the issues associated with quantifying expert, domain-specific visual search behavior in operationally realistic environments.

By examining a series of fixations and saccades it is possible to generate information about the attention shifting process that accompanies visual task executions. The fixation position depends on the location of the previous fixation. Ellis and Stark (1986) suggest that scan paths can be modeled by a stochastic first-order Markov process. Observable eye movements can be associated with overt attention while hidden states from HMM are coupled with shifting covert attention (Findlay and Gilchrist 2003). HMM tools allow for finding stochastic relations between observations and hidden states and, thus, are very useful in broadening the basic knowledge about attentional visual processes.

Two-states HMM that corresponds to covert attention (local and global) registered during visual analysis of printed advertisements saccades’ lengths are illustrated by Liechty et al. (2003). Chuk et al. (2014) identify two specific attention patterns (holistic and analytic) for the analysis of the face recognition task. During textual information search, Simola et al. (2008) HMM scan paths investigation provide evidence for existing three consecutive hidden states (scanning, reading, and the answer). Grobelny and Michalski (2016) estimate four HMMs with three hidden states for the digital control panels examined.

One of the applications of HMMs in the industrial management context is concerned with fall detection and localization of the operator. Kianoush et al. (2015) use HMMs to monitor various worker’s postures and detect a fall event by analyzing the changes in the radio-frequency signal quality.

Given more and more accessible mobile eye-trackers that are becoming increasingly accurate, a more ecological approach is started being more appreciated in psychology of human visual behavior (e.g., Tatler et al. 2011). Lukander et al. (2017) describe a number of further studies in this field. They review eye-tracking experiments where visual behavior is recorded during everyday activities outside research laboratories. One of the directions in this trend focuses on providing classifications and models based on eye-tracking data. For example, Greene et al. (2012) use a linear discriminant classifier to verify if the complex mental states to be inferred from eye tracking data as it was suggested by Yarbus (1967) whereas Coutrot et al. (2018) provide a method for scan path modeling and classification using variational HMMs and discriminant analysis. Similar methods, also involving HMMs are also employed by Kit and Sullivan (2016) for naturalistic eye movement behavior, while an interesting and comprehensive review and comparative analysis is provided by Boisvert and Bruce (2016).

Though a number of papers recently have been published that involve HMMs in eye-tracking data analysis, most of them focus on classification problem. Moreover, studies on how these models can be used in work environment analyses are very scarce. Among the few research in this area, Sodergren et al. (2010) utilize data of subjects using eye-tracker when performing natural orifice translumenal endoscopic surgery. Unfortunately, HMMs are only used for profiling subjects and the authors did not include any substantial analysis of participants’ visual attentional behavior. Another possible application of employing HMMs in the industrial environment may be related to automatic face recognition. A number of studies published in this domain are rather concerned with improving the effectiveness of this method in general (e.g., Chuk et al. 2014), than to examine if and how it might be useful in the industrial environment.

Based on the accessible literature, it can be said that data obtained by use of eye-tracking technologies in real life manufacturing environments are not analyzed by use of HMM. Therefore, this study focuses on the visual activity of quality inspection operator tasks. Basic research question is defined as the qualitative differences between subjects’ visual activities and the analyses are carried out in the HMM perspective. Second section summarizes the basics of HMM. Following section explains the control panel experiment. Then, the outcomes of the HMM simulation results are discussed and the paper is concluded with directions for future studies.

2 Hidden Markov models brief overview

Markov models, also called as processes or chains, are introduced by a Russian mathematician Andrei Andreyevich Markov in (1913). Generally, they deal with states where the transitions between states specify probabilities of their occurrences. An HMM is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (i.e., hidden) states. The HMM can be represented as the simplest dynamic Bayesian network. The detailed HMMs specifications can be found in various sources. One of the most comprehensive explanation is provided by Rabiner (1989).

A discrete, first order HMM typically comprises of a set of hidden states S = {S1S2, …, SN}, a group of M observations for every state V = {v1, v2,…, vM}, a states’ transition probability matrix which provides likelihoods of switching from state i to state j: A = {aij}, the emission probabilities matrix B = {bj(k)}, where bj(k) = P[vk at t | qt = Sj,]; 1 ≤ j ≤ N, 1 ≤ k ≤ M, and the starting probability distribution π = {πi}, where πi = P[q1 = Si], 1 ≤ I ≤ N, and qt denotes the state at time t. HMM models are often defined by a set of three probability matrices λ = (A, B, π) that produces the sequence of O = O1O2OT observations, where every Ot is one of the V symbols, whereas T is the total length of the observation sequence. In Sect. 5 of the current paper, the HMMs parameters (A, B, π) are estimated based on observation sequences derived from the eye-tracking data, assume number of states, and a vocabulary corresponding to the defined areas of interests (AOIs).

3 Method

3.1 Company overview

The facility in concern is located in Turkey and is structured in a 10000 m2 closed and 12640 m2 total area. Currently, there are 24 white-collar employees and 222 blue-collar personnel. More than 400,000 tumble driers are usually manufactured every year, 99% of which are exported. The facility has three assembly lines. Two of these lines has 22 workstations and the third one has 6 workstations. The products undergo functional and electrical tests after the assembly is completed.

3.2 Inspection area in concern

Depending on the model of the tumble drier, the plastic control panel may be produced in different colors (i.e., white, silver, and black). Each front panel requires to be controlled by a quality control operator before it is transferred to the related assembly workstation. The inspection area in concern is presented in Fig. 1. The picture comes from a Tobii Pro Glasses Analyzer snapshot that is the original viewpoint of the control operator.

Fig. 1
figure 1

Quality control area from the point of an operator’s view

Main steps of the visually based quality control are explained as follows based on the numbering in Fig. 1:

  1. 1.

    Related information is printed on the front panels in another department of the facility, carried to the inspection area in deep cartoon boxes, and placed on the left hand side of the control operator.

  2. 2.

    The smaller cartoon box keeping the digital cardboards that need to be mounted to the backside of the panels are placed in front of the control operator. The box is replaced with a new one and seal of the new box needs to be opened approximately after 60 panels.

  3. 3.

    Operator takes one front panel from the deep box and places it on the control table front side up. After visual inspection of front side is completed, panel is turned backside and backside of the panel is inspected. Two screws are taken from box (4) and mounted to the back side. If a defect (i.e., scratch, faded color) is detected, a colored tape (5) is sticked on the panel and separated to the area (8) that is on the left hand side of the control table.

  4. 4.

    If the panel meets the defined specifications, it is first placed on the area (6), then placed in the carton box (7) that is on the right hand side of the control table to be send to the related assembly workstation. When the box is full, it is carried to the material handling area and an empty box brought.

3.3 Visual inspection task description and identification of areas of interests (AOI)

The visual inspection of the front control panel in concern is represented in Fig. 2. In eye-tracking studies, they are usually called areas of interests (AOI) and they are defined as the main independent variable in the present research. Figure 3 illustrates the areas on the panel that are of special interests.

Fig. 2
figure 2

Panel in concern

Fig. 3
figure 3

Representation of AOIs and their abbreviations

Details of AOIs characteristics are summarized in Table 1. The quality control engineers in the facility define highest importance for the area named as 01_BN because it includes information of the brand and model of the tumble drier. Faded or not aligned painting in this area cannot be tolerated. Likewise, the area that is named as 06_DP that lies in the middle of the panel and keeps the digital panel screen have high importance because any scratch in this area has a potential to be quickly realized by the customer. The space area (03_SA) attracts attention of the customer if there is scratch. Therefore, needs to be inspected carefully.

Table 1 Characteristics of AOIs and their abbreviations for the panel in concern

The control operators are asked to focus on the printed information on the two sides of the black digital section. These areas include several printed information including, brand name, loading capacity, energy class, and program setting information. The printing may be faded or not aligned properly in these areas. Therefore, adequate time is required during visual inspection. 07_RH have lowest importance that lies on a smaller area compared to other AOIs defined.

The visual activities related to AOIs are examined and analyzed separately for one novel and one experienced female quality control operators. We take advantage of the fixation sequences registered within the confines of the defined AOIs to model the possible hidden visual quality inspection task states by HMMs.

3.4 Data collection procedure

Several data can be obtained by use of an eye-tracker and evaluated for the purpose in concern. Data such as; fixation (the length of a fixation is usually an indication of information processing or cognitive activities), saccade (the rapid eye movement occurring during reading and when viewing visual patterns), smooth pursuit (slow eye movements that stabilize the images of a slow moving target on or near fovea), and change in the pupil size (increases in response to light, emotional stimuli, attention, working memory workload, pupillary unrest) are recorded by use of an eye tracking glasses that has the components such as illuminators, cameras, and the processing unit containing the image detection, 3D eye model and gaze mapping algorithms. In this study, participants’ eye movement data are collected through a mobile eye-tracker (Tobii Pro Glasses 2) that is represented in Fig. 4.

Fig. 4
figure 4

Representation of the eye-tracker used (https://www.tobiipro.com/product-listing/tobii-pro-glasses-2/)

The study is conducted in Eskisehir, Turkey, during January-May 2016 daytime shift at a home appliance facility. Two participants who represent the population for the visual inspectors are considered. The female participants who are 25 and 30 years old have no visual problem such as near, far sighted, or astigmatic. Each participant is tested individually in the workplace represented in Fig. 1. They were not extra paid for their participation in the examination. The sequence of the test is as follows:

  1. 1.

    Introduction of the nature and aim of the test.

  2. 2.

    Calibration of the Tobii Pro Glasses 2 device for participant’s gaze.

  3. 3.

    Performance of the inspection task for 60 panels to identify possible defects as summarized in Table 1.

The experimental eye-tracking data are first elaborated in the Tobii Pro Glasses Analyzer software, version 1.95 (https://www.tobiipro.com/learn-and-support/downloads-pro/), then exported to text files and further processed in in TIBCO Statistica 13.3 package (TIBCO Software Inc. 2017). The first order, discrete time, HMMs are estimated according to the Baum-Welch procedure (Baum 1972). The functions elaborated by Murphy (1998, 2005) are used to calculate estimates of initial, transition, and emission probabilities. The maximum number of iterations are set at 1000 whereas the convergence threshold equaled 0.0001. All the HMM computations are performed in a 64 bit version of Matlab (9.5.0) R2018b.

4 Overview of the recorded visual activity

The eye-tracker is used when experienced and novice control operator are inspecting one panel at a time. Table 2 summarizes the metrics for ten AOIs in concern generated by Tobii Analyzer software. Fixation count, total fixation duration, visit count, average visit duration, and time to first fixation data for experienced and novel operators are provided. During the inspection performed by the novel operator, no significant data are recorded for 09_RE and 10_PS AOIs.

Table 2 Eye-tracking metrics for the operators’ visual activity

Heat maps are meant to visualize how fixation duration is distributed. Red color represents the area where the operator spends more time during inspection. Yellow and green colors represents shorter duration periods respectively. Figure 5 represents the heat map for the experienced operator and Fig. 6 for the novice operator. It is clear that experienced operator is capable of inspecting the areas that are determined as to have the highest importance (01_BN, 03_SA, and 06_DP). However, as also supported with the data from Table 2, novice operator completely misses to inspect 09_RE and 10_PS AOIs.

Fig. 5
figure 5

Heat map for the experienced operator

Fig. 6
figure 6

Heat map for the novice operator

5 Modeling visual activity by HMMs for novice and experienced operators

5.1 Simulation experiment

As the research outcomes of Hayashi (2003) on novice and expert pilots suggest, they might use a qualitatively different visual strategies. A simulation experiment is designed and conducted to verify the best number of hidden states necessary to model the visual behavior of novice and expert workers. Overall, 10 conditions were examined. They were differentiated by five possible hidden states (from 2 to 6) and two types of workers (WT: novice and expert): 5 (States) × 2 (WT).

It is well known that the HMM estimation results depend on initial probabilities. Thus, to compensate for this, 100 single simulations are run for each experimental condition with random starting values. Each and every model is evaluated by Akaike’s (1973) Information Criterion (AIC), and Schwarz’s (1978) Bayesian Information Criterion (BIC). Obtained in current experiment values for these criteria, e.g., their means, standard deviations (in brackets), minimum values, as well as log-likelihoods are presented in Table 3.

Table 3 The HMM simulation results for expert and novice workers

Given that smaller values of AICs and BICs signify more appropriate models. Obtained results unanimously suggest two-states HMMs irrespective of the operator’s experience. Such a finding is somewhat surprising as it is in contrast with Hayashi (2003) results for the pilots eye movement data registered while landing. It is proposed that models with larger number of states better explain the visual behavior of more experienced pilots.

These two criteria are known for their bias towards more parsimonious models favoring solutions with as smaller number of parameters to be estimated. Such a property may lead to selecting models with too few states to adequately reflect the observed data structure. On the other hand, the log-likelihood itself increases along with the larger number of states suggesting the need for more and more parameters in models. Naturally, this approach may deliver solutions fitting the noise. Unfortunately, the present simulation experiment outcomes do not provide clear indication on how many states to choose. Though, the smallest values of AICs and BICs suggest two-states HMMs both for the novice and expert operators. One may have doubts whether the smallest possible number of states is enough to explain the registered shifts in visual attention. Therefore, we present and analyze both two- and three-states HMMs in the next section.

5.2 Analysis of HMMs

According to the AIC and BIC criteria, the HMMs with the smallest possible number of hidden states are considered. For comparison purposes, estimations of HMMs for the three states models are also presented. From among a hundred models computed for a given number of states, the models which have AIC and BIC values close to their minima and log-likelihoods near maximum values are selected. Further, the models that provide logical and meaningful interpretation in the examined context are considered. Table 4 contains two and three states HMMs determined as the most appropriate for representing the visual attention shifts registered for novice and expert operators. Second rows in each table contain initial states probabilities (π), consecutive two (or three) rows include probabilities of transitions between each pair of states (A) whereas the rest of the table presents emission probabilities (B).

Table 4 Two states HMMs for the experienced and novice operators

Analyzing two-states HMMs emission likelihoods it is quite obvious that visual inspections performed by the novice worker ignore some AOIs (03_SA. 07_RH. 08_RK. 09_RE. 10_PS). In contrast, the HMM for the expert shows attention shifts to all the predefined AOIs. This outcome is fully consistent with the qualitative analysis of the obtained heat maps presented in Figs. 5 and 6.

Another difference between these models is visible in transition probability matrices. High values on the diagonal suggest that the attention is very likely to stay in the same state. This may be interpreted as being more focused on the currently performed subtask. Despite high diagonal values, there are still considerable probabilities of switching to the other state. This indicates that the experienced worker jumps between the two states from time to time. On the other hand, the novice operator transitions imply different pattern of attention switching strategies. In both cases, the two-states HMMs shows that the operator starts with the second state related with searching for scratched surfaces. However, the less experienced operator is more prone to shift to the first state than to stay in the current one. Next, the novice operator is likely to stay in this first state, and there is very little chance to come back to the second state.

The three-state HMMs demonstrated in Table 5 confirm considerable differences in visual attention shifts performed by both operators. Examining three-state models, one may notice similar phenomenon as in two-states models, i.e., visual inspections of the less experienced operator produced no fixations in a number of AOIs as opposed to the expert user’s model where emission probabilities in all AOIs were estimated.

Table 5 Three states HMMs for the experienced and novice operators

The fact that no fixations are observed in many regions of the visually examined panel for the less experienced operator, may result from using peripheral vision, which cannot be registered by eye-trackers. Such a general view of the whole panel may be associated with the zoom-lens model (Eriksen and James 1986). Though here, the second phase of the model is not detected. In turn, the visual attention behavior of the experienced operator suggest rather applying the spotlight strategy (Posner et al. 1980) for a thorough and systematic visual inspection of panel quality.

Likewise in two-states HMMs, we can presume also for the three-states HMMs that the states indicate searching for some specific types of flaws, namely, searching for scratches, misalignments, and faint elements. The analysis of the models from Table 5 together with information from Table 1 suggests that the S3 state may reflect scratches identification. However, a more detailed analysis of this proposal shows that it is not fully compliant with information gathered from operators, especially in terms of searching for scratches. Since identified AOIs are differentiated by their importance to the manufacturer, the presented hidden states from models may correspond to some combination of defect type and importance components of the quality control process.

6 Conclusions

6.1 Summary of the findings

In the current research we have presented and analyzed eye-tracked data of quality assurance workers performing visual inspection tasks in their natural working environment. As mentioned in the introduction of this article, such an approach is being increasingly popular in research involving both eye-tracking techniques (Tatler et al. 2011; Lukander et al. 2017) and HMMs (Kit and Sullivan 2016; Boisvert and Bruce 2016).

We examined basic eye-tracking metrics, qualitative heat maps, and compared them with obtained HMMs. We analyzed their visual attention patterns and searched for differences between novice and expert operators. The simulation studies aimed at providing the most appropriate number of hidden states to capture visual registered visual characteristics. Our results in this aspect did not confirm results from a previous study of Hayashi (2003), where experienced pilots exhibited more hidden states than novice ones. In our case, the simulation outcomes suggested the same number of hidden states for both novice and experienced operators.

Further, two and three HMMs were estimated and thoroughly analyzed. These models showed that despite the same number of hidden states, these two types of operators employed considerably different visual strategies. It seems that experts used the spotlight strategy (Posner et al. 1980), which well corresponds to a thorough and systematic visual inspection whereas inexperienced workers’ visual behavior involved much more global assessment of the entire panel associated with peripheral vision and the first phase of the zoom-lens model (Eriksen and James 1986).

The identified differences between attention shifting strategies may constitute a basis for optimizing the quality control operations. The provided models are logically interpreted in terms of specific sub goals accompanying the visual inspection tasks.

The paper confirms how the modern equipment for mobile eye-tracking may be practical in examining operations in a manufacturing facility. The gathered data can be analyzed in a classic way for specified AOIs and given visual activity measures. However, taking advantage of more sophisticated approach like HMM, one may try to discover, not so obvious, human visual behavior patterns. In the present study, it is illustrated that even for a quite limited sample of eye-tracking data, such a methodology may provide qualitatively different results for novice and experienced quality assurance operators.

6.2 Limitations and future research

There are, naturally, specific limitations concerning the current investigation. First, the eye tracking data are registered only for one novice and one experienced employee. Though, they repeated the same experimental task multiple times, one should rather not draw any far reaching conclusions about the general differences between expert and inexperienced operators.

Since it is, usually, difficult to obtain a consent to examine workers by eye-trackers in a real environment, most of the experiments in the literature, related with use of an eye-tracker are conducted in a laboratory environment. In the current study, data are collected as inspectors conduct their routine tasks, in authentic working conditions. In this context, our contribution might be interesting for researchers and practitioners, since it is more ecologically valid than lab investigations.

As far as we know, neither lab nor factory based studies similar to ours, were carried out to analyze humans’ visual attention recorded during the inspection process. Therefore, despite the above mentioned limitation, we think that the present research is still valuable as it provides some insight in this regard. The presented outcomes may be used as a benchmark in future studies.

From the perspective of the presented and described Markov models, the collected data were sufficient. Participants examined 60 panels, and the visual inspection tasks were relatively simple. Thus, we were able to estimate and provide significant models uncovering specific patterns that help to better understand subjects’ attentional behavior.

However, modelling human behavior is usually troublesome and may be very difficult. Selecting a meaningful model that can be reasonably interpreted, seems to be a kind of art even for a simple experimental setup like the one in the present research. It resembles, to some degree, the procedures used in factor analysis (rotations, oblique rotations, decisions about the number of factors, selecting which factor loadings should be assign to which factors, etc.). Likewise, in classic regression analysis, the researcher has to decide how many dependent variables to include, whether or not leave the variates that are correlated with each other, or to what extent a given variable has to add up to the whole model in terms of R square to keep it in the regression equation. Similar problems occurred in the present study. For each simulation experiment condition there were 100 models produced. Some of the good models, in terms of the information criteria, are trivial and cannot be used as an explanation of human visual behavior. Moreover, the model selection criteria employed are not very helpful since there is no optimal number of states. Additionally, there are qualitatively different models with similar values of the goal function (or criterion). Probably, taking advantage of some other measures in this regard may provide more concrete suggestions.

The extension of the present investigation may be focused on automatic identification of novice and experienced operators. It is also possible to apply HMMs algorithms to the selection process of the most promising candidates for positions where specific visual strategies are required. Further studies may also assess whether operator gender has an influence on visual behavior or not.

Ulutas and Ozkan (2017a) attract attention to the use of digital cameras and image processing algorithms that may be helpful in inspection tasks. However, worker’s decision making capability, and the ability to tolerate some type of defects are the main reasons for many firms to still rely on human visual inspection. On the other hand, automated control methods may not suitable or too costly for inspection activities such as panel inspection. Some future directions may, however, involve automated intelligent procedures combined with human direct supervision. The presented findings may be treated as next step towards better understanding of how humans perform visual inspection tasks. This could also facilitate the development of better algorithms and intelligent procedures for automated control systems.

6.3 Final remarks

Despite the limitations and problems outlined in the previous section, the HMMs seem to be an interesting way of modeling human visual behavior while performing work operations based on visual inspection.

The study also attracts attention to the eye-tracking techniques that are useful to support inspector training and improve subsequent performance by linking failure modes to visual search activity like the problem of “looking but not seeing” (Muczynski and Gucma 2013). The models obtained in this study, are certainly qualitatively different for novice and expert operators. Gaining the insight into how experts perform visual inspection tasks may help in developing appropriate training procedures for novice operators or enhancing existing ones. As a result, the overall efficiency and effectiveness of work operations can be improved.

In general, HMMs allow to conduct analyses of hidden states that can be associated with the so called covert attention (Findlay and Gilchrist 2003), whereas the classic scan path examination deals only with manifestations of overt attention changes.