We applied the methodology in a study of engineers who use multivariate time series data to diagnose the performance of devices throughout the production lifecycle. The goal of the study was to understand the expert’s abductive reasoning processes and the key features of the time series data used in these processes. This information can then be used to create and select input features used by advanced data analytics to model and predict certain response variables.
In the work domain studied, access to experts was limited due to their senior roles spanning multiple engineering teams. Therefore, knowledge elicitation sessions had to be as brief as possible, while still being thorough enough to acquire all data relevant to the work and the expert’s reasoning processes.
To design the knowledge elicitation sessions, we followed the multi-stage approach described in Sect. 2. In Step 1, we collaborated extensively with the ePOC to gain an overview of the entire work domain and to develop an elicitation methodology that would be compatible with the work culture and the availability of the expert engineers. In Step 2 we conducted observations of a subset of experienced engineers performing their day-to-day time series analysis work. From these observation sessions, we developed a list of specific questions and general talking points to use in subsequent interviews. In Step 3, we conducted one hour interviews with the each of the three most experienced engineers. After completing the interviews, we analyzed our notes and developed lists of commonly used, domain-specific vocabulary, tools used to complete the time series analysis work, and common difficulties encountered during the analysis work. In Step 4, we used the information from Step 3 to design the instruments to be used in the knowledge elicitation study. We chose to include eye tracking as an objective measure of attention allocation during a domain-specific task. We also chose to create study instruments that could be cross-referenced during analysis to highlight both consistencies and discrepancies in the raw data. This design allows for comparisons across objective measures of attention and subject measures of information collection and reasoning processes that will help guide subsequent studies in this work domain.
Four instruments were developed, (1) a general demographics questionnaire, (2) a work domain-specific questionnaire, (3) a simplified, domain-specific, abductive reasoning task, equipped with eye tracking, and (4) a verbal walkthrough protocol (see Sect. 4.2 for a detailed description of each instrument). For each instrument, we created an initial draft and then reviewed and revised the content and instructions in collaboration with the ePOC. The ePOC also provided technical content (time series) to use as stimuli for the time series analysis task.
4.1 Participants
Thirteen employees at Sandia National Laboratories volunteered to participate in the study. Three of the participants in the study were classified as experts; that is, they diagnosed device performance using the multivariate time series data as part of their daily job. These experts had an average of 15.5 years’ experience performing this type of activity. Four participants were categorized as practitioners; that is, they were familiar with the multivariate time series data but did not use it to diagnose device performance. These practitioners had an average of 5.5 years’ experience interacting with the multivariate time series data. Six participants were classified as novices who had no experience with the multivariate time series data. This novice cohort was included to provide comparative performance baselines.
4.2 Procedure
The participants completed the study individually. The participant first read through and signed the study consent form and asked any questions he/she had about the study. Next, the participant filled out a demographic questionnaire which assessed the participant’s age, gender, years of experience, etc. The experts and practitioners then filled out a questionnaire which asked specific questions about their work with the multivariate time series data. The novices did not fill out this second questionnaire since they did not have any experience with this type of data.
Multivariate Time Series Task. A PowerPoint presentation was displayed to the participant which explained the study and described what the participant would be asked to do. The novices were given very detailed instructions since they did not have experience with the multivariate time series data. The experimenter calibrated the eye trackerFootnote 4and then the participant completed two blocks of trials; the first block consisted of 10 trials and the second block consisted of 5 trials. Each trial consisted of four images displayed on the screen that contained multivariate time series data from a single device test. The participant was asked to classify the images as anomalous or normal. If the participant indicated that the image was anomalous, another screen was displayed which asked the participant to indicate the type of anomaly. Eye tracking dataFootnote 5 and response times were recorded while the subject inspected the time series stimuli.
Verbal Walkthrough. Finally, the experts and practitioners provided a Verbal walkthrough of the 15 trials. The experimenter opened a PowerPoint presentation that contained the 15 trials and asked the participant to explain their thought processes as they examined the time series data, how he/she reached their decision and what aspects of the images “popped out” or caught his/her eye for each trial. A second experimenter took notes while this discussion was taking place. The novices did not perform this task since they made decisions based on the detailed instructions that were given to them.