Design evaluation by combination of repeated evaluation technique and measurement of electrodermal activity
- 678 Downloads
Consumer product design needs design evaluation for obtaining information about consumers’ preferences and liking to optimize market success. Such evaluations are usually conducted in simple single-shot studies where consumers only once have to evaluate, for instance, the attractiveness of a design. However, innovative designs often break common visual habits by combining more or less familiar parts into a new concept (Carbon and Leder in Appl Cogn Psychol 19:587–601, 2005). Thus, when design innovation is realized in a too advanced way, such designs are expected to be rejected by perceivers at first glance due to low familiarity. However, from everyday experience, we know that consumers’ liking of products often is a dynamic process, which cannot be captured by simple single-shot studies. Carbon and Leder (Appl Cogn Psychol 19:587–601, 2005) have proposed the repeated evaluation technique (RET) for measuring such dynamic effects, which we have combined here with the measurement of electrodermal activity (EDA). The EDA data demonstrated that the RET captured dynamic effects, as the EDA showed specific sensitivity for highly innovative material only after the RET had been conducted; a cross-check with the same material analyzing item-specific boredom revealed that participants were much more bored by low innovative material over time than by highly innovative material. Thus RET seems to be a valuable tool for relevant affordances of design evaluation, particularly when innovative designs have to be evaluated.
KeywordsDesign Styling Prediction Innovation Electrodermal activity Adaptation Attractiveness Preference Applied cognition Innovativeness Repeated evaluation technique Mere exposure
Evaluating innovation in design not only is an important issue in basic research for a better understanding of underlying cognitive processes, but is foremost an imperative procedure for optimizing designs to perfectly fit consumers’ preferences. Here we show, by the measurement of electrodermal activity (EDA), the need for dynamic testing, realized by the repeated evaluation technique (RET; Carbon and Leder 2005). With such a dynamic test setting we can obtain critical information about ecologically valid preferences, and, therefore, predict liking of future products (Carbon and Leder 2007).
2 Simple evaluations versus repeated evaluation technique
Before a new design is presented to the public, it has to undergo several evaluative procedures in order to minimize design flops, and consequently to maximize market success (Urban et al. 1996, 1997). Standard procedures of this kind are questionnaires, customer clinics or simple ratings. All these techniques are usually conducted only once. However, a “single shot” measurement is only valid if the underlying construct is static, which does not seem to be the case with all designs that are evidently innovative, such as those of cars, cell phones, Hifi/Video components, etc. Carbon and Leder (2005) have recently shown in an experimental study that innovation in design has a very dynamic impact on key attributes such as attractiveness ratings. They used a special technique where participants deeply elaborate the utilized stimulus material. The so-called RET does not only expose the participants massively to the stimuli but also prompts multiple evaluations of the presented material. This procedure of elaboration aims at simulating time and exposure effects of everyday life as in reality we are used to work, live and operate with our consumer products, too. The study showed that innovative material was relatively strongly disliked in an initial evaluation phase. However, after having examined and elaborated the entire material, there was a significant increase in attractiveness ratings for innovative designs, while there was a trend for decreasing attractiveness for low innovative designs. This dissociation demonstrates the important dynamic effect of innovativeness over time, which can only be measured validly in a dynamic measuring scenario.
Only recently, Carbon et al. (2006) have expanded this theory by replicating the effect behaviorally and by providing additional neurophysiological evidence. They used pupillometry and analyses of scan paths to investigate the dynamic nature of innovative designs. Pupillometry (i.e., the dilatation of the pupil) was accessed by the averaged horizontal diameters of the left and the right eye. Previous studies suggest that the pupil may be a good indicator for the intensity of attention focused on the current task. Beatty and colleagues (e.g., Beatty 1982; Beatty and Kahneman 1966), for example, have demonstrated that the more demanding a cognitive task is the more dilated the pupils are. According to this rationale, the Carbon et al. (2006) study demonstrated that innovative designs were not only cognitively more demanding but also appeared to evoke more interest. Moreover, the analyses of scan paths revealed that innovative designs may be interpreted as more balanced in their conceptual structure. This was documented by an increased number of eye movements directed to the focus areas of the car (e.g., the steering regions around the steering wheel and the console). These effects were particularly strong when participants had experienced the innovative stimuli in an elaborative way via the RET. Thus, again, measurement techniques which only evaluate innovative designs with a single-shot technique, do not seem to be able of capturing such dynamic effects.
Interest for highly innovative stimuli, on the one side, might be one reason for the increase of liking for such material (cf. Carbon et al. 2006). On the other side, the dissociate data pattern between highly and low innovative material could also be explained by the fact that low innovative material has not much inherent innovation and represents material we consume and know in everyday life which is quite boring for participants. As pointed out by Bornstein et al. (1990), boredom is a limiting condition for effects that are related to increasing liking over time. Boredom was identified as being triggered by (social) meaning (Barbalet 1999). An absence of meaning leads to an experience of boredom. Deeper meaning, cognitively demanding and insurgent attributes of features, assembled in a coherent and harmonic way (Carbon et al. 2006) might be the right mixture of innovative material to induce liking for products in the long run.
3 Measurements of the electrodermal activity
Electrodermal activity has proven to be a useful psycho-physiological tool with wide applicability, especially for studying attentional processes and stimulus significance (Dawson et al. 2000).
The neural control of the eccrine sweat glands is entirely under sympathetic control. Thus, unlike most responses of the autonomic nervous system (ANS), the measurement of EDA provides a direct and undiluted representation of sympathetic activity (Boucsein 1992; Dawson et al. 2000). The skin conductance response (SCR) is elicited by a controlled cognitive process that is preceded by an early automatic discrimination process (Dawson et al. 2000; Lyytinen et al. 1992).
The event related SCR is an integral part of an orienting response. This investigatory response is caused by several stimulus characteristics such as novelty, meaningfulness, surprise or conflict (Berlyne 1960). Because of its discriminating abilities, the processes underlying the orienting response are not only involved in an unspecific autonomic arousal, but also in a stimulus-specific function guiding activation, attention and exploration.
Despite this apparent lack of specificity a SCR becomes interpretable by considering the experimental conditions in which it occurs (Dawson et al. 2000). These properties make EDA particularly interesting for research on consumer products, as EDA provides possibilities of revealing consumers’ preferences without penetrating such processes cognitively. Therefore, the measurement of EDA seems particularly adequate to analyze dynamic effects of innovativeness in design.
4 The present study
The aim of the present study was to further investigate the nature of innovation in design, particularly to better understand the dynamic aspects of innovation in relation to consumers’ preferences. In order to be able to systematically analyze such effects we used car interior designs of different levels of innovation. This kind of material was systematized by a study of Leder and Carbon (2005) who varied their material concerning several properties, such as innovativeness, curvature and complexity. Furthermore, to circumvent typical problems of the inherent dynamics of innovation, we utilized the RET propagated by Carbon and Leder (2005). According to the idea of multi-methodical measuring, we extended the methods already utilized with the RET, such as behavioral measures (Carbon and Leder 2005), pupillometry and analyses of scan paths (Carbon et al. 2006), by using the measure of EDA (Experiment 1). In order to control EDA data, which do not reveal the valence of an autonomous reaction, we conducted a second experiment integrating an additional dependent variable that explicitly measured how boring different stimuli for the participants were (Experiment 2). With these data the pattern of the EDA data can be qualified and interpreted more specifically.
5 Experiment 1: measuring EDA due to RET
Sixteen undergraduate students, aged between 21 and 46 years (M = 27.9; SD = 6.3), volunteered to participate in the experiment. The 12 females and 4 males received either course credit or were paid six Euros for participation. All participants had normal or corrected to normal vision and were tested individually.
5.1.2 Apparatus and stimuli
The stimuli were presented on an Iiyama™ 19-inch CRT monitor with a screen resolution of 1024 × 768 pixels at 60 Hz. Skin conductance was recorded by the constant voltage method (0.4 V). Two Ag–AgCl electrodes (8 mm diameter of an active area) filled with 0.5% NaCl electrolyte were attached to the thenar and hypothenar eminence of the participants’ left hand by means of double-sided adhesive collars. The electrodes were connected to a PAR-PORT/F™ system linked to a microcomputer whose software (PARPORT Online 2.8™) facilitated visualization and storage of EDA. The electrodermal data was measured with a sampling rate of 50 Hz.
The participant had to sit in a comfortable chair, approximately 70 cm in front of the computer monitor, within a constantly lit, sound-reduced, air-conditioned room with the temperature maintained at a thermo neutral level between 22 and 24°C.
The procedure was very similar to that of the original study investigating effects of innovation on attractiveness of car interiors (Carbon and Leder 2005). In an initial phase, the participants rated eight stimuli separately on scales of attractiveness and innovativeness (test phase 1: T1). All ratings were made on a 7-point-Likert scale (from “1”: least significant, up to “7”: most significant). In this phase we also measured SCRs of the participants while they were looking at the stimuli without further instructions. Subsequently, an extended rating phase followed. This phase, which consisted of 25 rating blocks,1 is called the RET phase (cf. Carbon and Leder 2005). Participants were instructed to rate the same stimuli as in T1 on several dimensions (see footnote 1) on “yes” or “no” scales. After all RET ratings were given, there was a short break in which the participants were instructed to answer two final ratings as deliberately as possible, followed by the second rating phase for attractiveness and innovativeness (test phase 2: T2). In T2, SCRs were measured a second time. The order of the stimuli was fully randomized for each rating block. Moreover, the order of the rating blocks with attractiveness ratings in the first place and innovativeness ratings in the second place was constant. All stimuli were presented for 8 s, yielding approximately 25 min for each session.
5.2 Results and discussion
In the following, behavioral data (ratings of attractiveness and innovativeness) and SCR data will be reported.
5.2.1 Attractiveness ratings
Attractiveness and innovativeness ratings in test phases T1 and T2 for both levels of Innovation (low and highly) in Experiment 1
Test phase T1
Test phase T2
The analysis revealed a significant main effect of Innovation, F(1,15) = 81.67, p < 0.0001, ηp2 = 0.845. No other effect was found to be significant. This is not in accordance with the general findings of Carbon and Leder (2005), who found an interaction between Phase and Innovation. However, in the original study of Carbon and Leder (2005) an alternative selection of stimuli were used.
5.2.2 Innovativeness ratings
Mean ratings for innovativeness evaluations for each participant were submitted to a two-way repeated measurement ANOVA with Phase (T1, T2) and Innovation (low innovative, highly innovative) as within-subjects factors. The analysis only revealed a significant main effect of Innovation, F(1,15) = 23.20, p = 0.0002, ηp2 = 0.607, but no other effects.
The innovativeness ratings replicated the original findings of Carbon and Leder (2005): The repeated evaluation phase between T1 and T2 did not affect the general innovativeness rating of the material, neither of the low innovative nor of the highly innovative one.
5.2.3 Electrodermal activity
The data were analyzed by a two-way repeated measurement ANOVA with Phase (T1, T2) and Innovation (low innovative, highly innovative). The analysis revealed a trend for a main effect of Innovation, F(1,15) = 4.43, p = 0.0526, ηp2 = 0.228, and a two-way interaction between Phase and Innovation, F(1,15) = 4.90, p = 0.0427, ηp2 = 0.246. The analysis of simple main effects of Innovation demonstrated a significant difference in EDA activity between highly and low innovative material only for T2, F(1,15) = 6.56, p = 0.0217, ηp2 = 0.304, but not for T1, F(1,30) < 1.0, n.s.
The interaction between Innovation and Phase in combination with a significant simple main effect of Innovation at T2 demonstrates that, in respect to the autonomic arousal, participants were quite insensitive to different levels of design innovation in the initial test phase. Probably, the full range of material looked relatively indifferent to them. However, at T2, after repeated evaluation of the stimuli, a differentiated pattern of autonomic arousal was generated. The question what valence this arousal had, was addressed in Experiment 2. There, a similar RET procedure was used which was complemented by an additional explicit boredom measure. Participants were asked to evaluate how boring they found the material at T1 and T2. This enables us to qualify and interpret the EDA data more specifically and helps to identify the underlying cognitive processes of the autonomic reactions.
6 Experiment 2: measuring boredom due to RET
Thirty-one undergraduate students, aged between 18 and 50 years (M = 24.4; SD = 8.3), volunteered to participate in the experiment. The 16 females and 15 males received course credit for their participation. All participants had normal or corrected to normal vision and were tested individually.
6.1.2 Apparatus and stimuli
The same stimuli as in Experiment 1 were used. Stimuli were presented by the experimental control software PsyScope 1.25 PPC (Cohen et al. 1993) on a 17-inch CRT monitor with a screen resolution of 1024 × 768 pixels at 89 Hz.
Each subject sat approximately 70 cm in front of the computer monitor, in a constantly lit, sound-reduced room, with the temperature maintained at a thermo neutral level of about 22°C. The procedure was very similar to that of Experiment 1, consisting of three phases: T1, RET, and T2. The RET phase was identical, T1 and T2 was complemented by a boredom rating. Here, participants were asked to rate on a 7-point-Likert scale (from “1”: least significant, up to “7”: most significant) how boring the material was. In Experiment 2, EDA was not measured. The whole procedure lasted approximately 25 min.
6.2 Results and discussion
In the following, rating data, sampled across participants are reported.
6.2.1 Attractiveness ratings
Attractiveness and innovativeness ratings in test phases T1 and T2 for both levels of Innovation (low and highly) in Experiment 2
Test phase T1
Test phase T2
The analysis revealed a trend of Innovation, F(1,30) = 3.19, p = 0.0844, n.s. No other effect was found to be significant. This is a similar data pattern as in Experiment 1.
6.2.2 Innovativeness ratings
Mean ratings for innovativeness evaluations for each participant were submitted to a two-way repeated measurement ANOVA with Phase (T1, T2) and Innovation (low innovative, highly innovative) as within-subjects factors. The analysis only revealed a significant main effect of Innovation, F(1,30) = 51.30, p < 0.0001, ηp2 = 0.635, but no other effects.
The innovativeness ratings replicated the findings of Experiment 1: the repeated evaluation phase between T1 and T2 did not affect the innovativeness rating of the material, neither for low innovative or highly innovative material.
6.2.3 Boringness ratings
Mean ratings for boringness evaluations for each participant were submitted to a two-way repeated measurement ANOVA with Phase (T1, T2) and Innovation (low innovative, highly innovative) as within-subjects factors. The analysis revealed a significant main effect of Innovation, F(1,30) = 8.37, p = 0.0071, ηp2 = 0.218, and, most interestingly, an interaction between Phase and Innovation, F(1,30) = 5.57, p = 0.0250, ηp2 = 0.157. Analysis of the simple main effects of Phase showed a significant effect for highly innovative stimuli, F(1,30) = 4.35, p < 0.05, ηp2 = 0.127, but not for low innovative stimuli, F(1,30) = 1.23, p = 0.2758, n.s. The analysis of simple main effects of Innovation demonstrated a significant difference in boringness ratings between highly and low innovative material only for T2, F(1,30) = 12.63, p = 0.0013, ηp2 = 0.296, but not for T1, F(1,30) = 1.47, p = 0.2346, n.s.
7 General discussion
In the present experimental study, we investigated dynamic effects of different levels of innovation in car interior designs. Based on the conceptual idea proposed by Carbon and Leder (2005) that “innovative designs often break common visual habits” (p. 587), highly innovative designs should initially be rejected by the perceivers. However, by becoming increasingly familiar with such highly innovative designs, they should also benefit from higher ratings of attractiveness, liking and interest after a while. In order to facilitate familiarization and elaboration of highly innovative material, we used the RET introduced by Carbon and Leder (2005). With that technique, we could show that dynamic effects of innovativeness and attractiveness can be captured, and thus, dynamics of real world scenarios can be simulated. For example, Carbon and colleagues (Carbon et al. 2006; Carbon and Leder 2005) have demonstrated that participants who initially disliked highly innovative design material, evaluated such material as significantly more attractive after elaborate processing, whereas their attractiveness evaluations for low attractive material decreased over time. Interestingly, an experimental study with an additional measurement of scan paths and pupillometry revealed that highly innovative material was not only benefiting from familiarization and elaboration but also led to more balanced eye tracks (between main areas of interest) and more dilated pupils during test phase T2 (Carbon et al. 2006). This might be, on the one hand, an indicator for a more balanced conceptual structure or a higher degree of visual rightness (Locher 2003) of highly innovative designs. On the other hand, the EDA data from Experiment 1, together with results from former studies, might also indicate that highly innovative designs are cognitively more demanding and challenging, at least after repeated evaluation and elaboration of the entire material. Nodine and Krupinski (2004) concluded that such demanding material appears to produce more tension, while at the same time generating longer lasting interest.
In the present experiment we could not replicate the behavioral findings observed for the RET in previous studies (Carbon et al. 2006; Carbon and Leder 2005). However, the analyses of EDA data showed that skin conductance was highly sensitive for the dynamic effects of innovation. Whereas the skin conductance during the evaluation of low innovative material decreased between test phase T1 (which was the initial test phase) and test phase T2 (which was the second test phase after the participants had been massively exposed to and had extensively evaluated the entire set of stimuli under the RET condition), it increased for the highly innovative material. These results correspond to the attractiveness data of Carbon and Leder (2005).
But how can the dissociation between the EDA of highly and low innovative material in test phases T1 and T2 be explained? Particularly, why is there an effect of increased EDA only for highly innovative material after the block of repeated evaluations? According to the rationale of the RET (Carbon and Leder 2005), participants must elaborate and familiarize themselves with new, highly innovative and atypical material in order to integrate it into their conceptual space before they can truly appreciate it. As pointed out by Hekkert et al. (2003), participants tend to reject such designs, because they are too advanced and therefore not acceptable. However, with the RET proposed by Carbon and Leder (2005), innovative design not only becomes more familiar, but is increasing in cognitive fluency (Leder 2003). Carbon et al. (2006) assumed that after becoming highly familiar with innovative material, participants are capable of exploring the innovative structure of it. As an underlying cause, design that requires cognitively sophisticated processing appears more interesting than low innovative design which is more familiar, but also rather boring in the long run. Following this line of argumentation, in Experiment 2, participants were explicitly asked how boring they found the different materials. Participants rated boredom indifferent when first exposed to the material in T1. Low as well as highly innovative designs were rated similarly, at a medium level of boredom. After RET, boredom ratings for low innovative material increased, whereas boredom ratings for highly innovative material decreased significantly. The boringness data were in accordance with the idea that innovative designs need time to become appreciated but are also not susceptible so much for boredom. Thus, designs that are more innovative and more advanced have a greater chance of becoming liked, popular, or even admired designs in the future. Low innovative designs, on the other hand, when only tested in a single-shot study have a good chance of being fatally misinterpreted as being liked and not boring. The RET setting reveals that they are disliked and seen as boring after a deeper elaboration.
To sum it up, we have shown that a single-shot measurement of appreciation is rather ineffective in capturing the dynamic effects caused by innovation. In order to avoid the inherent weakness of such studies, we have tested such dynamic effects by using the RET proposed by Carbon and Leder (2005) along with measurements of EDA and boringness ratings. Once again the usage of the RET triggered design sensitive processes. Although EDA initially, at T1, did not differ between low and highly innovative material, there was a clear and specific increase of EDA for the highly innovative material after a series of repeated evaluations in T2. In addition, boringness data qualified the underlying mechanisms by demonstrating that only at T2 evaluating the boredom revealed effects in respect to innovativeness. This underlines the importance and usefulness of the RET for simulating dynamics of the real world in an experimental setting. Consequently, the RET method can be a very helpful tool in predicting future preferences in design, and, therefore, for optimizing market success of consumer products.
“hochwertig” (of high quality), “elegant” (elegant), “nüchtern” (plain), “angenehm” (pleasant), “erdrückend” (overwhelming), “komfortabel” (comfortable), “geschmackvoll” (tasteful), “flippig” (hip), “ansprechend” (appealing), “stilvoll” (stylish), “überladen” (overloaded), “bieder” (proper), “extravagant” (extravagant), “luxuriös” (luxurious), “verspielt” (playful), “durchdacht” (carefully designed), “kitschig” (kitschy), “übersichtlich” (clearly structured), “einladend” (inviting), “gediegen” (soild), “abschreckend” (disgusting), “konservativ” (conservative), “praktisch” (functional), “modern” (modern), “futuristisch” (futuristic)
This research was supported by a grant to H. Leder and C.-C. Carbon from the FWF “Fonds zur Förderung der wissenschaftlichen Forschung” (National Austrian Scientific Fonds; P18910). We thank Andrea Lyman for proof-reading this manuscript, Gernot Gerger, Stella Färber and Thomas Ditye for conducting parts of the experiments and Jenny Zeller for producing the stimuli. Most importantly, we thank an anonymous reviewer for inspiring us to explicitly test the variable boringness in Experiment 2.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Beatty J, Kahneman D (1966) Pupillary changes in two memory tasks. Psychon Sci 5:371–372Google Scholar
- Berlyne DE (1960) Conflict, arousal, and curiosity. McGraw-Hill, New YorkGoogle Scholar
- Boucsein W (1992) Electrodermal activity. Plenum Press, New YorkGoogle Scholar
- Carbon CC, Leder H (2007) Design evaluation: from typical problems to state-of-the-art solutions. Thexis 2007:33–37Google Scholar
- Carbon CC, Hutzler F, Minge M (2006) Innovation in design investigated by eye movements and pupillometry. Psychol Sci 48:173–186Google Scholar
- Cohen JD, MacWhinney B, Flatt M, Provost J (1993) PsyScope: a new graphic interactive environment for designing psychology experiments. Behav Res Methods Instrum Comput 25:257–271Google Scholar
- Dawson ME, Schell AM, Filion DL (2000) The electrodermal system. In: Cacioppo JT, Tassinary LG, Berntson GG (eds) Handbook of psychophysiology. Cambridge University Press, New YorkGoogle Scholar
- Nodine CF, Krupinski EA (2004) How do viewers look at artworks? Bull Psychol Arts 4:65–68Google Scholar