In the following subsections, we discuss the main studies where we treat the hypotheses, findings, and discussion for each study separately. Study A depicts the taxonomy preferences (Section 5.2), and Study B addresses the overchoice effect within a chosen taxonomy (Section 5.3).Footnote 8
We recruited 326 participants through Amazon Mechanical Turk. Participation was restricted to those located in the United States, and also to those with very good reputation (≥95% HIT approval rate and ≥1000 HITs approved) to avoid careless contributions. Participants were recruited at various times of the day to balance night and day time music application usage. Several comprehension-testing questions were used to filter out fake and careless entries. This left us with 297 completed and valid responses. Age (19 to 68, with a median of 31) and gender (159 males and 138 females) information indicated an adequate distribution. Participants were compensated with $2 for their participation.
In Study A, we looked at how taxonomy preferences are related to different personality traits. To investigate this relation we simulated a music streaming service (Fig. 2). The application consists of a simple interface with three taxonomies (mood, activity, and genre) for participants to browse for music. A tooltip provided users a description of each taxonomy. The order of the taxonomies was randomized to prevent order effects. Once a taxonomy was picked, participants continued by choosing a category within the chosen taxonomy (this is addressed in Study B in Section 5.3). As we are interested in users’ intrinsic taxonomy preferences, participants were not able to go back once a taxonomy was picked. For those who want to choose a different taxonomy, we included an additional option of “None of the items” among the available categories. For those who picked this option, we included an additional question in the concluding questionnaire where they could indicate what they would have picked otherwise in terms of taxonomy (i.e., mood, activity, or genre) as well as the category within a taxonomy.
In order to prevent an experience bias with one of the music taxonomies (i.e., mood, activity, or genre), participants were told during the instructions of the user study that they were going to test a new music streaming service, and therefore it is important that they interact with the system in the most ideal way for them.
As there is no strong evidence from the literature to form hypotheses, we decided to adopt an exploratory approach. We try to draw relationships between our findings to what is known from prior research in the discussion section.
Using a chi-square test of independence, we explored the relationship between participants’ five personality dimensions and the chosen music taxonomy (mood, activity, and genre). We used a median split to divide each personality trait into a low and high measure and a binary value was assigned to each taxonomy representing whether or not a participant chose for a certain taxonomy. The distribution of the music taxonomy choices made by the participants are shown in Table 5. Participants in general chose the genre taxonomy followed by the mood and activity taxonomies. In the following sections we discuss the relationship between personality traits and the music taxonomy chosen by the participants (see Table 4 for an overview).
Results of the chi-square test indicated a positive relationship between openness to experience and mood χ2(1, N = 297) = 3.117, p = .05. This means that those who scored high on the openness to experience dimension were more likely to choose for mood than for activity or genre taxonomy. We did not find any significant effects of the other personality traits: conscientiousness χ2(1, N = 297) = .934, p = .334, extraversion χ2(1, N = 297) = .870, p = .351, agreeableness χ2(1, N = 297) = .044, p = .833, and neuroticism χ2(1, N = 297) = .703, p = .402.
When looking at the chi-square test results for the activity taxonomy, we found a positive significant effect of conscientiousness χ2(1, N = 297) = 3.210, p = .05. Additionally, we found a positive relationship of neuroticism χ2(1, N = 297) = 12.663, p < .001. These results indicate that those who scored high on neuroticism or conscientiousness were more likely to choose the activity taxonomy. We did not find significant effects for openness to experience χ2(1, N = 297) = .046, p = .830, extraversion χ2(1, N = 297) = .507, p = .477, and agreeableness χ2(1, N = 297) = .406, p = .524.
The chi-square test results for the genre taxonomy indicated a positive significant effect of neuroticism χ2(1, N = 297) = 6.583, p = .01, which implies that those who scored high on neuroticism were more inclined to choose for genre than for the other taxonomies.. All the other personality traits were not significant: openness to experience χ2(1, N = 297) = 3.079, p = .11, conscientiousness χ2(1, N = 297) = 0, p = .997, extraversion χ2(1, N = 297) = 1.506, p = .220, and agreeableness χ2(1, N = 297) = .266, p = .606.
Additionally, we looked for effects of gender and age. Controlling for gender and age did not result in any significant effects. However, as seen in Table 5, the distribution of gender is interesting and indicates some trends. The distribution of women is higher in mood and activity, while conversely for genre.
In this study, we investigated whether music taxonomy (mood, activity, and genre) preferences can be inferred from personality traits. We found that there is a relationship between personality traits taxonomy preferences that are used by music streaming services. We visualized our findings in Fig. 4.
We found a positive relationship between openness to experience and the mood taxonomy. This indicated that those scoring high on openness to experience are likely to choose for music organized by mood. Knoll et al.  found that open individuals show reciprocal behavior towards emotional support. Those scoring high on openness to experience are more aware of, and more capable to judge their own emotions. Therefore, music can play a supportive role for them, and would find greater benefit from browsing for music by mood.
Furthermore, a positive relationship between conscientiousness and the activity taxonomy was found. In other words, highly conscientious people show an increased preference for activity, but not for genre. Conscientiousness refers to characteristics, such as, self-discipline. People that score high on the conscientiousness scale tend to be more plan- and goal-oriented, organized, and determined compared to those scoring low . As conscientious people are more plan- and goal-oriented, they would benefit of taxonomies that consist of concrete music categories (e.g., activities) to support their plans and goals.
Lastly, we found relationships between neuroticism and the activity and genre taxonomy. This indicates that those scoring high on neuroticism are more likely to choose for activity or genre. The neuroticism dimension indicates emotional stability and personal adjustment. High scoring on neuroticism are those that frequently experience emotional distress and wide swings in emotions, while those scoring low on neuroticism tend to be calm, well adjusted, and not prone to extreme emotional reactions . Additionally, those who are highly neurotic do not believe that emotions are malleable, but rather difficult to control and strong in their expressions . As neurotic people do not consider emotions to be easily changed, they will not benefit much from the mood taxonomy, but more of the activity or genre taxonomies instead.
In Study B we looked into how the number of categories presented within a chosen music taxonomy influences the user experience (i.e., category choice satisfaction and difficulty, perceived system quality and usefulness), and how this effect is moderated by the participant’s musical dimension expertise (i.e., active engagement, emotion, and perceptual abilities). The conditions (6- and 24-categories) of Study B originate from the behavior in Study A, where participants picked a music taxonomy to continue their music browsing (Study A; Section 5.2). In the following subsections we continue with hypotheses building, findings, and discussion.
Overchoice is not always bound to occur; the choice set needs to satisfy preconditions. We covered the choice set preconditions in Section 4. However, overchoice does not only depend on choice set characteristics, but the user’s characteristics play a role as well. A significant moderator for overchoice is the expertise of the user [11, 12, 64, 70, 85].
In line with findings showing expertise as a moderator for overchoice, we therefore hypothesize that also in the context of this study, expertise plays a role. In order to measure expertise, we rely on the different dimensions (i.e., active engagement, emotion, and perceptual abilities) of the Gold-MSI. The active engagement dimension depicts general music expertise (e.g., how much time and money one spends on music listening), while the dimensions emotion and perceptual abilities depict expertise related to the individual music taxonomies (mood and genre taxonomy respectively). For example, the emotion dimension is related to how often someone might choose music that will send shivers down their spine or how often music can evoke memories of past people and places, thereby mapping to the mood taxonomy. As the perceptual ability dimension is related to how well someone can compare two pieces of music or how well someone can identify genres of music, thereby mapping to the genre taxonomy. The active engagement dimension depicts general behavior, we believe that it has a positive effect in both the mood and genre music taxonomy.
Furthermore, we do not only investigate the effects of overchoice on choice satisfaction, but assess other parts of the user experience as well. Besides satisfaction, we also include choice difficulty, perceived system quality, and perceived system usefulness. Unless otherwise specified, we will refer to these factors as the user experience.
The number of categories within any of the taxonomies will have a positive effect on the user experience for dimension experts in active engagement, but not for non-experts. The dimensions emotion and perceptual abilities are more specifically oriented towards the mood and genre music taxonomy. Therefore we hypothesize:
The number of categories within the mood taxonomy will have a positive effect on the user experience for dimension experts in emotion, but not for non-experts.
The number of categories within the genre taxonomy will have a positive effect on the user experience for dimension experts in perceptual abilities, but not for non-experts. We do not hypothesize overchoice within the activity taxonomy as it depicts specific activities, and is unrelated to any kind of expertise or ambiguity.
A multivariate analysis of variance (MANOVA) was conducted to test for user experience (i.e., perceived system usefulness, perceived system quality, choice difficulty, and choice satisfaction) differences between 6- and 24-categories. With the MANOVA we first tested differences between the number of categories within each music taxonomy (i.e., without controlling for expertise). Tables 6 and 7 show the categories that the participants chose, and the total distribution across the music taxonomies respectively. Results show that for the mood, activity, and genre taxonomies, participants did not experience any significant difference whether it was the smaller choice set or the bigger choice set that they chose from (see Table 8 for means and standard deviations).
In order to investigate the effects of expertise, we conducted a moderated multiple regression (MMR) analysis. We used the dimensions of the Gold-MSI (i.e., active engagement, perceptual abilities, and emotions) to assess participants’ expertise level, and added these as a moderator to the analyses. This allowed us to investigate how expertise influences the overchoice effect and the user experience factors (i.e., perceived system usefulness, perceived system quality, choice difficulty, and choice satisfaction).
The analyses were conducted in two steps.Footnote 9 In the first step we tested for main effects. This allowed us to see the general effects of expertise on the user experience factors within each music taxonomy, regardless of the number of categories. The second step involved the moderators (i.e., emotion, perceptual abilities, and active engagement dimension expertise). By including the moderators, we were able to look at how expertise influences overchoice, and in turn the user experience.
We separately discuss the significant findings of each music taxonomy on the user experience factors (i.e., perceived system usefulness, perceived system quality, choice difficulty, and choice satisfaction) below. In each of the following result sections, we first start with the significant main effects (i.e., the effect of expertise on the user experience without taking into account the different number of categories). After that we continue with the significant moderator effects (i.e., the effect of expertise on overchoice and the user experience).
When looking at the results of perceived system usefulness, we found a significant main effect of emotion expertise (t(1, 63) = 1.939, p = 0.05). This indicates that in general participants that are emotion experts found the system more useful than non-experts. For perceived system quality, we found a significant main effect of active engagement expertise (t(1, 63) = −2.379, p = 0.02), as well as emotion expertise (t(1, 63) = 2.285, p = 0.02). This means that active engaged participants indicated that they perceived the system of lower quality while participations with emotion expertise rated the system of higher quality. Furthermore, we found a main effect on choice satisfaction of emotion expertise (t(1, 63) = 1.764, p = 0.08), indicating that those who use music for emotional activities are in general more satisfied with their category label choice.
When looking at differences between the number of categories while controlling for the expertise dimensions, we found the following moderator effects on the different factors of the user experience. For the perceived system usefulness, we found a significant moderator effect of emotion expertise (t(1, 63) = −2.147, p = 0.03). The results of the moderator effect indicate that emotion experts perceived the system as less useful when given more choices, while non-emotion experts perceived the system as more useful when given more choices (Fig. 5). When looking at the perceived system quality, we found a moderator effect of emotion expertise (t(1, 63) = −1.834, p = 0.07), indicating that emotion experts perceived the system of less quality when given more choices (Fig. 6). Lastly, we identified moderator effects on choice difficulty by emotion expertise and active engagement expertise. Emotion experts show a decrease in choice difficulty when given less choices (t(1, 63) = −1.754, p = 0.08; Fig. 7), whereas active engagement experts show a decrease of choice difficulty when given more choices (t(1, 63) = 2.385, p = 0.02; Fig. 8). No significant effects were found on choice satisfaction.
As expected, no main or moderator effects were found for the categories within the activity taxonomy.
No main effects were found of the different expertise dimensions on the user experience. However, moderator effects were observed on the user experience factors when looking at the differences between the number of categories. A significant moderator effect was found on perceived system usefulness when controlling for perceptual abilities expertise (t(1, 197) = 2.260, p = 0.02). Participants with expertise in perceptual abilities rated the system as more useful when given more choices. On the other hand, those with low perceptual abilities rated the system as more useful when given less choices (Fig. 9). For perceived system quality, we found a moderator effect of perceptual abilities expertise (t(1, 197) = 1.838, p = 0.06). The results show that perceptual experts rated the system of higher quality when given more choices, while it hardly made a difference for non-experts (Fig. 10). No significant effects were found on choice satisfaction or choice difficulty by expertise in perceptual abilities, nor did we find any effects on the user experience factors by active engagement.
Our results show that expertise plays a role in whether overchoice occurs or not. With regards to H1, we only found partial support. We hypothesized that general musical expertise (active engagement), would play a role in whether overchoice occurs. However, we only found an overchoice effect in the mood taxonomy on choice difficulty. Those who were more expert indicated to find it more difficult to choose a category when they were given less choice, whereas non-experts indicated to experience more difficulties when given more choice. As this was the only effect found, the effect of expertise seem to be very specific, and cannot take any general form.
Remarkable is the effect of emotion expertise within the mood taxonomy. Here, the emotion expertise seem to adopt an opposite effect of overchoice. Therefore, we need to reject our hypothesis (H2). Instead of an increase in the user experience factors when given more choice, emotion experts show a decrease. In other words, they perceived the system as more useful and of higher quality, and indicated to have less difficulties to pick a category, when provided less choice. Non-experts indicated the opposite effect and were experiencing a higher user experience when given more choices. A possible explanation for this could be that emotional experts are in general more emotionally aroused and therefore prefer less choice because it takes less cognitive effort. This is in line with findings that show that emotional arousal can have an adverse effect on decision making because of reduced cognitive processing [20, 66]. In other words, information processing decreases as a result of emotional arousal. Making a choice from a bigger choice set would then take more effort to assess every option. Also, especially for those who rely more on the emotional triggers of music, making a bad choice will have bigger consequences than making a good choice . Hence, as the choice sets within each music taxonomy were designed to be most attractive, choice difficulty within the mood taxonomy is exacerbated for the more experienced ones.
The effect of expertise in the genre taxonomy is partially in line with our hypothesis (H3). Prior research suggests that expertise is a moderator for overchoice [11, 12, 64, 70, 85]. Those who indicated to be experts in perceptual abilities rated the system of higher quality, and more useful, when more choices were provided.
It is striking that we did not observe a clear overchoice effect on the choice that was made (i.e., choice difficulty and choice satisfaction), but only on the evaluation of the system (i.e., perceived system usefulness and perceived system quality). Evaluating the necessary preconditions for overchoice to occur state that the user needs to have a lack of familiarity with the items, and should not have a clear prior preference for an item . However, not meeting these preconditions should lead to preferring more choice [11, 12], whereas our results show no differences. Others argue that overchoice can only occur when all options are attractive. So, there should be no dominant option and the proportion of non-dominant options should be large [16, 17, 48, 77]. Otherwise, making a decision would be easy, regardless of the size of the choice set. In this study, we tried to control for that by creating choice sets with the most attractive items (see the preliminary study in Section 4). Also, by looking at the distribution of the choices made by the participants (see Tables 6 and 7), there is no category that excessively stands out of being chosen. The most plausible explanation for why we did not observe the hypothesized effects comes from Hutchinson . He argued that overchoice seldom occurs among animals, because they seem to have adapted to the different sizes of choice sets that naturally occur in their environment. Although this hypothesis has not been verified on humans so far, it would explain best why the overchoice effect on the choices made (i.e., choice difficulty and choice satisfaction) was not found in our study. The sizes of the choice sets we used are not uncommon for music streaming services. We picked the size of our largest choice set (24 categories) to be in line with the original work of overchoice by Iyengar and Lepper . However, this was just a subset of what would be presented to actual users of such a service. It could be that participants are accustomed to the sizes of the presented choice sets as currently in music streaming services they would need to deal with even larger choice sets than used in this study.
Although we did not experience the overchoice effect on the category items, it does not mean that our choice sets did not have any effects. We did find effects on the factors evaluating the system (i.e., perceived system usefulness and perceived system quality). These are important factors that help to form users’ general perspective of the system as a whole.
Aside of the fact that our results contribute to knowledge on how to design online music systems (see Section 5.4), our results also contribute to knowledge in other domains (see Fig. 11 for a visualization of our results). We show that personality traits relate to interactions within online music systems and thereby provide insights on how personality relate to online behaviors through new interactions methods that technologies are facilitating. Furthermore, our results provide additional insights important for decision making research by showing the versatility of expertise on the overchoice paradigm. Although expertise showed to be an important influencing factor, we show that it is case dependent whether it contributes to overcoming the overchoice effect.
The results of these studies support the creation of personalized user interfaces by taking into account the user’s personality and expertise (a proposed user model can be found in Fig. 11). With applications getting more and more connected and sharing resources (e.g., applications connect with social networking sites, such as, Facebook, Twitter, or Instagram), the automatic extraction of personality and expertise becomes more available. A possible scenario could be:
A user has the music application connected to his Facebook account. Based on his Facebook profile, the application inferred that he is someone open to new experiences. Therefore, the music application adjusts the user interface by emphasizing the mood taxonomy to let the user continue browsing for music. By analyzing his profile (e.g., he filled in artists and bands that he likes) and postings (e.g., posting often that he goes to concerts), the system may infer that he is actively engaged with music. Based on this, the system decides to provide him more categories to choose from within the mood taxonomy.
In the last couple of years, it has been demonstrated that personality information can be extracted from social networking sites (SNSs) like Facebook (e.g., [2, 35, 42, 73, 81]), Twitter (e.g., [41, 75]), and Instagram (e.g., [29,30,31, 36, 65], or a combination of such ). Being able to extract personality traits from SNSs caters the possibility for (music) applications to adjust their user interface based on our results. For example, when someone appears to be open to new experiences, the mood taxonomy could be emphasized while other taxonomies could be placed more in the background of the interface. In addition, music recommendations could be given based on the mood taxonomy (e.g., music with similar mood expression).
Although recent work has shown that personality can predict music sophistication [25, 44], we believe that also the expertise dimensions (i.e., active engagement, perceptual abilities, and emotions) that we used in Study B, can be inferred from the same increased connectedness with SNSs. For example, active engagement can be inferred by extracting information on concert attendance (e.g., Facebook events, SongKick; http://www.songkick.com) as well as purchase behavior (e.g., iTunes store, Amazon; http://www.amazon.com). The “About” section, or the posted activities and status updates in SNSs can provide cues to infer perceptual abilities. Analyzing postings of a SNS user could give an indication about the emotion expertise dimension (e.g., postings about induced feelings when listening to a song). Also, there seems to be some relationship between factors of the emotion expertise dimension and the openness to experience personality factor. This could serve as an additional indicator. Music applications could anticipate the choice set based on the expertise dimension of the user.