1 Introduction

Since people have been on the planet, they have demonstrated a tendency to attempt to classify their fellow human beings. For example, the temperament theory, which has its roots in antiquity, was developed to differentiate individuals according to their different temperaments (Merenda 1987). In modern times, anthropological racial theories as well as personality models can be found (e.g., Banks 1996). In the field of economics, the differentiation of products, markets, and market actors serves to simplify processes and predict certain outcomes. In the field of marketing, it is useful to know whether or not a consumer is likely to purchase a particular product—without having to ask the consumer. There are several ways to build correlations between user groups and user behavior. For example, collaborative filtering is used by online commerce websites such as Amazon (“People who bought books about statistics were also interested in econometrics”). This method attempts to identify the future behavior of a consumer from his or her past behavior (Das et al. 2007). Another way is to predict a certain attitude or behavior based on an individual’s personality traits. These could be, for example, personal human values or the Big Five personality traits of openness, conscientiousness, extraversion, agreeableness, and neuroticism (Bilsky and Schwartz 1994; Cieciuch and Schwartz 2017; McCrae and Costa 1997, 1999; Schwartz 2017). This approach posits that, if we know an individual’s personality traits, we can predict his or her behavior to a certain extent (Aral and Walker 2012). The difficult aspect of this is acquiring valid information about a consumer’s personality traits. Since the US presidential election in 2017, there has been growing interest and controversial discussion about whether the US election or the UK’s Brexit decision may have been influenced by personality-driven advertising—so-called micro-targeting. The responsible company, Cambridge Analytica, claimed to have derived the personality profiles of US citizens from their digital footprints, especially their Facebook likes. Although this was a fantastic media headline, it is unclear whether it is true and actually possible to determine the Big Five traits with enough accuracy using likes alone or whether the contribution of likes to users’ profiles is sufficient. In marketing and especially in election advertising, it has been a common practice for many years to aggregate and use commercial demographic data to achieve a targeted address of the individual. Today, it should be clear that, even in the infamous cases of targeting and using psychometric data in recent years, which have created scandals in the media—due in part to a lack of understanding—not only have the Big Five played a decisive role and not only were Facebook likes used for their calculation (González 2017). However, if individuals in a narrow target group are very similar in their socio-economic variables, it could well be that their individual personal dispositions play the decisive role. This narrow target group exists, for example, among owners of photovoltaic (PV) systems who have to decide whether or not to buy an electricity storage system. Jacksohn et al. (2019) found that e.g., age, income, household size, and education level were significantly different between adopters and non-adopters of PV systems. Therefore, the target group of the present study is very similar in these parameters, and it can be suspected that individual personal dispositions play a greater role in differentiating within this group. Although there exists some literature examining the influence of personality on energy efficiency investments (Busic-Sontic and Brick 2018; Poier 2021), it has never been investigated whether the digital footprint of users—and inferred from this their personality traits—also reveal a contribution about the adoption of electricity storage among owners of PV systems.

The aim of the present article is to narrow this research gap and to investigate if the personality traits of PV users, predicted by their Facebook likes, are suitable for distinguishing between adopters and non-adopters of electricity storage in this target group. This contributes not only to the understanding of consumer behavior but also to the usefulness of data mining in social networks for consumer research. In a first step, it is tested whether the predictions match the users’ self-assessments. A second round of research will examine whether PV system owners can be differentiated into adopters and non-adopters of electricity storage based on their personality traits derived from Facebook likes. The remainder of this article is organized as follows: a review of the literature is provided in chapter 2, followed by an explanation of the methodology, as well as the hypothesis formulation in chapter 3. Data collection and preparation is described in chapter 4. After that, the results are presented in chapter 5. This is followed by a discussion of the results and the limitations of the study in chapter 6. After an outlook on further research, the article ends with the conclusions.

2 Research background

Personality traits are a psychological construct used to describe individuals. Assuming a certain stability, this could be useful for describing or even predicting human behavior and, for marketers in particular, purchase behavior. In the scientific literature, a number of definitions of personality traits can be found. DeYoung (2015), for example, described them as “probabilistic descriptions of relatively stable patterns of emotion, motivation, cognition, and behavior, in response to classes of stimuli that have been present in human cultures over evolutionary time.” Following John et al. (2010) and Valchev et al. (2013), they are habitual patterns of behavior, thought, and emotion that are stable over time and in comparable situations. What all definitions of personality traits share is “the emphasis on the relative consistency of behavioral predispositions to behave in a particular manner across situations” (Fischer 2018).

Over the last three decades, researchers have developed several frameworks to describe the personalities of individuals using descriptive terms for patterns of behavior with different numbers of dimensions. Eysenck, for example, introduced his PEN model consisting of three elements: psychoticism, extraversion, and neuroticism; this later formed the basis for Costa’s and McCrae’s NEO personality inventory (Barrett et al. 1998; Parish et al. 1965). In the early 2000s, Ashton and Lee built on the research of Costa and McCrae (2008) and Goldberg (1993) and introduced Honesty-Humility as an additional factor to the five existing traits (Ashton et al. 2004; Ashton and Lee 2007). This six-factor model is known as the HEXACO model, derived from the initial letters of the factors Honesty-Humility, Emotionality, Extraversion, Agreeableness, Conscientiousness, and Openness to Experience (Ashton and Lee 2009). Although these models exist with more or less than five items, there is a broad consensus in the scientific literature that five-factor models make the greatest explanatory contribution. Thus, the most-often used and best-known models in contemporary research comprise five personality traits or personality factors. They are known as five-factor models (FFM) or the Big Five (Goldberg et al. 2006; McCrae and Costa 1999; McCrae and John 1992). Costa and McCrae identified neuroticism, extraversion, and openness to experience as three factors of 16 in a first step (Costa and McCrae 1976). Some years later, they added agreeableness and conscientiousness to the model, which later became known as the NEO-Personality Inventory Revised (NEO PI-R) after several improvements (Costa and McCrae 2008). The five traits can be measured with a number of inventories such as the original 44-item Big Five Inventory (BFI) (Benet-Martínez and John 1998), the revised 60-item version BFI-2 (Soto and John 2017), the 60-item NEO-FFI (McCrae and Costa 2004), and the 240-item NEO-PI-R (Costa 1996; Costa and McCrae 2008). Table 1 shows the five factors, each one comprising six facets.

Table 1 Big Five personality traits according to the NEO-FFI.

Numerous studies demonstrate the contribution of personality traits to behavior (Busic-Sontic and Brick 2018; Danielsbacka et al. 2019; Poier 2021; Rozgonjuk et al. 2021; Zhang et al. 2021). A recent study of Chinese students found that their information-seeking behavior depended significantly on their personality traits. Among other things, information seeking can reduce perceived risk in purchasing—a core construct of buyer behavior (Zhang et al. 2021). Thus, there is evidence of a contribution of personality traits on consumer behavior. Busic-Sontic and Brick (2018) and Poier (2021) investigated the direct and indirect effects of the Big Five on energy efficiency installations and photovoltaic adoption, respectively. In both studies, the effects were weak, but the Big Five were also opposed to very heterogeneous socio-demographic significant control variables.

The possibility of drawing conclusions about the personality traits of users of social media platforms, especially Facebook, from their profiles has been studied and confirmed in several studies (Kosinski et al. 2016; Kosinski 2021; Marengo et al. 2020; Marouf et al. 2020b; Segalin et al. 2017; Youyou et al. 2015). One of the most popular articles in recent years has been that of Youyou et al. (2015). In their study, they looked at inferring the personality of users from their Facebook profiles and found that a user with more than 10 likes can be better described by his Facebook profile than by the work colleagues and that more than 300 likes can describe the user better than his or her spouse. Segalin et al. (2017) were able to draw conclusions about the personality traits of Facebook users from their profile photographs. In contrast, Marengo et al. (2020) examined differences in personality traits between users of social media platforms and between users and non-users. They found that above all extraversion of social media users was significantly higher than that of non-users.

3 Methodology and hypotheses

The aim of this study is to explore whether consumers’ digital footprints—their Facebook likes, in particular—are suitable for predicting their purchase probability of a solar electricity storage system in Germany. Based on the literature introduced in chapter 2, the research question derived from this is as follows:

Is it possible to make a prediction about an owner of a photovoltaic system’s adoption of an electricity storage system using only the predicted Big Five personality profile derived from Facebook likes?

To answer this question, two hypotheses were tested:

  • H1: The predicted Big Five personality traits resulting from Facebook likes are significantly equivalent to the Big Five personality traits that emerge from self-reports.

  • H2: The Big Five personality traits between adopters and non-adopters of electricity storage systems are significantly different.

These considerations are based on the assumption that the sum of the users' activities reflects their online behavior, from which in turn their personality traits can be derived (Kosinski et al. 2016; Marouf et al. 2020a; Youyou et al. 2015). In this study, an online prediction application programming interface (API) provided by the Psychometrics Centre of the University of Cambridge is used as the basis for data processing (Popov et al. 2015). The developers of the API collected data about the participants’ personality traits and their Facebook likes and calculated correlations between their Facebook usage behaviors and personalities that could also be used the other way around—that is, to predict behavior based on personality (Kosinski et al. 2013).

For the first hypothesis, predictions provided by Apply Magic Sauce (AMS) (www.applymagicsauce.com) will be used as the source of comparison. AMS is an online prediction service provided free of charge for academic purposes by the Psychometrics Centre of the University of Cambridge (Kosinski et al. 2019). It uses data from the myPersonality project (www.myPersonality.org), a Facebook app that was active from 2007 until 2012 and used by approximately 6 million users. About 30–40% of the participants donated their Facebook data voluntarily. To draw relations between a psychological assessment and Facebook pages that were liked by the participants, personality predictions are based on opt-in data from 260,000 participants who completed the 100-item International Personality Item Pool (IPIP) questionnaire in English (Popov et al. 2015; Stillwell and Kosinski 2019). The app was banned by Facebook in 2019, although it hasn’t been active since 2012. Unfortunately, the availability of the myPersonality dataset has since been discontinued following several concerns regarding data protection. Thus, it is no longer possible to derive raw scores from the AMS results. AMS provides results for the estimated results not as absolute scores but as percentiles. Both self-reports and predictions will be converted into z-scores in advance in order to achieve a common base for t tests.

4 Data

Data were collected through an online survey between April and June 2019. The questionnaire contained items regarding household and personal demographics, technical features of the PV system, and information about a possibly existing battery. In addition, it included two question batteries about psychological traits. The first block, concerning the Big Five personality traits, was mandatory and comprised 16 items that were taken directly from the SOEP questionnaire (Goebel et al. 2019). Prior to the online questionnaire, participants declared themselves to be of legal age and to be taking part voluntarily. After being provided with detailed information about data protection and the scientific use of personal data, all participants gave their written consent to the use of their data and information about the privacy policy associated with the survey. Facebook carefully reviewed the app and, finally, after a few months of coordination and negotiation, allowed its use for scientific purposes and activated the app. This Facebook app is the key element of the present study (Fig. 1). It enabled the data exchange between Facebook and AMS for data processing and the calculation of the predictions.

Fig. 1
figure 1

Login to the like-exchange app. Note: Figure presents the login screen to the Facebook app. The user must consent in advance to his likes being accessed. After proceeding, the user’s likes were transferred to the AMS-API and processed. The resulting Big Five traits were re-transferred as percentiles

Because the target group comprised Facebook users who were also owners of a PV system, the study was advertised directly on Facebook and addressed individuals who were interested in solar power, photovoltaics, renewable energies, and related topics. In addition, a call to participate in the study was posted in relevant groups with a total of about 26,566 members. During the period, in which the survey was conducted, it was not only the Europe-wide introduction of the General Data Protection Regulation that was omnipresent. At the same time, several data-related scandals in connection with the Facebook platform became public. The result was an unexpectedly low participation rate since it was actually to be assumed that a technology-savvy target group within the platform, of which they are users, would show more activity. The reactions to the advertising or postings to the study were largely characterized by hostile rejection, including insults and insinuations. These, too, were unexpected because the target group should have comprised higher-earning and better-educated people, and a higher share of married couples (Table 12 in the Appendix). It was also unclear where the individuals who reacted hostilely to the postings came from since some of them did not belong to the target group. At the end of the questionnaire, the participants could compare their predicted personality profile with their self-assessment using a graphical compilation. This should have created an incentive to provide honest answers in order to receive a reasonable self-assessment. In addition, vouchers for an online department store were raffled among all participants. The ads reached about 55,448 Facebook users but with a manageable level of success. All in all, 3509 individuals visited the starting page of the survey, 213 (6.1%) of whom claimed to own a PV system, which was the basic criterion for participation in the study. Of the 66 cases where participants were able to connect their Facebook account with the app (339 likes, on average), 43 could be used for a prediction because they had enough likes (453 likes, on average).

The German Socio-Economic Panel (SOEP) is a representative, nationwide survey across nearly 15,000 private households (Goebel et al. 2019; Liebig et al. 2019). In this wide-range longitudinal study, more than 25,000 respondents are interviewed year by year. The survey started in 1984, and the most recent data represent wave 35 from 2018 (Liebig et al. 2019). In addition to the questions that are components of every wave of the survey, there are also special topics that flow into the investigation. Among many other topics, the SOEP includes variables about the Big Five personality traits and other psychological items. The data can be retrieved from the German Institute for Economic Research (Deutsches Institut für Wirtschaftsforschung, DIW) at no cost and are reserved exclusively for academic use and for registered researchers. In the years 2005, 2009, 2013, and 2017, a self-completion questionnaire on the Big Five personality traits was part of the SOEP study (DIW Berlin 2007). A short version of the Big Five Inventory was used, called Big Five Inventory Short (BFI-S), with 16 questions. Before the BFI-S was added to the SOEP panel, its external validity was tested and it was considered to be sufficient for capturing users’ personality traits (Dehne and Schupp 2007). The internal consistency of the scales was determined by the reliability coefficient Cronbach's alpha (α). Although all values were below the recommended measure of 0.7, Dehne and Schupp argue that the low values are caused by the small number of items and that the mean inter-item correlation of the scales provides good results. Crobach's alpha thus indicates how well the individual items are represented by the scale. The more items are used (the longer the measuring instrument), the better the α-values. However, for many participants, the inclination to answer decreases if too many questions are asked. Thus, some researchers note the low reliability of such short scales as in the SOEP or the British Household Panel (Smith et al. 2021). While most studies concerning personality traits investigate student samples, which result in a bias toward young adults with a higher level of education, the great advantage of nationwide studies is their representativeness.

4.1 Construction of the working sample

After deletion of all cases where the requirement of a PV system was not answered positively, 159 cases remained. Of these, 16 cases were excluded from the survey because of obviously incorrect answers. Thus, the working sample consists of 143 PV users (mean age 44.3, 18.0% female), of whom there are 74 owners and 69 non-owners of a battery storage system. Of the 61 participants who managed to connect their Facebook profile to the app, there were 39 who had enough likes for a prediction of personality traits; among them, there were 20 owners and 19 non-owners of a battery storage system. Table 2 gives an overview of the self-assessments.

Table 2 Big Five personality traits of the 2019 self-reports

4.2 Construction of the control sample

For comparison, data from the German Socio-Economic Panel (SOEP) were used. In 2015 and 2016, the questionnaire included an item that asked if the household owned a PV system. A total of 13,083 individuals answered this question “yes” or “no.” For individuals who took part in both years, only the results of the second administration were left in the dataset (number (n) = 7286, 59.7% male, mean age 56.24). In survey year 2017 (n = 32,485, 51.4% female, mean age 45.98), a 16-item question battery was used to investigate the Big Five personality traits of the participants. For every trait, a score was calculated when at least one trait-related question was answered (Table 3).

Table 3 Big Five personality traits in the 2017 SOEP study

The Big Five scores were added to the PV dataset. The deviation of the total standardized scores from zero could be due to the fact that the majority of participants who answered the question regarding PV ownership were homeowners and were neither very young nor very old people. It is noteworthy, then, that all traits, except for conscientiousness, score below the sample mean of the SOEP study 2017. One reason could be the higher proportion of males and the significantly higher mean age.

5 Econometric analysis

For the prediction of personality traits through Facebook likes, a t test with repeated measures was used because the same individuals were assessed first by the AMS app and second through a self-report questionnaire. To evaluate whether battery owners and non-owners could be distinguished, an independent t test was conducted. The independence of the measurements was fulfilled because two different groups of individuals were assessed concurrently. The same applies to the comparison between self-reported scores and the data from the 2017 SOEP study.

5.1 Predictability of self-assessments through facebook likes

The self-reported personality traits were supposed to be predicted through the Facebook like estimates. Thus, self-reported scores and Facebook predictions should be significantly similar for a participant’s Big Five personality traits and, in addition, both values should correlate positively. Because the AMS app provides percentiles while self-reports are given as absolute scores, both were computed into z-scores for comparison (Table 4). Z-scores (z) are a standardized measure to compare scores in terms of standard deviations. Regarding the self-assessments, the z-scores were calculated from raw values (x) by subtracting the mean (μ) from each raw value and then dividing by the standard deviation (σ):

$$z = \left( {x - \mu } \right)/\sigma$$
Table 4 Statistics of Z-scores

In this study, mean and standard deviation of the 2017 SOEP data were used. To derive the z-scores from the percentiles of the normal distribution, the SPSS function IDF.normal was used. Here, the mean of 0 and a standard deviation of 1 were taken for the normal distribution. After these calculations, their distributions has not been altered. The z-scores were compared via paired t tests for each Big Five trait. For extraversion (p = 0.081) and neuroticism (p = 0.530), the null hypothesis of mean level equality could not be rejected, while the correlation was positive only for extraversion (r = 0.318) and agreeableness (r = 0.420), which is a necessary assumption for predictability (Table 5). Lambiotte and Kosinski (2014) noted that a typical correlation was between r = 0.2 and r = 0.4. As an intermediate result, it could be stated that the results of AMS prediction and self-reports were significantly equal and correlated only for extraversion. Or, in other words, the Facebook likes predicted the self-assessments of users to a significance level of 95% in a sufficient way only for this trait. Figure 2 provides three important comparisons: (1) A comparison between self-assessments from the SOEP study (blue) and all self-assessments from the present investigation (green) revealed the greatest mean deviations in agreeableness and conscientiousness. (2) When all self-assessments from the present investigation (green) and only the self-assessments from individuals whose Facebook profiles could be evaluated (orange) were compared the largest mean deviations were found in openness and extraversion. (3) A comparison of self-assessments of individuals whose Facebook profiles could be evaluated (orange) and their predictions (yellow) showed that users rate themselves as considerably more open and extroverted but less agreeable and conscientious than their Facebook likes predicted.

Table 5 Mean level comparison of self-reports and predictions
Fig. 2
figure 2

Self-assessments and predictions compared to SOEP data. Note: Figure presents mean scores of the Big Five personality trait z-scores for photovoltaic adopters from the SOEP study (n = 425, blue), all self-reports from the present study (n = 139, green), self-reports from the present study with predictions (n = 39, orange), predictions from AMS (n = 39, yellow); n = number of individuals

5.2 Distinguishability between user groups

Following the comparison of self-assessment and prediction, whether owners and non-owners of batteries differ significantly should be investigated. For this purpose, a t test for independent samples should be conducted in the first step, which tested the group mean values of predicted scores for differences. None of the p-values was significant, and thus, the null hypothesis of equality of means cannot be rejected (Table 6). As a result, it can be assumed that the groups cannot be distinguished by their mean values. In order to verify whether the failed distinctness affected only the predicted values, the self-reported scores were also checked. A second t test was conducted, and again, the p-values revealed no significant differences between the two groups. Owners and non-owners of batteries could not be distinguished either by the AMS predictions or the self-assessments of the Big Five personality traits. Thus, the Big Five alone were clearly not suitable for determining group membership.

Table 6 Mean levels of battery adopters and non-adopters

A linear discriminant analysis (LDA) should show whether the Big Five personality traits have a discriminant property on the two user groups and whether an enrichment with further variables can enable differentiation. This proceeding originates from the finance and insurance industry, where it is used to assess whether a consumer is creditworthy or not, depending on several predictor variables. The discriminant analysis was first conducted with only the predicted Big Five traits; in a second step, demographic variables were added; and in a third step, risk preferences and risk perceptions completed the set of independent variables. The first analysis gave an eigenvalue of only 0.090 with a canonical correlation of 0.288 and Wilks’ lambda of 0.917 (p = 0.703). Thus, the whole model was not significant. Neuroticism and openness could be determinants of battery ownership, and conscientiousness, extraversion, and agreeableness were possible predictors of non-ownership. The model was able to classify 51.3% of the cases correctly, which was marginally more than chance. When demographic variables like age, gender, education, and family status were added, the eigenvalue increased to 0.256 with a correlation of 0.452. Wilks’ lambda is 0.796 (p = 0.642). The model could classify 67.5% correctly but was still not significant. When the number of persons in the household and the household income and expenses were added, the model could classify 73.0% correctly but was still insignificant. When risk propensity and risk perceptions in several domains were added, the eigenvalue increased to 2.787 with a correlation of 0.858 and Wilks’ lambda is 0.264 (p = 0.022). Neuroticism and openness were still determinants of battery ownership. The model was able to classify 91.4% of the cases correctly. As a result, the Big Five personality traits contributed only to a small degree to the differentiation between the user groups. Instead of using latent variables as predictor variables, i.e., Big Five personality traits, the like-IDs of the Facebook pages could be used as a source for the discriminant analysis. A user-like matrix was created from a total of 19,335 different pages related to 61 individuals (mean number of likes = 335, SD = 542.4, min = 2, max = 3495), 29 of whom were battery owners and 32 were not. Every time an individual liked a single page, this was represented by 1 or otherwise by 0. Both were equally weighted. To reduce complexity, the matrix was trimmed to all cases with at least 20 liked pages per user and at least 2 users per page (see also Kosinski et al. (2016)). The result was a matrix consisting of 1846 pages by 43 users (4 users had to be deleted because of incorrect answers) with 79,378 cells. Although discriminant analysis actually requires continuous variables, it can also be conducted with 1/0 coded binary independent variables. According to the central limit theorem, a normal distribution of the independent variables could be assumed for a sample larger than 30. A step-wise LDA was conducted to explore the contribution of the likes to the ownership of a solar battery. Since ultimately only 16 variables were included in the equation, the condition that more cases should be considered as parameters was also fulfilled. For both user groups, it was striking that, among the top 20 most-liked pages, for battery owners and non-owners, 5 and 9, respectively, were for comedy or satirical entertainment. The most popular fan page for PV owners was “Der Postillon,” a satirical news website. Pages with the most discriminant properties are listed in Table 7. The most important page for non-owners was “DFB Frauen,” a fan page for the German women’s soccer association. The page with the highest selectivity for owners was “Sonnen Batterie,” a manufacturer of solar batteries.

Table 7 Discriminant coefficients of facebook likes

The Facebook likes could classify 100.0% of all 43 cases correctly with an eigenvalue of the model of 2,178.473 (canonical correlation r = 1.000) and a Wilks’ lambda of 0.000 (p = 0.000). Even the cross-validated result revealed a 93% correct classification. Thus, LDA was suitable to predict the correct group of PV owners, according to the present data. In contrast, it was not possible to derive a prediction from the Big Five personality traits alone, nor could the Big Five be determined by single likes.

The results of the discriminant analysis could not be substantiated by a logistic regression. For a total of 61 users—29 owners and 32 non-owners of a storage battery—19,335 different pages were regressed on the dependent variable, and none of them provided even the slightest significant results.

6 Discussion

The t tests revealed that the means of the Big Five z-scores were only predicted sufficiently for neuroticism and extraversion. Extraversion and agreeableness had significant positive Pearson’s correlations between self-assessments and predictions. Thus, only extraversion was sufficiently correctly predicted by the Apply Magic Sauce API for all photovoltaic users. Although the alpha reliability of the traits measured was generally low, it was in the acceptable range for extraversion at 0.707. In the data of the SOEP study, good measurement characteristics could be demonstrated for extraversion for the same scale (Smith et al. 2021). The mean deviation for agreeableness with simultaneous positive correlation could indicate, on the one hand, that the predictions do not apply. On the other hand, it could also indicate that Facebook users regularly rate their own agreeableness lower than is actually the case (Table 5). This question should be investigated further. The results for extraversion and conscientiousness support the findings of Marengo et al. (2020). Among other things, they found that the self-assessments of users versus non-users of Facebook did not differ for conscientiousness, while there were significant differences for extraversion. Although extraversion could be predicted correctly by the Facebook likes, the hypothesis that the Big Five personality traits are significantly different between adopters and non-adopters of electricity storage systems failed. Between the two groups, significant differences could not be found for the self-assessments or the predictions (Table 6). Further, it was not possible to distinguish between battery adopters and non-adopters because the variances overlap in large parts for all Big Five traits. This can also be seen in Table 4, where the standard deviations of the users’ self-reports are up to five times higher than the standard deviations of the AMS predictions. The most likely cause for this may be the rather small sample size, as this leads to strongly varying standard deviations. Apart from personality traits, however, there was a possibility to use the digital footprint in the form of the liked pages to differentiate between user groups. A sole consideration of the Big Five enabled a prediction of group membership, which was not much higher than chance. The additional inclusion of demographic characteristics increased the proportion of correct classifications to 93.8%. Regarding the usefulness of Facebook likes as a distinguishing characteristic between adopters and non-adopters, a linear discriminant analysis uncovered 16 pages that determined adopters and non-adopters of battery storage. One could say that if you are an owner of a PV system and you like CSI: Miami, then you are likely to own a battery system, and if you like football—especially the German women’s team—it is likely that you do not. However, the single likes could not be clearly assigned to the Big Five.

There exist, of course, several other limitations. The applicability of linear discriminant analysis should be tested with a much larger sample. Although the normality assumption is fulfilled according to the central limit theorem, the suggestions of Feldesman (2002) could be taken up. He recommends classification trees as a non-parametric tool for classifying user groups when the assumptions for LDA are not met.

The database of the Apply Magic Sauce API is from 2012. This means Facebook pages created later could not be used to estimate the Big Five personality traits. Furthermore, the API users come from all around the world, mostly from the US, with a large proportion of younger people. While this does not necessarily mean the predictions are not correct for German users, the pages that are suitable for a prediction relate mostly to the interests of American users. This results in a lower share of possible predictions among users of the AMS-API outside the US.

The sample’s personality traits are biased toward higher scores of openness and higher extraversion and lower values of conscientiousness and agreeableness. This is likely because only individuals who are very open-minded toward new technologies and experiences are (a) members of a social network and (b) willing to take part in a survey that analyzes their personality, while in Europe at that time, everyone was talking about data protection issues and the danger associated with using American online services.

More research with larger sample sizes is needed to draw conclusions from the users’ digital footprints, e.g., liked pages, to self-report Big Five scores and, thus, to build a study’s own database or to prove whether there is really no significant difference between the personality traits of battery adopters and non-adopters. Unfortunately, increased consumer awareness of data protection issues has severely limited the acceptance of empirical research in the online sector. Furthermore, hardly any company would risk making personal user data available for scientific purposes.

7 Conclusions

This research aimed to investigate whether the Facebook likes of owners of PV systems were suitable for assessing whether they own an electricity storage system. Although—according to Youyou et al. (2015)—analysis of the digital footprint is well-suited to making a prediction about the Big Five personality traits, a satisfactory prediction about the mean value could be found only for extraversion. Agreeableness showed a positive correlation, but predictions differed from self-assessments. The second hypothesis, that significant differences exist between adopters and non-adopters of battery storage, could not be confirmed.

Although the results did not correspond to the hypotheses, this study provides suggestions for further research in this area. Reliable results require, above all, larger samples and comparable data without having to take a detour via z-scores. For example, a suitable data source is the German core energy market data register (Marktstammdatenregister), which stores all solar power generators in Germany. However, the European general data protection regulation (GDPR) sets high hurdles for the usability of the data for scientific studies, especially in connection with social network analysis. Additionally, further research should be based on detailed scales rather than ultra-short scales. This would also allow an in-depth investigation using structural equation modeling.