Empathy is a personal competence that contributes greatly to the initiation and maintenance of desirable interpersonal relationships. Empathy has received great attention in psychological research and practice, and it was found to be related to desirable moral and social functioning (Romera et al. 2019). Empathy has been defined as an emotional response related to other people’s emotions and situation that is congruent with other people’s emotional states (Eisenberg et al. 1991). Traditionally, empathy is defined by two dimensions: cognitive empathy, described as understanding other people’s emotions, and affective empathy, described as experiencing other people’s emotional states (Davis 1983; Jolliffe and Farrington 2006). Thus, empathy includes both thinking and feeling in relation to another’s emotional state.

Low empathy is an important predictor of antisocial behaviors, and high empathy is an important predictor of prosocial behaviors. Recent meta-analyses showed that perpetrators of bullying (Zych et al. 2019b, c) and perpetrators of cyberbullying (Zych et al. 2019a) scored lower on both cognitive and affective empathy (see Zych et al. 2019b for a review of both meta-analyses). Similarly, other meta-analyses have found that low empathy was related to an increased likelihood of offending (Jolliffe and Farrington 2004; Van Langen et al. 2014).

Regarding prosocial behaviors, meta-analytic findings confirm that children who defend the victims of bullying score high on empathy (Nickerson et al. 2015; Zych et al. 2019b). In addition, in an experimental manipulation, it was found that empathy-induction using an emotional story was related to significantly higher cooperation with other people (Batson and Moran 1999).

Although the causal relationship between empathy and prosocial and antisocial behaviors still needs to be confirmed, increasing empathy is frequently considered to be an important component of interventions designed to prevent or reduce antisocial behaviors such as conduct problems (Durlak et al. 2011). For example, in their meta-analyses including 18 randomized controlled trials of empathy training programs, Teding van Berkhout and Malouff (2016) found that empathy training successfully increased later levels of empathy (Hedge’s g = .51), but this was only clearly the case for select populations (university students and health professionals).

In order to examine the relation of empathy to theoretically relevant constructs in different cultures, and changes in empathy which may be the result of empathy-enhancement programs in these cultures, the development of appropriate tools to measure empathy is an essential initial step. The Basic Empathy Scale (BES), designed and validated by Jolliffe and Farrington (2006), is probably the most popular instrument to measure empathy worldwide. According to a recent review focused on the measurement of empathy, up to 2017 (Basto-Pereira and Farrington 2020), this 2006 BES validation study has received almost 800 citations. The original scale consists of 20 items, 11 measuring affective empathy and 9 measuring cognitive empathy. The original instrument, validated with 363 English adolescents, showed good psychometric properties (Jolliffe and Farrington 2006). Example of studies on psychometric properties of the BES in different countries are shown in Table 1.

Table 1 Psychometric studies of the Basic Empathy Scale in Different Countries

As displayed in Table 1, most of the studies showed that females scored higher than males on both cognitive and affective empathy. Many studies conducted with children and adolescents found an adequate fit of the original two-factor structure of the BES, including 20 items (Albiero et al. 2009; D’Ambrosio et al. 2009; Cavojova et al. 2012; Jolliffe and Farrington 2006; Pechorro et al. 2015). Nevertheless, most of these studies found fit indices that are commonly considered adequate but not excellent. The recommended cutoff values for indices such as CFI and TLI are close to .95, with errors (e.g., RMSEA) close to .06 (Hu and Bentler 1999). Among the two-factor structure with 20 items, only Albiero et al. (2009) and Pechorro et al. (2015; with adolescent students but not with the incarcerated female subsample) met these strict cutoff criteria. Thus, the two-factor model with 20 items shows good psychometric properties but it seems possible to find a model with a better fit, although the model fit depends greatly on the sample used in each study.

Several studies tested different factor structures of the BES. Among them, a three-factor model with 20 items was described (Herrera-López et al. 2017). Also, two-factor models with a reduced number of items were found to have a good fit (Geng et al. 2012; Heynen et al. 2016; Pechorro et al. 2015; Sánchez-Pérez et al. 2014). Among these studies, the strict cutoff criteria for an excellent fit (Hu and Bentler 1999) were met for a 12-item version with a two-factor structure that showed the best fit (Heynen et al. 2016) and for a 16-item version with a two-factor structure (Geng et al. 2012). Geng et al. (2012) used a sample aged 9 to 18 years old, an age range similar to the participants of the current study. Thus, it is possible that a shorter version of the BES could be more appropriate in certain settings.

The original version of the BES (Jolliffe and Farrington 2006) includes 12 positively worded items and eight reverse worded items. The short version developed by Heynen et al. (2016) with only two reversed items, using a sample of prisoners, showed the best fit indices. Importantly, the authors of this study suggested that some of the reversed items might have not been understood by the participants. Although combining positive and reversed items is common in psychological research, new developments in psychometrics suggest that this might negatively affect the psychometric properties of questionnaires. Suárez-Alvarez et al. (2018) conducted a repeated measures study, in which they administered a self-efficacy questionnaire with positive only, reversed only, and combined-item forms to general population adults. In comparison to the positive only and reversed only forms, the combined-item form with reversed and positive items showed lower discrimination indices, and also had lower reliability and worse fit indices in a confirmatory factor analysis. There seemed to be no differences regarding acquiescence response bias between the three forms. As a result, the authors recommended not using positive and reversed items combined in the same questionnaire.

It is possible that the use of only positively-worded items would improve the psychometric properties of the BES. Moreover, it is important to validate the BES in the geographic areas where research on empathy is at an earlier stage of development. To our knowledge, the BES has not been validated or used in Poland. At the same time, the rates of bullying in Poland are frequently found to be higher than in other countries (Twardowska-Staszek et al. 2018; Zych et al. 2017).

There are only a few measures of empathy that have been validated in Poland. Among them, Kliszcz et al. (2006) validated the Jefferson Scale of Empathy (Hojat et al. 2002). The Jefferson Scale of Empathy is specifically designed to measure empathy of health care providers with their patients. Jankowiak-Siuda et al. (2017) have recently validated the Polish version of the Empathy Quotient (Wakabayashi et al. 2006). This study was conducted with participants aged 15 to 80, and the psychometric properties of the scale for children and adolescents were not specifically tested. Some studies in Poland used an Internet application which induced either affective or cognitive empathy, finding that cognitive empathy was related to defending the victims of cyberbullying (Barlińska et al. 2018). These studies did not include self-report measures of empathy.

The BES is one of the most frequently used measures of empathy in the world, but it has not been validated in Poland. Empathy is an important psychological construct (e.g., Cohen and Strayer 1996), and it is essential to have a measure validated for Polish children and adolescents. This would make it possible to obtain a measure that can be used to evaluate programs to enhance empathy, and to understand if the relation of empathy with prosocial and antisocial behaviors in Poland is similar to the relations among these variables found in other countries. Therefore, the objective of this study is to analyze the psychometric properties of the Polish version of the BES using a broad sample of children and adolescents. The psychometric properties of the questionnaire will be tested using the 20-item version and a short version that includes only the positively-worded items. Given that empathy is related to desirable moral functioning (e.g., Romera et al. 2019), social and emotional competencies and prosocial behaviors (e.g., Nickerson et al. 2015) concurrent validity is tested through correlations with these constructs. It was hypothesized that the Polish version of the BES would have good psychometric properties.

Method

Participants

A total number of 1052 students aged 9 to 16 years (M = 12.53, SD = 1.98; 54.4% females) participated in this study. Schools were located in the Lesser Poland geographic area, four of these were in a large city and two were in smaller towns. Students were enrolled in four Primary Schools, Grades 4 to 7 (N = 580) and two Middle Schools, Grades 2 and 3 (N = 472). All the participants were Caucasian with Polish nationality.

Procedure

This was a cross-sectional study conducted using a survey method-approach. The BES was translated into Polish by the first author (a Polish native speaker), reviewed by the last author (also a Polish native speaker) and back-translated into English by an official translation service. The final versions were compared, and minor disagreements resolved. The minor disagreements were related to vocabulary that could mostly be interpreted as synonyms.

Schools were selected by convenience sampling through the head teachers who were invited to participate in this study. These schools were then contacted, and all agreed to participate in the survey. Within each classroom, students were informed about the objective of this study by a researcher and asked to fill in the survey. Participation was voluntary and anonymous, and participants had the right to decline or withdraw from the study at any point. Students filled in a pen and paper survey during their regular classroom hours, supervised by the researchers of this project who delivered and collected the questionnaires. None of the students declined to participate or withdrew their consent. The study met all national and international ethical standards, including the Declaration of Helsinki and the data protection regulations.

Instruments

Empathy was measured using the Basic Empathy Scale (Jolliffe and Farrington 2006) translated into Polish. The original and the Polish version of the BES use a 5-point Likert response scale ranging from 1 (totally disagree) to 5 (totally agree). The original scale contains 20 items: 11 items focused on affective empathy and 9 items focused on cognitive empathy. The final Polish version included 12 positive-worded items, with six focused on affective empathy (e.g., feeling sad after being with a friend who was sad) and six focused on cognitive empathy (e.g., understanding friend’s happiness). The instrument showed good psychometric properties described in the results section.

Social and emotional competencies were measured using the Social and Emotional Competencies Questionnaire (SEC-Q) by Zych et al. (2018). This instrument (α = .90, in the current sample) contains 16 items with a 5-point Likert response scale ranging from 1 (strongly disagree) to 5 (strongly agree). It includes four factors: Self-awareness (α = .78; e.g., “I am aware of the thoughts that influence my emotions”), Self-motivation and management (α = .77; e.g., “I pursue my objectives despite the difficulties”), Social-awareness and prosocial behavior (α = .79; e.g., “I usually know how to help others who need that”) and Responsible decision making (α = .78; e.g., “I usually consider advantages and disadvantages of each option before I make decisions”). The CFA showed a good fit of the current data to this four-factor model (S/B χ2 = 291.1784; df = 98; p < .001; NFI = .98; NNFI = .98; CFI = .99; RMSEA = .047; 90% CI = .041–.053).

Moral disengagement was measured using The Mechanisms of Moral Disengagement Scale (Bandura et al. 1996). This instrument (α = .93) includes 32 items with a 5-point Likert response scale ranging from 1 (totally disagree) to 5 (totally agree) distributed across 4 domains: Dehumanization (α = .78; 8 items, e.g., “Some people deserve to be treated as animals”), Minimizing consequences (α = .67; 4 items, e.g., “Teasing someone does not really hurt them”), Reconstruction (α = .79;12 items, e.g., “It is alright to beat someone who bad mouths your family”), and Disconnecting agency (α = .76; 8 items, e.g., “If kids are living under bad conditions they cannot be blamed for behaving aggressively”). The CFA confirmed the four factor structures showing an adequate fit of the current data (S/B χ2 = 1879.7569; df = 458; p < .001; NFI = .90; NNFI = .92; CFI = .92; RMSEA = .091; 90% CI = .087–.096).

A short 19-item version was developed for Primary Education after eliminating the items that were difficult to understand for Primary School children, with a 5-point Likert response scale ranging from 1 (totally disagree) to 5 (totally agree) distributed in three domains: Dehumanization (α = .82; 7 items), Minimizing consequences (α = .74; 4 items), and Reconstruction (α = .83; 8 items). The CFA for the three-factor model indicates a good fit of the data (S/B χ2 = 664.4819; df = 149; p < .001; NFI = .97; NNFI = .97; CFI = .98; RMSEA = .085; 90% CI = .079–.092).

Data Analysis

First, a Confirmatory Factor Analysis (CFA) with maximum likelihood, robust method and polychoric correlations (Satorra-Bentler chi-square) was performed with EQS. 6.2. This was done with a 20-item model based on the original version of the BES (Jolliffe and Farrington 2006). Model fit was tested taking into account a combination of different indices such as the Normed Fit Index (NFI) (≥.90), the Non-Normed Fit Index (NNFI) (≥.90), the Comparative Fit Index (CFI) (≥.90) and the Root Mean Square Error Approximation (RMSEA) (≤.08) (Bentler, 1990). Factor loadings were examined and items with low factor loadings (lower than .20) were eliminated, and another Confirmatory Factor Analysis was run. Item-total correlation matrix was examined to check if all the items of the scale were correlated in the expected direction. Items with non-significant correlations and correlations in an unexpected direction were eliminated obtaining a short version of the BES with positively-worded items only. Another Confirmatory Factor Analysis was run and fit indices of all the models were compared to choose the model with the best fit.

Descriptive statistics were calculated using the PASW Statistics 20 software. Cronbach’s alphas were calculated for each factor and the total scale. Pearson item-total correlations and interitem correlations were also calculated. Concurrent validity was tested using Pearson correlations among the BES, Moral disengagement and Social and Emotional Competencies. Empathy was expected to be related to low Moral disengagement and high level of Social and Emotional Competencies. Differences between males and females, and younger versus older participants, were tested using the Student’s t test. To estimate the construct reliability, composite reliability (CR) and average variance extracted (AVE) were computed. The cut-off points used for these indices are usually .70 for CR and .50 for AVE.

Results

Factor Structure and Items of the BES in Polish Children and Adolescents

A confirmatory factor analysis using the original 20-item two-factor structure of the BES showed a poor fit to the data for Primary Education participants (S/B chi-square = 794.52, df = 169, p < .01, CFI = .69, NFI = .64. RMSEA = .09) and for Middle Education participants (S/B chi-square = 1528.12, df = 169, p < .01, CFI = .66, NFI = .63. RMSEA = .14). Thus, other models were tested to find a better fit to the data.

Factor loadings of items 1, 6 and 13 (all negatively-worded) were below .20 in both Primary and Middle Education subsamples. Model fit improved after eliminating these items for Primary Education participants (S/B chi-square = 777.68, df = 118, p < .01, CFI = .88, NFI = .86. RMSEA = .11) and for Middle Education participants (S/B chi-square = 972.77, df = 118, p < .01, CFI = .76, NFI = .74, RMSEA = .13). Nevertheless, the model fit was still poor.

Item-total correlation matrix and interitem correlation analyses showed some further concerns regarding the negatively-worded items. Item 7 (negative) had nonsignificant correlations with items 9 and 10; item 8 (negative) had nonsignificant correlations with items 4, 9 and 11; item 19 (negative) had nonsignificant correlations with item 17, and item 20 (negative) had nonsignificant correlations with items 4, 5, 11, 15 and 17. In the Middle Education sample, all the negative worded items had loadings below .40. In the Primary Education sample, negatively-worded items such as 7, 8, and 20 had loadings below .40. Thus, a model without the negatively-worded items was tested.

Alternatively, a 12-item model without the negatively-worded items showed a good fit to data in Primary Education (S/B chi-square = 177.14, df = 53, p < .01, CFI = .97, NFI = .96, RMSEA = .07) and in Middle Education (S/B chi-square = 203.02, df = 53, p < .01, CFI = .94, NFI = .93, RMSEA = .08). In this case, all the factor loadings were above .40 (see Fig. 1). Thus, based on the fit indices and theoretical basis, the two-factor model with 12 items was considered the best.

Fig. 1
figure 1

Confirmatory Factor Analysis of the Basic Empathy Scale in Poland

Reliability of the BES in Polish Children and Adolescents

The Polish version of the BES had very good Cronbach’s alpha values for affective (Primary α = .75, Middle α = .76), cognitive (Primary α = .84, Middle α = .77) and total empathy (Primary α = .85, Middle α = .84). The Average Variance Extracted in the Primary School sample was .47 and in the Secondary School sample was .41. The Composite Reliability in the Primary School sample was .91 and in the Middle School sample was .89.

Empathy in Males and Females in Primary and Middle Education

Table 2 shows that affective, cognitive and total empathy were higher in Primary Education compared to Middle Education. It also shows that affective, cognitive and total empathy were higher for females compared to males. Gender differences were consistent in both Primary and Middle Education. Nevertheless, the lower levels of empathy in Primary compared to Middle education was only significant for males. Thus, older males have less empathy, but there were no differences for females.

Table 2 Gender and Primary versus Middle Education Differences in Affective and Cognitive Empathy

Relations among Empathy, Moral Disengagement and Social and Emotional Competencies

Concurrent validity was tested by checking if empathy was related to social and emotional competencies and to moral disengagement. These relations were statistically significant and were in the expected direction, showing that empathy was positively related to high social and emotional competencies and low moral disengagement (see Table 3). The strongest relations were found between high affective empathy and high social awareness and prosociality, and between high cognitive empathy and a high total score in social and emotional competencies.

Table 3 Pearson Correlations between Empathy (final scales), Moral Disengagement and Social and Emotional Competencies

Discussion

Empathy is an important social and emotional skill that is related to low antisocial behavior (Jolliffe and Farrington 2004) and high prosocial behavior (Nickerson et al. 2015). As a result, many social and emotional learning programs target increasing empathy as a way of increasing desirable behavior and social cohesion (Durlak et al. 2011). Nevertheless, research focused on empathy is not equally advanced around the world and there are some geographic areas, such as Poland, where studies on empathy are urgently needed.

The objective of this study was to test the psychometric properties of the BES in Poland using a broad sample of Polish children and adolescents. The BES is one of the most popular measures of empathy in the world (see Basto-Pereira and Farrington 2019 for a review). This measure had not been validated in Poland. To our knowledge, this is the first study that validated a measure of empathy in Polish children and adolescents. Thus, we believe that the current results are useful in advancing knowledge about empathy in an understudied geographic area.

In this study, an original 20-item version of the BES (Jolliffe and Farrington 2006) was tested, but this factor structure did not show a good fit to the data. Based on other studies that used a shorter version (e.g., Geng et al. 2012; Heynen et al. 2016), and on statistical analyses of the items, a 12-item version of the BES was produced. Some studies suggest that the inclusion of both positively-worded items and negatively-worded items in a questionnaire decreases its psychometric properties (Suárez Álvarez et al., 2018). The final Polish version of the BES, with the best psychometric properties, includes only positively-worded items. Thus, it is possible that the participants had difficulties in understanding the negatively-worded items or the Likert response-scale for these items. The Polish 12-item version of the BES showed very good psychometric properties for Primary School children and Middle School adolescents.

Some gender differences were found regarding affective and cognitive empathy in Polish children and adolescents. Females scored higher than males in Primary and Middle school subsamples, in affective and cognitive empathy. Perceived affective and cognitive empathy were stable in females but decreased with age in males. Empathy was found to be related to theoretically similar constructs such as low moral disengagement and high social and emotional competencies. Nevertheless, these relations were stronger in males than in females. Previous research found that females tend to have more advanced perceptions of social and emotional competencies (Zych et al. 2018), and therefore, it is possible that they can distinguish between empathy and similar constructs whereas males treat them all as a single, less complex construct. Future studies should examine these possibilities.

Given that the rates of antisocial behaviors such as bullying in Poland are relatively high (Twardowska-Staszek et al. 2018), programs to decrease these behaviors are urgently needed. Increasing empathy should be a component of these programs and the current study is especially useful for the evaluation of these programs, that should be specifically adapted to Polish culture. For example, compared to other countries, Poland is a country with medium-high individualistic culture and medium-high emotional expressivity endorsement (Matsumoto et al. 2008). Thus, it is reasonable to suggest that programs to promote empathy in Poland should be based on both individual and social values, and can use relatively high, but not exaggerated, expressions of empathy based on emotional expressivity culture in Poland. Future studies could also include cross-national comparisons to study similarities and differences in the relation between empathy and antisocial or prosocial behaviors in Poland and other geographic areas. This could be useful for the prevention and intervention in these behaviors.

This study has several strengths, such as the use of a broadly validated questionnaire with a large sample of Polish children and adolescents. Nevertheless, it also has some limitations. Given that the Polish version of the BES only includes positively-worded items, it could be useful to conduct future studies that control for social desirability or acquiescence response bias. Future research could also use empathy measures that do not use self-reports, for example, other-reports focused on behavioral expressions of empathy. Despite these limitations, the current study is an important step towards filling the gaps in knowledge regarding empathy in Poland.

Given that research on empathy is not equally advanced around the world, and studies in Poland were urgently needed, the current study is an important contribution to the field. The Polish version of the Basic Empathy Scale has good psychometric properties and it can become a very useful tool for researchers and practitioners in psychology in Poland.