Introduction

The NEO Inventories were developed by Paul T. Costa, Jr. and Robert R. McCrae. Because it assessed Neuroticism, Extraversion, and Openness to experience, its original version, developed in 1978, is known as the NEO inventory (NEO-I). The NEO-I measured only three of the Big Five personality traits [1] and was subsequently revised in 1985 to include all five traits under the new title ‘NEO Personality Inventory (NEO-PI).’ It was further refined as the NEO-PI-R [2]. Its latest version is the NEO-PI-3 [3].

The NEO-PI-3 includes 240 items corresponding to the Big Five personality traits (Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to xperience) and subordinate dimensions (facets). It is suitable for use with adolescents and adults (12 years or older). Item responses are made on a five-point scale, ranging from ‘strongly disagree’ to ‘strongly agree’. Electronic and print forms of the inventories are available. Administration of the full version of the NEO-PI-3 takes between 30 and 40 min. Assessment should not be evaluated if there are more than 40 items missing.

The aim of the current study is to validate the Greek translation of the NEO-PI-3 in the general Greek population.

Material and methods

The study sample included 734 subjects from the general Greek population (436 females, 59.4%; 298 males, 40.6%). Their mean age was 40.80 ± 11.48 years (range 25–67 years): 39.43 ± 10.87 years (range 25–65 years) for females and 42.82 ± 12.06 years (range 25–67 years) for males.

The NEO-PI-3 was translated into Greek by KNF and back-translated into English by two other authors (MS and KM). The originators of the instrument and KNF verified the accuracy of the translation and its conformity to the original version. Discrepancies were discussed until an agreement was reached. This final version was then refined to ensure it is easily understandable.

Statistical analysis

All data were coded and analyzed using the Statistical Package for Social Sciences (SPSS) version 20 (SPSS Inc., Chicago, IL, USA). All tests were two-tailed. According to the Bayesian interpretations, the chance of replication in future studies is low for p values between 0.05 and 0.01, moderate for p values between 0.01 and 0.001, and high for p < 0.001 [4].

First, descriptive statistics (means, standard deviations, and frequency tables) were calculated for the items and subscales proposed by Costa and McCrae [5]. Second, with the aim of studying the structure of the NEO-PI-3, a confirmatory factorial analysis (CFA), was conducted at the facets level (see below); a targeted rotation of principal components was also evaluated using congruence coefficients with the American normative sample.

Scale reliability was measured by Cronbach’s alpha. For group comparisons, reliability values of 0.7 are considered satisfactory while subscales values approximately 0.6 are considered acceptable [6]. However, it has been argued that internal consistency is less important than retest reliability [7].

The Pearson product–moment correlation method was used to determine the presence or absence of variable correlation. This method was chosen due to its robustness with regards to normality assumptions and for its simple interpretability. For Pearsons r, the suggested threshold for effect sizes were r = 0.10 = small effect, r = 0.24 = medium effect, and r = 0.37 = large effect [8].

Sociodemographics groups were compared by ANOVA.

Confirmatory factorial analysis

CFA was carried out with the lavaan package [9] running in R [10]. The lavaan package has been shown to generate the same results as other software packages [11]. Mardia’s kurtosis was used to check for multivariate non-normality: Mardia’s kurtosis = 1,194, z = 30.78, p < .0001.

Maximum likelihood estimation with robust standard errors and the Satorra–Bentler scaled test statistic were used to test CFA models; this method was chosen because it was unlikely to be affected by deviation from normality in data [12]. Chi square is the traditional fit index used to evaluate an overall model as it assesses the magnitude of discrepancy between the sample and the fitted covariance matrices [13]. However, the use of the chi square test to assess this model fit was found unsatisfactory for a number of reasons [14], including its sensitivity to sample size. The ratio of chi square to the degrees of freedom (df) was calculated, with ratios larger than 3 indicating poor fit [15]. Additional parameters for fit estimation were the following: the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and the standardized root mean square residual (SRMR). RMSEA values of 0.08 or lower, SRMR values of 0.09 or lower, and CFI values of 0.90 or higher are considered acceptable [13],[16].

Two models were tested, a rather unlikely, unidimensional model, which assumes that all facets load on a single factor, and the a priori expected five-factor model, in which all facets were linked to its own latent factor only, the so-called simple structure [17],[18]. The more complex models were not tested because they are based on cross-loading (as well as several cross-loadings), which prevents a clear attribution of the predictor to the latent variable it is expected to measure. As a matter of fact, it has been found that increasing the measure’s complexity to comply with the CFA standard led to a reduced convergent and discriminant validity [17].

When CFA failed to reach fit, the orthogonal Procrustes rotation was proposed as a method to test the replicability of the NEO-PI-3 personality factors [18]-[20]. A dedicated script running in SPSS of the program that performs the orthogonal Procrustes rotation was used to execute the analysis (courtesy of Professor Robert R. McCrae).

According to a shared convention, factor loadings higher than 0.71 (accounting for 50% of variance or more) are considered excellent, 0.63 (40%) very good, 0.55 (30%) good values around 0.45 (20%) fair, and values below 0.32 (10% of variance) poor [21].

Congruence between potentially homologous factors across samples was evaluated using the coefficient of congruence (CC). The CC index ranges from −1.00 (perfect negative similarity) to 1.00 (perfect positive similarity), with zero indicating complete dissimilarity [22]. Reported thresholds for agreements between factors are as follows: very high = 0.90 or above; high = 0.80 to 0.89; and moderate = 0.70 to 0.79 [23].

Results

The study sample was convenient and somewhat representative of the country’s active population with some overrepresentation of younger ages and clerks (Tables 1 and 2).

Table 1 Composition of the study sample in terms of gender and age in comparison to the general population according to the Greek National Statistics Service for 2009
Table 2 Occupation characteristics of the study sample

Internal consistency reliabilities and mean scores for the Greek NEO-PI-3 facets

Mean, standard deviation, skewness, kurtosis, and internal consistency scores (with 95% confidence of interval) for the 30 NEO-PI-3 facets are shown in Table 3.

Table 3 Mean values for the domains and the facets of the Greek NEO-PI-3

Most facets exhibited Cronbach’s alpha values above 0.60, the accepted limit of internal consistency reliability for subscales. A few facets exhibited Cronbach’s alpha values lower than 0.50. Overall, the internal consistency reliability measures of the Greek translation were somewhat lower than those observed in the original American sample.

Skewness was always below [3.00] while kurtosis was always below [8.00], indicating that there was no univariate non-normality in the distribution of facet scores.

Confirmatory factor analysis of the Greek NEO-PI-3

The unidimensional model was rejected on the basis of the fit indexes: chi square = 4,975.31, df = 405, p < 0.0001; CFI = 0.387; RMSEA = 0.124 (95%CI: 0.121–0.127); SRMR = 0.137.

The a priori expected five-factor model had a better fit for all indexes (Table 4).

Table 4 Confirmatory factor analysis of the facets of the Greek NEO-PI-3

Overall, the fit was still poor. However, loading of the facets on their own hypothesized factors was acceptable, and the estimated Cronbach’s alphas for the hypothesized factors were very good.

Procrustes rotation analysis of the Greek NEO-PI-3

The Procrustes rotation analysis revealed a good replication of the expected five-factor structure of the NEO-PI-3.

The loading of the facets on their own factors was good to excellent with few exceptions (Table 5).

Table 5 Factor loadings for Greek NEO-PI-3 facet scales after Procrustes rotation

Only a minority of facets also loaded on a different factor than their own with an absolute factor loading higher than 0.40.

CC values for potentially homologous factors across samples were within high to very high interval. The extracted factors in the Greek sample can be considered reasonably homologous to their counterparts in the American normative sample.

Scores on the five dimensions of the Greek NEO-PI-3

The pattern of raw mean scores is similar to that seen in the US and elsewhere (Table 6).

Table 6 Mean values and correlations for the big five factors of the Greek NEO-PI-3

As expected, Neuroticism was negatively related to the other factors, Extraversion was positively related to Openness, Agreeableness, and Conscientiousness, and Agreeableness was positively related to Conscientiousness. The links between Openness and Agreeableness or Conscientiousness were less evident but in the expected direction. Correlation between factors was never so high as to prevent discriminant validity.

Differences by gender, age, and education on the Greek NEO-PI-3

Females scored higher than males on the Neuroticism and the Openness factors. Males scored marginally higher than females on the Conscientiousness factor (Table 7).

Table 7 Differences by gender (females–males, after standardizing the scores as z -scores using the total samples M and SD) and correlation with age and education on the Greek NEO-PI-3

Overall, the pattern of gender differences is similar to what one sees around the world, except that Greek females did not score higher than males on Agreeableness.

Age and education were modestly related to Greek NEO-PI-3 facets.

Discussion

The current paper reports on the results of the Greek translation of the NEO-PI-3. Most facets exhibited Cronbach’s alpha values above 0.60, though overall, the internal consistency reliability measures of the Greek translation were lower than those observed in the original American sample. Confirmatory factor analysis failed to reach the predefined fit. However, it showed acceptable loading of the facets on their own hypothesized factors and very good estimations of Cronbach’s alphas for these factors; therefore, it partly supports the five-factor structure of the NEO-PI-3. Principle components after Procrustes rotation closely resembled the factors of the American normative sample. Correlations between dimensions were as expected and similar to those reported in the literature.

The literature suggests that, overall, the psychometric properties of NEO-PI-R scales have been found to generalize across ages, cultures, and methods of measurement [7].

The internal consistency originally reported for both NEO-PI-R domains (N = 0.92, E = 0.89, O = 0.87, A = 0.86, C = 0.90) as well as facets (0.56–0.81) was high. The internal consistency of the NEO-PI-3 was similar to that of the NEO-PI-R, with alphas ranging from 0.89–0.93 for domains and 0.54–0.83 for facets [24],[25]. The literature appears to support the internal consistencies listed in the manual. The Filipino translation of the NEO-PI-R has internal consistency of domain scores ranging from 0.78–0.90 [26], with facet alphas having a median of 0.61 [27].

Test-retest reliability (administered 3 months later) of an early version of the NEO-PI domains was N = 0.87, E = 0.91, and O = 0.86 [28]. The test-retest reliability reported in the manual of the NEO-PI-R over 6 years was N = 0.83, E = 0.82, O = 0.83, A = 0.63, and C = 0.79. Costa and McCrae pointed out that this not only shows good reliability of the domains but also that they are stable over long periods of time (past the age of 30), as the scores more than 6 years apart were only marginally different from the scores measured a few months apart [5]. Other research has also shown acceptable test-retest reliability. A 2001 study by Kurtz and Parrish on the short-term test-retest reliability yielded alpha coefficients 0.9–0.93 for domains and 0.70–0.91 for facets after a 1-week interval [29]. A 2006 study by Terracciano et al. [30] on long-term test-retest reliability yielded alpha coefficients 0.78–0.85 for domains and 0.57–0.82 for facets after a 10-year interval.

In terms of criterion validity, Conard (2006) found that Conscientiousness significantly predicted the GPA (grade point average) of college students, more so than by using Scholastic Assessment Test (SAT) scores alone [31]. Garcia et al. correlated a Spanish version of the NEO to predictors of teacher burnout in Sevilla, Spain. Neuroticism was related to the ‘emotional exhaustion’ factor of burnout with a correlation coefficient equal to 0.44. Agreeableness related to the ‘personal accomplishment’ factor of burnout (which is negatively scored when predicting burnout) exhibited a score of r = 0.36 [32]. A group of authors in 2006 found that in a minority students population, the Extraversion trait was correlated to Career Decision Making Self-Efficacy (CDMSE) with r = 0.30, while Neuroticism was strongly related to Career Commitment after controlling for CDMSE (r = 0.42) [33]. Finally, in 2007, Korukonda reported that Neuroticism was positively related to computer anxiety, while Openness and Agreeableness were negatively related to each other [34].

Cross-cultural stability of an instrument can be considered evidence of its validity. A huge amount of cross-cultural research has been carried out on the Five-Factor model of personality by utilizing the NEO-PI-R and its shorter version, the NEO-FFI. A collection of selected papers from various researchers across the globe have been presented covering various issues in cross-cultural research on the FFM [35]. This monograph has also presented data for the FFM from several cultures. The robustness of the FFM has been proven across different cultures; these include but are not limited to the following: Chinese [36],[37], Estonian and Finnish [38], Filipino and French [39], Indian [40], Portuguese [41], Russian [42], South Korean [43], Turkish [44], Vietnamese [45], sub-Saharan cultures like Zimbabwean [46], Austrian, former East and West German, and Switzerland’s culture [47]. On the basis of the data from 16 cultures, it has been suggested that the concepts of Neuroticism, Openness, and Conscientiousness are cross-culturally valid, while Extraversion and Agreeableness are components of interpersonal circumflex and are more sensitive to cultural context [48]. It is interesting to note that in the Zuckerman five-factor model ‘Openness to experience’ is deliberately excluded because Zuckerman suggested that it does not meet the criteria for a truly ‘basic’ factor of personality [49]. Furthermore, it seems that the age differences in the five factors of personality across the adult life span are paralleled in samples from Germany, Italy, Portugal, Croatia, and South Korea [50]. The age and gender differences and fluctuations found in the original American sample [3] were generally confirmed in an analysis of the data from 51 cultures [51]-[53]. These findings are paralleled by the results of the current Greek validation study.

In conclusion, we submit that the results of the current study confirm the reliability of the Greek translation and adaptation of the NEO-PI-3. The inventory has comparable psychometric properties in its Greek version as in the original and other national versions, although with somewhat lower values, and it is suitable for clinical as well as research use in Greek speaking populations.