Factor structure and test-retest reliability of the Polish version of the Clarity of Auditory Imagery Scale

Vividness of imagery usually refers to the degree of similarity between mental images and corresponding percepts of real objects. One of the recently developed questionnaires, proposed to measure the vividness of auditory imagery, is the Clarity of Auditory Imagery Scale (CAIS). The main goal of the present study was to assess the factor structure, internal consistency, and test–retest reliability of the Polish version of the CAIS. The study was conducted on musicians (N = 39) and non-musicians (N = 40) to establish differences between the two groups in the vividness (or more specifically, clarity) of their auditory images. A combination of the minimum average partial (MAP) test and parallel analysis (PA) was used as a method of establishing the number of factors and provided evidence that the CAIS is one factor questionnaire. Test–retest reliability was measured by the intraclass correlation coefficient (ICC) between the mean scores obtained in two measurements made over a one-week interval. The test–retest (ICC) obtained between two measurements equaled .85. The ICC value showed satisfactory stability of the measurement of the vividness of auditory images, at least for short time intervals. The internal consistency of the scale was also satisfactory (Cronbach’s α = .87). Summarizing, the psychometric properties of the Polish version of the CAIS indicate that the scale is a reliable measure of the vividness of auditory imagery. Vividness of auditory imagery measured by the CAIS was not influenced by sex or musical expertise factors.


Introduction
Auditory imagery is defined as an introspective, nonhallucinatory experience of hearing that occurs in the absence of real sound (Hubbard 2013). Although the number of studies of auditory imagery has increased in the past years (Hubbard 2018), the phenomenon has been much less investigated than visual imagery (Hubbard 2010;Jensen 2005). Vividness refers to the degree of similarity between mental images and corresponding percepts of real objects (Marks 1999). In the case of auditory imagery, vividness may be related to the strength of representations of auditory features within the phonological loop in working memory (Baddeley and Andrade 2000). Kosslyn et al. (1990) and Tinti et al. (1997) found that the vividness of auditory images was rated as higher than the vividness of visual images, but Tracy et al. (1988) suggest the reverse relation.
There is almost general agreement that vividness (in terms of similarity between the real sound and its image) is an attribute of auditory images (Andrade et al. 2014;Betts 1909;Gissurarson 1992;Halpern 2015;Hishitani 2009;Hubbard 2010Hubbard , 2013Hubbard , 2018Lacey and Lawson 2013;Willander and Baraldi 2010). However, there is not one commonly accepted questionnaire for measuring the vividness of auditory images (Hubbard 2013). Currently, there are several questionnaires which are entirely or partly designed for measuring the vividness or clarity of auditory imagery: the Questionnaire on Mental Imagery (QMI) (Betts 1909), the shortened version of the Questionnaire on Mental Imagery (Betts' QMI) (Sheehan 1967), the Auditory Imagery Scale (AIS) (Gissurarson 1992), the Auditory Imagery Questionnaire (AIQ) (Hishitani 2009), the Clarity of Auditory Imagery Scale (CAIS) (Willander and Baraldi 2010), the Bucknell Auditory Imagery Scale -Vividness (BAIS-V) (Halpern 2015), and the Plymouth Sensory Imagery Questionnaire (Psi-Q) (Andrade et al. 2014).
Contrary to the questionnaires pointed in the previous paragraph, the CAIS, the BAIS-V, and the Psi-Q, have clear instructions and scale's labels (Andrade et al. 2014;Halpern 2015;Willander and Baraldi 2010). The CAIS was developed to measure only the clarity of auditory imagery (Willander and Baraldi 2010), whereas the BAIS-V and the auditory imagery scale of the Psi-Q were designed to investigate the vividness of auditory imagery (Andrade et al. 2014;Halpern 2015). In all three questionnaires, the labels describing scales' points are adequate in relation to the measured property.
The CAIS is one of the most recent questionnaires developed for measuring auditory imagery in terms of clarity of mental imagery, which is considered as a component of the vividness (Marks 1999;McKelvie 1995;Willander and Baraldi 2010). The main benefit of using the CAIS in auditory imagery research is that it appears to be a reliable measure of clarity of auditory imagery and has clear instructions and unified, non-misleading scale labeling.
As Willander and Baraldi (2010) found, clarity of imagery is one of the components of the concept of vividness of mental imagery. However, many authors do not precisely distinguish between clarity and vividness (Hubbard 2010(Hubbard , 2013Lacey and Lawson 2013). Lacey and Lawson (2013) and Hubbard (2018) consider that both terms may refer to the same phenomenon: the degree of similarity between the auditory image and corresponding real sound. We also consider that the two terms may be synonyms. Therefore, we decided to use the term "vividness" to refer to the phenomenon measured by the CAIS.
The first issue we would like address in this study is to assess the factor structure, internal consistency, and test-retest reliability of the Polish version of the CAIS. Previously, the authors of the CAIS carried out a study of a Swedish sample to determine the factor structure of the scale (Willander and Baraldi 2010). Willander and Baraldi extracted four factors with eigenvalues greater than 1 in the initial principal component analysis (PCA). Further, they carried out a combination of a minimum average partial (MAP) test (O'Connor 2000;Velicer 1976) and a parallel analysis (PA) (O'Connor 2000) which indicated that only one factor should be extracted. The one factor explained 31.63% of the scale's total variance. A similar study has also been applied to the Spanish version of the CAIS (Campos and Pérez-Fabello 2011). Campos and Pérez-Fabello conducted a PCA and extracted five components with eigenvalues greater than 1, which explained 57.4% of the CAIS total variance. Unfortunately, neither MAP nor PAwas carried out in that study to verify the results of the PCA. Furthermore, Campos and Pérez-Fabello did not propose any interpretation of their five-factor solution. The internal consistency of the CAIS was determined by Cronbach's alpha: α = .88 (Willander and Baraldi 2010), α = .82 (Campos and Pérez-Fabello 2011), and α = .88 (Campos and Fuentes 2016). However, in order to more accurately evaluate the psychometric properties of the questionnaire, it is necessary to carry out a testretest study on the same sample of people, which would allow the test-retest reliability of the scale's measurement to be assessed. The construct of auditory imagery vividness is assumed to be relatively stable over time (Willander and Baraldi 2010). Consequently, the score achieved in well-designed questionnaire dedicated to measure vividness of auditory imagery should also keep stability across measurements in different time points. So far, such an analysis has not been conducted for either the original version of the CAIS or its Spanish version.
Another important issue we would like to rise in the study concerns the influence of musical training on the level of auditory imagery vividness. The question is whether musicians and non-musicians differ in vividness of auditory imagery. A few studies show that musicians are significantly more successful in behavioral tasks involving auditory imagery than nonmusicians (Aleman et al. 2000;Tużnik et al. 2018). Musically trained participants achieve much higher scores than the general public in assessments of vividness of auditory imagery (Campos and Fuentes 2016;Gissurarson 1992;Hishitani 2009;Janata and Paroo 2006;Seashore 1938). So, it can be expected that similar relationship should be found for the Polish version of the CAIS. However, earlier studies related to the CAIS provide inconsistent results concerning differences between musicians and non-musicians in auditory imagery vividness (Campos and Fuentes 2016;Van Hedger et al. 2018).
The present study was aimed to: (1) determine the factor structure of the Polish version of the CAIS, (2) assess the internal consistency of the scale, (3) assess the test-retest reliability of its measurement, and (4) investigate differences in vividness of auditory imagery between musicians and nonmusicians. We expected the Polish version of the CAIS to be a reliable one-factor measure of the vividness of auditory imagery, as revealed earlier by Willander and Baraldi (2010). We also made the assumption that musical expertise may impact on the CAIS score and musicians should assess their images as more vivid than non-musicians. Similar relationship was found in earlier studies (Campos and Fuentes 2016;Hishitani 2009;Janata and Paroo 2006;Seashore 1938).
The musicians were educated in playing musical instruments and had a minimum of ten years' musical training. Of the 39 musicians, 30 had attended primary music school (M = 5.63 years, SD = 1.71 years). Thirty-eight musicians had graduated from or attended secondary music school (M = 4.95 years, SD = 1.27 years). Twenty musicians had graduated in or studied music majors at universities. Practice in playing instruments ranged from 10 to 22 years (M = 13.77, SD = 3.26). Musicians participating in the study spent 3-50 h per week playing musical instruments (M = 16.91, SD = 13.15).
All of the non-musician participants reported no musical training. In the group of non-musicians, 38 participants had graduated from or studied at university (two participants did not answer), and six of them had studied two majors. Taking into account both majors (the first and the second), 32 participants studied social and law sciences, five studied humanities and arts, three studied natural sciences, and four studied other subjects.

Imagery Measure
The Polish version of the CAIS consists of 16 questions regarding the clarity of images of sounds that are well known from everyday life. Participants assess the clarity of their images on a five-point Likert scale with the anchors described as 1, "not very clearly", and 5, "very clearly" (the Polish version of the CAIS is included in the Appendix).
The instructions and all questions of the CAIS (Willander and Baraldi 2010) were translated into Polish on the basis of the Swedish and English versions of the questionnaire made available by Willander and Baraldi. Based on both versions, a single questionnaire was developed in Polish and later translated back into English and Swedish. Both retranslated versions of the questionnaire were compared with the contents of the Swedish and English versions of the CAIS made available by Willander and Baraldi (2010) and were used to design the final Polish version of the questionnaire.

Procedure
The first part of the study consisted of filling in a paper version of the CAIS. All participants participated in the study one at a time. The study was held in a well-lit, sound-isolated room. Each participant completed the questionnaire in the presence of the researcher. Filling in the questionnaire took about five minutes for each participant. After one week, all participants who had taken part in the first stage of the study were asked to fill in the questionnaire once again.
All participants were volunteers and were assured of the confidentiality of their responses. The research was approved by the university commission for research ethics.

Analyses
An analysis of the data collected from 79 participants taking part in the first stage of the study (test phase) was carried out to determine the factor structure of the Polish version of the CAIS. The subjects-to-variables ratio was 4.94:1, so its value is close to the minimum ratio of 5:1 (Gorsuch 1983;Everitt 1975) required for conducting a factor analysis.
In addition, an analysis of the internal consistency of the items forming the questionnaire was carried out on the data obtained in the first stage of the study. The data obtained from 77 participants who took part in both stages of the study (test and retest phases) were used to assess the test-retest reliability of the Polish version of the CAIS. All statistical analyses were performed in IBM SPSS Statistics 22 (IBM Corp., Armonk, NY).

Factor Analysis
Factor analysis of the data obtained from 79 participants in the first stage of the study (test phase) was conducted on 16 items included in the Polish version of the CAIS. The Kaiser-Meyer-Olkin (KMO) measure of the sampling adequacy equaled .83, which may be interpreted as excellent (Field 2009;Kaiser 1974) and indicated that a factor analysis may be performed on the data collected in the first stage of the study (test phase). Similarly, all KMO values for individual items were greater than .72 and thus clearly exceeded the generally accepted minimum value of .50 (Kaiser 1974). Bartlett's test of sphericity, χ 2 (120) = 429.58, p < .001, indicated that the correlations between test items were sufficiently large for factor analysis. Also, the value of the determinant was very low (.003) and allowed factor analysis to be conducted. In the non-musician group, the mean score in the Polish version of the CAIS ranged between 2.69 and 5.00 (M = 3.98, SD = .620, S = −.208, K = −.679). In the group of musicians, the mean score ranged from 3.06 to 5.00 (M = 4.19, SD = .540, S = −.129, K = −.827). Table 1 presents the mean, standard deviation, and values of skewness and kurtosis for each of the 16 items.  In order to estimate the number of components for the Polish version of the CAIS, we performed a combination of the MAP test and PA. Velicer et al. (2000) proposed that those methods of factors extraction are optimal and among the most reliable rules. To carry out both tests, we used syntaxes for IBM SPSS Statistics written by O'Connor (2000). Running the MAP test syntax generates a PCA and series of partial correlations matrices. The initial PCA on the pool of 16 items allowed four components with eigenvalues greater than 1 to be extracted. Despite that, the MAP test indicated that the lowest average squared partial correlation (.0239) was at the first root. As a consequence, the output from the MAP test allowed us to extract only one factor instead of four.
In the next step, the PA was conducted. In this analysis, eigenvalues from a large number of random data sets are computed. These datasets have the same number of variables and cases as the observed data. Eigenvalues from the observed data are also computed and then compared with the eigenvalues from the random data-set matrices corresponding to the same percentile. In the case of our analysis, we generated 1000 random data sets with 79 cases and 16 variables and compared their eigenvalues with eigenvalues from observed data at the 95th percentile. The results of the parallel analysis are the same as the results of the minimum average partial test and indicate that one factor should be extracted from the data. For the first root, the eigenvalue of the observed data (5.52) is greater than the mean random data eigenvalue at the 95th percentile (2.08). For the second root, the observed data eigenvalue (1.60) is lower than the eigenvalue of the mean random data at the 95th percentile (1.80).
The outcomes of the MAP test and PA do not provide information about the factor loadings of each item of the CAIS. Therefore, we carried out exploratory factor analysis (EFA) to establish factor loadings of all 16 items of the CAIS. PCA was chosen as the extraction method. The analysis was conducted by forcing a onefactor solution, as indicated by the results of the MAP test and PA. The factor loadings of 16 items ranged between .46 and .74. All factor loadings were above the threshold of .32 (Tabachnick and Fidell 2007). The one component extracted with the principal component method explained 34.48% of the CAIS total variance.

Internal Consistency and Test-Retest Reliability Analyses
For the final version of the CAIS, which included all 16 questions, a reliability analysis was carried out. First, the internal consistency of the scale was assessed on the data collected in the test phase. Cronbach's alpha coefficient reached a satisfactory level of α = .87, which is greater than the most commonly accepted threshold values of α = .70 (DeVellis 2012) and α = .80 (Nunnally and Bernstein 1994).
Subsequently, an analysis of test-retest reliability was assessed for two measurements carried out on the CAIS scale on the same sample of participants over a one-week interval. The analysis of test-retest reliability of the CAIS was carried out using the intraclass correlation coefficient (ICC; Shrout and Fleiss 1979). The ICC was estimated and its 95% confidence interval (CI) was calculated based on a mean-rating (k = 2), absolute-agreement, two-way mixed-effects model (Field 2005;Koo and Li 2016). The value of the ICC between the means obtained in the first and second measurements equals .85 with a 95% CI (.76 to .91), proving that the reliability of the Polish version of the CAIS can be considered as "good" to "excellent" (Koo and Li 2016). Over a short period of time, the measurement of auditory imagery vividness with the CAIS is characterized by a relatively high repeatability.

Intergroup Differences
Similarly to Willander and Baraldi (2010) as well as Campos and Pérez-Fabello (2011), the analyses of differences between groups were carried out on the data obtained in the test phase. These analyses aimed to compare the vividness of auditory images between women and men and between musicians and non-musicians. Due to the purposes mentioned above, the mean scores obtained in the Polish version of the CAIS were subjected to two t-tests for independent samples with categorical variables: (1) sex (females, males) and (2) musical expertise (musicians, non-musicians). The results of both t-tests were not statistically significant. Neither the sex (p = .372) of the participants nor their level of musical expertise (p = .114) differentiated the scores obtained on the Polish version of the CAIS.

Discussion
Willander and Baraldi (2010) did not address the issue of differences between clarity and vividness when developing their auditory imagery scale. Lacey and Lawson (2013) and Hubbard (2018) consider the CAIS as a method of measuring the vividness of auditory imagery. Note that Willander and Baraldi (2010) claim that the CAIS measures the clarity of auditory imagery just because respondents are directly asked to generate clear images of sounds. Therefore, the question is whether the vividness of auditory imagery is actually something different from its clarity (Willander and Baraldi 2010) or whether they are rather synonymous terms that describe the same property of auditory images (Hubbard 2018;Lacey and Lawson 2013). We consider the CAIS and other questionnaires, recalled in the present article, as a measure of the same construct related to vividness of auditory imagery and, as it was stated earlier, we decided to use term 'vividness' to name the phenomenon measured by the CAIS. Future imagery research should take into account the question raised above.
Despite of the ambiguity pointed in the previous paragraph, the results of the factor analysis and reliability analyses, indicate that the Polish version of the CAIS is a reliable, one-factor questionnaire. The one-factor solution explains 34.48% of the total variance. Campos and Pérez-Fabello (2011) drew a similar conclusion in applying a one-factor solution to the Spanish version of the CAIS, explaining 27.6% of the variance. Also, Willander and Baraldi (2010), using the PA method and the MAP method, extracted only one factor in the Swedish version of the CAIS. Using the same methods of components extraction as we used, Willander and Baraldi explained 31.63% of the total variance in the CAIS score. The proportions of variance, explained by a one-factor solution, are quite similar across the studies, regardless of the language version of the scale.
Also, the value of Cronbach's alpha obtained for the Polish version of the CAIS (α = .87) is comparable to those obtained for the original version of the scale (α = .88, Willander and Baraldi 2010) and for the Spanish one (α = .82, Campos and Pérez-Fabello 2011; α = .88 Campos and Fuentes 2016). The Cronbach's alpha coefficient reached a satisfactory level that testifies to the high internal consistency of the scale. Removal of any of the test items does not affect the increase in the value of Cronbach's alpha. The test-retest reliability measured by the ICC indicates that the Polish version of the CAIS is a questionnaire with a good level of repeatability of the scores obtained by respondents within a relatively short time interval. All mentioned measures of internal consistency as well as test-retest reliability show that the Polish version of the CAIS is a reliable scale designed to measure the vividness of auditory imagery.
The lack of a sex effect on scores obtained in the Polish version of the CAIS is consistent with results reported earlier by Willander and Baraldi (2010), Campos and Pérez-Fabello (2011), and Campos and Fuentes (2016). The lack of a musical-expertise effect is not in line with previous reports by Campos and Fuentes (2016). Also, many earlier studies reported that musicians have better auditory imagery abilities than non-musicians (Aleman et al. 2000;Gissurarson 1992;Hishitani 2009;Janata and Paroo 2006;Seashore 1938;Tużnik et al. 2018). All items of the CAIS correspond to very familiar sound events which are well known from our everyday lives. It is possible that the commonness of the sound events represented by items of the CAIS is responsible for the lack of the musical expertise effect in the vividness of the corresponding auditory images. Both groups might well know all of these sounds and generate their auditory images with a comparable degree of vividness. Another possible explanation may be that the lack of a musical-expertise effect is related with some kind of general tendency to overestimate the vividness of the generated images, for example as a result of influencing or modifying imagery processes by other, non-imagery processes (Hubbard 2018). It is possible that musicians and non-musicians may assess their imagery capability in the same manner and differences between the two groups show up only in specific behavioral tasks that require the use of auditory imagery abilities, for example, in an imagined pitch comparison task (Aleman et al. 2000) or in the case of imagining timbres that differ in their spectral properties (Tużnik et al. 2018). This topic should be investigated in future studies.
Summarizing the results of the current study, it can be concluded that the Polish version of the CAIS is a one-factor scale for testing vividness (or more specifically, clarity) of auditory imagery and has satisfactory psychometric properties. All items included in the scale reliably measure the phenomenon of auditory-imagery vividness. These results are similar to the psychometric characteristics of the Swedish and Spanish versions of the CAIS reported by Willander and Baraldi (2010) and by Campos and Pérez-Fabello (2011). In addition, the results of research conducted on the Polish version of the CAIS showed that it is a questionnaire that is characterized by a satisfactory repeatability of measurement. The CAIS score is not influenced by sex or musicalexpertise factors.
Funding The study was supported by the National Science Centre (Poland), based on the decision no. DEC-2013/09/N/HS6/02835.

Compliance with Ethical Standards
Conflict of Interest On behalf of all authors, the corresponding author states that they have no conflict of interest.
Ethical Approval All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Human and Animal Studies This article does not contain any studies with animals performed by any of the authors.
Informed Consent Informed consent was obtained from all individual participants included in the study.