Background

Psychiatry has a problem with its image such that difficulty recruiting medical students into psychiatry has emerged as an important problem [1], being referred to by some as a “Sisyphean task” [2]. Studies carried out in different parts of the world—both developing and developed countries—indicate that medical students do not consider psychiatry as a desirable career choice [3]-[10]. In a survey of 655 students from 6 Australian medical schools, for example, psychiatry was identified as the least respected specialty and the least likely to be identified as a career choice [3]. In the Kingdom of Bahrain, only four of 140 (2.9%) medical students who completed the survey questionnaire from the Arabian Gulf University selected psychiatry as their first choice—the lowest of any specialty [9]. Similar proportions have been identified in the United States and the United Kingdom [11],[12]. In a large 20-country study, 4.5% of the medical students surveyed indicated they were likely to choose psychiatry as a career [10].

Most countries have a shortage of psychiatrists and many rely heavily on international graduates [1],[13]. In New Zealand, for example, the ratio of psychiatrists to population is 1 to 14,880; considerably lower than the often-cited benchmark of 1:10,000 [14]. In 53% of countries reporting to the World Health Organization, covering 69% of the world’s population, there is less than one psychiatrist per 100,000 population. This includes all countries in the South-East Asia Region and 96% of those in the African Region in which there is often no more than one psychiatrist per million population [15].

The bulk of research in this area has focused on the attitudes of medical students. Criticisms made by medical students are that psychiatry is too narrow in scope; it does not draw on all aspects of medical training; it is ineffective, unscientific and too emotionally demanding; and psychiatrists are unattractive role models. These negative attitudes persist even after contact with psychiatric educators and clinical rotations [7],[16]-[18].

Research has demonstrated that a proportion of medical students will change their career choices due to negative comments (“badmouthing”) from mentors and peers. In a survey of 1,114 senior students, three quarters had heard some badmouthing about their career choice and 17% indicated that they had decided against their initial career choice because of negative comments that they heard about it. In over half of the cases, badmouthing was identified as coming from teaching faculty [19]. This research suggests that the attitudes of medical educators may be an important and understudied source of stigma against psychiatry as a career choice.

A review of the literature indicated that there were no psychometrically validated scales to measure medical educators attitudes. Therefore, we developed a pool of 37 Likert-type survey questionnaire items to describe medical educators’ views of psychiatry and psychiatrists. (We originally had 39 items but two items pertaining to psychiatric rotations were eliminated, as rotations were not offered in all of the countries participating in the study.) Given the lack of psychometrically tested instruments measuring attitudes of teaching faculty, and the importance of understanding medical educators views as a potential barrier to recruitment of psychiatrists, our goals in this paper were to: (a) identify the basic factor structure underlying the 37 items using an exploratory factor analysis; (b) test the resulting factor structure using a confirmatory factor analysis; and (c) assess the internal reliability of each identified scale. To our knowledge, this is the first attempt to psychometrically assess a scale to assess medical educators’ attitudes toward psychiatry and psychiatrists.

Methods

Study setting

This study was conducted as part of the scientific activities of the World Psychiatric Association’s Stigma and Mental Health Scientific Section. It was developed with the participants attending an international course on leadership skills for young psychiatrists in Asian/Pacific countries organized by the Association for the Improvement of Mental Health Programs, a not-for-profit organization in Geneva. Individuals attending the course agreed to participate in a research study in order to develop and improve their research skills. Subsequently several young psychiatrists from Europe petitioned to join the research group. Each participant functioned as the lead site investigator and was responsible for obtaining appropriate institutional approvals, conducting or coordinating the translation and back translation of the survey, and coordinating data collection.

Sampling plan

Data were collected from 23 academic teaching sites in 15 countries: Belarus, China, Croatia, India, Indonesia, Iran, Japan, Portugal, Romania, Russia, Scotland, Singapore, Thailand, Turkey, and the Ukraine. The average response rate was 65% (1060 of 1629), with site-specific response rates ranging from 42% to 100% and sample sizes ranging from 25 to 169.

In each participating site, all non-psychiatric medical educators were enumerated from staff directories and their career stage (early, middle, or late) was determined. Individuals were considered to be in the early stage of their career if they were Lecturers, Assistant Professors, under ten years in practice, or under 40 years old. They were defined as in mid career if they were Associate Professors, had been in practice 11-25 years, or were 41-55 years old. Individuals were considered to be in late career if they were Full Professors, Emeritus Professors, in practice more than 25 years, or over the age of 56 years. A stratified random sample of 10 faculty members from each career stage (representing approximately 20% of teaching faculty overall) was then drawn from each site. Some sites over-sampled to support more detailed country-specific analyses. More detailed information on the methods and descriptive results can be found elsewhere [20].

Item development

Our work was informed by three survey instruments, all of which measured attitudes of medical students: The ATP 30 scale developed by Burra and colleagues in 1982 [21], the 39-item questionnaire developed by Balon and colleagues in 1999 [22], and its predecessor, a 22-item questionnaire originally developed by Nielsen and Eaton in 1981 [23]. Only the ATP 30 had been psychometrically tested. Split-half reliability was high (0.9) and the six-week test-retest reliability in a control sample of first year medical students was 0.87.

Table 1 summarizes the genesis of our scale items. The first column shows the items that were obtained from the literature, with bolded references indicating the citation from which the item was drawn. In some cases, there were multiple similar items for a single idea. In others, there were gaps, which we filled with a new item. The second column shows the 37 revised scale items that were eventually tested and their corresponding survey item number. Five items were drawn from a single survey and used verbatim. Twenty-six items were adapted, and 6 items were added. Reworded and new items were reviewed by two of the authors (HS and NS). Items pertained to perceptions of psychiatry as a discipline (5); perceptions of the effectiveness of psychiatric treatments (7); perceptions of psychiatrists as role models (5); perceptions of psychiatry as a career (7); perceptions of psychiatric patients (7); and perceptions about the quality of psychiatric training (6).

Table 1 Genesis of scale items

Following the scoring approach recommended by Ballon et al. to avoid non-committal response sets [22], items were rated on a 4-point Likert type agreement scale ranging from strongly agree to strongly disagree, with no neutral option. Several items were reversed to avoid response patterns. To minimize social desirability bias, respondents were asked what they thought others in their medical school would endorse. This approach, originally recommended by Link and Cullen [24], has been used extensively in population studies of stigma to minimize social responsibility biases that can emerge when asking people to make declarations of personal prejudices.

Items were translated and back translated by bilingual investigators in each setting. In some sites this was done by a single individual, and in others, by a small group. Two authors (HS and NS) independently reviewed the back translations for comparability to the original scale.

Data management and analysis

Completed surveys were returned via email (scanned .pdf files) or by courier to Queen’s University, Canada, where they were entered and analyzed. Queen’s University Health Sciences and Affiliated Teaching Hospitals Research Ethics Board granted ethics clearance. In addition, some study sites also obtained local ethics reviews and clearances.

We first conducted an exploratory factor analysis. Because the results of exploratory factor analysis may not be replicable in a new sample (there is a tendency for models to over-fit the data), we randomly split the sample. Exploratory factor analysis was conducted on the first half of the sample and confirmatory factor analysis was conducted on the second half. Osborne and Fitzpatrick [25] refer to this as internal replication and recommend that researchers examine their exploratory factor analysis solutions using replication samples to determine the extent to which their solutions are likely to be robust. Items with strong loadings in the exploratory analysis may not load strongly in the confirmatory analysis and so may need to be dropped once the confirmatory factor analysis is completed.

Because the survey items were ordinal, we performed principal components factor analysis using the polychoric correlation matrix and varimax rotation. We examined eigenvalues (1.0 or greater), scree plots, and factor loadings to select two potential factors. Once the factors were selected we conducted two confirmatory factor analyses on the remaining half of the sample with structural equation modeling using Stata 13 [26] following the procedures described by Acock [27]. In the confirmatory factor analysis, we allowed for correlation between the factors as this occurs frequently even when varimax rotation has been used in the exploratory analysis to identify uncorrelated factors [28]. We examined a variety of goodness of fit statistics to assess the appropriateness of the model to the data. We eliminated items with poor factor correlations (under 0.4) and respecified the model based on model indices statistics. To obtain a better fitting model, we conducted a second confirmatory factor analysis allowing for correlated error variances between items within a scale whenever the modification indices were 10 or higher on the assumption that highly correlated items may also have correlated errors. We examined the wording of all correlated items to ensure that they made conceptual sense. We used Cronbach’s alpha to assess internal reliability of the resulting scales.

Results

Table 2 describes the composition of the full sample. The majority of respondents were male (60.2%). Career stage was evenly distributed by design. Study sites were predominantly from Asia-Pacific and Eastern European countries, though a site in Scotland also provided data. In two instances (Russia and Thailand) more than one study site contributed data. We could determine the medical field for all but 32% (n = 339) of the sample, however the largest group (19%) was from an undisclosed specialty. Otherwise, the most frequently occurring fields were family medicine and surgery.

Table 2 Sample composition

The Kaiser-Meyer-Olkin measure of sampling adequacy of .73 indicated that the survey items were sufficiently correlated to warrant conducting a factor analysis. Table 3 shows these results. Column 2 of Table 3 describes the factor loadings associated with the exploratory factor analysis, excluding any items with loadings less than 0.4. Two factors emerged with strong eigenvalues over 1.0 accounting for 54% of the overall variance. Factor one had an eigenvalue of 7.3 and factor 2 had an eigenvalue of 2.3. Three additional factors had eigenvalues exceeding 1.0 (1.8, 1.3, and 1.1 respectively); however the item loadings on these factors were sparse, conceptually inconsistent, and the alpha values were disappointing. Therefore, we chose a two-factor solution as the best fit for this dataset. Thirteen items were eliminated at this stage.

Table 3 Factor loadings from the exploratory factor analysis (EFA), the confirmatory factor analysis (CFA) and the confirmatory model respecified (CFA2)

Items loading on the first factor were those reflecting negative stereotypic images of psychiatry, psychiatrists, and psychiatric residents such as psychiatry is unscientific; students are attracted to psychiatry because of their own problems; or entering psychiatry is a waste of a good education. We designated this scale the Images of Psychiatry Scale. Cronbach’s alpha for the items composing this 16-item scale showed high internal reliability in this sample (.83). The eight items that loaded on the second factor portrayed psychiatry as a rewarding and efficacious branch of medicine, such as psychiatry is a rapidly emerging frontier of medicine; most people who receive psychiatric treatment find it helpful; or working with psychiatric patients is rewarding. We designated this scale the Efficacy of Psychiatry Scale and noted it had acceptable reliability in this sample (alpha = .68). Because we used varimax rotation, the factors were uncorrelated.

Colum 3 of Table 3 summarizes the results of the first confirmatory factor analysis (CFA1) testing a model with two scales with uncorrelated error variances between items. Several of the key goodness of fit statistics indicated that the model was a poor fit for the data. The Chi-square statistic was significant (which often happens in samples over 200), the root mean square error of approximation (RMSEA) was well above the .05 threshold at .10, the comparative fit index of .54 was considerably less that the desired .90 threshold, the standardized root mean squared residual was greater than .05. In addition, 5 items on the Images scale and 3 items on the Efficacy scale had correlations of less than .40. This model explained 96% of the variation in the data.

Column 4 of Table 3 summarizes the results of the second confirmatory factor analysis (CFA2) using the same second half of the sample with a respecified model allowing for correlated errors within scales. The fit of this model was considerably improved. Though the Chi square statistics remained significant (as expected), the Root mean square error of approximation was reduced to .055, the comparative fit index rose to .80, and the standardized root mean squared residual dropped to .13. This model also explained 95% of the variation in the data. Internal consistency was high for the resulting 11-item Images Scale (.82) and acceptable but lower than desirable on the 5-item Efficacy Scale (.61), likely due to the smaller number of items. The final two scales were modestly correlated (.50).

Figure 1 summarizes the results of the exploratory and the confirmatory factor analyses in terms of the items retained at each step. It shows the results of the second and best fitting confirmatory factor analysis. Thirty-seven items were entered into the exploratory factor analysis. Thirteen were eliminated with factor loadings of .40 or less. The remaining 24 items loaded on two factors: 16 items on the Images of Psychiatry Scale, and 8 items on the Efficacy of Psychiatry Scale. On the basis of the second confirmatory analysis, 11 items were retained on the Images of Psychiatry Scale and 5 were retained on the Psychiatric Efficacy Scale. The specific items that were retained appear in Colum 4 of Table 3. Items that were eliminated appear in bold in Column 3 of Table 3.

Figure 1
figure 1

Summary of exploratory and confirmatory factor analyses (Respecified model).

Discussion

There is growing concern over the shortage of psychiatrists worldwide and many organizations, such as the World Psychiatric Association, the American Psychiatric Association, and the European Psychiatric Association, have attempted to understand the reasons behind low recruitment levels among medical students [13]. For example, Farooq and colleagues recently studied the career plans of 2198 final year medical students in 46 medical schools from 40 countries. Across all countries, 4.5% of students definitely considered psychiatry as a career choice. Women, prior experience with a mental or physical illness, media portrayal of doctors, and positive attitudes to psychiatry were associated with a career choice of psychiatry. In order for the survey to be sufficiently brief to improve response rates, a number of factors, including potentially stigmatizing attitudes towards psychiatry and psychiatrists, were not addressed [10].

The scales developed in this research can be used to examine the attitudes of medical educators and their effects on psychiatry career choices of medical students, which is an important gap in our knowledge. Prior to this study, a psychometrically tested scale using both exploratory and confirmatory factor analysis did not exist. We identified two unidimensional scales. The first measures stereotypic Images of Psychiatry and psychiatrists. The second measures positive aspects of Efficacy of Psychiatry. The Images Scale was the longer and stronger of the two. As alpha values are influenced by the number of scale items, it is likely that the Efficacy of Psychiatry Scale could be improved in future research with the addition of a broader range of items.

A persistent question faced by those measuring attitudes has been the extent to which responses reflect socially correct responses, as opposed to the more deeply held attitudes that are embedded in our cultural views and more likely to govern behaviours. Link and Cullen have operationalized these deeper attitudes by asking respondents to indicate how they think “most people” would respond to someone with a mental illness. They found that, when asked directly, respondents tended to report idealized responses reflecting cultural norms for politically correct responses. When deeper attitudes were measured, scores were higher, reflecting greater stigma [24]. Based on this research, an important strength of our measurement approach was to ask medical educators to indicate the view that they thought best reflected the attitudes of their colleagues in their medical schools. Similarly, as recommended by Balon and colleagues [22] we deliberately excluded a neutral category in the agreement scale to avoid non-committal response sets.

A second strength of our approach was that we based our results on a large heterogeneous sample of medical educators drawn randomly from 23 academic settings in 15 countries. Survey items were translated and back translated following a systematic process with final approval from the principal investigators. This means we can have some confidence that these scales will behave well in a wide assortment of samples and educational settings. However, the test sites in this research did not include centres from countries such as Canada, the United States, or Western Europe, where the bulk of this research tends to occur. Also, there were a high number of missing values pertaining to socio-demographic characteristics and medical field, as these were not always obvious from the staff directories. Therefore, future research is needed to assess the usefulness of these instruments in a wider sample of North American and European countries.

Conclusions

Using exploratory and confirmatory factor analyses, we identified two unidimensional scales to measure attitudes of medical educators toward psychiatry and psychiatrists: the Images of Psychiatry Scale (11 items) and the Efficacy of Psychiatry Scale (5 items). These constitute the first psychometrically tested scales to measure attitudes of medical educators—knowledge that is important if we are to better understand the determinants of low recruitment into psychiatry.