Teacher knowledge consists of different types of knowledge (general pedagogical knowledge, pedagogical content knowledge, content knowledge; Shulman 1987). To be able to consider multiple perspectives in the classroom, teachers need to associate and combine these different knowledge types. If, for example, students reveal problematic beliefs about their abilities in a specific subject, their teacher will benefit from possessing multiple knowledge types: General pedagogical knowledge helps to understand how general beliefs about cognitive abilities affect motivation, while pedagogical content knowledge helps to understand subject-specific beliefs. Ideally, the teacher can apply both knowledge types simultaneously.

However, the predominant structure of teacher education does not prepare future teachers for the integrated use of knowledge, as courses usually address knowledge types separately (Darling-Hammond 2006). Recent studies have investigated methods to support knowledge integration, for instance by presenting content in an integrated manner or by prompting future teachers to integrate knowledge from separate courses (Harr et al. 2014, 2015). Though beneficial for knowledge integration, these methods are associated with disadvantages, such as low practicability and substantially increased learning times. In the present study, we explored whether an instruction that makes knowledge integration relevant to preservice teachers (relevance instruction; McCrudden and Schraw 2006) would be a practicable and efficient method to support knowledge integration. Such an instruction could be easily implemented, for example, in online learning tools. We focused on music education and investigated whether a relevance instruction would enhance the integrated application of general pedagogical knowledge and music-specific pedagogical knowledge among preservice music teachers.

Music teacher knowledge

The classical model by Shulman (1987), which has also been applied to music education (e.g., Bauer 2013; Haston and Leon-Guerrero 2008; Kerin and Murphy 2015), identifies three areas of knowledge that are important for developing teachers’ professional competence: Pedagogical knowledge (or pedagogical–psychological knowledge, PPK; Voss et al. 2011) includes knowledge about teaching and learning across content domains, such as the knowledge of learners’ motivation and beliefs. Pedagogical content knowledge (PCK) includes the knowledge of teaching and learning in a specific subject. For example, teachers should know about subject-specific instructional methods and learners’ difficulties (e.g., Berry et al. 2016). Another fundamental component is knowledge of the subject matter (content knowledge; Shulman 1987). The present article focuses on PPK and musical PCK, which specifically provide music teachers with a solid base to handle pedagogical issues.

Music teachers regard PCK as a central part of their knowledge (Ballantyne 2006; Millican 2008). Yet there is a lack of clarity as to what constitutes musical PCK—in particular because music education sets special requirements for teachers with its practical, sensual, and esthetic character (e.g., Puffer and Hofmann 2016). Across definitions, two aspects appear to be the most relevant: First, music teachers need to know about instructional strategies for teaching music. For example, they should be able to explain and demonstrate musical concepts but also know how to teach skills such as singing or playing an instrument (e.g., Ballantyne 2006; Millican 2013; Millican and Forrester 2018). Second, music teachers need to know about the learners and their music-specific ways of learning. For example, a music teacher should engage the students in practical training and should know about typical difficulties that students have with singing or making music (e.g., Millican 2013; Puffer and Hofmann 2016). One such difficulty is that students often believe to be “unmusical” (e.g., Asmus 1986; Hallam and Prince 2003; Spychiger 2017).

In short, music teachers need profound PPK, PCK, and CK as a base for developing teaching expertise. To use these different knowledge types in the classroom, the knowledge needs to be well organized.

Knowledge integration in teacher education

Well-integrated knowledge is an important prerequisite for applying knowledge in the classroom. Teachers should thus build knowledge structures that facilitate the simultaneous retrieval of multiple types of knowledge as well as their flexible application in practice (e.g., Hammerness et al. 2005; Harr et al. 2015). However, there is discrepancy between teachers’ desirable knowledge structures and the organizational structures in teacher education, as the different knowledge types are usually taught in separate courses, at different times, by different departments, and using different terminology (Darling-Hammond 2006; Hellmann et al. 2019; Kleickmann and Hardy 2019). For example, in music teacher education, music-specific content is often provided aside from pedagogical issues (Ballantyne 2006, 2007). Such departmentalization may prompt preservice teachers to encode different knowledge types separately with hardly any interconnections. A recent study by Tröbst et al. (2018) demonstrated the challenges of merging different knowledge types in the domain of mathematics: Preservice teachers were able to transform knowledge from separate lectures with pedagogical and mathematical content into PCK; however, they developed much more PCK when taught directly. These results indicate that, without assistance, preservice teachers have trouble forming coherent schemata from consecutive contents. Knowledge can remain fragmented, which creates the risk that only some contents will be activated during teaching, while others remain inert and inapplicable in practice (see Renkl et al. 1996). In contrast, when content is taught in an integrated way, learners are more likely to associate new units of information. Such associations facilitate the simultaneous retrieval of information (spreading activation; e.g., Anderson 1983). Next to these cognitive consequences, a fragmented curriculum affects the perception of coherence within teacher education (e.g., Henning-Kahmann and Hellmann 2019). Future teachers often perceive little coherence between courses that address different knowledge types—and they need support to use strategies for creating stronger coherence (Graichen et al. 2019; Joos et al. 2019). Recent studies have investigated two different approaches to fostering knowledge integration in teacher education.

The first approach was to present content in an integrated manner. For instance, an integrated presentation of PPK and PCK revealed benefits in a computer-based learning environment: Preservice mathematics teachers who had learned via an integrated presentation applied more PPK and PCK simultaneously than those who had learned via a separate presentation. The integration required no or little time investment (Harr et al. 2014). Likewise, in a study involving preservice biology teachers, an integrated text that combined pedagogical information and content-related information led to more coherent justifications about designing a lesson plan than did two separate texts (Janssen and Lazonder 2016). Koehler et al. (2007) drew similar conclusions from a semester-long intervention, which addressed the issue of integrating technological knowledge into teacher education. The participants attended a seminar in which pedagogical, content-related, and technological information about the design of online courses was taught in an integrated way. By the end of the seminar, the participants had developed an integrated conception of the three knowledge types—in contrast to considering pedagogy, content, and technology as separate constructs. Taken together, these studies provide evidence that an integrated presentation supports integration processes and the joint application of different knowledge types. However, the practicability of this approach seems rather low, as it is hard to transfer to real-world settings (see Harr et al. 2015). Teacher education programs would need to be radically restructured to offer combined courses. Although such new structures are now being slowly implemented (e.g., Federal Ministry of Education and Research [BMBF] 2017), they require considerable financial and organizational investment. Furthermore, educators’ (negative) attitudes towards integrated teaching may hinder structural changes (e.g., Thibaut et al. 2018).

The second approach was to present content in a segregated way, followed by specific prompts to support knowledge integration. For instance, integrative prompts supported preservice history teachers in writing a learning journal after having worked on three texts that contained PPK, PCK, and subject-specific knowledge (Wäschle et al. 2015). The prompts supported a more balanced application of the knowledge. Harr et al. (2015) used similar prompts to foster the integration of PPK and PCK. Their setting was analogous to that described above (Harr et al. 2014), but was supplemented with an experimental condition in which a separate presentation of PPK and PCK was followed by integration prompts. With respect to the simultaneous application, these prompts were as effective as the integrated presentation. However, the prompts lengthened learning times substantially. The authors thus argue that prompts are an effective, but inefficient means of supporting knowledge integration.

In general, both approaches—presenting content in an integrated manner or prompting teachers to integrate knowledge after a separate presentation—have proven to be beneficial for knowledge integration. However, these methods are associated with drawbacks: On the one hand, an integrated presentation seems hardly practicable on a larger scale, as structural changes in teacher education would be necessary. On the other hand, integrative prompts after learning require additional time and, consequently, reduce efficiency. One method to overcome these drawbacks may be prior instructions that highlight the relevance of knowledge integration and encourage students to connect new contents. This type of instruction refers to the idea of relevance instructions (McCrudden and Schraw 2006), which are presented prior to learning. Relevance instructions offer two advantages: First, they seem easier to implement in teacher education than major structural changes, for example, as a supplement in online learning tools (e.g., e-portfolio; Robnolt et al. 2017). Second, relevance instructions may require less additional time for integration. In contrast to prompts that need to be processed after learning, a prior instruction facilitates integration throughout the entire process. We address below the question as to whether relevance instructions are a suitable means of fostering knowledge integration.

Relevance instructions

A relevance instruction is defined as a brief instruction that gives learners a particular goal prior to learning (McCrudden and Schraw 2006). Relevance instructions have been found to improve learning outcomes (e.g., Cerdán and Vidal-Abarca 2008; Kaakinen et al. 2002; Lehman and Schraw 2002). The crucial mechanism seems to be that such instructions provide learners with a specific goal that is transformed into a personal intention and, subsequently, stimulates the use of appropriate strategies (McCrudden et al. 2010). For instance, Bråten and Samuelstuen (2004) demonstrated that students adjust their use of cognitive and metacognitive strategies according to an instruction-related purpose. In the context of the present study, studies of particular interest are those that used relevance instructions to enhance the integration of new information from multiple sources. Prior instructions in the form of specific tasks (e.g., providing a summary of multiple texts; Gil et al. 2010; Wiley and Voss 1999) or a broader question that emphasized the importance of integrating different texts (Cerdán and Vidal-Abarca 2008) provoked more integrative strategies and coherent text writing. These results indicate that prior instructions are an effective tool to support the learning process—not just concerning general learning outcomes, but also with respect to integrating new information from multiple sources.

However, the instruction can be at the cost of increased cognitive load (e.g., Kaakinen et al. 2002; McCrudden et al. 2010). We consider cognitive load in terms of (a) perceived task difficulty, (b) mental effort, and (c) time-on-task (i.e., time invested in learning and testing). Perceived task difficulty refers to the characteristics of the material and the task, whereby learners perceive more demanding material as more difficult (e.g., Kalyuga et al. 2001). Integrating contents is more demanding than processing separate contents (e.g., Harr et al. 2015); thus, a corresponding instruction is likely to heighten the perceived difficulty of the materials or even induce overload. Mental effort refers to the learner’s internal cognitive activities (e.g., Paas 1992). Integrating knowledge requires particular processing: The task involves the formation of integrated schemata, which increases essential processing. Furthermore, the task requires that learners keep mental representations in their working memory for integration purposes (see Mayer and Moreno 2003). Time-on-task refers to the time that learners spend on a task and can be seen as an objective indicator for cognitive load and resource allocation (e.g., van Gog and Paas 2008; Schnotz and Kürschner 2007). Relevance instructions can increase the time spent on learning (e.g., McCrudden et al. 2010) and particularly the task to integrate knowledge requires additional time (Harr et al. 2015).

Furthermore, the effects of relevance instructions might be moderated by learners’ prior knowledge. Here, research findings suggest different assumptions: On the one hand, learners with high prior knowledge might benefit especially from relevance instructions as their knowledge provides a solid base for processing and, thereby, integrating new information (Bråten and Samuelstuen 2004; Gil et al. 2010). On the other hand, learners with low prior knowledge might benefit from relevance instructions as they help them to apply good strategies that they would not apply spontaneously (Bråten et al. 2017).

Knowledge about student beliefs as an example for knowledge integration

Knowledge integration becomes particularly relevant when information concerning the same topic is provided separately (Renkl et al. 1996). This problem is inherent in teacher education as the different knowledge types (e.g., pedagogical–psychological knowledge, PPK and pedagogical content knowledge, PCK) address similar topics but are usually provided in separate courses (see Tröbst et al. 2019). For example, both knowledge types include knowledge about learners’ beliefs: PPK concerns learners’ general beliefs; PCK concerns subject-specific beliefs (e.g., Berry et al. 2016; Voss et al. 2011). The present study thus takes beliefs—more specifically, students’ beliefs about musical abilities—as an exemplary topic for demonstrating how separate information can be integrated.

There were two main reasons for choosing ability beliefs as a content: First, ability beliefs are tapped by both PPK and (musical) PCK, whereby the integration of both perspectives creates an additional benefit. From a pedagogical–psychological perspective, beliefs about abilities are conceptualized for instance as intuitive theories about intelligence (Dweck 2000) or as dysfunctional attributional beliefs (Weiner 1985). From the music-educators’ perspective, beliefs about musical abilities can be understood as part of the musical self-concept (Spychiger 2017) or as society-shaped concepts of musical talent (Hallam and Prince 2003). While pedagogical–psychological research has stimulated studies about how to change dysfunctional beliefs (e.g., Yeager and Walton 2011), music-specific research has shed light on how music lessons unwillingly support certain beliefs (e.g., Austin and Vispoel 1998). Hence, the combination of both perspectives should help music teachers analyze beliefs thoroughly and react adequately when facing problematic beliefs. A second reason for selecting the learning content was that ability beliefs are a relevant issue in the music classroom. Dysfunctional beliefs such as the belief that abilities are fixed and unchangeable create long-lasting barriers to learning (e.g., Dweck 2000) and can impair the learning process in multiple ways. For example, fixed beliefs can decrease students’ motivation, engagement, their emotional well-being, and their achievement (e.g., Blackwell et al. 2007; Burnette et al. 2013; King 2016). An overestimation of innate ability as a predictor for good performance is particularly pronounced in the artistic domains, such as music education (e.g., Hallam and Prince 2003; Patterson et al. 2016). Subsequently, it is important for music teachers to know about students’ music-related beliefs, attitudes, and emotions (see Macrides and Angeli 2018).

To counteract negative developments caused by dysfunctional beliefs, teachers should attend to beliefs and handle them competently. Blömeke et al. (2015) provide a framework for describing teacher competence as an interplay between dispositions, situation-related skills, and performance: Dispositional factors—for example, teachers’ knowledge—have an influence on teachers’ skills to handle classroom situations. These skills comprise noticing a critical event, interpreting it, and making an adequate decision about how to respond—which, in turn, determines actual behavior in the classroom. The model thus describes how declarative knowledge is transformed into usable (procedural) knowledge (see also Kersting et al. 2012). Particularly interpreting and decision-making are complex tasks that require profound knowledge (König et al. 2014; Seidel and Stürmer 2014).

With regard to ability beliefs, competent acting may be influenced by teachers’ knowledge and their beliefs. Topic-related knowledge supports teachers’ situation-related skills (e.g., Stahnke et al. 2016). Knowledge about ability beliefs may thus help to notice dysfunctional beliefs, interpret beliefs, and decide about an appropriate reaction. Furthermore, teachers’ beliefs affect how they perceive and interpret classroom situations (e.g., Lee and Cross Francis 2017; Meschede et al. 2017). For example, an incremental theory about abilities (i.e., the belief that abilities can change) is associated with better noticing of fixed beliefs and with helpful interpretations of student achievement (Rieche et al. 2019; Butler 2000; Rattan et al. 2012). With respect to the intervention in this study, however, previous research found that effects of relevance instructions were independent from content beliefs (Bohn-Gettler and McCrudden 2018). It can thus be assumed that teachers’ ability beliefs do not moderate the effect of a relevance instruction on knowledge integration.

The present study

Future teachers face the challenge of integrating knowledge from different disciplines (e.g., Ballantyne 2007; Darling-Hammond 2006; Tröbst et al. 2018). These knowledge types are often taught in separate courses, which can lead to inert knowledge (Renkl et al. 1996) and impede the practical application. In the present study, we investigated whether prior instruction that emphasizes the importance of knowledge integration would enhance the integrated application of pedagogical–psychological knowledge (PPK) and musical pedagogical content knowledge (PCK). Learners’ beliefs about cognitive abilities (i.e., PPK) and learners’ music-specific beliefs (i.e., PCK) were chosen as content. In terms of outcome measures, we assumed that integrated knowledge would be reflected by the simultaneous application of PPK and PCK. In accordance with Blömeke et al. (2015), we defined the application of teacher knowledge as a process consisting of noticing and interpreting relevant classroom aspects and making instructional decisions. These situation-related skills were assessed with tasks that resemble classroom situations (i.e., scenario-based tasks). As well-organized knowledge is particularly important for interpreting and decision-making, we focused on these two skills to investigate knowledge application. Additionally, noticing was measured before and after learning.

We addressed the following hypotheses: First, in line with previous research (e.g., Cerdán and Vidal-Abarca 2008), we expected a positive effect of the relevance instruction on the integration of information from two separate lectures. An instruction that makes knowledge integration relevant should foster the simultaneous application of PCK and PPK when preservice teachers interpret classroom situations and make decisions, as opposed to a control instruction (Simultaneous application hypothesis). This hypothesis represents our study’s central aim.

Second, we expected knowledge integration to increase cognitive load (see Harr et al. 2014, 2015). Consequently, an instruction that targets knowledge integration should stimulate higher cognitive load, compared to a control instruction (Cognitive load hypothesis). An increased load should be apparent as the integration takes place, that is, when content from more than one knowledge type is being processed (after the second lecture and in the posttest). As indicators for cognitive load, we considered (a) perceived task difficulty, (b) mental effort, and (c) time-on-task.

Third, as in previous studies (e.g., Bråten and Samuelstuen 2004; Gil et al. 2010), we expected that the effect of the relevance instruction might be moderated by prior knowledge (Moderation hypothesis). Thereby, two directions were possible: On one hand, learners with high knowledge could particularly benefit from the instruction as they are well equipped to integrate different knowledge types. On the other hand, learners with low knowledge could particularly benefit from the instruction as they need support to apply appropriate strategies.

Finally, we explored whether the relevance instruction affects preservice teachers’ perception of coherence. A fragmented university curriculum often entails a weak feeling of coherence between areas of teacher knowledge (see Hellmann et al. 2019). The relevance instruction might prompt the preservice teachers to concentrate on overlaps and similarities between the lectures, and to use integrative strategies, which may, in sum, strengthen their perception of coherence. We thus explored whether there would be any differences between conditions regarding perceived coherence.


Participants and design

Seventy-two preservice music teachers from German universities participated in this computer-based experimental study (70% female; age M = 22.40, SD = 2.83). The preservice teachers were enrolled at a music conservatory or university of education in four German cities. Our sample included future teachers on the secondary level (64%), primary level (31%), and special needs education (4%). Most of the participants had little practical experience in the classroom (6 months or less: 75%; 7–12 months: 21%; more than 12 months: 4%). Their average time enrolled in a music teacher education program was five semesters (M = 5.04, SD = 3.60). On average, the participants had taken 2.10 (SD = 2.03) courses in educational psychology and 2.11 (SD = 2.03) courses in music education. The study focuses on preservice teachers because, due to the often fragmented university curriculum, the risk of fragmented and inapplicable knowledge is particularly high for this target group. Instructional support is needed to facilitate future teachers’ knowledge development (Hammerness et al. 2005; Hellmann et al. 2019; Kleickmann and Hardy 2019).

The participating preservice teachers worked on a learning environment that consisted of a PPK lecture and a PCK lecture on ability beliefs. Participants were randomly assigned to one of two conditions: In the integration condition (n = 36), participants received instruction that addressed the relevance of integrating knowledge before they learned new content (one full instruction before the first lecture and a short reminder before the second lecture). In the control condition (n = 36), participants received control instruction before the first lecture that did not address knowledge integration. As a main dependent variable, we assessed the simultaneous application of PPK and PCK in scenario-based tasks.

We conducted our research in accordance with the German Psychological Society’s (DGPs) ethical guidelines (2004, CIII) as well as APA ethical standards. The study was funded by the German Federal Ministry of Education and Research (BMBF; grant number 01JA1518A), which did not require additional Institutional Review Board approval. We recruited the preservice teachers via bulletin boards, social media, and announcements in lectures. All participants took part voluntarily and received € 15 for participating. Each participant read a consent document about the procedures and data protection, and provided informed consent. All were aware of taking part in research. The data were collected and analyzed anonymously.


Figure 1 provides an overview of the procedure. The study was conducted in a computer lab. Each participant worked on a computer individually and was equipped with headphones and a sheet of paper. First, participants agreed to the consent document and answered questions on demography. Afterwards, they worked on an open scenario that assessed prior noticing, a prior knowledge test, and a questionnaire about intuitive theories. Next, participants received either a relevance instruction or a control instruction. In the following learning phase, participants watched two lectures and took notes on a sheet of paper. They proceeded to the next video at their own pace but could not go backwards or repeat a video. Participants in the integration condition received a short reminder between the first and second lecture. All participants rated their cognitive load after finishing a lecture. After the learning phase, the participants handed in their notes and proceeded to the posttest that assessed post-noticing, post-knowledge, and knowledge application (via scenario-based tasks). Finally, the participants rated their cognitive load and perceived coherence. The whole procedure took about an hour and a half.

Fig. 1
figure 1

Notes: CL = cognitive load


The procedure of this study follows the tradition of previous studies, which used laboratory settings that resemble university courses and investigated immediate effects of instructional methods on the development of preservice teachers’ knowledge (e.g., Evens et al. 2018; Graichen et al. 2019; Harr et al. 2014; Tröbst et al. 2018; Wäschle et al. 2015). Such experimental studies serve to estimate an intervention’s probability of success before implementing it in the field. The controlled setting allows for reducing confounding factors and thus investigating the mere effect of the intervention.


Relevance instruction and control instruction

The preservice teachers in the integration condition received the relevance instruction; the preservice teachers in the control condition received a control instruction. A table that presents a comparison of both instructions is included in the appendix (“Appendix 1”). The relevance instruction started with a short preface and, subsequently, provided a practical example for integrated knowledge. This example refers to a music teacher who realizes that both pedagogical–psychological knowledge and music-specific knowledge need to be considered in order to prepare a good lesson. Afterwards, the different types of teacher knowledge were described and information about the relevance of integrated knowledge was given. The relevance instruction ended with general instructions on the upcoming learning phase and with a specific instruction to integrate knowledge throughout the study (434 words). Thus, knowledge integration was made relevant. Before starting the second lecture, a short reminder of the instruction was presented (49 words). The control instruction contained only the preface, the description of the different types of teacher knowledge, and the general instructions on the upcoming learning phase without addressing the relevance of knowledge integration (93 words).

PPK and PCK lectures

We used videos to present PPK and PCK in the style of university lectures with slides (bullet points and pictures) and spoken text. To maintain the impression that the videos originated from different courses, we employed two speakers, one for the PPK lecture and another for the PCK lecture, as well as different designs of the slides (e.g., color use, font). Each lecture took about 14 min, and was divided into four or five short videos. The perceived task difficulty did not differ between lectures (p = .439). Furthermore, we presented the lectures in random order to analyze sequence effects. The sequence of the lectures did not affect learning outcomes (all ps > .12).

The PPK lecture addressed intuitive theories about intelligence (Dweck 2000) and explained how incremental and entity theories shape students’ motivation and learning. Five strategies for changing entity beliefs were presented (e.g., to give information about the malleability of intelligence; see Yeager and Walton 2011). The PCK lecture addressed students’ musical self-concept and its influencing factors (e.g., Müllensiefen et al. 2015; Spychiger 2017). The lecture explained how self-beliefs are associated with general music-related beliefs, whereby examples for typical music-related beliefs such as “musical ability is inborn” or “talented students do not need much practice” were provided (e.g., Asmus 1986; Hallam and Prince 2003). Furthermore, we explained how certain teaching methods might unintentionally support such beliefs. Appendix 2 illustrates how the contents from the lectures were connected to the knowledge application test.


Noticing of student beliefs

We selected text scenarios to measure noticing and knowledge application (i.e., interpreting and decision-making). Such scenarios are a valid and economical instrument to represent classroom practice (see approximations of practice; Grossman et al. 2009) and are commonly used to assess situation-related skills (e.g., Dreher and Kuntze 2014; Jacobs 2017). Furthermore, scenario-based tasks are seen as an indicator of usable knowledge and can predict teachers’ instructional practices (Kersting et al. 2012). Although text scenarios are less authentic than, for example, video scenarios, text has several advantages: Text allows for rereading and focusing on relevant information, whereas videos are fleeting and contain distracting information (see Leahy and Sweller 2011; Lowe 2003). Text examples thus require less processing time than video examples (Hefter et al. 2019). Recent findings demonstrate that text and video are equally effective to train and assess teachers’ situation-related skills (Friesen and Kuntze 2016; Kramer et al. 2017). Moreover, as videos can be overwhelming for preservice teachers, text might be particularly useful for this target group (Schneider et al. 2016). We thus decided for text scenarios. To ensure a sufficient level of authenticity, we consulted experts in music education who approved of the fictitious scenarios.

To measure whether the participants noticed problematic beliefs, we used a text scenario before and after the learning phase. This scenario described a music lesson in which a student expressed a fixed belief about musical abilities. In addition, other task-related and class-related problems were mentioned (as distractors). The participants read the scenario and wrote about the difficulties they had observed. We coded whether an answer contained the problematic belief or not, and assigned one point if the belief was mentioned, and zero points if it was not mentioned. Two raters (the second author and a student assistant) coded all answers, revealing high interrater reliability for noticing prior to learning (Cohen’s Kappa κ = .94) and for noticing after learning (κ = 1.00). The raters were blind to condition and independent from each other.

Intuitive theories about musical ability

We used a nine-items questionnaire to measure intuitive theories about musical abilities. The items were taken from the “Subjective Beliefs about the Conditions of Success in Learning Contexts Scales” (Spinath and Schöne 2003), whereby “intelligence” was changed to “musicality” (e.g., “Everybody has a certain degree of musical ability that cannot be changed/can be changed”). Participants rated their agreement on a five-point scale. Higher scores represented an incremental theory about musical ability; lower scores represented an entity theory. The internal consistency for the scale was satisfactory (α = .72).

Conceptual knowledge

Before and after learning, we assessed the conceptual knowledge of the learning content with six open-ended questions. Three questions tapped content from the PPK lecture (e.g., “What is the difference between an entity theory and an incremental theory about intelligence?”) and three questions tapped content from the PCK lecture (e.g., “Name three factors that influence the musical self-concept”). Two raters (the second author and a student assistant) assigned points for correct answers. The highest possible score was twelve points. Interrater reliability was determined by intraclass coefficients (ICC two-way random, not adjusted) and showed high agreement, ICC = .93. Thus, we used the mean scores from both ratings for further analyses and aggregated them into a total score for prior conceptual knowledge and post-conceptual knowledge. Internal consistency was acceptable for prior knowledge (α = .63), but low for post-knowledge (α = .23). The low reliability occurred due to ceiling effects in single items and indicated that the test differentiated insufficiently for high knowledge. In our analyses, conceptual knowledge was merely included as a learning check.

Simultaneous application

Four text scenarios assessed how the participants applied the learning content to specific classroom situations. In all scenarios, students made utterances indicating the belief that musical ability is fixed. For example, a student said “I’ve never been able to hear if a melody is wrong or not. There’s nothing I can do about it.” The participants read the scenario and answered three questions: (1) “What difficulties do you see with regard to this student?”, (2a) “What would you do in this situation?”, and (2b) “What would you do in future lessons?”. The first question aimed at teachers’ interpretations of the situation; question 2a and question 2b aimed at teachers’ decision-making. Both sub-questions assessed the same skill (i.e., making instructional decisions) and were only split to obtain a wider range of responses. For the analyses, we merged the answers to questions 2a and 2b. Additionally, we instructed the participants to relate their answers to the learning content and use technical terms when applicable. Thus, we guided them to focus on the problematic beliefs. The rationale behind these scenarios was that both PPK and PCK content are useful for answering the questions. On the one hand, participants could interpret the belief as an entity theory about abilities and suggest interventions from the PPK lecture. On the other hand, they could interpret the belief as a sign of a low musical self-concept and suggest changing their teaching methods according to the PCK lecture. Ideally, participants would not choose one or the other, but rather apply both PPK and PCK in an integrated way. Appendix 2 provides examples of how the content from the lectures could be used for answering the scenario-based tasks.

In the first step of the analysis, we collated all answers into individual statements and coded whether they contained any PPK or PCK aspects. This allocation was bound to the content of the lectures: If a statement contained aspects from the PPK lecture, we assigned a point for PPK application (and vice versa); if it contained aspects from both lectures, we assigned one point for PPK and one point for PCK application each. Two blinded and independent raters coded all answers, revealing high interrater reliability for PPK application (ICC = .81) and for PCK application (ICC = .84). However, the absolute scores varied significantly between the four scenarios (ps < .001) and were, therefore, z-standardized before we aggregated them into mean scores.

In the second step, we rated the simultaneous application on a three-point scale for interpretations (question 1) and decision-making (questions 2a and 2b). The scale ranged from 0 (only one knowledge type used) over 1 (both knowledge types used, but not integrated) up to 2 (both knowledge types used and integrated). Table 1 displays the coding scheme and provides examples of the simultaneous application of different qualities. The two ratings agreed moderately with regard to absolute values (ICC, not adjusted = .61), but offered acceptable agreement regarding consistency (ICC, adjusted = .73). Hence, conjunct scores were used for further analyses. We aggregated these conjunct scores for interpretation and decision-making over all four scenarios into a sum score for simultaneous application (maximum possible score = 16). The internal consistency of this measure was acceptable (α = .60; see Schmitt 1996).

Table 1 Coding scheme for PPK application, PCK application, and simultaneous application for four sample answers

Note that, unlike noticing, simultaneous application within interpreting and decision-making was only measured after learning. Previous studies (Rieche et al. 2018, 2019) had indicated that the level of preservice teachers’ spontaneous interpretations of beliefs and their decisions are of a rather low quality. In particular, there were few preservice teachers who applied theoretical knowledge in their answers. As the participants in these studies were comparable to the present study, we could expect a similarly low level of knowledge-based interpretations and decisions.

Perceived coherence

A nine-item questionnaire assessed to what extent the preservice teachers perceived coherence between the lectures. The items tapped the coherence during learning and during knowledge application (e.g., “The two lectures overlapped with regard to content”, “I used content from both lectures while working on the classroom scenarios”). Participants indicated their agreement on a four-point scale from 1 (do not agree) to 4 (agree). Internal consistency was good (α = .75).

Cognitive load

Cognitive load was measured with two subjective indicators (self-reported perceived task difficulty and mental effort) and an objective indicator (time-on-task). Subjective ratings of difficulty and effort are frequently used in cognitive load research and are considered to provide valid data (e.g., van Gog and Paas 2008; Sweller et al. 2019). The items were presented three times in total, that is, after the first and second lectures and after the posttest. Participants rated their agreement on a scroll-bar that ranged from 1 (not at all) to 9 (totally). Perceived task difficulty was assessed with two items (“How difficult did you find it to understand the learning content?”, “Do you feel that working on the learning content was overstraining?”; see Kalyuga et al. 2001). The items were aggregated into a mean score; the Spearman-Brown-corrected reliability values were good at all three points of measure with .78, .63, and .76. Mental effort was assessed with a single item (“How much effort did you invest to understand the learning content?”; see Paas et al. 2003; Paas 1992). There was a positive correlation for mental effort between the first and second lecture, r = .60, p < .001; and between the second lecture and posttest, r = .51, p < .001; this can be seen as a type of retest reliability for the single-item measure. Note, however, that the correlation between the second lecture and posttest was lower than that between the two lectures, as the ratings did not refer to the same activity. Time-on-task was calculated using participants’ log data. Three separate scores were calculated for the time spent on the first lecture, on the second lecture, and on the posttest.

Data analyses

Following preliminary analyses, we structured the data analyses according to our hypotheses. For all analyses, we used an alpha level of 0.05 and partial eta-squared (\(\eta_{p}^{2}\)) as an effect size index. Values of 0.01, 0.06 and 0.14 correspond to small, medium, and large effects (Cohen 1988). When a test of a hypothesis was non-significant, we estimated Bayesian probability values to provide further evidence for the null hypothesis given the present data (BF01; e.g., a value of 3 indicates that the null hypothesis is three times more likely in the data than the alternative hypothesis). We used the JZS prior to estimate the Bayes factor, as recommended by Rouder et al. (2009).

First, an ANCOVA was conducted to test the simultaneous application hypothesis. We included total prior knowledge as a covariate, as it was significantly related with simultaneous application, r = .27, p = .021. There was no interaction between condition and covariate (p = .183), indicating that the assumption of homogeneity of regression slopes was met. Second, t-tests were used to investigate time-on-task (and later for analyzing perceived coherence). Third, ANOVAs with repeated measures were used to test the cognitive load hypothesis. Analyses included condition as the between-factor and time as the within-factor. Finally, we used a moderation analysis (Hayes 2013) to test whether the effect of the relevance instruction was moderated by prior knowledge, and we performed a mediation analysis (Hayes 2013) to test whether there was an indirect effect of time-on-task on knowledge application.


Preliminary analyses

Table 2 presents the descriptive values for pretest- and posttest measures. With respect to noticing, there were no differences between conditions prior to learning or after learning (χ2 tests; ps > .627). However, the percentage of participants who noticed problematic beliefs immensely increased from 38% (both groups) before learning to 97% after learning (McNemar test, p < .001). The scores for intuitive theories indicated a strong tendency towards incremental beliefs among all participants (M = 3.87, SD = 0.47, on a five-point scale). There were no substantial differences between conditions, t(70) = 0.86, p = .393. Regarding conceptual knowledge, an ANOVA with repeated measures revealed a large effect of time (p < .001; \(\eta_{p}^{2} = 0. 8 1\)), but no difference between conditions (p = .983). The interaction between time and condition did not reach the level of statistical significance; F(1, 70) = 3.37, p = .071, but gives an indication that the integration condition had a somewhat steeper learning trajectory than the control condition. With regard to the collected demographic data (age, gender, semester, practical experience, school type), no significant differences between the conditions occurred (ps > .10).

Table 2 Means (and standard deviations) of prior and post measures for both conditions

Simultaneous application hypothesis

As predicted, there was a significant difference in simultaneous application between groups, F(1, 69) = 7.71, p = .007, \(\eta_{p}^{2} = 0. 10\). The integration condition outperformed the control condition in applying integrated PCK and PPK. This finding demonstrates that the instruction stimulated the preservice teachers to apply knowledge from different lectures in an integrated manner. Note that there were no differences between conditions with respect to mere PPK application or PCK application (ps > .153).

Cognitive load hypothesis

With respect to self-reported cognitive load, there were significant effects of time for perceived task difficulty and mental effort (ps < .001; \(\eta_{p}^{2} s > 0.29\)). The participants found the posttest more difficult than the two lectures, and invested more mental effort to understand the lectures compared to the posttest (see Table 3). However, in contrast to our hypothesis, we found no effect of condition (ps > .486; BF01s > 3.89), and no significant interaction between time and condition (ps > .576; BF01s > 7.88). These findings indicate that the students who had received the relevance instruction did not experience higher cognitive load, although they were more engaged in integration activities. The instruction obviously stimulated integrative processing not at the expense of higher mental effort or stronger perception of difficulty.

Table 3 Means (and standard deviations) of cognitive load for both conditions

Furthermore, we analyzed time differences between conditions (see Table 3). As expected, the integration condition spent more time on the second lecture than the control condition, t(70) = 2.12, p = .038, \(\eta_{p}^{2} = 0.06\). The time difference was about one minute (medium effect). However, the conditions did not differ regarding posttest time, t(70) = 1.21, p = .231, BF01 = 2.20. To explain these findings, we investigated the time spent on the first lecture: Although the participants could not (yet) integrate, the integration condition took descriptively more time for the first lecture than the control condition. That test did not reach the level of statistical significance, t(70) = 1.76, p = .083, BF01 = 1.11, but it indicates that the participants in the integration condition tended to take more time to learn.

We explored the relation between time-on-task and the learning outcome (i.e., simultaneous application) with an additional mediation analysis and bivariate correlations. The mediation analysis revealed that condition predicted time spent on the second lecture, b = 1.04, p = .038, but time did not predict simultaneous application, b = 0.13, p = .488, BF01 = 5.29, and there was no indirect effect of condition on simultaneous knowledge application via time, b = 0.14, 95% CI [− 0.16, 0.63], BF01 = 2.69. These results demonstrate that higher simultaneous application in the integration condition was not driven by time-on-task during learning. Furthermore, there was only a weak, non-significant correlation between posttest time and simultaneous application (r = .19, p = .118), which indicates that, on the one hand, writing well-integrated answers tended to take more time, but on the other hand, time-on-task cannot alone explain the better results in simultaneous application. In summary, these findings support our hypothesis partially. The participants in the integration condition merely needed a little longer for the second lecture, but not for the posttest. In line with the results regarding self-reported cognitive load, the relevance instruction stimulated integrative processing that was not, however, at high cost of time.

Moderation hypothesis

Moderation analysis showed that knowledge integration was predicted by condition, b = 2.05, SE = 0.76, t = 2.69, p = .009, and total prior knowledge, b = 2.52, SE = 1.12, t = 2.26, p = .027. In contrast to our hypothesis, there was no significant interaction, b = − 1.00, SE = 0.74, t = − 1.36, p = .179, BF01 = 13.94. Hence, the relevance instruction supported knowledge integration independent from prior knowledge.

Perceived coherence

We detected no significant difference between conditions regarding their perceived coherence, t(70) = 0.51, p = .612, BF01 = 3.68. Hence, although the relevance instruction increased integrated application, it did not increase the students’ subjective feeling of coherence between the lectures and their evaluations to the extent at which they had integrated the learning material.


The present study investigated whether an instruction that makes knowledge integration relevant would foster the integrated application of different knowledge types (pedagogical–psychological knowledge and pedagogical content knowledge). The results reveal important contributions to research on teacher education as well as to research on relevance instructions: (1) The relevance instruction increased the integrated application of PPK and musical PCK in scenario-based tasks. (2) The relevance instruction did not increase self-reported cognitive load (perceived task difficulty and mental effort). Regarding time-on-task, the instruction slightly increased the time spent on learning but not the time spent on knowledge application. (3) The effect of the relevance instruction on knowledge application was not moderated by prior knowledge.

Simultaneous application, cognitive load, and prior knowledge

In line with our central hypothesis and with previous studies (e.g., Cerdán and Vidal-Abarca 2008), our findings indicate that relevance instructions can support knowledge integration. Preservice music teachers, who were given a relevance instruction prior to learning, used their knowledge in a more integrated way while interpreting classroom situations and making instructional decisions. In accordance with former research (Graichen et al. 2019; Harr et al. 2015; Wäschle et al. 2015), the intervention did not affect declarative learning but substantially improved the application of the knowledge. This outcome represents a central aim of teacher education: Instead of taking a pedagogical–psychological stance or a subject-specific stance, teachers should consider multiple perspectives in the classroom (Darling-Hammond 2006). Our study thus adds to the debate on how to improve teacher education in general (e.g., Hellmann et al. 2019; Kleickmann and Hardy 2019) and music teacher education in particular (e.g., Ballantyne 2007), and shows that relevance instructions may be a means of bridging different disciplines.

The results regarding cognitive load were contrary to our hypothesis: Apart from slightly increased time-on-task during learning, there were no signs of increased cognitive load. We consider this unexpected result from two perspectives: From one perspective, high cognitive load is desirable if it occurs due to the requirements of the task (i.e., intrinsic load) and due to the construction of cognitive schemata (i.e., germane load, see, e.g., Schnotz and Kürschner 2007). Time-on-task can then be regarded as an important predictor of learning outcomes (e.g., Goldhammer et al. 2014). Building a coherent mental representation from multiple sources is demanding and requires additional effort (Britt and Rouet 2012). Hence, the preservice teachers might have needed more time for integrating the PPK and the PCK lecture. This result is in line with Harr et al. (2015), who found that learners who had received support for knowledge integration took more time than those who had worked on separate lectures. In that study, however, the two integration conditions needed 116% and 188% of the time that the control condition spent on learning. In comparison, the time difference in the present study seem rather small: The integration condition took 107% of the time that the control condition spent on learning. Thus, the preservice teachers only needed a little longer and reported no higher mental effort or higher difficulty, but nevertheless showed a better integrated knowledge application. This result is striking, but good news for potential implementation in practice, where simple interventions are more appealing than those demanding much time and effort.

From another perspective, high cognitive load is undesirable if it resembles unnecessary processing (i.e., extraneous load), which can, for example, stem from the design of the learning material or from a waste of time and effort (Schnotz and Kürschner 2007). Time-on-task does then not predict learning outcomes (e.g., Goldhammer et al. 2014). From this perspective, it is encouraging that the relevance instruction did not increase cognitive load. Similarly, previous studies on relevance instructions found time and effort to be unaffected by the instruction (e.g., Bohn-Gettler and McCrudden 2018; Narvaez et al. 1999). McCrudden et al. (2010) argue that a relevance instruction helps learners focus on the relevant information, which reduces their processing of irrelevant information. Presumably, the relevance instruction in our study reduced extraneous load by focusing the learners on the integration of information. Learners who did not receive the instruction may have experienced more unnecessary load from processing the different information and the varying design of the lectures—thus wasting time and effort. In summary, we assume that knowledge integration increased necessary cognitive load due to the construction of coherent schemata but, at the same time, the relevance instruction reduced unnecessary load caused by processing the multiple sources. As our instruments do not allow for testing these mechanisms, further research is needed.

Finally, we found no moderating effects of prior knowledge on knowledge application. More specifically, the preservice teachers’ prior knowledge did not influence the effect of the relevance instruction on the integrated application of PPK and PCK. This finding contradicts our hypothesis but corresponds with former studies (e.g., Cerdán and Vidal-Abarca 2008; Harr et al. 2015). Thus, the intervention was independent from background knowledge; both learners with high prior knowledge (who have better preconditions for integrating new information, see Gil et al. 2010) and learners with low prior knowledge (who may need support to apply good strategies, see Bråten et al. 2017) profited from the instruction. This result indicates that an instruction that makes knowledge integration relevant can be used for a broad target group, which facilitates its practical implication.

Perceived coherence and noticing of student beliefs

With respect to the debate on coherence in teacher education (e.g., Federal Ministry of Education and Research [BMBF] 2017; Hellmann et al. 2019), we explored the preservice teachers’ perception of coherence. Yet, although the relevance instruction fostered the integrated application of knowledge (objectively), it did not increase the subjectively perceived coherence. One possible explanation lies in the generally high level of perceived coherence (mean scores about 3.6 on a 4-point scale). Perhaps the two lectures were so closely related that they encouraged integration to some degree even without a specific instruction. Hence, ceiling effects could explain the missing difference. Furthermore, the preservice teachers in the integration condition might have set higher standards concerning successful integration. They may well have assessed their own integration activities more critically than those who had not focused on integration before. Thus, the higher coherence between the lectures may have been weighted out by a more critical judgment of one’s own creation of coherence. To shed light on this issue, reliable and valid measurements of perceived coherence are needed (see Henning-Kahmann and Hellmann 2019).

A positive side effect was that the preservice teachers improved their ability to notice problematic student beliefs. After receiving information on ability beliefs, almost all preservice teachers identified the problematic belief in an open scenario. This finding adds to recent studies on noticing for equity (e.g., Hand 2012; Kalinec-Craig 2017), which investigated whether teachers notice social aspects such as students’ status, body language, or participation in the classroom. First studies on teachers’ noticing of belief-related problems had indicated that preservice mathematics and music teachers were mostly unaware of fixed ability beliefs (Rieche et al. 2018, 2019). This lack of awareness is problematic as fixed beliefs can put students at a long-lasting disadvantage (e.g., Blackwell et al. 2007). Moreover, fixed beliefs are at risk of becoming even stronger if they remain unattended (e.g., Dai and Cromley 2014). Detecting dysfunctional beliefs from the teachers’ perspective could thus be a first step towards developing more positive beliefs. In this regard, our study implies that conceptual knowledge about beliefs helps to heighten awareness. It seems important to provide such knowledge during teachers’ education in order to improve their skills in handling problematic beliefs in the classroom.


A first limitation refers to the limited ecological validity of the study. On the one hand, our results were obtained from a short-term experiment with medium effect sizes. The effect of the relevance instruction was only measured immediately after the intervention but not with a follow-up test. Delayed testing is needed to investigate whether relevance instructions can induce a sustained improvement. On the other hand, there were no time lags between the lectures or between the second lecture and posttest. This setting is unlike a genuine setting, where courses addressing similar topics are unlikely in close temporal proximity. In this respect, it is important to mention that we presented not just the relevance instruction before the first lecture, but also a reminder before the second lecture. Future studies are needed to investigate whether a single instruction without a reminder reveals similar effects and whether the effect occurs when lectures and tests are spread out over time. Our study thus shares the restrictions of former studies (e.g., Evens et al. 2018; Graichen et al. 2019; Harr et al. 2014; Tröbst et al. 2018; Wäschle et al. 2015) in that they all follow a strict laboratory setting with high internal validity, but suffer from a limited external validity. Nevertheless, these studies contribute to the field of teacher education by investigating effects that can later be transferred to the field. We thus regard our study as a kick-off for future research.

Second, with just 72 participants our sample was rather small, a factor that limits the statistical power and generalizability of our findings. However, a posthoc analysis indicated that the ANCOVA with prior knowledge as a covariate had sufficient power to test our focal hypothesis (i.e., simultaneous application hypothesis) with .80. With respect to generalizability, we see particular strength in that we recruited the participants from different universities. These universities represented the two types of colleges that provide music teacher education in southern Germany. We thus aimed to avoid sample-specific results. In summary, we argue that our sample was appropriate for the present analyses. Nevertheless, further investigations with larger samples are necessary to produce robust findings.

Third, there are restrictions with respect to the assessment of knowledge application. We had the students work on text scenarios that described classroom situations. Such scenarios are a legitimate and easy-to-use approximation of practice (Grossman et al. 2009) and come close to actual performance (Blömeke et al. 2015; Kersting et al. 2012). Nevertheless, assessment methods that more closely resemble authentic teaching situations would enable stronger conclusions. In addition, we assessed knowledge application only after learning, not before. Our data does therefore not allow for observing the progress that the preservice teachers made regarding simultaneous application. As previous studies had indicated a low level of belief-related interpretations and decisions among preservice teachers (Rieche et al. 2018, 2019), we expected that the participants in the present study would have a similarly low level concerning belief-related knowledge and skills. The data confirms that prior knowledge was low (3 points out of 12 on average), which can be regarded as an insufficient prerequisite for applying knowledge in an integrated way. Yet, future studies could include a pretest measure of knowledge application to shed light on the progress that is achieved through the intervention.


This study presents initial evidence that relevance instructions can foster knowledge integration among preservice teachers. We investigated instructions that highlighted the importance of knowledge integration and that were presented before preservice music teachers worked on two separate lectures. These lectures contained general pedagogical knowledge and music-specific knowledge of ability beliefs, whereby the separate presentation resembled the predominant structures in teacher education. Our results indicate that, indeed, the instruction enhanced the combined application of knowledge in a follow-up test. Thus, the principle of relevance instructions—providing learners with a specific goal before they start learning—seems to bear fruit to enhance knowledge integration.

With respect to implementing the intervention in teacher education, we see the intervention’s particular strength in its practicability, efficiency, and self-regulated character. In terms of practicability, relevance instructions are a rather simple intervention to support knowledge integration. Unlike interventions in previous studies (e.g., Koehler et al. 2007), they are an “add-on” that instructors can use at the beginning of their courses or easily include in online learning tools. Furthermore, the intervention is independent from learning content and can be used for a broad target group with different prior knowledge. In terms of efficiency, the intervention itself is brief and did not lengthen time-on-task substantially. Compared to Harr et al. (2015), it appears to be more efficient to foster integration before learning instead of afterwards. In terms of self-regulated learning, the relevance instruction provided a goal (i.e., knowledge integration) but put the learners in charge of attaining this goal. In contrast to previous interventions (Harr et al. 2015; Koehler et al. 2007; Wäschle et al. 2015), the intervention did not provide assistance for integration. Relevance instructions could thus be a tool for encouraging self-regulated knowledge integration. To test whether relevance instructions prove successful in practice, a design-based research approach seems useful. Such a research process with several cycles of implementing and analyzing the intervention may help to customize the instruction to the students’ needs. With a view to such usability, we find “making it relevant” to be a promising approach to encourage the integrated use of knowledge in teacher education.