1 Introduction

In the twenty-first century, with the rapid development of information equipment and the vigorous development of information technology, information technology has developed rapidly in terms of theory, technology, systems, and tools. The influence of information technology has penetrated into various fields and domains, bringing great convenience to our lives. Therefore, it is recommended that all students learn computational thinking (CT) (Wing, 2006), so as to be able to adapt to the technological era. Design thinking (DT) has been viewed as a process of solution-based thinking to produce creative future outcomes or to innovatively solve problems (Pusca & Northwood, 2018). Both DT and CT could be put into practice in the different stages of interdisciplinary learning such as science, technology, engineering, art, and mathematics (STEAM).

New course guidelines were proposed (Ministry of Education, 2014) and put into practice in primary and secondary education from 2019, such as in Taiwan. The learning objectives are the core literacies of each domain. Therefore, literacy-oriented (LO) learning is emphasized to enable students to easily adapt to the future world. There is one new domain for secondary school students to learn in the compulsory education of this new course guideline, namely the technology domain. There are totally two disciplines in the technology domain in the new course guidelines, information technology and living technology. The core literacy of the learning performance in information technology is CT, while the core literacy of the learning performance in living technology is design thinking (DT). In particular, the subject of living technology puts emphasis on interdisciplinary application and curriculum design. Accordingly, the various subjects such as science, technology, engineering, art, and mathematics can be appropriately combined with the learning focus in technology domain learning. The reform of the new course guidelines emphasizes interdisciplinary integration (e.g., STEAM), while the foundation of the discipline of living technology is DT, and the foundation of the discipline of information technology is CT, so as to finally cultivate students’ literacies.

When assessing STEAM practice, scholars have indicated that teachers tend to rely on their professional judgment and support student self-advocacy (Dubek et al., 2021). Previous studies have developed assessment tools for evaluating aspects of the learning process such as efficacy (Herro et al., 2017) or collaboration (Chen et al., 2019) in K-12 STEAM activities, or have emphasized career readiness assessment in the STEAM activities of higher education (Sarmiento et al., 2020). However, few studies have proposed an assessment tool for evaluating the instructional tools of STEAM products or creations in terms of the critical indicators such as design thinking, computational thinking, and literacy-oriented learning, so that the users, teachers, or parents can refer to the results of the assessment to choose the appropriate products for their students or children to learn by doing.

Accordingly, in order to evaluate the creations of the interdisciplinary activities or the products for interdisciplinary learning, this study developed and validated assessment indicators for STEAM education products designed for K-12. The higher score a STEAM creation gains, the more useful it is for K-12 STEAM education. As a result, STEAM-related products or practical creations could be assessed according to the indicators which were developed for each construct in the current study. Each indicator is a Likert 4-point scale, where 1 refers to the lowest-level quality and 4 refers to the highest-level quality. As the relevant STEAM teaching materials and practical creations are likely to be increasingly diversified in the future, the development and validation of assessment indicators is essential to assist educators or parents in selecting proper products or STEAM teaching aids and creations for their students or children, or for evaluating their own STEAM inventions as well as instruction at school.

2 Literature review

2.1 STEAM

STEAM education emphasizes the concept of interdisciplinary learning, including the interdisciplinary combination of Science, Technology, Engineering, Art, and Mathematics. Land (2013) deeply analyzed the core concepts of the four disciplines which are science, technology, engineering, and mathematics (STEM), then integrated the concept of Art into STEM, and conducted theoretical practice such as value evaluation, created literacy opportunities, and provided examples. In addition, Bequette and Bequette (2012) found that focusing on engineering and art is an important creative design thinking process. Therefore, some studies have advocated the expansion of the combination of artistic and humanistic concepts, forming STEAM education (Maeda, 2013). As a result, when people use the term STEM, it does not mean that the interdisciplinary activity does not include art or design. On the contrary, it implies that the interdisciplinary activity naturally encompasses art or design (Reeve, 2015). The current study will uniformly adopt the term “STEAM.”

There is a growing body of research focusing on STEAM education, and it is noted that cross-domain learning methods bring learners better knowledge cultivation. To achieve this goal, traditional education needs to undergo reasonable adjustments and create and incorporate innovative technologies. It is expected that students would have the ability to solve complex problems in human society and to engage in creative thinking through interdisciplinary learning (Madden et al., 2013).

STEAM Education not only aims to develop students’ problem-solving skills, but also to cultivate their job search skills, and to enable them to feel interested and enthusiastic (Land, 2013). Through cross-domain learning, the learning tools can be more in line with real-life scenarios and problems, so that students try their best to effectively use the skills they have learned to solve those problems or tasks. McAuliffe (2016) mentioned that it is necessary to cultivate students’ different creative design thinking and cognitive abilities in the process of learning, so as to improve their learning effectiveness in cross-field domains such as STEAM learning.

2.2 Literacy-oriented learning

In order to cultivate the ability of continuous self-learning, literacy-oriented learning is gradually becoming more highly valued in many countries. Therefore, literacy is listed as an important education policy. In the past, instructional skills were often the core of education. Studies have pointed out that many teachers do not understand the difference between literacy education and technical familiarity. They only teach students skills rather than cultivating their critical thinking and problem-solving abilities in depth. The results may cause obstacles for students to truly understand the content (Smith Macklin, 2001). It is thus very important to use literacy-oriented learning to cultivate students’ problem-solving skills.

In curriculum design, there are many studies on literacy-oriented learning. Through appropriate curriculum planning, literacy concepts can be effectively integrated for curriculum development and design, such as the nature and characteristics of project-based learning, which can effectively correspond to the core literacy, and can foster the ability of students to solve problems and collaborate and create together (Lestari et al., 2020; Markic et al., 2008; Meijer et al., 2020; Rahmawati et al., 2020).

As for technology education, STEAM literacy can be promoted by exploring the essence of science and technology education. The technology domain includes the understanding of science and technology, the realization of personal goals, the development of intelligence and communication skills, the promotion of individual character and positive attitudes, and the achievement of goals in the field of social education, while emphasizing cooperative learning and decision-making. Accordingly, it can be found that STEAM education contributes to literacy ascension (Holbrook & Rannikmae, 2007).

The new course guidelines of each domain in Taiwan regard literacy-oriented learning as a final strategy by combining the concepts of knowledge, affection, and skills to solve the problems in our daily lives. The purpose of the core literacy is to enable every student to appropriately develop talents and lifelong learning. In the interdisciplinary learning of STEAM, in order to enable students to solve daily life problems and engage in higher order thinking, two aspects of learning performance, computational thinking and design thinking, are cultivated to enable students to achieve technological literacy in the process of problem solving and implementation.

2.3 Design thinking (DT)

Design thinking (DT) is regarded as a development process which is involved in the elements of inspiration, ideation, and implementation (Brown, 2008). The process involves effective design and can be used as a general innovation process. DT is applied for multiple aspects of design and interaction. DT results in useful design from the innovative design process such as describing and taking examples (Beckman & Barry, 2007).

First, the steps of DT require "designer empathy" to facilitate the development of a design which meets the needs of the problem. Then, the designers look for creative solutions to the problem, and finally solve the problems via the actual problem-solving process of trial and error and continuous iteration.

In real teaching situations and course development, many studies are designed to coordinate with these practical problems in their daily lives and to integrate DT into addressing the challenges. For example, a project was designed to detect the real-life problem of the breakdown of African water wells. The designers organized groups of students from engineering, business, design, and other different majors to develop low-cost sensor systems. The research used DT to overcome social-related open innovation challenges and proposed effective designs which required empathizing with users in order to increase the impact of the solution (Charosky et al., 2018).

Regarding using human-centered experimental projects in combination with the participation of multiple disciplines, students can ultimately solve problems to meet users’ needs. The structure of the STEAM course ranges from discovery, design, to production, and corresponds to the process of DT. Ultimately it needs to be highly compatible with user needs to achieve real problem solving (Hassi et al., 2016). Research should consider the DT factor and also needs to consider curriculum and pedagogy to help students produce creative results or to solve problems (Pusca & Northwood, 2018).

2.4 Computational thinking (CT)

Whatever field one is engaged in, computational thinking (CT) is regarded as an indispensable competence (Wing, 2008). CT refers to the ability to understand how information is processed and operated, and to use a systematic and logical way to think and solve problems. Wing (2006) stated four main steps of CT, namely decomposition, recognition, abstraction, and algorithm. The four phases of CT are used to analyze and solve problems (Wing, 2006).

With the advanced development of science and technology, the understanding and emphasis on CT has gradually been promoted. There has been great improvement in CT education in the last decade as a growing number of relevant studies have pointed out the importance of CT ability (Román-González et al., 2017), have developed courses (Kong, 2016; Kong et al., 2020), and have designed instructional materials to cultivate students’ CT literacy (So et al., 2020) and also have promoted teacher’s CT self-efficacy (Avcı & Deniz, 2022). Research problems and instructional tools have also become more diversified (Hsu et al., 2018; Tsai et al., 2022). Problem solving is not only required to be used in the information field, but is also an important ability in all fields (Herro et al., 2017). Therefore, it could be found that many studies have been designed by involving CT in different domains. There is a positive development for CT regardless of integrating CT into different domains, such as interdisciplinary learning of mathematics, biology, music, and so on, or accompanying CT with different instructional tools and teaching methods (Hsu, et al., 2018).

Meanwhile, there are studies targeting specific learning tools, analyzing the learning process and the behavioral patterns corresponding to computational thinking phases. The research further explores whether CT is positively applied and effectively learned from understanding the logic and concepts used by the students in the learning process (Berland & Lee, 2011).

However, how to deeply evaluate students' CT ability and learning process needs to be further explored (Tsai, et al., 2022). The previous research has probed into developing a CT scale, and designed plans for the evaluation of CT competence (Korkmaz, et al., 2017). In addition to the basic core algorithmic thinking, some CT assessment research also includes creativity, critical thinking, cooperation, problem solving, and other important proficiencies required when people encounter problems. The CT competence scale can measure the overall performance and the degrees of individual dimension so as to achieve the validity and reliability of the CT scale (Korkmaz, et al., 2017). Based on this CT assessment tool, the current study could adopt relevant aspects and improve the reliability and validity of the scale.

Based on the cross-disciplinary characteristics of STEAM while considering the development goals of the new course guideline in 2019, this study adopted the above-mentioned dimensions, STEAM, literacy-oriented learning, DT, and CT, to examine STEAM creations or products. In other words, the new assessment tool will be used to distinguish whether the STEAM interdisciplinary learning activities or creations achieve the important competences related to cultivating students' multi-disciplinary and continuous self-directed learning.

2.5 STEAM creation assessment indicators

Three experts were invited to validate the indicators’ content which was proposed in this study initially based on the integration of the abovementioned literature review. An abstract is shown in Table 1.

Table 1 Factors and definitions

3 Research purposes and research hypotheses

This study attempted to propose the basic assessment items for evaluating the level of STEAM creations. The value or scores of the STEAM creations could provide users, teachers, and parents with a reference when they are choosing a creation as an instructional tool. The research framework and hypotheses are shown in Fig. 1.

  • H1: The interdisciplinary level of STEAM has an impact on the levels of LO.

  • H2: The CT level has an impact on the levels of LO.

  • H3: The CT level has an impact on STEAM levels.

  • H4: The DT level has an impact on STEAM levels.

Fig. 1
figure 1

The structural model

Because this research objective is theory development and prediction using a small sample size, the Partial Least Squares Structural Equation Modeling (PLS-SEM) was an appropriate method to deal with the formative measurement model and to test the hypotheses (Hair et al., 2011).

4 Method

4.1 Participants

Three experts were invited to check the descriptions of each indicator. A total of 102 teachers (67 males and 35 females) from different learning domains in compulsory education who had experienced STEAM instruction joined the assessment of the STEAM creations. Their average teaching experience was 13 years. In terms of their previous field of teaching experience, of the 102 STEAM education teachers, 15 had taught science, 31 had taught technology, 34 had taught engineering, 10 had taught art, and 12 had taught mathematics.

4.2 The practice of using the developed scales to assess a STEAM creation

STEAM creations have been frequently employed in K-12 education. In order to explore the reliability and validity of the assessment of STEAM creations, the 102 teachers used the assessment tool developed in this study to evaluate a creation which is famous in the teacher training workshops in Taiwan. The creation is named the “START!” intelligence car. People who make the cars have to experience the interdisciplinary process of science, technology, engineering, art, and mathematics, as shown in Fig. 2.

Fig. 2
figure 2

The STEAM product used for assessment in the practical measurement in education

After the teachers experienced this STEAM creation, they all used the following items in Table 2 which had been reviewed by three experts to check the level of this product.

Table 2 Assessment scales for STEAM education products in K-12

After this study collected the feedback filled out by the 102 teachers, Partial Least Squares Structural Equation Modeling (PLS-SEM) was used for the data analysis. Because the STEAM creation assessment indicators include the four constructs of CT, DR, STEAM, and literacy orientation (LO) based on the literature review and theoretical foundations, no indicator can be cancelled based on the Formative Measurement Models.

4.3 Instrument design and development

The measurement development procedure in the current study is shown in Fig. 3. After reviewing the literature and finding the four constructs (CT, DT, STEAM, LO) for assessing the STEAM creations in education based on the theoretical foundations, 16 indicators were formed. Then, three experts were invited to review them for content validity.

Fig. 3
figure 3

The measurement development process

The understanding of the definition of each assessment dimension was confirmed by the three experts. Three STEAM-related experts have reviewed each indicator for every scale based on the theorematic basis. They defined the importance of each indicator with semantics and then define the ranking range of each semantics. Finally, the three experts reached the consistence and validated that there are 16 indicators are indispensable. From the experts’ validation of their understanding of the definition of each assessment dimension, this study utilized the Analytic Hierarchy Process (AHP) (Saaty, 1980) to examine the consistency and found that the proportion of inconsistency, CR (Consistency Ratio) (DT = 0.065; CT = 0.023; STEAM = 0.019; LO < 0.001), was less than 0.10 which is the standard when the metrics of indicators and criteria are reasonably consistent.

A total of 102 teachers from different learning domains in compulsory education who had experienced STEAM instruction used the developed scales to assess the STEAM creations named the “START!” intelligent car. The results of the GPower test also indicated that a sample size of at least 76 respondents was required, implying that the current sample size of 102 was sufficient. Consequently, the following reveals the formative measurement model of PLS-SEM, collected from the 102 STEAM teachers experiencing the “START!” intelligent car product.

5 Results

5.1 Formative Measurement Model

There were three steps to evaluate the formative model. Firstly, redundancy analysis (RDA) was used to confirm the convergent validity (Chin, 1998; Legendre & Legendre, 1998), so the results of each indicator cross-validated communality and the results of each construct cross-validated communality are shown in Table 3. The rho_A of each construct was used for confirming the construct reliability and validity. When the Rho_A is larger than 0.7 (Chin, 1998; Fornell & Larcker, 1981), the construct reliability and validity are acceptable. The results are shown in Table 3.

Table 3 Analysis of construct reliability and validity

Manley et al. (2021) show that formative model doesn't need to examine average variance extracted (AVE), only reflective model does. Therefore, the part of discriminant validity only needs to provide correlation between constructs. The discriminant validity of the constructs was evaluated using the approaches recommended by Fornell and Larcker (1981). The discriminant validity is acceptable when the value of the Fornell-Larcker Criterion is larger than 0.4. As a result, the construct validity (i.e., CT, DT, LO, STEAM) was confirmed based on the convergent validity and discriminant validity. The Heterotrait-Monotrait Ratio Criterion (HTMT) was also confirmed to be smaller than 0.9 in this study, indicating that the discrimination validity was good (Henseler et al., 2015).

Second, the collinearity was assessed by variance inflation factor (VIF). VIF values were well below the acceptable threshold of 5.0 (Neter, et al., 1990; Ringle et al., 2015), while all the VIF values, shown in Table 3, were smaller than 5.

Third, the outer weights shown in Table 3 are the result of a multiple regression, expressing each indicator’s relative contribution to the construct. When an indicator’s outer loading is high (i.e., above 0.5), the indicator should be interpreted as being absolutely important. Table 4 shows that all the outer loadings achieved significance, implying that all the indicators were important. The PLS Algorithm was used for the formative scales to find out the path coefficients of the structure model.

Table 4 Outer Loadings

5.2 Structural equation modeling analysis for hypothesis testing

PLS-SEM was used to run 5,000 subsample bootstrapping. The structure model was first evaluated by the collinearity, as shown in Table 5. The VIF values were all smaller than 5. Second, Table 5 also shows the path coefficients which were used to interpret relative to one another. When the t-value achieves a significant degree, the significant relationship between each path is confirmed and the total effect size is the strength of the significant relevance. The structural model showed a significant relevance of the path from CT to STEAM and the path from STEAM to LO. It was also showed a significant relevance of the path from CT to LO, indicating a partial mediation effect from STEAM.

Table 5 Causal relationships and the results of the hypotheses

The structure model is shown in Fig. 4. Figure 4 shows that the outer weights of CT are 0.665, 0.246, 0.010, and 0.205, which are the results of a multiple regression of the construct, CT, on its set of indicators. Those weights are the primary criterion to assess each indicator’s relative importance in the formative measurement model (Hair et al., 2017). For example, CT1 has the most weight and is significantly important for the construct CT in comparison with the other three indicators (i.e., CT2, CT3, CT4). Those four indicators, CT1 to CT4, compose a common construct named CT. After the evaluator makes a decision regarding the four items in the CT construct, the evaluator can know the results of the assessment of the creation in the CT scale.

Fig. 4
figure 4

Path coefficients of the SEM analysis

In PLS-SEM, SRMR is used for measuring the fit of a model. It would be better if SRMR was smaller than 0.8 (Hair et al., 2017; Hu & Bentler, 1999). The SRMR of the saturated model and estimated model in this study was 0.067, which is smaller than 0.8. Therefore, the model is considered a good fit. Henseler et al. (2016) introduced the SRMR as a goodness of fit measure for PLS-SEM that can be used to avoid model misspecification.

6 Discussion

From the analysis results of the PLS-SEM, it could be known how the experienced teachers evaluated the example of STEAM creation named START! Intelligent Car with the assessment scales. The influencing factors of literacy-oriented (LO) learning are directly from the two constructs, STEAM and CT. According to the results of the structural model in this study, STEAM is a mediator between CT and LO and has a partially mediation effect. In addition, both CT and DT are the crucial original factors to conduct STEAM learning (Bati et al., 2018; Henriksen, 2017). STEAM is a mediator between DT (Kijima et al., 2021) and LO (Lee, 2015). Therefore, it is very important to carry out STEAM education in K-12, so as to allow students to practice CT and DT in their learning activities at school. The scholars also encouraged instructors to provide early exposure to STEAM through both informal and formal learning environments (Jackson et al., 2021).

There is a growing number of STEAM learning tools. Therefore, the indicators which were developed and validated in this study could help solve the problems of evaluating the STEAM learning tools bought by instructors or students. Some instructors are faced with too many choices and do not know which one is better for their students, and which are designed in accordance with the current interdisciplinary and literacy-oriented learning in current education and the new course guidelines. In addition to the STEAM learning tools on the market, future studies can encourage teachers and students to use these indicators to assess their own STEAM products which they make by themselves at school. Based on the evaluation, the evaluators can understand how many aspects the interdisciplinary learning involves. The present study considers that all the indicators for each construct are indispensable. However, a limitation of this study is that the considered constructs are the four phases of CT, DT, STEAM, and LO according to the new course guideline and the theoretical foundations collected in this study. The formative model may develop or increase in the future if there are more essential factors found. There have been a number of studies exploring causality through PLS-SEM (Ringle, et al., 2012), and most have adopted the reflective measurement model and conducted exploratory factor analysis. Future studies could further develop mixed models of reflective and formative constructs, like Ali and Park’s (2016) study, so as to assess STEAM creations from different points of view.

CT is an important foundation perspective to STEAM and Literacy-oriented learning in K-12 education (Lenke & Tenberge, 2022). The model of this study also revealed that STEAM is a mediator from CT to achieve literacy-oriented learning. In other words, some of the effects from CT to literacy-oriented learning come from STEAM learning. It is necessary for scholars to continue putting efforts on developing assessment tools to address how to measure STEAM creations and to check whether the CT can be learned as an independent subject to reach literacy-oriented learning or indirectly from STEAM to achieve literacy-orientation (Grover & Pea, 2013). The formative model of SEM in the current study found that CT had a causal effect on STEAM and then achieved literacy-oriented learning, which echoes another recent study (Yin et al., 2020). The scholars have confirmed that the CT literacy could be enhanced by incorporating CT in the interdisciplinary activities (Hadad et al., 2020; Yin et al., 2020). Accordingly, CT has influence on STEAM activities while STEAM activities promote literacy-oriented learning.

This study found that CT is not only an impact factor for STEAM creations, but can also be integrated into any literacy-oriented curriculum directly. Even though the disciplines are not involved with the integration of Science, Technology, Engineering, Art, and Mathematics, CT could be genialized into any domain of literacy-oriented curriculum. For example, on the one hand, the previous scholars assessed the students’ CT application to solve problems (Chen et al., 2017) and critical thinking (Yağcı, 2019) in daily life. On the other hand, some CT assessment scale research tends to evaluate computer science practice like conditional logic, algorithm building, debugging, simulation, distributed computation (Berland & Lee, 2011), or computer science knowledge in middle schools (Buffum et al., 2015) or the scale for Java programming self-efficacy in particular (Askar et al., 2009). On the contrary, Araujo et al.(2019) developed the assessment from the perspective of CT without programming, and finally gathered abstraction, generalization, and decomposition as the first factor, and logical inference as the second factor. The Bebras cards which is also a CT measurement tools do not require any coding platform familiarity (Sung, 2022). In sum, there is no assessment tool which is able to fulfill all the requirements of users while it depends on the assessing needs of different scenarios.

Scholars (Tang et al., 2020) reviewed 96 CT assessment paper and concluded the following four aspects: (a) More assessment scales or tools need to be developed; (b) Most of CT assessment tools currently only focus on the programming skills or computer techniques; (c) Many traditional tests or questionnaires or implementation performance forms were used for assessing CT competence before; (d) Future research need to develop more assessment tools with reliability and validity. The current study significantly contributes to the assessment development for technology domain learning in K-12. The significant contribution of this study was to integrate the assessment of STEAM and CT and find the relations between them and lead to literacy-orientation learning because rare research has done so.

7 Conclusion

The scale was developed on the basis of a literature review and was then employed in educational practice, with a total of 16 indicators in four constructs in the formative model. The first to the fourth items belong to the DT construct, while the fifth to eighth items form the CT construct. The STEAM construct contains five items which are the different degrees for each subject involved in STEAM creation. Finally, the 14th to 16th items are the indispensable aspects of literacy-oriented learning (i.e., the LO construct). The reliability and validity of the scale are in compliance with the standard specifications, which means that this is a good tool for assessing STEAM learning creations in K-12. Accordingly, by using the evaluation tool developed in this study, the instructors have an evaluation tool with acceptable reliability (Rho_A > 0.7) and discriminant validity (HTMT < 0.9) to determine the level of the STEAM creations or learning tools. A higher level of production means that the creation or tool is more sufficiently complete to meet the current education requirements. This study also confirmed the structural model in which the CT construct has significant impacts on the DT construct and LO construct, and DT has a significant influence on the results of STEAM learning, while CT and STEAM learning is able to result in the performance of literacy-based learning (i.e., the LO construct).

When the theoretical foundation is expanded for other interdisciplinary learning purposes, it is possible to extend the model proposed in the current study in the future. As the current study was a formative model, every indicator is indispensable (Bollen & Lennox, 1991). In the future, when researchers design more items for an indicator using different wording but with similar meanings, the reflective model can be further examined. Most present studies employed students as the main research subjects in the research of STEAM assessments, such as career readiness assessed according to students’ participation in STEAM activities in higher education (Sarmiento et al., 2020), or the efficacy or collaboration of the learning process evaluated according to primary or secondary school students’ performance in STEAM activities (Herro et al., 2017). There are, however, few studies which have evaluated the levels of STEAM products based on the required factors such as CT, DT, LO, and the interdisciplinary nature of STEAM from the perspective of teachers. The research limitation of this study is that the assessors in this study had rich qualifications with more than 10 years of teaching experience on average. It was confirmed that the participants had sufficient relevant experience. Because not all teachers in other regions have such plentiful teaching experience, it was inferred that whether teachers themselves have sufficient STEAM teaching ability and experience could affect the assessment results (Ku et al., 2022). Therefore, it is suggested that future studies can compare the assessment results according to different levels of seniority.