Exploring the relationship between surface features and explaining quality of YouTube explanatory videos

Physics education research on explanatory videos has experienced a boost in recent years. Due to the vast number of explanatory videos available online, e.g. on YouTube, finding videos of high explaining quality is a challenging task for learners, teachers, and lecturers alike. Prior research on the explaining quality of explanatory videos on classical mechanics topics has uncovered that the surface features provided by YouTube (e.g. number of views or likes) do not seem to be suitable indicators of the videos' explaining quality. Instead, the number of content-related comments was found to be statistically significantly correlated with the explaining quality. To date, these findings have only been observed in the context of explanatory videos on classical mechanics topics. The question arises whether similar correlations between the explaining quality and YouTube surface features can be found for videos on topics that are difficult to access visually and verbally, for example from quantum physics. Therefore, we conducted an exploratory study analyzing the explaining quality of N = 60 YouTube videos on quantum entanglement and tunnelling. To this end, we made use of a category-based measure of explanatory videos' explaining quality from the literature. We report correlations between the videos' explaining quality, and the surface features provided by YouTube. On the one hand, our results substantiate earlier findings for mechanics topics. On other hand, partial correlations shed new light on the relationship between YouTube's surface features and explaining quality of explanatory videos.

Explanatory videos are brief videos -typically up to 10 minutes maximum -aimed at introducing and explaining a certain topic of interest (cf.Wolf & Kratzer, 2015).They have increasingly been discussed in science education research in recent years (e.g., Kulgemeyer & Wittwer, 2022;Pekdag & Le Marechal, 2010;Schroeder & Traxler, 2017), both in the context of formal and informal learning environments, in particular on YouTube (e.g., Beautemps & Bresges, 2021;Kulgemeyer & Peters, 2016;Pattier, 2021).In the literature, factors that seem to be conducive to the success and popularity of explanatory YouTube videos on scientific topics have been revealed (Beautemps & Bresges, 2021;Welbourne & Grant, 2016), e.g., regarding the structure of a video (Beautemps & Bresges, 2021).While it is desirable to reach as many people as possible, the main goal associated with the development of explanatory videos, of course, is to support student learning.Besides ensuring the success of the video, creators thus have to increase the quality of the explanations offered in their explanatory videos.
From the physics education research perspective, it is crucial to assist learners, teachers and university lecturers in selecting videos with high explanation quality from the plethora of (online) resources.In the case of YouTube explanatory videos, their popularity is publicly shown by means of different surface features, such as the number of views, the ratings of the video (e.g., the number of likes), or via the comment section.However, it remains open as to whether or not these surface features indeed correlate with the explanatory video's explaining quality, and hence, may serve as some kind of quality indicator in this respect.In other words: Can teachers and students rely on them?This question has already been posed by Kulgemeyer and Peters (2016).The authors presented a measure of explaining quality to investigate the above-mentioned question in the context of YouTube explanatory videos on two topics from classical mechanics, namely Newton's third law of motion and Kepler's laws (Kulgemeyer & Peters, 2016).In their exploratory study, the number of content-related comments given by users below a specific video turned out to be the only variable that was statistically significantly correlated with the explaining quality of explanatory videos -neither the number of views, nor the number of likes or dislikes showed correlations to explaining quality that were statistically significant (Kulgemeyer & Peters, 2016).Kulgemeyer and Peters (2016) see the need for further studies on the relationship between surface features provided by YouTube and explaining quality, in particular, regarding other topics.They developed a hypothesis on this relationship that requires further evidence.Videos on topics from quantum physics seems to add a valuable perspective here.
Quantum physics differs fundamentally from classical mechanics, especially since its concepts are not directly visible with the naked eye.Thus, explanations of quantum physics topics arguably require specifically varied explanations.As a result, the question arises as to whether or not the metrics of YouTube explanatory videos about quantum concepts show similar correlations to an established measure of explaining quality as has previously been revealed by Kulgemeyer and Peters (2016) for explanatory videos on classical mechanics topics.This is where this research project comes in: We investigate the explaining quality of YouTube explanatory videos on two genuine quantum physics topics without classical analogies, namely quantum entanglement and quantum tunnelling.To this end, the research methods used by Kulgemeyer and Peters (2016) were leveraged into our study.The objective of the research project presented in this article is to expand on Kulgemeyer and Peters' results by exploring correlations between the YouTube surface metrics (e.g., likes, dislikes, views, number of days since release, number of relevant comments) of explanatory videos on these two quantum topics and the explaining quality of these videos (cf.Kulgemeyer & Peters, 2016).

Research Questions
The present study addresses the following research questions: 1. How is the explaining quality of YouTube explanatory videos on quantum entanglement and quantum tunnelling correlated with the videos' metrics such as the number of views, the number of likes, or the number of dislikes?2. How is the number of content-related comments correlated with the explaining quality of YouTube explanatory videos on quantum entanglement and quantum tunnelling?
These factors have been expanded to a total of nine factors in a 2019 review addressing instructional explanations in science teaching (Kulgemeyer, 2019, p. 90).An important criterion for effective instructional explanations is the adaption to the explainee because this criterion mirrors that explaining is to be regarded a constructivist process (Kulgemeyer & Peters, 2016).
The constructivist nature of explanations is reflected in the communication model for explaining physics presented by Kulgemeyer and Schecker (2013).This model consists of four pillars, namely the explainer, the explanation itself, the explainee, and the explainee's feedback.The fact that a good explanation requires 1. constant evaluation of the explainee's feedback, and 2. prompt adaptation of the explanation based on that feedback, is at the heart of this model (Kulgemeyer & Schecker, 2013).According to the communication model for explaining physics, "the explainer can vary the explanation on four levels based on this feedback, ranging from the language code, the graphic representation form and the mathematic code, to using examples and analogies" (Kulgemeyer & Peters, 2016, p. 3).

Design principles for explanatory videos
The Cognitive load theory (e.g., Sweller, 1988Sweller, , 1994;;Sweller et al., 1998) assumes a limited capacity of working memory caused by a cognitive load on learners in learning environments, which -in its modern view (cf.Sweller et al., 2019) -is composed by • intrinsic cognitive load which is dependent on the concrete learning task, the students' prior knowledge, or the teaching materials used, and • extraneous cognitive load stemming from irrelevant cognitive processes that tie up working memory capacities and thus hinder the learning process.
According to Sweller et al. (2019), the Cognitive load theory "provides evidence-informed principles that can be applied to the design of instructional messages or relatively short instructional units, such as lessons, written materials consisting of text and pictures, and educational multimedia" (p.274).
The Cognitive Theory of Multimedia Learning (cf. Mayer, 1999) builds upon the above-mentioned Cognitive load theory.This theory is based on three fundamental assumptions that, taken together, describe how auditory-verbal or visual-imagery information is processed toward long-term memory: • The Dual-Channel assumption describes that "humans possess separate channels for processing visual and auditory information" (Mayer, 2009, p. 63).
• The Limited-Capacity assumption describes that each of the above-mentioned channels can only process a limited amount of "chunks" (Mayer, 2009, p. 67) of information simultaneously.
• The Active-Processing Assumption describes that students' active engangement is necessary for students constructing knowledge (Mayer, 2009).Kruger & Doherty, 2016;Noor et al., 2014).In addition, different studies have derived design principles that may influence the effectiveness of explanatory videos against the backdrop of the above-mentioned theories (e.g., Brame, 2016;Kay, 2014;Muller, 2008).For example, it has been indicated that the integration of interactive elements into explanatory videos (Delen et al., 2014) or the use of a 1st person perspective in explanatory videos (Fiorella et al., 2017) might have a positive impact on students' performance.Findeisen et al. (2019) reviewed and systematized studies dealing with potential effects of explanatory videos' design principles on student learning, and derived guidelines for the development of explanatory videos based on the overall picture emerging from current empirical findings.

Explaining quality of explanatory videos
In the previous sections, we reviewed both the current state of research on explaining physics and on design criteria for the development of explanatory videos.In this section, both perspectives are merged in order to shed more light on the state of research on the explanatory quality of explanatory videos.Kulgemeyer (2020) presented a framework for effective explanation videos.This framework is • ...consistent with guidelines on the quality of explanatory videos published elsewhere in the literature (e.g., Brame, 2016;Findeisen et al., 2019), and • ...acknowledges research on multimedia learning (Kulgemeyer, 2020), while building upon state-of-the-art research on instructional explanations (e.g., Geelan, 2012;Wittwer & Renkl, 2008).In this framework, seven factors comprising a total of 14 features are described to have an impact on the effectiveness of explanatory videos (Kulgemeyer, 2020(Kulgemeyer, , p. 2450)).Examples are the use of summaries (factor: structure of the video), the use of an appropriate language-level (factor: tools for adaption), the avoidance of digressions (factor: minimal explanation), or the adaption to prior knowledge, misconceptions and interest (factor: adaption).An overview of the whole framework for effective explanation videos is presented in Kulgemeyer (2020Kulgemeyer ( , p. 2450)).
The above-mentioned framework has been tested empirically in order to clarify as to whether or not an explanatory video developed with respect to the framework leads to higher student achievement compared to a video that has not strictly been developed according to the framework (Kulgemeyer, 2020).The results of this study revealed that students learning with an explanation video adhering strongly to this framework showed significantly more declarative knowledge in a post-test than students learning with a video that has not strictly been developed according to the framework (d = 0.42).However, no statistically significant difference in the post-test scores regarding conceptual knowledge was observed.

Evaluation of explanatory videos' explaining quality
An online test which allows for the assessment of physics explanatory skills has been published by Bartels and Kulgemeyer (2019).This test has been developed both for its usage in teacher education and for self-assessment.
Moreover, based on the communication model for explaining physics (Kulgemeyer & Schecker, 2013), Kulgemeyer and Tomczyszyn (2015, p. 121) developed a process-oriented and category-based measure for the assessment of explanation skills.Kulgemeyer and Peters (2016), adopted this category-based measure for the evaluation of explanatory videos' explaining quality.The category system to evaluate explanatory videos' explaining quality (cf.appendix) consists of seven main categories (content, structure, use of language, contexts and examples, mathematics, interrogation, non-verbal elements) comprising a total of 31 different categories.Each of these categories is either assigned to a certain explanatory video (= 1 point) or not (= 0 points).Four out of the 31 subcategories (1.scientific mistake, 2. ignoring students' comment, 3. leaving new technical term uncommented, 4. without context) are related to a decrease of explaining quality, and hence, a negative point (= -1 point) is allocated to the video for their occurrence.
Within the scope of evaluating the explaining quality of explanatory videos (i.e., in the course of categorisation), each category is considered uniformly and there is no counting of a successive occurrence of the same category, "since repetitions of the same wording or the repeated use of a similar explaining aid without any variation are not considered a rich and varied explanation" (Kulgemeyer & Peters, 2016, p. 6).By summing up the points received on the basis of the categories assigned, a specific number of "category points" (Kulgemeyer & Peters, 2016, p. 6), referred to as CP, can be calculated for a given explanatory video (Kulgemeyer & Peters, 2016, p. 6) where X + denotes the number of positive categories assigned to a video, and X − stands for the number of all negative categories assigned to a video.The category points (with the upper limit of 28 CP) serve as a measure of an explanatory video's explaining quality as has been shown by Kulgemeyer and Peters (2016).
It is important to note that the category points assigned to a specific explanatory video may neither judge the video's overall quality (e.g., a video's technical design is not taken into account), nor do the CP help finding the best explanation of a specific topic under investigation among multiple explanatory videos.Instead, the rationale underlying this measure is "to distinguish between rich and varied explanations on the one hand and those with fewer variations on the other" because "those with fewer variations in their explanations may be less suitable for a wider range of viewers as some learners' needs may not be considered" (Kulgemeyer & Peters, 2016, p. 9).

Methods
In this section, we outline the methodology applied in our exploratory study to approach a clarification of the research questions.We aim at expanding on Kulgemeyer and Peters's study according to which none of the correlations between the surface features provided for YouTube explanatory videos and their explaining quality was statistically significant, except from the number of content-related comments (Kulgemeyer & Peters, 2016).In a further study, Kocyigit and Akaltun (2019) even conclude that the "number of views, likes, dislikes, and comments per day is not a predictor of high-quality videos on YouTube" (p.1267).

Content domain
We decided to analyze YouTube explanatory videos on two topics: (a) quantum entanglement, and (b) quantum tunnelling.We analyzed videos addressing these topics because neither quantum entanglement nor quantum tunnelling has any classical analogy and the quantum physics formalism does not enable a space-time description of these concepts (cf.Ubben & Bitzenbauer, 2022).In this way, our study allows best to contrast the previous findings of Kulgemeyer and Peters (2016) who analyzed explanatory videos on topics of classical mechanics.

Inclusion-exclusion criteria and search procedure
Following Kulgemeyer and Peters (2016), we found the videos to be included in our sample via YouTube's search engine applying the search strings "quantum entanglement" and "quantum tunnelling", respectively.We used the following inclusion-exclusion criteria for selecting videos appropriate for data analysis: • The video is published in the English language.
• The video exclusively covers one of the two topics quantum entanglement or quantum tunnelling, respectively.Videos that covered both topics were excluded.
• The video has a maximum duration of 10 minutes.
The latter criterion was important because it only makes sense to compare "the explaining quality of scientifically correct explanations" (Kulgemeyer & Peters, 2016, p. 5).Applying the above-mentioned search strings, we found more than 100.000videos on both topics.A title-caption screening of the search results led to the exclusion of the majority of these videos since they did not fulfill the inclusion criteria (in this stage most often due to a duration above 10 minutes, the coverage of topics beyond the ones under investigation, or representing recorded lectures).In a next step, we reviewed about 200 videos on each of the topics quantum entanglement or tunnelling in detail.Again, we excluded the videos that did not fulfill the inclusion criteria (in this stage most often due to serious scientific errors).Lastly, for our final sample, we (a) settled on videos with comparable run-times of around 5 minutes as has been done in the prior study conducted byKulgemeyer and Peters (2016), and (b) aimed for a sample size comparable to the one of the earlier study in the classical mechanics context (Kulgemeyer & Peters, 2016).The final sample consists of 60 YouTube explanatory videos that were included for data analysis, 30 of which address the topic of quantum entanglement, and 30 of which focus on quantum tunnelling.

Description of the sample
The mean duration of the selected videos is m = 4.97 min with a standard deviation of SD = 2.43 min.The explanatory videos on quantum entanglement (m = 4.74 min, SD = 2.38 min) were of similar length as those on quantum tunnelling (m = 5.20 min, SD = 2.48 min).Moreover, the videos in our sample are of similar length as the ones included in the prior study (cf.Kulgemeyer & Peters, 2016).

Data collection
The explanatory videos included in the final selection have been analyzed in August and September 2021.For the exploration of our research questions, the data collection comprised three aspects: In a first step, we collected each videos' surface features, i.e., the number of likes and dislikes, the number of views, and the publication date to calculate the videos' time online (in days).Additionally, we recorded the number of subscribers to the channels by which the videos were published.The average view duration has been a further surface feature which was included in Kulgemeyer and Peter's study on explanatory videos on classical mechanics topics (Kulgemeyer & Peters, 2016, p. 5).However, at the time of conducting our data collection this feature was not publicly accessible anymore and hence, it is not included in our analysis.In addition, the dislike statistic is not publicly available anymore since the end of 2021 -since our data collection was conducted in August and September 2021, however, we kept the number of dislikes for each video in our dataset and also included it in the data analysis.This allows for a more comprehensive comparison to the earlier results published by Kulgemeyer and Peters (2016) and may help to better understand the interaction of users with explanatory videos.For a description of all the above-mentioned YouTube metrics, we refer the reader to the YouTube Analytics and Reporting APIs (2022).
In a second step, we categorised the comments given below the videos in order to receive the number of relevant comments for each video.We provide a proper description of (a) the term relevant comment and (b) the categorisation procedure in the data analysis section.We explored relevant comments because they "provide by far the most intense communication channel between explainer and addressee" (Kulgemeyer & Peters, 2016, p. 5).
Lastly, following the data collection method from Kulgemeyer and Peters (2016), we used the category system described above (cf.appendix) to assess the explaining quality of the explanatory videos included in our sample.The coding was performed by two independent raters.The inter-rater reliability expressed via Cohen's Kappa can be regarded substantial (κ = 0.79) according to Cohen (1988).Against this backdrop, the category system used in this study allows for an objective assessment of the explaining quality of explanatory videos.Furthermore, the reliability of the measure has been found to be satisfactory (Cronbach's α = 0.58; in the earlier study by Kulgemeyer and Peters (2016), a comparable value of α = 0.69 has been reported).Moreover, the category-system used for this study allows for a valid measure of explanatory videos' explaining quality as has been justified by Kulgemeyer and Peters (2016).
As a last step of data collection, we calculated the category points CP for each explanatory video included in our sample.These category points were then further processed to data analysis.

Data analysis carried out the answer research question 1
We report descriptive statistics (range, median Mdn, mean m, standard deviation SD) regarding the category points of the explanatory videos on quantum entanglement and quantum tunnelling, respectively.
We conducted a correlation analysis in order to explore relationships between the videos' explaining quality (in category points CP) on the one hand, and the surface features provided by YouTube on the other hand.We report Pearson's correlation coefficient r because the data are of metric scale.We interpret correlation coefficients according to Cohen (1988): weak correlation for 0.1 < |r| < 0.3, moderate correlation for 0.3 ≤ |r| < 0.5, strong correlation for |r| ≥ 0.5.In addition, we report partial correlations to verify that observed relationships are no artefact caused by • the videos' time online, i.e. the time that has passed between the publication of a video and the data collection, and • the number of subscribers to the channels by which the videos were published.
The latter control variable seems particularly important due to the fact that the YouTube algorithms promote videos published by popular channels which in turn leads to high numbers of views for these videos.This might influence the results, and hence, deserves special attention.

Data analysis carried out the answer research question 2
The comments below each video included in our sample have been categorised.For the categorisation, we used the category system presented by Kulgemeyer and Peters (2016, p. 8) which consists of four categories: 1. Comment on content: "further question or comment on notations" (Kulgemeyer & Peters, 2016, p. 8).
4. Comment on use: description of "the viewer's use of the video, e.g., revising, preparing a talk or learning for a test" (Kulgemeyer & Peters, 2016, p. 8).
All comments that could be assigned to at least one of these categories, were considered as relevant comments.Comments that could not be assigned to any of these categories, conversely, were excluded from further analysis because they were not related specifically to the content presented in the respective video or to the explanation offered within.For the further analysis, we refrained from a deeper differentiation between the different categories as has been done by Kulgemeyer and Peters (2016) because research question 2 only addresses relevant comments in general.
The categorisation of the all comments underneath N = 60 explanatory videos included in our sample led to a total of 1452 relevant comments.The number of relevant comments for each video was included in our data set as a metric variable and was used for correlation analysis.Again, we additionally calculated partial correlations to verify that observed relationships are no artefact caused by the videos' time online, or the number of subscribers to the channels by which the videos were published.

Descriptives
The median value of the explanatory videos' explaining quality (measured in CP) was Mdn = 11 CP for our total sample, ranging from 2 CP (one video) to 18 CP (two videos).In table 1, descriptive statistics on the category points assigned to the videos comprised in our sample are reported separately for the two subject areas under investigation, namely quantum entanglement and quantum tunnelling, respectively.

Correlation analysis
The correlation analysis results are summarized in table 2. Within the total sample, we find statistically significant correlations between the videos' explaining quality and the number of views (r = 0.27, p < 0.05), the number of likes (r = 0.37, p < 0.01), and the number of dislikes (r = 0.32, p < 0.05).The highest correlation is uncovered between the videos' explaining quality and the number of relevant comments (r = 0.46, p < 0.01), whereas the correlation between the videos' explaining quality and their time online does not differ from 0 with statistical significance.A striking observation is the positive correlation between the number of dislikes and the measure of explaining quality, both in the total sample (r = 0.37, p < 0.01) and the two sub-samples including videos on quantum entanglement (r = 0.37) and quantum tunnelling (r = 0.30).In order to better understand the underlying principles, we decided to introduce three further variables into our analysis: 1. We calculated the likes-to-dislikes ratio for each video included in our sample.This variable allows to contrast the frequency of occurrence of likes to that of dislikes for a given video, and thus could potentially be a more accurate measure of the quality of an explainer video.A similar approach has already been taken by Meyer (2019).
2. We assumed that the interaction with a specific explanatory video, i.e., giving a like or a dislike to a video, requires the user to be cognitively activated to some extent.
We therefore introduced the variable interactions calculated via interactions = likes + dislikes, to explore the relationship between explaining quality and the number of interactions.This might provide further insights into how users interact with explanatory videos depending on their explaining quality.
3. Lastly, to check as to whether the number of likes and dislikes are really relevant variables with respect to explanatory videos' explaining quality, we calculated the the likes-to-interactions ratio via likes interactions , because for a high quality explanatory video one could expect a high number of likes compared to the total number of interactions.Note that the dislike-to-interactions ratio does not contain any further information, and hence, leads to mathematically equivalent results in the correlation analysis (up to sign).
The correlations between the measure of explanatory videos' explaining quality and the three above-mentioned variables are shown in table 3: It becomes apparent that neither the likes-to-dislike-ratio (r = 0.11) nor the likes-to-interactions ratio (r = −0.03) is correlated statistically significantly with the explaining quality of the explanatory videos included in our sample.The only variable showing moderate but statistically significant correlation to the videos' explaining quality is the number of interactions itself (r = 0.39, p < 0.01).From these results, of course, no particular (and a fortiori no causal) relationship between 1. the number of likes or dislikes (and their ratio), and 2. the videos' explaining quality can be inferred.

Table 3
Pearson's correlation coefficient r of the measure of explaining quality (in CP) with the variables likes-to-dislike ratio, interactions, and likes-to-interactions ratio, respectively.For all correlations, we report 95% confidence intervals (95%-CI).However, it seems that the total number of user interactions is correlated significantly with the videos' explaining quality -no matter of whether these interactions result in a like or a dislike in the end.It is necessary to control the correlations presented in tables 2 and 3 for the videos' time online (in days), and the number of subscribers to the channels by which the videos were published in order to explore this in more detail.
Therefore, we report partial correlations in the next subsection.

Partial correlations
In this subsection, we report partial correlations which refer to the entire sample.
This means that we do not distinguish between the sub-samples here for the sake of clarity.
Controlling the correlations between our explanatory videos' explaining quality (measured in CPs) and the YouTube surface features for the videos' times online, we observe the following (cf.table 4): Besides significant correlations between explaining quality and the number of views (r = 0.33, p < 0.05), the number of likes (r = 0.40, p < 0.01), the number of dislikes (r = 0.33, p < 0.05), and the number of relevant comments (r = 0.55, p < 0.001), only the number of interactions (r = 0.41, p < 0.01) shows a significant correlation to the explaining quality of our videos.These partial correlations uncover similar relationships between YouTube's surface metrics and the videos' explaining quality as the ones presented earlier (cf.table 2).

Table 4
Partial correlations (controlled for the time online) between the measure of explaining quality (in CP) and YouTube's surface metrics as well as the likes-to-dislike ratio, the number of interactions, and likes-to-interactions ratio.In a next step, we controlled for the number of subscribers to the channels by which the videos were published.The corresponding partial correlations are shown in table 5: Only three of the correlations remain statistically significant in this case, namely the ones between the explanatory videos' explaining quality and the number of likes (r = 0.43, p < 0.01), the number of relevant comments (r = 0.47, p < 0.001), and the number of interactions (r = 0.43, p < 0.01).In contrast, both the correlations of the videos' explaining quality to the number of views, and the number of dislikes are not statistically significant anymore.We will discuss these observations in the discussion section.

Discussion
In our exploratory study, we investigated as to how the explaining quality of YouTube explanatory videos on genuine quantum topics such as quantum entanglement and quantum tunnelling is correlated with the surface features provided by YouTube alongside each online video.In this section, we discuss the results of our study with regards to our research questions, and against the backdrop of a study published earlier that explored similar questions for explanatory videos on classical mechanics topics (cf.Kulgemeyer & Peters, 2016).

Discussion of research question 1
A correlation analysis revealed statistically significant correlations between the explanatory videos' explaining quality and the surface features provided by YouTube (cf. table 2): • The correlation between the number of views and the explanatory videos' explaining quality is small and statistically significant for the total sample (r = 0.27, p < 0.05) but not statistically significant for the videos on quantum entanglement and quantum tunnelling.
• The correlation between the number of likes and the explanatory videos' explaining quality is moderate and statistically significant for the total sample (r = 0.37, p < 0.01) and for the sub-sample including quantum entanglement videos (r = 0.42, p < 0.05).For the videos on quantum tunnelling, however, the correlation is not statistically significant.
• The correlation between the number of dislikes and the explanatory videos' explaining quality is moderate and statistically significant for the total sample (r = 0.32, p < 0.05).In contrast, it is not statistically significant for the videos on quantum entanglement and quantum tunnelling.
Our results compare well with the findings reported earlier for the mechanics context (cf.Kulgemeyer & Peters, 2016): While the correlations presented in both studies seem different at first glance (cf.table 6), we note that most of the correlations reported by Kulgemeyer and Peters (2016) fall within the 95% confidence intervals of our correlation coefficients (or vice versa).
In additon, our results also shed new light on the underlying relationships: In their 2016 article Kulgemeyer and Peters (2016) found no statistically significant correlation between the videos' explaining quality and the number of likes although the authors expected such a correlation due to the 'illusion of understanding': "Students do not realise the possible inconsistencies in their understanding and feel as if they have understood a topic" (Kulgemeyer & Peters, 2016, p. 11).This assumption is supported by empirical evidence from a recently published experimental study by Kulgemeyer and Wittwer (2022).
For the explanatory videos on quantum topics included in our sample, we indeed uncovered a statistically significant correlation between the number of likes and the videos' explaining quality (r = 0.37, p < 0.01).
Moreover, we find the number of dislikes (r = 0.32, p < 0.05) and the number of views (r = 0.27, p < 0.05) to have statistically significant correlations with the explaining quality of the videos on quantum entanglement and tunnelling.In contrast, Kulgemeyer and Peters (2016) have not found the corresponding correlations to be statistically significant for the videos on classical mechanics topics.The analysis of partial correlations, though, puts these differences between the two studies into perspective: We controlled the a N = 60, this study.b N = 51, (cf.Kulgemeyer & Peters, 2016, p. 10).
correlations between the videos' explaining quality and the surface features provided by YouTube for the number of subscribers to the channels by which the videos were published.
As a result, the correlation between explaining quality and views (r = 0.23) loses its statistical significance.To describe this observation, we go along with Kulgemeyer and Peters (2016) who state that "the number of views is more influenced by [...] the popularity of the YouTube channel than the explaining quality" (p.5).Accordingly, the correlation between explaining quality and dislikes (r = 0.26) loses its statistical significance, though remaining moderate (cf.table 5).
Lastly, we newly introduced the number of interactions, i.e., the sum of likes and dislikes for a given YouTube explanatory video, into the analysis (cf.table 3).The number of interactions correlates statistically significantly with the explaining quality of the explanatory videos on entanglement and tunnelling: r = 0.39, p < 0.01.The partial correlation -when controlling for the number of subscribers of the channels by which the videos are published -of r = 0.43, p < 0.01 was even higher.

Discussion of research question 2
Compared to the metrics provided by YouTube alongside each video (e.g., the number of views), the number of relevant comments turned out to be most strongly correlated with the explaining quality of explanatory videos (r = 0.46, p < 0.01 for the total sample) on (a) quantum entanglement (r = 0.59, p < 0.01), and (b) quantum tunnelling (r = 0.31, p < 0.1).Similarly, Kulgemeyer and Peters (2016, p. 10) report a correlation of r = 0.38 (p < 0.01) between explaining quality and the number of relevant comments for videos on Newton's third law and Kepler's laws, respectively.
We controlled the correlations between the videos' explaining quality and the number of relevant comments for the videos' time online (in days).As a result, the partial correlation between explaining quality and number of relevant comments for the total sample increased (r = 0.55, p < 0.001).This result is comparable to the one reported for the mechanics context, where a partial correlation coefficient of p = 0.40, p < 0.01 was found (Kulgemeyer & Peters, 2016).
The medium to high correlation between the explanatory videos' explaining quality and the number of relevant comments might be justified via the users' cognitive activation: "Hence, videos that accumulate plenty of those relevant comments are more successful in catching viewers' attention as these videos might use either a more stimulating explanation or the explanation delivered is considered as a starting point for further learning progress" (Kulgemeyer & Peters, 2016, p. 12).

Conclusion
Our results support the findings presented earlier for YouTube explanatory videos on mechanics (cf.Kulgemeyer & Peters, 2016), according to which • there is a statistically significant correlation between explaining quality and the number of content-related comments (r = 0.46, p < 0.001 in our study, cf.table 2), and • YouTube's surface metrics (e.g., likes) might not be fruitful indicators for the explaining quality of explanatory videos (cf.table 5).
However, focusing on YouTube explanatory videos addressing quantum entanglement and tunnelling, our study contributes to extending previous results presented by Kulgemeyer and Peters (2016) in two respects: 1. We find a statistically significant correlation between the number of likes and the explaining quality of explanatory videos on the quantum topics entanglement and tunnelling (r = 0.37, p < 0.01, cf.table 2).Although such a correlation has already been assumed in the previous study (cf.Kulgemeyer & Peters, 2016), it could not be found at that time in the context of explanatory videos on topics of classical mechanics.
2. Our study hints that the number of interactions (e.g., the sum of likes and dislikes) might be an indicator for videos of high explaining quality (r = 0.39, p < 0.01, cf. table 3).We argue that this result fits well to the number of relevant comments being statistically significantly correlated with the explaining quality of explanatory videos (cf.table 2).

Limitations
It is important to note that the results presented in this article should be interpreted with caution for the following reasons: 1. We could only include a small number of N = 60 videos in our sample due to the huge amount of data and the great effort required for data analysis (e.g., categorization of all comments underneath each video).
2. Classical correlations, as presented in this article, allow for the exploration of relationships between variables, but not for the identification of causal connections.
3. The data analysis is largely based on the metrics provided by YouTube, which are not fully transparent to users (cf.Kulgemeyer and Peters, 2016).
4. In this study, we only analyzed explanatory videos on the topics quantum entanglement and tunnelling, and hence, the correlations found are not generalizable to different topics.

Outlook
Despite the above-mentioned limitations, our results may serve as a valuable starting point for future research, in particular with respect to teaching and learning quantum concepts: While in this study only scientifically sound explanatory videos have been included for the analysis, the internet is crowded with scientifically misleading or mystifying explanatory videos on quantum concepts, such as quantum entanglement and quantum tunnelling.Since YouTube's surface features, however, are not likely to provide reliable quality indicators, future educational research should (a) explore widespread misconceptions in explanatory videos on quantum concepts, and (b) make further efforts towards the derivation of evidence-based selection criteria that support both students and teachers/lecturers in detecting high quality content out of the dark noise.
Both the Cognitive load theory and the Cognitive Theory of Multimedia Learning have been the basis for prior research on explanatory videos aimed at fostering student learning (cf.

Table 1
Descriptive statistics on the measure of explaining quality of the videos included in our sample (expressed in category points CP).

Table 2
Pearson's correlation coefficient r between the measure of explaining quality (in CP) and the surface features (incl.number of relevant comments) for the total sample, the videos on quantum entanglement and the ones on quantum tunnelling, respectively.For all correlations, we report 95% confidence intervals (95%-CI).

Table 5
Partial correlations (controlled for the number of subscribers) between the measure of explaining quality (in CP) and YouTube's surface metrics as well as the likes-to-dislike ratio, the number of interactions, and likes-to-interactions ratio.Note.Statistical significance of the correlations is denoted by an asterisk: * p < .05. ** p < .01.

Table 6
Pearson's correlation coefficient r between the measure of explaining quality (in CP) and the surface features provided by YouTube.For the correlations calculated in our study, we report 95% confidence intervals (95%-CI).