Introduction

Many Internet users use online video platforms for medical information [1]. The internet has shifted patients’ roles from passive information recipients to active information searchers, thereby increasing patient activation in managing their health care [2]. However, inaccurate information may mislead patients to make wrong decisions and even influence their outcomes [3, 4]. Therefore, enhancing the quality and content of these health-related online videos can improve the public’s accurate perception of health [5].

YouTube, recognized as the most extensive global long-video platform [6, 7], remains inaccessible in China. Bilibili and TikTok, which dominate the long-video and short-video segments in the Chinese market respectively have filled this gap [5, 8, 9].

Although laryngeal cancer accounts for only 1% of total cancer cases and related deaths, it is one of the most prevalent types of head and neck cancer [10]. There are many videos related to laryngeal carcinoma on these platforms. Nevertheless, the quality evaluation of these videos remains sparse [11]. This study aims to identify upload sources, contents, and feature information of these videos on YouTube/ Bilibili/ TikTok and further evaluate the video quality. We expect to provide suitable directions for the public to learn about laryngeal carcinoma from online videos and give comprehensive advice to content creators and platforms.

Materials and methods

Ethical considerations

All information was obtained from publicly released YouTube, Bilibili, and TikTok videos, and none of the data involved personal privacy concerns. Therefore, no ethics review was needed.

Video collection

On January 1, 2024, a search was performed on YouTube with the keywords “laryngeal carcinoma” and “throat cancer” in English. Bilibili and TikTok were searched using the term “喉癌” (laryngeal carcinoma in Chinese. This is both a scientific name and a common colloquial term and the same in simplified Chinese and traditional Chinese characters.). Before searching, we logged out of all accounts and cleared the search history to avoid bias from personalized recommendations. The search results were presented in the default order without any filtering criteria. We skipped the videos published within a week because the data on views and likes were unstable and could not accurately reflect audience engagement. Advertising videos were also skipped (Additional file 1). The top 100 videos on each platform were collected.

Video characteristics

On January 1, 2024, various attributes of the videos were systematically documented, including video length, duration, views, views/30 days, thumbs up, thumbs up/30 days, comments, comments/30 days, coins, collections, collections/30 days, shares, and shares/30 days. However, the following data were unavailable: ① views on TikTok, ② collections and shares on YouTube.

Uploader characteristics

Similarly, on the same day, details regarding the uploaders were gathered, including ID, number of fans, certification status, and type. Certification was determined based on specific criteria (Additional file 1). The video uploaders were categorized as doctors, other medical workers/students, hospitals/ departments/ associations (also regarded as non-profit organizations), for-profit companies, official media (the media under government regulation, such as BBC), and self-media. Self-media was regarded as non-professionals, others as professionals.

Video review and categorization

Between January 2 and 7, 2024, two authors (ZY.L. and YW.C.) independently reviewed the videos and excluded some similar or irrelevant videos (Additional file 1). The topic of the videos was categorized as anatomy, etiology/ prevention, pathology, epidemiology, symptoms, examinations/ diagnosis, treatment, and prognosis. The number of topics covered by each video was collected, as some videos were related to multiple topics. Videos not covering these topics were deemed irrelevant and should be excluded.

Originality and style assessment

Videos that were direct reprints, translations, or gross editions were not considered original (Additional file 1). The style of video shooting was classified as solo narration, questions and answers (Q&A), PPT/class, animation/action, medical scenarios, TV show/documentary, and others (Additional file 1).

Quality assessment

Two authors (ZY.L. and YW.C.) independently assessed the quality of the remaining videos from January 8 to 18, 2024. A third arbitrator (Y.L.) assigned the final score if the two raters’ scores were inconsistent. Furthermore, we used Cohen’s kappa (κ) to quantify the agreement between the two raters. The Patient Education Materials Assessment Tool (PEMAT), Video Information and Quality Index (VIQI), Global Quality Score (GQS), and modified DISCERN (mDISCERN) were utilized to evaluate the video quality.

The PEMAT [12] consists of 25 questions, with 21 representing the understandability of health information and 4 evaluating the actionability of recommendations by videos. Each item is scored as “agree = 1, disagree = 0, N/A”. The total score (PEMAT-T) and the score of understandability (PEMAT-U) and actionability section (PEMAT-A) are calculated as “Total Points/Total Possible Points × 100”. Higher scores indicate better performance.

The VIQI tool [13] encompasses four dimensions: information flow (VIQI 1), information accuracy (VIQI 2), quality (videos including one point for each image, animation, interview, video captions, and summary) (VIQI 3), and precision (level of coherence between video title and content) (VIQI 4). Each criterion is rated on a scale of 1 to 5, with higher scores indicating better quality.

The GQS [14], a 5-point scale, assessed overall video quality, ranging from poor (1) to excellent (5).

The mDISCERN was adapted from the DISCERN tool and is more suitable for assessing video material [6, 15]. It consists of five questions related to the reliability of the video. Each question is scored 1 for “yes” and 0 for “no”. Higher scores correspond to more excellent reliability.

Previous studies have validated the above tools, particularly in the context of social media platforms [5,6,7,8,9, 12,13,14]. The Additional file 2 and the Additional file 3 provide detailed descriptions of these tools.

Statistical analysis

We used IBM SPSS version 24.0 to analyze the data. Shapiro–Wilk test was applied for testing the normality of continuous variables. Normally distributed continuous variables were presented as mean ± SD (standard deviation), while those without normal distribution were presented as median, min–max values, and 25–75 percentiles (M[P25, P75]). We used Cohen κ to quantify the agreement between the two raters. The κ values were interpreted as follows: κ > 0.8 indicated excellent consistency; 0.6 < κ ≤ 0.8 suggested substantial agreement; 0.4 < κ ≤ 0.6 signified moderate agreement; and κ ≤ 0.4 was indicative of poor agreement. The Mann–Whitney U test was applied to compare continuous variables without normal distribution. Categorical variables were reported as numbers and rates. The Chi-square test, continuity correction, or Fisher’s exact test were utilized for comparing categorical variables. A pairwise comparison was employed to elucidate differences among the three platforms. We performed Spearman correlation analysis to evaluate the relationship between audience interaction and video quality. Spearman’s rank correlation coefficient (r) was used, with r > 0 denoting a positive correlation and r < 0 indicating a negative correlation. The strength of the correlation was classified as follows: |r|≤ 0.2 represented no relationship; 0.2 <|r|≤ 0.4 implied a weak relationship; 0.4 <|r|≤ 0.6 indicated a moderate relationship; 0.6 <|r|≤ 0.8 suggested a strong relationship; and |r|> 0.8 denoted a very strong relationship. A value of P < 0.05 was considered statistically significant.

Results

Video characteristics

Our study included 99 YouTube videos, 76 from Bilibili and 73 from TikTok, after excluding duplicates and non-relevant content (Fig. 1). The irrelevant content included celebrities with laryngeal cancer and patients seeking sympathy. All videos on YouTube were in English or had English subtitles. Similarly, all videos on Bilibili and TikTok were in Chinese or had Chinese subtitles. The characteristics of the videos from YouTube, Bilibili, and TikTok are detailed in Table 1. All the continuous variables were not normally distributed according to the Shapiro–Wilk test. TikTok videos (42[28.5–74] seconds) were notably shorter compared to those on YouTube (193[102–467] seconds) and Bilibili (136[89.25–276] seconds), and were generally newer based on their upload dates. TikTok led in audience interaction, showing the highest thumbs up and comments, while Bilibili exhibited the least interaction across all metrics.

Fig. 1
figure 1

Search strategy for videos on laryngeal cancer

Table 1 Characteristics of videos about laryngeal carcinoma on YouTube/ Bilibili/ TikTok

Uploader characteristics

In this study, there were 71 uploaders on YouTube, 65 on Bilibili, and 42 on TikTok (Table 2). TikTok authors owned the largest number of followers and uploaded videos more frequently, contrasting with the lower activity among Bilibili authors. The categories of uploaders varied significantly across platforms (Fig. 2A). Nearly three-quarters of the authors on YouTube were from non-profit organizations (hospitals/departments/associations), followed by individual doctors. Bilibili uploaders mainly consisted of self-media, with doctors as the second-largest group. On TikTok, over half of the accounts belonged to doctors, followed by official media. TikTok uploaders had the highest rate of certification (83.3%), with all doctors being certified (Fig. 2B). In addition, seven uploaders on Bilibili were doctors of traditional Chinese medicine (TCM), six of whom were certified.

Table 2 Characteristics of video uploaders about laryngeal carcinoma on YouTube/ Bilibili/ TikTok
Fig. 2
figure 2

Numbers of video uploaders about laryngeal carcinoma on YouTube/ Bilibili/ TikTok. A All the authors. B the Certified authors

Video categorization

Video categorization is shown in Table 3. Original content was predominant on YouTube (98.0%) and TikTok (94.5%) but less on Bilibil(69.7%). Topic variety differed across platforms, with YouTube and Bilibili videos covering more topics than TikTok, attributed to their longer duration. Treatment and symptoms were the most and the second most popular topics on YouTube and Bilibili. Many videos addressed the different symptoms of laryngeal cancer and chronic pharyngitis. Transoral laser microsurgery, an ideal treatment for early-stage laryngeal cancer [16], was frequently mentioned in several videos. Prognosis topped the topic on TikTok, followed by treatment, mainly due to patients sharing their postoperative experiences (especially about how to talk after surgery). Some doctors also liked shooting postoperative patient follow-up visits in the outpatient department.

Table 3 Categorization of videos about laryngeal carcinoma on YouTube/ Bilibili/ TikTok

The style of video shooting also varied, with solo narration being the most common across platforms and PPT/class more prevalent in long-video platforms.

Additionally, 21 videos on Bilibili focused on TCM, but only 12 were from professional uploaders. Only one video on TikTok mentioned TCM, uploaded by a self-media account.

Video quality

The κ value indicating interobserver reliability was 0.78. Overall, YouTube videos were of the best quality because it was statistically significant that they had the highest scores in the VIQI-sum, GQS, and mDISCERN-sum (Table 4). Despite similar PEMAT-T scores across all the platforms, TikTok videos scored highest in PEMAT-U and lowest in PEMAT-A. Approximately, there was little difference between the scores of Bilibili and TikTok. Professional authors generally outperformed non-professionals in GQS and mDISCERN, according to Table 5.

Table 4 Quality assessment of videos about laryngeal carcinoma on YouTube/ Bilibili/ TikTok
Table 5 Quality comparison between the videos uploaded by professionals and non-professionals

Correlation analysis

No strong relationships were found between the video quality and the audience interaction (Table 6). Approximately, VIQI, GQS, and mDISCERN had weak to moderate positive relationships with the audience interaction. PEMAT scores appeared mostly unrelated to audience interaction. Unexpectedly, on TikTok, mDISCERN negatively correlated to thumbs-up(moderate), comments(weak), and shares(weak).

Table 6 Spearman correlation between video quality and audience interaction on YouTube/ Bilibili/ TikTok

Discussion

The use of social media in public health education has been increasing due to its ability to remove physical barriers that traditionally impede access to healthcare support and resources [17,18,19]. In recent decades, digital video has been widely used as an important information carrier for patients’ education.

In the field of otorhinolaryngology head and neck surgery, YouTube and TikTok content has been investigated for the educational value of videos about cholesteatoma [20], pediatric tonsillectomy [21], middle ear ventilation tubes [22], rhinoplasty [23], tinnitus [24], nasopharyngeal carcinoma [25], and thyroid cancer [8, 26, 27]. However, the overall quality of these videos was not satisfying. Similar studies also raised concerns about the misinformation on social media and called for the responsibility of health specialists to improve health-related content [4, 19].

Few articles concerning laryngeal cancer-related videos have been published, except for Narwani’s research in 2016 [11]. However, his research only involved 54 videos from Google/ Yahoo/ Bing/ MSN and suggested much of the laryngeal cancer information was of suboptimal quality and written at a level too difficult for the average adult to read comfortably [11]. With the rapid growth of social media platforms, the findings may have varied recently.

Principal findings

This study is the first comprehensive evaluation of laryngeal cancer-related videos across major video platforms. Research on Bilibili remains sparse, and we filled this gap. Our study also revealed the differences between YouTube, Bilibili, and TikTok through detailed statistical pairwork comparisons. We used four tools to make a comprehensive judgment that YouTube videos performed relatively the best but still needed improvement. We considered factors like originality, certification, video shooting style, and TCM—elements often overlooked in previous research. Our findings not only guide public access to health information but also provide helpful insights for both content creators and the platforms.

Video characteristics

The difference in video platform histories and the difference in algorithms explain the variance in video duration and relevance, respectively. YouTube, established in 2005 [28], may have less priority of the publication date during searching (default order according to “relevance”), whereas Bilibili (established in 2009 [9]) and TikTok (established in 2016 [29]) use a complex algorithm incorporating recency and user engagement (default order according to “comprehensiveness” [9]).

Interestingly, despite lacking health information, some irrelevant videos on TikTok/Bilibili achieved high viewership and engagement. This phenomenon was also found in the studies on other diseases [30,31,32]. Those videos were most likely to contain hot topics (such as celebrities). Besides, videos from patients might arouse public compassion and achieve much flow. This suggests a potential strategy for video creators to increase audience interaction.

Despite being the newest platform, TikTok exhibited the highest engagement levels in our study. This aligns with studies suggesting that shorter videos tend to be more addictive and disseminate rapidly due to their suitability for consumption during brief intervals [33, 34]. This also explained why it had the highest score of VIQI-1 (information flow) and the largest number of followers.

These findings highlight the evolving landscape of online health information dissemination, where platform-specific characteristics significantly influence content relevance and audience engagement.

Uploader characteristics

Our study reveals that TikTok authors tended to upload more videos, likely because longer videos can be segmented into shorter clips, inflating the number of uploads.

The diversity in uploader types across platforms could be attributed mainly to the varying certification policies of each platform. YouTube’s algorithm favors group accounts, making over half of its uploaders non-profit organizations. This finding on YouTube agrees with the previous studies on other diseases [5, 6, 20, 35]. TikTok, with its stringent certification requirements, only allows certified attending/associate/chief doctors from grade 3 and first-class hospitals (A hospital ranking system in China. Grade 3 and first-class means the top level) to use the title “doctor,” as corroborated by multiple studies [5, 8, 27, 36]. While enhancing content authenticity, this policy restricts the participation of resident doctors, grassroots doctors, and medical students in China, leading them to gravitate towards Bilibili, which has more relaxed certification processes. Meanwhile, Bilibili’s leniency in certification standards allows a greater number of non-professionals to publish health-related content, raising potential concerns about the quality and reliability of these videos. As creators will receive more support from the platforms after certification, we suggest all professionals apply for it.

It is said that China is now vigorously promoting TCM on several platforms, but few studies have mentioned TCM [8, 9]. According to the our research, Bilibili was aligned with Chinese health policy favoring TCM, but the videos posted by non-professionals raised quality and authenticity concerns. Zheng’s 2023 study on TikTok liver cancer videos noted lower-quality TCM content [9]. Unexpectedly, TikTok had only one TCM video. This contrasted sharply with Yang’s 2022 study, in which TCM videos comprised a quarter of TikTok’s top 100 thyroid cancer search results [8]. This discrepancy may be due to a recent strict rule of certification on TikTok (Additional file 1). Many TCM doctors who work in relatively lower-ranking hospitals are not allowed to post TCM videos on TikTok because they can not get certified. In conclusion, there is a clear need for improvements in the quality and representation of TCM content on these platforms.

Video content

While previous studies have overlooked originality, it is crucial for content creators and platforms. Bilibili’s copyright policy (Additional file 1) mainly contributes to its lowest originality rate. Video uploaders should also be aware of copyright issues. Watermarks are recommended for original content. Uploaders should also avoid infringement when reposting or translating others’ works. All video platforms should protect and support authors who produce high-quality original content.

Longer videos on YouTube and Bilibili encompass a broader range of topics due to their capacity to convey more information. Consistent with our study, treatment and symptoms were the most common topics in some previous studies about other diseases [5, 6, 8]. Prognosis was the most popular topic on TikTok, mainly because short videos are more suitable for sharing life. The videos on TikTok about prognosis reflect that patients with laryngeal cancer can usually receive ideal treatment in China.

There are no studies examining the style of video shooting. Based on our experience, solo narration has always been the predominant video style across all platforms, likely due to its ease of production, particularly for individual creators like doctors with limited time and resources for video production. However, solo narration often relies heavily on auditory information, potentially limiting the amount of visual content. In contrast, other styles, like PPT or class, provide more comprehensive audio-visual information but are more common on long-video platforms. These formats, though information-rich, tend to be longer (often more than 10 min) and may not be ideally suited for the general public, targeting mainly medical professionals. It is challenging for video creators to find an ideal style to deliver high-quality medical information concisely and understandably.

Video quality

The choice of tools is crucial in assessing video quality. Though PEMAT, VIQI, GQS, and mDISCERN are widely utilized [6, 12,13,14,15, 37, 38], our experience suggests certain limitations. The origin, contents, and limitations of these tools are shown in the Additional file 2 and the Additional file 3.

Short videos like TikTok, while more accessible and easier to understand, often contain less information, leading to a higher PEMAT-U, a lower PEMAT-A, a lower VIQI-2 (accuracy), VIQI-3 (video shooting aids), mDISCERN-4 (references), and mDISCERN-5 (uncertainty). Despite their brevity, TikTok videos scored well in mDISCERN-1 (clear target) and mDISCERN-3 (balance), attributing to its rigorous doctor certification. TikTok’s high user engagement contributed to its strong performance in VIQI-1 (flow) and helped balance its overall VIQI and GQS. Bilibili’s higher proportion of non-professional uploaders potentially affected its VIQI-2 (accuracy), mDISCERN-2 (reliability), and mDISCERN-3 (balance) scores. Table 5 also proves the inferior quality of videos uploaded by non-professionals. TikTok’s lowest VIQI-4 (appropriate title) score is due to its interface, where videos auto-play without user selection based on titles, unlike YouTube and Bilibili. Thus, titles are not very important on TikTok.

In general, videos on laryngeal cancer on YouTube are better than those on Bilibili and TikTok due to YouTube’s prioritized algorithms for health-related videos, the highest ratio of professionals especially non-profit organizations, and a more comprehensive range of video styles. These findings align with other comparative studies between YouTube and TikTok on different medical topics [39,40,41]. However, Bilibili and TikTok were found to have higher-quality gastric cancer videos than YouTube [5]. Additionally, the average quality of laryngeal cancer videos across all platforms was no better than videos about other diseases.

For Chinese audiences who cannot access YouTube, there is a recommendation for more translation (or secondary creation) of high-quality YouTube content for local platforms. Meanwhile, we also encourage translating high-quality Chinese videos into English and their subsequent upload to YouTube.

Relationship between video quality and flow

Our study uncovered that the relationship between video quality and audience interaction was not strong. This observation, consistent with previous similar studies, suggests that viewers often cannot discern the quality of health-related videos [9, 32]. Public health education needs to be strengthened. Platforms should not rely solely on views and likes for video recommendations but rather enhance oversight and algorithm optimization to promote high-quality health content. This approach is crucial for ensuring viewers receive accurate and reliable health information.

Limitations

The tools for video evaluation remain to be refined despite their wide application. Though we used four tools and three well-trained doctors rated the scores, we could not avoid the potential systemic biases. Secondly, certain data (such as video views on TikTok) are unavailable due to the platform restrictions. No platforms provide a thumb-down button to express opposition, so negative opinions remain unknown. Thirdly, these findings might not be fully applicable in different linguistic contexts. Fourthly, the additional text information (such as the textual introduction under the video page and the content of comments) was not included. Finally, this study’s findings represent a snapshot both in time and from a particular region in a limited sample size. With the rapid growth of social media platforms, the findings may vary greatly over time.

Conclusions

This study provides reliable information for the public to understand the present state of laryngeal cancer-related online videos on social media platforms. The findings are helpful for the public, the content creators, and the platforms. Videos on social media platforms can help the public learn about the knowledge of laryngeal cancer to some extent. The short video platform, like TikTok, has strong interactive functions but carries less information. The long video platforms, like YouTube and Bilibili, have less flow but provide more information. In general, videos on YouTube are of the best quality but still need improvement. Chinese uploaders are encouraged to translate high-quality videos on YouTube and to post them on Chinese platforms. There is a call for more professional content creators to enhance the quality of videos related to laryngeal carcinoma, as some non-professionals might degrade the overall video quality. Video creators are facing the challenge of balancing the length and richness of content, endeavoring to deliver high-quality medical information concisely and understandably. They should also be aware of certification, originality, and the style of video shooting, which may help them make better videos and achieve more audience engagement. Finally, given the public’s limited ability to discern video quality, enhanced oversight and algorithm optimization for platforms are needed to promote high-quality health-related videos.