From 1990 to 2019, there has been a substantial increase in the global population from 5.3 billion to 7.7 billion (about 44.6%) [1, 2]. Simultaneously, the global death population among people under 20 years of age has increased from 6.22 million to 13.93 million, accounting for an increase of 124% [1,2,3]. This means that the mortality rate of children/adolescents is almost three times higher than the growth rate of the global population. Additionally, the disparities between adolescent mortality in different countries and sexes are widening [3, 4]. Given this alarming trend, scientific research is crucial for addressing the global health challenges of children/adolescents and can aid in decision-making.

The current priorities in funding and policies in the field of child health research are increasingly focused on innovation in an endeavor to improve clinical outcomes. The United States and the United Kingdom have established relevant working groups in this area and issued several guidelines [5, 6]. However, summarizing all existing research based on published literature is challenging, given the broad areas of research in pediatric medicine.

The journal classification systems currently used are neither well-refined to identify and classify all relevant studies nor capable of elaborating on the topics or concepts [7]. Semi-automated methods that identify key topics based on text analysis offer an alternative solution to this issue [8, 9]. The research and development of software such as “CiteSpace” and “VOSviewer” have demonstrated similar concepts [8, 9]. Researchers have previously summarized the hotspots and analyzed the research trend in cardiovascular medicine during 2004–2013 using artificial intelligence (AI)-based natural language processing (NLP) technology, involving the analysis of more than 400,000 literature reports [7]. Machine learning models have also been used for classifying over 200,000 pediatric research literature in the field of pediatrics [10]. However, these conclusions require updating and will benefit from meticulous, large-scale, and long-term analytical studies. Therefore, this study aimed to identify the themes and evolutionary trends in pediatric medical research by assessing studies published during 1940–2021 (particularly in the last 40 years) and evaluating the change in these trends over time.

The dataset used in this study included data on titles, abstracts, and keywords obtained from 2,580,642 pediatric medical publications from 1940 to 2021 via searching 14 pediatric medical search terms based on Medical Subject Headings (MeSH) phrases provided by PubMed, eliminating repetition, and using specific search terms as shown in Supplementary Table 1. All the titles and abstracts from these publications were used, and the time information and noun phrases (pieces of text of various lengths) were extracted using the NLP framework developed by Python 3.9 software. The generic terms commonly used in most documents that generate a series of highly specific text fragments and themes to improve the accuracy and highlight class features were filtered out. Five experts from the field of pediatric medicine ranked the topics based on the first 30 text fragments representing the topics; they then cross-reviewed them and eventually merged similar topics. Two complementary approaches, latent dirichlet allocation (LDA) and the k-means clustering algorithm (k-means) were used to categorize the literature [11, 12]. Supplementary Fig. 1 illustrates the specific data analysis flow.

This study was conducted based on three time periods: pre-2010, 2010–2020, and post-2020. During these periods, word clouds were drawn for keywords, and the top five keywords with the highest frequency of occurrence (in a larger font on the figure) in each period were identified. For the pre-2010 period, the major keywords were “developing countries,” “population,” “demographic factors,” “infant,” and “adolescence”; for the 2010–2020 period, they were “pediatrics”, “epidemiology”, “adolescents”, “obesity” and “adolescent”. The post-2020 period keywords with the highest frequency were “coronavirus disease 2019 (COVID-19)”, “pediatrics”, “adolescents”, “epidemiology” and “obesity”; common words such as “child” and “children” were not included (Fig. 1a).

Fig. 1
figure 1

Hot topics and trends in pediatric medicine research. a Frequency of each keyword in the research in the three periods; the higher the frequency, the larger the corresponding word. (The top five keywords with the highest frequency of occurrence in each period were marked in red.) b Trends in the number of studies in the three fields of clinical, population, and development during 1982–2021 and projections over the next 5 years. ce Increasing trends in clinical, growth, and population studies in the last 10 years. Here, 9, 10, and 12 of the fastest-growing research hotspots are displayed (ce, respectively). They have increased to at least double of original quantity in the past 10 years. f Distribution of the document clusters during 2017–2018 and 2020–2021. The top nine largest document clusters presented in 2017–2018 accounted for 98.1% of the total documents in that period. The top seven largest document clusters presented in 2020–2021 accounted for 97.9% of the total documents in that period. COVID coronavirus disease, ADHD attention deficit and hyperactivity disorder, MRI magnetic resonance imaging

In total, 111 pediatric medical topics were identified and are listed alphabetically in Supplementary Table 2. These topics were further categorized into “clinical”, “growth,” and “population” studies, e.g., the characteristics and treatment of pediatric malaria in clinical studies, cognitive development of children in developmental studies, and the health status of children in developing countries in population studies. AutoReg was used to predict the growth trend of the three categories of studies in the next 5 years based on the number of publications in the first 40 years (Fig. 1b) [13]. The trend of clinical studies was observed to be rapidly increasing.

The “hotspots” of the significantly increased publications in clinical, growth, and population studies over the last decade are displayed in Fig. 1c–e, respectively. Allergy and immunotherapy studies were found to be extremely “popular” in clinical studies, which may be due to the high incidence of such diseases [2, 15]. More than 50% of children had at least one allergic disease, significantly reducing their quality of life [16]. Conventional fields of study including infectious diseases, congenital diseases, and pediatric imaging were observed to be expanding. Children’s psychological health, exposure to toxic substances, and the impact of coronavirus emerged as the latest research hotspots. In population studies, studies related to health education, healthcare, and fertility policies were extremely “popular.” This may be due to large-scale, comprehensive public health interventions can effectively improve the health of children [4, 17]. Our results indicated that the number of studies targeting socioeconomic factors and people in developing nations rapidly increased. Research on child behavior, family relationships and child health, child nutrition, child maltreatment, and neglect showed a continuous growth trend. Moreover, research on child development and physical and mental health grew rapidly. Supplementary Table 3 lists the comparison of the rapidly growing themes in 2021 with those in 2012. Here, the number and growth rate of publications is depicted.

A complementary clustering algorithm of LDA and the k-means approach was applied in two publication datasets, 2017–2018 and 2020–2021, leading to the generation of nine and seven large document clusters, respectively. Figure 1f shows a comparison of the document clusters between the two epochs. Infectious diseases in children, mental health, family relations, child welfare, specific disease diagnosis, and treatment play crucial roles in both periods. Research on family relations, child welfare, and physical and mental health was found to gain increased attention. COVID-19 infection in children was newly defined during 2020–2021. Moreover, research on child nutrition and food safety was prominent. Autism spectrum disorders, child maltreatment, and special healthcare needs were separately identified areas during 2017–2018; however, they were not recognized during 2020 and 2021.

The results obtained by the two methods were compared to determine the high consistency of the two methods (Supplementary Table 4), and detailed clusters for 16 document clusters were obtained through the LDA method. For example, the cluster “family relationships, child welfare, and physical and mental health” exhibited different areas of focus, i.e., child behavior and parent–child relationships, child-rearing practices and potential impact, education, and social adaptation of children. Moreover, the “infectious disease” cluster was further divided into malaria, parasites, viruses, and other fields.

In addition to focusing on the primary research areas, understanding the significance of research findings in fields with a lower focus is essential. Inverse document frequency (IDF) calculations were performed for keywords by year of research [14] to observe the research trends in relatively “niche” areas. Keywords with extensive research focus in a specific year compared with that in previous years were identified. We calculated the annual IDF for each keyword. This value indicated the year in which a particular keyword appeared most frequently to locate those parts that greatly vary over time despite their small number, irrespective of whether it may have appeared in previous research. Supplementary Table 5 indicates the keywords that have appeared most frequently in their field each year (2012–2021), including keywords like “ultrasound contrast agents,” “2019 novel coronavirus disease,” and “oral habits” in thelast 3 years. A complete list of keywords from the last 76 years is presented in Supplementary Table 6.

The NLP techniques enable the identification of themes and detect clusters across pediatric medical publications, allowing us to view trends in pediatric medical research at the macro level. The research hotspots in the field of pediatric medicine are gradually changing over time, particularly in the last decade. A previous study used a dataset of publications retrieved from PubMed to classify research results in pediatrics, and this work required extensive expert review of a subset of published articles [10]. Despite the use of advanced automated analysis techniques, experts still played a key role in interpreting and linking concepts, as well as validating results. Moreover, experts refer to terms on the list of classical specificity in the field and incorporate their insights with pediatrics classification while naming current topics and clusters. This approach has aided in the identification of many emerging research areas, such as “health inequities and child mortality study” and “abuse and neglect of children” [10, 18].

This study found that allergy and immunotherapy were the most prominent and rapidly expanding areas in clinical research, which could be due to the high prevalence of diseases and their substantial impact on life [2, 15, 16]. Furthermore, it may indicate the increased number of ongoing clinical trials in this field for exploring novel therapeutic tools. Infectious diseases continue to dominate pediatric medicine, indicating that children, as immunocompromised individuals, remain the focus of prevention and control of infectious diseases. In response to the new coronavirus pandemic, coronavirus-related studies in the field of pediatric medicine have emerged as research groups, demonstrating the high sensitivity of the analytical approach in this study.

In recent years, there has been an increasing focus on research related to child behavior, women's welfare, child maltreatment, physical and mental health, and social family factors. This reflects a positive trend towards providing comprehensive care and attention to the physical and psychological well-being of children. Additionally, the emphasis on family factors is consistent with the current trend transfering the shift in focus of childcare away from hospitals and directing it toward communities and families [19]. Child nutrition is a hot topic and the number of related studies is growing [20]. This may be due to that some communities and families continue to struggle to provide safe and adequate nutrition to children. The number of regarding researches on population health conditions, health education, and fertility policy, has been rapid growth over the last decade as the related clinical studies have increased. Meanwhile, research topics are gradually shifting towards developing nations such as Africa.

Traditional research on specific childhood diseases remains noteworthy, with an emphasis on the importance of evidence-based treatment guidelines and outcome studies. The rapid growth of topics such as health economics and quality of medical services suggests current concerns about long-term prognosis and disease care. The importance of preventive medicine is being highlighted by increased research on risk factors. Notably, areas such as metabolic syndrome, environmental pollution, and obesity have shown an increasing trend in population, clinical, and growth studies, implying that pediatric medicine research is highly susceptible to cross-domain integration and concatenation.

In conclusion, our findings suggest that pediatric medicine is currently promoting the development of overall physical and mental health in children. While clinical research focuses on infections and immune-related diseases, growth and population studies are increasingly concerned with the development of physical and mental health and welfare in children. Furthermore, there has been an increased focus on the impact of food safety and family factors on the health of children. The requirement for the overall healthy development of children, new technologies, and social support are the potential driving factors for these changes [21]. The results of our study, along with expert insights, funding allocation, and policy formulation, will help pave the way for future research.