Introduction

Inflammatory bowel disease (IBD), also known as chronic idiopathic inflammatory disease, includes the two major forms of ulcerative colitis (UC) and Crohn’s disease (CD). The two subgroups are distinctive clinically and by endoscopy. However, when the clinical and endoscopic signs of chronic colitis are without the specific features of UC or CD but features of both, the classification of this disease is quite unclear and the concept of indeterminate colitis or unclassified IBD (IBD-U) will be used. IBD-U also belongs to IBD, and thus it is not an independent disease.

In 1859, Samuel Wilks first described UC in detail and precisely as “simple ulcerative colitis” (Wilks 1859). Later in 1952, Burrill B. Crohn published a detailed description of CD, which was then named after the author (Crohn et al. 1952). CD is a chronic, usually lifelong, but intermittent disease. It can affect all areas of the digestive tract from the oral cavity to the anus. It occurs most frequently in the terminal ileum and colon (Gajendran et al. 2018), at the transition from the small intestine to the large intestine. It does not spread continuously. UC, while also a chronic and intermittent disease, affects only the colon, most frequently the rectum, and comes to a standstill in the terminal ileum. It spreads continuously.

The two diseases are distinctive, both clinically and histologically. Clinically, CD patients tend to present abdominal pain and perianal disease frequently, while UC is commonly associated with gastrointestinal bleeding. Histologically, UC is a more superficial bowel inflammation, centered on epithelial cell damage and dominated by goblet cell loss and crypt architecture disorders. In contrast, CD is characterized by Crypt abscesses and granulomas, and the inflammation features of mucosal discontinuity and transmural involvement. However, if only the mucous membrane is affected, a diagnosis of indeterminate colitis (IBD-U) would be made (Guindi and Riddell 2004). This diagnostic difficulty occurs especially in young children and at onset of the disease (Mamula et al. 2002; Tremaine 2012).

IBD runs lifelong and intermittently, in phases of acute relapses and alternating phases of chronic remissions. The therapy and prognosis of IBD is very individually dependent on the disease’s location, extent, onset, and care. There is currently no cure for IBD. Medical and surgical therapy are the current modalities for treatment for IBD. The primary goal of the treatment is to first induce a remission in acute relapses (mucosal healing was achieved by endoscopy), and then to maintain it to achieve a better quality of life for the patient. In practice, the maintenance of remission is often the biggest problem (Hoffmann and Zeitz 2000).

The therapy of IBD in childhood differs from that in adults. Certain types of medication are restricted for children, and the therapy intensity and dosage also requires adjustment (Däbritz et al. 2017). The choice of therapy in childhood is also different from in adults, with an exclusive enteral nutritional therapy being the first choice for children with CD to induce remission, for example, and aminosalicylates being used to treat mild to moderate UC until at least the end of puberty (Däbritz et al. 2017).

Surgical interventions in childhood, on the other hand, follow the same indication spectrum as in adults (Däbritz et al. 2017). It has been reported that patients with juvenile onset (age < 20 years) have a significantly shorter life expectancy compared to the general population (median of IBD population, 64 years; median of general population, men 71 years and women 78 years) (Canavan et al. 2007).

The etiology and pathogeneses are also quite unclear. In general, IBD is considered a multifactorial disease, caused by multiple genetic and environmental risk factors affecting the immune system, such as smoking, breastfeeding, economic conditions, ethnicity, and psychological factors (Timmer 2009). The interaction between the factors is also complex, which makes the explanation of etiology very difficult. For example, smoking shows a stronger effect in women or patients with a later onset, but a weaker effect in young people or those with positive family anamneses (Tuvlin et al. 2007). The well-known hygiene hypothesis suggests that IBD is due to an imbalance in adaptive immune response and intestinal bacteria, especially in childhood, which is a result of the improved hygiene status, including less animal contact and lack of infections.

Thus, pediatric IBD has specific characteristics, and children are always considered separately in etiologic IBD studies.

According to a new trend analysis between 1990 and 2016, adults in Europe are particularly susceptible to IBD, e.g., in Norway and Germany (Ng et al. 2017). In Germany, the most recent report on incidence rates in all age groups in a rural area between 2004 and 2006 showed that the incidence of IBD was more stable compared to previous data between 1991 and 1995 (Ott et al. 2008). In other European countries and North America, the incidence of IBD was also stable or lightly decreasing in recent years (Ng et al. 2017). However, in newly industrialized countries in Africa, Asia and South America, the incidence was increasing (Ng et al. 2017). Although IBD can occur at any age, people are more frequently diagnosed between the ages of 15 and 40 (Gajendran et al. 2018; Shivashankar et al. 2017). Only assessing changes of incidence in a whole population may, however, conceal variations in specific age groups. Children and adolescents need to be looked at differently because of their puberty and growth period. For children, for example, another trend analysis of incidence from 1985 to 2018 revealed that the highest incidences of IBD in children and adolescents were reported in Europe (23/100,000) (Sýkora et al. 2018). In comparison, the highest incidence of CD in children and adolescents is in the USA (13.9/100,000), but the highest incidence of UC is in Europe (15/100,000) (Sýkora et al. 2018). Generally, the incidence of IBD in children and adolescents was increasing steadily over time and varied greatly by geographical area (Sýkora et al. 2018). In Germany, the Saxon Pediatric IBD Registry also showed an increasing trend of pediatric onset of IBD between 2000 and 2009 (Kern et al. 2021).

The Saxon Pediatric IBD Registry was founded in 2000, with the aim of collecting data so that the epidemiology of IBD in Germany could be better described. Specially, it was established to enable determination of epidemiological values such as incidence, prevalence, and diagnosis latency of IBD in Saxony/Germany. This should not only allow comparisons with international studies but also provide information on infestation patterns, symptoms, and diagnostics as well as the therapeutic response of pediatric IBD patients. This information can help to establish guidelines and quality standards for pediatric IBD. The registry provides a database across the German federal state of Saxony that can help identify risk clusters and problems in the care of pediatric IBD patients, and also collected psychosocial parameters, among others. These results will be used to answer questions about care structures and quality of care, based on international guidelines, to standardize the diagnosis and therapy of pediatric IBD patients and to provide a basis for education and outreach. The aim is to improve the quality of life of these pediatric patients and their families.

When assessing the individual data source and its content, data quality, as well as completeness, should also be considered (Weiskopf and Weng 2013). Verifying the completeness of a registry provides feedback to researchers in many ways. First, it is a self-monitoring of registry data collection, reflecting not only scientific transparency but also helping to determine the effectiveness of the study design and methodology used, which could be used for similar projects in the future. To ensure the data quality of the Saxon Pediatric IBD Registry, the completeness was regularly determined, e.g., in Zurek et al. in 2017. In our publication, the final completeness of the registry was examined and updated. The results of the completeness assessment have far-reaching implications for further publications from the registry dataset – for the scientific seriousness of the published results.

Methods

Data sources

First data source (the Saxon pediatric IBD registry)

All pediatric IBD (CD, UC, and IBD-U) inpatients and outpatients of all 31 pediatric hospital in Saxony were registered in this database between 2000 and 2014. The completeness of the Saxon Pediatric IBD Registry was regularly ascertained using the capture–recapture method (C–R method).

Second data source

To update the ascertainment of completeness of the Saxon Pediatric IBD Registry, a second data source between 2008 and 2014 was collected in 2019 in a Saxon region.

Data collection of the second data source

A contained region in Saxony with postal codes from 010XX to 017XX was selected. Queries were developed and sent out by post to all pediatric, internal, and gastroenterological practices in these districts. The addresses and telephone numbers were searched using Google Maps and Jameda.de. Jameda is the biggest portal-website for physicians and other healthcare professionals in Germany. The search process for practices questioned in the survey took place between the end of February and the beginning of March 2019. At the beginning of March, a total of 424 letters with the queries were sent out. In the case of practitioners with the same address and telephone number, letters were sent out separately to each of them.

Questionnaires were then sent out again in several waves to collect even more responses. In the last phase, practices were also contacted by telephone or visited to emphasize the importance of the study and to achieve an even higher response rate.

Criteria for medical practices

Practices included in the second data survey had to meet one of the following criteria, namely gastroenterological practice, pediatric practice, internal practices without a special field and general practitioners (GP’s), who were certified for internal medicine (were counted together), as well as be located in a delimited region in Saxony with a postal code between 010XX and 017XX.

An exclusion factor was internal practices with other special fields, such as diabetology, nephrology, or cardiology.

The inclusion criteria for patients included in the second data source were age of onset less than 15 years, initial diagnosis of UC, CD, or IBD-U between 2008 and 2014, and confirmed IBD case according to the ICD-10 code (International Statistical Classification of Diseases and Related Health Problems).

Exclusion criteria for patients were suspected cases with no confirmed diagnosis, and patients not resident in Saxony. In accordance with the German Data Protection Act, the names of patients were pseudonymized for collection. The two initials of the patient’s first and last name, and the postal codes of the patient’s residence were used to identify matching. Additionally, the date of birth and diagnosis were also used to ensure matching.

The capture–recapture method

The C–R method, also called the mark–release–recapture method, is an ecological statistical tool and was primary developed to estimate the complete size of an animal population. A number of individuals from the first data source, i.e., the whole population, are captured and marked, and then released back to the total population and mixed. After some time, those marked individuals will be recaptured in a smaller region with a known population, i.e., the second data source. Since the ratios of marked individuals are the same, the total population size in the first data source at the end can be derived through comparison of the number of recaptured individuals with the second data source with the captured individuals from the first data source (Chao et al. 2008; McCarty et al. 1993):

$$\frac{a}{A}=\frac{n}{N}$$

Legend:

a:

Number of recaptured individuals that were marked

A:

Number of individuals captured on the second visit

n:

Number of individuals marked

N:

Number of individuals in the population

In this study, the C–R method was developed in two data source models and in a multi-data source model to ascertain the completeness of the registry. The most classical formulation of the two data source models, namely the Petersen–Lincoln index, was chosen for this study (Chao et al. 2008) (Table 1).

Table 1 Ascertainment of completeness according to Petersen–Lincoln index (Hook and Regal 1995)

The Petersen–Lincoln index is based on the following assumptions (Hook and Regal 1995; Southwood and Henderson 2000), namely that the investigated population is closed, the identifiable marks cannot be lost or changed, every individual has the same potential to be captured in a source, and the two sources are independent of each other. Since its introduction in the late twentieth century, this method has frequently been used in epidemiology (Hook and Regal 1995; Southwood and Henderson 2000), with personal information, such as name, sex, and age, being treated as marks of individuals, and used to identify the captured or recaptured individuals. In some completeness confirming studies, the confidence interval of the estimated population of the first data source (registry) is also used to estimate the confidence interval of the completeness value (Howitz et al. 2008; Schrauder et al. 2007). The 95% confidence intervals [CI] were estimated with following formula (cf. legend above):

$$95\%\;CI=\pm1.96\times\sqrt{\frac{\left(n+1\right)\left(A+1\right)\left(n-a\right)\left(A-a\right)}{\left(a+1\right)^2\left(a+2\right)}}$$

The completeness was ascertained using the C–R method with the two data sources (Peterson–Lincoln index).

Statistical analysis

The age at onset was transferred into an accurate value, and the two-tailed Welch’s t-test (alpha = 0.05) was used to detect any differences in the subgroups. Shapiro–Wilk and Kolmogorov–Smirnov tests were run to detect normal distribution. The calculations were performed with the Statistical Package for Social Sciences (SPSS) (Version Statistics 26).

Results

The addresses of 424 physicians were found in the selected region according to the criteria described. The exact process of data collection, including the corresponding response rates, is illustrated by flowchart in Fig. 1.

Fig. 1
figure 1

Flowchart of data collection

Thirty-seven practices no longer existed, and 64 practices had more than one physician but the same list of patients. In the case of internal medicine practices, a differentiation had to be made: 68 internal medicine practices with known specializations such as cardiology, respiratory medicine, and diabetology (thus no gastroenterology) were excluded, but this did not affect internal medicine practices with unknown specializations, which remained included.

In total, 254 practices responded and were enrolled in the second data source. Among them 174 (response rate 68.9%) were internist practices, 66 (response rate 71.2%) pediatric practices, and the other 14 (response rate 92.8%) gastroenterological practices (Fig. 1).

Distribution of second data source

A total of 23 patients was included in the second data source, split into 19 patients reported by pediatricians, two patients by internists, and two patients by gastroenterologists. The geographical distribution of reported patients by practitioners is shown in Fig. 2. The entire map represents Saxony, a German federal state (Fig. 2). The colored shades define the surveyed region and the numbers refer to the postal codes of practitioners.

Fig. 2
figure 2

Distribution of reported IBD patients by practitioners in different regions of Saxony, Germany

Completeness of registry

A total of 23 patients was found in the second data source between 2008 and 2014. One patient was not listed in the registry’s database. According to the Petersen–Lincoln index, the completeness level of the Saxon Pediatric IBD Registry between the years 2008 and 2014 was 95.7% (95% CI 90.2–100) (Table 2).

Table 2 Ascertainment of completeness of Saxon Pediatric IBD Registry

Tables 3 and 4 show the exact distribution of reported patients. Most pediatric IBD patients were between the ages of 10 and 15 years.

Table 3 Distribution of reported IBD patients by sex in the second data source
Table 4 Distribution of reported IBD patients by age at onset in the second data source

Age at onset in subgroups

There were eight girls and 15 boys among the 23 included IBD patients. The results of Shapiro–Wilk and Kolmogorov–Smirnov tests showed that the age at initial diagnosis was confirmed to be within a normal distribution. Only one subgroup could not be confirmed as a normal distribution, namely the UC group with only four patients, due to the small number of cases. The results of the t-test showed no significant difference between the subgroups of sex (p = 0.62) and between subgroups of single IBD diagnoses (p = 0.75) (Fig. 3).

Fig. 3
figure 3

Age at onset by sex

The Saxon Pediatric IBD Registry between 2000 and 2014

The comprehensive data of the Saxon Pediatric IBD Registry between 2000 and 2014 are still being assembled and analyzed for this publication. Over the entire 15-year period, a total of 532 patients with IBD resident in the federal state of Saxony, Germany, and under 15 years of age at the time of initial diagnosis were registered. The ratio of males to females between 2000 and 2014 was about 4:3 for CD and nearly 1:1 for UC (Table 5). The proportion of IBD-U was very low (3.4%). The age at onset of 153 (28.8%) registered patients was less than 10 years. Tables 5 and 6 show the basic registry data.

Table 5 Distribution of registered IBD patients by sex and disease in the Saxon Pediatric IBD Registry between 2000 and 2014
Table 6 Distribution of registered IBD patients by age at onset and disease in the Saxon Pediatric IBD Registry between 2000 and 2014

Discussion

Until the Saxon Pediatric IBD Registry was established, there were no valid long-term data on the epidemiology of pediatric IBD in Germany. The age-standardized incidence rates (ASRs) of the Saxon registry between 2000 and 2009 showed an increasing trend of pediatric IBD from 4.6/100,000 (95% CI: 2.8–6.3) in 2000 to 10.5/100,000 (95% CI: 7.5–13.6) in 2009 (Kern et al. 2021). This increasing trend was more pronounced in male patients and patients with UC, although there were more patients with CD (61.6%) than UC (35.6%) (Kern et al. 2021). The ASR in Saxony 2000–2009 seems to be much higher than the incidence, which was reported in the region of Upper Palatinate in Bavaria, Germany, between 2004 and 2006. In this study, although the ASR was reported in a graph for all age groups, including childhood, it was less than 2.5/100,000 for both CD and UC (Ott et al. 2008). However, the significance of these results may be very limited, since only a few patients under 15 years of age were included, the region studied was relatively small compared with Saxony, and the number of patients included in individual age groups was not published. Unfortunately, other epidemiological studies with pediatric patients and comparable study designs do not exist in Germany.

In North America and the UK, there were also more CD cases reported than UC in children under 17 and 20 years, respectively, but the increase in UC incidence was steeper than for CD (Abramson et al. 2010; Virta et al. 2017). Between 1996 and 2006, the incidence of CD in North America increased from 2.2/100,000 to 4.3/100,000, and of UC from 1.8/100,000 to 4.9/100,000 (Abramson et al. 2010). The mean annual incidence of CD in the UK increased from 6/100,000 in 2000–2004 to 8/100,000 in 2010–2014, and of UC from 10/100,000 in 2000–2004 to 15/100,000 in 2010–2014 (Virta et al. 2017). In contrast, in Finland, more UC cases were reported, accounting for 52% of all IBD patients under 17 years, but the mean annual incidence of IBD also increased from 3.9/100,000 (95% CI 2.5–5.8) in 1987 to 7.0/100,000 (95% CI 5.0–9.4) in 2003 (Turunen et al. 2006).

The second data source is necessary for future publications, since it quantifies the completeness of data in the registry and helps to supplement it with the newly discovered cases (Zurek et al. 2017). Furthermore, it also offers a different perspective to understand the data. The completeness was determined to be almost 96%.

Pediatric IBD patients in Saxony are treated regularly in outpatients’ clinics of hospitals after being discharged from inpatient treatment. Only when they have other acute health problems do they visit the nearest pediatric practice. All pediatric clinics in Saxony report to the Saxon Pediatric IBD Registry. This explains why the completeness of our registry at 95.7% is so high, with only one extra patient reporting to the second data source. In addition, the high completeness may also be attributable to the complex diagnosis of IBD, which is only possible in a hospital. Many pediatricians have stated that they never treat IBD patients alone, but always transfer them to a hospital.

The data of the whole Saxon Pediatric IBD Registry between 2000 and 2014 are representative for the child population in Saxony. The results of the second data source provide only an insight into the surveyed region, which may differ compared with the entire registry. For example, the male/female ratio of the whole registry was 4:3 for CD and 1:1 for UC, but in the second data source 10:7 and 4:1, respectively. In Europe and North America, the ratio of male to female for adults was reported as 1:2 for CD and 1:1 for UC (Hausmann and Blumenstein 2015). At the same time, the second data source also reflects certain characteristics of the complete registry, such as the low proportion of IBD-U. Apart from that, the Saxon Pediatric IBD Registry between 2000 and 2009 showed that the mean of age of onset significantly decreased from 11.5 years to 9.6 years during this period (Kern et al. 2021). A similar downward trend was also observed by the second data source between 2008 and 2014. Although previous reports have stated that IBD is rarely diagnosed before reaching the age of 10 years (Timmer 2009), more than 50% of the detected pediatric patients, namely a total of 12, in the second data source had an age at onset less than 10 years. Also, the entire registry between 2000 and 2014 found approximately 28% of registered patients were younger than 10 years when IBD was diagnosed.

This may be related to improvements in diagnostic accuracy, as also suggested by the low rate of IBD-U (3.8%) in the Saxon Pediatric IBD Registry between 2000 and 2014, since the diagnosis of IBD-U was only made when it is difficult to distinguish between CD and UC. Similarly, there was only one patient with IBD-U in the second data source. Previously, this rate was considered to be 10%, or even 50% (Tremaine 2012). The second data source is less representative than the entire registry, but it reflects and confirms certain characteristics of the whole registry. A decisive role in this question plays here above all extent of the data in the second data source. However, the main goal of data collection was not patient characteristics, but to determine the completeness of the registry. This aim was undoubtedly fulfilled by this study.

Considering the limitations of the internal validity by the capture–recapture method with multiple data sources, a regularly performed ascertainment of completeness using this method with two data sources was considered more reliable and accurate. At the same time, estimating confidence intervals of completeness using the confidence interval of the estimated population is also an advantage. This approach differs from other studies, which perform the ascertainment of completeness by comparing two incomplete and already existing registries (Howitz et al. 2008; Schrauder et al. 2007). Our first data source, namely the Saxon Pediatric IBD Registry, is actually a nearly complete database, since it collects data on all pediatric patients with IBD from all pediatric hospitals throughout Saxony; therefore, to ascertain completeness, we needed to collect a separate, independent data source. The second data source was evaluated several times in the lifetime of the registry, each time for different regions in Saxony, to verify the registry better. Our validation methodology does not differ decisively from past surveys of second data sources. In earlier publications (Kern et al. 2021; Zurek et al. 2017) the capture–recapture method was mentioned, but not explained in detail, so in this paper the methodology was presented in a more concise way. In addition to this, in the second data source from Leipzig, all patients under 26 years were also recorded in practices of adult gastroenterologists. This led to an important conclusion, since it showed that the Saxon Pediatric IBD Registry has an excellent completeness only for the age groups up to 15 years, but not for older adolescents. Thus, these surveys have made a great contribution to the correct data analysis and upcoming publications of the registry. However, the completeness of the Saxon Pediatric IBD Registry determined using the second data source of Leipzig is lower than in our survey, which may be related to the fact that the Saxon Pediatric IBD Registry has been intensively edited, and supplemented in recent years.

For this study, a classical theoretical model was used (Chao et al. 2008; McCarty et al. 1993) in compliance with all methodological criteria, thus fulfilling the four assumptions of the Petersen–Lincoln index (Hook and Regal 1995; Southwood and Henderson 2000), namely that the investigated population was close in time and region, safely and unambiguously pseudonymized, all patients were identically diagnosed according to the applicable guidelines and reported to the registry or survey, and both sources were independent and collected separately, at clinics for the Saxon Pediatric IBD Registry, and at practices outside of hospitals for the second data source.

Strengths

A high value was placed on the quality of the study and therefore an elaborate and proven design was chosen. The definition of the surveyed practices was set very broadly in order to include as many patients as possible. In the medical practices, two independent sources were searched for safety reasons to compensate for possible omissions in the completeness of the single sources. After an initial query by post, an extensive follow-up was carried out, mostly by telephone. Telephone reminders have been reported to improve response rates by 13% (Asch et al. 1997). Through the consistent application of suitable methods, a relatively high response rate was achieved compared to the usual level (Asch et al. 1997), namely over 70% in total, and in the most relevant practices, gastroenterologists, even over 90%.

The reported patients from the second data source could be compared with patients in the Saxon Pediatric IBD Registry trouble-free and unambiguously. Only one patient was identified who was not previously reported in the Saxon Pediatric IBD Registry. It was found (mostly by telephone) that the most common reason for non-response of a practice in the survey was that it did not treat pediatric patients. Given the high overall response rate achieved, it is unlikely that one more patient could be found. Completeness was thus correctly determined with a high degree of probability.

The reported completeness level of the registry of almost 96% is plausible and also fully consistent with previous second data sources on the registry in Saxony for this age group of children younger than 15 years (Zurek et al. 2017).

Limitations

The second data source was limited by the number of patients, as IBD is not a very frequent disease. The epidemiological characteristics displayed in the results, represent only the selected region in the second data source and may differ from the rest of the region or the overall population (Hausmann and Blumenstein 2015; Kern et al. 2021).

Except for the final query, data in the registry were collected prospectively, while the data in the second data source were collected retrospectively, from a time period several years ago.

All pediatric IBD patients in Germany are regularly treated in hospitals, since it is only possible to make a reliable diagnosis there. The probability that the pediatrician was not informed about the diagnosis of the child patient is small, but theoretically possible. However, because only pediatric clinics reported to the Saxon Pediatric IBD Registry, this could not influence the result regarding the completeness of the registry as determined in this publication.

Conclusions

The second data source between 2008 and 2014 offered a small insight into the characteristics of the epidemiology of pediatric IBD in the studied region. The completeness level of the Saxon Pediatric IBD Registry was determined to be high.

We have presented how to determine the completeness of a dataset using the Saxon Pediatric IBD Registry as an example. The transparency of the process of data collection and the question of data quality is the main focus. The Publication can also serve as a concrete example for other researchers to apply the described C–R method.

The initial results of the Saxon Pediatric IBD Registry 2000–2014 were presented in this publication. The registry contains information on the diagnosis, diagnostic latency, symptoms, and infestation patterns, as well as diagnostic methods and treatment response, and thus maps the care of pediatric IBD patients in Saxony. In addition to providing comprehensive epidemiological data, it allows comparison with international studies. The determination of the high level of completeness of the register data also has consequences with regard to public health. The Saxon Pediatric IBD Registry can generate reliable data for both clinicians and health policy designers. The publications resulting from this high-quality register data set can be used as a basis for health policy planning and decision-making processes in the health care system. Since IBD is usually a lifelong disease, information on adequate care for pediatric patients is important not only for prognostic planning of health care, but especially for the daily quality of life of affected children and their families.

The outcome and endpoints of IBD patients with onset in childhood in Saxony may be interesting themes for the future, for example, the risk of cancer for pediatric onset of IBD (Komaki et al. 2021). The international Pediatric IBD Ahead Program with prognostic factors for outcome prediction (Ricciuto et al. 2021) has also been recently discussed. The completeness level of the Saxon Pediatric IBD Registry was verified using the capture–recapture method. Since it was found to be 98% complete, it can be regarded as a valid basis for future analyses.