Introduction

The prevalence of chronic sleep onset insomnia (CSOI) is approximately 10% among the non-disabled school-aged population (Blader et al. 1997). CSOI is the inability to fall asleep at the desired sleep time. A stable sleep schedule that is substantially later than the conventional or desired time is one of the main symptoms of delayed sleep phase disorder (DSPD) (Sack et al. 2007). CSOI, combined with the finding of (an age relative) late dim light melatonin onset (DLMO), is suggestive for DSPD. Late melatonin onset in children between 6 and 12 years old is defined as a DLMO later than the mean DLMO in children without CSOI, being 19:45 ± 60 min in children of 8.2 ± 2.1 years (Smits et al. 2003).

In the Netherlands, melatonin is increasingly used for the treatment of children with idiopathic CSOI and late melatonin onset. In the second half of 2008, melatonin capsules took the fifth rank of extemporaneous mixtures compounded in Dutch pharmacies (Anonymous 2009).

A recent meta-analysis showed the short-term efficacy and safety of melatonin in adults as well as in children with a mean reported DLMO between 20:37 and 21:06 (van Geijlswijk et al. 2010a). Also in children with attention deficit hyperactivity disorder (van der Heijden et al. 2007; Weiss and Salpekar 2010) or autistic spectrum disorder (Cortesi et al. 2010), melatonin effectively improved quality of sleep.

However, treatment of children with melatonin has been controversial, because of its effects on reproduction in animals (Arendt 1997; Szeinberg et al. 2006).

Schertbarth et al. concluded that the normal variation in melatonin levels resulted in different results of gonadotrophic effect and reproduction, depending on the animal species involved. Melatonin is solely the mediator for the environmental cue that activates the seasonal breeder organism in a species appropriate way to the seasonal changes (Scherbarth and Steinlechner 2010).

Even though humans are no seasonal breeders, there is still concern that enduring high nocturnal levels due to exogenous melatonin uptake might postpone puberty onset (Srinivasan et al. 2009).

Case reports of precocious puberty (Waldhauser et al. 1991) and delayed puberty development associated with a disturbed melatonin rhythm (Verhoeven and Massa 2005) have led to different hypotheses about puberty and lowered nocturnal melatonin levels (Waldhauser et al. 1991; Debruyne 2006; Srinivasan et al. 2009; Silman 2010). The lower nocturnal levels of melatonin in children after onset of puberty and adults might stem from growth (Silman 2010). However, in children with precocity, also lower nocturnal melatonin levels are found in comparison with preadolescent children of the same age and stature (Waldhauser et al. 1991). This suggests that lower levels of melatonin are not a result of a larger volume of distribution due to growth. Moreover, after suppression of the pituitary–gonadal axis, resulting in a normalized gonadotropin and sex steroid levels, these lower melatonin levels do not normalize to age-related preadolescent levels. This implies melatonin levels are not determined by hormonal status.

Long-term data on efficacy and safety are scarce and especially with respect to pubertal development needed. A recently published long-term study (Hoebert et al. 2009) suggested that long-term use of melatonin is safe. However, these authors did not study the influence on pubertal development.

The objective of this study was to evaluate the long-term efficacy and the long-term safety of melatonin therapy in pre-pubertal children. This is, to our knowledge, the first study evaluating pubertal development in children using melatonin for a long period of time in pre-puberty as compared with pubertal development in the general Dutch population (controls).

Methods and materials

Study design

For this follow-up study, all participants of the Meldos trial, a melatonin dose-finding investigation in children with CSOI were invited (van Geijlswijk et al. 2010b). The study consisted of a written interview with inventory questions about demographic data and melatonin use and three international questionnaires about sleep habits, mental health and pubertal development, adapted to the population of Dutch children. The children, having used melatonin for 6 months or longer, were asked to complete the full questionnaire. The Meldos protocol was approved by the institutional review board, as a mono-centre trial by the Central Committee on Research Involving Human Subjects, and registered in the International Standard Randomized Controlled Trial Number Register (ISRCTN20033346). The protocol included the possibility to assess health several years after finishing the placebo-controlled part of the study.

Participants

All children that participated in the Meldos trial between May 2004 and February 2007 (van Geijlswijk et al. 2010b) were eligible to participate in this follow-up study. After finishing the Meldos trial, the participants were prescribed with melatonin; the dose was determined by the subjective results of the last trial week. In the previous Meldos trial, eligible participants were children aged between 6 and 12 years who were in good general health, otherwise suffering from sleep onset insomnia more than four nights a week for more than 1 year, based on parental reports. Sleep onset insomnia was defined as sleep onset later than 8:30 p.m. in children aged 6 years and for older children, 15 min later per year until age 12 (10:00 p.m.). Furthermore, the latency between lights-off time and sleep onset (sleep onset latency) had to be more than 30 min on average. Sleep hygiene interventions did not result in better sleep. Exclusion criteria were CSOI due to psychiatric or pedagogic problems, known intellectual disability, pervasive developmental disorder, chronic pain, known disturbed hepatic or renal function, epilepsy, prior use of melatonin, and use of stimulants, neuroleptics, benzodiazepines, clonidine, antidepressants, hypnotics, or beta-blockers within 4 weeks before enrolment.

In December 2008, questionnaires were sent to all 69 children who completed the placebo-controlled part of the Meldos trial.

Outcome measures

The questionnaire consisted of four distinct parts.

  1. 1.

    Demographic data and melatonin use.

    Part I were 24 multiple choice, open and scaled questions about continuance of melatonin usage, the way melatonin prescription was obtained, the applied dose(s), therapy compliance, drug-holidays, other medication, length and weight, school, sports, reading activities, gaming and watching TV, occurrence and severity of headache.

  2. 2.

    Mental health

    In part II, 25 questions about mental health, to assess social development were asked. Mental health was assessed by means of the self-administered Dutch version of the Strengths and Difficulties Questionnaire (SDQ) for adolescents and children. The SDQ is a questionnaire that is suitable as an index of therapy outcome (Garralda et al. 2000; Goodman 2001; Muris et al. 2003).

    It includes 25 symptom items and measures both negative and positive behavioural and emotional attributes of the child or adolescent. There are five sub-scales: emotional symptoms, conduct problems, hyperactivity–inattention, peer relationship problems and pro-social behaviour. Every item has three categories: ‘not true’ (0), ‘somewhat true’ (1) or ‘certainly true’ (2). The scores were summed for each scale. A total difficulties score was calculated by summing the scores of all the items, except those of the pro-social behaviour scale. This Dutch version is validated as a parents questionnaire in the Dutch population (Muris et al. 2003) (Muris et al. 2004), and it is also used as a self-administered version in 13 and 14-year-old children (Havas et al. 2010). The SDQ in this study addressed the parents and children, irrespective of age.

  3. 3.

    Sleep habits

    Part III consisted of the Children's Sleep Habits Questionnaire (CSHQ), which is a retrospective, 45-item parent questionnaire that has been used in a number of studies to examine sleep behaviour in young children. The CSHQ includes sleep complaints in this age group: bedtime behaviour and sleep onset; sleep duration; anxiety around sleep; behaviour occurring during sleep and night wakings; sleep-disordered breathing; parasomnias; and morning waking/daytime sleepiness. Parents are asked to recall sleep behaviours occurring over a ‘typical’ recent week. Items are rated on a three-point scale: ‘usually’ if the sleep behaviour occurred five to seven times/week; ‘sometimes’ for two to four times/week; and ‘rarely’ for zero to one time/week (Owens et al. 2000). The Dutch version was recently validated in the Dutch population (Waumans et al. 2010). The tool to objectify sleep in the Meldos study was dim light melatonin onset (DLMO) and Sleep Onset and Sleep Onset Latency, obtained by actigraphy. Additionally, sleep hygiene measures in Meldos were evaluated by means of a questionnaire based on the Sleep Disturbance Scale of Children (SDCS) (Bruni et al. 1996). Since these parameters change with age, especially in puberty, repeating these measurements for this study seemed of less value than comparing the outcomes of this questionnaire to outcomes in controls.

  4. 4.

    Pubertal development

    Pubertal development was assessed by three Tanner score questions and one additional question for girls (mothers age at menarche) and two additional questions for boys (oigarche age [the age at first ejaculation (Laron et al. 1980; Carlier and Steeno 1985)] of him and his father).

The Tanner scores consist of three scores for boys and three for girls, describing size of genitals, testicles and growth of pubic hair in boys, and breasts, pubic hair and menarche in girls (Marshall and Tanner 1969; Marshall and Tanner 1970). The Tanner scores were self-reported, based on photographs and sketches of testicle volumes added to the questionnaire (Vlaamse Groeicurven 2004 [Flemish Growth Charts]: http://www.vub.ac.be/groeicurven/pubemodel.html). Results in our population were compared with the general Dutch population (Mul et al. 2001) to assess pubertal development.

Timing of pubertal development is influenced by genetic factors. Comparison of pubertal timing between generations is difficult because of the absence of a distinct criterion apart from menarche in girls (Carskadon and Acebo 1993; Sedlmeyer et al. 2002; Wehkalampi et al. 2008). The age of oigarche was added as a menarche equivalent for boys for its distinct value, if attainable. The parents' ages at menarche and oigarche were retrieved as indicators for genetic predisposition for early or late puberty onset.

A successful pilot of the questionnaire was done in five children.

Data analysis

Primary outcomes are SDQ score, CSHQ score and Tanner scores of children under melatonin treatment for more than 6 months. Secondary outcomes are percentage of persistent use of melatonin in this group of children, mean effective dose, reported adverse events, menarche/oigarche related to parental menarche/oigarche.

For the analysis of the SDQ and CSHQ scores, one sample t test was applied to compare scores obtained in (subgroups of) this population with previously published scores of the general Dutch population of the same age and/or sex (controls). Additionally, the CSHQ score was compared with the score in a subpopulation identified as without sleeping problems, and with the score in a subpopulation with sleeping problems.

Tanner scores were analyzed using the web application (http://vps.stefvanbuuren.nl/puberty) (van Buuren and Ooms 2009). This tool calculates standard deviation scores (SDS) of individual observations of Tanner scores, and additionally plots those scores in a stage line diagram. The traditional way to evaluate an individual's score to a population is the application of a nomogram, which calculates the relative position of the individual as compared with his/hers age-peers. This is suitable for continuous parameters like length or weight, the standard deviation of the mean determining the relative position between 0 and 100, p0 meaning that 0% of the general population is smaller/weighing less, p100 that 0% of the general population is taller/weighing more. But it is not very convenient for processes that occur in a limited period of time, like breast growth and are characterized by discrete measures (stages) instead of a continued parameter. The determination of the deviation of a distinct stage at a specified age in relationship to the distribution (expressed in SDS) of this stage over ages in controls (data of 1997 (Mul et al. 2001)) moderates the evaluation of puberty development in a more continuous way. The stage line diagram is created by modelling the probabilities of successive category transitions in the reference data (general population) as functions that are smooth in age. Then, the mid-p value for each category is calculated and transformed into the Z scale by a probit transformation (van Buuren and Ooms 2009). The resulting scores can be plotted against age to produce the stage line diagram. A first stage, like B1 (breasts-1, meaning no sign of growth yet) is present in 100% (mid p value) of the general population of 8-year-old girls, but in 12-year-old girls, this stage occurs only in 10% of the girls (late), SDS = 1,645. For the most mature breasts stage, B5, the opposite is true; for 13-year-olds, this stage occurs only in 10% of the girls (early), whereas the mid-p value of 100% is reached in nearly all 20-year-olds.

Results

Demographics and melatonin use

The questionnaire was returned by 59 of the 69 children (Fig. 1). Two children continued melatonin use for less than 6 months, and refrained from completing the questionnaire. The mean age of the remaining 57 children was 12.0 (min 8.6, max 15.7 years). Nine children discontinued melatonin therapy after more than 6 months and did complete the questionnaire. Melatonin was still used by 48 children at the time of questioning, mean duration of use was 3.1 years (min 1.0, max 4.6 years) and mean dose was 2.7 mg (min 0.3, max 10 mg) (Table 1).

Fig. 1
figure 1

Justification of obtained outcome data

Table 1 Demographic characteristics of participants

Effect of cessation

Eleven children stopped therapy. In one case because the family physician decided that 6 months of therapy was enough, one boy because even 10 mg of melatonin did not have a sustained effect, one girl because of the adverse event of apathy combined with weight gain, and eight children because the need of early sleep disappeared. Of those eight, one child indicated to have adopted a delayed sleeping pattern and one child accepted increased sleep onset latency. The remaining six children indicated the sleep onset insomnia had disappeared.

Drug-free intervals

Melatonin therapy was interrupted during holidays by 31 children. Reasons for interruption were the advice of the prescribing physician, a delayed rhythm during holidays and checking the continuing need for melatonin to advance sleep rhythm. Four children skip (ped) medication on a regular basis every week, for instance in the weekends, or 1 or 3 days a week. Eighteen children never skip (ped) medication (unless forgotten).

Adverse events

Several adverse events were reported: nausea at start (one child), apathy combined with weight gain (one child), weight gain (one child), nocturnal diuresis (three children), short temper during high-dose therapy (10 mg) (one child).

We explicitly asked for the occurrence of headache. Twenty one (38%) children reported suffering from headache regularly (once a week–once a month), 11 (20%) seldom and 23 (42%) never experienced headache.

Mental health

Mean SDQ score was 9.75 (min 2, max 29, SD = 5.72). This score and the sub-scores of age-related subgroups were compared with the one sample t test with previously published scores in controls (Muris et al. 2004; Havas et al. 2010). Muris et al. (2004) studied the validity of the self-report SDQ in 1,111 Dutch, non clinical children, mean age 10.6 years, Havas et al. (2010) studied mental health problems of Dutch adolescents aged 13 and 14 years. Nineteen children (33.3%) of this study population were 13 years or above.

We compared the subgroup of children <13 years with the primary school controls (Muris et al. 2004) and ≥13 y with the adolescent controls (Havas et al. 2010). No significant differences on SDQ (sub) scores were found (Table 2).

Table 2 Mental health development in comparison with Dutch primary school children and with Dutch adolescents

Muris et al. (2004) reported boys and girls scores separately. The conduct problem score of the girls subgroup aged 8–13 in the Meldos population is statistically significantly lower than the primary school controls, as is their SDQ total score. Other subscores of the girls and all (sub) scores of the boys did not deviate from controls (Table 3).

Table 3 Mental health development of boys and girls in comparison with a Dutch primary school population

Sleep habits

The CSHQ score of our study population was 42.9 (min 34, max 62) (Table 4). This score and the subscores were compared with previously published Dutch population scores (van Litsenburg et al. 2010) (Table 4). Two subscores—sleep onset delay and daytime sleepiness—and the CSHQ total score were significantly higher than in the controls, indicative for worse sleep. Sleep-disordered breathing was significantly lower. In the general population a subpopulation of problem sleepers (PS) was defined based on (subjective) parental report endorsing at least one of the CSHQ items as a problem (van Litsenburg et al. 2010). In comparison with these PS (23%) in the general population, the participants of this study had significantly deviant scores on all eight subscores (indicating better sleep). When compared to the 77% non-problem sleepers (NPS) of this study showed, in addition to the three deviant subscores mentioned earlier, higher subscores for sleep duration, sleep anxiety and night wakings. In conclusion, five out of eight subscores indicated statistically significant worse sleep in this group in comparison with the NPS in the controls; one score indicated better sleep (Table 4).

Table 4 Sleep habit results compared with problem sleepers and non problem sleepers in a Dutch primary school population

In accordance with findings in controls, girls in general have higher scores than boys. In contrast to that, in this study population, boys' scores for sleep onset delay were significantly higher in comparison with controls (1.62 vs 1.25, t value = 2.08), and even higher than the girls' scores (1.53, general population 1.34. t value = 1.26, NS).

The questionnaire was developed for parental report on sleep behaviour of children between 4 and 10 years old. For this reason, we compared the subgroup of children <11 years with the controls. In this group of 17 children, 8–10 years old, no significant diversion of CSHQ scores from the controls was detected (Table 5). The complimentary group of children between 11 and 15 years old showed three statistically significant higher subscores and the total score that differed significantly with the age-related controls, indicating worse sleep.

Table 5 Sleep habit results of two age subgroups compared with a Dutch primary school population

Pubertal development

Tanner Stages standard deviation scores could be determined for 16 boys and 30 girls. Female breasts/pubic hair/menarche SDS were 0.003 (min −1.9, max +1.5)/0.013 (min −1.0, max +1.4)/0.143 (min −.87 max +2.47) (Fig. 2. for all individual female scores). Male genital development/pubic hair/testis volume SDS were 0.038 (min −2.1, max +2.8)/0.171 (min −1.8, max +2.55)/0.299 (min −1.83, max 2.67) (Fig. 3. for all individual male scores). Two boys had all three SDS outside the 80% percentile: one boy in the 1–5 percentile early development, one boy in the 5–10 percentile late development.

Fig. 2
figure 2

Pubertal development: stage line diagram with SDS plot for girls (n = 30). This figure depicts the SDS values for development of breast (blue lines, five stages), pubic hair (green lines, five stages) and menarche (red lines, yes or no) of 30 girls. The black circles depict the three SDS values for three individual girls, the first circle represents a girl with an early menarche and nearly p90 values for breast and pubic hair, the middle circle represents a girl with p50 development, and the third circle represents a girl with normal menarche but late breast development

Fig. 3
figure 3

Pubertal development; stage line diagram with SDS plot for boys (n = 15). This figure depicts the SDS values for development of genitals (blue lines, five stages), pubic hair (green lines, five stages) and testis (red lines, eight volumes) of 15 boys. The black circles depict the three SDS values for three individual boys, the first circle represents a boy with an early development, almost p99 values for all parameters, the middle circle represents a boy with p50 development, and the third circle represents a boy with late development of all parameters (p5)

Comparison of maternal menarche (median 13 years) with the menarche of the girls (median 12 years) revealed an earlier menarche, in accordance with Mul et al. (2001) who reported a 0.25-year reduction of menarche in the period between 1965 and 1997 in Dutch girls. Oigarche data in the population of this study and in the general population are too scarce to draw conclusions.

Discussion

Six of the 11 children that stopped therapy indicated that the delayed sleep onset had disappeared. This suggests that successful cessation of therapy is attainable after a longer period of melatonin usage without rebound phenomena as demonstrated by typical hypnotics. Adverse events occurred infrequently and were acceptable in most cases, leading to cessation of melatonin use in 1.6%. Upon explicit inquiry, though, most children (58%) reported headache. The finding of regular headaches in 38% of the children was disturbing at first glance. However, recently, Arruda et al. (2010) reported a prevalence of low-frequency episodic headaches (suffering from headaches in the past year but less than 5 days of headache in the past month) of 38.9% in preadolescent children from the general population.

The social development of the 57 former participants of Meldos as measured by SDQ did not deviate from the children without sleep problems of the general population (controls), thanks to or in spite of long-term melatonin use.

The sleep habit questionnaire indicated that the sleep patterns of these long-term melatonin users were not as good as the sleep patterns of healthy sleepers, but better than the patterns of 23% of the controls that were categorized as PS. In fact, all melatonin users were still satisfied with the results of the therapy.

The CSHQ was recently validated in Dutch children (Waumans et al. 2010). It was concluded that this questionnaire is appropriate for ages 4–10, but not for older children. This outcome might reduce our results to 17 valid scores that, however, did not deviate from the controls.

Nineteen children (33.3%) in our population reached the age of 13 during the follow-up period, which allowed us to evaluate the effects of melatonin therapy on timing of puberty development. The Tanner results indicated undisturbed puberty onset, as did the comparisons of menarche of the girls and their mothers. However, only 62% of the boys and 91% of the girls answered the Tanner score questions. Fourteen children (three girls and 11 boys) did not answer this part of the interview. The reluctance to answer the Tanner questions could be caused by (real or perceived) deviated pubertal development, but might also emerge from the strong religious background of some of the children, as for instance, is demonstrated by the absence of TV watching or gaming in some of the children (concerning only boys, not shown) (Table 1). The self-assessment of the scores instead of an intrusive physical examination by a physician might have augmented the response rate at the expense of the objectivity of the results.

The phenomenon of repetitive loss of response after initial good response and dose escalation, in response to the effect, wearing off was described in three case reports of (Braam et al. 2010). The loss of response was associated with CYP1A2 poor metabolism, resulting in the loss of rhythmicity of melatonin levels due to saturation kinetics. The incidence of slow CYP1A2 metabolism ranges from 12% to 14% (Butler et al. 1992; Nakajima et al. 1994; Zhou et al. 2009). The boy in who even 10 mg melatonin failed might be a slow metaboliser for CYP1A2.

Some other limitations of our study need to be addressed. The children in our study came from one sleep clinic, and the sample with long-term results is small, 57 children. This population might not be representative for otherwise healthy Dutch children with CSOI. A recently published long-term study (Hoebert et al. 2009) studied the efficacy and safety in a larger population (99). In this study, only 8% of the children stopped therapy after 4 years of treatment. However, most of these children suffered from other health problems, remained under supervision of a specialist and used other medication as well.

For evaluation of long-term effects and results of any intervention, one would ideally apply the same measurements in due course. The Meldos study applied a sleep hygiene questionnaire that was deemed inappropriate for the older children of the present evaluation. Even more, even the applied CSHQ was recently invalidated. So, the CSHQ scores may only be valid for 17 children.

In conclusion, we found that melatonin was still used by 81% of children, after a mean term of usage of 3.1 years. Six (10%) children stopped therapy successfully, two others adopted a delayed sleep pattern after cessation. One girl quit melatonin therapy because of apathy and weight gain, one boy quit because of loss of response. One girl was forced to stop therapy by her GP after 6 months. The CSHQ results indicate that the sleep habits in melatonin users are better than in PS without medication, but worse than in NPS. Social development assessed by SDQ indicates a normal development. Puberty onset, as assessed by Tanner scores, seems to be undisturbed after 3.1 years of exogenous melatonin usage in this limited population.