Teacher Nominations of Preschool Children at Risk for Mental Health Problems: how False Is a False Positive Nomination and What Make Teachers Concerned?

Identification attempts in populations with a low prevalence of problems usually result in a considerable number of false positives. Thus, the aim of the current study was to investigate the false positive rate following nomination of developmental concerns by preschool teachers and the reasons for which teachers raise developmental concerns about children who display non-clinical levels of mental health problems.A total of 1430 children aged 1 to 6 years in Norwegian childcare centers were classified as true positive, false positive, true negative, or false negative by comparing preschool teachers’ nomination with their ratings on the Caregiver-Teacher Report Form, resulting in 127 (9%) false positives and 1142 (80%) true negatives.Compared to the true negative group, the false positive group received significantly higher scores on internalizing problems, externalizing problems than true negatives, conflict and significantly lower scores on closeness. Children’s internalizing and externalizing problems and age were the main factors that increased the likelihood of teachers raising concerns, while increased closeness in the teacher-child relationship reduced the likelihood of being nominated. Children’s gender and conflict level were not significant when adjusting for other factors.These findings suggest that preschool teachers’ concerns about children’s development should not be discarded as the false positive group did show elevated levels of problem behavior and poorer teacher-child relationship compared to the true negative group. Scrutinizing concerns in collaboration with parents and other mental health professionals may be beneficial to ensure healthy development for children with elevated problem levels.


Introduction
Globally, approximately 20% of all children experience mental health difficulties (Belfer 2008) and 13% meet the criteria for a psychiatric disorder (Polanczyk et al. 2015). In Norway, the prevalence rates are slightly lower with 15% to 20% of the preschool children exhibiting some mental health problems (Skogen et al. 2014) and 7% showing a symptom load that would qualify for a psychiatric disorder . However, very few preschoolers who meet diagnostic criteria are referred for mental health evaluation or receive treatment (Egger and Angold 2006;Horwitz et al. 2003;Horwitz et al. 2007) with approximately half of all children with behavioral disabilities not being identified before school entry (Glascoe and Marks 2011). Additionally, in Norway, only 10% of four-year-old children with emotional and behavioral difficulties have received help (Wichstrøm et al. 2014).
At the community level, parents and other caregivers, such as preschool teachers, are the only viable source of information regarding young children's development (Sveen et al. 2013). Parents are the primary initiators of contact with health services when there are concerns about a child's development (Ellingson et al. 2004). Usually, the task of identifying children with emotional and behavioral difficulties has been carried out by pediatric practitioners in collaboration with parents. However, many cases may go undetected if responsibility is placed solely on parents and pediatric practitioners (Lavigne et al. 2016a). Alternatively, preschool may provide a valuable context to screen for early developmental concerns. Thus, more attention should be directed towards preschool teachers' perceptions of children's difficulties (Poulou 2015) as their observational accuracy may be an important factor in connecting children in need of help with relevant mental health services (Berkhout et al. 2012;Eklund et al. 2009). However, surprisingly little research has been carried out on the ability of preschool teachers to identify children with emotional and behavioral difficulties.
There is an increasing awareness of the great importance of early identification of emotional and behavioral difficulties (Council on Children with Disabilities et al., 2006;Essex et al. 2009;Glascoe and Marks 2011;Njoroge and Bernhart 2011;Radecki et al. 2011). The rapid development occurring in preschool-aged children may place them at risk of developing emotional/internalized problems and behavioral/ externalized problems. While some children overcome these difficulties, others struggle to return to a normal developmental trajectory (Essex et al. 2009;Lavigne et al. 1998;Poulou 2015). Early emotional and behavioral difficulties are predictors of later maladjustment, underlining the importance of identifying those children with high levels of internalized and externalized problems or those with continuity in their problem behavior over time (Basten et al. 2016;Briggs-Gowan and Carter 2008;Essex et al. 2009;Fanti and Henrich 2010;Gilliom and Shaw 2004). Well-timed and targeted interventions may disrupt these negative trajectories and enhance the probability of better adjustment (Masten and Cicchetti 2010). This presumes that children in need of intervention are identified at an early stage; however, such identification may be challenging in a period during which children's development proceeds rapidly (Keenan et al. 1998). Given the potentially serious consequences of early difficulties in children's development and lifelong health (Center on the Developing Child Harvard University, 2010), developing procedures to identify those in need of help should be a public health priority (Essex et al. 2009;Sawyer et al. 2013).
In contrast to diagnostic assessment, the main purpose of identifying children at risk through screening is to detect which children are in need of further assessment to decide whether treatment is necessary or not. In other words, screening may be viewed as a first step that indicates the need for a more thorough diagnostic assessment and identifies children who do not meet diagnostic criteria but may still be at risk developmentally. Screening procedures are usually shorter and less costly to administer than diagnostic assessments, and the briefest and most low-cost screening procedure is the nomination method. Simply, this approach involves informants nominating the children who they perceive to meet a given criteria. This can be in the form of either a certain number of risk factors being present or the informant's perception of developmental concerns. The nomination method may be regarded as pre-screening or as a subjective judgment call as it can direct attention towards subsets of children in need of further screening.
Preschool teachers are a logical source for the developmental screening of children during early development. However, accuracy in identifying children with difficulties has predominantly been investigated for teachers of school-aged children rather than teachers of preschool children. Several studies of children in elementary school have shown that teachers are more likely to elicit concern and exhibit higher precision regarding externalizing symptoms compared with internalizing s y m p t o m s ( D w y e r e t a l . 2 0 0 6 ; L o a d e s a n d Mastroyannopoulou 2010; Soles et al. 2008). Furthermore, children identified as needing mental health services show significantly more adjustment problems than their peers (Layne et al. 2006;Roeser and Midgley 1997) and children nominated by a teacher as at-risk for problems differ significantly from non-nominees with respect to academic grades, sociometric status, and social behavior five years after nomination (Ollendick et al. 1990). However, previous studies have reported low to moderate accuracy when teachers are asked to identify children with anxiety and depression difficulties (Dadds et al. 1997;Moor et al. 2000). Similar to elementary school teachers, preschool teachers also tend to under-report internalizing symptoms (e.g., anxiety, depression, and withdrawal) compared with parent reports. In addition, if preschool teachers perceive their relationship with a child as conflictual, they tend to over-report both internalizing and externalizing symptoms . That said, it has been reported that preschool teachers can identify a considerable portion of children rated with a clinical symptom load (true positives), especially among the oldest preschool children, and leave very few false negative cases (children rated with a clinical symptom load for which teachers have no concerns). However, preschool teachers' nominations can also produce a high rate of false positive cases with about every other nomination identifying a child rated with a non-clinical symptom load (Stensen et al. 2021). Being identified as a false negative, that is having a negative screening test with positive follow-up or presence of problems that are not identified in a timely manner, may deprive children of appropriate help through non-referral for a more thorough assessment. Misclassifications in the form of false positives, that is a positive screening test with negative follow-up results, indicating an absence of problems, may have a stigmatizing effect for the child and create unnecessary anxiety for the child and parents. This will also result in a waste of clinical resources, which ideally should be allocated to those clearly needing services.
The high rates of false positive cases associated with screening in a population with a low base rate of difficulties, such as a normal population of preschool children, remain problematic (Lavigne et al. 2016a). The ability of screening procedures to classify cases correctly tends to decline as the base rate of the target problem declines, thereby leading to a higher misclassification rate (Lavigne et al. 2016a;Young and Takala 2018). However, some findings suggest that the validity of a false positive screening result can be debated. By drawing from various screening instrument validation studies, Glascoe (2001) recruited 512 parents and their children, age 7 months to 8 years, to undergo screening followed by diagnostic assessment for all. The results showed that the children classified as false positives performed significantly worse on diagnostic measures than the children classified as true negatives (i.e., those who showed negative screening results and an absence of problems). In another study, Jensen and Watanabe (1999) used DSM criteria and symptom checklists, as well as other survey measures, to compare true positive, false positive, true negative, and false negative cases. They found that the false positive cases exhibited higher levels of a range of risk factors than the true negative cases.
Even though reports of estimated accuracy are somewhat mixed, the nomination method can be used to identify a considerable proportion of children with emotional and behavioral problems. Children classified as false positive cases may still present problems, although not necessarily at a clinical level. Indeed, these children have been reported to receive poorer outcome scores on various measures than children classified as true negative (Glascoe 2001;Jensen and Watanabe 1999). In addition, it has been reported that children displaying symptoms of psychopathology experience considerable impairment even when they do not meet DSM criteria for disorders (Angold et al. 1999). Subthreshold conditions may be effective targets for preventive interventions as these can be precursors for disorders later in life (Shankman et al. 2009). Because preschool teachers can potentially play an important role in identifying and helping children with emotional and behavioral problems, the validity of false positive teacher nominations needs to be investigated by comparing the characteristics of those classified as false positive with those classified as true negative. The current study thus sought to test the following hypothesis in a sample of preschool children: (H1) Children with false positive teacher nominations will receive higher scores for problem behaviors and lower scores for teacher-child relationship. (H2) Children with an elevated but non-clinical, symptom load will have higher odds of a false positive classification. (H3) Negative teacher-child relationships, represented by either high levels of conflict or low levels of closeness, will increase the odds of a false positive classification.

Methods
Data were collected from 2012 to 2014 as part of the Children in Central Norway study, which aimed to enhance teacher competence in addressing preschool children's mental health and to improve the quality of relationships between teachers and children. The study was approved by the Regional Committee for Medical and Health Research Ethics.

Procedure and Participants
Parents with children in childcare centers, serving children from age one to six years old, in three municipalities in Central Norway received recruitment letters with information regarding the project as well as an informed consent form. Information was also provided in parent meetings before the project started. The recruitment letter provided the option for parents to consent either by logging in with a personal invitation code or by returning the consent form to the childcare center. Participation was voluntary and parental consent could be withdrawn at any time until the participation registry was deleted without reprisal. Parental consent gave the teacher in the childcare center who was most familiar with their child permission to complete a survey regarding that parent's child.
Children are usually enrolled in childcare centers in the autumn, and the data were collected in January the following year. Thus, most teachers would be expected to have known the child for at least a few months. The teachers provided consent electronically with their own invitation codes. Of the invited parents, 1631 (77%) consented to enroll their child in the study, and 169 teachers (7% male) reported on 1431 children (88% of eligible). The gender distribution of the children was 51% boys and 49% girls, and the mean age was 45 months. In the survey, the preschool teachers were first asked to decide whether or not to nominate the child as being subject of developmental concern. Next, the teachers were asked to respond to the Student-Teacher Relationship Scale (STRS) and the Caregiver-Teacher Report Form (C-TRF).
The teachers responded for all children in the same sitting.

Teacher Nomination
Teachers were asked to make a global subjective judgment concerning each child's risk status by answering "yes" or "no" to indicate whether they perceived that the child had any developmental concerns. This question was located at the start of the survey before the standardized questionnaires were presented. If "yes" was answered, teachers could specify their nomination by checking one or more reasons for the nomination, including aggression, attention, emotional, social, motoric, language, and home situation. However, only those nominated with specifications of aggression, attention, emotional, or social concerns were considered in the analyses to have been nominated at being at risk to match the types of problems addressed in the criterion (see below).

Student-Teacher Relationship Scale (S-TRS)
The S-TRS (Pianta 2001) is a teacher-report form developed to measure teachers' perception of their relationship to a child or student. It contains the subscales closeness, conflict, and dependency with item responses ranging from 1 ("definitely does not apply") to 5 ("definitely applies"). In the current study, only the closeness (11 items) and conflict (12 items) subscales were used. A total score for each subscale is obtained by summing the individual items, where higher scores in closeness indicate a higher degree of warmth in the relationship and higher scores in conflict indicate a higher degree of problem interactions in the relationship. The closeness and conflict subscales have been demonstrated to have high internal consistency (α = .86 for closeness and .92 for conflict) and test-retest reliability (4-week r = .88 for closeness and .92 for conflict) (Pianta 2001). In addition, these two subscales have been shown to have good concurrent and discriminant validity in Norway and have been shown to have good factor validity in a slightly modified version (Drugli and Hjemdal 2013;Solheim et al. 2012).

Caregiver-Teacher Report Form (C-TRF)
Teachers also completed the C-TRF (Achenbach and Rescorla 2000), which contains 100 items describing problem behaviors for children who are between 1.5 and 5 years old. Each item has three response options: "not true (as far as you know)", "somewhat or sometimes true", and "very often or often true". These answers correspond to scores from zero to two. The C-TRF contains the following subscales: emotionally reactive (7 items), anxious/depressed (8 items), withdrawn (10 items), somatic complaints (7 items), attention problems (9 items), and aggressive behavior (25 items). A total problem score (ranging from zero to 200) can be calculated by adding the scores across all items. In addition, two broadband scales can be calculated by adding certain subscales for internalizing problems (emotionally reactive, anxious/depressed, withdrawn, and somatic complaints) and externalizing problems (attention problems and aggression problems). The validity, reliability, and factor structure of the C-TRF have demonstrated to be excellent across cultures (de Groot et al. 1994;Ivanova et al. 2007;Ivanova et al. 2010;Ivanova et al. 2011;Koot et al. 1997;Rescorla et al. 2012;Rescorla et al. 2014;Verhulst and Koot 1992). A score at or above the 90th percentile defines the clinical range on the C-TRF total problem score and has been shown to discriminate well between referred and non-referred children (Achenbach and Rescorla 2000). In addition, the parent-reported counterpart to the C-TRF, the Child Behavior Checklist (CBCL), has been reported to show good correspondence and predictive validity to DSM diagnoses both for preschool children and older children (Bellina et al. 2013;Ebesutani et al. 2010;Krol et al. 2006;de la Osa et al. 2016).
We defined children with a score at or above the 90th percentile on the C-TRF's Total Problem, Internalizing, or Externalizing scale as having a clinical level of mental health problems. In addition, children in the top 2% on at least one C-TRF subscale (except somatic complaints, as this scale does not match the concern specification options in the nomination process and, thus, would have created mismatched data) but who were not rated in the clinical range (90th percentile) on the C-TRF broader scales were also considered to be at risk. Consistent with recommendations by Achenbach and Rescorla (2000), this was done because the subscales compromise a smaller and more homogeneous set of problems, which are believed to require more stringent cutoff values to indicate a need for professional help. Thus, we ensured that children scoring very high on a specific set of problems were classified in the clinical range and in need of mental health services, even though they might have scored below the cutoff value on a broader scale. Because teachers tend to score boys higher than girls on the C-TRF (Achenbach and Rescorla 2000;Kristensen et al. 2010), separate norms for girls and boys were used to establish gender-specific cutoffs. The cutoff values used are shown in Table 1.

Statistical Analyses
Before establishing cutoff values defining the clinical range on the C-TRF, one child was excluded due to missing age information. Based on teacher nominations of children with developmental concerns (246/1430 = 17%) and the C-TRF cutoff values described above (161/1430 = 11% in the clinical range), children were placed in one of the following categories: true positive (119/1430 = 8%), false positive (127/ 1430 = 9%), true negative (1142/1430 = 80%), and false negative (42/1430 = 3%) ( Table 2). Independent sample t-tests (equal variance not assumed) were carried out to test for group differences between the true negative cases and the false positives cases, followed by twolevel (children nested within preschool teachers) binominal logistic regression analyses to investigate the covariates of group membership for the false positive (target group, n = 127) compared to the true negative (reference group, n = 1142). None of the 1269 total children in these two groups had missing data, and all were thus included in the current study. Due to the relatively low number of false positive cases, the number of covariates were limited to children's age and gender, S-TRS conflict score, S-TRS closeness score, C-TRF internalizing problems score, and C-TRF externalizing problems score. The covariates were entered in the analytic model in three blocks to yield unadjusted (single covariate entry), adjusted (single covariate adjusted for children's age and gender), and fully adjusted (full model with all covariates) odds ratio (OR) estimates. The analyses were performed using SPSS25 and STATA16.

Results
The true negative group contained 50% (569/1142) boys, while the false positive group contained 60% (76/127) boys. The mean age for the true negatives was 3.70 years and the mean age for the false positives was 4.25 years. Both gender (p = .03) and age (p = <.001) yielded significant group differences. In general, as seen in Table 3, the false positive group had a significantly higher mean score on all covariates except the closeness scale, which was significantly lower, all indicating more negative evaluations for this group, as well as larger variation. In addition, when comparing the false positive group's mean scores for internalizing (M = 5.76) and externalizing (M = 8.54) problems (Table 3) with the clinical cutoff values for the same scales (Table 1), the estimates indicate that, on average, the false positive cases would have to display approximately twice the symptom load to approach a clinical level on the C-TRF. Table 4 shows that all covariates in the unadjusted and ageand gender adjusted analyses were significantly associated with group membership, while male gender and conflict were non-significant covariates in the fully adjusted analyses. With the exception of closeness, all covariates (age, male gender, conflict, internalizing problems, and externalizing problems) showed ORs greater than one, indicating as these covariates increased, the chance of being classified as a false positive increased. For closeness, the OR was significant and less than one, indicating that as closeness increased the chance of being classified as a false positive decreased. More specifically, the fully adjusted analysis revealed that for each year of age, the risk of being classified as a false positive increased by 68%, and a one-unit increase in internalizing or externalizing problem score was associated with a 55% and 21% increased risk of being classified as false positive, respectively. A one unit increase on the closeness scale was associated with an 8% decrease in the risk for false positive classification. Male gender was associated with a 1% decrease in the risk of being classified as a false positive, while a one-unit increase on the conflict scale was associated with a 2% increase in the risk of being classified as a false positive. Neither gender nor conflict were significant covariates in the fully adjusted analysis.

Discussion
The current study aimed to investigate, in essence, how false a false positive teacher nomination is for preschool children at risk for mental health problems and to compare the characteristics of those classified as false positive with those classified as true negative. In support of our initial hypotheses, our findings indicate that children in the normal range of the C-TRF who were nominated by preschool teachers with developmental concerns (false positive) were reported to have significantly higher internalizing and externalizing problem scores compared with children who not were nominated. Children identified as false positive cases were also perceived by teachers to have poorer teacher-child relationships, have higher levels of conflict, and have lower levels of teacher-child closeness. Neither child's gender nor degree of conflictual teacher-child relationship was significantly associated with false positive classification when adjusting for other factors. Age, internalizing problems, and externalizing problems increased the risk of false positive classification, while increased closeness. Closeness in the teacher-child relationship reduced the risk of false positive classification.

Children's Internalizing and Externalizing Problems and Teacher-Child Relationships
The findings of the current study support results from previous research on school-aged children, showing that teachernominated children differ significantly from their nonnominated peers (Layne et al. 2006;Ollendick et al. 1990;  Roeser and Midgley 1997). Although not reaching the clinical cutoff, the teacher-nominated preschool children still received higher scores for internalizing and externalizing problems, and lower scores for teacher-child relationship quality compared to their non-nominated peers (Table 4). These results suggest that even when the teachers' concerns are classified as false positives and referrals for a more thorough assessment appear to be unnecessary, children may still be in need of extra monitoring and support to ensure that development proceeds normally. Contrary to previous research (Loades and Mastroyannopoulou 2010), internalizing rather than externalizing problems made teachers more likely to report concerns about children's development, at least for children in the nonclinical range of the C-TRF. Although preschool teachers tend to underreport internalizing problems , they may be more vigilant in raising concerns when asked specifically to report this type of problem. If they have more experience dealing with externalizing problems than internalizing problems, a lower threshold for deviancy from what they perceive as normal behavior in the internalizing domain may occur, thus causing them to raise concerns more readily when dealing with children with internalizing problems.
As seen in Table 4, characteristics of the children themselves (i.e., age, internalizing problems, externalizing problems) were the main factors that increased the odds of teacher nomination, while teachers' perception of teacher-child relationships did not have the same impact. This may be because positive relationships can and do develop even in the presence of problem behavior (Myers and Pianta 2008). For each oneunit increase in the closeness scale, the odds of false positive classification were reduced significantly by 8%, indicating that teachers' proximity is important when teachers raise concerns. It seems plausible that with increased closeness, teachers are more able to accurately assess development and thus provide more reliable information. Children with externalizing problems are more likely to develop conflicting relationships with their teachers, which may lead to a maladaptive spiral (Sabol and Pianta 2012). This reciprocal relationship has been found in several studies (e.g., Skalická et al. 2015;Zhang and Sun 2011). Internalizing problems also exhibit the same reciprocal relationship with conflict, but neither externalizing nor internalizing problems show this bidirectional effect for teacher-child closeness (Zhang and Sun 2011). These findings may indicate that two distinct mechanisms are involved in these two relationship dimensions. It may be that preschool teachers' perception of conflict is child-driven, while the perception of closeness is more teacher-driven (Silver et al. 2005). This could explain the stronger link between problem behavior and conflict, while factors such as teachers' self-efficacy may play a more important role in teacher-child closeness. In this study, the false positive group received significantly higher scores in internalizing and externalizing problems and conflict as well as lower scores in closeness than the true negative group. Even though the reciprocity of children's problem behaviors and teacher-child conflict has been demonstrated in previous studies (Skalická et al. 2015;Zhang and Sun 2011), All values are mean (SD); the group difference was significant at p < .001 for all covariates Note: OR = odds ratio (significant associations in bold); CI = 95% confidence interval ªAdjusted for all covariates in first column conflictual relationships do not seem to be a source of concern for teachers, although closeness reduced slightly but significantly the odds for stating concern.
Children's Age and Gender As seen in Table 4, even though boys are more prone to false positive classification, this association was not significant when adjusting for other covariates, such as internalizing and externalizing problems. Although preschool teachers tend to report boys with more problem behaviors than girls on the C-TRF (Achenbach and Rescorla 2000;Kristensen et al. 2010), it does not seem to bias teacher concerns when children fall in the non-clinical range. As mentioned previously, different teacher thresholds for internalizing and externalizing problems may be in play when teachers raise developmental concerns and the same mechanism may also be at play regarding child gender. If teachers' perception of normal behavior differs for boys and girls and they operate with different thresholds for raising concern, it may result in girls and boys having the same odds for teacher nomination, even when boys display more symptoms. Future studies should investigate the interaction effect between gender and type of problems regarding teacher concerns and false positive classification. Children's age was a significant predictor of teacher nomination for children in the non-clinical range of the C-TRF. Preschool teachers were more likely to raise concerns for older children than for younger children, making a false positive classification more likely with increased age. One explanation may be that teachers feel more capable of discriminating between normal and abnormal behavior for older children, as symptom expression in older children can be more distinguishable than the more subtle expressions in younger children. Thus, teachers may have a better reference base for normal rather than abnormal behavior when reporting concern. Another explanation may be that teachers perceive younger children to have more transient problems that are more likely to normalize before school entry. As children age and school entry approaches, teachers may grow increasingly concerned if there is a dissonance in the perception of developmental skills and school readiness. As the current study shows, false positive cases received significantly higher problem scores than the true negative cases, indicating that teachers' concerns should not be disregarded out-of-hand.
Some behaviors are more likely to be considered deviant or unusual as they raise concerns more easily. However, most behaviors are displayed on a continuum and depend on context, which makes the discrimination of normal and abnormal behavior more difficult to establish ). In addition, internalizing and externalizing problems exhibit a low to modest correlation to functional impairment (Gordon et al. 2006;McKnight and Kashdan 2009;McKnight et al. 2016). Thus, some children may display an elevated symptom load without significant impairment, while other children may display few symptoms and significant impairment. Consequently, blindly relying on categorical criteria may hinder the identification of children with emerging psychopathology. As children with sub-clinical levels of symptoms continue to display impairment later in life (Finsaas et al. 2018;Shankman et al. 2009), it is important to identify relevant cases below clinical cutoffs and among children not meeting diagnostic criteria. As symptoms of psychiatric disorders reflect normal behaviors that change phenotypically as children develop, approaches that capture the full range of behaviors relevant to psychopathology are needed (Dougherty et al. 2015). A probabilistic approach to clinical cutoff values was proposed by Sheldrick et al. (2015) in which children with a very high symptom score are assumed to be more likely to have some psychopathology than children with a low score. In addition, children approaching a clinical cutoff would also be more likely to display some psychopathology than children with low scores. Thus, this approach assumes that increases in symptom scores indicate an increased probability of psychopathology. In support of Sheldrick et al. (2015), the current study reports that increases in internalizing and externalizing problems were associated with an increased likelihood of nomination by preschool teachers. Further, the nominated children were reported to have significantly more symptoms of problematic behavior compared to the children not nominated, indicating that children classified as false positive through preschool teachers' nominations should be developmentally monitored rather than regarded as completely behaviorally healthy. Our results and those of previous studies indicate that, although both categorical and dimensional approaches to psychopathology are capable of discriminating between normal and abnormal behaviors in preschoolers (Moreland and Dumas 2008), dimensional approaches are better suited for monitoring developmental trajectories because of their flexibility (Coghill and Sonuga-Barke 2012). Consequently, dimensional approaches to identify and intervene for sub-clinical problems should be initiated before more stable patterns of psychopathology emerge.

Clinical Implications
It is reasonable to assume that a false positive classification may be more correct for some children and less correct for others. Further, better training for preschool teachers so that they can recognize what constitutes developmental concerns, might be expected to lower teacher nomination-associated misclassification rates. Findings from this study suggest that the term false positive may in some cases be misleading or inaccurate, which could potentially hinder at-risk children getting appropriate help in a timely manner. When a preschool teacher raises concerns that a child may potentially have clinical problems, it is important and worthwhile to scrutinize the concern in a prompt manner even for children who are found to be in the non-clinical range following further tests. Further, preschool teachers should be encouraged to express their concerns once they arise, preferably in a forum which includes colleagues, parents, and other mental health professionals. The Norwegian Kindergarten Act (2017) states that preschool teachers are responsible for following up concerns they might have regarding a child's development. Thus, one approach may be to apply teacher concerns as a pre-screening method to identify children of interest, supplemented by a psychometrically sound screening instrument, which will either confirm or dismiss the concern. It may also be beneficial to monitor child development dimensionally by scrutinizing scores and/ or establishing a symptom profile rather than blindly using categorical cutoffs. This may ensure that children with elevated (but non-clinical) problem scores are monitored and get help for their problems, which in turn may increase the likelihood of healthy development. The importance of this task is underlined by findings indicating that young children who display internalizing and externalizing problems at a subclinical level will continue to exhibit poorer functioning throughout childhood and adolescence (Finsaas et al. 2018). Since few preschool children are referred and receive treatment for existing mental health problems (Egger and Angold 2006;Horwitz et al. 2003;Horwitz et al. 2007;Wichstrøm et al. 2014), scrutinizing preschool teachers' concerns may lead to an increase in referral rate. Currently, it is known that parents are a strong trigger for the initiation of contact with health services for young children not meeting developmental expectations within the family context (Ellingson et al. 2004).
The results of the current study and previous studies suggest that preschool teachers may play a similar role within the context of childcare centers, as their concerns seem to capture a considerable portion of children with a clinical symptom load (Stensen et al. 2021), as well as those with a subclinical symptom load.

Strength and Limitations
To the best of our knowledge, this study is the first to investigate false positive classification rates and the factors that predict teachers' developmental concerns in a large sample which includes the full age range of preschool children. Although this study was strengthened by its large sample size, an even larger sample to obtain more false positive cases would have been beneficial, allowing more covariates to be investigated. Future studies should investigate the interaction of the covariates examined in the current study to further illuminate the factors leading to teachers being concerned about children without obvious clinical problems. One limitation of the current study is that the ordering of measures in the survey may have increased the susceptibility to a confirmation bias. Preschool teachers nominated children prior to completing the C-TRF, which could have primed them to respond to the survey differently than if the nomination item was dropped or maybe located elsewhere in the survey. Although the C-TRF might be regarded as a "gold standard" for measuring children's mental health problems based on its psychometric properties, it does not exhibit perfect accuracy compared with diagnostic interviews (Lavigne et al. 2016b). Future studies could investigate preschool teachers' concerns against classifications from a diagnostic interview, thus not solely relying on one informant. The inclusion of other informants, such as parents, would have been beneficial when investigating the nomination method. As preschool teachers provide one perspective, future studies should include other informants to investigate the psychometric indices of the nomination method, as the "gold standard" for clinical decisions in developmental psychopathology use multiinformant reporting. It would also be valuable to investigate the relationship between preschool teachers' concerns and measures of functional impairment, rather than symptom scales isolated. Finally, as this study is crosssectional and represents only a snapshot, an important focus for future research should be to longitudinally investigate whether scrutinizing preschool teachers' concerns, preferably in combination with a psychometrically sound screening instrument and in collaboration with parents and other mental health professionals, actually leads to increased support for children displaying an elevated level of mental health problems. As children's mental health problems may be a precursor for emerging psychopathology, early identification and support can be of great importance to ensure their healthy development.

Conclusion
The results of the current study emphasize the importance of seriously considering preschool teachers concerns about children in their care, as even children classified as false positive by teacher nomination displayed significantly poorer outcomes compared with true negative cases. Because an elevated level of mental health problems and a decreased quality of teacher-child relationship may be precursors to emerging psychopathology and later maladjustment, it is important to scrutinize concerns when they arise and supplement with a psychometrically sound screening instrument that may confirm or dismiss the initial concern. Even in cases where follow-up instruments dismiss the initial concern, scrutinizing the preschool teachers' concern in collaboration with others may still reveal that the child needs support to reduce the level of problem behavior and thus increase the likelihood of healthy development.
Funding Open access funding provided by NTNU Norwegian University of Science and Technology (incl St. Olavs Hospital -Trondheim University Hospital).

Declarations
Conflict of Interests None of the authors have declared any competing or potential conflicts of interest.
Informed Consent Informed consent was obtained from all parents and teachers of included children.
Experiment Participants All procedures performed in studies involving human participants were conducted in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.