A Scoping Review of Mental Health and Wellbeing Outcome Measures for Children and Young People: Implications for Children in Out-of-home Care

Purpose One of the challenges for mental health research is the lack of an agreed set of outcome measures that are used routinely and consistently between disciplines and across studies in order to build a more robust evidence base for how to better understand young people’s mental health and effectively address diverse needs. Methods This study involved a scoping review of reviews on consensus of the use of mental health and wellbeing measures with children and young people. We were particularly interested to identify if there are differences in measures that are recommended for children and young people with care experience including those with developmental disabilities. Findings We identified 41 reviews, of which two had a focus on child welfare settings, three on childhood trauma and 14 focused on children and young people with developmental disabilities. Overall, our review highlights a lack of consensus and a diversity of measures within the field. We identified 60 recommended measures, of which only nine were recommended by more than one review. Conclusions Our review highlights the need for greater agreement in the use of mental health outcome measures. While our review highlights that there is value in identifying measures that can be used with any child or young person, researchers need to take into account additional considerations when working with children and young people with care experience and those with developmental disabilities, to ensure measures are accessible and sensitive to their life experiences.


Introduction
There is a growing recognition of a lack of agreement and consistency in the use of mental health measures in research and practice in relation to children and young people, which makes it difficult to compare research findings (Krause et al., 2021).For example, over 280 measures for depression have been developed since 1918 (Santor et al., 2006).How researchers assess mental health outcomes varies.Different measures reflect different mental health outcome domains that researchers decide to assess.Most commonly mental health is defined by looking at mental health problems, although there has also been an increase in positive mental health measures, utilising concepts such as wellbeing or quality of life (Losada-Puente et al., 2019).While it is important that there are robust measures for all aspects of mental wellbeing, common mental health problems and severe mental illness, the lack of consistency in the measures used means that important opportunities to compare findings across studies, settings and time are being missed.Additionally, the increase in the numbers of measures available has also led to concerns about the quality of measures that are used and the absence of independent evaluation of their psychometric properties (Addington et al., 2015;Howe et al., 2020), as well as if measures accurately reflect the different mental health outcome domains claimed (Krause et al., 2022).
The need for greater consensus has been highlighted by a number of initiatives.In 2005, the COSMIN initiative (COnsensus-based Standards for the selection of health Measurement INstruments) was set up by a multidisciplinary team of international researchers to provide guidance on assessing and selecting suitable outcome measures.More recently the International Alliance of Mental Health Research Funders, the National Institute of Mental Health and the Wellcome Trust (Farber et al., 2020) have suggested a set of common data items and measures which should be routinely used in mental health research.These include: Age; Sex at Birth; WHO Disability Assessment Schedule (WHODAS) 2.0 (for adults); Patient Health Questionnaire (PHQ-9) (for adults); Generalised Anxiety Disorder Assessment (GAD-7) (for adults); and the Revised Children's Anxiety and Depression Scale (RCADS-25) (for youth).The International Consortium for Health Outcomes Measurement (ICHOM) has recommended a standard set of outcomes specifically for child and youth anxiety, depression, obsessive compulsive disorder, and post-traumatic stress disorder (Krause et al., 2021) which are: the RCADS-25; the Obsessive Compulsive Inventory for Children (OCI-CV); the Children's Revised Impact of Events Scale (CRIES); the Columbia Suicide Severity Rating Scale (C-SSRS); the KIDSCREEN-10; the Children's Global Assessment Scale (CGAS); and the Child Anxiety Life Interference Scale (CALIS).
One of the reasons for a lack of agreement on which measures to use, is that different measures are often validated to be administered in specific groups and populations and within defined settings.Specific populations of interest include children and young people with care experience (also referred to as looked after children) including those with developmental disabilities.While there are increasing concerns about the mental health and well-being of children and young people in general (Frith, 2016), there are concerns in particular about looked after children (Bazalgette et al., 2015) and children with developmental disabilities, such as autism or ADHD (Sayal et al., 2018;Lecavalier et al., 2014).Children and young people in care are always, first and foremost, children and young people and there is a risk of othering those with care experience and/or developmental disabilities when viewing them as an entirely distinct and different group.Yet, some of their experiences will be unique and it is important for researchers to be aware of this.Young people who are looked after have consistently been found to have much higher rates of mental health difficulties than the general youth population, with almost half of looked after children (and three quarters of those living in residential group care) meeting the criteria for a psychiatric disorder in the UK (Fleming et al., 2021;McKenna et al., 2023).There are many reasons for this, including the adversities experienced by children before coming into state care, such as abuse, neglect, exploitation and poverty, along with the difficulties children may experience during their time in care, which can both add to and exacerbate their needs.
Given the accumulation of experiences, it is important to understand trajectories and outcomes of poor mental health and wellbeing, and recovery, for these young people to inform policy and practice.Reviews of the extant literature and research (Luke et al., 2014;NICE, 2015NICE, , 2021) ) have highlighted a number of challenges to ensure that the needs of looked after young people are better understood and addressed.The NICE (2015NICE ( , 2021) ) Guidelines on Looked After Children and Young People concluded that further work was needed to develop robust methods for evaluating services.This included, for example, developing standardised, validated and reliable measures and robust tools to evaluate quality of life outcomes for use with all looked after children and young people from birth to 25 years, regardless of where they live.
Children and young people with developmental disabilities have also been found to be at greater risk of mental health difficulties due to an interplay of differences in individual functioning and environmental risk factors such as higher prevalence of bullying, experiences of stigma, lack of social inclusion and school exclusion (Sterzing et al., 2012;Honey et al., 2011;Emerson & Hatton, 2007).Additionally, there is now a growing recognition of the presence of developmental disabilities within the care population but this is often overlooked in mental health research with this group, despite potential implications for how such young people should be cared for, and supported (Banerjee et al., 2021).Recognising diversity in individual functioning and life experiences for children with care experience, including children with developmental disabilities raises the question of which measures are, can and should be used across populations, which need to be adapted and what population or domain-specific measures are needed.

Aims and Objectives
The overall aim of the scoping review is to explore variability in the use of mental health outcome measures and to identify measures that have been recommended to be routinely used in research with children and young people in different contexts with a specific focus on children with care experience including those with developmental disabilities.The scoping review is part of a bigger project (blinded for peer review) and findings from this review informed the development of a Delphi study to identify and agree a common core set of measures to be used in mental health research with young people, who are care experienced.Additionally, as part of our project we conducted participatory work with young people and adults with care experience, including some with developmental disability, to help us think about how we should define and understand mental health, given the criticism that young people in out of home care are rarely asked about their perspectives on their own health (Smales et al., 2020).The scoping review was the first step of the process and aims to map the literature on the development and implementation of agreed outcome measures.

Defining Mental Health Outcome Measures
For this review, outcome measures were defined as psychometrically validated measures of mental health.We aimed to take a broad conceptualisation of mental health and included related concepts such as wellbeing and quality of life, to capture clinical definitions, as well as broader social perspectives on mental health (Berghs et al., 2021).Additionally, it was important for the review team not to equate developmental disability with mental health problems and we decided not to include tools that facilitate a diagnosis of developmental disabilities such as autism or ADHD.The focus of this review is not on diagnosis, but on outcome measures that can be used in research to assess and understand young people's mental health, identify risks or capture change over time.

Methods: A Review of Reviews
Reviews of reviews are helpful in areas of research and practice that are rapidly growing and have an extensive evidence base that make the synthesis of primary studies too burdensome (Smith et al., 2011).An initial database search combining terms for measures, mental health and children in PsychInfo showed over 70,000 results of primary studies.The search results were then filtered to include systematic reviews published in the last 10 years.A further preliminary database search showed that there were several existing reviews that explored the use of mental health measures in research with children and young people with a focus on different age groups, settings and outcome domains.Thus, we made the decision to conduct a review of existing reviews to map recommendations for different populations and outcome domains.

Research Questions
Our primary research question for the scoping review was: What outcome measures are currently used to assess the mental health and wellbeing of children and young people in research?Our aim was to map measures recommended by existing reviews for use in research with children and young people.
Sub-questions of interest were:

Search Strategy
The Joanna Briggs Institute (https:// jbi.global/) recommends using PCC (Population -Concept -Context) to develop search strategies for scoping reviews, and the PCC format guided the development of our search strategy.The process was supported by an expert support librarian, who was a member of the research team (RJ).The search strategy was developed in PsycInfo, where the subject headings were likely to be the most detailed for mental health related terms, and the sensitivity of the search was tested using a set of papers already identified as relevant.The search strategy was then translated to Medline, Embase and ERIC.
A combination of subject headings and keyword (free text) searches were used.The search was conducted between March and April 2021.An overview of our search terms can be found below (Table 1) and details of the full strategy with truncations and search filters can be obtained from the first author on request.

Eligibility Criteria
The following eligibility criteria were developed to guide the screening process: • Is it a published review?
• Is it a review of measures?
• Is it a review of mental health measures?
• Does it focus on children and young people (0-26)?
• Is it available in English or German?
• Has it been published in the last 10 years (2011 to 2021)?
The age range was chosen to reflect current policy and practice recommendations, reflecting an understanding that the period of transition to adulthood can take several years after young people leave school.Additionally, mid-twenties have been identified as an age when most mental health conditions will have manifested (Kessler et al., 2007).Reviews that included studies with children/young people, as well as adult populations, were only included if they specifically referred to children or young people as a distinct group in their assessment and recommendation of measures.
We excluded specific populations such as children and young people with diabetes or terminal illness.Reviews were defined as following a systematic and transparent search process and included scoping, systematic and narrative reviews.The focus on English and German publications reflects the languages spoken by the research team, however, we recognise that the exclusion of other languages adds bias to the review.

Screening Process and Data Extraction
Overall 25,438 results were identified across the four databases and after removing 3,544 duplicates, 21,894 were screened against our eligibility criteria.21,387 were excluded after screening all titles and abstracts and 506 were assessed for full-text eligibility, after we were unable to retrieve the full-text for one record.A team of six researchers conducted the screening (PJ, LP, CMC, JD, GD, PMC).20% of results were assessed by two-reviewers at the title and abstract stage as a standardisation exercise, before moving to single reviewer screening.All full-text records were assessed independently by two reviewers.Conflicts were resolved by a third reviewer, and particularly difficult decisions were taken after discussions with the whole review team.An overview of the screening process can be found in the PRISMA diagram (Fig. 1).
The main reasons for exclusion at the full-text stage were reviews which did not meet our definition of being a systematic review of measures.This included reviews which focused on one or two specific measures and where the selection of those measures was not transparent or systematic.It also included reviews that reported the frequency of use of measures, but failed to provide an assessment of the psychometric properties or the acceptability or utility of identified measures.Additionally, 138 studies were identified as not primarily relating to mental health outcomes.This included reviews which focused only on physical health or physical functioning (e.g.mobility), IQ-tests or standardised diagnostic assessments.
41 reviews were deemed to meet our eligibility criteria.The data extraction process followed several steps.Firstly, extracting information about each included review study, including information about authors, country of origin, methods and the number of included studies and measures.Secondly, we identified recommended measures across the 41 reviews and information about each measure and the context of their use (recommended for which purpose, which setting, context and which age-group) was collated by two members of the research team (PJ and LP).To answer sub-questions of interest we used a framework of four mental health typologies to group reviews and measures (Slade, 2002).These were (i) condition-specific measures, (ii) behaviour associated with poor mental health such as self-harm or substance misuse, (iii) general mental health and (iv) positive mental health.We paid particular attention to reviews that discussed measures in relation to children with care experience and children with disabilities, comparing if different measures were recommended or if other differences were noticeable such as use of outcome domains.

Results
The results section will provide an overview of included reviews, discuss findings in relation to outcome domains being used, identify recommended measures and lastly highlight findings in relation to children with care experience and children with developmental disabilities.

Overview of Included Reviews
An overview of the 41 included reviews can be seen in the tables below and are presented in accordance to the four typologies (condition-specific measures, behaviour, general mental health, positive mental health).Tables include a description of the methods and aims of each included review, alongside a summary of key-findings and recommendations made by the authors (including use of measures with specific age ranges, populations or in specific settings).21 reviews did recommend specific measures as part of their findings, while 20 reviews felt unable to provide a recommendation.Those reviews often noted that the choice of measure depends on specific research questions, aims, settings and groups of interest.Notably, most endorsed measures were recommended to be used in clinical or mental health specific  settings, with none of the reviews exploring use of measures in community settings (such as schools).This seemed to be because authors felt that there was not enough evidence on the use of measures with diverse populations (Eklund et al., 2018).Additionally, few measures were identified that could be used in early childhood Tables 2, 3, 4, 5, and 6.

Dimensions of mental health
The included reviews were based on different concepts of mental health.These included (i) nine reviews of symptom and condition-specific measures (e.g.depression, anxiety, psychosis), which focused on the presence of symptoms, were often closely related to diagnostic criteria and used in clinical settings; (ii) nine reviews that focused on behaviour associated with poor mental health, including substance use, aggression, disruptive behaviour, self-harm and suicide; (iii) ten reviews that focused on general mental health measures, combining an assessment of multiple dimensions such as cognition, social and emotional development and functioning in different environments; and (iv) ten reviews that utilised positive mental health perspectives, assessing wellbeing, quality of life and resilience through concepts such as life satisfaction, participation, sense of belonging in combination with consideration of the impact of environmental factors such as relationships, or housing.Both general mental health measures and positive mental health measures included examples of one-dimensional measures, providing an overall score across domains, as well as multidimensional ones considering individual domains alongside each other.Three of the reviews had a wider scope and reviewed measures across typologies (Becker-Haimes et al., 2020;Krause et al., 2021;Newton et al., 2017).Examples of measures for each typology included the Revised Children's Anxiety & Depression Scale as a condition-specific measure for depression and anxiety; the Columbia Suicide Severity Rating Scale which evaluates severity of behaviour and ideation; the Paedtriatic Symptom Checklist, which involves an assessment of psychosocial problems, as well as overall functioning including school and peer relationships; and KIDSCREEN as an example of a measure of wellbeing that includes questions about physical and psychological wellbeing, mood and emotions, autonomy, home life, relationships, social support and school.Interestingly, we had initially thought that measures of wellbeing and quality of life would reflect a more positive perspective to mental health.However, during the review and data extraction process we became aware that authors were highlighting that some wellbeing and quality of life measures are often applied in studies that take a deficitview to highlight limitations or difficulties (Davis et al., 2018;Mierau et al., 2020).Thus, wellbeing and quality of life measures were often found to be used within narratives that focus on psychopathology, rather than identifying what helps children and young people to be well (Losada-Puente et al., 2019).

Overview of recommended measures
Overall, 60 measures were recommended by 21 reviews.Interestingly, a number of reviews had the same areas of interest (e.g.measures of anxiety or risk of suicide) but came to different conclusions and recommendations.This appeared to be because of different foci in relation to the exact purpose of the measures or their use with different age-groups or populations and different priorities in the assessment of the measures.For example, some reviews had a stronger consideration of predictive values when making recommendations in relation to the identification of early risk (Harris et al., 2019), while others focused on the sensitivity of measures in relation to using them as screening tools or to capture change over time (Newton et al., 2017).Reviews assessing the use of measures in schools or clinical practice tended to include a stronger consideration of their utility and acceptability to children and young people (McConachie et al., 2015;Rosanbalm et al., 2016).Yet, it was still striking how little consistency there was across reviews on which measures to use.For example, we identified 15 different recommended measures in relation to the assessment of anxiety.Similarly, Bear et al. (2020) identified 15 different measures in their systematic review of outcome measures of anxiety and depression in young people.
To narrow down the list of recommended measures we looked at which measures were recommended by more than one review.Only nine measures were recommended more than once and these are presented in the table below (Table 7).An overview of the full 60 measures including information on recommended populations, settings and number of items, can be found in Appendix 1.The Revised Children's Anxiety & Depression Scale (RCADS, long and short version) was the most recommended measure with four reviews recommending it as a measure for anxiety and depression and it is also included in the set of measures recommended by ICHOM and the Wellcome Trust.Reviews recommended it for the age range of 6 to 18 years, within clinical and community settings.Strengths that were noted included its use in different cultural contexts, but Krause et al. (2021) noted that they did not find evidence of its sensitivity to change.
Two measures were recommended by three reviews.The Paediatric Symptom Checklist (PSC) was recommended as a general mental health measure, assessing internalising, externalising and general mental distress for the ages 4 to 16 years (Becker-Haimes et al., 2020;Zima et al., 2019;McCrae & Brown, 2017).The Screen for Child Anxiety Related Emotional Disorders (SCARED) was recommended as another assessment and outcome measure of anxiety, with reviews highlighting its strong psychometric properties (Becker-Haimes et al., 2020;Lecavalier et al., 2014;).All other measures were recommended by two reviews.This included a further two anxiety measures.The Anxiety Disorders Interview Schedule (ADIS), was recommended specifically to be used with autistic children for the ages of 6 to 18 years, to detect treatment effect (no longer meeting diagnostic criteria) and for characterization of research participants.However, Lecavalier et al. (2014) noted that administration burden makes it unsuitable as a repeat measure.
The Spence Children's Anxiety Scale (SCAS) was recommended to be used in community mental health settings, as well as with autistic children and young people for the ages 8 to 15 years (Becker-Haimes et al., 2020).
KIDSCREEN was recommended as a wellbeing and quality of life measure for young people between 8 and 18 years.Identified strengths included its sensitivity to change over time, as well as accessibility and ease of use in practice, having been developed with input from children, young people and their families (Davis et al., 2018;Krause et al., 2021).The Pediatric Quality of Life Inventory (PedsQL) was recommended as another quality of life measure for the ages 8-18 years.Limitations included its poor quality when used with younger children (Mierau et al., 2020), as well as its high cost (Davis et al., 2018).
The Strengths and Difficulties Questionnaire (SDQ) was recommended as a general mental health measure for the ages 3-16 years, with Becker-Haimes et al. (2020) emphasising evidence for its use as a routine measure of progress over time.
Lastly, the Columbia Suicide Severity Rating Scale (C-SSRS) was recommended to evaluate the severity of suicidal behaviour and ideation.Krause et al. (2021) noted that there had been no validation of the C-SSRS recent selfreport measure to be used with children and young people, but that the clinician-rated C-SSRS had strong evidence of good internal consistency, inter-rater reliability, and sensitivity to change in adolescent samples.

Recommendations in Relation to Children and Young People with Care Experience and Those with Developmental Disabilities
We identified two reviews with a focus on children and young people with care experience in relation to general mental health measures and measures of wellbeing (McCrae & Brown, 2017;Rosanbalm et al., 2016) and three that focused on trauma related experiences in relation to general mental health (Atazadeh et al., 2019;Eklund et al., 2018) and resilience (Satapathy et al., 2020).
In relation to developmental disabilities, we included five reviews, which focused on symptom or condition-specific measures, which all related to autism and anxiety (Kreiser & White, 2014;Lecavalier et al., 2014;Wigham & McConachie, 2014;Tulbure et al., 2012;Grondhuis & Aman, 2012), three reviews focused on the assessment of specific behaviours, which included aggression and self-harm in autism (Hanratty et al., 2015;Howe et al., 2020;Matson & Cervantes, 2014), one discussed medication management and symptom changes in ADHD (Hall et al., 2016), three focused on quality of life and general mental health outcomes in relation to disabilities as a general concept (Davis et al., 2018;Janssens et al., 2016;Losada-Puente et al., 2019), and two focused on autism and quality of life (Ikeda et al., 2014;McConachie et al., 2015).This shows that while developmental disabilities include a very diverse group of children and young people, there appears to have been greater focus on autism over other disabilities.
Only one of the reviews that focused on children and young people with care experience made recommendations, which included the SDQ and the PSC, which were also recommended to be used with young people in mental health settings.Satapathy et al. (2020), in their review on resilience measures, further discussed that the Child and Youth Resilience Measure and Connor-Davidson Resilience Scale included small samples of children from welfare homes.Three anxiety measures (RCADS, SCARED, SCAS) were recommended for young people in the general population as well as autistic youth, with the ADIS being specifically recommended for autistic children and young people (Lecavalier et al., 2014;Groundhuis & Aman, 2012).Additionally, Lecavalier et al. (2014) highlighted that one study had evaluated the use of RCADS with autistic children (Hallett et al., 2013), which has since been repeated adding further support for its use with autistic children and young people (Sterling et al., 2015).The KIDSCREEN (long and short versions) was recommended for use with children and young people in clinical care, youth with disabilities and children with ADHD.The PedsQL was recommended to be used with young people in mental health services, as well as autistic youth, young people with ADHD and intellectual disabilities (Mierau et al., 2020).
All reviews on children and young people with care experience focused on general mental health or positive mental health measures.This reflected a view that holisitic assessments would help capture the complexity of experiences in this population.Additionally, in relation to oucome domains, all reviews on children and young people with care experience and some of the reviews that focused on children and young people with developmental disabilities highlighted the value of measures that included an assessment and questions on strengths alongside difficulties or deficits, as well as assessments that included a consideration of environmental factors alongside individual ones (Davis et al., 2018;McConachie et al., 2015;McCrae & Brown, 2017).Authors argued that, for both populations, environmental factors often contribute to and sustain poor mental health, and that it is important for researchers and practitioners to understand and capture poor mental health as a response to trauma and experiences of social exclusion or stigma (Davis et al., 2018;Ikeda et al., 2014;McConachie et al., 2015;McCrae & Brown, 2017).Similarly, two of the reviews which involved children and young people in the assessment process of measures with a focus on autism (McConachie et al., 2015) and developmental disabilities (Janssens et al., 2016) highlighted discrepancies between what was being measured and what children, young people or their families identified as important to them, as well as highlighting the importance of measures being accessible.This included a dominant focus on deficits and difficulties, overlooking the strengths and abilities of children and young people.

Discussion
Having identified over 60 recommended measures, only nine were recommended by more than one review which adds to the evidence for the lack of consensus on the use of mental health measures in research with children and young people.Across the included reviews the tension between having specific measures that are validated for use in particular settings, with specific age groups and populations, that can also address defined research aims and questions (such as measuring change over time, having predictive power) was evident.Reviews which focused on developmental disabilities emphasised that many measures were not designed with children and young people with disabilities in mind, which was also true in relation to children and young people with care experience and those who have experienced adversities (McCrae & Brown, 2017;Satapathy et al., 2020).Yet, authors argued that instead of developing new measures it can be more helpful to adapt and develop existing measures to build on existing knowledge.This allows researchers to make comparisons, while remaining aware of specific needs and circumstances, particularly as mental health tools can subsequently be validated for their use with children or young people with developmental disabilities (Biederman et al., 2005;Sterling et al., 2015) or those with care experience.Similarly, Krause et al. (2021) argue for the piloting of existing measures in new populations and contexts to adapt or exchange them in light of new evidence and knowledge.
Only two of the nine measures that were recommended by more than one review were recommended to be used with young children under 6 years of age (proxy versions).These measures (the PSC and SDQ) were also recommended to be used with children and young people in care.None of the nine measures that were recommended more than once were recommended for both children and young people with care experience and children and young people with developmental disabilities, neglecting the intersectionality of both (Gajwani & Minnis, 2023).A focus on autism over other developmental disabilities was noticeable and when considering intersections of care experience and developmental disabilities it will be important for future research to consider other conditions such as FASD and ADHD (Gajwani & Minnis, 2023).
Next to a lack of consensus of which measures to use, our review also identified a lack of consensus of how to assess or review existing measures and how to report psychometric properties.Reviews differed in their reporting of psychometric properties.For example, there were differences between reviews only reporting psychometric data from the original studies of the development of measures, while others synthesised information from subsequent independent studies.This made it difficult to include and compare information on psychometric data.There are existing guidelines on the reporting of psychometric properties, including frameworks by the American Psychological Association (Gehrig, 2019), and our findings point to a poor use of those frameworks.The importance of having consensus in how we assess outcome measures is furthermore highlighted by the COSMIN initiative, which provides guidelines and standards, and to which a number of the reviews in this study referred.Alongside reliability, validity and responsiveness, COSMIN advocates for a consideration of interpretability, which is also sometimes referred to as acceptability or utility.There was less consideration of utility and acceptability within our review of measures, compared to reliability, validity and responsiveness.Similarly, in their review of psychosocial interventions for maltreated children and young people Macdonald et al. (2016) found that researchers often fail to consider issues of accessibility and acceptability.While high quality and evidence-based research relies on reliable and valid outcome measures, researchers have started to pay attention to their acceptability as well.This reflects the importance that children and young people understand the questions and items asked and that they feel those reflect their experiences.Thus, alongside psychometric assessments researchers have started to involve service users and experts by experience to evaluate and adapt assessment and treatment processes (Krause et al., 2021;Macdonald et al., 2016).Equally, the reviews focusing on developmental disabilities also highlighted the importance of involving children and young people (Davis et al., 2018).Reviews with a focus on autism discussed the significance of adapting self-report items and questions to ensure measures are accessible and inclusive (Ikeda et al., 2014;McConachie et al., 2015).This will be similar in relation to children and young people with care experience.Questions around family life and relationships need to be able to capture the diverse experiences of children and young people who might have experienced multiple placement changes, family conflict and for whom the concept of 'family' might be ambiguous or sensitive.Research with children and young people in care has highlighted that other significant people such as teachers, sports coaches or friends can be their closest relationships (Frederick et al., 2023), and it might be important to be more inclusive in the assessment process, asking young people who they trust, and to identify who the key people in their life are.As McCrae and Brown (2017) suggest: "Perhaps more of an issue than choosing screening tools with valid scientific properties is ensuring that instruments meet the needs of children and families." (p. 784).Involving children and young people with care experience in the process of adapting and assessing measures is an important next step (Smales et al., 2020).This will also help us to understand children and young people's experiences of assessment processes and in how far they are able to help researchers and practitioners to understand their experiences and facilitate engagement (Bradford & Rickwood, 2012;Tsang et al., 2012).
Additionally, in relation to children and young people with care experience and their families it is important to understand that the process of conducting assessments is a relational one.Children might find it difficult to engage in overly restrictive processes and may mistrust professionals due to past experiences (MacCrae & Brown, 2017;Macdonald et al., 2016).Similarly, McConachie et al.'s (2015) work with autistic young people and professionals stressed how the use of measures that take a deficit view can impact negatively on the relationship and engagement between professionals who undertake a problem focused assessment with children and young people.Previous research has shown that clinical definitions of mental health can often be restrictive and not fully supported by the experiences of young people themselves or research (Macdonald et al., 2016;Zhang & Selwyn, 2019).This highlights the importance to not only think about which measures are used, but also if what is being measured matters to children and young people, how measures are used and how the assessment process impacts on children and young people.

Conclusion
It is hoped that this review adds to the ongoing consideration and development of approaches to more effectively and consistently measure the mental health outcomes of young people, including those that are care experienced and those that have developmental disabilities.Research designs which enable links across settings and countries will facilitate comparison, although there should be some caution about what is appropriate to compare.It should also be acknowledged that these are not all of the outcomes that may be important, but by seeking to use an internationally agreed set of mental health and well-being measures in research involving young people there is a greater likelihood of building a comprehensive understanding of the diversity and totality of needs, and how to meet these needs effectively.While a tension remains between having recommended outcome measures to enable consistency in the application of questions, items and scores, and ensuring that measures are sensitive to the contexts of different populations and settings, we agree with Krause et al. ( 2021) that it will be important to create greater consensus and to understand mental health measures as evolving tools that are co-owned and co-produced with those that should benefit from them, while upholding the value of reliability, validity and responsiveness.Funding The review was funded as part of a larger project by the Medical Research Council.

Fig. 1
Fig. 1 PRISMA flowchart How is mental health and wellbeing defined and what typologies and dimensions underlie existing measures?• What outcome measures are used for children and young people in care and care-leavers?What outcome measures are used for children and young people with developmental disabilities?• What are the age groups for which outcome measures have been designed and used? •

Table 2
Overview of reviews on condition/symptom-specific mental health measures

Table 4
Overview of reviews on Positive Mental Health Measures