Introduction

Atypical eating behaviour, referred to as food selectivity (FS), is considerably more prevalent in children with autism spectrum disorder (ASD 46–89%) compared to typically developing (TD 25%) children (Johnson et al. 2014). Food selectivity is a collective term which is used to refer to an insufficient variety of food (Tarbox et al. 2010), characterised by a range of feeding issues including food refusal, limited food repertoire, and a high frequency of single food intake (Mari-Bauset et al. 2014). Factors underlying FS in autistic children include sensory sensitivities to taste, texture, and smell (Suarez et al. 2014b), along with gastrointestinal symptoms such as reflux and constipation (Field et al. 2003, Cuffman & Burkart, 2021). In some cases, the food restrictions may even meet the threshold for a diagnosis of avoidant/restrictive food intake disorder (AFRID), which manifests in persistent failure to meet nutritional or energy needs (Bourne, Mandy, and Bryant-Waugh 2022).

Although FS can be typical and developmentally appropriate in some TD children (Crist and Napier-Phillips 2001), it has been found that children with ASD are more likely to eat a limited range of foods within a specific food group, consuming half the amount of foods of TD peers (Schreck et al. 2004). More recently, a meta-analysis concluded that the nutritional intake of children with ASD consisted of a significantly lower amount of protein and calcium than in TD children (Esteban-Figuerola et al. 2018).

The impact of FS depends on the severity of the food restrictions but can include nutrient deficiencies (Bandini et al. 2010 and Herndon et al. 2009), obesity (Egan et al. 2013), and medical complaints associated with poor nutritional status, such as iron deficiency anaemia (Latif et al. 2002), scurvy (Swed-Tobia et al. 2019), and constipation (Field et al. 2003). In a rare case, an autistic child had such severe symptomatic vitamin A deficiency that they developed a very painful eye condition, causing him to lose his eyesight, which does not usually occur in developed countries (Uyanik et al. 2006) and may be indicative of AFRID.

Despite the well-established high prevalence and significant nutritional consequences of FS in children with ASD, the UK guidance for the management of autism only acknowledged the need for feeding-related interventions in June 2021 (National Institute for Health and Care Excellence (NICE) 2021). Prior to this, the only reference to diet was to ‘not routinely use exclusion diets’ in the treatment of ASD (NICE 2013). The updated guidance now stresses the importance of nutritional assessment and monitoring, which may include blood tests for nutritional deficiencies, and onward referrals, although clarity is needed as to the destination of any referrals. This is in contrast to the Healthcare Improvement Scotland (HIS) (2016) guidance document (Scottish Intercollegiate Guidance Network 145) that specifies that a referral to a dietitian may be warranted for children and young people with significant food selectivity and dysfunctional feeding behaviours, or who are on restricted diets that may be adversely impacting growth or producing physical symptoms of recognised nutritional deficiencies or intolerances.

Beyond the nutritional consequences of FS, inappropriate mealtime behaviour is common in autistic children, causing stress for the child, caregiver, and siblings (Crowe et al. 2016; Sharp et al. 2018). Furthermore, changes to the foods served or mealtime routine often heightens challenging behaviours, with consequences for the whole family (Rogers et al. 2012; Marquenie et al. 2011). These difficulties include the inability for families to eat together, leading to unfulfilled hopes for mealtime as family time (Suarez et al. 2014a); increasing feelings of pressure, worry, and stress regarding their child’s nutritional intake (Ausderau and Juarez 2013; Marquenie et al. 2011); difficulties during holidays and family gatherings (Rogers et al. 2012). The impact on TD children may also include supporting their autistic sibling with feeding, extra household responsibilities, as well as having different mealtime rules, which can result in conflict (Ausderau and Jaurez 2013; Marquenie et al. 2011).

It has also been reported that mothers of children with FS, both with and without ASD, can experience more mental health and wellbeing issues, including low self-esteem and social isolation (Blissett et al., 2007), higher levels of emotional distress (Budd et al. 1992), anxiety and depression (Blissett et al. 2007; Coulthard and Harris 2003; Whelan and Cooper 2000) and parental/caregiver stress (Greer et al. 2008 and Spender et al. 1996). In the case of children with ASD, this may reflect the lack of perceived support not only from family members such as fathers, extended family, and friends (Ausderau and Juarez 2013), but also from professionals (Rogers et al. 2012).

Given the wide-ranging consequences of FS, an intervention needs to be sensitive not only to the child’s sensory preferences but mindful of the home environment, family dynamics, and parental/caregiver well-being. The latter, in part, is because interventions for children with ASD have been less effective when parents/caregivers are experiencing high levels of stress (Osborne et al. 2008). As such, there has been an increased focus on parent/caregiver-implemented feeding interventions within a natural context for children with ASD and FS (Sharp et al. 2014). It has been suggested that interventions implemented in the natural context may be more efficient than interventions carried out in specialised clinical settings (Mueller et al., 2003 and Sharp et al. 2014). In addition, parent/caregiver involvement in intervention implementation has been deemed beneficial in addressing anxiety incurred when their child has atypical eating habits (Wood et al. 2009), better generalisation to other mealtimes, creating more positive parent/caregiver-child interactions, and enhancing self-efficacy (Feldman and Werner 2002). Furthermore, Cheng et al. (2022) recognised that caregiver interventions can enhance the effectiveness of other interventions (initially delivered by health professionals) and provide opportunity for a greater intensity of intervention due to significant time spent with their child.

Despite this, successful implementation and continuation of the intervention by the caregiver are likely to be influenced by the acceptability of the intervention, the caregivers’ confidence in implementing the intervention (Murphy and Zlomke 2016), and the impact the intervention has on the stress levels and the quality of life of the family (Brookman-Frazee and Koegel 2004). The method of training given to parents and caregivers to deliver the intervention may also influence the successful implementation of the caregiver intervention. As an example, group education can be an effective mode of delivery, resulting in a reduced sense of self-blame and facilitating better feeding practices (Mitchell et al. 2013); however, if there is mismatch between the group characteristics and the needs of the individual caregiver, success is unlikely. This would undermine the positive gains resulting from the group dynamics (Festinger 1950).

To date, literature reviews have characterised the state of food selectivity (Marí-Bauset et al. 2014; Sharp et al. 2013), the types of feeding issues and interventions to improve FS (Ledford and Gast 2006; Diaz and Cosbey 2018), and the short-term effectiveness of interventions for improving food intake, eating behaviour, and secondary outcomes such as parent/caregiver stress (Marshall et al. 2015; Ledford et al. 2018 and Aponte et al. 2019). The most recent review (Aponte et al. 2019) focused specifically on interventions with parents or caregivers as interventionists; however, little attention was paid to evaluating the meaningfulness of reported improvements in these outcome measures for the child or family (e.g. did changes in food intake improve the nutritional status of the child). Instead, there was considerable discussion of the variety of parental/caregiver interventions available, the timing and consistency of parent/caregiver training and fidelity of parental/caregiver implementation of the intervention. As such, the aim of this review was to critically evaluate the effectiveness of caregiver-led interventions with specific attention on the meaningfulness of any reported improvements and the practicality of the intervention, given the current financial limitations in health and social care. Effectiveness was considered in relation to (i) the child’s food intake and mealtime behaviours, (ii) family outcomes, and (iii) acceptability of the caregiver-led intervention.

Method

Scoping Search

Informal scoping of the literature of studies of ASD feeding interventions found a predominance of case study and case series designs. Given the above-noted research questions, the authors chose to undertake a systematic literature review and narrative synthesis of the existing evidence.

Systematic Search Strategy

Systematic searches were conducted in four key academic databases (Medline, PsycINFO, CINAHL, and ERIC) from inception to January 2021. Search terms were developed drawing on those used in similar systematic reviews (e.g. Ledford et al. 2018) and terms employed in ASD feeding studies identified through informal scoping. Search terms were categorised into those for autism and related conditions (e.g. autistic, autism spectrum disorders, ASD, Asperger, and pervasive development disorder), food sensitivity and mealtime behaviour (e.g. food choice, refusal, acceptance, selectivity, and preference), and intervention method or implementation approach (e.g. shaping, fading, scheduling, desensiti*, escape extinction, non-removal, behaviour modification, parent-implemented, and caregiver-implemented). The combination of terms and their spellings were customised according to the requirements of individual databases, including using MESH terms and Boolean operators. An example of the full search strategy as developed for Medline is included in Online Resource 1. Results from the searches were combined into a single Endnote X9 library, and duplicate references were removed through the software’s facilities. Two authors then screened the remaining reference’s titles and abstracts for further duplicates, obviously irrelevant studies, or studies that clearly did not meet the inclusion criteria (below). Full versions of the remaining references were downloaded and independently screened against the inclusion/exclusion criteria, with a third author helping to resolve any disagreements or uncertainty.

Study Selection

To be included, studies had to (i) focus on children with a diagnosis of autism or related conditions (e.g. pervasive development disorder), (ii) report an empirical study of caregiver-led interventions for improving the child’s nutritional intake and mealtime behaviours, (iii) include reporting of child feeding outcomes, family outcomes, and/or acceptability of the caregiver-led intervention. Studies were excluded if they (i) were conducted solely in a clinical or university setting or (ii) were solely conducted with TD children.

Quality Assessment

In assessing the quality of the evidence, this systematic review aimed to go beyond using critical appraisal tools to justify inclusion/exclusion or to generate subjective numeric quality scores that place emphasis on well-designed randomised control trial (RCT) that give little context to the clinician when making their ‘informed’ clinical decisions. While RCTs are considered the gold standard approach to assess causal relationships, excluding other evidence may omit studies that provide evidence of feasible, acceptable, and meaningful interventions that could be built on to assess effectiveness. Furthermore, health professionals are interested in broader evidence that relates to experience of health and healthcare. As such, this review considers Audi (1995) view of evidence as all information a person has, the positives and negatives, relevant to a proposition. In doing so, it considers whether the caregiver-led interventions are not only effective, but relevant and sensitive to the health needs of the consumer using the feasibility, appropriateness, meaningfulness, and effectiveness (FAME) framework (Table 1), modified from Pearson et al. (2007) and Pearson (2004). Therefore, this systematic review implemented the FAME criteria as a quality assessment tool, which informed the narrative analysis.

Table 1 Quality assessment criteria using the feasibility, appropriateness, meaningfulness, and effectiveness framework (Pearson et al., 2007 and Pearson, 2004).

Data Extraction and Synthesis

Data extraction was completed by two authors independently and checked and confirmed by a third author. Disagreements were resolved by discussion between the three authors. The overall quality and heterogeneity of existing studies precluded a meta-analysis; therefore, a narrative synthesis was conducted to analyse the findings related to the three research aims.

Results

In total, the systematic review included 29 case studies/series and seven experimental studies, four of which included a control group. The latter included data for 212 of the 264 children initially recruited, indicating an average attrition rate of 19.7% (range 11.9 to 36.7%). Combined with the case studies/series, a total of 336 participants that met the threshold for food selectivity were included in this review; however, 50 children (~ 15%) did not have a diagnosis of ASD or pervasive developmental disorder. These individuals included typically developing children (n = 18) (Najdowski et al. 2010; Seiverling et al. 2020) and those with social communication difficulties (n = 4) (Miyajima et al. 2017), developmental disability (n = 3) (Surarez and Bush 2020), or other special needs (n = 25) (Seiverling et al. 2020 and Taylor et al. 2020).

The majority of studies (n = 22) did not comment on the intellectual function of the children. Of the studies that did, most referred to communication skills, which ranged from non-verbal (Seiverling et al. 2018) or echolalic speech, requiring communication with pictures (Muldoon and Cosbey 2018), to verbal communication using single (Silbaugh et al. 2018) or multiple words (Tanner and Anderone 2015) or complete sentences (Taylor 2020a). Medical history was explicitly referred to in a quarter of studies, of which constipation was most common (Muldoon and Cosbey 2018; Taylor et al., 2020; Taylor 2020), followed by genetic/chromosomal abnormalities in five participants (Taylor, Blampied, and Roglic 2020). In contrast, Sharp et al. (2019) and Seiverling et al. (2020) employed an exclusion criterion for specific medical conditions. Furthermore, Tarbox Schiff and Najdowski (2010) and Najdowski et al. (2012) reported no significant medical problems in their participants.

Overall, the children in the reviewed studies were aged from 2 to 15 years, and the gender profile was approximately 80% male. Ethnicity was reported for 10 studies, with ~ 72% of children classified as Caucasian. The characteristics of parents and caregivers were not always clearly stated, but 14 papers conducted the intervention with mothers alone, six with both parents, six papers had a mix of family members, and the remaining 10 papers did not specify which family members were involved in the intervention.

Severity of Food Selectivity

The severity of food selectivity was assessed in the majority of studies (n = 35); however, classifications varied, and only five studies reported the need for oral nutrition supplementation or enteral feeding (Binnendyk and Lucyshyn 2009; Seiverling et al. 2018; Tanner and Andreone 2015; Hoyo and Kadlec 2020; Taylor et al., 2020). Most commonly reported (n = 10) was the repertoire of foods accepted by the child participant, which consisted of few foods and commonly excluded all items from at least one food groups. As an example, Anderson and McMillan (2001) reported intake was limited to mashed potato, yoghurt, and applesauce, excluding vegetables and protein from animal and plant sources. A further eight studies specified the number of foods consumed, which ranged from five (Fu et al. 2015 and Penrod et al. 2010) to 15 (Suarez and Bush 2020), but without further details about the types of foods consumed, therefore it is unclear if they excluded an entire food group. Less commonly described was avoidance of specific textures, brands, or cooking methods, limited variety, or previous unsuccessful efforts to improve variety. A more detailed clinical assessment of FS was completed by Tayler et al. (2020), using the criteria for avoidant/restrictive food intake disorder (ARFID), which includes failure to thrive, nutritional deficiencies, dependence on artificial feeding methods and/or dietary supplements, and marked impairment in psychosocial functioning. Finally, three studies focused on reporting the behavioural aspects of FS alone.

Feeding Therapy and Caregiver Involvement

Feeding interventions designed to improve food outcomes for autistic children experiencing food selectivity varied between studies (Online Resource 2). All, except three interventions, were multi-component in nature (Ewry and Fryling, 2016; Seiverling et al. 2018; Tarbox et al., 2010). The most common feeding therapies employed were escape extinction (n = 18), differential reinforcement of alternative behaviour (n = 12) and stimulus control and fading (n = 12). In turn, escape extinction generally consists of no longer allowing a child to escape or avoid something non-preferred when they engage in challenging behaviour (e.g. non-removal of a spoon that presents a target food); differential reinforcement of alternative behaviour is a procedure in which one behaviour is reinforced and another behaviour is on extinction (Tarbox and Tarbox 2017); finally, stimulus control and fading is a behavioural procedure that entails the gradual introduction of the feared stimuli (i.e. an unfamiliar food) closer to the child, allowing time for habituation (or adjustment) to the stimulus prior to each move closer (Furr et al. 2020).

These feeding therapies were implemented by the caregivers either as the initial interventionist or after successful implementation by a therapist/researcher as a way to generalise the desired behaviours to the child’s natural environment. The level of involvement of the caregiver in the intervention design and delivery varied. As an example, Cosbey and Muldoon (2017) demonstrated high caregiver involvement throughout their study, with caregivers involved in selecting primary intervention goals, developing individualised intervention plans, and delivering initial therapy with the presence of the researcher (phase one coaching). In contrast, caregiver involvement was less pervasive in a slightly earlier study (Barnhill et al. 2016), where nutritional staff instructed the mother what foods to bring to the therapy session and to sit quietly and observe the session with the therapist. Only after appropriate feeding behaviour was observed for 80% of presentations did the parent take ownership of the feeding therapy.

Caregiver Training

All the included studies reported some degree of parent/caregiver training in the feeding therapies (Table 2). Five studies (Johnson et al. 2015; Johnson et al. 2019; modelled after the Research Units on Pediatric Psychopharmacology Autism Network, 2007, Sharp et al. 2014; Sharp et al. 2019; Suarez and Bush 2020) employed training manuals that combined didactic teaching with a range of training methods (e.g. role play, modelling, video vignettes, homework/worksheets, coaching, and feedback). Those studies not employing manuals used different combinations of training methods. The two most commonly employed training methods were verbal instruction and oral/written feedback, which were reported in 18 papers. Sixteen papers reported the use of written instructions, followed by modelling/demonstration (14 studies), roleplay/rehearsal (11 studies), and observations and audio-video recording of training or earlier sessions (both reported in seven studies). Only a single study reported using used goal setting as part of the training process.

Table 2 Method of parental training for the parent-led intervention

The majority of studies provided little detail concerning the implementation of parent/caregiver training, with only 13 of the 36 studies providing any details of the time, number, and frequency of training sessions. Where reported, these varied widely, from one study (Bui et al. 2013) reporting one session of 45 min, to another (Muldoon and Cosbey 2018) reporting sessions of 50 min twice weekly for six months. There is, however, a lack of clarity in a number of papers, stemming from parent/caregiver training being an integral part of the intervention process. For example, in Muldoon and Cosbey (2018), the sessions included registered behaviour technicians modelling the session feeding strategy, which was then repeated by the caregiver, who then received feedback on the fidelity of their implementation.

FAME Quality Assessment

The median score for each of the FAME criteria (Online Resource 3) was 3, indicating that the caregiver-led interventions were largely practical with limited local training or modest additional resources, acceptable and justifiable after minor revisions, provided a rationale for local, regional, or national reform, and were effective to a degree that suggests application. Despite this, the highest score was awarded to just two studies for feasibility (Bui et al. 2013; Miyajima et al. 2017) and one study for appropriateness (Cosbey and Muldoon 2017). The later study included an individualised plan that fits with the family’s needs and strengths and gave caregivers ownership of the intervention in the home environment. In contrast, no studies were awarded the highest score for meaningfulness or effectiveness. The lowest score possible was awarded to three studies for feasibility (Taylor 2020a; Taylor 2020b; Taylor et al. 2020), one for appropriateness (Seiverling et al. 2018), and three for meaningfulness (Bui et al. 2013; Marshall et al. 2015; Suarez and Bush 2020). Of note, appropriateness was questionable mainly due to ethical concerns related to feeding practices that appeared to cause distress or conflict with child autonomy. As an example, interventionists continued to place food in the child’s mouth even if crying or screaming (Seiverling et al. 2018) or physically manipulating the child’s jaw to insert the target food (Silbaugh et al. 2018). Of the studies reviewed, effectiveness was rated lowest (score of two) for Sharp et al. (2014), Silbaugh et al. (2018), Sira and Fryling (2012), and Clarke et al. (2020).

Food Outcomes

All of the included articles measured changes in the child’s food-related outcomes (Tables 3 and 4) at numerous phases of the study (baseline, during training, second baseline, during the caregiver-led intervention, and at a variety of different follow-up periods); however, Johnson et al. (2019) failed to report the outcomes of the 3-day dietary record, which could be considered an ethical issue given the participant burden associated with keeping dietary records (Holmes et al., 2008). Food outcomes included variables associated with food consumption (n =16), food acceptance (n = 18), and non-acceptance (n = 4), along with bite response rate per minute (n =1), diet variety (n = 3), and quality (n = 4). These variables were measured using a variety of methods; for example, consumption was reported as the number (n = 2) or percentage of bites consumed during an eating occasion (n = 6), the percentage of meals consumed (n = 1), total number of grams consumed (n = 3), percentage of foods consumed (after caregiver instruction or self-initiated) (n = 1), total number of foods and F&V consumed (n = 1), bites swallowed (n = 2), and mouth clean (n = 3).

Table 3 Overview of food, behaviour, family, and acceptability outcomes for case-based studies
Table 4 Overview of food, behaviour, family, and acceptability outcomes for intervention studies

Interestingly, the terminology used to characterise variables associated with food intake also varied; however, there was some overlap between definitions. As an example, bites consumed typically referred to swallowing a bite of food and leaving the mouth clean within (5 to 30 s from acceptance) or without a specified timeframe (Barnhill et al. 2016; Fu et al. 2015; Gentry 2011; Penrod et al. 2012). Similar definitions were applied to bites swallowed (Najdowski et al. 2003), percentage of foods consumed (Binnenyk 2009), and mouth clean (Najdowski et al. 2010), allowing some comparison of study findings. Despite this, the difference in timeframe for measuring food consumption could be a source of variability between studies. In addition, some definitions associated with food intake did not account for key sources of error. As an example, in one study, the percentage of meal consumed was measured by weighing the meal before and at the end of the meal, without consideration of mouth cleaning or expulsion (Tarbox et al., 2010). In contrast, Seiverling et al. (2018) adjusted the weight (grams) of foods consumed for expelled foods.

The definitions of food acceptance also varied between studies. The majority of authors specified that in order to be recorded as a successful occurrence of food acceptance, the food must be swallowed (within or without a specified timeframe) and the mouth clean on inspection. In contrast, Gale et al. (2011) and Penrod et al. (2010) counted acceptance separately to mouth clean, indicating that swallowing the food was not essential to food acceptance. As such, foods accepted in some studies may overestimate the success of the intervention in relation to food intake. Furthermore, the majority of studies measured changes in food intake and acceptance at specific eating occasions (lunch, dinner, and/or snacks) rather than changes in the adequacy of the overall diet. In contrast, Marshall et al. (2015), Johnson et al. (2015, 2019), and Sharp et al. (2019) reported changes in food intake over a 3-day period, which likely gives a better understanding of the habitual diet quality and variety. Similarly, food intake and acceptance were measured more broadly, using questionnaires (Miyajima et al. 2017; Taylor 2020a; Taylor 2020b; Taylor et al. 2020; Seiverling et al. 2020; Suarez and Bush 2020) and a combination of food frequency questionnaire (FFQ) and 24 h recall in studies conducted by Muldoon and Cosbey (2018) and Cosbey and Muldoon (2017). The FFQ asked caregivers to indicate whether their child ate or rejected foods from a list of 150 foods within the previous 6 months, while the 24 h recall gave an estimation of foods consumed in the previous 24 h.

All 29 case-based studies reported improvements in food-related outcomes; however, the magnitude of the improvements was not always clear without further calculations or inspection of graphs displaying data for each eating occasion. This revealed that food acceptance or intake ranged from 0% (Binnendyk and Lucyshyn 2009; Cosbey and Muldoon 2017; Ewry and Fryling 2015; Fu et al. 2015; Gale et al. 2011; Najdowski et al. 2003; Penrod et al. 2012; Seiverling et al. 2012; Silbaugh et al. 2018; Sira and Fryling 2012) to between 33 and 69% at baseline (Bui et al. 2013) and increased to the maximum of 100% of foods presented at individual eating episodes (Aclan and Taylor 2017; Anderson and MacMillan, 2001; Bui et al. 2013; Cosbey and Muldoon 2017; Ewry and Fryling, 2016; Fu et al. 2015; Najdowski et al. 2010 and Najdowski et al., 2012; Penrod et al. 2012; Seiverling et al. 2012; Tarbox et al., 2010).

A single case series (Cosbey and Muldoon 2017) reported a statistically significant improvement in acceptance of non-preferred foods (p < 0.001), which the authors concluded represented a 90% improvement from baseline. This should be interpreted with caution, as acceptance in this study was calculated using a personalised hierarchy (scored from 0 to 10–12) for each child, and only the highest scores were associated with swallowing the food. As such, the improvement may not have translated into a meaningful change in energy or nutrient intake. In contrast, only two of the case-series studies reported improvements that translated into meaningful benefits for nutritional status. This included weight gain (Muldoon and Cosbey 2018; Seiverling et al., 2018) or reduced reliance on oral nutritional supplements to meet the child’s energy and nutrient needs (Seiverling et al. 2018).

The majority of these studies supplemented the description of the numerical data with graphical representations of the food outcomes for each eating occasion throughout the study, providing some indication of the stability and longevity of the improvements in food acceptance or intake between eating episodes and intervention periods (at times including generalisation to other eating environments). On inspection of the graphical representation of the data, it appeared that food intake was subject to daily fluctuations and variability between eating environments and participants was common. As an example, Penrod et al. (2010) recorded that the target number of bites consumed was achieved after ~ 30 sessions for Patrick, ~ 50 sessions for Jack, and > 100 sessions for Matt. Furthermore, progression from accepting food on the tongue to swallowing the food can be slow (Penrod et al. 2012); this could be important for healthcare professionals to acknowledge to manage caregiver expectations.

In a small number of studies, the graphs accompanying the author’s descriptions were a substitute for some key numeric data, meaning it was difficult to obtain a clear indication of the precise volume of improvement in food outcomes, mainly due to the complexity of the graphs. As an example, Barnhill et al. (2016) described the number of bites consumed for the first meal, but the remaining data had to be extracted from the figure, which was challenging due to the scale of the units presented on the y-axis. Furthermore, in a study conducted by Aclan and Taylor (2017), there appeared to be some inconsistency between the author’s evaluation of the data and the values reported for one of the children in the study. The authors indicated that bite consumption increased for both novel and mastered foods; however, the average consumption at baseline of 87 (80–100%) and 95% (90–100%) was not dissimilar to post-feedback of 85 (50–100%) and 83% (70–90%), for novel and mastered foods, respectively.

The remaining studies (n = 6) used an experimental design that allowed statistical analysis, employing a single-arm (pre/post) intervention study (n = 3), a parallel intervention with waiting list control (n = 1), a randomised controlled trial with PEP control (n =1), or comparing two treatments (operant conditioning or systematic desensitisation (n = 1)). Two-thirds of these studies reported statistically significant improvements in food outcomes; however, direct comparison between studies was not possible (Marshal et al. 2015; Miyajima et al. 2017; Sharp et al. 2019; Suarez and Bush 2020). Both Marshal et al. (2015) and Sharp et al. (2019) reported increases in the volume of food consumed (measured by 3-day food diaries), which equated to an increase in total energy intake of 9.7% (2.1–17.4%) p = 0.01, and 30.76 ± 6.75 g per meal (p = 0.001), respectively. Furthermore, Marshall et al. (2015) observed increases in fruit and vegetable counts (mean difference 2.3 (0.4–4.1), p = 0.02) and protein counts (mean difference 4.7 (3.3–6.1) p < 0.01). In contrast, Miyajima et al. (2017) and Surarez and Bush (2020) noted improvements in the number of foods the child (mean increase in eatable foods: 2.56, p < 0.001, 10.5, p = 0.018, respectively) would eat. In the former, there was also an increase in the caregiver’s subjective view of dietary imbalance (mean difference 13.66, p < 0.001); however, it is not clear if this translated into clinically meaningful improvements in dietary intake or the child’s nutritional status.

Behaviour Outcomes

Twenty-four studies measured and reported changes to mealtime behaviours, which included observations of inappropriate mealtime behaviour (IMB) (such as self-injurious behaviour, gagging/vomiting, vocal protests, facial grimace, and throwing utensils), self-injurious behaviour, quantifying meal duration, or reported mealtime behaviours using questionnaires (Tables 3 and 4). A further two studies measured mealtime behaviours but did not include the findings in the analysis (Bui et al. 2013; Najdowski et al. 2010), and three studies did not measure mealtime behaviour formally but reported anecdotally accounts of behaviour (Tanner and Anderone 2015; Seiverling et al. 2012; Valdimarsdottir et al., 2010). This inconsistency in documenting behavioural outcomes could suggest reporting bias.

In total, 23 studies reported improvements in mealtime behaviours, related to self-injurious behaviour, negative vocalisation, IMB, and, to a lesser extent, reductions in meal duration (Tarbox et al., 2010) and the timeframe for snack acceptance (Binnendyk and Lucyshyn 2019). Improvements reached statistical significance for four out of five studies that were able to make statistical comparisons (p < 0.05). In contrast, Sharp et al. (2014) and Johnson et al. (2019) reported no change in mealtime behaviours or global impressions between intervention and waiting list controls, respectively (p > 0.05). Furthermore, Cosbey and Muldoon (2017) reported an increase in IMB for one of the three children in their study, although the authors attributed this increase to caregiver’s greater awareness of inappropriate behaviours at follow-up. Anecdotal accounts suggested improved behaviour due to no longer needed support at mealtimes (Valdimarsdottir et al. (2010), ease of transition to chair for mealtimes (Seiverling et al. 2018), and mothers perceiving mealtime behaviour as good or excellent (Seiverling et al. 2012).

Family Outcomes

Only seven studies reported outcomes related to the parents or wider family (Tables 3 and 4). Family quality of life measured using versions of a Family Quality of Life Scale (Hoffman et al. 2006 and Park et al., 2003) increased (Binnendyk and Lucyshyn 2009) or remained high (Cosbey and Muldoon 2017) from baseline to conclusion of two studies. In addition, Miyajima et al. (2017) reported that difficulty experienced by caregivers reduced (p < 0.001), while self-efficacy increased (p = 0.018). In contrast, parental stress, caregiver strain, and sense of competence did not change (p = 0.17 to p = 0.25) for 50% of studies that reported these variables (Marshall et al. 2015; Johnson et al. 2019).

Acceptability Outcomes

The acceptability of the caregiver-led interventions was formally assessed in 15 studies (Tables 3 and 4) and anecdotally reported in a further three studies (Gentry and Luiselli 2008; Tarbox et al., 2010). Formal assessment tools measured goodness of fit (n = 3), acceptability (n = 3), social validity (n = 4), and parental/caregiver satisfaction (n = 9); however, there was some ambiguity and inconsistency in how the tools were described or used to measure these concepts. Firstly, Cosbey and Muldoon (2017), Muldoon and Cosbey (2018), and Binnendyk and Lucyshyn, (2009) all employed a goodness of fit survey (adapted from Albin et al. 1996); however, Cosbey and Muldoon reported the results in terms of social validity. In contrast, social validity has been measured using a range of different tools using a Likert scale to rate between six and 10 items, the origins of which were not reported (Binnendyk and Lucyshyn 2009; Najdowski et al. 2010; Clarke et al., 2020).

Secondly, acceptability was measured independently using what appear to be different tools (3–16 item scale), one of which was reported to be similar to the Intervention Rating Profile, developed by Martens et al. (1985). In contrast, acceptability formed part of the social validity and caregiver satisfaction tools developed for two studies (Sharp et al. 2014, Sharp et al., 2019). Parental/caregiver satisfaction was also assessed using either a Behaviour Rating Scale (developed by Elliott and Treuting, 1991) or a parental/caregiver satisfaction questionnaire (developed by Hoch et al. 1994 and the Research Units on Pediatric Psychopharmacology Autism Network, 2007).

While these tools could be criticised for providing an arbitrary numeric value that may have different meanings to each caregiver, there appears to be a general consensus that caregivers were satisfied with the intervention they undertook and that the interventions were suitable to the environment, along with the family’s needs and goals. In addition, analysis of individual items on the questionnaire can provide insight into the aspects of the intervention that are valued most. Individual studies have indicated that videos are instrumental to change (Clark et al. 2020), and the most helpful components of caregiver-led interventions were modelling (Seiverling et al. 2012) and behavioural principles and prevention (Johnson et al. 2015, Johnson et al., 2019).

Discussion

The purpose of this systematic review was to evaluate the effectiveness of caregiver-led interventions for FS in autistic children with respect to the following outcomes: (i) the child’s food intake and mealtime behaviours, (ii) family outcomes, and (iii) acceptability of the caregiver-led. Effectiveness was considered within the context of the health needs of the child, family environment, and financial limitations of health and social care.

In relation to the outcome measures, the study design for all but one of the case studies and case series (Cosbey and Muldoon 2017) prevented statistical analysis; therefore, the reported improvements need to be interpreted with caution. As noted in a previous systematic review of interventions designed to improve feeding behaviours in ASD (Ledford et al. 2018), procedural variations are not conducive to calculating the magnitude of change for outcome measures. This was particularly evident in the current review with regard to differences in definitions for the same outcome. As an example, acceptance of a target food did not always include mouth clean (Gale et al. 2011; Penrod et al. 2010); therefore, viewing all ‘acceptance’ outcomes together could result in overestimation of the success of caregiver-led intervention studies.

In light of the procedural variations, the earlier review conducted by Ledford et al. (2018) used functional relation analysis to determine the effectiveness of caregiver-led interventions. Functional relations analysis is a method of assessing the effect of an independent variable on a dependent variable for single-case designs using visual analysis. This analysis typically focuses on trends, variability in each phase, consistency between similar phases, overlap between different phases, and comparisons of projected and observed data (Manolov et al. 2014). The main outcome of the functional analysis conducted by Ledford et al. (2018) was that 75% of the studies (with sufficient demonstrations or data points) had a functional relation for food acceptance, but only 45% of studies had a functional relation for problematic behaviour. Interestingly, only 11% of the studies they reviewed included functional relations as part of their analysis process; therefore, practitioners need to be cautious when interpreting the findings of individual studies. It is also important to note that visual analysis techniques have been criticised due to variability between analysts (Danov and Symons 2008), which could cast doubt on the reliability of the review findings as the functional relation analysis was performed by a single researcher (Ledford et al. 2018). Consequently, future research with single-case designs should seek to include functional relation analysis with two or more researchers to enhance the reliability of findings.

While Ledford et al. (2018)’s functional analysis provided a sense of the overall proportion of studies that were successful at improving food and behaviour outcomes, without consideration of the volume of improvement, it is not clear how meaningful the findings are for the child or family. This is concerning, as a review of previous studies has indicated that 25% of autistic children with severe FS omit two food groups from their typical diet, with a high proportion (67.1%) omitting vegetables, placing them at increased risk of nutritional inadequacies, including vitamin E and fibre (Sharp et al. 2018). It is also reported that nutritional inadequacies due to FS in autistic children can lead to iron deficiency anaemia, constipation, and in extreme cases to loss of eyesight (Uyanik et al. 2006).

Food Outcomes

In relation to the food outcomes, most studies focused on reporting outcomes of discrete eating episodes rather than the overall diet and therefore provide little insight into the nutritional status of the child. As an example, Cosbey and Muldoon (2017) reported a statistically significant improvement (92–93%, p < 0.001) in acceptance of less-preferred foods compared to baseline, using Tau-U as a measure of non-overlap between phases. Furthermore, visual analysis and the percentage of non-overlapping data points (75–100%) between baseline and both caregiver-intervention and maintenance phases were considered to be evidence of effectiveness. This should be interpreted with caution, as food acceptance was measured via an individualised hierarchy ranging from 0 to between 10 and 12, depending on the child. Using the hierarchy designed for one child as an example (Blake), a score of between 66.7 to 91.7% reflected that the child had expelled the less-preferred food. Visual inspection of the data for this child indicated that this score was exceeded on only one occasion throughout the study. Furthermore, a score of 100% was required to demonstrate that each child (n = 3) had swallowed a typical bite of food; however, only one child (Dominic) consistently achieved 100%, and only within the maintenance phase of the study. Therefore, despite the reported success of this caregiver-led intervention, at least two of the three participants were unlikely to see any meaningful improvement in their nutritional status.

In contrast, two of the 29 case-based studies provided some indication of an improvement in nutritional status. In these studies, weight gain (5 to 8 lb) was reported for two out of five participants, which is likely to be meaningful as both children had previously experienced faltering growth (Muldoon and Cosbey 2018; Seiveling et al. 2018). This appeared to be achieved via improvements in food intake, as there was a reduced reliance on chocolate milk (Muldoon and Cosbey 2018) and oral nutritional supplements (Seiverling et al. 2018) to meet the children’s energy needs. Despite these improvements, it is not clear whether the children in these two studies were at risk of malnutrition in the form of micronutrient deficiencies.

Inspection of specific studies with detailed nutritional data suggests micronutrient deficiency is probable even after successful caregiver-led interventions. As an example, Muldoon and Cosbey (2018) used a single 24 h recall describing the types of foods consumed, at key stages of the study (pre/post), albeit without quantifying the volume of solid foods consumed. This revealed marginal improvements in the variety of foods consumed, especially for one child (Juan). Before the intervention, this child consumed chocolate milk (reduced from 48 to 9 oz post intervention), yoghurts, and rice. These foods were supplemented with applesauce, beans, spaghetti, and oral nutritional supplement (8 oz prescribed by a medical professional) after the intervention. If this reflects a typical day, it is likely this child would be at risk of micronutrient deficiencies due to the absence of vegetables, and low intake of fibre and iron-rich foods.

In addition, Marshall et al. (2015) reported an increase in fruit, vegetables, and protein using counts, but they did not report how this was done or how this related to reference nutrient intakes for children. Despite this, they did report an increase in energy intake, from 92.0 ± 20.6% of energy requirements to 101.7 ± 24.2%. While the increased energy intake is close to the average estimated energy requirements for these children, there was marked variability between participants, which could have consequences for under/over nutrition for some children. Nonetheless, daily fluctuations in energy intake have been reported to be between 16 and 34%, during a dietitian study completed over a 17-day period (Champagne et al. 2013), and therefore energy intake needs to be interpreted alongside the child’s growth trends.

Sharp et al. (2019) reported an increase in the quantity of food consumed in grams (30.76 g, range 13.4 to 48.12 g, p = 0.001) without any indication of the composition of this additional intake. As such, it is difficult to ascertain if this increase is meaningful for the child. As an example, if this additional intake was from a banana, it would result in an increase in energy of ~ 26 kcal, which would likely be insufficient to improve the nutritional status of a child that is not meeting their predicted growth trajectory. In addition, it would not contribute to iron intake, leaving individuals with low iron stores susceptible to iron deficiency anaemia. As such, it is recommended that future caregiver-led interventions for children with ASD and FS should include a nutritional analysis and growth measurements alongside food outcomes associated with individual eating episodes.

Sharp et al. (2014) and Johnson et al. (2015) failed to detect significant improvements in preferred foods or nutrient intake, respectively. In the later study, nutritional analysis of a 3-day food diary indicated that more than three quarters of children had inadequate intakes of fibre (93%), vitamin D (93%), vitamin K (93%), and potassium (100%) at baseline. This pattern continued after the intervention, with the addition of manganese (75%). Furthermore, there were no significant changes in the percentage of children below recommended intakes for vitamin A (36 to 42%), calcium (64 to 67%) and iron (50 to 42%), which are essential for vision (Clifford et al. 2013), bone mineral content (Lamas et al. 2019) and iron status (Wang 2016), respectively.

Despite the enhanced reporting of nutritional intake by Johnson et al. (2015), the clarity of the dietary analysis is lacking. Firstly, insufficient details about the analysis were provided in the methods, thus the reliability of the data generated is unclear. Secondly, the authors failed to report the magnitude of any nutrient deficits; therefore, it is important to acknowledge that these deficits may not translate into suboptimal nutritional status. As such, research (and healthcare services) for FS may need to include a clinical assessment of children presenting with symptoms of nutritional deficiency or dietary intakes below dietary reference values, along with input from a dietitian to caregiver-led interventions. Furthermore, it is clear from this review that improvements in FS as a result of feeding therapy are slow and variable between participants; therefore, detection of meaningful change likely requires long-term monitoring throughout the child's growth period (Ranjan & Nasser, 2015). As a guide, the just right challenge feeding protocol has indicated that when using a food interaction hierarchy, a single point improvement can take 5–10 weeks (Suarez 2022). This is likely to be important to communicate with caregivers, to manage expectations about treatment gains.

Behaviour Outcomes 

Similar to food outcomes, methods used to measure changes in mealtime behaviours were variable, and statistical analysis was not possible in the majority of studies, making it difficult to form strong conclusions about the efficacy of the caregiver-led interventions. Behavioural outcomes reported by the majority of case studies/series were quantified against study-specific definitions of inappropriate behaviour (typically frequency of occurrence via direct observations of mealtimes) or using standardised tools. Regardless of the measurement, all studies reported improvements in behaviour following intervention, however a further two studies that collected behavioural outcomes failed to report their findings post intervention (Bui et al. 2013; Najdowski et al. 2012). This may suggest that reporting bias may have prevented negative outcomes from being included in these two publications. Furthermore, some studies were unable to report interobserver analysis for IMB (Sira and Mitch 2012) or described low interobserver agreement in relation to specific behaviours (i.e. gagging) (Silbaugh et al. 2018), casting doubt on the reliability of key findings.

In contrast, the experimental studies used tools that had been designed specifically for autistic children, such as the brief autism mealtime inventory (BAMBI), which are reported to have good internal consistency (Cronbach alpha = 0.88) and test–retest reliability (r = 0.78) (Lukens and Linscheid, 2008). Despite this, the scores generated provide limited information about the severity of inappropriate mealtime behaviours. As an example, mealtime behaviours within the BAMBI tool are scored from one to five in relation to the frequency of occurrences (1 = never to 5 = always), rather than the impact these behaviours have on the child and family or whether the behaviour(s) is transient or persistent during each mealtime. Therefore, improvements in mealtime behaviours need to be interpreted with caution, as items such as ‘self-injurious behaviour’, even if experienced less frequently, could still put the child at risk of harm. Equally, the item ‘Closes mouth tightly when food is presented’ may have little impact on the child or mealtime experience if it presents towards the end of the meal, even if it occurs during every meal. Future research may benefit from displaying changes in individual items to aid interpretation of the total behaviour score.

To facilitate triage and clinical decisions, cut-off scores of 54 (BAMBI-R), 45, and 34 (four-factor BAMBI) have been proposed to be indicative of substantial feeding problems, significant behaviour difficulties, and problematic feeders, respectively (Johnson et al. 2019; Lukens 2005; DeMand et al. 2015). Despite this, not all studies report the total behavioural scores (Coseby and Muldoon 2016; Muldoon and Coseby 2018), making direct comparisons between studies near impossible. Observations of the behaviour data (BAMBI-R) reported by Johnson et al. (2019) indicated that the significant improvement in feeding behaviour was still indicative of substantial feeding problems after 10 weeks. Furthermore, at 20 weeks, the average score fell slightly below the threshold for substantial feeding problems, and therefore improvements may not be clinically meaningful. This is concerning as inappropriate mealtime behaviour has been reported to prevent families from sharing mealtimes (Suarez et al. 2014a) and enjoying holidays and family gatherings (Rogers et al. 2012).

Family Outcomes

Although the aforementioned negative impact of inappropriate mealtime behaviours is common in the family unit, relatively fewer (19.4%) studies have measured changes in family outcomes post caregiver-led interventions for FS. In addition, there is considerable variability in the family outcomes being measured, including quality of life, parental stress, self-efficacy, and parental/caregiver strain. The most commonly reported outcome was parental/caregiver stress, measured in all instances using the parental stress index (Abidin, 1995) -short form (n = 4), however with 50% of studies demonstrated no improvements (Johnson et al. 2019; Marshall et al. 2015), there is insufficient evidence to draw conclusions about the efficacy of these interventions for family outcomes.

Despite this, the change in parental/caregiver stress reported by Sharp et al. (2014) and Johnson et al. (2015) reflects a change from clinically significant (89.3 ± 7.8 and 89.3 ± 24.7) to borderline clinical severity (81.0 ± 14.1) and below (79.6 ± 18.4), respectively (p < 0.05). This is particularly important as reductions in stress have been associated with improved parent/caregiver and child interactions (Brookman-Frazee and Koegel, 2004), which is likely to be meaningful to the family. It is not clear why stress is reduced in the absence of improved food and behaviour outcomes in the Sharp et al. (2014) study, but previous research has suggested that this could be due to care-giver involvement in treatment (Brookman-Frazee and Koegel, 2004). This possibly reflects the perception held by parents that a diagnosis of autism for the child is also a diagnosis for the wider family (Gentles et al. 2018). As such, future interventions should consider the interdependent autism unit, including the wider family.

Acceptability Outcome

Similar to the preceding outcome measures, acceptability of caregiver-led interventions has been measured using a variety of techniques, adding to the complexity of this literature review, however never from the child’s perspective. This may, in part, reflect the challenges associated with assessing acceptability in AC, given the variability in communication skills reported in some of the papers. Nevertheless, in all instances, the measures were superficial in nature (i.e. abstract numerical score or anecdotal feedback from caregivers) and gave little indication of the components of the interventions that caregivers found most useful. In the limited studies that presented parental/caregiver satisfaction for specific techniques implemented in the intervention, modelling (Seiverling et al. 2012) and behavioural principles and prevention (Johnson et al. 2015; Johnson et al. 2019) were considered most valuable.

Outside of this review, Vazquez et al. (2019) reported differential reinforcement of alternative behaviour was the most acceptable strategy and EE was least preferred. The latter may be unsurprising, as escape extinction can involve representing expelled food, which parents may not be comfortable with (Anderson and MacMillan, 2001). Based on the FAME criteria (i.e. appropriateness), the acceptability of the caregiver-led interventions reported by Seiverling et al. (2018) and Silbaugh et al. (2018) are likely to be low due to the ethically questionable strategies implemented to influence the child to ingest a non-preferred food. Despite the limited insight into the parent and caregiver-preferred intervention components, these findings could be used to design future parent-led interventions. This would include the preferred modelling and behaviour techniques and exclude escape extinction tactics that do not fit with the family’s values. Alignment between interventions and family values likely impacts caregiver engagement with the intervention and therefore the potential for successful improvements in outcome measures.

Interestingly, Sharp et al. (2014) reported high parental/caregiver satisfaction (4.8 out of 5) for acceptability despite no change in food preferences or behavioural outcomes. In this case, acceptability could reflect reduced stress experienced by parents/caregivers post intervention. Possibly more surprising is that despite reporting no change in stress, caregiver strain, global impressions, and relatively small improvements in mealtime behaviour, parents/caregivers in the study by Johnson et al. (2019) reported high levels of satisfaction (94%). This could suggest responses to the satisfaction questionnaire were due to social desirability, as it is recognised that self-reported measures are particularly susceptible to this form of bias, along with expectancy effects following treatment (Karst and Van Hecke, 2012).

A qualitative approach to evaluating acceptability may provide a better opportunity to challenge the disconnect between child or family outcomes and parental/caregiver acceptability of an intervention, that is not possible with standardised questionnaires and rating scales. Furthermore, supplementing numeric data with narrative data may allow researchers to explore the acceptability of key aspects of the intervention and any challenges faced by parents/caregivers in the implementation phase. This may be especially true for pilot studies (Johnson et al. 2015), allowing researchers to refine the intervention prior to roll out on a wider scale. It may also be advisable to engage parents/caregivers in the development of interventions to ensure the needs are being met. This is especially important when supporting caregivers to translate skills learned from professionals into real-life situations that are specific to the family, with parents regarded as experts on the unique needs of their child (Gentles et al. 2018).

Despite the breadth of this literature review, this study did not include unpublished work or conference presentations. In addition, the variability in research design and reporting made direct comparisons of studies challenging, and therefore it is not possible to conclude with any certainty what type of caregiver-led intervention provides the most meaningful improvements in the desired outcomes. Furthermore, only four studies used a randomised controlled trial design, which is considered the gold standard for assessing cause and effect (Hariton and Locascio 2018). While this may also limit the conclusions that can be drawn, real-world interventions and case studies/series, especially those that report intervention fidelity, provide some insight into the feasibility of caregiver interventions within the natural environment.

Conclusion

Caregiver-led interventions show promise for improving food acceptance and mealtime behaviour in autistic children with food selectivity and to a lesser extent, quality of life and parental/caregiver stress. Based on caregiver perceptions of the most beneficial components of such interventions, future studies should incorporate modelling and behaviour techniques that align with family values. This is both when the caregiver is the initial interventionist and when they are involved in generalisation to the home environment, after a period of intervention with clinicians or researchers. Despite this, improvements in food outcomes may not be sufficient to mitigate potential inadequacies in nutritional status that are often associated with FS in this population group. This is important, as despite their general acceptability, the majority of studies are time and resource intensive, for what appear to be marginal gains in food intake. Involving the caregiver earlier in an intervention could reduce the healthcare resources, allowing more children to be seen on a timely basis. Future research should therefore include health economic outcomes alongside assessment of nutritional status to better assess the efficacy and viability of such interventions in the current context of diet-related diseases, escalating healthcare costs and long waiting lists.