Background

The role of the home environment in shaping a child’s diet and growth is an area of increasing interest, particularly among those working in child obesity prevention and treatment. The home environment has significant influence on child socialization [1], including adoption of eating behaviors [2]. This is particularly true for younger children (2-12 years old) given their limited autonomy and dependence on adult caretakers, who influence dietary intake and eating behaviors through the foods they provide as well as the social environment they create [3].

Parent food practices and feeding style represent a large component of parent behaviors that influence child diet and/or weight. Parent food practices are the specific techniques or behaviors used by parents to influence children’s food intake [4]. Traditionally, food practice constructs have included pressure to eat, restriction, monitoring of the child’s food intake, or the use of rewards for food consumption. More recently, constructs have been expanded to include parent food modeling, family mealtime environments, food preparation practices, involvement of children in food planning and preparation, and control allowed to children over when, where, what and how much they eat. While food practices are specific behaviors or actions, they are often used to categorize parent feeding style [5]. A parent’s feeding style reflects the emotional climate in which these practices occur, or the balance between demanding versus responsive feeding practices [6].

Reviews of family environmental correlates have found fairly consistent associations between child fruit and vegetable consumption and parent food practices such as dietary modeling, food rules, and encouragement [79]. However, these reviews have also highlighted gaps in the literature with regard to measurement. How constructs are defined and measured is highly variable across studies, making it difficult to draw clear conclusions. Additionally, studies tend to assess only a limited number of constructs; thus hampering efforts to understand the relative importance of factors and how they might interact. While there have been two recent reviews on measurement of home food availability and accessibility [10, 11], there has not been a similar review focused on measurement of the parent behaviors that influence child diet.

This paper addresses this gap in the literature by presenting results from a comprehensive, systematic review designed to identify and evaluate instruments or specific scales assessing parent food practices. It captures the full array of parental food practices thought to shape the sociocultural food environment of the home in an attempt to bring some order to a field of measurement that has become increasingly complex and confusing.

Methods

This review was conducted in two phases (depicted in Figure 1), beginning with an extensive systematic review of the literature to identify factors within the home environment hypothesized to relate to children’s diet and/or eating behaviors. During this first phase, both social and physical characteristics of the home environment and any evidence of their relationship to child diet, eating behaviors, or weight were explored. This initial review was conducted as part of a larger study to identify potential constructs and items for consideration in the development of a comprehensive measure of the home food environment (known as the Home Self-administered Tool for Environmental assessment of Activity and Diet, or HomeSTEAD, R21CA134986). The search terms used and inclusion and exclusion criteria employed reflect this goal. During the second phase, results of the initial review were used to identify articles describing development of instruments assessing parent food practices.

Figure 1
figure 1

Overview of two-phased literature review.

The initial systematic literature review was conducted in October of 2009 using four search engines: PubMed/Medline, PsychInfo, Web of knowledge (ISI), and ERIC. Search terms were identified to capture the following topic areas: (1) home environment or parent behaviors and (2) feeding practices, dietary habits, or eating behaviors. (A detailed description of search terms is available in Additional file 1). No limits were placed on date of publication, but articles had to be in English.

Titles and abstracts were reviewed to narrow results. Percent agreement between reviewers (AV, RT, MB) based on a 5% sample of search results ranged between 93-95%. Disagreements were discussed by all authors; discrepancies were resolved via consensus; and inclusion/exclusion criteria were refined. Following completion of the title and abstract review, full articles were retrieved and reviewed (by either AV or RT) to determine whether or not the paper met the full inclusion/exclusion criteria.

Inclusion and exclusion criteria

During the first phase, inclusion criteria specified that the methods section had to describe the measurement of physical and/or social-cultural characteristics of the home environment related to diet and/or eating behaviors in children aged 2-18 years. A content map (Figure 2) based on the ANGELO framework [12] guided the review and ensured inclusion of all relevant topics. The ANGELO framework identifies four types of environments – physical, socio-cultural, political, and economic – which were then conceptualized and defined very specifically to factors coming from within the home environment. The economic environment is often captured by assessing household income, parent occupation, parent education, and similar demographic variables. Identifying demographic surveys was not the focus of this review; therefore, the economic environment was viewed as outside the scope. Additional constructs outside the scope of this review included: individual level determinants of behavior (e.g., knowledge, attitudes, self-efficacy, barriers, food security, acculturation), child eating behaviors (e.g., picky eating), and parent or child dietary intake, food expenditures, time use, body image, and factors not specific to the home (e.g., restaurant meals, purchasing behaviors). While these factors may influence the home food environment, they are not a direct measure of that environment.

Figure 2
figure 2

Content map used to guide review.

Articles were also excluded at this stage if they were not peer reviewed (e.g., editorials and dissertations), if they would not aid in the identification of close-ended items (literature reviews, qualitative studies, case reports), or if they referenced use of an existing measure and offered no further development. In cases where existing and relevant instruments were reused, reviewers verified that the original measure development article had been retained in the original search. Articles could also be excluded if the original measure could not reasonably be obtained (e.g., surveys administered in another language with no translation of items provided within the article, articles published before 1995 that provided insufficient detail to recreate items).

In the second phase, additional selection criteria were added to narrow results to articles that described development and/or evaluation of an instrument assessing parent food practices in families with 2-12 year old children. Parent food practices was defined broadly, based on the original content map, to include constructs related to the home’s social, cultural and political environment around food. Measures of the home’s physical environment (food preparation space, food consumption areas, food availability and accessibility) were eliminated, but have been described elsewhere [10, 11].

Articles had to contain details regarding instrument development and/or evaluation. This could include steps such as developing items based on formative data, using cognitive interviews to assess item clarity, engaging experts to evaluate content coverage, and at least one method of reliability or validity testing (e.g. test-retest reliability, internal validity, construct validity, etc.). Measures also had to include at least one relevant scale or theory-generated category of items.

Data extraction and quality assessment

A data extraction form and quality assessment protocol were developed to facilitate the full appraisal of each measure. While quality assessment protocols of patient-reported outcomes do exist [1316], review and pilot testing of these tools with papers from this review showed that modifications would be required. Therefore, a new protocol was developed based on common elements from existing protocols and DeVellis’ scale development standards [17]. This process was fully piloted by all authors to ensure accuracy of reporting. Percent agreement between reviewers across items was, on average 83.4% (range: 49.1-100). Additionally, any differences in scoring were discussed until agreement on a final score could be reached. Data extracted, included:

  • General descriptive characteristics of the measurement tool: reference, name of measure, purpose, total number of items

  • Details about sample used for development: sample size, age range and gender (of children), race/ethnicity, SES, country, completed by parent/child/both, subject burden, translation and/or testing in additional populations

  • Content: theory or conceptual model employed, list of scales/categories assessed, number of items in each scale/category

Quality evaluated for the following six key elements:

  • Conceptualization of instrument purpose: Instruments were scored 1-4 depending on how clearly the paper conceptualized the purpose of the tool and defined constructs intended to be measured (4 = strongly agree, concepts are named and clearly defined, 3 = agree, concepts are named and generally described, 2 = disagree, concepts only named, but not defined, and 1 = strongly disagree, concepts are not clearly named or defined). Additionally, reviewers captured whether or not a theory or conceptual model helped inform this conceptualization (yes/no).

  • Development of item pool: Instruments were scored on how systematic the developers’ process was for developing a pool of potential items, taking into consideration the use of multiple methods (e.g., pulling items from existing instruments, consulting expert opinion, extrapolating from qualitative data, and extracting from the literature) and an iterative process. Scores ranged from 1-3 where 3 = fully systematic processes were used, 2 = systematic process were weak or only used for pieces (but not whole instrument), and 1 = no systematic process used/reported.

  • Refinement of item pool: Reviewers extracted information about the methods employed to refine the item pool (e.g., expert review, pilot testing or cognitive interviews with draft instrument, assessment of item performance, and use of exploratory factor analysis. When applicable and available, factor loading were recorded so that they could be compared against generally recognized statistical standards to retain only items with factor loading greater than 0.4 and to address any items with cross-loadings greater than 0.32 [18].

  • Reliability: To capture evidence of reliability, reviewers extracted information regarding the evaluation of test-retest, inter-rater, and/or internal consistency testing. Results of test-retest and inter-rater reliability testing, which generally present correlation analysis, were extracted so that results could be compared against generally accepted standards where 0-0.2 indicates poor agreement, 0.3-0.4 indicates fair agreement, 0.5-0.6 indicates moderate agreement, 0.7-0.8 indicates strong agreement, and >0.8 indicates almost perfect agreement [19]. Results of internal consistency, which generally report Cronbach’s alpha, were extracted so that results could be compared against generally accepted standards where 0.6-0.7 is questionable (but often considered sufficient in exploratory analyses), 0.7-0.8 is acceptable, 0.8-0.9 is good, and ≥0.9 is excellent [20].

  • Validity: Reviewers extracted information about three types of validity: construct validity, structural validity, and criterion validity. Construct validity was defined as evidence that the new scale(s) “behaves the way that the construct it purports to measure should behave with regard to established measures of other constructs.” (DeVellis, pg. 46) This can include evidence of associations/correlations between the new scale(s) and established measures of general parenting practices, child dietary intake or eating habits, and/or child weight. Evaluation of construct validity could employ simple correlations or t-tests, or more complex methods like regression models. While correlations ≥0.3 are considered acceptable, the significance of results must be interpreted in light of the underlying theory. Evidence of structural validity, specifically results from confirmatory factor analysis (CFA), were extracted so that results could be compared against generally accepted cutoffs for “acceptable” fit indices: maximum likelihood-based Tucker-Lewis Index, Bollen’s Delta, Comparative Fit Index, Relative Centrality Index, and Gamma Hat ≥0.95, McDonald’s Centrality Index ≥0.90, Standardized Root Mean Squared Residual ≥0.08, and Root Mean Squared Error of Approximation ≤0.06 [21]. Evidence of criterion validity was also extracted, generally assessed by correlational analysis between the new scale and a gold standard. The criterion used for the gold standard in this review had to be an objective assessment of food parenting practices (e.g., observation protocols completed by trained research staff).

  • Responsiveness: Evidence of responsiveness was also extracted. Responsiveness testing is usually conducted using Effect Size statistics or Standardized Response Means (with values greater than 0.5 considered moderate [22]) or by the Reliable Change Index (with 1.96 considered as a minimally important difference [23]).

An updated literature search was conducted in July 2012 to identify additional measures published since the original search. Given the broad scope of the original search, terms were refined to focus the search on food parenting practices (using the diversity of terms uncovered during the original search) and specifically articles describing the development of measures.

Results

Results from the four search engines were combined and duplicates were identified and removed, resulting in 28,378 unique titles. Review of titles and abstracts narrowed the search to 1,352 articles, and full articles were located and retrieved for all but six. The initial selection criteria narrowed the search to 242 articles; the additional criteria added in the second phase further narrowed the pool to 74 articles; and a review of citations identified 8 additional papers. These 82 articles described development of 57 unique instruments. The updated search identified 18 additional articles, 14 of which represented new instruments. Table 1 provides a description of each instrument identified and Table 2 describes the development processes employed.

Table 1 Description of instruments assessing parental feeding practices (in ascending order by year of publication)
Table 2 Description of development and testing methods for parental feeding practice instruments

Among the food parenting practice questionnaires included in this review, final surveys had between 6 and 221 items (44 items on average). While all instruments had at least one relevant scale or categorical grouping of items to assess parent food practices, items within these scales or categories represented less than half of the items in the instrument. These instruments had between 2 and 76 relevant items (19 relevant items on average) and anywhere between 1 and 12 relevant scales or categories (3 to 4 on average). As described in Table 1, the constructs measured varied widely from one instrument to another. Often instruments focused on measuring either controlling feeding practices or supportive and encouraging feeding practices.

Conceptualization of instrument’s purpose

The terms used to describe what the instruments were intended to measure varied, in part, on the background from which the instrument arose. In addition to parent food practices, common terms included: parent-child feeding practices, feeding strategies, feeding style, feeding dimensions, feeding relationship, mealtime environment, mealtime actions, mealtime interactions, parent-child mealtime behaviors, food socialization practices, home food environment, amongst others. Each of these terms has a slightly different definition; however, all of the instruments included items that measured parent food practices. Despite differences in terms, 87% did conceptualize and define what they intended to measure, with 35 instruments receiving the maximum score of 4 and 27 instruments receiving a 3 for conceptualization. Just under half (33 of 71) noted a theoretical basis for the development of their instrument. More commonly referenced theories included: Social Cognitive Theory (n = 12) [43, 46, 61, 62, 91, 93, 98, 102],[104, 107, 114, 121], Social Ecologic Framework (n = 3) [97, 98, 101], Theory of Planned Behavior (n = 3) [35, 74, 98], Social Learning Theory (n = 2) [37, 82], Costanzo and Woody’s Domain Specific Parenting or Baumrind’s Parenting Styles (n = 4) [6, 52, 64, 123], and Satter’s model of the feeding relationship (n = 3) [27, 29, 113].

Development of item pool

The processes used to develop a pool of items varied widely. Only 14 (20%) received the maximum score of 3, indicating a fully systematic process was employed. Common methods employed for item development included: pulled or modified items from existing instruments (n = 44), extrapolation from qualitative formative data such as focus groups or interviews (n = 36), created items based on a review of the literature (n = 22), expert guidance (n = 19), or some combination of methods (n = 33). Seven instruments had no description regarding how items were created.

Refinement of item pool

About one third (n = 24) of the instruments identified in this review did not report any attempts to refine the pool of items once created. Among those who did attempt to refine their pool of items, factor analysis was the most commonly used method (n = 36). Those who employed factor analyses generally used widely accepted criteria for cut-offs for factor loadings and cross loadings. Only 16 instruments had items reviewed by experts to assess content validity, and only 30 piloted the instrument or conducted cognitive interviews to assess clarity of items and face validity. Item performance was also noted as a means to reduce the item pool for 7 instruments.

Reliability

Some form of reliability was reported for a majority of instruments (n = 57 or 80%). Internal consistency was the most common form of reliability reported (n = 56). Generally those that employed such methods retained only those scales that met generally established cut-off criteria with 38 reporting Cronbach’s alphas of at least 0.6 or higher. None of these 38 instruments had alphas greater than 0.9 for all scales, only 5 had alphas consistently above 0.8, and an additional 23 had alphas consistently above 0.7. Test-retest was reported for 27 instruments, typically using a 1-3 week interval. The two notable exceptions were the GEMS’ Diet-Related Psychosocial Questionnaire [69], which administered test-retest over a 12-week period (during which time there was also an intervention delivered); and the Behavioral Pediatrics Feeding Assessment Scale [31], which administered test-retest over a 2 year interval. Correlations reported for test-retest were generally acceptable (>0.6) for most scales within a given instrument. However, when looking at test-retest correlations of all scales on a given instrument only 5 had correlations for all scales above 0.8; 6 additional had correlations >0.7; and 5 more had correlations >0.6. Inter-rater reliability was reported for 6 instruments, but only 1 instrument reported that all correlations were >0.8, the remaining 5 included correlations less than 0.6 for at least one scale.

Validity

The majority of instruments (n = 61 or 86%) reported some type of validity evidence. Construct validity was by far the most common type of validity evidence evaluated (n = 59), often testing for relationships between food parenting practices and child diet or child weight. Most instruments had one or more scales that were significantly associated with one of these outcomes; however, correlations were generally in the range of 0.15-0.45. While all papers including this type of evidence were given credit for evaluating construct validity, these tests were not always presented as construct validity within the articles. Confirmatory factor analysis was reported for only 10 instruments. Those that did attempt to explore structural validity were generally successful with only minor modifications to their original model. Only two studies attempted to establish criterion validity.

Responsiveness

The Family Eating and Activity Habits Questionnaire [37] was the only paper that formally assessed the instruments’ responsiveness to treatment results. The questionnaire was administered to families taking part in a weight loss program both at baseline and follow-up. Changes in questionnaire scores as well as changes in weight were observed in the intervention group, and weight loss in the child was highly correlated with improvement in the questionnaire score.

Completeness of development process

Ideally, instrument development would involve all 6 components described thus far: (1) clear conceptualization of what the instrument is intended to measure, (2) systematic process for developing item pool, (3) refinement of the item pool (through at least one method: factor analysis, expert review, cognitive interviews, and/or piloting), (4) some type of reliability testing (inter-rater, test-retest, and/or internal consistency), (5) at least one type of validity testing, and (6) responsiveness or stability testing. On average, instruments reported only 2 or 3 of these 6 steps (range: 0 to 4).

Discussion

In the current review, 71 instruments were identified that included assessment of parent food practices. The quality of processes used and reported for instrument development varied widely, but there are instruments that demonstrate reasonably thorough development work. The quality assessment of the 71 instruments in this review highlights many key lessons that should inform future research in the areas of conceptualization of constructs, development and refinement of the item pool, collection of multiple types of reliability and validity evidence, and planning for responsiveness or stability testing.

Conceptualization of constructs

Parent food practices is a rapidly growing area of research that would benefit greatly from a common conceptual model. The content map (Figure 2) represents an initial effort to capture relevant constructs that should be included in this conceptual model. It served as a useful guide for the current review and may help inform future work to develop a conceptual model. Consensus is required in order to develop a clear conceptual model including an indication of what constructs should be included and how those constructs should be defined. The current lack of consensus has resulted in scales from different instruments that may share similar names, but include items measuring very different behaviors. Further, other instruments may include similar items, but employ different names for their scales. For example, the Restriction subscale from the Child Feeding Questionnaire [52] includes items about ensuring the child does not eat too many sweets or high fat foods and items about guiding and regulating child’s intake of certain foods – both of which reflect how “restriction” is typically defined. However, this subscale also includes items regarding offering sweets as a reward for good behavior, which other measures call “instrumental feeding” [65]. Researchers working in this field need consensus and a clear conceptual model based on current knowledge. Future research can then expand upon or clarify components of the model.

The use of theory to guide instrument development helps ensure clear conceptualization of all relevant constructs. Unfortunately, only half of developers noted a theoretical basis for their instruments. Social Cognitive Theory (SCT) [124] and the Social Ecologic Framework [125, 126] were two of the most commonly referenced theories, both of which recognize the influence that the environment, and the shared environment in particular, has on behavior. A number of instruments originated from family psychology, using theories about Social Learning Theory [127], Parenting Dimensions [128130], and Domain Specific Parenting [131] to guide development of their instruments. These theories generally recognize that parents play a central role in the socialization of their children and hence the behaviors that a child adopts, including eating behaviors. All of the theories provided useful guidance to instrument development, and should be considered in efforts to develop a conceptual model for parent food practices.

Development and refinement of the item pool

Ideally, development and refinement of the item pool uses a systematic approach that involves multiple methods and allows for multiple iterations. Consulting the current literature is a good starting point, but less than one third of instruments reported reviewing the literature as part of their process. Many instruments reported pulling items from existing instruments, but it is unknown how systematically existing instruments were reviewed before selecting which instruments and items to use for the new measure. Development of a new measure should also address gaps in measurement, creating items and scales for constructs not being measured by current instruments. Informed processes are needed to guide creation of new questions. Qualitative data (e.g., focus groups and interviews) can provide such a resource, but less than half of instruments reported drawing on such data sources. Once an initial item pool is created, it is also important to evaluate and refine that item pool; however, over a third of instruments reported no details on item refinement. Among those that did, factor analysis was the most common strategy employed. Expert review, cognitive interviews and piloting are important steps for ensuring complete content coverage and inclusion of items that are easily interpreted by the target audience. However, very few instruments reported assessment of content or face validity. The development article for the Comprehensive Feeding Practices Questionnaire [87] provides a useful example of a thorough and iterative process combining multiple strategies to generate and refine an item pool. To create an initial item pool, these researchers drew items from most widely used instruments, adapted items from adult measures where no existing items existed and reviewed the literature to gather information about additional constructs. The original item pool was piloted and factor analysis used to identify constructs needing additional items. Then, open-ended questions were given to another sample of parents to help generate these additional items. Future researchers interested in instrument development should aim to adopt similar methodologies and incorporate multiple strategies into their own plans for developing and refining the item pool for their new instruments.

Reliability and validity evidence

While almost all instruments received credit for performing some evaluation of reliability (80%) or validity (86%), there was clear reliance on more statistical approaches using data collected from a single time point for supplying such evidence. For reliability, many instruments presented only a Cronbach’s alpha. While this provides information on how well items within a scale group together, it does not provide evidence of repeatability. Assessment of test-retest and inter-rater reliability provides this type of evidence, but requires more investment in data collection. Not surprisingly, few instruments included these latter types of evaluation. Similarly, most validity evidence came from assessment of construct validity. Structural validity and criterion validity require greater investment in data collection to administer the instrument in multiple samples or to collect a gold standard measure. Use of these latter types of validity was limited. A thorough assessment of reliability and validity should include multiple strategies for each, which will require researchers to devote more time to instrument development by collecting data across multiple time points or in multiple samples or incorporating use of a gold standard.

Responsiveness testing

The area of instrument development that clearly needs the most attention is instrument responsiveness. Researchers seeking to evaluate interventions need evidence regarding the level of change that these instruments are able to detect. This type of information is essential when trying to calculate power and sample size needed for a study. While many of these instruments have indeed been used in studies to evaluate interventions, responsiveness testing is almost never reported as part of the development.

Additional issues

Another important consideration when selecting an instrument is its relevance for the target population. The feeding relationship changes as children get older, and hence the feeding practices parents employ change as well. At younger ages, children are more dependent on their parents to provide food choices. As they get older, they become more independent and peers are thought to exert a greater influence on eating habits [132], which may in turn influence the feeding approaches parents employ. All of the instruments included in this review were developed for families with children between the ages of 2 and 12 years old. Similarly, parent food practices may vary across different cultural groups [44, 51]. Some practices may appear to be detrimental to healthy eating habits in certain populations, but those same practices are found to be protective in others. For this reason, Table 1 describes the population in which each instrument was tested.

Limitations

Authors of this review provided a comprehensive inventory and assessment of existing measures of parent food practices. However, the current review is limited to instruments developed for families with children 2-12 years old. Additional instruments that were developed for families with adolescent children are not included. We limited this review to younger children because the parents and the home environment are the predominant influence on child eating behaviors at this age. Similarly, this review is limited to articles written in English. While it includes instruments that were developed in other languages, there are additional non-English instruments that have undoubtedly been left out of the current inventory. The results focus on presenting the primary development articles for each of the identified instruments; however, many of these instruments have been used in later studies with different populations. During the review process, 244 articles were identified that described studies in which existing instruments were used. Some of these may provide additional information about construct validity (e.g., association with child diet or weight), but were not included or summarized here. However, articles in which there was clear development work to adapt and evaluate scales for new populations are captured in Table 1. Also, no attempt was made to provide an overall quality scores for each instrument. To be truly informative, a scoring rubric would need to take into account not just attempts to complete the various development steps, but also the appropriateness of tests used, and the significance of the outcomes across factors measured within an instrument. Such a scoring tool is expected to be complex and is not yet available at the time of writing. Therefore, the authors have summarized the development work that has been done, the reliability and validity evidence reported, and well-accepted criteria for assessing those results. Readers are thus able to judge for themselves the strength of the evidence in light of other factors.

Conclusions

This review was able to identify 71 different measures of parent food practices. However, these existing instruments measure a variety of different constructs. Additionally, the rigor with which they were developed varied widely. Ideally, instrument development and evaluation are multi-staged processes that require time and patience. Researchers or practitioners who do not have the resources to dedicate to instrument development should be encouraged to look for existing instruments that measure the specific constructs needed for their study. Future work should focus on further evaluation of appropriate instruments where possible. Undoubtedly, new instruments will need to be developed; however, this future development work should consider the lessons learned from the current review and to consider all stages of development needed to create a valid and reliable measure.