Participants were children enrolled in the Generation XXI birth cohort. Recruitment took place between April 2005 and August 2006, in all public maternities of the metropolitan area of Porto (northern Portugal). Demographic and socioeconomic characteristics, obstetric history and previous personal diseases were collected in the maternity, within 72 h after delivery, in face-to-face interviews performed by trained interviewers and from clinical records [20, 21]. After the initial evaluation at birth (8647 children and 8495 mothers), follow-ups were conducted when children were 4 (2009/2011) (86% of participation proportion), 7 (2012/2014) (80% of participation proportion) and 10 years of age (2015/2017) (76% of participation proportion). From the 8647 children enrolled at baseline, the current study included only twin pairs (n = 288). Children with no information about zygosity (n = 36) and without data on eating behaviors at 10 years (n = 80) were excluded, resulting in a final sample of 172 children (86 twin pairs).
Zygosity was assessed through a nine-item questionnaire . This questionnaire was completed by the mothers, and assessed their opinions about children’s zygosity (e.g., “Do you think your twins are identical (monozygotic)?”), global similarity and twin confusion (e.g., “Are the children as alike as two peas in a pod?”) and specific similarity (e.g., “Are there differences in your twins’ eye colors?”). In addition, zygosity assessment was also performed by an independent observer, which classified the children as monozygotic (MZ), dizygotic (DZ) or unknown. A score of one point was given for each answer indicating similarity for a trait between twins and one point was subtracted for each answer indicating dissimilarity. A final score lower than 0 corresponded to DZ twins and scores equal to or above 0 corresponded to MZ twins. This questionnaire is a non-invasive tool of zygosity assessment and showed, previously, a high degree of accuracy (95.4%) .
Eating behaviors at 10 years were assessed through a widely used parent-report questionnaire, namely the Children’s Eating Behavior Questionnaire (CEBQ)  which was previously validated among the Generation XXI school-aged children . This questionnaire also showed good internal consistency at 10 years (Cronbach’s α coefficients ranging from 0.76 to 0.84 ). Parents or main caregivers were asked to respond to the 35-item questionnaire, which assesses eight subscales: satiety responsiveness (CEBQ-SR—5 items, e.g., “My child leaves food on his/her plate at the end of a meal”), slowness in eating (CEBQ-SE—4 items, e.g., “My child eats slowly”), food fussiness (CEBQ-FF—6 items, e.g., “My child is difficult to please with meals”), emotional undereating (CEBQ-EUE—4 items, e.g., “My child eats less when s/he is upset”) assess child’s avoidance and lack of interest towards foods and will, therefore, be further called “food avoidant behaviors”; food responsiveness (CEBQ-FR—5 items, e.g., “If allowed to, my child would eat too much”), enjoyment of food (CEBQ-EF—4 items, e.g., “My child loves food”), desire to drink (CEBQ-DD—3 items, e.g., “My child is always asking for a drink”) and emotional overeating (CEBQ-EOE—4 items, e.g., “My child eats more when annoyed”) assess child’s general appetite and interest for food and drinks and will, therefore, be called “food approach behaviors”. Answers were given using a 5-point Likert scale, ranging from 1—“Never” to 5—“Always”, such that the higher the score, the more frequent the eating behavior. In accordance with the original scale, five of the items were reverse-scored. For questionnaires that were missing < 50% of data items (approximately 3% of the sample), subscale scores were calculated by replacing missing items with the mean of the items available. Adequate internal consistency was also observed in the current twins’ sample, with Cronbach’s α coefficients ranging from 0.77 to 0.86 (data not shown). In the majority of cases, the CEBQ at the 10-year follow-up was answered by the mother of the children (92.6%).
At each follow-up, participants were weighed, by trained researchers, in underwear and without shoes, using a digital scale and the measure was recorded to the nearest 0.1 kg. Height was also measured without shoes, using a fixed stadiometer to the nearest 0.1 cm. Children’s BMI was calculated and cutoff points for the sex- and age specific BMI z-scores were created. Weight status was then defined as ‘underweight’ for z-scores below − 2 standard deviations (SD), ‘normal weight’ for z-scores ≥ − 2SD and ≤ 1SD, ‘overweight’ for z-scores > 1 and ≤ 2SD and ‘obesity’ for z-scores above 2SD, according to the World Health Organization (WHO) child growth references .
Maternal demographic characteristics, such as age, education, marital status, pre-pregnancy weight and height, gestational age, and household monthly income were obtained through face-to-face interviews conducted by trained researchers. Mother’s weight status was classified as follows: BMI < 18.5 kg/m2 for ‘underweight’, between ≥ 18.5 and < 25 kg/m2 for ‘normal weight’, between ≥ 25 and < 30 kg/m2 for ‘overweight’ and ≥ 30 kg/m2 for ‘obesity’, according to WHO cut-offs . Child birth weight was recorded in grams after birth and were retrieved from medical records by trained researchers. Birth weight categories were defined as < 1000 g as ‘extremely low’, between 1000 and 1499 g as ‘very low’, between 1500 and 2499 g as ‘low’ and ≥ 2500 g as ‘normal’, according to WHO thresholds , and was used for descriptive purposes only.
Continuous variables were expressed as mean and standard deviations (M(SD)) for symmetric distributed variables, or median and interquartile ranges (Md(IQR)) for non-symmetric distributed variables. Distribution of variables was evaluated for each continuous variable using Kolmogorov–Smirnov test and graphically, through Q–Q plots. Counts and percentages (n(%)) were described for categorical variables.
To explore genetic and environmental contributions on variations of the CEBQ subscales, intra-class correlation coefficients (ICC) calculations (two-way mixed-single measure) and the twin method  were used. First, to assess the resemblance between identical and fraternal twins, ICCs for each CEBQ subscale were calculated. Greater similarities (i.e., greater ICCs) between MZ twins, compared to DZ twins, indicate a greater genetic contribution to the variation of the trait, because the only difference between these two types of twins is that the identical twins (MZ) are twice as similar genetically .
The twin method consists of a formal comparison between the resemblance between identical (MZ) and fraternal twins (DZ) for some traits of interest. MZ twins are genetically identical (100%) and, if reared together, share the same environment, and DZ twins share half of their genes (50%) and, if reared together, also share the same environment . The maximum likelihood structural equation modeling (MLSEM) was conducted aiming to estimate the genetic and environmental variances in the measured eating behaviors from the twin method. The genetic component comprises additive genetic influences (A). The environmental component compromises common or shared environmental influences (C), which represent factors common to both twins (e.g., socioeconomic status) and child-specific or non-shared environmental factors (E), which refer to factors in the environment that make members of the twin pair different (e.g., an illness in only one twin) . Parameter estimates, their respective 95% confidence intervals (CI), as well as models’ goodness-of-fit statistics were evaluated and described. Although we tested the full ACE model for each appetitive trait, considering the small sample size and the fit index, we only reported bivariate estimates of A, C and E (i.e., AC, AE and CE). These models were then compared according to the following fit indexes: Bayesian information criterion (BIC) value, log likelihood ratio (− LL), Chi-squared value (χ2) and p value. Only models with the best fit, i.e., that explained the observed variance and covariance with the fewest parameters for each appetitive trait, were described (models with the lowest BIC value, smallest χ2 and p > 0.05) . Details of the remaining fitted models are described in a Supplementary Table (Supplementary Table 1).
Descriptive statistics and ICCs were performed in SPSS (IBM Corp. Released 2017. IBM SPSS Statistics for Windows, Version 25.0. Armonk, NY: IBM Corp.) and the MLSEM was performed in R 3.0.1 using the structural equation modeling package NlsyLinks .