The Generation R Study: design and cohort update 2017

The Generation R Study is a population-based prospective cohort study from fetal life until adulthood. The study is designed to identify early environmental and genetic causes and causal pathways leading to normal and abnormal growth, development and health from fetal life, childhood and young adulthood. This multidisciplinary study focuses on several health outcomes including behaviour and cognition, body composition, eye development, growth, hearing, heart and vascular development, infectious disease and immunity, oral health and facial growth, respiratory health, allergy and skin disorders of children and their parents. Main exposures of interest include environmental, endocrine, genomic (genetic, epigenetic, microbiome), lifestyle related, nutritional and socio-demographic determinants. In total, 9778 mothers with a delivery date from April 2002 until January 2006 were enrolled in the study. Response at baseline was 61%, and general follow-up rates until the age of 10 years were around 80%. Data collection in children and their parents includes questionnaires, interviews, detailed physical and ultrasound examinations, behavioural observations, lung function, Magnetic Resonance Imaging and biological sampling. Genome and epigenome wide association screens are available. Eventually, results from the Generation R Study contribute to the development of strategies for optimizing health and healthcare for pregnant women and children.


Introduction
The Generation R Study is a population-based prospective cohort study from fetal life until young adulthood. The background and design have been described in detail previously [1][2][3][4][5][6][7]. Briefly, the Generation R Study is designed to identify early environmental and genetic causes of normal and abnormal growth, development and health from fetal life until young adulthood. This multidisciplinary study focuses on several health outcomes including behaviour and cognition, body composition, eye development, growth, hearing, heart and vascular development, infectious disease and immunity, oral health and facial growth, respiratory health, allergy and skin disorders of children and their parents. Main exposures of interest include environmental, endocrine, genomic (genetic, epigenetic, microbiome) lifestyle related, nutritional and socio-demographic determinants. Full lists of exposures and outcomes are presented in Tables 1 and 2. An important focus of the study is on the identification of new early life determinants of common non-communicable diseases in adulthood or there risk factors, on which various papers have been published recently in this journal [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26]. A detailed and extensive data collection has been conducted over the years, starting in the early prenatal phase and currently in early adolescence (age 13 years). Data collection in parents and their children included questionnaires, interviews, detailed physical and ultrasound examinations, behavioural observations, lung function, Magnetic Resonance Imaging (MRI) and biological sampling. In this paper, we give an update of the data collection in the children and their parents until the child's age of 13 years.

Study design
The Generation R Study is conducted in Rotterdam, the second largest city in the Netherlands. Rotterdam is situated in the Western part of the Netherlands. The study is a population-based prospective cohort study from fetal life onwards. Pregnant women with an expected delivery date between April 2002 and January 2006 living in Rotterdam were eligible for participation in the study. Extensive assessments are performed in mothers, fathers and their children. Measurements were planned in early pregnancy (gestational age \18 weeks), mid pregnancy (gestational age 18-25 weeks) and late pregnancy (gestational age [25 weeks). The fathers were assessed once during the pregnancy of their partner. The children form a prenatally recruited birth cohort that will be followed at least until young adulthood. In the preschool period, which in the Netherlands refers to the period from birth until the age of 4 years, data collection was performed by a home-visit at the age of 3 months, and by repeated questionnaires and routine child health centers visits. Information from these routine visits was obtained and used for the study. Additional detailed measurements of fetal and postnatal growth and development were conducted in a randomly selected subgroup of Dutch children and their parents at a gestational age of 32 weeks and postnatally at the ages of 1.5, 6, 14, 24, 36 and 48 months in a dedicated research center.
Around the ages of 6 and 10 years all children and their parents were invited to visit our research center in the Erasmus MC-Sophia Children's Hospital to participate in hands-on measurements, advanced imaging modalities, behavioural observations and biological sample collection. MRI scans of all participating children were made in order to image abdominal composition, brain, lungs, cardiovascular system, fat tissue, kidney, liver, and hip development. Furthermore, the parents received 6 questionnaires during this period. Children also received their own questionnaire around the age of 10. Information from municipal health services, schools and general practicionars has also been collected.
In the current adolescence period, all children and their parents will be re-invited around the child's age of 13 and 16 years. We will again assess their growth, development and health in our research center and with questionnaires. We will perform MRI scans of the abdominal composition (fat), brain, and hip development.

Study cohort Eligibility and enrolment
Eligible mothers were those who were resident in the study area at their delivery date and had an expected delivery date from April 2002 until January 2006. We aimed to enrol mothers in early pregnancy but enrolment was possible until birth of their child. The enrolment procedure has been described previously in detail [1][2][3][4]. In total, 9778 mothers were enrolled in the study. Of these mothers, 91% (n = 8879) was enrolled during pregnancy. Partners from mothers enrolled in pregnancy were invited to participate.
In total, 71% (n = 6347) of all fathers were included. A total of 1232 pregnant women and their children form the subgroup of Dutch children for additional detailed studies. The overall response rate based on the number of children at birth was 61%. The study group is an multi ethnic cohort. Ethnicity was defined according the classification of Statistics Netherlands [27][28][29][30][31][32]. Ethnic background was assessed in accordance with the country of birth of participants themselves and his or her parents. A participant was considered to have non-Dutch ethnic origin if one of her parents was born abroad. If both parents were born abroad, the country of birth of the participant's mother determined the ethnic background [33]. The largest ethnic groups were the Dutch, Surinamese, Turkish and Moroccan groups. We also constructed a dichotomous variable ''Western/non-Western''ethnicity. Western ethnicity included Dutch, European, American Western (including North American), Asian Western (including Indonesian and Japanese) and Oceanian. Non-Western ethnicity included Turkish, Moroccan, Surinamese, Antillean, Cape Verdean, African, Asian (except Indonesia and Japan) and South American and Central American [33,34].
Response and follow-up Figure 1 shows the enrolment and follow-up rates of the children and parents included in the Generation R Study. The 9778 mothers enrolled in the study gave birth to 9749 live born children. During the preschool period (0-4 years), the logistics of the postnatal follow-up studies were embedded in the municipal routine child care system and restricted to only part of the study area. In total 1166 children lived outside this defined study area at birth and were therefore not approached for the postnatal follow-up studies during the preschool period. Of the remaining 8583 children, 690 (8%) parents did not give consent, or their This invitation was independent of their home address and participation in the preschool period. In total, 8305 children (90% of those who were invited (n = 9278) and 85% of the original cohort (n = 9749)) still participated in the study at this age, of whom 6690 visited the research center at a median age of 6.0 years. For the follow-up phase at the age of 10 years (mid childhood period) 730 children of the 9278 could not be invited. In total, 7393 children (86% of those who were invited (n = 8548) and 76% of the original cohort (n = 9749)) participated in the study in mid childhood, of whom 5862 visited the research center at a median age of 9.7 years. Of the 8548 children invited in the mid childhood period, 456 had withdrawn and 124 children were lost to follow-up during this period, leaving 7968 children for invitation around the age of 13 (early adolescence period). Table 3 shows the general characteristics of the mothers who were enrolled in the study at baseline, and who remaind in the study until the child's age of 13 years. The median age of the women at enrolment was 30.5 (95% range, 19.3-39.6) years, 58% percent of those mothers were of the Dutch nationality, 43% of the mothers were highly educated and 55% had a high household income. The mean birth weight of the children was 3397 (SD 582) grams and they were born at a median gestational age of 40.0 (95% range, 34.9-42.3) weeks. Compared to the baseline characteristics, the mothers who still participated in the study at follow up were older, more frequently of Dutch nationality and higher educated.

Measurements Data collection during pregnancy and fetal life
Physical examinations were planned at each visit in early pregnancy, mid pregnancy and late pregnancy and included height, weight and blood pressure measurements of both parents (Table 4).
Mothers received four postal questionnaires and fathers received one postal questionnaire during pregnancy. Topics in these questionnaires were: Blood samples were collected in early (mother, father) and mid-pregnancy (mother) and at birth (child). A detailed overview of the design and response of the biological sample collection and available measurements is given elsewhere [5,7].
Fetal ultrasound examinations were performed at each prenatal visit. These ultrasound examinations were used to Values are means (standard deviation), percentages or medians (95% range) establish gestational age and to assess fetal growth patterns [35,36]. These methods have previously been described in detail [37][38][39]. Longitudinal curves of all fetal growth measurements (head circumference, biparietal diameter, abdominal circumference and femur length) were created resulting in standard deviation scores for all of these specific growth measurements. Placental hemodynamics including resistance indices of the uterine and umbilical arteries have been measured in second and third trimester [40][41][42]. Detailed measurements of fetal brain, heart and kidney development were done in the subgroup [40,[43][44][45][46][47][48].
The obstetric records of mothers have been retrieved from hospitals and mid-wife practices to collect information about pregnancy progress and outcomes. Specialists in the relevant field coded items in these records [49].

Data collection during the preschool period
At the age of 3 months, home visits were performed to assess neuromotor development using an adapted version of Touwen's Neurodevelopmental examination and to perform a home environment assessment [50][51][52][53]. Information about growth (length (height), weight, head circumference) was collected at each visit to the routine child health centers in the study area using standardized procedures [54] (Table 5).
During the preschool period, parents received 8 questionnaires, of which one was specifically for fathers. Items included in these questionnaires and their references are listed in Tables 6 and 7. Response rates based on the number of sent questionnaires are shown in Fig. 2. Not all children received each questionnaire due to logistical constraints and delayed implementation of some of the questionnaires after the first group of children reached the target age for those questionnaires. Thus, although response rates may be similar, the absolute number of completed questionnaires differs between different ages. Response rates presented in Fig. 2 are based on the number of sent questionnaires.
During the preschool period, children participating in the subgroup were invited six times to a dedicated research center. Measurements at these visits included physical examinations (height, weight, head circumference, skinfold thickness and waist-hip ratio, Touwen's Neurodevelopmental Examination) and ultrasound examinations (brain, cardiac and kidney structures) [44,[55][56][57][58][59]. Dual X Energy Absorptiometry (DXA) scanning and Fractional exhaled Nitric Oxide (FeNO) measurements have been performed in a smaller subgroup [60,61]. Blood pressure was measured at the age of 24 months [62,63]. Observations of parent-child interaction and behaviour, such as executive function, heart rate variability, infant-parent attachment, moral development, and compliance with mother and child have been repeatedly performed and with father and child once [64][65][66][67][68]. Biological materials were collected if parents gave consent [69][70][71].
Data collection during the early school age, mid childhood and adolescence period From the age of 6 years onwards, we invite all participating children to a well-equipped and dedicated research center at the Erasmus MC-Sophia Children's Hospital every 3-4 years. Visits at age 6 and 10 years have been completed, at age 13 years are ongoing and age 16 years are being planned. Currently, the total visit takes about 3 h and all measurements are grouped in thematic 35 min blocks. Clinically relevant results are discussed with the children and their parents and, if needed, children or parents are referred to their general practitioner or other relevant health care provider.
At each age, we collect data using questionnaires on growth, health and physical and mental development of the children. Also, we collect information on childhood diet and behaviour (Table 6, 7). These questionnaires are sent to the primary caregiver.
We use various advanced imaging techniques including ultrasound and Doppler (GE LOGIQ E9, Milwaukee, WI, USA) for measuring thoracic and abdominal structures, Dual X Absorptiometry for measuring body composition and bone mineral density (iDXA scanner, GE Healthcare, Madison, WI, USA) and Peripheral Quantitative Computed Tomography (PQCT, Stratec Medicin Technik, Pforzheim, Germany) for measuring bone mineral density and geometry of the tibia. We use orthopantomograms (OP 200 D, Intrumentarium Dental, Tuusula, Finland) for measuring dental development.
MRI has been used for brain imaging in a subgroup (n = 801) of 6-8 year old children using a hospital-based 3.0 Tesla MRI scanner (Discovery MR750, GE Healthcare, Milwaukee, WI, USA) [80][81][82][83]. From 2014 onwards, we use a dedicated 3.0 Tesla MRI (Discovery MR750, GE Healthcare, Milwaukee, WI, USA) for brain and total body imaging of all children participating in the study at the mid childhood visit (age 10 years) (see Table 9 for the MRI outcome measures). We use a mock MRI scanner, to familiarize the children and get use to the scanning procedures. Children are scanned using standard imaging and positioning protocols, wearing light clothing without metal objects while undergoing the scanning procedure. Total scanning time amounts to approximately 60 min. The scanner is operated by trained research technicians and all imaging data are collected according to standardized ?

Physical characteristics
?
Social media use [176,177]  Imaging is performed without administration of contrast agents. All imaging data are stored on a securely backed-up research picture archiving system, using programmed scripts to check for completeness of the data received. We will re-scanning the abdominal composition (fat), brain imaging and hip development during adolescence (age 13 years) of all participating children in Generation R. MRI scan of the brains will also be conducted in the parents of a subgroup of Generation R participants. This research is focused on aging effects of the brains in young adults and follow up of mothers who experienced gestational hypertensive complications. Blood and urine samples are collected in the mothers and their children during every visit. A detailed overview of the design and response of the biological sample collection and available measures is given elsewhere [5,7]. General health [132] ?
Social media [176,177] ? Hearing (listen to music, use of headphone) ?
? = Assessment in whole cohort

Fig. 2 Response to the questionnaires and visits in the Generation R Study
The Generation R Study: design and cohort update 2017 1253 Genomics: genetic, epigenetic and microbiome biobank DNA from parents and children has been extracted and used for genotyping using taqman analyses for individual genetic variants and using a genome-wide association scan (GWAS) using the Illumina 670 K platform in the children [5,7]. For genotyping, we used the infrastructure of the Human Genomics Facility (HuGe-F) of the Genetic Laboratory of the Department of Internal Medicine (www. glimdna.org). The GWAS dataset underwent a stringent QC process, which has been described in detail previously [5,7,84]. Most GWAS analyses are strongly embedded in the Early Growth Genetics (EGG) (http://egg-consortium. org/) and Early Genetics and Longitudinal Epidemiology (EAGLE) Consortia, in which several birth cohort studies combine their GWAS efforts focused on multiple outcomes in fetal life, childhood and adolescence. These efforts have already led to successful identification of various common genetic variants related to birth weight, infant head circumference, childhood body mass index, bone development and obesity and atopic dermatitis [85][86][87][88][89][90][91]. DNA from parents is used for genotyping for candidate gene or replication studies. DNA methylation was measured on a genome wide level in a subgroup of Dutch children, using the Illumina Infinium HumanMethylation450 BeadChip (Illumina Inc., San Diego, USA). We used cord blood samples of 1339 children, blood samples in 469 children aged 6 years and blood samples in 425 children aged 10 years. Quality control and normalization of analyzed samples was performed using standardized criteria. Many of the epigenome-wide association analyses are performed in the context of the Pregnancy And Childhood Epigenetics (PACE) Consortium (http://www.niehs.nih.gov/research/ atniehs/labs/epi/pi/genetics/pace/index.cfm), which brings together studies with epigenome-wide DNA-methylation data in pregnant women, newborns and/or children. Recent studies have identified differentially methylated sites in association with maternal smoking, maternal folate levels, maternal stress and air pollution during pregnancy [92][93][94][95].
Gut microbiota profiles were determined by Next Generation Sequencing (on Illumina MiSeq) of the V3 and V4 variable regions of the 16S ribosomal RNA gene in DNA extracted from feacal samples. Samples were collected at mid childhood in 2414 children. Phylogenetic de novo profiling was performed using the QIIME [96] and USEARCH [97] software packages and resulted in an operational taxonomic unit table with 239 species, 109 genera and 8 phyla. For example, those samples can be used for studying the effects of the fecel microbiota with overweight or obesity [98][99][100].

Follow-up and retention strategies
Thus far, loss to follow-up has been lower than 10%. Major efforts are made to keep the children and parents involved in the study and to minimize loss to follow-up. Several strategies have been implemented and are currently part of the study design: • Addresses: new addresses of participants, which are known by the municipal health service, can be retrieved by the study staff; • Newsletters: participants receive two to four newsletters per year, in which several results of the study are presented and explained, questions of participants are answered and new research initiatives are presented; • Facebook: every week we post a short news update about the ongoing research on our facebook page; • Website: we have an up-to-date website where participants can find information about the ongoing research, the procedures at the dedicated research center and our contact information; • Presents and discounts: all children who visit our research center receive small presents. Also, discount offers are regularly presented in the newsletter; • Transport costs: all costs for transport and parking related to visits to the research center are reimbursed; • Reminders for questionnaires: when the questionnaire has not been returned within 3 weeks, a kind reminder letter is sent to the parents. After 6 weeks, if the questionnaire still has not been returned, the parents receive a phone call. If necessary, help with completing the questionnaire is offered and the importance of filling out the questionnaire is explained once more during this phone call; • Individual feedback: if clinically relevant, results of measurements are discussed with the parents and children at the visit. If necessary, follow-up appointments with the general practitioner are planned; • Support for non-Dutch speaking participants: all study materials such as questionnaires, newsletters, website, and information folders are available in three languages (Dutch, English, and Turkish). Furthermore, staff from different ethnic backgrounds is available and verbally translate these materials into Arabic, French and Portuguese. As such, the study staff is able to communicate with all participants; • Additional help: children and parents who showed low response rates for different measurements, showed difficulties in completing questionnaires or require additional explanation or support are pro-actively contacted by one dedicated member of the study staff; • Home visits: We visit children and parents who cannot be contacted by phone, e-mail or letter. Most visits are planned in the evenings to have higher chances that both parents and children are at home.

Power, datamanagement, privacy protection
Power calculations for the Generation R Study are shown in Tables 10 and 11. Due to missing values and loss to follow-up, most analyses in the study are not based on data in all subjects. Therefore, these power calculations demonstrated are based on 7000 subjects in the whole cohort and 700 subjects in the subgroup. The presented power calculations are conservative since most studies will assess the effects of continuous instead of dichotomous exposures and studies may be focused on outcomes collected in more than only 1 year. From 2016 onwards, data collected during the measurements at the research center are entered directly into an electronic database. Data collected by questionnaires are scanned and manually entered into an electronic database by a commercial company. Random samples of all questionnaires are double checked by study staff members to monitor the quality of this manual data entry process. The percentage of mistakes does not exceed 3% per questionnaire. Open text fields are entered into the electronic database exactly as they are filled in on the questionnaires. In a secondary stage, these open text fields are cleaned and coded by a specialist in the relevant field.
All measurements are centrally checked by examination of the data including their ranges, distributions, means, standard deviations, outliers and logical errors. Data outliers and missing values are checked with the original forms. The data of one specific measurement are only distributed for analyses after data collection and preparation is completed for that measurement for the whole cohort.
Datasets needed for answering specific research questions are centrally constructed from different databases. All information in these datasets that enables identification of a particular participant, including names and dates of birth, is excluded before distribution to the researchers. The datasets for researchers include unique identification numbers for each subject that enable feedback about individuals to the datamanager but do not enable identification of that particular subject. Currently, we are exploring possibilities for a remote access environment, in which researchers can access centrally stored research data from their own computer without storing such data locally.

Collaboration
The Generation R Study is conducted by several research groups from the Erasmus MC in close collaboration with the Erasmus University Rotterdam and the Municipal Health Service Rotterdam area. Since the data collection is still ongoing and growing, the number of collaborating research groups in and outside the Netherlands is expected to increase. Various research projects are performed as part of ongoing European or worldwide collaboration projects.
The study has an open policy with regard to collaboration with other research groups. Request for collaboration can be sent to Vincent Jaddoe (v.jaddoe@erasmusmc.nl). These requests will be discussed in the Generation R Study Management Team regarding their study aims, overlap with ongoing studies, logistic consequences and related finances. After approval of a project by the Generation R Study Management Team and the Medical Ethical Committee of Erasmus MC, the collaborative research project is embedded in one of the research areas supervised by the corresponding principal investigator. Generation R Study is conducted by the Erasmus Medical Center in close collaboration with the School of Law and Faculty of Social Sciences of the Erasmus University Rotterdam, the Municipal Health Service Rotterdam area, Rotterdam, the Rotterdam Homecare Foundation, Rotterdam and the Stichting Trombosedienst & Artsenlaboratorium Rijnmond (STAR-MDC), Rotterdam. We gratefully acknowledge the contribution of children and parents, general practitioners, hospitals, midwives and pharmacies in Rotterdam. We thank Sylvie van den Assum, Ronald van den Nieuwenhof and Patricia van Sichem-Maeijer for their study coordination. The general design of Generation R Study is made possible by financial support from the Erasmus MC, University Medical Center, Rotterdam, the Netherlands Organization for Health Research and Development (ZonMw) and the Ministry of Health, Welfare and Sport. Liesbeth Duijts received The presented effect sizes are detectable proportions of the standard deviation with a type I error of 5% and a type II error of 20% (power 80%)