Overall study objectives
The primary objectives of the YWHHS were to provide insight into modifiable early life and life-course factors associated with young-onset (< 50 years) BC risk and to understand racial and socioeconomic inequities in BC risk in the U.S. [40, 44,45,46,47]. We are investigating: (1) the association between early life and life-course factors and risk for BC overall and by tumor subtypes among young NHB and NHW women [9, 27,28,29,30,31,32], (2) the potentially modifying effects of the socio-historic context of race/ethnicity (hereafter “race”) and life-course socioeconomic position (SEP) on BC risk, and have also (3) created a bio-repository of blood (or saliva) and breast tumor tissue for current and future study of the contribution of biomarkers, gene-environment interactions, and gene expression on BC risk in young women.
Overall study design
BC cases were identified from the metropolitan Detroit (Oakland, Wayne, and Macomb counties) and Los Angeles County Surveillance, Epidemiology and End Results (SEER) registries diagnosed between 2010 and 2015. Controls were identified through area-based sampling from the 2010 Census and matched to cases by study site, age, and race. Primary data collected included: (1) an in-person computer-assisted personal interview (CAPI) conducted with a life history calendar, (2) anthropometric measurements, (3) blood collection (or saliva when not available) and related questionnaire, (4) SEER tumor type information, including ER, PR and HER2 status, and (5) breast tumor tissue collected from participants’ BC surgeries. Additional collected data included: (6) an interviewer-completed built environment survey of participants’ neighborhoods, (7) a survey completed by participants’ primary childhood caregiver, and (8) childhood photos of body size. We also requested (9) permission to obtain information from the health department(s) where women gave birth and (10) where she was born, and (11) most recent mammogram reports from healthcare providers. Participation in the main study questionnaire was necessary for enrollment; all other study components were optional. This study protocol was approved by the Institutional Review Boards at the University of Wisconsin—Milwaukee (UWM); Michigan State University (MSU); Wayne State University (WSU); the Michigan Department of Community Health; University of Southern California (USC); the California Committee for the Protection of Human Subjects (CPHS); and for the Medical College of Wisconsin (MCW), IRB oversight was deferred to UWM. The California Cancer Registry also approved the study.
Study organization
The YWHHS Coordinating Center (initially hosted at MSU, moved to UWM in 2014) were responsible for study design, development, and oversight of the study tracking system. Westat, a research services corporation, and study collaborators developed the control sampling design, oversaw identification and recruitment of control participants, and created final study sample weights. Final recruitment, in-person interviews, and biospecimen collection were conducted at two field sites: Los Angeles County (at USC) and metropolitan Detroit (at WSU). A community advisory panel was assembled and consulted about data collection materials and study methodologies.
Eligibility criteria (see Table 1)
Table 1 Eligibility criteria for cases of breast cancer and controls, Young Women’s Health History Study Study tracking system
A centralized computer system that tracked all corresponding study data and biospecimens was adapted and managed for YWHHS by the USC Cancer Research Informatics Core (CRIC).
Ascertainment, sampling, recruitment, and screening
Ascertainment, sampling, recruitment, and screening activities for cases and controls are outlined in Fig. 2.
Cases
Potentially eligible cases were identified by the Metropolitan Detroit Cancer Surveillance System (MDCSS) SEER registry and the LA County Cancer Surveillance Program (CSP) SEER registry. For both sites, cases were identified through rapid case ascertainment (RCA), which aims to identify cases within 3–6 months after diagnosis.
Case sampling.
We sampled from all eligible NHW 45–49 years of age due to budgetary constraints. Given that there is a paucity of studies among NHB women, the youngest women (< 45 years of age), and women diagnosed with estrogen receptor-negative tumors, we retained all NHB women diagnosed 20–49 years of age, all NHW women 20–44 years of age, and among NHW women aged 45–49 years, oversampled women with estrogen receptor-negative tumors. Thus, all eligible NHB cases 20–49 years of age and NHW cases 20–44 years of age were included, and a sample of NHW cases aged 45–49 years (n = 829 of 2,527 Detroit; n = 883 of 2,782 LA), sampled as follows: between 09/01/2010 and 08/31/2012 30.5% of all NHW 45–49 year old cases; between 08/31/2012 and 08/31/2015 84.5% of ER- cases and 40.8% of ER + tumors.
Case screener interview.
All sampled cases were screened to determine final eligibility status. Cases not successfully screened by a study site team were checked against the updated SEER Registry to determine eligibility status. Cases initially sampled were considered ineligible for the following reasons: not U.S.-born (n = 373), self-identified as neither White nor Black (n = 153), self-identified as Hispanic (n = 151), previous cancer diagnosis (n = 117), resided outside of the study areas at reference date (see definition of reference date in Table 1; n = 50), tumor had ineligible histology (n = 44), did not speak English (n = 29), updated age or reference date was out-of-range (n = 17), physically or mentally unable to complete the interview (n = 14), or institutionalized at reference date (n = 7). Two percent of cases were ineligible for screening for one or more of these reasons. In Detroit, a letter was sent to each eligible case’s physician before cases were contacted; if the physician did not respond within three weeks the case could be contacted, except for a few Detroit hospitals that required active physician approval.
Controls
YWHHS investigators and the Westat team developed the area-based control sampling strategy and Westat developed the statistical sampling methodology [48, 49]. Westat also oversaw control identification and recruitment, household rostering, screener interviews, and initiated control recruitment efforts. Once potentially eligible controls were identified, their contact information was provided to the YWHHS Coordinating Center to be entered into the study tracking database for recruitment.
Control sampling.
A three-stage area probability sample was conducted to provide coverage of metropolitan Detroit and LA County from which YWHHS case participants were identified (see Supplemental Materials). The first stage of sample selection was that of PSUs (Primary Sampling Units) consisting of one or more Census blocks as identified in the U.S. Census conducted in 2010. Within sampled PSUs, the second stage was the sampling of approximately 24,000 + addresses from listings based on addresses served by the U.S. Postal Service. Households within occupied sampled addresses were rostered to identify members who were potential controls for the study. The third stage of sample selection involved randomly selecting women from among those potentially eligible. The sampling rates employed were designed to obtain a set of controls that were frequency matched to the expected case distribution within study site by race (NHB/NHW) and 5-year age intervals.
Control household roster.
A total of 24,612 households were sampled (Table 2) and 21,668 were determined eligible for roster. An introductory letter, brief roster, and a $2 bill were mailed to all sampled residential addresses. The same follow-up household contact recruitment protocol was then used as the National Health and Nutrition Examination Survey [50]. A total of 18,612 household were rostered. The roster asked the initials/name, age, and race/ethnicity of all adult women 20–50 years old in the household (see Supplementary Materials for additional details).
Table 2 Overall ascertainment numbers by race and site, Young Women’s Health History Study
Control screener interview.
An in-person screener interview was conducted to determine the final eligibility of potentially eligible women identified and sampled from the household roster. Those who completed the screener received $5. Respondents willing to participate or interested in learning more were asked to provide their contact information for a study site (WSU/USC) interviewer to contact them.
Data collection
In-home case–control interview recruitment.
An introductory letter and study brochure were sent to all sampled case and control women. After sending the introductory letter, study staff (WSU/USC) telephoned women to determine (cases) or confirm (controls) eligibility, answer questions, and identify a location and time for an in-person interview. Women not reached by phone were sent follow-up letters and reminder postcards, and, in some cases, in-person visits. Women who declined to participate were asked to complete a brief questionnaire about demographic characteristics to characterize non-respondents.
In-person interview scheduling and informed consent.
Study participants were interviewed at their selected location. Prior to interview, participants were mailed a confirmation letter and their interviewer’s business card with a photograph. Before the interview, the participant was asked to read and sign a consent form that described the study and participant rights and safeguards; it also requested permission to conduct the interview and each component of the study. Women were informed they could refuse any questions and terminate the interview at any time. Women who had a mammogram were asked to complete a separate consent form that requested permission to obtain information from her healthcare provider about her last mammogram before reference date. Additionally, case participants were asked to provide consent to obtain tumor tissue sampled at the time of diagnosis or thereafter. A thank you gift of $75, which was later increased to $100, was provided for the main interview.
Main questionnaire.
The YWHHS questionnaire captured information about energy balance factors (e.g., childhood and adult diet, physical activity, and adult body size), factors known to affect life-course energy balance (e.g., food security, sleep patterns, built environment), known risk factors for BC (e.g., reproductive and family history), as well as race/ethnicity and life-course socioeconomic indicators. Collected information related to race/ethnicity includes self-reported race and Hispanic ethnicity, as well as the race/ethnicity others typically ascribe to the participant. We also asked about early life discrimination, experiences of every-day discrimination and the source of discrimination. Life-course socioeconomic indicators include residential history, household percent poverty (HPP), educational attainment, and occupational status [51, 52]. HPP was calculated using household net income adjusted for household size. Other factors associated with social context collected include life-course experiences of adversity (including childhood experiences), financial status and use of governmental subsidies, food insecurity, occupational status, and health insurance status. Other information on factors potentially associated with BC risk include prenatal exposures, medical history, non-steroidal anti-inflammatory medication use, contraceptive use, hormone medication use, fertility history, and life-course personal and secondhand tobacco exposure, as well as alcohol use. Study questions were developed based on existing questionnaires [53,54,55,56,57].
Multiple tools were used throughout the questionnaire to assist participants with recall, including a life history calendar of key life events [58], showcards, which also provided a non-verbal method of responding to sensitive questions, and a photobook of oral contraceptive, hormone, and thyroid medications [58].
Additional components of the in-person interview: anthropometric assessment.
Height, weight, waist circumference, and body composition (assessed by Tanita bioelectrical impedance analysis (BIA)) were measured. Diet. A modified version of the full 100-item Block Food Frequency Questionnaire (FFQ) was developed by NutritionQuest (Berkeley, CA) with the study PI (Velie) to capture total diet and foods suspected to be associated with BC risk (e.g., cruciferous vegetables) in the 12 months prior to reference date. The FFQ was administered on paper or verbally during the interview; those who did not complete it at the interview returned it via mail or at the phlebotomy visit. Childhood diet was assessed with a food list. Childhood photographs. Participants provided photos from “head to toe” at ages 6, 9, 12, 15, and 18 years to validate recalled relative body size (assessed by somatotype); photos were scanned and de-identified by digitally masking the participant’s eyes/face, if requested. Built environment survey. Interviewers conducted a survey of neighborhood characteristics, primarily at the time of the interview [59, 60]. Surveys not completed by the end of study recruitment (6.5%) were conducted remotely via Google Maps Street View using photos collected at the date closest to the interview date [61]. Primary caregiver survey Participants were asked to mail their primary childhood caregiver a brief survey. Caregivers were given $10. The survey included respondent’s demographics, biologic mother’s pregnancy with the participant, and the study participant’s childhood body size, physical activity, and SEP.
Biospecimen collection
Blood.
All study participants were asked to provide a blood sample. Samples were collected by a phlebotomist, generally at the second visit (96%, 4% at first visit). Phlebotomists attempted to obtain 30 mL (approximately 2 tablespoons) collected in four 10-mL vacutainers: two with no additive and two with EDTA. For cases, our protocol indicated samples should not be collected until at least two months after last treatment (average days post treatment = 376 days; 95% CI 353.9, 398.6). Participants who provided blood samples were originally given a $20 thank you gift, which was later increased to $25. Samples were processed at the MSU Cytogenics laboratory and MCW Tissue Bank.
Blood Questionnaire.
Phlebotomists administered a questionnaire to each participant at the time of blood draw. Questions addressed recent medication use; medical history; menstrual, pregnancy, and lactation status; and recent food, beverage, alcohol, and tobacco consumption.
Menstrual calendar.
During the main interview, if a participant reported menstruating within the past year and if they consented to have their blood drawn, they were asked to complete a menstrual calendar that indicated each day they experienced menstrual bleeding until the date their blood was drawn. If participants had not completed this calendar at the time of blood draw, the phlebotomist completed it with the participant for the preceding two months.
Menstrual postcard.
At the end of the blood draw, menstruating participants were given a pre-addressed stamped postcard, and asked to record the date of the first day of their next menstrual cycle and mail it; this information was used to determine the participant’s menstrual phase at the time her blood was drawn.
Saliva.
Participants unwilling or unable to provide a blood sample were asked to provide a saliva sample with the Oragene OG-500 DNA kit. Saliva samples were collected immediately after administration of the main questionnaire, by the phlebotomist at the second visit, or mailed to the participant after the first visit and returned by mail.
Tumor SEER Information.
Tumors were characterized by ER, PR, and HER2 molecular subtypes, and histological grade to differentiate luminal A and luminal B tumors using data from SEER registries [11]. SEER reports also included ICD-O codes, tumor size, laterality, lymph node involvement, and initial treatment and surgical history.
Tumor Tissue.
To evaluate other tumor characteristics, e.g., Ki-67 status [11], tumor tissue from consenting cases was requested from hospitals or clinics where they were stored; when possible, tumor samples were taken before treatment. When adequate tissue was provided, tumor microarrays (TMAs) were created.
Biospecimen storage.
All blood, saliva, and tumor tissue biospecimens are stored at the MCW Tissue Bank as part of the YWHHS Biorepository. Separate biomarker studies will be conducted with all collected biospecimens.
Interviewer Training and Quality Control Measures
Control recruitment interviewer training.
Control field interviewers were employees of Westat. Interviewers from both study sites were trained together to synchronize data collection. Once they demonstrated adherence to all protocols they were certified for data collection.
Study site interviewer and phlebotomist training.
Training was conducted by the YWHHS Coordinating Center to synchronize data collection. All field staff completed appropriate IRB-mandated training and field safety training and were certified by the YWHHS Coordinating Center once they demonstrated adherence to all protocols and competence in a complete study interview.
Main interview and phlebotomy quality control.
Interviews and phlebotomy visits of consenting participants were audio recorded for quality control. The first five recorded interviews completed by each interviewer and additional interviews as needed based on performance (4.8% in Detroit; 2.6% in LA of completed interviews) were reviewed by a trained evaluator. The evaluator documented discrepancies in recorded responses, deviations from protocol, and appropriate probing, and provided detailed feedback to each interviewer.
Study response and cooperation rate calculations
Response and cooperation rates were calculated using imputation methods in accordance with the American Association for Public Opinion (AAPOR) guidelines [62] (see Supplemental Tables 1 and 2).
Sample weights
Sample weights were created for both cases and controls to account for sampling design and non-response. Weights reflect probabilities of selection and adjustments for non-response. Adjustments for non-response were done at the screener and main interview levels. To achieve the frequency matching of controls to cases, a weighted distribution of cases for each study site was established across cells of age and race. The sample weights of controls were then post-stratified to the weighted totals within each of these cells [63]. Additionally, replicate weights were created to develop estimates of variability, including standard errors. Demographic characteristics were obtained for 86% of sampled controls (complete roster information), and 100% of sampled case participants (age, race, site, county, ER status) to inform non-response weights. Replicate weights were created for case–control analyses and case-only analyses. A second set of weights was created for control-only analyses, to weight controls to the source population. Replicate weights were also created for blood sample analyses.
Statistical analyses
Primary analyses are conducted using survey weighted multiple logistic regression to account for study design and potential confounding. Where appropriate, potential effect modification by study site, race and/or socioeconomic position are being evaluated. For some analyses, structural equation modeling (SEM) with latent variables is being conducted to evaluate exposures over the life-course [64]. Additionally, for some analyses we are using survey weighted polytomous logistic regression to assess heterogeneity in risk by tumor subtypes.