Melanocortin-1 receptor, skin cancer and phenotypic characteristics (M-SKIP) project: study design and methods for pooling results of genetic epidemiological studies
- First Online:
- Cite this article as:
- Raimondi, S., Gandini, S., Fargnoli, M.C. et al. BMC Med Res Methodol (2012) 12: 116. doi:10.1186/1471-2288-12-116
- 4.1k Downloads
For complex diseases like cancer, pooled-analysis of individual data represents a powerful tool to investigate the joint contribution of genetic, phenotypic and environmental factors to the development of a disease. Pooled-analysis of epidemiological studies has many advantages over meta-analysis, and preliminary results may be obtained faster and with lower costs than with prospective consortia.
Design and methods
Based on our experience with the study design of the Melanocortin-1 receptor (MC1R) gene, SKin cancer and Phenotypic characteristics (M-SKIP) project, we describe the most important steps in planning and conducting a pooled-analysis of genetic epidemiological studies. We then present the statistical analysis plan that we are going to apply, giving particular attention to methods of analysis recently proposed to account for between-study heterogeneity and to explore the joint contribution of genetic, phenotypic and environmental factors in the development of a disease. Within the M-SKIP project, data on 10,959 skin cancer cases and 14,785 controls from 31 international investigators were checked for quality and recoded for standardization. We first proposed to fit the aggregated data with random-effects logistic regression models. However, for the M-SKIP project, a two-stage analysis will be preferred to overcome the problem regarding the availability of different study covariates. The joint contribution of MC1R variants and phenotypic characteristics to skin cancer development will be studied via logic regression modeling.
Methodological guidelines to correctly design and conduct pooled-analyses are needed to facilitate application of such methods, thus providing a better summary of the actual findings on specific fields.
KeywordsGenetic epidemiology Melanoma Meta-analysis Pooled-analysis Skin cancer Study design
Since millions of Single Nucleotide Polymorphisms (SNPs) were identified by the SNP Consortium , a growing number of studies have reported the association of SNPs in candidate genes with several diseases. However individual studies of typical size usually have low statistical power to find true associations given the polygenic nature of most common diseases, leaving alone the various forms of potential interactions between genetic, phenotypic and environmental factors. The advent of genome-wide association studies allowed genotyping of hundreds of thousands of SNPs across the genome on a usually large number of subjects, but information on a wide spectrum of epidemiological and lifestyle factors were seldom collected, although the role of these factors in complex diseases is undoubtedly crucial.
Meta-analysis of genetic epidemiological studies has been adopted to increase the power of smaller candidate gene studies by summarizing results from multiple studies. However the lack of access to individual data precludes in-depth investigations, including analyses of gene-gene and gene-environment interaction, and appropriate stratified analyses. This may potentially lead to false-positive or false-negative results, or biased magnitudes of associations, as previously pointed out .
Pooled-analysis of the primary data has been shown to have critical methodological advantages over meta-analysis [3, 4] and has been applied successfully in the genetic epidemiology field [4, 5, 6, 7, 8, 9, 10, 11]. Pooled-analysis uses standardized definitions of cases, outcomes and covariates, as well as the same analytical methods, thus limiting potential sources of heterogeneity across different studies. It also allows investigators to better control for confounding factors, evaluate alternative genetic models and estimate the joint effect of multiple genes. Finally, population-specific effect and gene-gene and gene-environment interactions may be better assessed using pooled-analysis . The pooling of data from observational studies has become more common recently, and different approaches of data analysis have been applied . Methodological guidelines to correctly design and conduct pooled-analyses are needed to facilitate application of such methods, thus providing a better summary of the actual findings on specific fields. Moreover, the awareness of the potential problems connected with the establishment of international collaborations and data pooling might help investigators to avoid or overcome them.
We describe here our experience with the study design of an international pooled-analysis on Melanocortin-1 receptor gene, SKin cancer and Phenotypic characteristics (M-SKIP project). In the first part of the paper, we explain the procedures that were used to identify studies and to collect and standardize data. In the second part we describe the statistical analysis plan that we are going to apply, giving particular attention to methods of analysis recently proposed to account for between-study heterogeneity and to explore the joint contribution of genetic, phenotypic and environmental factors in the development of a disease.
The M-SKIP project: rationale and aims
Melanocortin-1-receptor (MC1R, MIM#155555) is one of the major genes that determine skin pigmentation and it has been reported to be associated with risk of melanoma , possibly through the determination of the tanning response of skin to UV radiation [15, 16, 17]. However the relationship between some MC1R variants and melanoma also in darkly-pigmented European populations suggests that MC1R signaling may have an additional role in skin carcinogenesis beyond the UV-filtering differences between dark and fair skin . In previous meta-analyses [14, 19, 20] authors found evidence of a significant association between melanoma, red hair and fair skin and the five MC1R variants R151C, R160W, D294H, D84E and R142H, and suggested a possible role in melanoma development, via non-pigmentary pathways, for I155T and R163Q variants. However, the specific contribution of each MC1R variant to melanoma development via pigmentary and non-pigmentary pathways could not be evaluated in meta-analyses due to the lack of individually joint information on MC1R variants and phenotypic characteristics.
The aim of the M-SKIP project is therefore to perform a pooled-analysis of individual data on sporadic skin cancer cases and controls with information on MC1R variants, in order to: 1) assess the association of MC1R variants with melanoma, basal cell carcinoma (BCC) and squamous cell carcinoma (SCC); 2) assess the association between MC1R variants and phenotypic characteristics, including hair and eye color, skin color, skin type, common and atypical nevi, freckles, and solar lentigines; and 3) perform stratified analyses on MC1R variants and skin cancer by phenotypic characteristics, and evaluate MC1R-phenotype interaction in skin cancer risk.
Data collection and creation of the standardized dataset
The identification of data sets and data collection
Published epidemiological studies on MC1R variants, melanoma, non-melanoma skin cancer (NMSC) and phenotypic characteristics associated with melanoma [21, 22] were searched until April 2010 in the following databases: PubMed, ISI Web of Science (Science Citation Index Expanded) and Embase, using the keywords “MC1R” and “melanocortin 1 receptor” alone and in combination with the terms “melanoma”, “basalioma”, “basal cell carcinoma”, “squamous cell carcinoma”, “skin cancer”, “hair color”, “skin color”, “skin type”, “eye color”, “nevi”, “freckles”, and “solar lentigines”, with no search restriction. The computer search was supplemented by consulting the bibliographies of the articles and reviews. We also tried to identify unpublished datasets by personal communication with participant investigators, members of the Advisory Committee, and with attendees of scientific meetings. Unpublished datasets were evaluated by an internal peer-review process before inclusion.
We selected papers according to the following inclusion criteria: 1) observational studies on single-primary sporadic skin cancer cases with information on any MC1R variant or 2) control series with information on any MC1R variant and at least one phenotypic characteristic under study. Permanent exclusion criteria were: 1) populations selected for MC1R status or for other genetic factors, 2) studies including only familial and/or multiple-primary melanoma cases, because we wanted to study MC1R-melanoma association at a population level, therefore excluding cases for whom the role of genetics is probably stronger. In the first step of the project, we also excluded genome-wide association studies (GWAS), because their different study design and genotyping methodology would significantly increase the heterogeneity of our data; however GWAS with epidemiological data would be included in a next step of the project and their results would be compared with those of classical genetic epidemiological studies.
The original search provided 748 papers, among them 111 were considered potentially interesting and full-text articles were retrieved and evaluated. We excluded 49 articles for the following reasons: duplicate populations (N = 20), no data on outcome (case/control status or any of the studied phenotypic characteristics) or on MC1R variants (N = 12), case reports, commentaries or reviews (N = 6), GWAS (N = 6), populations selected for genetic factors (N = 4) and multiple primary melanoma cases only (N = 1). The remaining 62 independent studies were considered eligible for inclusion in the pooled-analysis.
For each independent study, we identified the corresponding investigator and retrieved his/her contact information. Each investigator was invited to join the M-SKIP project: this required them to sign a participation form and a document attesting to approval of the study guidelines, and then to provide their data in electronic form without restrictions on format. A detailed list of variables relevant for skin cancer was provided and, for each available variable in the list, the authors were required to compile a form with a clear and complete description on how it was collected and coded. Investigators did not send any personal identifier with data, but only identification codes. Finally, investigators were asked to send a signed statement declaring that the original study was approved by an Ethic Committee and/or that study subjects provided a written consent to participate in the original study.
Data collection started in May 2009 and was closed in December 2010. During this period, 43 investigators were contacted and invited to share data. Thirty-one (72%) agreed to participate and provided data on 28,998 subjects, including 13,511 skin cancer cases (10,182 melanomas) and 15,477 controls from 37 independent published [19, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62] and 2 unpublished studies. Both the unpublished datasets came from investigators who were originally contacted for their published data and who had further data of (still) unpublished studies. Among the 12 non-participant investigators, seven did not reply to our invitation letter, three were not able to retrieve the original dataset and two were not interested in the project. The total number of skin cancer cases and controls from the 25 independent studies [63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95] of non-participant investigators was 5,135 and 8,262, respectively. The study design was case–control for 13 studies, control-only for 11 studies, and case-only for one study.
Quality control, data coding and creation of the standardized dataset
We inspected the data for completeness and resolved inconsistencies with the investigator of each study. A number of subjects were excluded due to the following reasons: multiple-primary melanoma cases (N = 1596), missing data on MC1R variants (N = 1081), non-skin melanoma cases (N = 150), subjects with atypical mole syndrome and no skin cancer (N = 58), non first-primary melanoma cases (N = 24), familial melanoma cases, defined as subjects with two first-degree relatives or three or more any-degree relatives with melanoma (N = 25), other reasons including: unknown case/control status, duplicate subjects, or inappropriate controls (N = 232).
List of the main variables, number of original studies and related subjects per variable
Studies (%) N = 39
Melanoma cases (%) n = 7806
NMSC cases (%) n = 3151
Controls (%) n = 14875
Body mass index
Intermittent sun exposure
Continuous sun exposure
Artificial UV exposure
Family history of skin cancer
Family history of cancer other than skin
Melanoma body site
Some variables were collected in different ways in different studies. We report here as an example the rules we used to standardize sun exposure variables, in order to provide suggestions on how to recode variables with highly heterogeneous assessment among studies.
- 1)calculate the variable mean on all the study subjects as:(1)
calculate the average hours of exposure/day (ν) over all the datasets with the variable coded (or recoded) in this way as in 1);
- 3)recode each observation basing on the proportion as:(2)
set as 6 (maximum hours of exposure per day) the value of all calculated values greater than 6.
The assumption underlying this coding was that the average sun exposure pattern for study subjects was similar for different studies (and countries). Since we will use this variable only for confounding adjustment and/or effect modifier analyses, the purpose was to regroup subjects with a similar pattern of sun exposure, although the precise individual amount of sun exposure could not be estimated.
As a general rule, when a variable (i.e. common nevi count) was collected into classes, we recoded each class by using its median. The maximum numbers for open categories were chosen according to the available M-SKIP data.
Brief description of the collected data and statistical power
The final dataset was created in June 2011 and included data on 7,806 melanoma cases, 3,151 NMSC cases (2,211 BCC, 788 SCC and 152 with both), and 14,875 controls.
Summary of data included in the M-SKIP project by geographical location
Participant investigators (studies)
Main characteristics of the included studies
Studies (%) N = 39
Melanoma cases (%) n = 7806
NMSC cases (%) n = 3151
Controls (%) n = 14875
Source of controls
Population or healthya
Examination by an expert
We calculated that the minimum required sample size to find a statistically significant association between a MC1R variant and melanoma assuming a similar association to that observed in our previous meta-analysis  (Odds Ratio (OR) = 1.5) is around 7,500 cases and 7,500 controls for rare variants (1-2% allele frequency in controls), and 1,400 cases and 1,400 controls for common variants (8-10% allele frequency in controls), with 90% statistical power. Sample size for gene-environment interaction analysis was also calculated with the program POWER, version 3.0 . Considering the study of a simple two-way interaction between an environmental factor and a rare MC1R variant, around 5,000 cases and 5,000 controls would be needed to observe a multiplicative interaction effect of 2.0, arising to 16,000 cases and 16,000 controls to observe a smaller multiplicative effect of 1.5, both with 90% statistical power. For common MC1R variants, the same gene-environment interaction effects of 2.0 and 1.5 could be observed with around 1,200 cases and 1,200 controls, and with around 3,500 cases and 3,500 controls, respectively. Our sample size therefore is appropriate for the purpose of the analysis, and large enough to allow stratified and interaction analyses, especially to find even small interaction effects with the most frequent variants, and larger interaction effects for less common variants.
Statistical analysis plan
Appropriateness and representativeness of the collected data
Comparability of the main study population characteristics and results between studies included and excluded from the pooled-analysis will be assessed. Funnel plots to evaluate participation bias will be drawn and Egger’s test  will be performed.
Departure of genotype frequencies of each MC1R variant from expectation under Hardy-Weinberg equilibrium will be assessed by Chi Square test among controls for each study, in order to detect any possible genotyping error or stratification problem in the datasets.
Combining data into a single dataset with random effects models
In this model the transformed regression coefficient exp(β) is the odds of skin cancer for a subject with the MC1R variant compared with a subject without the MC1R variant, and the bk are the study-specific coefficients accounting for the random selection of studies, with bk ~ N(0, σ2b), where σ2b represents the between study variance of the MC1R effect.
The logistic regression model above described could be applied to different inheritance models and could include covariates, in order to adjust the studied associations by possible confounding factors. In order to include the available information from all the studies, missing values could be estimated in the model with multiple imputation and/or the creation of a missing-data indicator variable. However, when the majority of missing data are the results of non-availability of certain variables in some studies, as for the M-SKIP project, the use of both multiple imputation and the missing-data indicator would be likely to introduce a bias in comparison with the complete case method [98, 99] and a two-stage approach would be preferred.
Two-stage analysis with random effects models
The two-stage analysis method  will allow us to overcome the problem of the availability of different study covariates. The pooled-estimates of the association of MC1R variants with each skin cancer type and each phenotypic characteristic will be calculated as follows.
where β is the pooled-exposure log-odds ratio, bk are random effects with bk ~ N(0, σ2b), where σ2b represents the variability of the study-specific exposure effects βk about the population mean β, and ek are independent errors with ek ~ N(0, σ2k), where σ2k describes the within-study variation of the βk. In the first stage and its variance are estimated from equation 4, separately for each study.
Investigation of heterogeneity among studies
Homogeneity among the study estimates will be measured by Q statistic and I-Square , the latter representing the percentage of total variation across studies that is attributable to heterogeneity rather than to chance. Meta-regression analysis will be performed to investigate heterogeneity among study estimates, by evaluating the role of methodological characteristics of the studies and the characteristics of study populations.
Joint association of MC1R and phenotypic characteristics with skin cancer risk
Stratified analysis for the association of MC1R variants with each skin cancer type will be performed for different phenotypic characteristics. The hypothesis of homogeneity of ORs among strata will be verified using the Breslow-Day test .
The searching for the optimum combinations of MC1R variants and phenotypic characteristics mostly associated with skin cancer will be undertaken by a (stochastic) simulated annealing algorithm [104, 105, 106]. This algorithm has a good chance to find a model that has the best or close to best possible score but, in the presence of noise in the data, typically overfits data. In order to select the best model, application of a combination of cross-validation and randomization tests has been suggested [104, 105].
In an explanatory setting, at risk gene-phenotype combinations will be identified among a very large number of possible combinations by logic regression-based methods recently proposed [106, 107]. The skin cancer risk of the identified subpopulations will be estimated within the pooled-analysis context using the two-stage analysis previously described.
Structural equation models will be also applied to eventually clarify the independent and dependent role of MC1R variants on skin cancer by phenotypic characteristics.
Finally, the role of environmental exposure will be investigated by entering new covariates in the models, by subgroup analyses and by studying gene-environment and phenotype-environment interactions using traditional and new proposed methodologies .
Use of MC1R data
In all the proposed analyses, each of the nine most frequently investigated MC1R variants (V60L, D84E, V92M, R142H, R151C, I155T, R160W, R163Q, D294H), as well as known rare mutations affecting MC1R function  will be evaluated assuming different inheritance models and choosing the one that fits the data best. Haplotype frequencies will be estimated using the iterative Expectation-Maximization algorithm [110, 111], and their association with each skin cancer type and phenotypic characteristics will be evaluated. Moreover, for the studies that sequenced the entire gene, we will evaluate the impact on skin cancer and phenotypic characteristics of the total number of MC1R variant alleles and of the scores obtained from appropriate classification of MC1R variants .
Based on our experience with the study design of the M-SKIP project, we have described here the most important steps in planning, conducting and analyzing pooled individual data from genetic epidemiological studies. A previously published commentary highlighted the advantages and limitations of this kind of analyses, but did not describe the statistical methods that could be used to pool datasets . Some methods for pooling results of epidemiological studies were suggested [10, 100, 112, 113, 114], but specific problems related to genetic epidemiology – such as the evaluation of different genotyping methodology, the Hardy-Weinberg equilibrium testing, the hereditary model assumption, and the assessment of gene-phenotype and gene-environment interaction – were not discussed.
Within the M-SKIP project, we collected a large amount of data in which multiple hypotheses can be examined with greater statistical power than is possible in individual studies. The response rate of invited investigators was high (72%), probably due to the well defined criteria of data collection and use, the clear publication policy, and the presence of an Advisory Committee tasked with monitoring adherence to project guidelines and scientific quality. Another strength of the pooled-analysis here described is the carefully-planned approach to standardizing the demographic, epidemiological and phenotypic information obtained from individual studies, giving the opportunity to perform appropriate and detailed subgroup and interaction analyses. Because the inclusion of an individual study in a particular analysis is not dependent on whether those investigators have published findings on that association, and because of inclusion of unpublished datasets, our pooled-analysis should not be affected by publication bias, as it might a meta-analysis of the published literature. Finally, we plan to analyze data by conventional and recently proposed statistical methods, and will compare and integrate the results obtained with these different approaches.
The main limitation of a pooled analysis, especially with respect to prospective consortia, is that it was planned retrospectively, and hence there was no a priori standardization of data collection. On the other hand, pooled-analysis may be feasible with fewer funds than those required for a prospective consortium, and it takes shorter time to obtain results because the original data have already been collected. The quality of genotype methodology may be heterogeneous among different participant laboratories. We will take into account this possible problem both by calculation of Hardy-Weinberg equilibrium and by meta-regression analysis. Finally, while we will try to assess the existence of participation bias, we cannot completely rule out that the results could be affected by the exclusions of the studies from the investigators who refused to participate in this pooled analysis.
In conclusion, the data collected within the M-SKIP project are a valuable resource for investigating associations between MC1R variants and skin cancer, particularly for population subgroups, and may be an appropriate setting to better investigate the genetics of sporadic skin cancer. A pooled-analysis of epidemiological studies is feasible, has many advantages over meta-analysis in making it possible to adjust for confounders and assess interactions, and in addition preliminary results may be obtained with lower costs and shorter time than with prospective consortia. We are convinced that its success depends upon the initial definition and approval of clear guidelines necessary for conducting such studies. The diffusion of pooled-analysis in genetic epidemiology field will assist epidemiologists and other health professionals in synthesizing the vast amount of available data on specific gene-disease associations and a common data-base would be the source of possible future investigations.
The M-SKIP study was supported by the Italian Association for Cancer Research [MFAG 11831]. The GEM study was supported by National Cancer Institute [R01 CA112243, R01 CA112243-05 S1].
The M-SKIP study group consists of the following members: Principal Investigator: Sara Raimondi (European Institute of Oncology, Milan, Italy); Advisory Committee members: Philippe Autier (International Prevention Research Institute, Lyon, France), Maria Concetta Fargnoli (University of L’Aquila, Italy), José C. García-Borrón (University of Murcia, Spain), Jiali Han (Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA), Peter A. Kanetsky (Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA), Maria Teresa Landi (National Cancer Institute, NIH, Bethesda, MD, USA), Julian Little (University of Ottawa, Canada), Julia Newton-Bishop (University of Leeds, UK), Francesco Sera (UCL Institute of Child Health, London, UK); Consultants: Saverio Caini (ISPO, Florence, Italy), Sara Gandini and Patrick Maisonneuve (European Institute of Oncology, Milan, Italy); Participant Investigators: Albert Hofman, Manfred Kayser, Fan Liu, Tamar Nijsten and Andre G. Uitterlinden (Erasmus MC University Medical Center, Rotterdam, The Netherlands), Rajiv Kumar and Dominique Scherer (German Cancer Research Center, Heidelberg, Germany), Eduardo Nagore (Instituto Valenciano de Oncologia, Valencia, Spain), Johan Hansson and Veronica Hoiom (Karolinska Institutet, Stockholm, Sweden), Paola Ghiorzo and Lorenza Pastorino (University of Genoa, Italy), Nelleke A. Gruis (Leiden University Medical Center, The Netherlands), Terry Dwyer (Murdoch Childrens Research Institute, Victoria, Australia), Leight Blizzard and Jennifer Cochrane (Menzies Research Institute, Hobart, Australia), Ricardo Fernandez-de-Misa (Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain), Wojciech Branicki (Institute of Forensic Research, Krakow, Poland), Tadeusz Debniak (Pomeranian Medical University, Polabska, Poland), Niels Morling and Peter Johansen (University of Copenhagen, Denmark), Ruth Pfeiffer (National Cancer Institute, NIH, Bethesda, MD, USA), Giuseppe Palmieri (Istituto di Chimica Biomolecolare, CNR, Sassari, Italy), Gloria Ribas (Fundacion Investigation Hospital Clinico Universitario de Valencia- INCLIVA, Spain), Alexander Stratigos and Katerina Kypreou (University of Athens, Andreas Sygros Hospital, Athens, Greece), Anne Bowcock and Lynn Cornelius (Washington University, St. Louis, MO, USA), M. Laurin Council (St. Louis University, St. Louis, MO, USA), Tomonori Motokawa (POLA Chemical Industries, Yokohama, Japan), Sumiko Anno (Shibaura Institute of Technology, Tokyo, Japan), Per Helsing and Per Arne Andresen (Oslo University Hospital, Norway), Terence H. Wong (University of Edinburgh, UK), and the GEM Study Group.
Participants in the GEM Study Group are as follows: Coordinating Center, Memorial Sloan-Kettering Cancer Center, New York, NY, USA: Marianne Berwick (PI, currently at the University of New Mexico), Colin Begg (Co-PI), Irene Orlow (Co-Investigator), Urvi Mujumdar (Project Coordinator), Amanda Hummer (Biostatistician), Klaus Busam (Dermatopathologist), Pampa Roy (Laboratory Technician), Rebecca Canchola (Laboratory Technician), Brian Clas (Laboratory Technician), Javiar Cotignola (Laboratory Technician), Yvette Monroe (Interviewer). Study Centers: The University of Sydney and The Cancer Council New South Wales, Sydney (Australia): Bruce Armstrong (PI), Anne Kricker (co-PI), Melisa Litchfield (Study Coordinator). Menzies Centre for Population Health Research, University of Tasmania, Hobart (Australia): Terence Dwyer (PI), Paul Tucker (Dermatopathologist), Nicola Stephens (Study Coordinator). British Columbia Cancer Agency, Vancouver (Canada): Richard Gallagher (PI), Teresa Switzer (Coordinator). Cancer Care Ontario, Toronto (Canada): Loraine Marrett (PI), Beth Theis (Co-Investigator), Lynn From (Dermatopathologist), Noori Chowdhury (Coordinator), Louise Vanasse (Coordinator), Mark Purdue (Research Officer). David Northrup (Manager for CATI). Centro per la Prevenzione Oncologia Torino, Piemonte (Italy): Roberto Zanetti (PI), Stefano Rosso (Data Manager), Carlotta Sacerdote (Coordinator). University of California, Irvine (USA): Hoda Anton-Culver (PI), Nancy Leighton (Coordinator), Maureen Gildea (Data Manager). University of Michigan, Ann Arbor (USA): Stephen Gruber (PI), Joe Bonner (Data Manager), Joanne Jeter (Coordinator). New Jersey Department of Health and Senior Services, Trenton (USA): Judith Klotz (PI), Homer Wilcox (Co-PI), Helen Weiss (Coordinator). University of North Carolina, Chapel Hill (USA): Robert Millikan (PI), Nancy Thomas (Co-Investigator), Dianne Mattingly (Coordinator), Jon Player (Laboratory Technician), Chiu-Kit Tse (Data Analyst). University of Pennsylvania, Philadelphia, PA (USA): Timothy Rebbeck (PI), Peter Kanetsky (Co-Investigator), Amy Walker (Laboratory Technician), Saarene Panossian (Laboratory Technician). Consultants: Harvey Mohrenweiser, University of California, Irvine, Irvine, CA (USA); Richard Setlow, Brookhaven National Laboratory, Upton, NY (USA).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.