Introduction

Formaldehyde (CH2O) is a simple one-carbon molecule, found in most human and other living cells as a normal product of the metabolism of serine, glycine, methionine, and choline, and is generated in the demethylation of N-, O-, and S-methyl compounds. It is also an essential intermediate in the biosynthesis of purines, thymidine, and various amino acids [1]. Consequently, formaldehyde is present in virtually all cells in the body at varying concentrations.

Formaldehyde is also produced commercially and is valuable as a biocide, preservative, and basic chemical in the manufacture of common materials such as plastics, building materials, glues and fabrics, and many household and consumer products, including medicines, health, and beauty aids. Formaldehyde is also a product of organic matter combustion.

Common exposure sources include some laboratories, indoor air (e.g., carpets), vehicle emissions, cigarette smoke, and workplaces manufacturing or using resins, various wood products (e.g., particle board), adhesives, textiles, and numerous other consumer products [2]. High concentrations of formaldehyde were found inside some of the temporary housing units built for victims of hurricane Katrina in the US in 2008, which raised the public awareness of the chemical and its potential acute health effects [3, 4].

Inhalation is the predominant route of exposure to exogenous formaldehyde. Following inhalation, formaldehyde rapidly reaches cells in the upper respiratory tract and reacts virtually instantaneously with primary and secondary amines, thiols, hydroxyls, and amides [5]. Formaldehyde is swiftly metabolized by erythrocytes [69]. Formaldehyde forms adducts with DNA and proteins and also produces DNA cross-links [10].

The most common acute health effects of exposure to formaldehyde include eye and upper respiratory tract irritation. Reversible declines in lung function have also been observed, although the evidence that it causes asthma and other chronic respiratory diseases is inconsistent [11]. There is inadequate evidence to assess other potential adverse effects of formaldehyde in humans, such as immunotoxicity, neurotoxicity, and reproductive and developmental toxicity [12, 13].

Carcinogenicity of formaldehyde

Concerns about the carcinogenicity of formaldehyde were prompted in the early 1980s by the induction of nasal tumors in rats exposed at high concentrations [1417]. As a consequence, the focus of early epidemiologic studies was on nasal cancer, based on the understanding that formaldehyde is rapidly metabolized at the site of contact (i.e., nasal passages and cavity) [1820]. Consequently, associations between formaldehyde exposure and other malignancies in humans were reported, including nasopharyngeal carcinoma (NPC), lung cancer, lymphohematopoietic malignancies (LHM), mainly leukemias, and other cancers such as brain, colon, and prostate [21, 22]. Epidemiologic studies on formaldehyde exposure and LHM risk are reviewed in detail below.

In 2006, the International Agency for Research on Cancer (IARC) conducted a comprehensive review of the literature and classified formaldehyde as a known (i.e., Group 1) human carcinogen, based on sufficient evidence for NPC. The evidence for leukemia was considered suggestive [23]. In 2009, IARC conducted an abbreviated updated review of all Group 1 chemicals, including formaldehyde [24], in which the epidemiologic evidence for leukemia—specifically myeloid leukemias—was classified as sufficient. The US National Toxicology Program similarly classified formaldehyde as a known human carcinogen [25]. The US Environmental Protection Agency (EPA), in its draft Integrated Risk Information System (IRIS) report on formaldehyde, concluded that existing epidemiologic evidence supported a causal association with LHM as a group and specifically for myeloid leukemia [26]. A special committee of the US National Research Council of the National Academies critically reviewed the EPA draft IRIS report and found the causal conclusions for LHM to be inadequately supported [27].

We undertook a critical, systematic, and comprehensive review and synthesis of the epidemiologic literature on formaldehyde and risks of the LHM. Our review is more thorough than that produced by the National Research Council [27], which focused on literature summarized in the EPA draft IRIS document. Our objectives were to characterize the overall strength and consistency of the evidence to guide causal interpretations and to recommend research improvements that would extend knowledge on this important public health and scientific issue.

Methods

Our methods were consistent with those used by IARC [28] and others [2931]. Briefly, we identified published, peer-reviewed epidemiologic studies specifically addressing formaldehyde exposure and risk of the LHM. Searches were conducted in PubMed, the US National Library of Medicine’s primary research tool that indexes most of the world’s health and medical peer-reviewed journals since at least 1966. All years indexed were searched to identify these studies using the following key words in various combinations: cancer, leukemia, non-Hodgkin’s lymphoma, lymphoma, lymphocytic, Hodgkin’s lymphoma, hematopoietic, multiple myeloma, hematological neoplasm, formaldehyde, embalmer, garment, laboratory workers, epidemiology, case–control, cohort, case-referent, occupational, chemical, exposure, risk, review, meta-analysis, and commentary. We identified a total of 1,441 potentially relevant articles from the literature searches. Of these articles, 126 were retained as relevant to formaldehyde exposures and LHM. Articles were excluded if they (1) were not epidemiological studies, (2) did not focus on formaldehyde, (3) focused on outcomes other than cancer, or (4) did not present results for specifically for LHM. Additionally, references cited in other publications, including reviews, were checked to ensure the thoroughness of the literature review. We did not attempt to identify unpublished reports. The final review included a total of 37 articles—22 cohort studies and 17 case–control studies.

We comprehensively reviewed the identified literature, including studies of occupational groups and population-based case–control studies of specific LHM that presented results for formaldehyde-related exposures. Most emphasis was placed on findings from occupational cohort studies, which, because of the greater potential for exposure to substantial concentrations of formaldehyde, provide the best evidence for possible associations. We limited the review to the most recent updates of occupational studies, although we include findings from earlier reports where results have changed materially with successive updates.

Defining the outcome of interest is an important aspect of the design of epidemiologic studies, and the LHM are particularly challenging in this regard. Much of the information about LHM and formaldehyde exposure derives from mortality data in occupational cohort studies that spanned several LHM classification schemes. The principles of the nosological classification of this group of neoplasms have changed during the past 40 years, following the increasing understanding of the pathological and clinical characteristics of the different diseases. The most substantial changes in the International Classification of Diseases (ICD) have occurred for the non-Hodgkin lymphomas (NHL). Until the 9th Revision of the International Classification of Diseases (ICD), NHL was classified under two rubrics: “lymphosarcoma and reticulosarcoma” and “other neoplasms of the lymphoid tissue” (Hodgkin lymphoma had a separate code) [32]. In ICD-10, which follows a new WHO classification, chronic lymphocytic leukemia (CLL), the most common type of leukemia among the elderly, is classified as a form of NHL, and other changes were made to the classification of NHL. The InterLymph Consortium of lymphoma epidemiology has made an effort to adapt the last two versions of the WHO classification to epidemiologic studies, following a hierarchical approach [33, 34]. Unfortunately, the majority of epidemiologic studies, in particular occupational cohort studies, which based outcomes on death certificates, do not follow the WHO classifications (or its InterLymph adaptation).

We present and discuss findings for specific LHM to the extent allowed by published data. We do not discuss results for all LHM combined because diseases in this group are clinically and pathologically heterogeneous, and thus probably etiologically distinctive.

We did not perform meta-analyses because our evaluation of the individual studies determined that the literature is too heterogeneous, that is, inconsistent, with respect to disease classification and exposure assessment, and therefore, quantitative risks are not appropriately combined. Moreover, the number of independent studies with comparable exposure circumstances (i.e., the same industry or occupation) and similar exposure assessments was too small to justify meta-analyses of these subsets of results. We were especially concerned about combining studies of different groups of workers with poorly characterized circumstances of exposure to formaldehyde. Several previous meta-analyses [3538] have been performed, yielding variable conclusions, which may result from different methods and the underlying heterogeneity of exposure and health outcome data specificity and validity among published studies. In our opinion, the apparent gain in precision from a meta-analysis would be offset by problems in the interpretation of the summary results. We do, however, provide Forest plots of overall study findings as Figs. 1, 2, 3, 4, and 5.

Fig. 1
figure 1

Forest plot of formaldehyde exposure and leukemias

Fig. 2
figure 2

Forest plot of formaldehyde exposure and myeloid leukemia

Fig. 3
figure 3

Forest plot of formaldehyde exposure and chronic lymphocytic leukemia

Fig. 4
figure 4

Forest plot of formaldehyde exposure and lymphomas

Fig. 5
figure 5

Forest plot of formaldehyde exposure and non-Hodgkin lymphoma

Epidemiologic literature

Associations between formaldehyde exposure and the LHM have been investigated among anatomists, pathologists, embalmers, and industrial workers involved in the manufacture and use of formaldehyde and formaldehyde-containing products, such as resins, adhesives, wood products, fabrics, and garments. Formaldehyde has also been examined as a risk factor in numerous studies conducted in the general population, including population-based case–control studies and analyses correlating occupations with LHM incidence and mortality. Accordingly, we present summaries of literature in tabular form separately for the following categories: cohort studies of industrial workers, cohort studies of professional workers, and population-based cohort and case–control studies.

Among all available literature, we regard two large occupational cohort studies as most informative because of the cohort design, greatest likelihood of exposure, quantification of exposure, and minimized bias and confounding. These are mortality studies of (1) a cohort of employees of ten US factories that produced or used formaldehyde, conducted by the US National Cancer Institute (henceforth termed the “NCI producers study”) [39]Footnote 1 and (2) a cohort of employees of six UK factories engaged in the production of resins, adhesives, and formalin (henceforth termed the “UK producers study”) [48].Footnote 2

A second group of occupational studies that we regard as less informative includes a cohort of US garment workers [5153]Footnote 3 and a case–control analysis of deaths among US embalmers and funeral directors [54] that was based on a series of earlier proportionate mortality studies [21, 55, 56]. The study base in which the nested case–control study of LHM in the US embalmers and funeral directors study was conducted was poorly defined [54], and the formaldehyde exposure assessment in the garment workers study [53] was less specific and detailed than in the two “producers” cohort studies.

The remaining occupational studies reviewed were those conducted among cohorts of undertakers [57], pathologists [58], anatomists [59, 60], wood industry workers [6163], and general chemical industry workers [20, 6467]. In these studies, formaldehyde exposure was less certain than in aforementioned occupational cohort studies and, in many cases, was inferred from job title or work area.

The other major categories of epidemiologic studies reviewed were community-based cohort and case–control studies and general population surveys, which also provide limited information on formaldehyde exposure and LHM risks. Exposure assessment in these studies was generally based on crude exposure metrics, such as “low” versus “high” exposure probability combinations of heterogeneous job titles. Details of study design and exposure assessment for the studies reviewed are summarized in Table 1.

Table 1 Studies of formaldehyde and lymphohematopoeitic malignancies and exposure metrics

Results

Summary of leukemia findings

The findings for the occupational cohort studies with leukemia outcomes are summarized in Table 2. The two most influential studies are considered first. Based on comparisons with national rates, no excesses for all leukemia (standardized mortality ratio (SMR) 1.02, 95 % confidence interval (CI) 0.85–1.22) or myeloid leukemia (SMR 0.90, 95 % CI 0.67–1.21) were found in the most recent follow-up of the NCI producers’ study. Among the formaldehyde-exposed portion of the cohort, there was a weak trend of relative risk (RR) with peak exposure, for both all leukemias and myeloid leukemia, largely influenced by elevated RRs of 1.78 (95 % CI 0.87–3.64) for myeloid leukemias and 1.42 (0.92–2.18) for “other” (non-myeloid) leukemias in the highest peak exposure category. However, most of the trends and individual RR estimates were not remarkable or precise. The association for peak exposure and myeloid leukemia was considerably attenuated from the previous follow-up of the cohort, RR 2.79 (95 % CI 1.08–7.21, 14 cases, p-trend 0.02) at the highest peak category. Beane Freeman [39] corrected the results published in Hauptmann [47] that inadvertently omitted 1,006 deaths, including 22 LHM deaths. No clear associations with average or cumulative exposure were found in the corrected data for any of the leukemias. Null findings were reported for lymphatic leukemia and “other and unspecified leukemia” [39].

Table 2 Studies of formaldehyde exposure and leukemia, myeloid leukemia, and other/unspecified leukemias

Leukemia mortality was not elevated overall (SMR 0.91, 95 % CI 0.62–1.29) or in the most highly exposed (i.e., jobs with >2 ppm formaldehyde) segment (SMR 0.71, 95 % CI 0.31–1.39) of the UK formaldehyde producers study [48]. No separate results for myeloid leukemias were presented.

Among other occupational studies, the nested case–control analysis of US embalmers reported odds ratios for myeloid leukemias and for acute myeloid leukemias in the range of 2.0–3.2 for number of embalmings, and for cumulative and peak formaldehyde exposure categories, relative to the referent group that performed <500 career embalmings. However, the underlying sample of death certificates evaluated in this analysis demonstrated no excess of myeloid leukemias: the 29 myeloid leukemias reported in this study generated a proportionate mortality ratio (PMR) of 1.08 (95 % CI 0.70–1.56), and the subset of 20 acute myeloid leukemias generated a PMR of 1.16 (0.71–1.79) [68]. Moreover, there was little evidence of increasing exposure–response trends in the non-reference exposure categories [54]. In the study of US garment workers, the SMR for leukemia deaths was 1.09 (95 % CI 0.7–1.62), based on 24 total leukemia deaths. For the 15 observed myeloid leukemias, the SMR was 1.44 (95 % CI 0.8–2.37), and for the nine acute myeloid leukemias, the SMR was 1.34 (95 % CI 0.66–2.54). In the US garment workers study, SMRs were increased among workers with ≥10 years exposure (SMR for myeloid leukemia 2.19, 95 % CI 0.95, 4.32) [69]Footnote 4 and ≥20 years since first exposure (SMR 1.91, 95 % CI 1.02, 3.27)Footnote 5 [53].

No excesses were observed for all leukemia or for leukemia subtypes among persons classified as exposed to formaldehyde in the population-based case–control studies [70, 71]. In the remaining occupational studies, risk estimates for leukemia compared with the national or regional populations were consistently close to the null value and unstable due to small numbers.

The RR estimate was 5.79 (95 % CI 1.44, 23.25) for leukemia among the combined exposure group of “formaldehyde-exposed and wood-related occupations” in the American Cancer Society Cancer Prevention Study II; however, this result was based on only two deaths. The RR for those with formaldehyde exposure only was 0.96 (95 % CI 0.54–1.71), based on 12 leukemia deaths [63].

Summary of lymphoma findings

The lymphoma results, including those for chronic lymphocytic leukemia (CLL) when reported separately, are summarized in Table 3. With the exception of Hodgkin lymphoma (HL), there were no overall excesses of the lymphomas among exposed workers in the NCI producers cohort; HL risk was associated with peak exposure, with relative risk reaching 3.96 (95 % CI 1.31–12.02) only at the highest exposure category (≥4.0 ppm), based on 11 deaths. A similar, but weaker, trend was observed for HL and average exposure (RR 2.48, 95 % CI 0.84–2.32) at the highest category [39]. The only overall excess for any of the lymphomas reported in the UK producers study was a weak association for multiple myeloma (MM) in the subgroup classified as mostly highly exposed workers (SMR 1.18, 95 % CI 0.48–2.44) [48]. Quantitative exposure–response findings were not presented.

Table 3 Studies of formaldehyde exposure and chronic lymphocytic leukemia, Hodgkin’s lymphoma, non-Hodgkin’s lymphoma, multiple myeloma, and all lymphomas

Results of the nested case–control study of embalmers presented for all neoplasms of lymphoid origin, rather than for non-Hodgkin lymphoma (NHL) or MM specifically, did not suggest an association with any indices of formaldehyde exposure [54]. SMRs for lymphoma were less than 1.0 in the US garment workers study [53]. None of the other occupational cohort studies reported a significantly increased risk of NHL, HL, or MM (Table 3). Risk estimates for NHL, HL, and MM in community-based studies also suggested no association, with RR estimates ranging between 0.5 and 1.3, although positive results were reported in one NHL study from Connecticut [72]. Several community-based studies provided results for NHL subtypes, but there were no consistent associations [5962].

Discussion

The main considerations pertinent to assessing epidemiological evidence for a causal relation between formaldehyde exposure and the leukemias or other specific LHM are consistency of findings across studies, evidence for exposure–response associations, accuracy of exposure and health outcome assessment, and minimal confounding and bias. The extent to which exposure assessment in a given study is valid, accurate, and, ideally, permits quantitative dose–response estimation is a critical aspect of research quality. Secondarily, epidemiologic findings suggestive of an association should be interpreted in relation to available evidence of mechanisms of pathogenesis.

The epidemiologic literature provides little or no evidence indicating excess risks overall or exposure–response associations between formaldehyde and any of the LHM, including leukemias, myeloid leukemias, and acute myeloid leukemias. In the majority of occupational cohort studies, which we regard as most informative, specific LHM risk estimates were consistent with the null value, with few exceptions, where the excesses were generally small (i.e., RR < 1.5) and statistically imprecise.

The NCI producers cohort [39] and the nested case–control analysis of the embalmers and funeral directors group [54] found elevated risk estimates based on some exposure metrics compared with an internal reference group. However, the increased relative risk for myeloid leukemia noted in an earlier follow-up of the NCI producers cohort [47] had diminished in the most recent update [39].

The strongest associations for myeloid leukemia observed in this cohort were with peak exposures; whereas cumulative exposure and average exposure intensity were unrelated to risk. As described in the original publication on the exposure assessment of the NCI producers study [73], there was no uniform definition of peak exposure. Instead, peak was defined on a job-specific basis as an excursion (usually of short duration, e.g., <15 min) relative to the estimated average exposure for the job. Moreover, epidemiologic associations of a specific disease with peak exposure can be difficult to interpret in the absence of prior mechanistic support, such as the requirement for acute above-threshold exposures. In general, established human carcinogens show strong and consistent associations between unbiased measures of cumulative exposure and cancer risk, and cumulative exposure is the default dose metric that is mostly used to assess cancer risk for etiologic exposures. A re-analysis of the data from the previous follow-up [47] corroborated the absence of associations with cumulative exposure but also indicated no consistent associations between myeloid leukemia and either duration of time worked at the highest peak or time since highest peak exposure [45]. Findings from similar re-analyses have not been reported for the most recent follow-up. In the other relatively strong occupational cohort study [48], there was no association between formaldehyde exposure and leukemia.

Among the other occupational studies, the US embalmers study generated elevated odds ratios for some formaldehyde exposure metrics [54]. However, as noted by Cole et al. [68], this study has notable limitations—including a lack of overall excess leukemia risks (based on PMR analysis), exposure assessment uncertainties, and a poorly defined study base originating from a convenience sample of death certificates obtained from previous proportionate mortality studies. In the study of US garment workers [53], the only support for an association with formaldehyde was the observation of moderately elevated relative risks for myeloid leukemia associated with long-term exposures and longest follow-up that are very crude exposure metrics correlated with older age. The results of the remaining lower-quality studies are not supportive of an association between formaldehyde exposure and leukemia risk. Another recent review of the literature reached similar conclusions for associations with the leukemias [74].

The pattern of epidemiological results for the lymphomas is inconsistent. In the NCI producers cohort, there were some notably elevated relative risks (in the range of 2.5–4.0) observed for exposure categories of highest peak for HL and MM [39, 47], yet null or at most very small excesses for these diseases were reported in the other studies of occupational formaldehyde exposure.

Consistency with toxicological and mechanistic evidence

Studies of workers in China have evaluated a potential association between exposure to formaldehyde and a change in one or more blood parameters indicative of hematotoxicity [7577]. Evidence suggestive of pancytopenia and leukemia-specific chromosome changes was reported from a study of Chinese formaldehyde melamine resin–exposed workers [78]. However, the blood cell parameters among exposed workers were largely within the normal range for Chinese populations [7982], and the chromosome findings were based on the progeny of circulating stem cells from a small numbers of workers (n = 10–12) after 14 days of culture. Overall, the available data do not provide evidence of a clinically or biologically relevant impact on blood cell parameters in humans following exposure to formaldehyde.

Although mechanisms for the development of leukemia or lymphoma following exposure to formaldehyde have been hypothesized [75], they remain speculative. Notably, proposed mechanisms rely heavily on the assumption that formaldehyde can have direct effects on cells or tissues beyond the portal of entry. One fundamental mechanistic question critical to these hypotheses is whether exogenously derived formaldehyde can enter the circulating bloodstream and subsequently damage circulating precursor cells or the bone marrow. Recent experimental research, using extremely sensitive assays with the power to detect as little as one exogenous DNA adduct in 10 billion deoxyguanosines, demonstrated identical endogenously formed DNA formaldehyde adducts in all rat and nonhuman primate non-portal-of-entry tissues, including bone marrow. No exogenous adducts were detected in any distant tissue [8385]. These considerations call into question the plausibility of causal links between formaldehyde and the LHM.

Conclusions and recommendations

Existing epidemiologic evidence does not provide convincing support that formaldehyde causes any of the LHMs, including myeloid leukemia. Findings among the highest quality occupational cohort studies are largely null, the positive findings are inconsistent in terms of strength and specificity of association, and there are only isolated instances of exposure–response relations. Epidemiologic evidence from other formaldehyde-exposed occupational cohorts is similarly inconsistent, is often based on small numbers of events, and suffers from a greater likelihood of exposure misclassification and other potential limitations than the two large industrial cohort studies that we regard as highest quality. Available community-based studies, which generally have superior diagnostic classification but poorer quality exposure assessment than in the occupational cohort mortality studies, provide no support for etiologic associations of formaldehyde with any of the LHM.

Although we conclude that a causal connection between formaldehyde exposure and LHM is not supported by existing epidemiologic findings and that the evidence is further weakened by the absence of established carcinogenic mechanisms for the LHM, we nevertheless encourage further epidemiologic research on this topic. We make this recommendation with the caveat that, in order to be informative, further research should offer substantive improvements over the existing body of studies, especially in terms of application of modern diagnostic criteria for specific LHM and individual level quantitative exposure assessment. Well-defined occupational cohort studies should offer the best opportunities to evaluate associations between formaldehyde exposure and LHM risks. Because formaldehyde exposure is ubiquitous, accurately characterizing exposures from the many possible sources, including combustion, household furnishings, automobiles, and consumer products, is essentially impossible. Workplace exposures, on the other hand, are typically substantially higher than exposures from other environmental sources. Continued follow-up of the established high-quality occupational cohorts would be worthwhile, although the scientific yield may be limited because exposure and health outcome misclassification limitations can probably not be remedied. Re-analyses, including sensitivity analysis, of existing datasets may add insight into reported findings, as evidenced by previous re-analyses of the NCI producers cohort data [45]. Specifically, additional statistical analyses of risks of specific LHM in relation to the various exposure metrics in the original NCI producers study [73] are warranted.

A more attractive—but also more complicated and expensive—option would be to enumerate and follow new occupational cohorts exposed to formaldehyde. Professional groups, such as anatomists, pathologists, funeral directors, and embalmers, may be the most appropriate study populations because their exposures are frequent, generally remain at relatively high intensity, and may not be confounded by other potential exposures to leukemogens, such as benzene. Another advantage to studying such professions is that they are comprised of persons with comparable socioeconomic status, a characteristic often associated with baseline rates of LHM in the population.

In contrast, new cohort studies of industrial workers would likely encounter problems related to vastly reduced exposures in large workplaces during the past several decades in many high-income countries, and the resulting reduced capacity to test exposure-related associations rigorously. New occupational cohort studies in developing economies may offer opportunities for further research. Any new occupationally based studies should strive to obtain incidence data with modern LHM classification, and to incorporate valid, thorough exposure assessments for formaldehyde and potential confounders. Cross-sectional and, preferably, prospective investigations of biomarkers of bone marrow toxicity relevant to carcinogenesis that have adequate statistical power would also be worthwhile and might be incorporated into cohort studies where feasible (e.g., on subsets of workers).

In summary, we find insufficient epidemiologic evidence to support a causal relation between formaldehyde exposure and leukemia, including myeloid leukemia. We find no clear evidence of an excess risk of leukemia or myeloid leukemia in any large, well-conducted study. Furthermore, we find the occasional positive associations between various exposure metrics and leukemia or myeloid leukemia risk to be inconsistent, and in some instances, contradictory to results based on more conventional exposure characterization approaches. We also find no epidemiologic basis on which to conclude that formaldehyde causes any of the lymphomas. Further weakening arguments for causal associations is the absence of well-defined plausible models of pathogenesis. Nevertheless, in view of the ubiquitous presence of formaldehyde in the population and experimental evidence indicating high-dose carcinogenic potential, at least for portal-of-entry sites, we recommend improved epidemiologic research on potential risks for the LHM.