Contribution to the ongoing discussion on fluoride toxicity

Since the addition of fluoride to drinking water in the 1940s, there have been frequent and sometimes heated discussions regarding its benefits and risks. In a recently published review, we addressed the question if current exposure levels in Europe represent a risk to human health. This review was discussed in an editorial asking why we did not calculate benchmark doses (BMD) of fluoride neurotoxicity for humans. Here, we address the question, why it is problematic to calculate BMDs based on the currently available data. Briefly, the conclusions of the available studies are not homogeneous, reporting negative as well as positive results; moreover, the positive studies lack control of confounding factors such as the influence of well-known neurotoxicants. We also discuss the limitations of several further epidemiological studies that did not meet the inclusion criteria of our review. Finally, it is important to not only focus on epidemiological studies. Rather, risk analysis should consider all available data, including epidemiological, animal, as well as in vitro studies. Despite remaining uncertainties, the totality of evidence does not support the notion that fluoride should be considered a human developmental neurotoxicant at current exposure levels in European countries.


Introduction
Since the 1940s, fluoride has been added to drinking water in many countries as a means of caries prophylaxis. Fluoride prevents caries at low exposure levels, whereas, excessive fluoride exposure causes dental and skeletal fluorosis in humans, and developmental toxicity in animals. Based on this background, the European Food Safety Authority (EFSA) defined an adequate intake (AI) level for fluoride of 50 µg/kg b.w. at which the caries preventive effect approached its maximum whilst the risk of dental fluorosis approached its minimum (EFSA 2013). In recent years, the benefits and risks of fluoride exposure to the general population, e.g. by drinking water, fluoridated salt or dental care products, have been heavily debated, and special focus is set on potential adverse health effects, such as neurodevelopmental toxicity.

What type of data is needed to assess fluoride developmental neurotoxicity?
To adequately address potential human health concerns caused by exposure to fluoride, the available evidence from all sources should be included. Thus, it is crucial to critically review the evidence from epidemiological, as well as from animal and in vitro studies. Recently, we published a comprehensive review considering the available data from all the study types mentioned above, particularly focusing on developmental toxicity . Another factor to consider when assessing the potential health risks of fluoride is the expected level of exposure. The focus of our review was on studies investigating the developmental effects of fluoride levels in drinking water in the range of community water fluoridation (CWF) of 0.7-1.0 mg/L, as well as naturally occurring exposure scenarios in Europe which generally do not exceed the AI defined by EFSA. Since our aim was to evaluate whether fluoride exposure in European countries is of potential health concern, we did not address other exposure scenarios, e.g. in areas with endemically occurring high fluoride concentrations in ground and drinking water.
In comparison, other reviews evaluating a potential developmental toxicity of fluoride (e.g. Choi et al. 2012;Grandjean 2019;Grandjean and Landrigan 2014) (i) focused on the evidence from epidemiological studies, but did not include experimental evidence, and/or (ii) included results from endemically high fluoride areas. Thus, it is important to recognize that our review, in comparison to others recently published on fluoride toxicity, aimed to address different questions, and this is reflected by the application of different inclusion criteria used. It is therefore not surprising that conclusions drawn by the authors differ in some respects.
Below, evidence from animal, in vitro and epidemiological studies is briefly summarized primarily focusing on European exposure scenarios as discussed in our review by Guth et al. (2020).

Evidence from animal studies
Chronic toxicity studies in rats, mice, and rabbits that focused on systemic effects of fluoride resulted in Lowest-Observed-Adverse-Effect Levels (LOAELs) ranging between 4.3 and 7.6 mg/kg b.w./day fluoride, and noobserved-adverse-effect levels (NOAELs) between 2.5 and 7.6 mg/kg b.w./day fluoride. Four well-conducted developmental toxicity studies (Collins et al. 2001(Collins et al. , 1995Heindel et al. 1996) are available which are in accordance with standard guidelines, used adequate numbers of animals, and administered sodium fluoride in drinking water. These studies resulted in NOAELs of 8.5-13.7 mg/kg b.w./day fluoride for rats and rabbits. It should be noted that the influence of specific fluoride doses on plasma levels may vary between different species. For example, it has been suggested that approximately fivefold higher doses in drinking water might be required for rats to achieve serum concentrations similar to those in humans (Dunipace et al. 1995;NRC 2006). However, it must also be taken into account that numerous variables could influence these relationships in both animal and human studies and the factor to calculate plasma concentrations is largely uncertain, in part because it could change with age or duration of exposure (NRC 2006).
To our knowledge, there are currently no further developmental studies that were performed according to standard guidelines. A search of the literature published between 2005 and 2018 revealed a number of animal studies that reported an effect of fluoride exposure on various endpoints in offspring during development (see ). We reviewed the quality of these studies and identified various limitations (see Box 1) that hamper their interpretation, thus reducing their value for risk assessment. Studies investigating neurobehavioral toxicity in animals produced conflicting results (NTP 2016). A systematic review by the US National Toxicology Program (NTP) reported a low to moderate level of evidence of adverse effects on learning and memory in rats and mice exposed to fluoride concentrations substantially higher than 0.7 mg/L (NTP 2016). After the publication of the NTP report in 2016, several studies became available that investigated the impact of fluoride exposure on memory and learning in experimental animals. We reviewed the quality of the available studies and found that only two fulfilled the criteria listed in Box 1, but nevertheless still had limitations (McPherson et al. 2018;Pulungan et al. 2018). In both studies, no exposure-related differences in motor, sensory, or learning and memory performance were observed with the exposure levels investigated (up to 20 mg fluoride/L or 9 mg fluoride/ kg b.w./day, respectively). Of note, one study (McPherson et al. 2018) primarily investigated the influence of fluoride exposure during development on neurobehavioral aspects. The other 11 studies identified in our search had various and strong limitations, and did not meet key quality criteria discussed in detail in our review (see ).

Human exposure in relation to adverse effects in animal experiments
The mean intake of fluoride from water, food, beverages and oral hygiene products in European populations is usually below the AI recommended by EFSA. Recently, there has been some debate as to whether exposure in the range of the AI, i.e. 50 µg fluoride/kg b.w. /day, is sufficient to cause an increased risk of adverse effects in humans. It has also been suggested that fluoride, at current exposure levels, should be categorized as a human developmental neurotoxicant, and be placed in the same category as lead, methyl mercury, arsenic and polychlorinated biphenyls (Grandjean 2019;Grandjean and Landrigan 2014). To evaluate the situation, we calculated a margin of exposure (MoE) between doses showing no adverse effects in animal studies and the AI . The lowest NOAEL for systemic toxicity from a welldesigned chronic animal study was 2.5 mg/kg b.w./day. The lowest NOAEL for developmental toxicity was 8.5 mg/kg b.w./day. Compared to the AI of 50 µg/kg/day, the margin of exposure (MoE) is ~ 50 (systemic toxicity) or ~ 170 (developmental toxicity), which are high MoEs.

Evidence from in vitro studies
Recent findings suggest that in vitro data should also be considered in the risk evaluation of chemicals (Godoy et al. 2013;Leist 2017). Therefore, we compared the highest reported fluoride concentrations in plasma of healthy individuals (3 µM; summarized by Guth et al. 2020;e.g. Rugg-Gunn et al. 2011) to cell culture medium concentrations causing cytotoxic effects in neuronal and stem cells of rodent and human origin, which occurred at ~ 1 mM in most studies (range: 0.1-4 mM) . This results in a ratio of ~ 300, which demonstrates that human plasma concentrations of fluoride are far below cytotoxic levels.

Evidence from epidemiological studies
Since our review ) addressed the exposure scenarios relevant for European countries, we focused on epidemiological studies conducted in non-endemic fluoride areas or areas with CWF. Furthermore, we based our assessment on prospective studies in which cohorts were followed over a period of time (see inclusion criteria, Box 3). Two prospective cohort studies conducted in CWF areas that considered possible confounding factors (Broadbent et al. 2015;Green et al. 2019, Box 5) were included in our evaluation, and both reported conflicting results. In our review, we also noted that the majority of epidemiological studies conducted in areas with endemically occurring high fluoride levels in ground and drinking water reported an association between lower measures of intelligence and high fluoride exposure.
Other reviews (e.g. Choi et al. 2012; Grandjean 2019; Grandjean and Landrigan 2014) did not only focus on community water fluoridation and prospective cohort studies, but also included cross-sectional epidemiological studies, as well as studies performed in areas with endemically occurring high fluoride concentrations in drinking water. In these reviews, it was concluded that recent epidemiological evidence suggests that elevated fluoride intake during early development can result in considerable IQ deficits (Grandjean 2019).
While the present letter was under review, an article on a retrospective cohort study was published performed by a research institute under the Swedish Ministry of Employment (Institute for Evaluation of Labour Market and Education Policy; IFAU), which estimated a zero effect on cognitive ability for fluoride levels in Swedish drinking water (Aggeborn and Oehman 2021). This article is based on data of a comprehensive retrospective cohort study already discussed in our previous review (Aggeborn and Oehman 2017; see also Guth et al. 2020).

Limitations of epidemiological studies and inclusion criteria
Our analysis of the epidemiological studies repeatedly identified the limitations summarized in Box 2. In line with our goal to assess possible effects of fluoride at current exposure levels in Europe, we used the inclusion criteria summarized in Box 3. However, in a recent editorial (Spittle 2020), the author wrote that we omitted specific studies in the epidemiological section of our manuscript . This was indeed the case, because these studies did not meet our inclusion criteria. We provide a standardized profile and brief discussion of these studies with their strength and limitations, while simultaneously addressing the comments of Spittle (2020) (Box 4). These studies do not change the overall conclusion that the totality of currently available scientific evidence does not support the concept that fluoride should be assessed as a human developmental neurotoxicant at the current exposure levels in Europe. Furthermore, the authors of the studies (Box 4) were aware of these limitations and usually addressed them in the corresponding discussions.

Box 2 LimitaƟons observed in previously published epidemiological studies about fluoride exposure and intelligence (IQ)
• S tudies were often performed in relatively poor rural regions with unusually high concentrations of fluoride in drinking water, and were compared to 'reference populations' with fluoride drinking water concentrations in the recommended range. It should be considered that unusual fluoride concentrations in drinking water may be associated with a less developed health-care system and lower socioeconomic status.
• S tudies were performed in fluoride endemic areas and did not consider the influence of other neurotoxicants, such as arsenic (and other contaminants) in groundwater.
• The IQ of the parents was not considered in some studies. The 'outcome' (intelligence) may influence the 'input' (fluoride exposure).
Explanation: It is possible that parents with higher IQ read or inform themselves about the possible health hazards to children, and therefore avoid fluoride exposure. In this case high maternal/parental intelligence would be causally linked to lower fluoride exposure rather than high fluoride exposure causing lower intelligence in children.
• I Q analysis was not optimally performed to allow a reliable evaluation of intelligence.
Example: IQ was analyzed only once between the ages of 3 and 4 when the IQ of children is known to change rapidly, but the exact age was not considered.
• S tatistical significance was reported, although removing only one or two cases with extreme IQ scores from the models would result in non-significant associations.
• I n some studies, which were based on previous larger programs, the authors did not describe why only a fraction of the individuals was chosen for the particular study.

Key message
Higher prenatal fluoride exposure, in the general range of exposures reported for other general population samples of pregnant women and non-pregnant adults, was associated with lower scores on tests of cognitive function in the offspring at age 4 and 6-12 years.

Strength Limitations
• Prospective study design • Individual exposure assessment (urinary fluoride concentrations) • Urinary fluoride concentrations were corrected for dilution (creatinine or specific gravity) • Data on maternal IQ were available • Consideration of numerous covariates (adjustment for gestational age, weight at birth, sex, parity (being the first child), age at outcome measurement, and maternal characteristics including smoking history (ever smoked vs. nonsmoker), marital status (married vs. others), age at delivery, IQ, education, socioeconomic status, cohort and exposure to other neurotoxicants such as lead and mercury) • 6-12 years of follow-up

Characteristics
• Lack of data on fluoride content in water • A relatively large part of the participants (around 40 %) provided only one spot urine sample • Conducted in an endemic fluoride area, but the impact of known potential neurotoxins occurring in the study region, e.g. arsenic, manganese, PCBs was not considered • Covariates not considered: fluoride exposure via diet, iodine in salt, alcohol consumption of the mothers, breastfeeding, maternal exposure to the neurotoxicant arsenic Explanation: Participants from the Early Life Exposures in Mexico to Environmental Toxicants (ELEMENT) project were studied. Fluoride was measured in archived spot urine samples taken from mothers during pregnancy and from their children when 6-12 y old, adjusted for urinary creatinine and specific gravity, respectively. Intelligence of children was measured by the General Cognitive Index (GCI) of the McCarthy Scales of Children's Abilities at age 4 and full-scale intelligence quotient (IQ) from the Wechsler Abbreviated Scale of Intelligence (WASI) at age 6-12. Higher prenatal exposure to fluoride in the general range of exposures reported for other general population samples of pregnant women and non-pregnant adults, was associated with lower GCI Scores and with lower Full-Scale IQ scores. The authors noted as a limitation of the study that fluoride was measured in spot (second morning void) urine samples instead of 24-hr urine collections. However, it was also emphasized that a close relationship between the fluoride concentrations of early morning samples and 24-hr specimens have been reported. Other limitations mentioned by the authors include the lack of information about iodine in salt, which could have an influence on cognition; the lack of data on fluoride content in the water consumed; and the lack of information on exposure to the neurotoxicant arsenic. Of note, the ELEMENT birth cohort was conducted to investigate early life exposures to environmental toxicants in Mexico City. Relatively high concentrations of numerous contaminants were detected in urinary and/or blood samples of pregnant women and their children. For example, potential developmental neurotoxicants such as manganese (Claus Henn et al. 2018), and various other metals (arsenic, lead, cadmium, aluminium etc. (Lewis et al. 2018)), or biomarkers of phyrethroid exposure (3-phenoxybenzoic acid) (Watkins et al. 2016) were found to be present in urine or other biological samples. Furthermore, phthalate metabolites and bisphenol A were detected in maternal urinary samples (Watkins et al. 2017

Key message
Mental development (MDI) showed an inverse association with fluoride levels in maternal urine during pregnancy suggesting that cognitive alterations in children born from fluoride exposed mothers could start in early prenatal stages of life.

Strength Limitations
• Prospective study design • Individual exposure assessment (individual water samples, urinary fluoride concentrations)

Characteristics
• Small sample size • Short period of follow-up • Only a small percentage of the participants provided biological samples • Maternal IQ was not reported • Conducted in an endemic fluoride area, but the impact by other neurotoxicants (lead, mercury, arsenic) was not considered • Small set of covariates and sensitivity variables was used (gestational age, type of water for consumption, marginaIization index and age of child) • Urinary fluoride concentrations were not corrected for creatinine Explanation: The association between in utero exposure to fluoride and Mental and Psychomotor Development (MDI and PDI) was evaluated through the Bayley Scale of Infant Development II (BSID-II) in infants at age 3-15 month (average of 8 month). The sample included 65 mother-infant pairs from two endemic hydrofluorosis areas in Mexico (Durango City and Lagos de Moreno, Jalisco, Mexico). Environmental exposure to fluoride was quantified in tap and bottled water samples and in maternal urine; samples were collected during the 1st, 2nd and 3rd trimester of pregnancy. The mean values of fluoride in tap water for the 1st, 2nd and 3rd trimester were between 2.6 and 3.7 mg/L; more than 80% of the samples exceeded the reference value of 1.5 mg/L with a maximum value of 12.5 mg/L. Maternal urinary fluoride levels were found to be associated with statistically significantly lower MDI scores after adjusting for gestational age, age of child, a marginality index, and type of drinking water.
As also noted by the authors this study has some limitations including the small sample size of evaluated children, the short period of follow-up and the low participation to provide biological and environmental samples in the last trimester. It was conducted in residents from endemic fluoride areas and the impact by other neurotoxicants e.g. of mercury, arsenic, and lead was not considered. Furthermore, only a small set of covariates and sensitivity variables was used (gestational age, age of child, marginaIization index and type of water for consumption) and urinary fluoride measures were not corrected for creatinine (which, by adjusting for urinary dilution effects, provides a more reliable measure of internal fluoride exposure). It should also be noted that although BSID-I and -II are validated as end points for primary neurodevelopment measures and can be used to assess early developmental delays, both are not designed to predict long-term neurocognitive outcome (Sun et al. 2015). Furthermore, BSID-I and -II do not have adequate measures to eliminate the confounding source resulting from the cultural influence (Sun et al. 2015). It is well known that during the assessment procedure the responses can be significantly altered depending on the individual cultural background and/or ethnicity (Sun et al. 2015).

Key message
In endemic fluorosis areas, drinking water fluoride levels greater than 1.0 mg/L may adversely affect the development of children's intelligence.

Strength Limitations
•  (NRC, 2006). In the opinion of the NRC, among the studies, the one by Xiang et al. (2003) had the strongest design. However, it was noted that overall the significance of these Chinese studies is uncertain (NRC, 2006). The mean IQ of children in Wamiao did not differ significantly from the mean IQ of children in Xinhuai in the 0.75 mg/L fluoride group, but was statistically different in the 1.53 mg/L fluoride group (p< 0.05) and also in groups with a mean fluoride concentration of 2.46 mg/L fluoride and higher (p < 0.01). The children's IQs were not related to urinary iodine, family income, or parent's education level. Furthermore, the authors of the study determined a reference value concentration of 0.925 mg fluoride/L and stated that this is very close to the current national fluoride standard for China of <1.0 mg F/L suggesting that the current national standard is safe enough to protect 90% of children, aged 8-13 years, from adverse effects on their intelligence development. This reference value also reflects the range of CWF in several countries or is even somewhat higher.

Key message
The study results suggest that exposure to fluoride reduces the prevalence of dental caries, but no association was found to the intelligence of children.

Strength Limitations
• Individual exposure assessment (water fluoride concentrations; urinary fluoride concentrations)

Cross-sectional study design •
Only a small percentage provided biological samples • Conducted in an endemic fluoride area but the impact by other neurotoxicants (lead, mercury, arsenic) was not considered • Incomplete set of covariates and sensitivity variables was used (diet, oral hygiene, body mass index, and socioeconomic status) • Urinary fluoride measures were not corrected for dilution Explanation: The study by Soto-Barreras et al. is a cross-sectional study which investigated 161 children in Chichuahua, Mexico, from 9 to 10 years of age. This study has the general limitations of the study type of cross-sectional studies already discussed above. However, the concentration of fluoride in drinking water (range: 0.79-1.48 mg/L) and urine was analyzed individually but the urinary fluoride concentrations were not corrected for dilution by creatinine or specific gravity. The intellectual ability of children was evaluated through the Raven's Colored Progressive Matrices. Variables such as diet, oral hygiene, body mass index, and socioeconomic status were included. However, important factors known to influence children's IQ such as breastfeeding, low birth weight, and exposure to other neurotoxic chemicals were not analyzed. No relationship was found between intellectual ability and fluoride exposure variables such as dental fluorosis, levels of fluoride in drinking water and urine, and exposure dose. According to the authors the results suggest that fluoride exposure above 1.0 mg/L reduces the prevalence of dental caries, but no association was found with the intelligence of children. Thus, the study results are in agreement with the conclusions drawn by Guth et al. (2020) which do not support the presumption that fluoride should be assessed as a human developmental neurotoxicant at the current exposure levels in Europe.

Key message
Exposure to increasing levels of fluoride in tap water was associated with diminished non-verbal intellectual abilities; the effect was more pronounced among formula-fed children.  Guth et al. (2020) and used the same MIREC cohort in Canada reported already by Green et al. (2019). 398 mother-child dyads who reported drinking tap water were examined. The water fluoride concentration was estimated using municipal water reports. Linear regression was used to analyze the association between fluoride exposure and IQ scores, measured by the Wechsler Primary and Preschool Scale of Intelligence-III at 3-4 years. It was examined whether feeding status (breast-fed (BF) versus formula-fed (FF)) modified the impact of water fluoride and if fluoride exposure during fetal development attenuated this effect. Covariate adjustment included child's sex and age at testing, maternal education (dichotomized as either a bachelor's degree or higher versus trade school diploma or lower), maternal race (white or other), second-hand smoke in the home (yes, no), and quality of the child's home environment (measured at time of testing using the Home Observation for Measurement of the Environment (HOME) -Revised Edition). A 0.5 mg/L increase in water fluoride concentration was associated with a decrease of 4.4 Full Scale IQ (FSIQ) points (95% CI: −8.34, −0.46, p=.03) in the FF group, but it was not significantly associated with FSIQ in the BF group (B=−1.34, 95% CI: −5.04, 2.38, p=.48). However, removing only two cases with extreme IQ scores from the models resulted in non-significant associations between water fluoride concentration and FSIQ in both groups. According to the authors the association between water fluoride concentration and FSIQ must be interpreted with caution, because the association became non-significant when the two outliers were removed. Controlling for fetal exposure by adding maternal urinary fluoride (MUF) to the model resulted in non-significant associations between water fluoride concentration and FSIQ in both the FF and BF groups. The water fluoride concentration was significantly associated with lower Performance IQ (PIQ) in the FF (B=−9.26, 95% CI: −13.77, −4.76, p < .001) and the BF groups (B=−6.19, 95% CI: −10.45, −1.94, p=.004). The association between water fluoride concentration and Performance IQ remained significant after controlling for fetal fluoride exposure among formula-fed. As mentioned by Till et al., breastmilk contains extremely low concentrations of fluoride (0.005-0.01 mg/L) due to the limited transfer of fluoride in plasma into breastmilk. Therefore, it seems unlikely that water fluoride concentration seems to play a role for fluoride exposure among exclusively breastfed children and that this could have an effect on IQ. Children in the breastfed group had higher FSIQ and Verbal IQ (VIQ) scores relative to the formula-fed group, regardless of fluoridation status. This is consistent with prior studies showing a positive effect of breastfeeding on cognition, amongst others the Broadbent et al. study (2015) which found that breastfeeding was associated with higher child IQ irrespective of residence in CWF or non-CWF areas. The authors recognize that higher education and income levels in the breastfed group (and so potentially higher IQ of the mothers) likely accounts for part of this association. Overall, Till et al. present a well-conducted prospective cohort study based on the same cohort as the Green et al. study which we discussed in detail in our review . The authors suggested that the developing brain may also be adversely affected by fluoride exposure during infancy but also discussed the limitations of their study. Our review has pointed out some further limitations of the Green et al. study which also apply to the Till et al. study. A limitation of both studies is the lack of IQ data of the mothers. An additional limitation also applying to Green et al. is that the intelligence tests have been performed only once between the age of 3 and 4 years, but the exact age of the children at the time point of the test has not been considered in the statistical analysis. This may be problematic, because the IQ of children changes strongly between 3 and 4 years.

Discrepancy between experimental and epidemiological evidence
We observed a discrepancy between experimental and epidemiological evidence, which may be explained by deficiencies that were inherent to most of the current epidemiological studies, e.g. insufficient consideration of potential confounders. The majority of epidemiological studies which reported an association between lower measures of intelligence and high fluoride exposure was conducted in areas with endemically occurring high fluoride levels in ground and drinking water. In contrast, the experimental evidence suggests that current exposure to fluoride, even for individuals with relatively high fluoride intake, is clearly below levels that have led to adverse effects in vitro or in animals.

Reasons why it is problematic to calculate bench mark doses for humans (BMD)
A main criticism of our review was that we made 'no attempt to calculate the threshold for fluoride neurotoxicity using the standard benchmark dose method. Grandjean used the regression coefficients and their standard deviations as provided in the published reports to estimate tentative BMD values. A BMDL of about 0.2 mg/L or below was suggested (Grandjean 2019), which was similar to the result calculated from a previous study (Xiang et al. 2003) by Hirzy et al. (Hirzy et al. 2016) (for a brief discussion of studies calculating BMDs for humans see Fig. 1 Correlation of maternal urinary fluoride concentration and full-scale IQ (FSIQ), reproduced from Green et al., 2019. Using this set of data, the authors concluded: "An increase from the 10th to 90th percentile of maternal urinary fluoride was associated with a 3.14 IQ decrement among boys." (Green et al. 2019). However, because of the relatively high variability of the IQ data, recently calculated benchmark doses of human neurotoxicity (Grandjean 2019) should be treated with caution Box 6). It should be considered that even without fluoridation, the fluoride concentration in drinking water in Europe often ranges around 0.5 mg/L and is therefore higher than the BMDL of 0.2 mg/L derived by Grandjean et al. (2019). It was concluded that the benchmark dose of fluoride neurotoxicity is clearly below commonly occurring fluoride exposure levels.
We did not follow this approach to calculate the BMD, because the results of such calculations would be questionable due to the inadequate quality of the available input data. It remains unclear why two studies (Bashash et al. 2017;Green et al. 2019) were finally selected to calculate the BMD (Grandjean 2019); whereas, others with a negative result (Broadbent et al. 2015) were omitted. The studies by Valdez Jimenez et al. (2017) and Bashash et al. (2017) have limitations, such as the lack of control of the influence of other neurotoxicants and small sample size (Box 4). Green et al. (2019) openly discussed the limitations of their own study directly in their publication, which are briefly summarized in Box 5. The difficulty to 'calculate the threshold of fluoride neurotoxicity' (Spittle 2020) is illustrated using a scatter plot of IQ (FSIQ) versus maternal urinary fluoride, where each dot represents the IQ of a child ( Fig. 1; reproduced Fig. 3A from (Green et al. 2019)). The trend line of IQ of the girls slightly increased with higher maternal urinary fluoride, although this effect was not statistically significant. In contrast, a significantly lower IQ was observed for boys, which depended on the two individuals with the highest urinary fluoride (Fig. 1). This difference led to the suggestion that there may be a sex difference in response to fluoride (Grandjean 2019). However, considering the high overall variability of IQ among the children in the study, this interpretation should be done with caution. Rather, further studies are required before such conclusions can be drawn.

Key message
No clear differences in IQ because of fluoride exposure via community water fluoridation (CWF) were noted. The findings of the study do not support the assertion that fluoride in the context of CWF programs is neurotoxic. Explanation: We analyzed this study in detail in our review article ). The study was performed with a general population sample of 1.037 children born in Dunedin, New Zealand. The participants were followed for 38 years and their fluoride intake via drinking water (residence in a CWF area versus non-CWF area; 0.7-1.0 mg fluoride/L vs. 0.0-0.3 mg fluoride/L), fluoride dentifrice, and/or 0.5 mg fluoride tablets in early life (prior to age 5 years) was deduced. IQ was assessed repeatedly between ages 7 and 13 years and at age 38 years. It was reported that no statistically significant differences in IQ due to fluoride exposure were observed also following adjustment for potential confounding variables, including sex, socioeconomic status, breastfeeding, and birth weight (as well as educational attainment for adult IQ outcomes). It was one of two available studies with a suitable study design (prospective cohort study) conducted in a non-endemic CWF area that appropriately considered confounding factors. However, we also identified some limitations already described in our review. For example, parental IQ is a strong confounder but the IQ data of the mothers were lacking. Furthermore, a limitation of the study is the fact that individual water-intake level was not directly measured and dietary fluoride was not considered. The study by Broadbent et al. was not only classified in our review as a high quality study but was also classified as such by other bodies e.g. the Health Board of Ireland (Sutton et al. 2015) or the National Health and Medical Research Council (NHMRC) of the Australian Government (NHMRC 2017). These bodies stated in agreement with our review that the study by Broadbent et al. is a high quality prospective cohort study, with a low risk of bias (NHMCR 2017). Moreover, this study is one of the very few prospective cohort studies conducted in a CWF area and its design is appropriate for inferring causality (Sutton et al. 2015). The Broadbent study has been critiqued (e.g. by Spittle 2020) on the grounds that the estimated difference in the exposure to fluoride, from drinking water, food, toothpaste, beverages, and fluoride supplements, of less than 0.2 mg /day, between the CWF group and the non-CWF group was so small that it was unlikely to lead to a detectable difference in IQ (Hirzy et al. 2016 was cited). Broadbent et al. reported residence in an area with or without CWF (0.7--1.0 ppm and 0.0--0.3 ppm fluoride, respectively) coded from residential address data (Broadbent et a. 2015). Hirzy et al. (2016) estimated a daily intake of 1.36 and 1.19 mg/day, respectively, for the CWF and non-CWF group. This was based on several assumptions, e.g. that a relatively high proportion of the overall exposure results from dietary sources or supplements. Osmunson et al. also calculated total fluoride intake for the CWF and non-CWF Dunedin Cohort participants using publicly available data (Osmunson et al. 2016) and estimated that lifetime CWF children had a mean total fluoride intake of 0.7 mg/day while non-CWF children averaged 0.5 mg/day. Broadbent et al responded to this estimation that it is similar to their own estimate of an average difference of a total daily fluoride intake of 0.3 mg/day through the first five years of life between study members from CWF versus non-CWF areas. According to Broadbent et al. (2016) these differences are consistent with the literature.

Strength Limitations
Other researchers have estimated that the increase in fluoride intake among children aged one to three years attributable to CWF is 0.2 milligrams per day (Rojas-Sanchez et al. 1999) or 0.3 milligrams per day (Cressey et al. 2010). Even when considering that the difference between fluoride exposure in CWF and non-CWF areas is small, the results of the Broadbent study showed that living in a CWF or non-CWF did not have a significant effect on the IQ of children suggesting that fluoride drinking water concentrations in the range of CWF do not represent a health concern. It is reasonable that this difference is much smaller in CWF areas than the difference occurring in endemic fluoride areas with a very high fluoride concentration of drinking water.
Box 5 Two prospecƟve studies already considered in Guth et al (2020)

Key message
In this study maternal exposure to higher levels of fluoride during pregnancy was associated with lower IQ scores in children aged 3 to 4 years. Only a fraction of the individuals from a larger program was chosen for the study Explanation: We analyzed this study in detail in our review article . It was one of two available studies with a suitable study design (prospective cohort study) conducted in a non-endemic CWF area that also appropriately considered confounding factors (even though there are still some limitations, see below). A limitation of the study is the lack of IQ data of the mothers, because parental IQ is a strong confounder. Moreover, it cannot be excluded that the 'outcome' (intelligence) influenced the 'input' (fluoride exposure). This is an important possible confounder which is frequently not considered. It is possible that more intelligent mothers read or inform themselves more about possible health hazards of children and therefore avoid fluoride exposure. In this case high maternal intelligence would be causally linked to lower fluoride exposure rather than high fluoride exposure causing lower intelligence of children. An additional limitation is that the intelligence tests have been performed only once between the age of 3 and 4 years but the exact age of the children at the time point of the test has not been considered in the statistical analysis. This may be problematic, because the IQ of children changes strongly between 3 and 4 years. Moreover, only 610 of 2001 pregnant women from the MIREC program were considered and information on maternal urinary fluoride was missing in a relatively high fraction of the mothers of children of whom IQ was determined. This may represent a possible source of bias. Green et al. (2019) considered some of the relevant confounders (city, socioeconomic status, maternal education, race/ethnicity, prenatal secondhand smoke exposure) but did not adjust for others (breastfeeding, low birth weight, alcohol consumption and further dietary factors, other sources of fluoride exposure, exact age of children at time point of testing). Furthermore, assessment of children's postnatal fluoride exposure via, e.g., diet, fluoride dentifrice, and/or fluoride tablets, was not included. Hirzy et al. (2016) used the benchmark approach to generate benchmark results from the study by Xiang et al. (2003). The authors discussed the limitaƟons of their study and stated that the calculated reference dose is based on a limited amount of quanƟtaƟve data, most of which is of ecological design. Furthermore, they were unable to find any data on human intellectual performance as a funcƟon of fluoride exposure in the USA. Nor were there studies, other than those by the Xiang et al. research group, which provided any useful dose-response informaƟon. In esƟmaƟng RfD values, Hirzy et al. used mean water consumpƟon rates, except as noted, and mean IQ measurements that were derived from different tesƟng methods, recognizing the limitaƟons of these uses and those inherent in ecological studies generally. The authors stated that the data they used for the food component in esƟmaƟng total fluoride intake were mean values from one study that were not accompanied by standard deviaƟons. They were, however, somewhat higher than the values for children's food fluoride exposures in the USA indicaƟng a conservaƟve approach. Overall, Hirzy et al. acknowledged that it would have been useful to have a more robust data set on which to base the risk analysis, but in their opinion waiƟng for more such data that are unlikely to be developed in the near future did not seem reasonable. Because of these limitaƟons, already discussed by the authors, we recommend not to use the data for calculaƟon of a reference dose and exposure limit. As concluded in our review arƟcle, further research is needed for a comprehensive risk assessment, for example, high-quality prospecƟve epidemiological studies that adequately control for any confounding factors.

Grandjean (2019)
Grandjean (2019)  ) and reported that based on these studies early-life exposures were negaƟvely associated with children's performance in cogniƟve tests. According to the conclusions drawn by Grandjean, neurotoxicity appeared to be dose-dependent, and he conducted a tentaƟve benchmark dose calculaƟon, which suggests that safe exposures are likely to be below currently accepted or recommended fluoride concentraƟons in drinking water. As a result, he concluded that the recent epidemiological studies support the noƟon that elevated fluoride intake during early development can result in IQ deficits that may be considerable. Considering the high overall variability of IQ, this interpretaƟon should be treated with cauƟon. Rather, further studies are required before such conclusions can be drawn.

Conclusion
There are varying opinions on the health effects of high fluoride exposure. Our recent assessment was based on evidence from animal, in vitro and epidemiological studies focusing on exposure scenarios relevant for the population in Europe. Others included epidemiological evidence from endemic areas into their assessment. Moreover, we critically discussed the insufficient consideration of confounding factors and deficiencies of study design and statistical evaluation in available epidemiological studies. Thus, the differences in considered study populations and different standards in evaluating the quality of epidemiological studies may at least in part explain the different assessments. Also, considering the additional studies which did not meet the inclusion criteria of our first review article (see Box 4), we still arrive at the same conclusions: the available epidemiological evidence does not provide sufficient arguments to raise concerns with regard to CWF in the range of 0.7-1.0 mg/L, nor does it justify that fluoride should be categorized as a human developmental neurotoxicant, signifying that it is similarly problematic as lead or methylmercury at current exposure levels.
Of course, the conclusions may have to be reconsidered if new comprehensive findings from epidemiological or animal studies are presented.

Final recommendations
Calculation of a threshold for human fluoride neurotoxicity based on selected epidemiological studies may be problematic since the available data are not considered to be sufficient to perform a dose response assessment. For risk evaluation, it is important to consider all available data, including animal experiments and in vitro studies. Further animal studies and prospective epidemiological studies would be