Background

Worldwide, there are around 11 million welders and 110 million additional workers exposed to welding fumes (Guha et al. 2017). In Sweden, there are around 25,000 welders and 200,000–250,000 additional workers that perform welding along with their main work task (Sjögren 2013). Welding results in exposure to high levels of fine- and ultrafine welding particles, UV radiation, and, in some cases, welders have co-exposure to asbestos and solvents (Guha et al. 2017). The composition of the welding fumes is highly dependent on the materials used for welding (e.g., mild or stainless steel electrodes) and the type of welding process (e.g., gas or arc welding).

The International Agency for Research on Cancer (IARC) recently classified welding fumes as Group 1, i.e., “carcinogenic to humans” (Guha et al. 2017; IARC 2019), based on epidemiological evidence showing increased risk of lung cancer in welders. The risk for lung cancer was higher in welders that had a longer or higher cumulative exposure (Matrat et al. 2016; Siew et al. 2008; Sorensen et al. 2007; t Mannetje et al. 2012). The assessment is valid for welding fumes in general, i.e., it includes both mild- and stainless steel welding (IARC 2019) and the key carcinogenic components of welding fumes are considered to be the particles. Strategies to identify the cancer risk associated with exposure to welding fumes are, therefore, needed, and a sound approach would be to develop methods to measure both the cancer-relevant particles at the workplace as well as early biomarkers of welding-related lung cancer.

The mechanisms behind the carcinogenicity of welding fumes are not fully understood. Several studies reported, for example, an increase in oxidative stress markers (Han et al. 2005; Hoffmeyer et al. 2012a, b), systemic inflammation (Kim et al. 2005; Shen et al. 2018; Wang et al. 2005), and immune suppression (Grigg et al. 2017; Marongiu et al. 2016) following exposure to welding fumes. In contrast, in cross-sectional studies of Swedish welders and controls, we found limited evidence for increased oxidative stress and inflammation, by measuring 8-oxodG in urine as well as cytokines and CRP in the blood (Li et al. 2015a, b). An in vivo study indicated that welding fumes derived from stainless steel welding act as a cancer promoter and lead to lung cancer in mice previously exposed to 3-methylcholanthrene, a tumor inducer (Zeidler-Erdely et al. 2013).

Other mechanisms of welding-induced cancer could involve interactions with the DNA, without preceding inflammation. In a cross-sectional approach, we found slightly shorter telomeres in welders compared with controls (Li et al. 2015b). In the same study sample, we found that welders had lower DNA methylation (hypomethylation) of the F2RL3 gene (Hossain et al. 2015), which has been suggested to be a prospective marker for risk of lung cancer (Zhang et al. 2015).

The goal of this study is to identify early biomarkers for welding-related cancer risk. To this end, we evaluated a panel of 91 putative cancer-related proteins in a cohort of welders and controls from southern Sweden, sampled on two occasions 6-year apart. In addition, based on the results from the longitudinal data, the observed significant associations were further evaluated in welders and controls sampled once, i.e., in a cross-sectional setting (cross-sectional group; flowchart of the study participants in Fig. 1). The evaluated proteins are involved in processes critical for cancer initiation and progression such as cell adhesion, motility, proliferation, differentiation, angiogenesis, and immune response.

Fig. 1
figure 1

Flowchart of the study design. Incomplete data refer to incomplete questionnaire and/or low quality of the protein data

Results

Characteristics of the study participants

All study participants were males with a relatively healthy lifestyle: the majority were non-smokers, and had low alcohol consumption and a medium–high intake of vegetables (Table 1). A comparison analysis between welders and controls from the three study groups (main study cohort examined at two occasions—timepoint 1 and 2—and cross-sectional group examined once) revealed that welders were more likely to be born outside Sweden (p < 0.005) as well as to live in towns and countryside rather than in large or small cities (p < 0.005). There were no other significant differences between welders and controls as regards the evaluated characteristics. An additional comparison analysis between the three study groups revealed that age was significantly different (p < 0.001); age for the cross-sectional group was more similar with the age for cohort at timepoint 1. There were no other significant differences between the main study cohort and the cross-sectional group.

Table 1 Characteristics of the main study group (cohort measured twice) and cross-sectional group (measured once) of welders and controls

Between the two sampling occasions, both the years of welding and the cumulative exposure increased significantly for welders at timepoint 2 compared with timepoint 1 (Table 1). There was no significant change in the respirable dust levels measured in the workplace (1.1 mg/m3 at timepoint 1 and 0.9 mg/m3 at timepoint 2). The median exposure to respirable dust among the welders (i.e., respirable dust adjusted for PPE—personal protection equipment) was slightly higher at timepoint 1 (0.7 mg/m3) compared with timepoint 2 (0.5 mg/m3) (p value = 0.052). The median area levels of respirable dust in the control companies were for timepoint 1 and timepoint 2 0.09 mg/m3 (0.02–0.2) and 0.03 mg/m3 (0.02–0.06), respectively. The cross-sectional group of welders had a median respirable dust level of 1.6 mg/m3 (0.8 mg/m3 when adjusted for PPE). The correlations between years of welding and respirable dust levels (adjusted PPE) were in general very low (rS = − 0.07 for timepoint 1, rS = − 0.14 for timepoint 2 and rS = − 0.12 for the cross-sectional group). Body-mass index increased significantly at timepoint 2 compared with timepoint 1 in both welders and the controls (Table 1). None of the other characteristics of the study participants changed significantly between the two sampling occasions.

Principal component analysis was performed to evaluate to which extent characteristics of the study participants could explain the variation in serum levels of cancer-related proteins. The complete description of the 91 measured proteins is appended in Supplementary Table 1. The three groups, cohort timepoint 1 and 2 and the cross-sectional group, showed similar patterns, with age, body-mass index, and use of snus being the most significant parameters explaining protein variation (Fig. 2). For this reason, age, body-mass index and use of snus were used as covariates in the statistical analyses. The use of snus explained to a large extent cornulin (CRNN) levels in serum; removing CRNN from the analysis reduced the impact of snus on explaining the overall variation in the data set (data not shown). The principal component analysis did not reveal any association with variables such as residence, personal history of cancer, or particle exposure from hobby activities.

Fig. 2
figure 2

Heatmaps of the principal components (PC) that explain the variation in the study groups. Numbers on the horizontal axis of the heatmaps indicate number of the principal component and in brackets is the percentage of the explained variation in the protein dataset. Heatmaps were constructed using input data from linear regression of association between the principal components of the data and the biological annotations. The influence of the biological annotations on the overall variation is plotted in a heatmap based on the p value of the association. Input data were the normalized protein expression values (on a log2-scale). Lines in the key indicate p value thresholds (blue—unadjusted; green—Bonferroni-adjusted for multiple comparisons). ‘group’ refers to welders and controls; ‘residence’ refers to current residence in large and small cities as compared with towns and countryside; ‘hobby exposure’ exposure to welding fumes, dust, engine exhaust, or engine diesel during leisure activities; ‘country of birth’ is categorized as Sweden or outside Sweden; ‘education’ is assigned to five categories for education from secondary school to university studies; ‘vegetables’ frequency of intake of vegetables and is assigned to eight categories from 3 per day or more to never; ‘physical activity’ is based on four categories from sedentary to intensive physical activity; ‘ever smoking’ stands for current or previous smoking and is categorized as ‘yes’ and ‘no’; ‘alcohol’ stands for frequency of alcohol intake and is stratified in six categories from every day to never (color figure online)

Differentially expressed cancer-related proteins in welders compared with controls

Using a longitudinal analysis (linear mixed models), we identified 14 out of 91 serum proteins that were significantly different in welders compared with controls (Table 2). Syndecan 1 (SDC1) was also significant after adjusting for multiple testing. Complete output data from the differential expression analysis between welders and controls are appended in Supplementary Table 2. Sensitivity analyses were performed (1) only including the non-smokers (n = 74 welders, n = 89 controls) and (2) by excluding welders with respirable dust levels > 2.5 mg/m3 (n = 10). All identified proteins in the linear mixed models were significant (p value < 0.05), with the exception of KLK8 (p value = 0.059) in the sensitivity analyses.

Table 2 Differentially expressed proteins in serum between welders and controls in the main study group (linear mixed models) and corresponding data for the cross-sectional group (linear models)

Three of the proteins identified by linear mixed models, i.e., syndecan 1 (SDC1), folate receptor 1 (FOLR1), and secreted protein acidic and cysteine rich (SPARC), were also differentially expressed in the replication group, and additionally, two proteins, i.e., kallikrein-related peptidase 8 (KLK8) and ephrin-receptor A2 (EPHA2), showed p < 0.1 in the cross-sectional group. The majority of the significant proteins (12 out of 14) in the study cohort were altered in the same direction in the cross-sectional group (Table 2). SDC1, FOLR1, and SPARC were downregulated in welders as compared with controls in both the longitudinal and the cross-sectional analysis (Table 2, Fig. 3). In addition, FOLR1 seemed to decrease from timepoint 1 to timepoint 2 both in welders and in controls (Fig. 3).

Fig. 3
figure 3

Box plots of normalized protein expression (NPX) values for SDC1, FOLR1, and SPARC in the six study groups. The p values (t test) for the comparisons between the welders and controls for the different timepoints were significant (< 0.05) for all proteins apart for SPARC comparison welder timepoint 2 versus control timepoint 2 and welders cross-sectional versus control cross-sectional

Disease and function analysis of the top 20 differentially expressed proteins (p < 0.1) indicated lung cancer as top disease and cell movement (cell migration and cell invasion) as top functions (Fig. 4). Based on our data set, cell migration and cell invasion were predicted to be inhibited in welders versus controls.

Fig. 4
figure 4

Networks of the most significant diseases and functions defined by the differentially expressed proteins between welders and the controls in the linear mixed models. Analysis was performed using Ingenuity Pathway Analysis for the top 20 proteins (p < 0.1). Proteins represented in red are upregulated, whereas proteins represented in green are downregulated in welders versus controls (color figure online)

Next, we evaluated the association between the three differentially expressed proteins (SDC1, FOLR1, and SPARC identified by linear mixed models and linear models of welders versus controls) and assessment of exposure to welding, i.e., respirable dust (n = 56), years of welding (n = 77), and cumulative exposure (n = 56) in welders only (Table 3). Since years of welding and cumulative exposure were correlated with age (rS = 0.53, rS = 0.30, respectively), we included models both with and without age as a co-variate, as well as models in which age was the independent variable (Table 3). For SDC1, age alone was associated with the proteins levels (in models adjusted for BMI and snus) and it is, therefore, difficult to disentangle the contribution of age from the contribution of welding years or cumulative exposure. However, for FOLR1, the protein levels were associated with years of welding irrespective of age. Similar associations between age and the respective circulating proteins were identified in controls.

Table 3 Associations of the differentially expressed proteins in welders and controls (SDC1, FOLR1, SPARC) with exposure to welding expressed as respirable dust (adjusted for personal respiratory protective equipment), years welding and cumulative exposure in welders only (linear mixed models)

Discussion

The current study identified 14 putative cancer-related proteins that were differentially expressed in a longitudinal cohort of welders working with mild steel. Associations with three of these proteins (SDC1, FOLR1, and SPARC) were similarly altered in an additional cross-sectional set of welders and controls. Importantly, these associations are observed at levels of moderate exposure to welding fumes and clearly below the current Swedish occupational exposure limit (2.5 mg/m3). Further studies are needed to conclude if these protein alterations are true markers of later disease.

The top 20 differentially expressed proteins were found to be related to functions such as cellular movement (cell migration and invasion) which is a potentially novel mechanism of toxicity for welding fumes. There was no association with proteins involved in inflammation or immune response, e.g., interleukin 6 or the Toll-like receptor 3. However, annexin A1, an anti-inflammatory glucocorticoid-regulated protein (Sugimoto et al. 2016), was upregulated in welders compared with controls. Overall, studies evaluating associations between welding fumes and inflammation show inconsistent results, and the contribution of inflammation to welding-induced lung cancer is still unclear. Changes in inflammatory status have been observed following acute exposure to welding (Kauppi et al. 2015; Kim et al. 2005; Shen et al. 2018; Wang et al. 2005), but there are only a few studies addressing inflammation after long-term exposure to welding fumes and with no clear associations (Li et al. 2015a, b). Indeed, the role of inflammation in welding-related cancer needs to be further evaluated in further studies. In addition, several studies in welders found that smoking can profoundly affect the status of inflammation (Kim et al. 2005; Wang et al. 2005), and therefore, it might be difficult to disentangle the contribution of smoking from the effect of exposure to welding fumes. In the current study, we can exclude the potential contribution of smoking to the observed associations, since, with very few exceptions, study participants were non-smokers. The few smokers were excluded in a sensitivity analysis, without altering the results.

SDC1, the most significant protein differentially expressed in welders, was downregulated in welders versus controls, and was the only protein that passed the significance threshold after adjusting for multiple testing. SDC1 is expressed in different tissues of epithelial origin, including the lung (proteinatlas.org). It is a cell surface bound heparan sulfate proteoglycan involved in numerous cell processes such as growth, adhesion, and migration (Szatmari et al. 2015). SDC1 is constitutively shed from the surface of the cells, a process that is mediated by proteases (e.g., heparanase) and is accelerated in connection to tumors (Ramani et al. 2013). SDC1 can act both a tumor suppressor and a tumor promoter, and this dual effect depends on the tissue of origin and form (cell surface bound or soluble) (Garusi et al. 2012). High levels of SDC1 in serum are associated with a poor prognosis in, for example, multiple myeloma (Seidel et al. 2000) and lung cancer (Joensuu et al. 2002). The latter study reported, however, no association between SDC1 tissue expression and serum levels, albeit on a small sample size (n = 45) (Joensuu et al. 2002). On the other hand, SDC1 in serum was significantly lower in individuals with malignant pleural mesothelioma as compared with unexposed controls or individuals exposed to asbestos but without mesothelioma (Demir et al. 2016).

FOLR1 was another protein differentially expressed in welders (downregulated) and, in addition, negatively associated with years of welding. FOLR is expressed in different epithelial tissues, with a high expression in the lung (proteinatlas.org). It is a cell membrane receptor involved in folate transport and predominantly present on the surface of epithelial cells (Elnakat and Ratnam 2004). It can also be shed from the surface of the cells and reach the blood stream in a soluble form (Kelemen et al. 2014). Increased serum levels of FOLR1 have been associated with ovarian cancer and there was a good correlation between expression in tumor tissue and soluble FOLR1 (Kurosaki et al. 2016). Increased expression of FOLR1 at tissue level was associated with malignant pleural mesothelioma and lung adenocarcinoma (Bueno et al. 2001; O’Shannessy et al. 2012).

SPARC was downregulated in welders compared with the controls, but without any association with exposure to welding measured as welding years, respirable dust, or cumulative exposure. SPARC is predominantly found as a secreted glycoprotein, but it is also present in the cell membrane and intracellularly (Wong and Sukkar 2017). Extracellular SPARC mediates cell–extracellular matrix interactions as well as angiogenesis and in the lung SPARC is involved in development of fibrosis and progression of lung cancer (Wong and Sukkar 2017). SPARC expression in lung squamous cell carcinoma tissue was reported to be higher compared to non-cancerous tissue and was significantly correlated with poor outcome in lung squamous cell carcinoma patients (Xu et al. 2019). SPARC levels were found to be higher in plasmas of lung cancer patients compared with healthy heavy-smoking controls and SPARC was suggested to be a diagnostic tool to discriminate between the two groups (Andriani et al. 2018).

The directions (lower in welders versus controls, or in relation to years welding) found in our study for SDC1, FOLR1, and SPARC were in several cases opposite to the direction found for protein levels in serum of cancer patients versus controls, or in relation to poor cancer prognosis. It is, therefore, unclear at this point to which extent the associations that we found represent toxic or adaptive responses following exposure to welding fumes, or if these putative cancer biomarkers are predictive of future cancer risk. In addition, it should be stressed that the study participants are healthy, even though they are exposed to potentially carcinogenic welding fumes and the associations might be different for healthy individuals compared to cancer patients. Moreover, the interplay between tissue and serum levels of proteins is highly dynamic and complex, and renders it difficult to compare studies that evaluate tissue levels with studies that evaluate serum levels of proteins.

It is yet unclear why FOLR1 seems to decrease from timepoint 1 to timepoint 2 both in welders and in controls. One possible explanation could be that serum levels of these proteins decrease with age. We could only find one reference in the literature where FOLR1 was reported to be associated with age; however, the study indicated a positive association that was only valid for women (Kelemen et al. 2014). In addition, we investigated whether the changes could be related with storage time of the samples, but we found no indication of such associations. It should be noted that the sampling was performed using the same protocol, by the same nurse and using same collection tubes. Since the sampling was performed approximately 6 years apart, we cannot exclude a batch effect. If that was the case, we do not expect that such a batch effect would have an impact on the observed associations with serum levels of cancer-related proteins, since both welders and controls were sampled at both timepoints.

Downstream analysis, i.e., disease and function analysis, performed on the top 20 proteins differentially expressed in welders indicated lung cancer as top disease and cell movement as one of the top functions. While the analysis predicted an inhibition of cell movement based on the input data, the direction of this prediction bears a great deal of uncertainty. This is mainly due to the background data at the basis of this prediction, as there are few data for serum samples. Furthermore, the correlations between the tissues and serum levels of the respective cancer-related proteins are yet not well established. We emphasize that these enrichments are merely prediction-based and additional validation should be performed in mechanistic studies of welding fumes.

Using a similar approach as in our study, cornulin [CRNN, member of the ‘fused gene’ family (Contzler et al. 2005)], levels in plasma were previously reported to be associated with consumption of snus (Sundkvist et al. 2018). CRNN was also differentially expressed in welders in our longitudinal analysis, albeit this was not replicated in the cross-sectional analysis. Interestingly, the variance explained by the model for CRNN is relatively high (34%) and this is likely related to the presence of snus in the model. Moreover, snus was one of the factors that was indicated in the PCA heatmap as being significant in explaining the variation in the data set, and this was conditioned by the presence of CRNN. The mechanisms behind the association between CRNN and snus have yet to be clarified.

One of the strengths of this study is the combination of a longitudinal approach with a cross-sectional approach. The longitudinal analysis takes into account intra-individual variations and time-dependent changes and provides the basis for a hypothesis generation, while the cross-sectional approach increases the study validity by allowing for replication of the longitudinal findings in an additional sample set. Moreover, we had access to extensive data on the background and lifestyle of the study participants (e.g., diet, physical activity, alcohol intake, cancer history, and hobbies) that could confound the statistical analyses. The number of smokers was marginal, as smoking was one of the exclusion criteria at recruitment. This is important, since smoking is a frequent confounder when cancer-related outcomes are evaluated. We also had information on hobby exposure to welding or other hobbies that incur increased exposure to carcinogenic particles. In additional to the group analysis, welders versus controls, we also modeled the relationships with different types of exposure measurements in welders only. This was facilitated by having information on the number of years welding and welding frequency, as well as by measuring the amount of respirable dust, individual welders are exposed to. One of the limitations of the latter approach is the lower power due to a reduced number of welders that had their respirable dust assessed at two different occasions (n = 56).

Conclusions

In this study, we identified associations between moderate exposure to welding fumes and cancer-related proteins in blood. It is yet unclear to which extent these changes are adaptive or predictive of cancer risk, and further studies are warranted.

Methods

Study design

This investigation is based on a cohort of welders established in 2010 in the south of Sweden (Södra sjukvårdsregionen) (Li et al. 2015a). At baseline, we recruited 101 welders working in small- and medium-size welding companies and 127 age-matched controls working in small companies, but without occupational exposure to particles, including from welding fumes. The inclusion criteria were that they should be non-smokers since at least 6 months and male. However, at follow-up, we identified based on the questionnaire data a few individuals that actually were current smokers. The follow-up after 6 years was performed in 2016–2017 with a dropout of 23% (n = 23) among welders and 24% (n = 31) among controls; the main reason for dropout was retirement and closure of one welding company. At follow-up, we recruited new participants (67 welders and 38 controls) with the same inclusion criteria. The follow-up survey included examination of 145 welders (78 welders re-examined) and 134 controls (96 controls re-examined). In this study, the participants were divided into two groups: one group with repeated measurements, i.e., cohort group (2010/2011—cohort timepoint 1 and 2016/2017—cohort timepoint 2), and one group that had measurements performed at either baseline or follow-up, i.e., cross-sectional group (90 welders and 69 controls). Out of the latter group, three welders and two controls were excluded from the analysis due to incomplete questionnaire and/or low quality for the protein measurements. In addition, two controls and one welder from the cohort were re-attributed to the cross-sectional group due to incomplete questionnaire and/or low quality for the protein measurements for one of the cycles. This amounts to a total of 142 individuals in the cohort group (77 welders and 94 controls) and 157 individuals in the cross-sectional group (88 welders and 69 controls). A flow diagram of the study design is included in Fig. 1.

Study participants were asked to fill out a questionnaire regarding country of birth, education (five categories: primary school, high-school, professional school, university < 3 years, and university > 3 years), medical history, personal/family history of cancer, diet (frequency of intake of fruit vegetables, fish), physical activity (from sedentary to intense exercise), current as well as previous smoking history, use of snus (Swedish moist tobacco, yes/no), alcohol consumption (six categories from never to every day), current residence (four categories; large city, small city, town, countryside), and exposure to particles/smoke (e.g. welding fumes, dust, engine exhaust, diesel engine) during leisure activities.

Blood samples were collected in BD vacutainers for serum, allowed to clot at room temperature for 10 min, and then centrifuged at 2400 rcf for 10 min. Upon separation, serum samples were aliquoted and kept on dry ice for transportation to the laboratory in the Division of Occupational and Environmental Medicine at Lund University, and then stored at – 80 °C until analysis.

Exposure assessment

For occupational history, we used, both for welders and controls, a structured questionnaire inquiring about the present and past workplaces, type, and duration of job and explicitly whether workers were exposed to welding or diesel fumes at past or present workplaces.

We asked the welders, in addition to the above-mentioned questions, about the type of welding, total hours welding during the usual working week, place of welding, area-level or point source exhaust use, as well as use of personal respiratory, noise, and eye protection devices while welding.

Personal respirable dust measurement

Personal sampling of respirable dust was performed for welders and area-level respirable dust monitoring was mainly performed for controls. For the personal sampling, a filter cassette was placed within the personal breathing zone of welders. Respirable dust was collected on 37 mm mixed cellulose ester filters with 0.8 μm pore size (pre-weighed), fitted in conductive cassettes attached to personal sampling cyclones (BGI4L, BGI, Mesa Labs, USA; cut-off 4 µm). The airflow through the sampler was set at 2.2 L/min and regularly checked before, during, and after sampling with primary calibrators (TSI Model 4100 Series, TSI Incorporated, USA). Personal sampling was coordinated with shift working hours; an average time of sampling was 6 h and 33 min (2 h and 22 min to 8 h and 33 min) for timepoint 1 and 6 h and 33 min (3 h and 5 min to 8 h and 29 min) for timepoint 2. Measured concentration was corrected for if respiratory protection was used, a factor of three was used to reduce measured concentration (outside respiratory protection) to actual exposure (Hedmer et al. 2014; Li et al. 2015a). At timepoint 2, one welder used a half-mask and then a correction factor of two was used, and four welders used newer versions of powered air purifying respirators with double visors and then a factor 50 was used. The filter samples were analyzed gravimetrically according to a validated method for determination of respirable dust. The limit of detection was set to 0.05 mg/sample. For those welders missing measured exposure data, the exposure was assessed individually using exposure data from welders working at the same location, engaged in similar tasks, and in the same company. Exposure data previously collected at the welding companies were also used in the exposure assessments (Hedmer et al. 2014; Li et al. 2015b). In total, there were 101 welders that had respirable dust data at timepoint 1 (measured respirable dust n = 53; estimated respirable dust n = 48) and 123 that had respirable dust data at timepoint 2 (measured respirable dust n = 103; estimated respirable dust n = 20). There were 56 welders that had respirable dust data at both timepoints (timepoint 1: measured n = 28, estimated n = 28; timepoint 2: measured n = 46, and estimated n = 10).

In control companies at timepoint 1, full-shift personal breathing zone samples of respirable dust were collected in two companies for 19 controls. In four companies, area-level air pollution monitoring of respirable dust was performed using a direct reading monitor, SidePak Model AM510 (TSI Incorporated) with a Dorr–Oliver cyclone (Li et al. 2015a). In control companies at timepoint 2, area-level air pollution monitoring of respirable dust fractions was performed using DustTrak DRX monitor (TSI Incorporated). The monitors were placed at breathing zone height in the area where workers spent the most time during the work shift. In companies where workers spent time at two different workstations, measurement was performed in both areas, but divided before and after the lunch break. All statistical analyses and calculation of cumulative dose were performed using respirable dust values that were adjusted for personal respiratory protection, as they better reflect the actual exposure.

Cumulative dose

For timepoint 1, the cumulative dose was estimated by multiplying respirable dust (adjusted for PPE) and reported years of welding experience. For timepoint 2, similar calculation was performed, and the estimate from timepoint 1 was added to calculate cumulative dose:

$$ \begin{aligned} {\text{Cumulative dose}}_{{{\text{timepoint}}\_ 1}} {\mkern 1mu}& = {\mkern 1mu} {\text{Respirable dust}}_{{{\text{timepoint}}\_ 1{\kern 1pt} }}\\&\quad \times {\mkern 1mu} {\text{Years welding}}_{{{\text{timepoint}}\_ 1}} \end{aligned} $$
$$ \begin{array}{*{20}l} {{\text{Cumulative dose}}_{{{\text{timepoint }}\_ 2}} {\mkern 1mu} = {\mkern 1mu} {\text{Cumulative dose}}_{{{\text{timepoint }}\_ 1}} {\mkern 1mu} } \hfill \\ {\quad + {\mkern 1mu} [{\text{Respirable dust}}_{{{\text{timepoint}}\_ 2}} \times ({\text{Years welding}}_{{{\text{timepoint}}\_ 2}} - {\text{Years welding}}_{{{\text{timepoint}}\_ 1}} )]} \hfill \\ \end{array} . $$

Measurement of proteins in serum

Serum samples were analyzed for 92 unique proteins using the Proseek Multiplex Oncology II kit coupled with the Fluidigm BioMark HD real-time PCR platform (Olink Proteomics, Uppsala, Sweden). The kit contains proteins that are curated to ontologies relevant for cancer development: angiogenesis (n = 20), apoptotic process (n = 34), cell adhesion (n = 35), cell differentiation (n = 42), cell motility (n = 30), cell proliferation (n = 43), cellular metabolic process (n = 47), cellular response to stress (n = 23), chemotaxis (n = 14), extracellular matrix organization (n = 9), immune response (n = 27), MAPK cascade (n = 25), proteolysis (n = 19), response to hypoxia (n = 3), and other gene ontology terms (n = 9). The analysis is based on the proximity extension assay (PEA); processing, quality control as well as normalization were previously described (Assarsson et al. 2014). Protein levels are reported as normalized protein eXpression (NPX) values on a log2-scale. The cut-off for intra- or inter-assay CVs were < 20%. All proteins had an intra-assay coefficient of variation (CV) lower than 13%. The majority of the proteins had an inter-assay CV lower than 16%. XPNPEP2 had an inter-assay CV of 33% and was excluded from the analysis. CRNN had an inter-assay CV of 22%, but was kept in the analysis due to the relevance of CRNN for the use of snus (Sundkvist et al. 2018). All the downstream analyses were performed on 91 proteins.

Statistical analyses

Evaluation of differences between the study groups

Characteristics are presented as median and 5–95 percentile for the continuous variables and percentage for categorical variables. Differences between groups were evaluated with the Kruskal–Wallis rank sum test (followed by Dunn’s post hoc test) (when comparing three groups) and paired samples Wilcoxon test (when comparing two groups) for continuous variables, as well as Fisher’s exact test for categorical variables.

Data exploration using principal component analysis

PCA heatmaps were constructed using the prince.plot function in the swamp package in R. The function generates principal components that explain part of the variation in the protein data set and then tests each variable against these components to evaluate possible associations. Heatmaps depict p values (− log10-transformed) of these associations. Hierarchical clustering of the variables was generated using the hclust function.

Evaluation of differentially expressed proteins

Longitudinal analysis employed linear mixed models to evaluate associations between groups (welder and control) with cancer-related proteins were fitted using the lmer function in the lme4 package in R. The mixed models included participants as random factors (random intercepts) and age, body-mass index, use of snus as well as group as fixed factors. Variance explained by fixed factors (R2m) and random factor (R2c) was calculated using RsqGLM function from the R package MuMin. Models were assessed by plotting Pearson standardized residuals. Sensitivity analyses were performed (1) only on the non-smokers (n = 74 welders, n = 89 controls) and (2) by excluding welders with respirable dust levels > 2.5 mg/m3 (n = 10).

For the cross-sectional analysis of the cross-sectional group, we used multivariable-adjusted linear models (adjusted for age, body-mass index, and use of snus) to evaluate the associations between groups (welder amd controls) with cancer-related proteins.

Similar analyses using linear mixed models were performed in welders only by replacing the group variable with respirable dust (in mg/m3), years of welding (in years), or cumulative exposure. The mixed models included participants as random factors (random intercepts) and age, body-mass index, use of snus as well as respirable dust, years of welding, or cumulative exposure as fixed factors. The longitudinal analysis was performed on 54 welders (sampled twice) for the association with respirable dust and cumulative exposure and on 77 welders (sampled twice) for the association with years of welding.

Bioinformatics analysis of the differentially expressed proteins

Disease and function analysis was performed using Ingenuity Pathway Analysis (license obtained from Ingenuity Systems, Redwood City, CA, USA) on the top 20 differentially expressed proteins between welders and controls from the linear mixed model analysis. All proteins had a p value  < 0.1. We changed the significance threshold from 0.05 to 0.1 in the disease and function analysis to increase the number of proteins and the statistical power for the downstream bioinformatics analysis.

All analyses ware performed using R v.3.3.2 (R Foundation for Statistical Computing, Vienna, Austria) unless otherwise stated.