1 Introduction

Job Exposure Matrices (JEMs) are tools used to assign exposure to hazards at work to employed populations in studies with information on job title but lacking information on exposure. JEMs have been used mainly to study the relationship between exposure to a hazard and the occurrence of a health outcome in epidemiological studies but can also be useful to estimate prevalence of exposure to a certain hazard in population surveys where explicit questions assessing work-related exposures are indeed absent (Fadel et al. 2020). JEMs are essentially cross-tabulations of jobs, normally listed in the rows, and measures of exposure to different occupational hazards, in the columns (Kauppinen et al., 1998). The type of information may be simply the presence or absence of exposure in each occupation, or can be more detailed, including also frequency, duration or intensity of exposure as ordinal measures (e.g., no exposure, low, medium, high), or quantitative ones (e.g. mg/m2 for a chemical hazard). National JEMs are comprehensive descriptions of exposures of all jobs present in a country. They are mainly based on expert ratings or on self-reports from workers using questionnaires or interviews, because objective measures, such as direct measurements or observations of exposures, due to their high cost are generally available only for a limited number of jobs or industries where measurement campaigns have been conducted for specific reasons, mainly for assessment of compliance.

In the last decade, JEMs on physical and psychosocial work-related factors have been developed in the US and in several European countries (Solovieva et al., 2012; García et al., 2013; Dale et al. 2015; Evanoff et al. 2019a; Dalbøge et al. 2016; Descatha et al. 2018). In particular, the JEMs on physical, organizational, and psychosocial exposures constructed from the US Occupational Network (O*NET) have been widely used in research (for a review: Cifuentes et al. 2010). O*NET contains information on hundreds of physical and mental descriptors of almost one thousand occupations, classified according to the US Standard Occupational Classification (SOC), in terms of skills, knowledge, activities, work context, etc. (www.onetcenter.org). For each job characteristic, ratings are assigned by job experts or collected by self-reports of workers. These ratings allow approximating exposure estimates for multiple workplace hazards, especially physical ones. O*NET demonstrated good to moderate agreement with several observed and self-reported ergonomic exposures (Gardner et al. 2010) and was found to be predictive of the risk of osteoarthritis (Dembe et al. 2014) and carpal tunnel syndrome (Dale et al. 2018).

An Italian O*NET database was also developed for mapping jobs work-related exposures in Italy, using a direct translation of the US O*NET items in a national survey on professions conducted in 2013. The Italian O*NET JEM rates exposure to 21 physical factors for almost 800 jobs, following the same methods employed by Evanoff et al. (2019b) for the US O*NET JEM (www.onetcenter.org). In a previous study, good to moderate concordance was observed between exposure scores to most physical items compared across the Italian and the US JEM (d’Errico et al., 2022). In the present study we introduce a novel composite exposure index to physical factors, constructed from the Italian O*NET JEM, and we evaluate its predictive validity on the risk of developing work-related musculoskeletal disorders. The study population is composed of workers employed during 2015–2019 in the Piedmont Region, Italy.

2 Materials and methods

2.1 Data collection

2.1.1 Study population

The study population was composed of all employed subjects 20–75 years, resident in the Piedmont Region (a North-western Italian region accounting for 7% of the national economy) during 2015–2019, aggregated by gender, age class (10-years) and job code using the 2011-version of the Italian Statistical Office Classification of Professions (CP-2011). The present study classifies jobs using 3-digit, including in this way 126 job titles. Data of the Labour Force Surveys collected from 2015 to 2019 were used to reconstruct the average employed population of Piedmont (N = 1,810,936), applying interview-level sampling weights to the data sampled.

2.1.2 Outcome

Two outcomes were examined in the study, i.e., work-related musculoskeletal disorders (WRMSD) notified to the Italian National Institute for Insurance against Accidents at Work (INAIL) in Piedmont in 2015–2019, as well as those compensated by the Institute during the same period.

Notified WRMSD should not be considered as self-reported musculoskeletal disorders, since the notification of a WRMSD to INAIL is a legal act, in which the disorder is ascertained through in-depth diagnostic procedures. Compensated WRMSD constitute a subsample of notified WRMSD and result from the evaluation of the cases by the commission of INAIL medical experts, which assesses the link between the occupational risk and the pathology reported, taking also into consideration the risk assessment conducted by the company where the worker is employed.

WRMSD were selected through their ICD-10 code, including neurological disorders which can arise due to compression, like carpal tunnel syndrome (G56.0), arteriopathies, arthropathies or dorsopathies which can arise due to exposure to repetitive movements, standing, awkward positions or vibration, like the Raynaud syndrome (I73.0) and soft tissues disorders caused for example by application of force leading to inflammation and muscle fatigue (Table 1). Table 1 reports the number of events notified to INAIL over follow-up by musculoskeletal disorders group. The number of WRMSD notified to and compensated by INAIL was then aggregated by gender, age class (10-years) and job code (CP-2011 3-digit), and then assigned to the study population through record-linkage based on these variables. One WRMSD occurring in the “Fishermen and hunters” group (CP code: 645) and one event among “Motorized farm plant operators” (CP code: 731) were deleted because these jobs were not present in the Piedmont workforce sampled by the Labour Force Surveys (Motorized farm plant operators were systematically classified as food processing machine operators, CP code: 732). Furthermore, farmers and breeders (CP code: 643) were excluded from the analysis because the rate of WRMSD notified to INAIL was unrealistically elevated. This finding appears mainly attributable to the fact that in the last years farmers’ entrepreneurial organizations conducted a campaign to promote the notification of occupational diseases to INAIL, which increased by five times between 2013 and 2016, although returning in 2019 almost to 2013 levels (data not shown); in contrast, the rate of the occupational diseases compensated in this professional group remained quite stable during this period, suggesting that the rate of notified WRMSD was artificially inflated by socio-political circumstances.

Table 1 ICD-10 codes of the musculoskeletal disorders considered and number of events

2.1.3 Exposure assessment

The Italian O*NET database contains information on the same descriptors as the US O*NET database (www.onetcenter.org), but, differently from the US O*NET, ratings for all dimensions are constructed from workers’ self-reports, based on interviews of approximately 20 workers for each of the 796 jobs of the Italian classification (CP2011, 5-digit level) in a national survey conducted in 2013 (https://inapp.org/it/archivio_rilevazioni/indagine-campionaria-sulle-professioni)Footnote 1. For each job, the O*NET database contains scores for each descriptor, rated by importance, frequency, or level of a certain workplace characteristic, averaged at the job code level. The JEM on physical exposures was constructed using data from three O*NET domains: Work Abilities, Activities, and Work Context. Items belonging to the Work Ability and the Activity domains are scored for both importance of a certain characteristic in a job and for the level of the characteristic, such as the level of an ability needed to perform a job or the level of an activity typical of that job. For these factors, importance is scored from 1 to 5, whereas level ranges from 1 to 7. In contrast, the “Work Context” domain, which focuses on aspects of both job content and workplace characteristics, includes items collected on a frequency scale from 1 to 5 (from never to all the time, or every day).

Of the 21 variables defined in the Italian O*NET JEM on physical factors, 17 items potentially associated with the risk of musculoskeletal disorders were identified through Principal Component Analysis. For all the items, good reliability against the same items of a corresponding US O*NET JEM has been shown (d’Errico et al., 2022). Table 2 provides details of the 17 items, of which: 3 focus on force exertion (static strength; dynamic strength; trunk strength), 6 on activity level and repetitive movements of the upper limb (manual dexterity; fingers dexterity; wrist-finger speed; handling and moving objects; time spent making repetitive motions; time spent using hands to handle, control, or feel objects, tools or controls), 4 on postures (awkward positions; standing; time spent kneeling, crouching, stooping, or crawling; time spent bending and twisting the body), 2 items on activities involving the whole body (performing generalized physical activity; walking and running), 2 items on exposure to vibration (whole-body vibration; driving vehicles or other types of moving machinery).

Table 2 Description of the items in the Italian O*NET databases used to construct the composite ergonomic index

Level scores of items in the Ability and Activities domains were reclassified to a level equal to zero, if their importance score was below or equal to 1 in the original response scale, as suggested by Evanoff et al. (2019b).

Scores of the level of each item were standardized on a 0-100 scale and averaged, to compute a composite exposure score (Cronbach alpha = 0.90). This composite ergonomic exposure index (Ergo-Index) was then averaged at the 3-digit job level (CP-2011) and linked through this key of linkage to the data of the study population. The Ergo-Index had a mean equal to 25.7, standard deviation of 14.2 and a range of scores of 2.9–60.8. In the appendix, Table 6 reports the derived Ergo-Index for each job (CP, 3-digit).

2.2 Data analysis

As mentioned in the previous paragraph, of the 21 variables investigating physical factors, 4 were excluded from the construction of the index because of their low concordance and low or even negative correlation with the other variables. The excluded variables are those investigating the microclimate (exposure to high or low temperatures), the variable referring to the repetitiveness of the same task and the variable on work at computer. The variables on the microclimate were discarded as indirect risk factors and in any case were found to have a very low correlation with the other variables. Although important, the variable of repetitiveness of the same tasks was discarded because it showed a very low concordance. Finally, the variable on computer work was the only one with a negative correlation with all the other variables. Principal component analysis was performed as statistical support for the choice of the 17 variables. The results indicate a reduction of all variables to a single component, with very high correlation values between each variable and the component (from 0.65 to 0.95).

The association between level of exposure to the Ergo-Index and incidence of notified and compensated WRMSD was assessed through negative binomial regression models, adjusted for age class and gender, both keeping the index as a continuous variable, as well as a categorical one. Two categorizations of the composite exposure index were performed, one based on quartiles, with cut-offs at 15.94, 27.40, 35.7, and one taking into account also the outcome distribution, using the Loess method (Cleveland et al. 1992): first, through non-parametric regression an interpolation curve was fitted, using iterative weighted least squares with a tricubic kernel function (with higher weights assigned to points closer to the curve); afterwards, the index was divided in four categories, with cut-offs identified visually in correspondence of the inflection points of the curve. Following this procedure, the first cut-off was identified at the value of 15.52, the second at 25.63 and the third at 39.8 (Fig. 1).

Fig. 1
figure 1

Interpolation curve between the Ergo-Index and the standardized rate of WRMSD (*10,000)

As offset variable we used the average employed population of Piedmont in the relevant cell identified by the intersection of 3-digit CP jobs, gender, and age group, based on data of the 2015–2019 Labour Force Surveys.

As a relevant number of subjects (around one quarter) had more than one WRMSD notified to INAIL, to assess whether the risk associated with the Ergo-Index was influenced by the presence of multiple events per worker, in the sensitivity analysis we also assessed the association between incidence of the first WRMSD notified by each subject (N = 3,406) and the Ergo-Index, both continuous and categorized, as in the main analysis.

3 Results

During 2015–2019, 4,416 WRMSD were notified to INAIL, of which 1,311 (29.6%) were compensated by the Institute.

3.1 Notified WRMSD

Overall, the average rate of WRMSD notified to INAIL during 2015–2019, standardized by sex and age class, was 6.05 per 10,000 workers (range: 0.36–69.03).

Table 3 reports the age and sex adjusted negative binomial IRR of notified WRMSD associated to the Ergo-Index included as continuous variable (model 1); divided in quartiles as derived directly from the distribution observed in the Italian O*NET JEM (model 2); in quartiles identified by the Loess method (model 3).

Table 3 Incidence Rate Ratios of notified WRMSD, for exposure to the Ergo-Index adjusted by gender and age (Negative Binomial Regression)

In model 1, where the Ergo-Index was examined as a continuous variable, the Ergo-Index was significantly associated with the incidence of WRMSD notified to INAIL, with an increased risk of 12% for an increase of 1 point in the score of the index (Table 4).

Categorizing the Ergo-Index in quartiles according to the actual distribution in the Italian O*NET JEM (model 2), all three upper quartiles showed significantly increased risks of WRMSD, compared to the lowest, with a significant trend in risk across ordered exposure categories (2nd quartile: IRR = 7.25; 3rd quartile: IRR = 24.3; 4th quartile: IRR = 61.0).

When the index was categorized in quartiles according to the Loess method (model 3), the IRR for the second quartile changed only slightly (IRR = 6.93), whereas the IRRs for the third and fourth quartiles increased by approximately one third (IRR = 32.19 and IRR = 77.67, respectively), compared to the analysis of model 2.

In all the three models, the risk of WRMSD was significantly higher among women (from IRR = 1.63 to IRR = 1.73 in the different analyses) and 5–6 times higher among subjects older than 55 years, compared to those 30–44 years.

3.2 Compensated WRMSD

The rate of WRMSD compensated during 2015–2019, standardized by sex and age class, was less than one third that of the notified WRMSD (1.46 per 10,000 workers).

As for Table 3, also Table 4 reports the adjusted negative binomial IRR of compensated WRMSD where the Ergo-Index is operationalized as continuous (model 1), in quartiles using the observed distribution (model 2) and in quartiles using the Loess method (model 3).

Table 4 Incidence Rate Ratios and 95% CI of compensated WRMSD, for exposure to the Ergo-Index adjusted for gender and age class (Negative Binomial Regression)

As shown in Table 4, the incidence of compensated WRMSD was also significantly associated with the Ergo-Index, although with a greater strength of association, compared to notified WRMSD. When the Ergo-Index was treated as a continuous variable, an increase in risk by 15% was estimated for an increase of 1 point in the score of the index (model 1). When the Ergo-Index was categorized in quartiles or by cut-offs set through the Loess method, for all exposed categories risks were higher than for notified WRMSD, with a stronger trend in risk. In these analyses, the category with the highest exposure had a risk of compensated WRMSD almost 250 times higher than that estimated for the least exposed, while that for the middle-high exposure category was almost 90 times higher (Table 4, models 2 and 3).

3.3 Sensitivity analysis

Restricting the analysis to the first WRMSD notified by each worker, similar associations as in the main analysis were found with the Ergo-Index, both when treated as a continuous variable (IRR = 1.12 for an increase of 1 point in the score, model 1), or when categorized in quartiles (2nd quartile: IRR = 6.87; 3rd quartile: IRR = 22.3; 4th quartile: IRR = 59.7, model 2), or by cut-offs set through the Loess method (middle-low exposure: IRR = 6.31; middle-high exposure: IRR = 29.4; high exposure: IRR = 71.6, model 3) (Table 5).

Table 5 Incidence Rate Ratios of the first WRMSD notified to INAIL, for exposure to the Ergo-Index adjusted for gender and age class (IRR not shown) (Negative Binomial Regression)

4 Discussion

In this study, we found a very strong association between the Ergo-Index, a novel composite index of exposure to ergonomic factors at work, constructed from the Italian O*NET JEM on physical factors, and the incidence of work-related musculoskeletal disorders in the Piedmont region. Compared to the least exposed category, risks of notified WRMSD were 60–70 times higher in the high exposure group and 20–30 times higher in the middle-high exposure category. For compensated WRMSD, risks were even higher, being almost 90 times higher in the middle-high and almost 250 times higher in the highest exposure category, compared to the group with the lowest exposure. Small differences in the risk of compensated WRMSD were observed when the ergonomic index was categorized in quartiles or in four categories obtained through the Loess method. For notified WRMSD, exposure categories obtained from the Loess method appeared more predictive, being associated with higher incidence ratios compared to the least exposed.

The study is the first one in Italy that has evaluated the incidence of notified and compensated WRMSD associated with exposure to ergonomic factors at work in the general working population.

Our results suggest that the Ergo-Index constructed from the Italian O*NET JEM on physical factors estimates quite accurately workers’ exposure. Therefore, this exposure index could be valuably employed for assigning exposure to physical factors at work in epidemiological studies where there is information on job titles, but exposure to work factors is lacking, in order to study its association with workers’ health outcomes. Moreover, the Ergo-Index can be used in empirical analyses to control for occupational exposure to ergonomic factors or to evaluate its potential role of effect modifier also in other social sciences’ disciplines.

Another possible application of this JEM and of the Ergo-Index is the assessment of diffusion and intensity of exposure at local or national level, for priority setting of ergonomic hazards control. The JEM alone may be usefully employed to rank the 796 jobs in the Italian classification for exposure to physical factors, in order to identify the jobs with highest exposure, on which prioritize preventive interventions or even compensatory interventions such as exemption to mandatory age of retirement. Furthermore, if linked to a working population through job code, as done in the present study with the Labour Force Survey, the JEM may serve estimating the prevalence of exposure to physical factors in the workforce, both as a whole and by other characteristics of the workers eventually available in the survey, such as gender, age class, economic sector, firm size, geographical area. For example, using the lowest cut-off of the Ergo-Index (score: 15.52) we were able to estimate, for regional programming purposes, the prevalence and the absolute number of workers exposed to ergonomic factors at risk of developing WRMSD in Piedmont.

WRMSD incidence estimated in our study (about 6 per 10,000 for notified WRMSD and 1.5 per 10,000 for compensated WRMSD) from Italian National Institute for Insurance against Accidents at Work (INAIL) data appears much lower than that observed in epidemiological studies conducted on working populations in Europe and the US, usually based on self-reported measures. In a recent systematic review, incidence rates around or above 1% per year have been mainly observed for upper extremity musculoskeletal disorders (MSD) in each specific anatomic region (da Costa et al. 2015). Also for work-related chronic low back pain, an incidence of almost 1% per year would be expected, based on incidence rates around 5% in the general population (Bot et al., 2005; Smith et al., 2004; Kopec et al., 2004) and a proportion of 14% attributable to work exposures (Palmer et al., 2008). Incidence rates of more than 10% per year have been found for upper extremity MSD alone (not considering MSD in other regions) in the general working population in a French study, too (Nambiema et al., 2020). Incidence data on self-reported or clinically ascertained WRMSD are almost lacking in Italy except for the 2013 national survey INSuLa (INAIL, 2014). In this survey, high MSD prevalences have been reported for pain in the back (51%), in the upper extremities (46%) and in the lower limbs (29%) during the previous year (Russo et al., 2020), which suggest an underreporting of WRMSD, also considering the much lower MSD prevalences observed among workers not exposed to biomechanical factors at work (low back pain: 12%; upper extremity disorders: 4%; knee pain: 10%) (Stucchi et al., 2018). However, the lower incidence of notified WRMSD in this study, compared to those estimated in epidemiological studies, appears attributable to the fact that the outcome was not represented by self-reported musculoskeletal symptoms, but by physicians-certified musculoskeletal disorders, diagnosed through clinical examination and often instrumental assessment, which likely have been lasted for months or years, with symptoms severe enough to induce workers to ask for compensation.

The incidence of WRMSD notified to INAIL is also lower than that of notified WRMSD to Compensation Authorities in other countries, which questions the completeness of Italian data (Chen et al., 2006; Yang et al., 2021; Marcum and Adams, 2017; Mustard et al., 2015; Ha et al., 2009; Van der Molen et al., 2012). It is known that most occupational diseases, and in particular WRMSD, are strongly affected by underreporting, due to organizational, socioeconomic and personal factors (Rosenman et al., 2000; Morse et al., 2005; Rivière et al., 2014; Park and Yoon, 2021). The low rate of WRMSD notified in this study, compared to other countries, suggests that an even stronger underreporting affected the claims rate in Italy (Marinaccio et al. 2018), possibly due to various factors which should be investigated, such as workers’ unawareness of the professional nature of the disorder, reduced unionization (Morse et al., 2005), lack of disease recognition by the occupational physicians in charge of health surveillance, and workers’ fear of negative consequences from the employer in case of claims.

The low proportion of WRMSD cases compensated seems to indicate that the INAIL policy on compensation is quite strict, also considering that 57.6% of WRMSD cases notified occurred in work processes where a high risk is presumed by INAIL, for which workers’ compensation should be almost automatic. In particular, lack of risk assessment or registration of workers’ exposure to biomechanical factors may be a leading cause of WRMSD compensation refusal, as INAIL requests that exposure is well documented before compensating an occupational disease. Another important determinant of compensation refusal may be the lack of objective clinical signs or instrumental alterations clearly pointing to a diagnosis, like often happens for enthesopathies or for carpal tunnel syndrome without severe median nerve compression.

4.1 Strength

A main strength of this study is the large set of data on occupational diseases employed, which provided the analysis with great statistical power, as it encompasses all WRMSD notified to INAIL, as well as those compensated by the Institute, in a population of almost 2 million workers for a period of 5 years.

Furthermore, assignment of the exposure through a JEM, being independent from workers’ perception, prevents the occurrence of differential misclassification of the exposure due to health status, which may cause an overestimation of the associated risk (Peters, 2020). Also, the use of administrative data on WRMSD precludes the possibility of differential reporting of the outcome by occupation or by exposure category, which may create spurious associations.

Last, although one of the aims of the INAIL reporting system of occupational diseases is conducting epidemiological surveillance, it has been shown that incidence of both notified and compensated work-related musculoskeletal disorders has a wide geographical variability which is not explained by differences in the economic structure, but it seem rather determined by contextual factors, such as workers’ unions strength, active search by OSHA inspectors, bargaining of work organizations, notification campaigns from large companies, as well as a different attitude of local INAIL commissions toward the level of evidence needed for compensation (Fontana 2018). Therefore, the use of a JEM for epidemiologic surveillance of exposure to ergonomic factors, being independent from such local factors, may provide a more reliable picture of the distribution of exposures potentially increasing the risk of musculoskeletal disorders in the employed population than that obtained through examination of the incidence of notified or compensated WRMSD.

4.2 Limitations

A limitation of the study is that the O*NET JEM was constructed from Italian national data, whereas the outcome was represented by WRMSD occurred among Piedmont workers only, as data on notified and compensated WRMSD were not available at the national level from INAIL. However, in spite of this misalignment very strong associations between exposure assessed through the Ergo-Index and incidence of both notified and compensated WRMSD were observed, suggesting that the JEM has a good validity also when employed at the regional level.

Although preventing differential misclassification bias, the use of JEM may introduce a non-differential misclassification bias of the exposure, as JEM assigns the same exposure to all workers in the same job without taking into account the variability of the exposure within each job, which may result in an attenuation of the exposure-outcome relationship.

5 Conclusion

A composite exposure index to physical factors at work, constructed from the Italian O*NET database, was highly predictive of the incidence of WRMSD notified and compensated by the Italian Workers Compensation Authority in Piedmont. This finding suggests that the index provides a quite accurate measure of the exposure to physical factors at the job level, which could be employed in epidemiologic studies and for establishing priorities in planning preventive and compensatory interventions and enforcing ergonomic hazards control. The index can also be useful for all scholars interested in the study of labour market outcomes, dynamics, and reforms to assess potential heterogeneities and/or control for exposures to physical factors at work.