Participants and methods
This was a prospective, longitudinal cohort study with individuals who signed up for the Richmond Marathon 2019. The research study was approved by a Research Ethics Committee (REC) 13,823/001. All participants gave written informed consent before taking part in the study.
Inclusion criteria included: no present or previous lumbar spine injuries or surgeries; no symptoms related to their musculoskeletal condition; no previous marathon runs; no contraindications to MRI. The main exclusion criteria were: pregnancy, active breastfeeding, age < 18 years old, claustrophobia, and history of panic attacks or anxiety, known lumbar spine problems.
Twenty-eight volunteers who registered to run their first marathon ever, the Richmond Marathon 2019, were recruited to the study (14 males, 14 females; median age: 30 years, range: 18–58 years old). Basic demographics were collected at baseline: weight (70.4 ± 9.6 kg), height (174 ± 10.2 cm), and body mass index (BMI). All participants reported similar previous running experiences: they previously participated in races ranging from 10 km up to half-marathon (21 km) distances, and never ran a marathon (42 km) before. Specifically, 5 people ran a 10 km race as their longest distance and 23/28 ran a half-marathon as their longest distance race, running ≥ 2 times/week (median: 3; range: 2–5 times/week), for a total of 3–4 h of running/week.
All participants started a formal 4-month training programme for the marathon provided by the race organiser (with a gradual increase in mileage/week, available online on the Richmond Marathon website). All underwent lower lumbar spine MRI scans prior to the start of the training plan (time point 1).
A number of 21/28 participants completed both the training for the marathon and the marathon run itself. Following the marathon run, participants were invited to attend a second MRI scan (time point 2).
The participants had lower lumbar spine 3.0 T MRI scans (Siemens Healthineers-Magnetom Vida, Erlangen, Germany) before and after running a marathon with a dedicated 18 channel ultraflex coil. The spine section being captured by MRI scanning was L3–S1 (comprising of lumbar vertebrae L3, L4, L5, and sacral vertebra S1). The MRI protocol included the following sequences: fat-suppressed proton-density-weighted turbo spin-echo (FS PDw TSE) sequences in coronal [repetition time (TR): 4190 ms/echo time (TE): 44 ms; image size/acquisition matrix: 512 × 512 pixels; field of view (FOV): 70.8 × 30 cm] and sagittal bilateral planes proton density [FS TSE TR: 4420/TE: 35 (320 × 320 pixels); FOV: 82.6 × 35 cm]– ‘bilateral’ implies that scanning on sagittal slices was performed from right to left on a single acquisition; axial (T1 TSE TR: 27/TE: 10; FOV: 82.6 × 35 cm) covering the lower lumbar spine; coronal PD TSE (TR: 3290/TE: 39; FOV: 69.8 × 29.6 cm); axial PD FS TSE [TR: 4400/TE: 36 (384 × 384 pixels); FOV: 82.6 × 35 cm] and axial Dixon in 4 phases (in-phase, out-of-phase, water only, and fat only; TR: 4220/TE: 45; FOV: 70.8 × 30 cm); T1 VIBE 3D coronal (TR: 0.1/TE: 4.92; FOV: 70.8 × 30 cm). The thickness of all non-Dixon slices was 3 mm, whilst the thickness of Dixon slices was 1.5 mm. The interslice gap used in sequences was 0.3 mm. The scanning time per individual was 30 min.
The MRI scans were evaluated using a picture archiving and communications system (PACS) workstation by 2 senior musculoskeletal radiologists with 10-year experience at consultant level, both at time point 1 and time point 2: one radiologist reported the full set of scans and the second one co-reported images from 20% of the study participants (n = 6 participants × 2-time points), independently. Double-reporting was done to verify the reproducibility of the readings. The participants whose scans were double-reported were randomly selected. The % for co-reporting was internally decided; also in previous studies the scans of 10% of the total number of subjects were co-reported6, but in this study, the subset was doubled to 20% for increased reliability.
Time point 1 MRI scans were examined at that specific time point by each radiologist, separately. Then, at time point 2, both MRI scans of each participant were compared for changes between timepoint 1 and time point 2 by each individual radiologist, again independently. The order was known, yet the examinations were pseudonymised and systematically analysed. Radiologists were blinded to any of the participants’ clinical information.
If there were any differences between the 2 radiologists’ reports, a second MRI reporting session determined the final scores based on a consensus reading.
MRI findings of the lumbar spine were assessed based on validated scoring systems and specific measurements. The following lumbar spine features were assessed: intervertebral disc height (IVD height), intervertebral disc width (IVD width), intervertebral distance, and disc degeneration. The presence of other findings, such as insufficiency fracture, facet joint effusion, or other sacroiliac joint findings was specified.
Measurements of IVD dimensions were done based on Kingsley et al. . The margins of the vertebral bodies were digitised for all MRI slices where the vertebral endplate and IVD could be detected. The points were interpolated, and the resulting coordinates were used to measure the distances between adjacent vertebral endplates were calculated and thus calculate mean vertical IVD height and width.
We assessed the frequency of disc disease degeneration of the lumbar spine, including different levels of severity using Pfirrmann’s classification as in Table 1 below:
Demographics and characteristics of study participants were evaluated, including gender, age, and BMI. Changes between time point 1 and time point 2 MRI-reported datasets were assessed using paired t-test. Distinctions were made in terms of lumbar spine outcomes between male and female participants, those aged < 40 years old and ≥ 40 years old, and those with BMI < 25 kg/m2 or BMI ≥ 25 kg/m2, respectively, using unpaired t-tests.
Differences between marathon finishers and training non-finishers were analysed using unpaired t-test. Marathon finishing times of participants with and without disc degeneration were compared with unpaired t-test.
Interreader agreement (between the scores reported by radiologists) was calculated based on kappa statistics. The interpretation of kappa values was the following: kappa < 0, less than chance agreement; 0.010–0.200, slight agreement; 0.210–0.400, fair agreement; 0.410–0.600, moderate agreement; 0.610–0.800, substantial agreement; 0.810–0.990, almost perfect agreement; 1.000, perfect agreement. Statistical significance was defined as p < 0.05 (GraphPad Prism, V.6.0 c).