Correlates and trajectories of relapses in relapsing–remitting multiple sclerosis

Background and aims In people with relapsing–remitting multiple sclerosis (pwRRMS), data from studies on non-pharmacological factors which may influence relapse risk, other than age, are inconsistent. There is a reduced risk of relapses with increasing age, but little is known about other trajectories in real-world MS care. Methods We studied longitudinal questionnaire data from 3885 pwRRMS, covering smoking, comorbidities, disease-modifying therapy (DMT), and patient-reported outcome measures, as well as relapses during the past year. We undertook Rasch analysis, group-based trajectory modelling, and multilevel negative binomial regression. Results The regression cohort of 6285 data sets from pwRRMS over time showed that being a current smoker was associated with 43.9% greater relapse risk; having 3 or more comorbidities increased risk and increasing age reduced risk. Those diagnosed within the last 2 years showed two distinct trajectories, both reducing in relapse frequency but 25.8% started with a higher rate and took 4 years to reduce to the rate of the second group. In the cohort with at least three data points completed, there were three groups: 73.7% followed a low stable relapse rate, 21.6% started from a higher rate and decreased, and 4.7% had an increasing then decreasing pattern. These different trajectory groups showed significant differences in fatigue, neuropathic pain, disability, health status, quality of life, self-efficacy, and DMT use. Conclusions These results provide additional evidence for supporting pwRRMS to stop smoking and underline the importance of timely DMT decisions and treatment initiation soon after diagnosis with RRMS. Supplementary information The online version contains supplementary material available at 10.1007/s10072-023-07155-3.


Methods of Rasch Analysis
Data from each (sub)scale was tested against the requirements of the Rasch measurement model [1].Briefly, these requirements include i) unidimensionality; ii) monotonicity; iii) homogeneity; iv) local independence and v) group invariance [2,3].
Whichever set of items are to be added together to provide a score, they should satisfy all of these requirements.They should: i) measure one thing (domain/construct/trait; ii) the probability of a positive response to an item (or in the case of polytomous items, the transition from one response category to the next) should increase with underlying ability, as should the total score [4]; iii) the same hierarchical ordering of items should hold for each level (or grouping) of the score [5]; iv) items should be conditionally (on the score) independent of one another [6]; and v) the response to items across different groups such as age or gender should, conditioned on the total score, be the samereferred to as (the absence of) Differential Item Functioning (DIF) [3].
Each requirement is tested.A t-test is used to determine if two separate groups of items deliver significantly different estimates, following the procedure given by Smith [7].The hierarchical ordering of items across the scale is determined through a Chi-Square test of fit based on grouped scores.Monotonicity is evaluated through inspection of the itemcategory ordering.Conditional item dependence is determined though the correlation of residuals, where pair-wise correlations should not exceed 0.2 above the average residual [8].Should clusters of locally dependent items be found, consideration is given to grouping these into 'super items' or testlets (simply adding them together to make one larger item, the latter based on a priori defined groups) to absorb the local dependency [9].In the RUMM2030 software, this gives a bi-factor equivalent solution retaining a specified proportion of the variance.This "Explained Common Variance (ECV)" is reported, whereby a value less than 0.7 is indicative of requiring a multidimensional model, a value above 0.9 a unidimensional model, and the grey area in between, undetermined, requiring further evidence [10].Consequently, value of the ECV at 0.9 and above is considered acceptable in the current analysis.If two parallel forms are created from either a subscale structure, if present, or from the pattern of local dependency in the item set, this requires a latent correlation ≥0.9.This is consistent with the reliability required for individual use.[11].Consequently, valid parallel forms would require both their latent correlation to be ≥0.9 and the ECV to be ≥0.9.
Group invariance (DIF) is tested through an ANOVA of residuals for age, gender, duration since diagnosis, education levels, and whether or not the patient is self-employed or employed, and working full-time or part-time.Should DIF be identified it is tested by a comparison of person estimates from split and unsplit solutions to see if it is 'substantive' [12].Where the difference is significant (a paired t-test), the result is reported as an effect size where a value higher than 0.1 is considered to represent substantive DIF [13].If this is present, then the scale works in different ways for the contextual factor under consideration, and results are reported separately.Finally reliability is reported as both a Person Separation Index (PSI), and as Cronbach's alpha.If data are normally distributed they are equivalent, but otherwise PSI tends to be lower the more data are skewed.
Values are treated the same, and so values below 0.7 would be described as low, as they do not support group use.

Methods of Trajectory Analysis
An inception cohort with disease duration since diagnosis of two years or less was identified to give an insight into the early trajectory of disability.A group-based trajectory model was applied, which is designed to identify groups of individuals following similar developmental trajectories [17,18]).It was implemented through traj.ado in STATA17 [19].
The number and shape (via polynomial functions) of trajectories were determined by analysing one to five group models without covariates.To accommodate attrition, a 'dropout' model was applied, specified in its basic form of variable dropout across assessment occasions [20].The Bayesian Information Criterion (BIC) was used to determine the best-fitting model, also with consideration for a useful and parsimonious model.Average posterior probabilities above 0.7 were also deemed to indicate optimal fit [21].Missing data were handled using a maximum likelihood approach based on a missing-at-random assumption.