Identifying definite patterns of unmet needs in patients with multiple sclerosis using unsupervised machine learning

Introduction People with multiple sclerosis (PwMS) exhibit a spectrum of needs that extend beyond solely disease-related determinants. Investigating unmet needs from the patient perspective may address daily difficulties and optimize care. Our aim was to identify patterns of unmet needs among PwMS and their determinants. Methods We conducted a cross-sectional multicentre study. Data were collected through an anonymous, self-administered online form. To cluster PwMS according to their main unmet needs, we performed agglomerative hierarchical clustering algorithm. Principal component analysis (PCA) was applied to visualize cluster distribution. Pairwise comparisons were used to evaluate demographics and clinical distribution among clusters. Results Out of 1764 mailed questionnaires, we received 690 responses. Access to primary care was the main contributor to the overall unmet need burden. Four patterns were identified: cluster C1, ‘information-seekers with few unmet needs’; cluster C2, ‘high unmet needs’; cluster C3, ‘socially and assistance-dependent’; cluster C4, ‘self-sufficient with few unmet needs’. PCA identified two main components in determining the patterns: the ‘public sphere’ (access to information and care) and the ‘private sphere’ (need for assistance and social life). Older age, lower education, longer disease duration and higher disability characterized clusters with more unmet needs in the private sphere. However, demographic and clinical factors failed in explaining the four identified patterns. Conclusion Our study identified four unmet need patterns among PwMS, emphasizing the importance of personalized care. While clinical and demographic factors provide some insight, additional variables warrant further investigation to fully understand unmet needs in PwMS. Supplementary Information The online version contains supplementary material available at 10.1007/s10072-024-07416-9.


Introduction
Multiple sclerosis (MS) is one of the most common neurological diseases impacting the central nervous system.Nowadays, MS is a major cause of permanent disability, Elisabetta Maida and Gianmarco Abbadessa contributed equally.
contributing in 2016 to 0.04% of global disability-adjusted life-years (DALYs), with estimated 2.2 million cases worldwide [1].As a chronic disease, MS requires a multifaceted interdisciplinary management, involving several clinical and administrative figures working together to guarantee the most appropriate path of care.
The term 'unmet needs' is used to describe 'a situation in which individuals or groups fail to obtain benefits for various reasons, although they may do so from interventions or health service delivery' [2].The underlying nature of unmet needs is far from static, as it may undergo considerable changes according to the healthcare system and support services available in each country.The prevalence and perception of unmet needs among individuals with chronic diseases, such as MS, is likely to be significantly influenced by differences in access to services, healthcare policies and resource allocation across different countries.
People with MS (PwMS) might have various needs based on their disability, unique life experiences, individual traits and disease severity.Whenever these needs remain unaddressed, patients are left alone to struggle with the difficulties of their illness.Confronting with the unmet needs of PwMS should prompt to improve understanding and awareness about the patients' perspective.Indeed, the investigation of unmet needs from the patients' perspective can be useful to address daily difficulties and guiding the optimization of PwMS care.The final goal should be the implementation of an integrated, person-centred path of care [3].
To gain a more comprehensive understanding of unmet needs among PwMS, our objective is (i) to delineate the core patterns characterizing these needs.To achieve this, we employed a machine learning clustering approach, facilitating the categorization of individuals based on their reported unmet needs; (ii) to investigate whether these patterns are associated to specific sociodemographic and clinical features.

Study design
We conducted a cross-sectional multicentre study involving six specialized MS centres equally distributed from North to South on the Italian peninsula (Hospital San Pietro Fatebenefratelli, Rome; University 'Magna Graecia', Catanzaro; University of Catania, Catania; University of Cagliari, Cagliari; City of Health and Science University Hospital of Turin, Turin; University of Campania 'Luigi Vanvitelli', Naples).PwMS were invited to participate in the study by the Chief of each MS centre.An e-mail was sent to all PwMS within each MS centre, followed by a subsequent reminder midway through the study period.Data were collected from December 2022 to May 2023 through an anonymous online self-administered questionnaire, presented as a Google Form, and subsequent data were extrapolated for analysis.The study was conducted in accordance with the guidelines of the Declaration of Helsinki for human subjects' research, and, at the beginning of the survey, the patient's informed consent was obtained.The study was approved by the Ethical Committee of the University of Campania Luigi Vanvitelli (protocol number 0014460/i).

Questionnaire
The full Italian (and the English translated) version of the questionnaire is reported in Supplementary.
Information about demographic status (gender, age, area of residence, education, living status and employment) and MS clinical features (disease duration; clinical phenotyperelapsing-remitting multiple sclerosis (RRMS), secondaryprogressive multiple sclerosis (SPMS), primary-progressive multiple sclerosis (PPMS) and 'I don't know'-current and past disease-modifying therapy (DMT) and global disability) were collected.To assess PwMS' disability, the Italian online version of the Patient Determined Disease Steps (PDDS) was employed [4], as it is a very intuitive self-administered disability scale highly correlated to the Expanded Disability Status Scale (EDSS).
The EQ-5D-5L (EuroQol-5 Dimension-5 Levels) questionnaire [5,6] was used to explore perceived quality of life (QoL), as it measures health-related QoL (HRQoL).The test consists of a descriptive system and a visual analogue scale (EQ-5D VAS) which are administered together.The descriptive system includes five domains of everyday life (mobility, self-care, usual activities, pain/discomfort and anxiety/ depression), each measured by five levels (1-no problem; 2-mild problems; 3-moderate problems; 4-severe problems; 5-extreme problems).The combined 5-value number represents the patient's health status and is further converted into a utility index (with the maximum value being 1000, representing the best possible health) that is calculated from a population preference-based value set [7].
Finally, unmet needs were assessed with a 23-item questionnaire developed by the cooperation of a team of expert MS neurologists (S.B., E.M. and L.L.).The questionnaire was designed to examine different aspects of perceived comprehensive care and covered five domains such as 'access to information' (about MS features, therapeutic opportunities and the role of digital technologies) and 'access to primary care' (access to recommended therapies; to visits with specialists other than neurologists or to laboratory and imaging tests; to prescribed medical devices, physiotherapy and psychological support and to guaranteed government aid), 'social life' (with focus on social activities and participation in patient associations, satisfaction in physical well-being, or any discrimination experienced in everyday life), 'need for assistance' (in transportation, in daily activities, at work, and in their own homes) and finally 'doctor-patient relationship' (satisfaction in the relationship with the neurologist and any occasions when there were difficulties in communication or other issues).Every question within each domain allowed respondents to indicate whether the needs were satisfied (score of 0) or unmet (score of 1).Subsequently, we combined the scores from all questions, obtaining an overall score for each category (scores ranged between 0 and 3 for 'access to information', 0 and 7 for 'access to primary care', 0 and 4 for 'social life', 0 and 5 for the need for assistance and 0 and 4 for the doctor-patient relationship) and a total score (ranging between 0 and 23).An additional question investigated PwMS' satisfaction with their treatment and focused on possible reasons for dissatisfaction (mode or frequency of administration, side effects, clinical worsening).

Cluster and statistical analysis
Continuous variables were presented as mean and standard deviation (SD), while categorical variables were presented as number and percentage.
To assess which class of unmet needs contributes more significantly to the total score (total number of unmet needs), Spearman's correlation coefficient was calculated between each class and total number of unmet needs.
The clustering variables were the five domains previously mentioned: access to information, access to primary care, social life, need for assistance, doctor-patient relationship.Our principal clustering algorithm was agglomerative hierarchical clustering (AHC), complemented by subsequent analyses employing k-means clustering techniques.Considering the characteristics of the clustering variables, which consist of count variables with varying range spans, we have applied a normalization approach to scale them.Subsequently, we conducted AHC using Ward's method (specifically, Ward2 algorithm), with the similarity measure being defined by the Euclidean distance.
To select the optimal number of clusters, we utilized two metrics: the silhouette score and the elbow method (Supplementary Figs.S1 and S2).Additionally, we examined the dendrogram structure (Fig. 1).Both metrics suggested that the ideal number of clusters was two.However, the silhouette score exhibited a range between 0.2 and 0.3 for cluster numbers ranging from two to five, suggesting a reasonable separation without clear dominance of any specific number of clusters.Based on the dendrogram structure and supported by the metrics, we decided to proceed with four clusters to explore the patterns of unmet needs within our For dimensionality reduction, we utilized principal component analysis (PCA) with an initial component count set to five, equivalent to the number of patient experience variables.To determine the most representative number of components, we examined the eigenvalues on a scree plot and employed the 'elbow' method.As a result, two factors were determined to be optimal, a decision also substantiated by the Kaiser criterion (Supplementary Fig. S3).After identifying the optimal number of components, a PCA taking two components was performed, and a varimax rotation was applied to maximize the variance of the squared loadings.The loadings from this analysis depicted how the original variables contributed to the evaluated components.Lastly, a biplot was generated to visually represent the PCA results.This biplot displayed the scores of the observations on the principal components and the loadings of the variables.
To characterize the identified clusters, we assigned a label to each cluster and examined the differences in reported unmet needs among the clusters.Subsequently, we proceeded to describe and compare the demographic, social and clinical characteristics among the four clusters, to identify the factors characterizing the identified pattern.To compare categorical variables, we employed the chi-square test, while for continuous variables, we utilized the Kruskal-Wallis test.We conducted pairwise comparisons and adjusted the p values using the Bonferroni correction method.

High prevalence of unmet needs among pwMS
A total of 1764 e-mails were sent from the six different centres located throughout the Italian peninsula.Six hundred ninety subjects answered the questionnaire in its entirety and were, therefore, included in the study, while an additional six individuals did not accept the privacy informed consent and were consequently excluded.The average response rate was 39.12%, in line with previous studies [8].Mean age was 43.60 years (± 11.43 SD), and 70.14 % of subjects were female (n = 484).Demographic and disease features are reported in Table 1.Overall, 655 out of 690 PwMS (94.93 %) reported at least one need being unmet, with only 35 (5.07%) reporting no unmet needs; 400 out of 690 (58.08 %) reported having six or more unmet needs.Specifically, 65.80% and 69.28% showed one or more unmet need within the domain of access to information and access to primary care; 83.05% reported at least one unmet need in social life; 54.50% revealed at least one unmet need related to assistance; lastly, 37.40% reported one or more unmet need in doctor-patient relationship.
Among the five classes of unmet needs addressed by the questionnaire, difficulty in accessing primary care was found to be the major contributor to the total number of unmet needs (R = 0.744, p < 0.001), followed by social aspect (R = 0.718, p < 0.001) and the need for assistance and support (R = 0.663, p < 0.001).Correlation coefficients were interpreted as in Mukaka MM [9].Spearman's correlation results are reported in Table 2.

Unmet needs in access to information, need for assistance and social life are key factors in defining cluster patterns
Four clusters were identified, labelled from C1 to C4 (Fig. 1).Cluster C1, referred to as 'information-seekers with few unmet needs', consisted of 166 patients (24.06%) and was characterized by a moderate to low prevalence of unmet needs in all domains, except that for a high number of unmet needs related to acquiring information concerning their condition.Cluster C2, named 'high unmet needs', was also composed of 166 patients (24.06%), and it was characterized by a high number of unmet needs in all the explored domains, with the highest score reached in access to information, access to care and social life.Cluster C3, 'socially and assistance-dependent' included 112 individuals (16.23 %) with significant unmet needs in self-care, autonomy and social life; conversely, subjects in this cluster reported only a moderate number of unmet needs in access to primary care and few unmet needs related to access to information.C4, self-sufficient with few unmet needs, consisted of 246 PwMS (35.65 %), which experienced no special struggles or unmet needs.Mean number of reported unmet needs for each domain within each cluster and pairwise comparisons between clusters with adjusted p values are reported in Supplementary Table S1.Overall, as also shown in the heatmap depicting the mean value of each domain within each cluster (Fig. 2), access to information, need for assistance and social life represent the prominent contributors to the cluster definition.
K-means clustering produced clusters that were overall similar to the hierarchical cluster analysis.The cluster sizes changed, with 206 individuals (29.86 %) in C1, 104 (15.07 %) in C2, 177 (25.65 %) in C3 and 203 (29.42 %) in C4.However, the mean value of the clustering variable within each cluster remained consistent, allowing the same labels (C1 to C4) to be applied as shown in Supplementary Fig. S4.Indeed, C1 encompassed subjects reporting exclusively a high number of unmet needs in access to information, while C2 consisted of subjects with the highest number of unmet needs in all the explored domains.In contrast, C3 reported difficulties in self-care, autonomy and social life.The only difference observed with this approach compared to AHC was that C3 reported a lower number of unmet needs for assistance compared to C2 (Supplementary Fig. S4).
Finally, we conducted standard PCA to visualize the clusters (Fig. 3).Eigenvalue analysis indicated that two principal components explained 63.48% of the variance in the data and were optimal in representing the data.Analysis of the loadings, post varimax rotation, revealed the significance of each patient experience metric on the two principal components.The heatmap of factor loadings after varimax rotation (Supplementary Fig. S5) revealed that component 1 (PC1) predominantly relates to access to information, access to care, and the doctor-patient relationship, which might collectively be termed the public sphere.This component may Table 2 Spearman's correlation.The heatmap depicts the direction of the correlation, with red tones trending towards a stronger association, negative or positive (± 1) -1-(-0.9)-0.9-(-0.7)-0.7-(-0.5)-0.5-(-0.3)-0.3-0 0-0.3 0.3-0.5 0.5-0.7 0.7-0.90.9-1 reflect needs concerning support from external sources.Component 2 (PC2) focuses on the need for assistance (autonomy) and social life, aligning more with the private sphere.Looking at the biplot displaying the clusters' distribution in a two-dimensional space (Fig. 3), we observed that C1 and C2 differ from other clusters mainly in PC1 (public sphere), while PC2 (private sphere) contributes more in differentiating C2 and C3 from C1 and C4, with individuals in the latter clusters having less needs related to the private sphere.

Demographics and clinicals partially explain the identified patterns.
C2 and C3 showed a significantly higher mean age than C1 and C4.Individuals in C1 had a higher level of education than those in C2 and C3.The unemployment rate was relatively low for C1 and C4, while it was quite high in clusters 2 and 3, with nearly half of the population without a stable job.C2 and C3 had longer disease duration and a higher degree of disability, measured by PDDS, than C1 and C4 (C1, 1.05 ± 1.20 SD; C2, 3.27 ± 2.07 SD; C3, 3.99 ± 1.95 SD; C4, 0.97 ± 1.22 SD).C2 and C3 had a significant higher prevalence of persons with a progressive disease (SPMS and PPMS) than C1 and C4.
In C1 and C4, more than half of the individuals were on oral disease-modifying therapy (DMTs).In contrast, the proportion of individuals receiving intravenous medication significantly rose in C2 and C3, along with the number of individuals not following any treatment.The majority of individuals in C1 and C4 were satisfied with their DMTs.In contrast, an opposite trend was observed in C2 and C3, with C2 resulting in the least satisfied group.
Lastly, QoL, as measured by both EQ-5D utility index and EQ-VAS, was significantly lower in C2 and C3, compared with C1 and C4.Demographic and disease characteristics per cluster and all pair-wise comparisons and adjusted p value are reported in Tables 3 and 4.

Discussion
Providing proper and constant care for PwMS presents several challenges due to the variability of clinical features at disease onset and along disease evolution, which often leads to several needs being unmet.In our study, we comprehensively assessed demographic and clinical characteristics of PwMS and their unmet needs in Italy to (i) provide a snapshot of PwMS unmet needs in Italy; (ii) identify distinct clusters of subjects according to the unmet needs and their main determinants and (iii) evaluate how demographics and clinical factors were distributed among these clusters.
A significant finding in our study was that nearly all the patients reported unmet needs, highlighting potential deficiencies in the healthcare system's ability to manage a chronic disease such as MS.The most pronounced category of unmet needs was related to difficulties in accessing care, closely followed by social aspects and the need for assistance in daily life.This limited accessibility to healthcare services poses significant challenges in obtaining entitled care, often necessitating out-of-pocket expenses for patients.Additionally, our research suggests a pervasive sense of social inadequacy among PwMS.Therefore, neurologists should encourage PwMS to maintain their routines and introduce them to tailored exercise programs to boost their self-confidence in their physical well-being [10].In summary, we observed a high prevalence of unmet needs among PwMS, with the primary care domain being the primary contributor to the overall burden.This may underscore the shortcomings in the Italian healthcare system in managing chronic diseases.
The unsupervised clustering approach identified four main patterns of unmet needs among PwMS.These differed from each other in the total number and typology of reported unmet needs.C1 and C4 emerged as the clusters with the fewest unmet needs.Specifically, PwMS belonging to C4 had not reported significant unmet needs in any of the five categories.In C1, PwMS referred exclusively a high number of unmet needs in the domain of access to information.C2 appeared to be the cluster with the highest cumulative number of unmet needs, especially in the domain of access to information and social life.Finally, C3 included individuals reporting an overall moderate number of unmet needs with the highest score in need for assistance and social life.The domains that more contributed to cluster definition were access to information, need for assistance and social life.This finding should prompt MS neurologists to carefully investigate these domains to provide a more personalized approach in MS management.
Moreover, PCA enabled us to condense our variables into two primary components, indicating that they can be categorized into two spheres: the public sphere (pertaining the access to information and access to primary care) and the private sphere (related to disability and social life).Among the demographic and clinical variables examined, there was a marked distinction between clusters C2 and C3, which were characterized by a higher prevalence of unmet needs (especially in the private sphere) and a lower QoL, compared to clusters C1 and C4, which reported fewer unmet needs.The populations of C2 and C3 predominantly consisted of older individuals, who typically had lower education levels and faced higher unemployment rates.In terms of disease characteristics, the individuals in C2 and C3 exhibited longer disease duration, greater disability and higher frequency of subjects reporting a progressive phenotype, compared to those in C1 and C4.This observation is in line with existing research [11], highlighting that older age, lower educational attainment, extended disease duration and higher disability levels are critical determinants of reduced QoL and more frequent unmet needs.As it is plausible that a better clinical condition may alleviate some needs, it is essential to address the potential disparities in care for patients with more severe conditions.Doctors should play a key role in bridging this gap by implementing personalized treatment strategies and providing comprehensive support services.However, demographic and clinical factors alone do not fully elucidate the four distinct patterns identified.Indeed, no significant demographic, social or clinical variable differences were found when comparing C1 with C4 or C2 with C3.In summary, factors such as older age, lower education level, longer disease duration and higher disability predominantly characterize clusters with a moderate to high range of unmet needs, especially in domains like assistance requirement and social life (private sphere).However, these factors fail to fully explain the varying patterns observed, as they do not account for differences in unmet pertaining to the public sphere.
This observation warrants particular emphasis, as it demonstrates that this approach has provided a level of granularity that allows for the identification of unmet need patterns not accounted for by conventional clinical, demographic and disease-related variables.Indeed, our findings have revealed that regardless of motor disability and other collected information, there are patients with specific unresolved needs.The rationale for the similarity between these two pairs of clusters (C1-C4 and C2-C3) may be attributed to the presence of additional variables that were not investigated in the present study.These could include cognitive status, depressive or anxious symptoms, possible comorbidities and lifestyle factors related to the environment and economic status.Consequently, it is imperative for future research to further explore the underlying factors contributing to the differences among the clusters, thus allowing for a more accurate definition.This update will enable highly customized unmet needs resolution strategies based on individual characteristics.
Another interpretation suggests that the two clusters with a high degree of unmet needs may represent a temporal progression from one of the clusters with a low degree of unmet needs.For example, C2 may arise from C1 and C3 may arise from C4.Although this hypothesis cannot be confirmed, the particular distribution of unmet needs categories suggests its plausibility.Future longitudinal studies may be able to provide further details on this connection and potential predictive factors related to the transition between clusters.For example, a recurring theme for clusters C1 and C2 is the high number of unmet needs in the category of access to information.This finding should prompt neurologists to increase the engagement of their patients by (1) improving doctor-patient communication to enable PwMS to be fully informed and active in healthcare decisions; (2) increasing awareness of symptoms, especially the 'hidden' ones and (3) providing greater emphasis on patient-reported outcomes (PROs) to establish a patient-centred approach [12,13].These results can be achieved by devoting more effort to informing patients, organizing in-person or telematic meetings and instructing PwMS how to use properly the resources available online.This study provided an updated overview of the unmet needs of Italian PwMS by considering a broad and widely heterogeneous sample.Our findings showed that demographic and clinical variables only partially explain the observed patterns.Indeed, despite sociodemographics and clinicals differ between patients with a higher number of unmet needs in the private sphere (autonomy and social life), they do not allow to discriminate the observed heterogenous patterns.Therefore, the neurologist should pay special attention to the presence of specific unmet needs in all PwMS regardless of the presence or not of unfavourable sociodemographic and clinical features.This approach will provide a deep understanding of each individual's specific needs, facilitating an effective and personalized approach to address the needs and improve the overall QoL of PwMS.
Limitations include not having incorporated the ruralurban area differentiation or socioeconomic status and not having assessed the cognitive status and depressive or anxiety symptoms that might have influenced the results.Indeed, more studies are needed to investigate additional variables that were not investigated by the present research.Furthermore, Google Form lacks a feature to detect duplicate entries without compromising respondent anonymity.Despite recognizing this constraint, we prioritized respondent anonymity over the possibility of excluding duplicates.However, since the length of a questionnaire can impact response rates, we believed that only a very small number of users would attempt to respond multiple times [14,15].Finally, having employed e-mail and digital media as a means of recruiting PwMS may have limited the study population, thus creating a selection bias.Therefore, as the mean response rate is 39.12%, it is possible that the study only partially reflects the entire population of PwMS.

Conclusion
Our study highlights the significant prevalence of unmet needs among PwMS, pointing to potential shortcomings within the healthcare system in managing chronic diseases such as MS.We identified four distinct patterns of unmet needs, underscoring the necessity of tailoring care to each patient's specific requirements.While clinical and demographic factors offer some insight, unexplored variables, including cognitive status and socioeconomic factors, could play a role in shaping unmet needs and should be further explored in future research.Addressing these unmet needs on an individual basis has the potential to significantly improve the overall quality of life of PwMS.

Fig. 1
Fig. 1 Hierarchical clustering dendrogram.The figure displays the hierarchical clustering dendrogram generated using the Ward linkage method.The dendrogram provides a visual representation of how the subjects in the dataset are grouped into clusters based on their similarities in the selected features.The x-axis represents the subjects.

Fig. 2
Fig. 2 Mean values of normalized variables within clusters generated by hierarchical clustering algorithm (HCA).The heatmap presents the mean values of selected variables within distinct clusters, after the normalization process.The figure offers insights into how variables vary across different clusters.The x-axis represents the clusters: C1: cluster 1, C2: cluster 2, C3: cluster 3 and C4: cluster 4. The y-axis displays the variables used to cluster the subjects ('access to information', 'access to care', 'social life', 'need for assistance', 'doctor-

Fig. 3
Fig. 3 Biplots of principal component analysis (PCA) scores and component loadings with cluster assignments.A biplot showing both the PCA scores for individual data points and the PCA loadings for the selected variables.Data points are coloured based on their cluster assignments, with four distinct clusters being represented.The arrows represent the loadings of each selected variable on the two principal

Table 1
Demographic and clinical characteristics (N = 690) y years, DMT disease-modifying therapy, MS multiple sclerosis, n number, PDDS Patient-Determined Disease Steps Scale, SD standard deviation

Table 3
Pairwise comparisons between each class of unmet need among each cluster SD standard deviation; the use of bold formatting within the table was employed to highlight significant values

Table 4 (
DMT disease-modifying therapy, MS multiple sclerosis, PDDS Patient-Determined Disease Steps Scale, SD standard deviation; the use of bold formatting within the table was employed to highlight significant values