, 20:20,
Open Access This content is freely available online to anyone, anywhere at any time.
Date: 02 Jul 2012

Identifying clinical course patterns in SMS data using cluster analysis



Recently, there has been interest in using the short message service (SMS or text messaging), to gather frequent information on the clinical course of individual patients. One possible role for identifying clinical course patterns is to assist in exploring clinically important subgroups in the outcomes of research studies. Two previous studies have investigated detailed clinical course patterns in SMS data obtained from people seeking care for low back pain. One used a visual analysis approach and the other performed a cluster analysis of SMS data that had first been transformed by spline analysis. However, cluster analysis of SMS data in its original untransformed form may be simpler and offer other advantages. Therefore, the aim of this study was to determine whether cluster analysis could be used for identifying clinical course patterns distinct from the pattern of the whole group, by including all SMS time points in their original form. It was a ‘proof of concept’ study to explore the potential, clinical relevance, strengths and weakness of such an approach.


This was a secondary analysis of longitudinal SMS data collected in two randomised controlled trials conducted simultaneously from a single clinical population (n = 322). Fortnightly SMS data collected over a year on ‘days of problematic low back pain’ and on ‘days of sick leave’ were analysed using Two-Step (probabilistic) Cluster Analysis.


Clinical course patterns were identified that were clinically interpretable and different from those of the whole group. Similar patterns were obtained when the number of SMS time points was reduced to monthly. The advantages and disadvantages of this method were contrasted to that of first transforming SMS data by spline analysis.


This study showed that clinical course patterns can be identified by cluster analysis using all SMS time points as cluster variables. This method is simple, intuitive and does not require a high level of statistical skill. However, there are alternative ways of managing SMS data and many different methods of cluster analysis. More research is needed, especially head-to-head studies, to identify which technique is best to use under what circumstances.