The impact of depression forums on illness narratives: a comprehensive NLP analysis of socialization in e-mental health communities

While depression is globally on the rise, the mental health sector struggles with handling the increased number of cases, especially since the pandemic. These circumstances have resulted in an increased interest in the e-mental health sector. The dataset is constituted of 67 857 posts from the most popular English-language online health forums between 15 February 2016 and 15 February 2019. The posts were first automatically labelled (biomedical vs. psy framing) via deep learning; second, the time series of framing types of recurring forum users were analysed; third, the clusters of biomedical and psy patterns were analysed; fourth, the discursive characteristics of each cluster were analysed with the help of topic modelling. Five ideal-typical patterns of forum socialization are described: the first and the second clusters express the developing of a ‘recovery helper’ role, either by opposing expert discourses or by identifying with the psy discourses; the third cluster expresses the acquiring of a substantively diffuse, uncertain role; the fourth and fifth clusters refer to a trajectory leading to the incorporating of a biomedically framed patient role, or a therapeutic psy subjectivity. Elements of data collection that potentially undermine representativeness: online forum users, open and public forums, keyword search. The trajectories identified in our study represent various phases of a general forum socialization process: newcomers (cluster 3); settled patient role (cluster 4) or psy subjectivity (cluster 5); recovery helpers (cluster 1 and 2).


Introduction
While depression is globally on the rise, the mental health sector struggles with handling the increased number of cases.Not just because of the general under-financing of the mental health institutions, but also because of the extra difficulties originating Extended author information available on the last page of the article from the pandemic: while COVID generated an extra wave of anxiety and depression, in the last few years the vulnerable social groups had less chance to personally participate in psychotherapy [34].Overall, these circumstances resulted in an increased interest in the e-mental health sector, which has become a central platform not only for gathering information about various mental disorders, but also for providing support online [43,50,51].
These conclusions served as the starting point of our investigations: while the content of online depression forums is relatively well mapped [5,23,32,42], the impact of these online interactions remains an uncharted territory.Little is known about the online 'roles' available in e-mental health communities or the impact of these virtual interactions on the depressed 'self'.In the present study, the socialization function of online depression forums is analysed in a broad Goffmanian framework [12]: the 'symbolic e-interactions' are understood as opportunities for experimenting with various roles, and-by distancing or identifying with the rolesdeveloping self-narratives.These narratives not only ground identity, but also outline various ways of relating to one's depression: it is either internalized as part of the self (i.e.illness narratives- [20], or distanced as a defeated existential obstacle (i.e.recovery narratives- [25], or lingers as a resolvable enigma (i.e.explorative narratives- [15]. While there are several studies exploring the interactions of online depression forums with qualitative methods (e.g.[27,41], the ones relying on comprehensive datasets and appropriate natural language processing (NLP) methods are still rare.From a methodological perspective, our study aims at filling this gap as well: the online depression forums are analysed with various NLP (i.e.algorithmic annotation, topic modelling) and traditional statistical methods (i.e.time series of odds ratios), so that socialization process could be grasped in a comprehensive manner.

Theoretical background
According to the classics of symbolic interactionism, the self is constructed in the engagements with the other [28].Identity is embedded in those social interactions, which provide opportunity to disclose the self and reflect upon it from the perspective of the other.Social roles provide frames and reference points for these processes: by expressing identification or distancing from a given role, the subjects might construct and display their self-identity [12].From the perspective of these general ideas, the depressed self can be reinterpreted as well.Depression is not merely an objective 'mental illness', it also affects the interactive construction of the self.It is enacted through various roles, which are however not necessarily incorporated: subjects might also contest the social role expectations and attributions related to depression.The way depressive self-identities are constructed is crucial in the process of overcoming depression.
On the level of expert discourses, depression is predominantly framed by a biomedical model emphasizing bodily malfunctions and pharmaceutical treatment [37].The biomedical approach is complemented by the psychological one emphasizing maladaptive developmental patterns, automatic thoughts and traumas, while 1 3 Journal of Computational Social Science (2023) 6:781-802 recommending psychotherapeutic interventions [4].These expert discourses outline a horizon for the lay interpretations as well.The actors do not follow blindly the biomedical or psychological semantics, rather they reinterpret them according to their own lifeworld.While elaborating complex narratives, they embed the various expert discourses in their life story [17].These 'illness narratives' [9,20] and 'recovery narratives' [25] indicate the ways leading in and out from depression.
Similarly to other chronic illnesses [48], depression is also embedded in a temporal structure of identity construction [8].Various phases can be differentiated: the time before depressive symptoms,the first disturbing signs; the dramatic turning points; the certainty of 'something wrong'; the battle with the suffering; the 'coming out'; the compromises; the overcoming of depression; the wisdom gained [15].These temporally structured narrative panels hold the potential of dealing with the 'existential crisis' [6] threatening the depressed subjects.Appropriate self-narratives require a resocialization process, wherein the self-blaming [52] and self-objectifying [18] framings-implied by one-sided psy and biomedical discourses-are equally avoided.
From a phenomenological perspective-complementing psy and biomedical approaches-depression can be understood as an embodied negative relation to the world characterized by a horizon full of obstacles (instead of opportunities), narrowed time consciousness (instead of an open future and past), disrupted interaffectivity and intercorporeality (instead of the ability of resonating with the others), passivity and helplessness (instead of agency- [11,36]).To transform these phenomenological dimensions, an interactive space is needed, wherein one can work not only on their inner experiences (i.e. the aim of most therapeutic relationships), but also on the social roles contextualising their identity.From this perspective, peer group interactions are particularly important: not being pre-determined by hierarchies, they enable the experimenting with roles (unlike psychotherapeutic interactions, which imply the medicalized role of a 'patient').This explains the importance of mental health communities: they provide opportunity for constructing depressed self-identities in a relatively non-deterministic setting.
Our research question is further refined according to the literature on virtual communities.Face-to-face and online interactions differ in many important aspects: although the latter is stretched out in space and time, the narrative universe is narrowed, while co-presence of interaction partners is replaced by the parallel channels of communication [45].Due to these differences, online interactions provide an alternative space for role playing and identity construction.The roles are underdefined, the expectations are broader, which implies greater freedom for the subjects, while also depriving them from mutual reference points.Furthermore, as the narrative universe is narrowed and the subjects navigate in a relatively unpredictable interactive realm, freer experimenting becomes possible at the cost of losing the chance of direct intersubjective feedback.As the circle of potential partners is wide, attempts of disclosure are not limited by space, time or social circles-however, it is more difficult to establish intimacy and maintain long-term attention or communication.Overall, online interactions further expand the ambiguity of interactions in modernity: they provide greater freedom, which comes however with increased contingency and indifference [24].The interactions in e-mental health communities are characterized by such constraints: the subjects aiming at negotiating their depressed identities have access to a virtually unlimited space of interactions, however, these are generally more unpredictable and fragile at the same time [22].
Despite these limitations, according to previous research, these platforms hold great potentials: the anonymity of online forums proves to be particularly important in case of stigmatized mental disorders [30],the contribution of mediators (who are often recovered from mental disorders themselves) can enhance the quality of peer support beyond the scope of expert therapeutic intervention [27],online 'ritual healing' may occur as a latent community praxis [41].Based on these results, it may be argued that online interactions do not simply represent a reduced interactive space, instead an alternative one.Role playing and identity construction, socialization, and individualization occur in e-mental health communities in a slightly different manner: the inclusiveness is extended, yet the impact is more subtle.
Our research questions were formulated within these frames.On the one hand, we were interested in the impact of online depression forums on the subject's attributive patterns.Previous research identified two main patterns of framing depression in lay discourses: one embedded in biomedical discourses, the other relying on psychotherapeutic discourses [5,31,42].These discursive frameworks have implications for the construction of depressed self-identity: the exclusive application of the former threatens with self-objectification [18],the over-emphasizing of the latter threatens with self-blaming [52].
Based on these premises, we were interested in the transformation of the attributions of long-time forum users.Three research questions were raised: -RQ1: how can the ideal-typical trajectories of acquiring biomedical or psy framing be described?-RQ2: what are the semantic and performative characteristics of the users following different socialization trajectories?-RQ3: what are the consequences of online forum socialization for the depressed self-identities and illness narratives?

Data and methods
The dataset consists of 67 857 posts from the most popular English-language online health forums 1 between 15 February 2016 and 15 February 2019, covering only publicly available posts, which were shared willingly by their authors.The posts were selected using Google search terms "depression forum" and "depression online".Our goal was to collect only posts that specifically discussed depression; to do this, we selected threads that included the word "depression" or "depressed" in the title or 1 3 Journal of Computational Social Science (2023) 6:781-802 at least one post, followed by posts whose link, topic, or content included a depression-related term, such as "unipolar depression", "mood disorder", or "depressant".Only posts long enough (> 20 letters) were retrieved since after removing semantically non-informative words (e.g.is, was, the), only one or two words of semantic value remain, making a judgement on framing of depression-as discussed belowunreliable. 2 The mean number of letters in retrieved posts is 984 (with a standard deviation of 1459), while the median is 568 letters (with an interquartile range of 893).The dataset was collected by SentiOne, a web-based social listening and text analytics platform, in full compliance with all EU regulations.Several studies have demonstrated the usefulness and adequacy of this tool for scientific data collection, e.g.Burzyńska et al. [2] and Kmetty et al. [21], while Healy and Neag [16] discuss potential ethical considerations.Because no identifiable information was collected, neither a specific ethical approval nor a consent for publication was required [47].
The unit of our analysis is the individual, but at the fundamental level, our dataset consists of forum posts written by various accounts, in various threads-overall, this resulted in a three-level data structure.Since we wanted to track changes in posting, we choose some salient features of these that are amenable to some kind of time series modelling.As we have already established in our previous research that posts can be automatically categorized according to the type of framing used [33], it is a straightforward choice to associate posts with their frames and look at the temporal patterns of framing used by individuals.

Frame labelling by a deep learning language model
Previously, we employed coders to read a randomly selected sub-corpus (n = 4454 posts) and to assign one of four labels (bio-medical, psychological, sociological, other) to the posts according to the type of framing their authors used.In [33], we showed that using a DistilBERT deep learning model [38], we can extend this knowledge to the whole corpus, and we can predict the frame label of a post with 68% validation set accuracy and recall, so we used this model in the present study to label the whole corpus.Gupta et al. [14] demonstrate that BERT-based text classification models work well in several similar use cases.Even though label prediction was not perfect, accuracy and recall values were very similar for 'biomedical' and 'psychological' labels separately (except for the 'sociological' which we did not use in the present study), thus bias toward either of these two labels is unlikely to occur (see Appendix Table 3).

Time series modelling of labels
Using the frame labels, posts for individuals are represented as a categorical time series.To further simplify the analysis, we modelled each label separately as a binary time series, contrasting each substantive label separately with all other labels.Because of the relatively small number of 'sociological' labels, we chose to focus on the 'bio-medical' and the 'psychological' labels as two binary time series modelled separately.
As the discussion forum socialization of users was the focus of our study, the analysis was restricted to those individuals who posted at least ten times during the study period (n = 794).A lower number of posts would make the assessment of time dependency for frame usage unfeasible.The histogram in Fig. 5 (see Appendix) shows the number of posts per user.We can see that few individuals have many posts, and most have only a few (Fig. 1).
For each individual and each binary time series, a logistic regression is fitted to extract the underlying probabilities of posting either using a 'biomedical' or a 'psychological' framing.A more flexible generalized additive model with regularized splines [49] was also considered, but the results showed that most individuals do not post frequently enough for us to be able to identify complex patterns, and a logistic curve is enough to model changes in frame probabilities.
The logistic regression models must include time as the primary explanatory variable.We wanted to know how the time spent on the forum changes the users' framing, and how the dominant framing of a thread affects the user, but these effects cannot be separated in the one-predictor model.Hence, an additional explanatory variable is needed to account for a possible selection effect in choosing which thread to post in.This second variable, which we call 'background ratio', measures the ratio of the given frame in the thread of the post up till the time of the post.As an example, when modelling the probability of using the 'biomedical' frame, if post X was written in thread A in a forum at time T, the ratio of posts with the 'biomedical' frame in thread A up till time T gives the background ratio for post X.As an illustration, Fig. 6 (see Appendix) shows the time series of 20 individuals focusing on the predicted probability of the 'biomedical' frame.Lines not spanning the entire time interval show that not all individuals had posts for the complete study period.

Segmentation of posting patterns
Using the parameter space of the logistic regression model, each individual's time series for each frame can be represented with three values-the three regression coefficients: one is the constant, and other two are the slopes for the two explanatory variables (time and background ratio) in the logits of the probabilities.Exponentiating the latter give odds ratios for the outcome, so these can be interpreted as the multiplicative changes in odds for one unit increase in the explanatory variable.One unit for time is a year in this case, and one unit for the background ratio is percentage points.
To investigate the presence of typical posting patterns, we clustered the authors based on their modelled time series curves.First, we constructed a two-dimensional embedding of the curves using t-distributed stochastic neighbourhood embedding (t-SNE: [46] of four of the regression coefficients, the two slope coefficients for each of the two frames, then we employed hierarchical clustering (run with Euclidian distance and complete linkage) to identify distinct groups.

Semantic analysis of the clusters
In a previous research phase, we built a topic model on the posts [32] to computationally discover abstract "topics" that occur in depression forums, using an NLP tool, latent Dirichlet allocation (LDA) topic modelling [1].The exact steps of text processing and model development can be found in [32].In the present study, we have worked further with these previously identified topics, linking them to the clusters established based on time series modelling.These user clusters can be described with the proportion of the (main) topics of the posts written in them: this describes what kind of framing the users in the cluster generally apply.To identify whether the differences between the distributions of main topics per cluster are significantly different from the overall distribution in the population of posts, we used Pearson's Chi-square tests.We considered the observed overall proportions per topic as the approximation of the distribution in the population.Since we conducted multiple comparisons, we used Bonferroni corrected p values.We found that the proportion in all clusters is significantly different from the overall distribution.After establishing that the distributions differ significantly, we conducted post hoc binomial tests to find the topics of which the proportion in a cluster differs significantly from its overall proportion.

Results and discussion
In the following section, first the process of segmenting the biomedical and psychological posting patterns is described, which is followed by the process of differentiating between various clusters constituted of the odds ratios of the time series of biomedical and psychological posting (RQ1).Secondly, each socialization pattern is characterized from the perspective of their semantic and performative dimensions (RQ2).Finally, an attempt is made to theorize about a comprehensive idealtype of online socialization processes and the related patterns of depressed self-identity (RQ3).

Acquiring biomedical and psy framing: the ideal-typical socialization trajectories (RQ1)
After fitting all the logistic regression models separately for each author's time series of biomedical posts, and the psychological posts, as described above, we could characterize authors using the coefficients of their two regressions.The two-dimensional scatterplot in Fig. 2 shows each individual as a point, for which the coordinates are the two (exponentiated) slope coefficients from the logistic regression models for the 'biomedical' frame.
A coefficient of one means there is no change in the odds of using the 'biomedical' frame.We can see that 95% authors have a coefficient over one for either the time (45% of authors) or the background ratio (90%) or both (39%), which suggests that the odds of using the 'biomedical' frame increases, because (1) individuals use the 'biomedical' frame relatively more often in time regardless of the background ratio found in the thread they are posting in, but also because (2) they increasingly tend to favour threads with a higher ratio of posts with 'biomedical' framing.The scatterplot in Fig. 3 shows the same individuals for the 'psychological' frame.
90% of authors have an odds ratio over 1 for either time or background ratio in the use of the psychological frame.48% have an increasing odd of using this frame as they spend more time on the forum, and 82% mirror the background ratio, meaning that as that ratio increases, so does the odds of using the psychological frame in a post.For 13% of authors, all four odds ratios are positive, which is possible since Fig. 2 Odds ratios from the logistic regression models for the biomedical framing.Each dot represents an author.The odds ratios are the exponentiated regression slope coefficients (omitting the third one, the constant) Journal of Computational Social Science (2023) 6:781-802 the use of the biomedical and the psychological frame can both increase at the same time (with the reduction in the usage of the sociological and "other" frame).
Looking at the patterns in the regression coefficients more rigorously, we could identify five groups.We used t-distributed stochastic neighbourhood embedding (t-SNE) to reduce the four original dimensions of the data given by the four regression slope coefficients to two.Then we clustered authors in the reduced space in five distinct groups shown in Fig. 4. It must be noted that clustering directly on the four regression coefficients yielded the same results, while using t-SNE for dimension reduction aids visual interpretation.
Fig. 3 Odds ratios from the logistic regression models for the psychological framing.Each mark represents an author.The odds ratios are the exponentiated regression slope coefficients (omitting the third one, the constant) Fig. 4 Two-dimensional t-SNE embeddings of the four regression coefficients for authors.Symbols and numbering of groups correspond to the groups given by the hierarchical clustering The first cluster in Table 1 is characterized by the gradual decrease of both biomedical and psy framing, the conformity with biomedical threads and the slight opposing of psy threads.Although the difference is small, it needs to be mentioned that the psy discourses are denied more than the biomedical ones.Usage of both frames can decrease simultaneously, since posts may have social framing or a lack of a distinct frame.The more time these users spend on depression forums, the more they distance themselves from expert discourses of depression and seek alternative narratives.This trajectory expresses a pattern of identity construction based on the denial of both typical forms of discursive power, that is, an attempt of regaining control over defining the depressed self [19].
The second group is characterized by the decrease of biomedical framing, the increase of psy framing and the conformity with both discourses.The members of this cluster gradually identify with the roles provided by the psy discourse, while distance themselves from biomedical narratives.This trajectory refers to an identity construction, which is based on a differentiation within the expert discourses on depression: the refusal of an illness role is paired with the acceptance of a therapeutic subjectivity [44].
The third cluster is characterized by a major increase of biomedical framing and a smaller increase of psy framing (and no particular trend in conformity).This cluster can be understood as an opposite of the first: instead of distancing from expert discourses, these subjects identify with both versions.Their identity is constructed according to the roles provided by biomedical and psy paradigms, without any claims of independence or autonomy [39].
The fourth group is characterized by a major increase of biomedical framing and major decrease of psy framing (especially in biomedical discursive environments).Those who belong to this cluster gradually distance from psy roles, while identifying with biomedical roles.Their identity is based on the illness paradigm, which is also used as a critical basis of therapeutic subjectivities [3].
The fifth group is characterized by a major increase of psy framing, the smaller increase of biomedical framing and a slight non-conformity.The more time they spend in depression forums, the more they internalize a psy subjectivityalthough that does not mean the complete giving up of biomedical roles.Instead, an identity is developed, which is mostly based on therapeutic discourses, partly based on biomedical ones, but-unlike cluster 2-also remains quite reflective.Similarly to cluster 4, these subjects are ready to confront with dominant views, in both biomedical and psy threads, which indicates their insisting on autonomy.In this case, the therapeutic subjectivity is complemented with doubt and criticism and the need for experimenting with identity [7].

The semantic and performative characteristics of socialization trajectories (RQ2)
The next step of the analysis was the semantic and performative characterization of socialization trajectories.The initial description of ideal-typical socialization trajectories was complemented with the analysis of the semantic and performative patterns of each cluster.Our earlier topic modelling [42] revealed various monological and interactive posts: the first group included attributions related to depression (e.g.consequence of an illness, work or education stress, or dysfunctional family and intimate ties) and disclosures (e.g.suffering and well-being monologues),the second group included consultations (e.g. about drugs or therapies) and supportive interactions (e.g.recovery helpers' advices or relations based on unconditional positive regard).In case of each cluster, the relative weight of each semantics was measured.
Table 2 summarizes the proportion of the main topics per user cluster.The numbers represent for each group the proportion of the posts, of which the main topic is shown in the column.The Bonferroni-corrected p values for the binomial tests are listed where this proportion is significantly different from the overall proportion of the main topics (typed in bold).
The first cluster is characterized by below average frequency of suffering monologues, consultations about psy discourses, the depressive experience and drugs; and above average frequency of support provided by recovery helpers and unconditional recognition.Overall, this pattern expresses a very restrictive discursive universe: instead of trying to reinterpret the depressive experience from either a biomedical or a psy perspective, or disclosing personal suffering, the emphasis is on pragmatic advice and direct emotional support.This socialization pattern expresses not only disillusionment about expert discourses and the claim of retaking the autonomy of constructing the depressed self, but also includes the emergence of a supporting role.The members of the first cluster walk on a trajectory that leads to a recovery helper role based on a pragmatic perspective, which less and less relies on expert discourses [40].
The second cluster is characterized by above average suffering and well-being monologues, drug consultations and below average recovery helpers' counselling.Similarly to the first trajectory, in this case there is a turning towards experiences of suffering and the others in need.However, unlike in case of the first cluster, this supportive role is not based on the distancing from expert discourses; instead, it is embedded in biomedical and psy discourses.The more time these subjects spend in the depression forums, the more they internalize a psy perspective, while relying less on their personal experiences.These two aspects are gradually combined into a new role of a 'psy-advocate recovery helper' [27].
The third group is characterized by below average well-being monologues and interactions implying unconditional recognition; above average consultations about psy discourse, religious support.This trajectory is defined mainly in a negative way: positive experiences of well-being or recognition are missing from the discursive space.It seems that by spending more time in the online depression forums, these subjects identify with the expert discourses in an uncertain way: they accept both its biomedical, psy and even religious versions, while not being able to identify with any of them.They experiment with various roles, as if they were not sure what discourse fits their needs [29].
The fourth group is characterized by below average well-being monologues and religious support; otherwise, the members of the group do not stand out according 1 3 Journal of Computational Social Science (2023) 6:781-802

Table 2
The prevalence of semantic topics within each socialization clusters to any dimensions.The denial of these two semantic patterns paired with a general indifference indicates the incorporating of a positivist perspective.Both the forced expressions of happiness and the transcendental consolations are refused, while the biomedical framing is cultivated more and more and the psy attributions are distanced.The members of this group do not only identify with the patient roles assigned by biomedical discourse, but they also strictly refuse any alternatives, such as relying on psy discourses or experimenting with peer support [10].
The fifth is characterized by below average partnership and work-related attributions, well-being monologues; and above average consultations about the depressed experience.The more time the members of this cluster spend in online depression forums, the less they rely on social explanations; instead, they engage the internal experiences and initiate discussions about them.This expresses a full commitment with psy discourses and the incorporation of related roles [26].Similarly to group four, they identify with a certain set of roles-however, in this case that does not imply complete passivity.Unlike the patient role entitled by biomedical discourse, the psy subjectivity is entitled to experiment with the narratives, being reflective (as expressed by the non-conformity) and engage in interactions (even if only about internal experiences).

The impact of socialization trajectories on identity and illness narratives (RQ3)
Besides describing the ideal-typical trajectories of socialization (RQ1, 2), to elaborate a comprehensive model (RQ3), the average age and the average number of forum accounts belonging to each group were also characterized.Based on the qualitatively gathered data of the five to six most active users in each clusters (see Table 4 in the Appendix), it may be argued that the accounts belonging to group 1 and 2 are the oldest, the accounts of the fourth and fifth group are younger and the accounts of the third group are the youngest.Furthermore, the most posts/users belong to group 1 and 2, followed by group 3, and the least belong to group 4 and 5.
After overviewing these results, an attempt can be made to answer our last research question concerning the comprehensive socialization dynamics of depression forums (RQ3).If we look at online depression forums as a special interaction setting, wherein certain roles are assigned and identities are constructed (by internalizing or distancing from these roles), it may be argued that five ideal-typical trajectories can be differentiated.The first and the second clusters express the developing of a recovery helper role [35], either by opposing expert discourses or by identifying with the psy discourses (becoming a biomedical recovery helper is improbable due to the relative inaccessibility of the biomedical discourse compared to the psy discourses).The third cluster expresses the acquiring of a substantively diffuse, uncertain role.The fourth and fifth clusters refer to a trajectory leading to the incorporating of a biomedically framed patient role, or a therapeutic psy subjectivity.
While at first, it may seem that these trajectories represent distinct, alternative socialization paths, it may also be asked, how do they relate to each other?According to previous qualitative research, those who become regular users of depression forums do not necessarily remain for the purpose of gathering more and more information 1 3 Journal of Computational Social Science (2023) 6:781-802 about their problem; instead, they rely on the positive feedback coming from the online interactions, while also getting involved in the rituals of providing support (e.g. via unconditional positive regard- [41].This implies that the trajectories identified in our study may also represent various phases of a general forum socialization process. According to our hypothesis, the first phase of socialization affects the 'newcomers', who are disoriented (lacking autonomy and incorporated discursive frames) and mainly preoccupied with their own depressive experiences (expressed by cluster 3).This phase can evolve into several directions: if the newcomer is generally disappointed with the forum interactions, they leave after or before gathering the required bits of information.However, if some sort of commitment emerges, then it may lead to various trajectories: cluster 4 and 5 represent this second potential phase.The original disorientation may either turn into the internalizing of biomedical discourse (that is the emergence of a patient identity, expressed by cluster 4); or it may result in the incorporating of psy discourse (that is the emergence of a psy subjectivity, expressed by cluster 5).In the second phase, the focus is still on the depressed self, which poses a narrative challenge.Thus, attempts are made to find an appropriate configuration of roles and identity patterns, which are negotiated within the frames of biomedical or psy discourses.These explorations require time-until the related interactions eventually saturate.Those forum participants, who find their answers detach from the e-community; only those who develop further roles remain.This indicates the beginning of the third phase, wherein the recovery helper roles emerge.The non-expert and psy versions of recovery helpers refer to the last step of the transformation of an originally disoriented depressed self, which later turns into a discursively settled depressed self: at this last transition point, the depressed self is complemented with a quasi-therapeutic dimension (expressed by cluster 1 and 2).
To test this hypothesis, the age and average posts of each group is compared.Based on Table 4, it seems that the data is consistent with our generalized hypothesis of forum socialization.The oldest and most active members are the recovery helpers (cluster 1 and 2).They acquired a special competence and skillset by walking through phase 1 and 2, while the sharing of this unique knowledge is recognized by their peers.The youngest, but similarly active group is cluster 3: they represent phase 1, where the disoriented newcomers try to find out how they can profit from the online interactions.Between these two phases, there is a more limited phase of the settled patient or psy subjectivity: the subjects from cluster 4 and 5 have the least contribution, while they are the second oldest group.Phase 2 also represents a potential way out from the forums: those who are satisfied with the newly found patient role or psy subjectivity do not need the online community anymore-only those who move towards phase 3 remain.

Limitations
Our analysis also has its limitations.Those who actively seek support through an online forum are not representative of people directly or indirectly affected by depression.Additionally, due to data protection and ethical regulations, our search was restricted to those forums which were public and accessible without registration.Password-protected support groups may be felt to be safer places.However, a study by Gulliver et al. [13] showed that participants believed that the forums should be accessible to view content without registration.Presumably lower-threshold forums have greater reach, but less likely include those struggling with most serious problems.
We collected only posts including specific keywords, a decision justified by methodological reasoning.This criterion ensures that only posts which explicitly discuss depression are selected; however, the procedure may lead to the over-representation of conversations which consider depression as a pathological mood disorder, while under-representing cultural or social references.
Finally, the relatively low number of recurring forum users could also limit our findings.

Conclusion: elaborating illness narratives in the online forums
The broader stakes of our analysis can be highlighted by referring to the phenomenological descriptions of depression.To overcome the 'disrupted interaffectivity' causing isolation [11] and to recover the lost agency causing helplessness [36], one needs to find illness and recovery narratives, which outline a space of interactions enabling connectivity and agency.Our findings might orient clinical praxes by revealing how depression forums can contribute to the acquiring of new roles and identities.
Recovery helper roles establish an intersubjective setting, wherein the depressed self can be turned into a self of unique experiences, which hold exceptional potential.By sharing the first-hand experiences about the personal struggle with depression, the subjects may become supporters themselves-that is, they gain the chance of re-establishing intersubjectivity and becoming actors once again.In this sense, the e-mental health communities do not simply provide a reduced version of interactions, but rather they represent a unique chance: due to their socialization structure, they enable the transformation of the depressed self not only into a biomedical patient role or a psy subjectivity, but also into a recovery helper.This possibility provides not just a new identity, but also a way out from the paralysed and isolated existence of the depressed self (cluster 1 and 2).
However, this trajectory is far from being general: many forum users quit as soon as they receive answers to their questions.These socialization trajectories are represented by the acquiring of a biomedical or psy framing, which are applied in the process of identity construction (cluster 4 and 5).Beside these ideal-typical patterns, it should also be emphasized that the forums are not for everyone: the relatively low number of regular users indicate that many actors turn away from the online forums, probably even before establishing an illness narrative with its help (cluster 3).The exploring of the explanatory factors behind these divergent trajectories indicates a potential direction of further research.

Appendix
See Tables 3 and 4 and Figs. 5 and 6.

Fig. 1
Fig.1Flowchart of the data analysis process.The Text database contains all forum posts collected using SentiOne, a subset of which were hand annotated for depression framing.DistilBERT was used to expand the framing labels to all posts, which were then inductively clustered according to frame use patterns in time.Clusters were described by their topic composition based on the LDA topic model

Table 1
Mean odds ratios (OR) for the five clusters

Table 3
Labels for all posts living it more fully.My depression is non-existent now and I can live in a happy mood And after years and years of struggle I figured it out.I'm letting go of all my fears and redefining everything.I'm rewriting my story I had depression since I started to work here.And It is getting worse.When do you know that your job isn't worth the stress anymore? at

Table 4
The temporal characteristics of the most active forum users in each cluster