1 Introduction

Technology is now an important component of the healthcare ecosystem, and both computer and management science are playing a greater role in analyzing data to measure efficiencies (Chaudhry et al. 2006; Harrison et al. 2007; Holden and Karsh 2010). The digital health industry is maturing, and as a result, highly tailored evidence-based behavior change programs are increasingly available to consumers through pharmaceutical companies, non-profit organizations, insurers, private corporations, and government entities.

A component of these programs is Digital Health Social Networks (DHSNs), otherwise known as bulletin boards or peer-to-peer support groups. While there are still no firm conclusions on how to determine their efficaciousness (Eysenbach et al. 2004; Graham et al. 2015), the general consensus is that social support and knowledge sharing increases patient education, enhances self-management, and decreases burden on existing health services (Bender et al. 2013; Brennan et al. 1995; Cobb et al. 2011; Conrad et al. 2016; Ploderer et al. 2013; Takahashi et al. 2009; Wicks et al. 2012; Wright 2002).

Hypothetically, an ideal DHSN would consist of members who are equally engaged. In reality, network participation is unequal. An issue is that other than observing a network’s number of actors and number of posts, there are few metrics that can be used to identify participation inequality.

To address this issue, some research has sought to define actor roles (Carron-Arthur et al. 2015; Cleary and Stanton 2015; Cunningham et al. 2008; Jones et al. 2011; Selby et al. 2010; van Mierlo et al. 2012). By systematically categorizing participants, taxonomies can give insight into how various actors in complex networks function in relation to one another.

Other research has explored network topologies. Some of these studies have employed traditional social network analysis and method that focuses on nodes, ties, density of relationships, and degree centrality (Cobb et al. 2010; Urbanoski et al. 2016). Other streams have examined marketing rules of thumb (van Mierlo 2014), latent semantic analysis (Myneni et al. 2013), natural language processing (Wang et al. 2015), or the phenomenon of power laws (Carron-Arthur et al. 2014; van Mierlo et al. 2015).

As DHSNs shift into mainstream healthcare delivery it will be important to develop metrics that help managers assess growth, sustainability, and participation equality (Healey et al. 2014; Stearns et al. 2014). While the quality of DHSN content is important and is rooted in behavioral science, quantitative methods to analyze the health of a network may come from established theoretical constructs in economics and computer science.

Through measuring 222 quarters of participation from 15,181 actors from four separate DHSNs, this paper investigates whether the Gini coefficient, an economic measure of statistical dispersion traditionally used to measure income distribution in populations, can be employed as a management tool to help measure inequality of member participation over time.

1.1 The Lorenz curve

In economics, the Lorenz curve is a popular method that illustrates income distribution (Lorenz 1905). Graphically, the y-axis represents the percentage of income in an economy, and the x-axis represents cumulative income distribution in the total percentage of households (Fig. 1).

Fig. 1
figure 1

A Lorenz curve

In a perfectly equal economy, all citizens share the same income, and the Lorenz curve would resemble the red line in Fig. 1, where y = x. Most economies are not equal. In Fig. 1, the Lorenz curve (blue line) illustrates economic inequality. Here, approximately 20 % of households receive 1 % of income, 40 % receive 3 %, 60 % receive 8 %, and 80 % receive 25 %.

1.2 The Gini coefficient

Developed by Corrado Gini in 1912 (Gini 1912), the Gini coefficient is an inequality measure based on the Lorenz curve. Specifically, it measures the distance between the Lorenz curve and perfect equality (Bellu and Liberati 2006). A Gini coefficient of 1 represents an economy where a single individual generates all income, whereas a Gini coefficient of 0 represents an economy where all citizens share the same income.

If translated to DHSNs, a Gini coefficient of 1 would represent a network where one individual created all posts. Alternatively, a Gini coefficient of 0 would represent a social network where all members authored the same number of posts.

To our knowledge, Gini coefficient numerical and visual outputs have yet to be applied to assess participation inequality in DHSNs. However, the method has been utilized in other studies.

For example, a 2005 study accessed U.S. census data to measure Personal Computer (PC) ownership inequalities amongst whites and African Americans at national, regional, and state levels (Chakraborty and Bosman 2005). Results indicated that although decreasing overall, PC ownership inequality is substantially smaller among white households. A strength of the study was the use of the Lorenz curve and Gini coefficient to graphically illustrate variations in income and PC ownership within and between the two groups.

Gini coefficient numerical scores and their visual representations were also utilized in a 2002 Statistics Canada research study depicting the digital divide, or cumulative internet usage amongst differing household income deciles (Sciadas 2002). Results indicate that as time progresses, the Gini coefficient is decreasing and the digital divide is closing. However, graphical outputs show that the shift is mainly attributable to middle-income groups, while lower-income and upper-income groups remain fairly stable.

Although there are other uses in research, a final example is a 2010 study analyzing university rankings. This study employed the Gini coefficient to assess whether academic institutions were becoming increasingly unequal (Halffman and Leydesdorff 2010). Ranking data mainly consisted of weighted contributions of total number of publications and citations, and findings indicate that contrary to popular belief, the 500 universities that publish at greatest frequency were becoming more equal in terms of output.

2 Methods

Data were extracted from the DHSNs of four interventions: AlcoholHelpCenter.net (AHC), DepressionCenter.net (DC), PanicCenter.net (PC), and StopSmokingCenter.net (SSC). All four digital health programs are based on state-of-the-art best practice guidelines, and have been extensively evaluated in the literature (Cunningham 2009, 2012; Cunningham et al. 2006a, b; Cunningham and van Mierlo 2009; Cunningham et al. 2009, 2010; Davis 2007; Doumas et al. 2009; Farvolden et al. 2003, 2005, 2009; McDonnell et al. 2011; Rabius et al. 2008). Table 1 outlines study duration, number of participants, and other descriptive data.

Table 1 Intervention and DHSN characteristics

For the study duration, a staff of trained moderators monitored and maintained the four networks. Key moderator roles included ensuring compliance with network policies and user agreements, answering actor questions, guiding discussions when deemed appropriate, and offering technical assistance. For the purpose of analysis, moderator posts were removed from the dataset.

For each network, data sets were divided into annual quarters. Within each quarter, the total number of posts by each unique actor was calculated, and actors were ranked according to level of contribution. Next, the population in each quarter was divided into quintiles. Finally, Gini coefficients for each quarter were calculated. Pearson correlations and linear regression were used to assess the strength of relationships between Gini coefficient, actors, and posts.

All data collection policies and procedures adhered to international privacy guidelines (European Parliament and of the Council on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data 2002; Office of the Privacy Commissioner of Canada 2000; US Department of Health and Human Services 2003) and were in accordance with the Helsinki Declaration of 1975, as revised in 2008 (World Medical Association 2008). The study was consistent with University Research Ethics Committee procedures at Henley Business School, University of Reading, and was exempt from full review.

3 Results

Table 2 displays the relationships between each quarter’s Gini coefficient, number of actors, and number of posts (Table 2). For each network, fluxions in quarterly shifts in the Gini coefficient were graphed (Figs. 2, 3, 4, 5).

Table 2 Actors, posts, Gini coefficient
Fig. 2
figure 2

AHC Gini coefficient over 41 quarters

Fig. 3
figure 3

PC Gini coefficient over 56 quarters

Fig. 4
figure 4

DC Gini Coefficient over 52 quarters

Fig. 5
figure 5

SSC Gini coefficient over 52 quarters

Summary statistics were computed to identify means and standard deviation for each network’s number of actors, number of posts, and Gini coefficient (see Table 3).

Table 3 Summary statistics

The DHSN with the least number of actors in any single quarter was the DC (n = 8), and the SSC was the DHSN with the greatest number of actors in any given quarter (n = 2323). Mean number of actors varied (n = 40.8–304.7), as did standard deviation (n = 19.6–347.3).

The AHC and DC had the least number of posts in any single quarter (n = 29), and the SSC was the DHSN with the greatest number of posts in any given quarter (n = 28,684). Mean number of posts varied (n = 405.7–9002.8), as did standard deviation (n = 308.6–9049.5).

In regards to Gini coefficient, three of the four DHSNs had at least one quarter with the highest level of inequality (0.37). Interestingly, each of the four social networks had quarters with similar lowest levels of inequality (0.15–0.19). Range of Gini coefficient varied slightly (0.15–0.22), however, mean (0.287 and 0.332) and standard deviation (0.032–0.058) did not.

Pearson correlations were then calculated to compare number of actors, posts, and Gini coefficient (see Table 4).

Table 4 Pearson correlations between Gini coefficient, actors, and posts

Pearson correlations computed positive and statistically significant relationships between number of actors and number of posts (0.527–0.835, p < .001), and Gini coefficient and number of posts (0.342–0.725, p < .001). However, the relationship between Gini coefficient and number of actors was only positive and statistically significant for the addiction networks (0.619 and 0.276, p < .036).

Multiple regressions were then computed to examine the association between the Gini coefficient (dependent variable), and independent variables actors and posts (see Table 5).

Table 5 Linear regression

Linear regression models had mixed R 2 results (0.333–0.527). In all four regression models, the association between Gini coefficient and posts were statistically significant (t = 3.346–7.381, p < .002). However, as opposed to Pearson correlations, the relationship between Gini coefficient and number of actors was only statistically significant in the two mental health networks (t = −4.305 and −5.934, p < .000).

In regards to collinearly, tolerance was all above 0.10 (0.303–0.723), and variance inflation factors were moderately correlated (1.384–3.297), indicating that multicollinearity is not a concern. However, as they did not approach a score of 2.0, Durbin–Watson statistics indicated the presence of autocorrelation (0.312–1.638).

4 Discussion

The results of this study generate several unique contributions to DHSN research, all of which require further investigation.

4.1 Detecting shifts in social network inequality

From both visual and quantitative perspectives, the Gini coefficient was effective in identifying shifts and trends in inequality. However, as a standalone metric, shifts in the Gini coefficient can be deceptive as coefficient of 0.33 can be calculated from a network of 29 actors who created 321 posts (see Table 2, AHC period 10-Q2), or a network of 119 actors who created 2867 posts (see Table 2, SSC period 10-Q2).

Future research in developing metrics to determine social network inequality should incorporate the Gini coefficient, but also test ratios pertaining to number of posts per actor, number of posts per Gini score, number of actors to Gini score, or various combinations.

4.2 Use of the Gini coefficient as a management tool

During the process of the study, informal qualitative in-person interviews were conducted with moderators and other management, and the visual depictions of shifts in quarterly Gini coefficient were deemed particularly helpful when recalling effects of technical outages, policy changes, the implementation of management issues and techniques, and dynamics of personalities within groups of actors.

In these informal meetings, discussion often centered on the status and intensity of participation from Superusers, actors who generate the greatest amount of network externalities (van Mierlo et al. 2012). Strategies to increase equality and engage non-Superusers were also discussed. Future research may investigate the value of utilizing the Gini coefficient as a tool to assist with network management.

4.3 Insights into social network utility and function

While there were consistent and statistically significant associations between Gini coefficient and number of posts in the Pearson correlations and the regressions, the associations between Gini coefficient and number of actors was inconsistent. It is interesting to note that this inconsistency was consistent for the two addiction DHSNs (AHC and SSC), and the two mood disorder DHSNs (DC and PC), and may potentially lend insight in quantifying the utility and function of DHSNs that promote different therapeutic approaches.

For example, the AHC and SSC interventions focus on addictions, and the theoretical approach to treatment in these programs is the Stages of Change (Prochaska et al. 2008) and Structured Relapse Prevention (Sanchez-Craig 1995). Both treatment approaches are designed to assist users in developing coping skills to assist with strong, yet relatively short-term cravings. This is reflected in program content, which consists of short exercises that offer brief, yet tailored feedback on how to deal with addiction issues in specific situations. Prior research indicates that the content and tone within the interventions’ DHSN reflects this.

A 2010 mixed-methods study on the SSC social network (Selby et al. 2010) analyzed demographic and smoking characteristics for both actors and non-actors, and qualitative analyses were conducted to explore themes in message content. Results indicated that the most frequent first posts were from new actors who were struggling with their quit attempts, and 90.6 % of responses to these posts were from experienced actors. The authors concluded that social support in the network was particularly beneficial to many new actors who short-term, time-sensitive assistance.

Conversely, the two mental health interventions (DC and PC) are heavily focused on Cognitive Behavioral Therapy (CBT) (Herbert and Forman 2011). At the core of each intervention are nine sessions that are designed to take a minimum of 9 weeks to complete. Members are given homework between each session, which involves intensive experiments, journaling, and self-reflection.

A 2013 mixed-methods doctoral thesis on the DC social network (Sugimoto 2013) found that DC actors generally sought informational support, emotional support, coaching support, and social companionship over long periods of time. Actors providing emotional support created an average of 8.3 posts, and those actors seeking support created an average of 5 posts. The author concluded that actor participation and social support in the network was long-term, and contributed to the everyday lives of actors.

4.4 Future directions

Future research may investigate potential relationships or patterns between Gini coefficient and number of actors in DHSNs leveraging differing therapeutic approaches. While this study focused on calculating the Gini coefficient over annual quarters, future research may experiment with calculating the Gini coefficient over shorter or longer time periods.

4.5 Strengths and limitations

The main strength of this paper was the number of study participants, the extensive longevity of the DHSNs, the number of posts, and the four separate indications, and that half of the social networks in the study were focused on mental health, and the other half addictions. We are not aware of any other study in the healthcare literature with such an extensive and complete data set.

Both a strength and limitation is that the populations analyzed are self-selecting populations that actively sought help. In the context of this study, it was helpful to have data sets of active and engaged participants. However, these results will not be indicative of populations of patients in health plans, hospital networks, or mass public health campaigns.

5 Conclusion

The Gini coefficient is helpful in measuring shifts in DHSN inequality. However, as a standalone metric, the Gini coefficient may be misleading as does not indicate optimal numbers or ratios of actors to posts, or effective network engagement.

From a management perspective, the Gini coefficient may be leveraged as a tool to assist moderators in detecting trends, or as a training tool to help explain how network inequality fluctuates.

The results to this study may support prior mixed-methods research on two of the four social networks (Selby et al. 2010; Sugimoto 2013), which found differences in social network utility, functionality, and tone.

Further research investigating quantitative scoring techniques, performance metrics, and optimization benchmarks is required.