Keywords

1 Introduction

Innovations in information and communication technologies, more specifically the extensive proliferation of social media tools, has proofed to have the possibility of greatly contributing to “open government” by providing not only public information disseminating forums, but also stakeholder participation avenues [1]. Considering that the consequences of social media as a major source of influence grows, extant inquiry to explore who takes the lead in information sharing, and more importantly the patterns of information exchange among these users become more prevalent. Therefore, the data from online social networks affords fresh opportunities and views regarding the creation and influence of large-scale social networks and communities and the evaluation of these networks. In recent times, social networks have evolved into influential platforms of human communication, conducting business, information sharing and various countenances of normal facets of life [33].

Social media in general has received substantial attention from researchers across information systems and marketing disciplines, as social media affects a range of stakeholders. In a study of 132 social media research papers between 1997 and 2017, the literature landscape ranges mostly from the examination of social media behaviour to its marketing possibilities and resultant organisational impact [37]. Also, social media’s information sharing and exchange capabilities are shared unanimously. However, there is a particular cluster of studies, which distinguish its efficiency during important events [37]. This paper falls into this cluster, and follows the use of Twitter analysis from other available social media information sources [37,38,39,40].

In particular, this study makes the following practical and theoretical contributions: Most studies utilize social exchange theory, network theory and organisation theory [37]. Firstly, this study use social network theory, but adds a graph theoretical lens, a lesser used theory to enrich the existing body of knowledge. Secondly, it adds to the existing group of studies which focuses on social media research during an important event – in this case, the emergence of a new idea (the South African National Health Insurance Bill hand-over to parliament - #NHI). The third contribution, is that it considers both the network topology and the behavior of network actors in the comprehensive #NHI networked system. This study therefore not only considers the predominant behavioural examination of social data, but it enables the use of the combined knowledge of network structure and behaviour which is substantially distinct from the “straight-forward analysis” of single limited graphs [11, 12]. The fundamental research objectives in this study can be listed as:

  1. 1.

    To model the emergence of a new idea (i.e. the South African National Health Insurance Bill announcement in parliament - #NHI), based on Twitter users and their interactions on Twitter.

  2. 2.

    To determine the key influential actors in Twitter as social network in the #NHI conversation.

  3. 3.

    To describe the degree distributions of the relationships between Twitter users in the #NHI Twitter conversation.

Among the sections to follow, the next section briefly reviews the broader social media literature and how this study is positioned therein, followed by the methodological framework of this inquiry, with a brief discussion of the #NHI case. The next section describes the data collection and analysis of this inquiry. This is followed by the results section of the SNA of the #NHI case. The paper concludes with a discussion of the findings and their implications for using social media to offer not only public information dissemination platforms, but also stakeholder participation and interaction opportunities.

2 Literature Review

Social media has developed beyond mere platforms for socialisation or virtual congregation, to being acknowledged for its abilities to encourage aggregation. Similarly, information systems are developing beyond organisational boundaries, to come to be a part of the larger societal context, necessitating strategic information system research to explore the competitive setting of dynamic social systems. Literature on social media over the past years abounds, whereas an agreed definition of the concept, is less clarified [37]. In this study, social media is defined as a collection of user-defined platforms which allows for information interactivity and diffusion between users of open platforms which enables them to expand social relationships with their social networks [2, 9, 10, 34, 37]. Social media literature is synthesised into twelve clusters. These clusters are as follows [37]: (1) Social media usage, behaviours and consequences; (2) Reviews and recommendations on social media platforms; (3) Organisational impact of social media; (4) Social media for marketing; (5) Participation in social media communities; (6) Social media risks; (7) Stigmatisation of social media usage behaviour; (8) Value creation through social media; (9) Social media during an important event; (10) Support-seeking through social media; (11) Social media in the public sector; and (12) Traditional/social media divide.

From these clusters, it is evident that clusters one to eight have received considerable attention in information systems research. Cluster nine is where this paper is situated, as indicated in the introduction. However, it could also be argued that there is an overlap with cluster 11, as the event studied for the purpose of this paper is within the realm of the public sector. Little research has been carried out lately in cluster 12, which could be as a result of the widespread acceptance of social media beyond the traditional media age.

Consequently, it comprises the field of social network analysis (SNA), where this inquiry focuses on the connections and exposure of the relations between the networks and actors. SNA models “relations and associations, developments and associations and dynamic forces in networks and activities on social media platforms” [2]. Although SNA has been used more so in social and behavioural sciences [3, 4], more recently it has also been applied to “more complex areas” including economics, business and medicine [5]. SNA is also observed as a group of theories, practices and instruments [6]. This phenomenon is well summarized as being typically rooted in three main beliefs [7]: (1) Networks’ structure and characteristics influence system performance; (2) Actors’ position in a network impacts their behaviour; and (3) Actors’ behaviour is in conformity with their network environment.

Moreover, in this study, the use of SNA is proposed to facilitate (based on graph theory) the identification of social networks consisting of nodes with which actors are linked to each other through their shared ideas, values, visions, social contacts and disagreement. This study argues that when social networks are successful, it has wider societal impact, which can affect programs, projects, policies, strategies, and partnerships (including its designs, implementations and results) [8] through access to human, social and financial information [5]. Therefore, social media grew central to civil society discourse – a platform where public debate and disputes, as well as knowledge exchange occur. As the public pulpit, social media exchanges are as important to note as any other large public gathering. Network maps of public social media discussions in services like Twitter can provide insights into the role social media plays in our society. These features and the size of online social networks puts SNA central to address many problems globally. This prevalence of increased user activities among social media users allow people to be more connected than ever before across the globe [13].

3 Methodology

3.1 Methodological Framework

This quantitative inquiry follows an instrumental, single case study design. In an instrumental case study, the case (#NHI) is selected as it represents some other issue under investigation (i.e. social network analysis) which can provide insights in that issue [14]. However, as a case study design could be regarded as a rather loose design, and as such, the methodological choices are addressed in a principled manner [15]. Therefore, these choices are outlined in Table 1:

Table 1. Methodological considerations and choices for this inquiry

The following section offers a brief overview of the South African National Health Insurance case, followed by an overview of NodeXL Pro as SNA tool applied for sampling, data collection and data analysis in this inquiry.

3.2 The Case: Announcement of the South African NHI Bill in Parliament

The functioning of the South African bicameral health system (public and private) has long been deteriorating. Politicians directed the decline in specifically the public health sector, to countless problems (claiming no responsibility). “However, the real reasons place the blame firmly at their door” [17]. In December 2015 the South African Government’s White Paper of the NHI was announced which proposed a single, compulsory medical aid scheme which would cover all South African citizens and permanent residents, with private medical schemes being reduced to “complimentary services” [16]. On 8 August 2019, the highly anticipated and controversial South African National Health Insurance Bill was unveiled by the Health Minister Zweli Mkhize in parliament. The Bill proposes that the government will provide a package of comprehensive health services for free at both private and public health facilities in their bid to more equitable quality healthcare access [17]. But, since the introduction of the NHI Bill in the South African parliament, an “enormous amount” [18] of commentary, analysis, interpretation and trepidation has played out across media platforms and society. Within hours of the unveiling of the NHI Bill, Twitter users were actively tweeting – the most popular hashtag, #NHI.

3.3 NodeXL Pro for SNA

To address the research questions, this inquiry conducted a SNA, using NodeXL Pro, the licence-based software developed by the Social Media Research foundation. NodeXL (Network Overview for Discovery and Exploration in Excel) includes two versions: NodeXL Basic, and NodeXL Pro, which enable social network and content analysis [19]. It is a well-structured workbook template in Word Excel consisting of multiple worksheets required to denote a network graph. An ‘edge list’ denotes the network relationships (named ‘graph edges’) and contains all the pairs of entities which are linked in the network. It also includes matching worksheets with information about each cluster and vertex [2]. The visualisation features of the NodeXL software can illustrate various network graph representations, as well as chart data features to visualise aspects such as shape, size, colour and location [20].

NodeXL Pro, offers more advanced features, building on those in NodeXL Basic. These features include, inter alia [19], advanced network metrics, content analysis, sentiment analysis, time series analysis, text analysis, top items, and most importantly access to the application programming interfaces of various social networks. For this study, only network visualisation, social network APIs, the data import and export functions and SNA were used.

3.4 Data Collection and Analysis

Data Description and Dispersion.

For the purpose of this inquiry, the Twitter data was imported on 20 August 2019 through NodeXL Pro’s Twitter Importer, which passes a query (in this case #NHI) to the Twitter API focusing on relevance, not completeness [21].

The mined #NHI data is then routinely entered in the NodeXL Pro Excel template in keeping with edges and vertices. The edges and vertices are central concepts in network theory [22], one of the theories grounding this inquiry. Firstly, ‘edges’ (similarly termed ‘links’, ‘ties’, ‘connections’, or ‘relationships’), involve social interactions, organisational structures, physical immediacies or abstract connections (for example hyperlinks). Secondly, vertices (similarly termed ‘agents’, ‘nodes’, ‘items’ or ‘entities’) can include individuals, locations, events, social structures, and content (for example keyword tags, videos or web pages) [23]. From a network theory perspective, an edge therefore links two vertices in the social network [24].

Network Structure Analysis.

After the dispersion of the #NHI was created, the next step was to analyse the network structure quantitatively and represent it visually. The network was presented visually using Clauset-Newman-Moore cluster layout algorithm and Harel-Koren Fast Multi-scale layout algorithm to reduce the number of visible elements, so as to lessen the visual complexity of the graph [26, 27]. This allowed for improved intelligibility and concurrently it increased the execution of layout and interpretation [25]. The next step in the social network analysis involved the calculation of each of the vertices’ relevant network metrics. For the purpose of this inquiry, the following metrices were calculated, to describe the network structure of the gathered #NHI data.

One of the key characteristics of SNA is finding prominent, influential “players” in these social media networks. This concept of identifying the important vertices in a graph based on the ranking, which in turn produced by the values is called centrality [28]. As the #NHI network is directed, it calls for the calculation of both in-degree and out-degree centrality. Similarly, the number of other accounts that have arrows pointing towards each Twitter account, is known as in-degree centrality. In this context, in-degree is regarded as a measure of popularity [29]. Out-degree centrality then refers to the number of arrows directed away from the Tweeter. The Tweeter with the highest out-degree calculation is then referred to as the main influencers in the network.

From a social network theory perspective, another centrality metric that should be considered, is betweenness centrality. Betweenness Centrality is a measure of how often a given vertex lies on the shortest path between two other vertices [20]. The Tweeter with the highest betweenness centrality is referred to as the bridges in the network. Closeness centrality describes the mean distance between a vertex and every second vertex in the social network [2]. Presuming vertices can only deliver messages to or effect its existing linkages (vertices), low closeness centrality requires the Tweeter to be directly linked to, or “just a hop away” [20] from, the majority other vertices in the social network. Eigenvector centrality (contrary to degree centrality), explicitly supports vertices that are connected to other similar vertices. Eigenvector centrality network metric considers, not only the number of vertex connections (its degree), but moreover the vertices’ degree to which it is connected [30]. Lastly, with NodeXL Pro, the clustering coefficient is calculated and analyzed using a community detection algorithm [31], which resulted in visible clusters. Based on the data analysis, the results and the discussion thereof, follows below.

4 Results and Discussion

4.1 Prevalence and Patterns of #NHI Twitter Users

NodeXL Pro’s sophisticated ‘crawling’ (extraction of data) of the ‘#NHI’ resulted in the mining of 4 112 tweets. The resultant data set of 4 112 tweets were “cleaned” through eliminating tweets which are not applicable to tweet relationships vital to the study. The mined #NHI network contained 1902 distinctive vertices and 4110 edges among them. The mined edges in this inquiry included original tweets, comments and mentions and were all directed. Figure 1 illustrates the ‘overall graph’, showing the #NHI social network according to the Harel-Koren multiscale layout algorithm [32]. Therefore, Fig. 1 is a visual representation of the overall networked data from by the #NHI Twitter users and Table 2, provides a summary of the overall graph metrics of the case.

Fig. 1.
figure 1

Overall social media network structure of #NHI Twitter users

Table 2. Overall graph metrics of #NHI Twitter case (Source: NodeXL Pro version 1.0.1.419)

4.2 Influence and Network Analysis Results

This section reports on the internal connectivity and the size of the #NHI social network. It further reports on the characteristics of every vertex, based upon in-degree and out-degree, closeness, betweenness, and eigenvector centrality.

In-Degree and Out-Degree Centrality Results.

Tables 3 and 4 represents the in-degree and out- degree centrality of #NHI.

Table 3. #NHI: In-degree centrality
Table 4. #NHI: out-degree centrality

The in-degree value is the number of Twitter users that replied to or mentioned #NHI. Based on the in-degree values generated by NodeXL Pro, the top 3 vertices had over 100 arrows pointing towards them. The top 3 most popular accounts included from highest to lowest in this inquiry, were (1) The Minister of Health – an in-degree of 215; (2) What appears to be a general citizen “Leon” – an indegree of 164; and (3) A highly regarded South African investigative journalist with an indegree of 160. Therefore, the Minister of Health, Dr Zweli Mkhize appears to be the most popular account in this inquiry. The remaining members in this social network occupy various “in-between” position.

Popularity is not the single suggestion of impact in a social media network. For the purpose of this inquiry, the influential accounts (out-degree centrality) were considered. Firstly, there were only 10 accounts that interacted directly with the Minister of Health on Twitter. However, when the out-degree Twitter accounts were ranked, the top Twitter account was “Velloccerosso” – appearing from the Twitter account data to be a citizen. However, what this also say, is that this is an influential account, which is quite vocal and mentions many others in the account’s discussion on #NHI. Therefore, by referring to others, the authoring account is extracting them into the linkage or engaging with them for a second time, if they were previously in the network. The out-degree of a Twitter account refers to mentions in the network, i.e. the number of arrows pointing away from it or the number of accounts it replies to. It is thus an indication of attention which an account points to others [30].

Closeness Centrality Results.

As indicated earlier in this paper, closeness centrality measures calculate the shortest paths between all nodes, then assigns each node a score based on its sum of shortest paths. This type of centrality is used for finding the individuals who are best placed to influence the entire network most quickly. Therefore, closeness centrality can help find good ‘broadcasters’ in a social network. Of the 1 902 #NHI Twitter users, only 3.6% of users had a similar score of 1, whereas 93.3% of the total of #NHI Twitter users have a closeness centrality score of 0. In this inquiry of #NHI, it can therefore be deduced that the connectedness of the network is complex but not significantly connected.

Betweenness Centrality Results.

Table 5 represents the betweenness centrality results of the #NHI inquiry. This measure shows which #NHI Twitter users act as ‘bridges’ between vertices in the social media network, by identifying all the shortest paths and then counting how many times each vertex falls on one.

Table 5. Betweenness centrality

In Twitter, information spreads through relatively short paths. Consequently, those Twitter accounts on short paths, control the information dissemination through that social media network. Thus, Twitter accounts with many short paths have high betweenness centrality, are considered as influential information gatekeepers. In the #NHI case, the Twitter account with the highest betweenness centrality was that of the Minister of Health, followed by the journalist and thirdly ‘Leon’ the Twitter users identified in the in-degree centrality discussion above. Therefore, these three Twitter users can not only be regarded as most popular, but also as most influential in the #NHI social network.

Eigenvector Centrality Results.

Eigenvector centrality is regarded as a “higher-level” type of centrality. With Eigenvector centrality, a Twitter user with fewer connections could hold a very high eigenvector centrality. However, those few connections need to be very well linked to permit connections high variable value. This implies that connecting to certain vertices is more beneficial than a connection to others. In the #NHI inquiry, the eigenvector centrality scores were notably low, implying insufficient evidence that connecting to some #NHI Twitter users are more beneficial to other users in the social network.

4.3 Analytics and Visualisation

The layout of the sociogram in Fig. 2 is presented as groups. The groups cluster vertices through a decided cluster algorithm. These groups are clustered according to its relative network density. These clusters assist in combining groups of vertices (network users) displaying high network density. This therefor refers to network users who exhibit high in-degree and/or out-degree centrality. It is also these network users who are considered as network influencers. The groups further assist in clustering the network users with a lesser degree of network density and disregard them as isolated cases which are not significant in the visualization of the clusters. This is mainly because they do not communicate with others in the network. For the purpose of this analysis and visualisation, the Clauset-Newman-Moore algorithm [35] was applied to display these vertices’ connections to each other. Modularity as network property is used in this algorithm to form a network distributed into communities.

Fig. 2.
figure 2

Groups of clusters and the direction of cluster communication of #NHI Twitter users

The groups were arranged in separate boxes, in order to present the isolates in a separate group. NodeXL Pro then computes the clusters based on the parameters used in choosing the groups [36]. In the #NHI case NodeXL pro generated 78 groups. The resulting sociogram (Fig. 2) displays the clusters through a variety of colours in different boxes with links to different clusters. The isolates are positioned in separate boxes at the top and bottom righthand corner of Fig. 2. These isolates fail to impact the overall visualization, based on its non-communication in the network. That is also the reason why the connections are demonstrated in a circular form in the figure. The communication between the groups should also be noted. From Fig. 2, the largest clusters are focused to the left-hand side with references to many other nodes in the social network.

It could be argued that the primary limitation of this study is that it falls short in terms of some degree of restricted impact. More specifically, this refers to the seemingly lack of public interaction on social media regarding the #NHI case. This seemingly lack of social media traction, more specifically on twitter, might have been influenced by the many other big news events both globally and locally at the time. Although information overload, specifically via social media is a reality that is not going to change in the near future. This could therefore present an opportunity for further research to explore the despondence of social media users during critical events, especially those which directly affect the individual.

5 Conclusions

Social media networks’ big data are among the most influential, yet it remains an under researched phenomena. In this inquiry, Twitter, a very widely used social media platform was used to gather big data, using the hashtag #NHI. The choice of this hashtag was a result of the controversies and uncertainties created among South Africans with the unveiling of the National Health Insurance Bill for discussion in parliament. This inquiry was grounded in graph and network theory in order to conduct a social network analysis of the national conversation of #NHI. This resulted in a visual graph model based on 4112 #NHI tweets, done by 1 902 Twitter users (vertices) indicating 4 110 Twitter interactions (edges). The key influential actors in this SNA was the Minister of Health, the media and to a limited degree, citizens of South Africa who will be influenced by the bill. The degree distributions revealed that relationships between the major #NHI Twitter users were limited, as the majority of closeness and eigenvector centrality indicated low connectivity. This could indicate the lack of involvement of the South African citizens in public discourse around the NHI bill which will affect all South African. From the preceding discussions, it is clear that social media is inevitably central to modern day society, with widespread influence that cannot be refuted or disregarded. In this paper, it was demonstrated how big, real-time data from Twitter can be employed using NodeXL Pro, to draw insights through social media metrics with visualizations. The paper further demonstrated NodeXL Pro as an enabler to harness big unstructured data which are mass-produced on a daily basis. More so, that it enables, through using appropriate analytic techniques, inference from seemingly uncoordinated microblogs that may assist businesses and governments alike in decision-making. The #NHI case study further reinforces that emerging economics are part of the social media race.