1 Introduction

E-commerce has been booming in the last years. The pandemic drove an enormous uptick in e-commerce’s share of total retail spending around the world. The Global Commerce Forecast estimated that by 2025, digital shoppers will spend $7.391 trillion online. To put that in context, 10 years ago, total worldwide retail spending amounted to just a little over $16 trillion [32]. The pandemic has brought new e-commerce opportunities to consumer goods. In e-marketplaces customers are increasingly finding and sharing useful information about products, such as photos, recommendations, reviews, and opinions to help others make buying decisions.

Sharing information about products and services is one of the greatest potentials of e-commerce, seen as a data driven type of business. Information systems provide tools that make suggestions for customized products or services, such as books, music, transportation or even people, based on information about products and users. Besides, online reviews are playing an increasingly important role in consumers’ online shopping decisions [25]. Reviews and recommendations are used in companies that deal with large amounts of information, like Google News, Twitter, LinkedIn, Netflix [75], Amazon.com and Alibaba [52], providing results of customized recommendations for products of interest based on historical customer data. When used effectively, this information provides suggestions to users based on matched preferences of other users, on the customer profile built from their previous purchases, or on the search history collected from websites.

Research indicates that these systems increase sales and consumer satisfaction [35]. Thus, a small improvement in this type of systems can leverage larger revenues [19, 52] and minimize sale risks [54]. However, academics and companies keep researching ways to increase their effectiveness to provide users with better recommendations [41, 45, 51].

We can delineate two main ways for communicating suggestions in e-marketplaces [18]: recommendation systems and reviews (under the form of scores, ratings, etc.). Recommendation systems are important tools in e-commerce as they provide users with machine generated personalized recommendations and allow them to discover new products for the same purpose through simplified search [14, 51]. Although recommendation systems usually achieve good results, the core recommendation strategies must also be adapted to deal with unique user tastes, providing a personalized recommendation and to deal with trusted peer influence. Moreover, Filieri [34] states that consumer involvement and experience, as well as the type of website, affects the way consumers assess trustworthiness in online reviews.

Reviews are comments assigned to products posted by users in e-marketplaces that have become increasingly important in the consumer’s decision-making process. Online product reviews websites, help consumers make informed decisions about purchasing new products and has become a major driving force in new product sales, making effective e-marketing a critical success factor for new product launch [24]. For instance, e-marketplaces, such as Amazon.com allows users to comment on the products available on the platform by providing feedback to other users about product attributes, quality, or performance [62]. Noticeably, about 93% of U.S. adults read reviews before making online purchases, suggesting that the vast majority of consumers understand the benefits of reliable reviews and would be motivated to write accurate ones themselves [86].

The awareness of the reviews may influence the behavior of the users because they assume that the criticisms exposed in the respective products/services are written by the consumers themselves [76]. For this reason, users rely on messages of other users and prefer them to those created commercially [16]. There are a number of reasons why people read reviews, from getting information, to building relationships in an online community or, affecting other readers’ behavior [15, 46].

For e-retail managers, consumer reviews are a valuable asset [36], demonstrated by the positive relationship between positive consumer reviews and willingness to pay [4], as well as increased sales [21, 39], and how they proactively used and organized these “free assistants” for marketing initiatives [40], reducing sales costs on customer assistance and support.

Literature on rating and the determinants of reviews and utilities have made gradual progress in designing and validating algorithms for calculating review utility scores and product ratings [28]. Given the great impact of product reviews on consumer purchases [64] companies can manipulate reviews to increase sales, posting favorable reviews and/or eliminating negative reviews [17]. Due to these constraints, it is relevant to study the helpfulness of the reviews.

In this research we develop a new theoretical framework to analyze and explain product rating and perceived helpfulness—the terms helpfulness and usefulness being used interchangeably-, of the online customer reviews, from a social network analysis perspective. We consider that the analysis of complex networks is an appropriate technique to be used in this context, as it allows for capturing the true amount of reviews, not exactly by their absolute number, but by measuring the importance (assessed by the centrality measures) of the products reviewed. Centrality measures are used to assess the importance of a product in the network, since they help us to identify nodes (products) that are important in a network, either because they have many connections or because they are located in strategic positions. These nodes might be key products in the network or have a disproportionate influence on the network’s behavior. In addition, by identifying important products because they are central, we can make predictions about how a network might behave under different conditions. The more central a product is in the review network, the more important it will be. The network is originally a bipartite network [6], containing two types of nodes: reviewers and products. By projecting the bipartite network in a one-mode network of products it is possible to measure the products’ centrality. Network science has been used to deal with reviews and ratings, as we can see in related work in Sect. 2.2. However, very little is known about the interplay between product rating, number of reviews and the helpfulness of the reviews in a network of products [83]. Our research aims to fill this gap by analyzing the importance of the products through the analysis of network centrality of reviews. So, one of the most innovative aspects of this work is the focus on the relationship between centrality and rating/helpfulness. Our motivation is that centrality may reflect an increase in rating and helpfulness of product reviews.

We focus on centrality measures as a way of capturing the importance of the products being reviewed and relate centrality to the helpfulness and the rating of the products reviewed. For that purpose, we use a data set openly available from Amazon.com [42], and generate a network of products of the category “musical instruments”, by linking products with posted reviews. The network is based on the principle that two products are linked if they are reviewed by the same reviewer. More than 2214 reviews that originated 5562 relationships between 717 different products have been analyzed using Machine Learning algorithms, such as Clustering and Regression Trees.

The paper is structured as follows: In Section 2, we discuss related work concerning reviews, ratings, and helpfulness and we describe the research questions and hypotheses. Section 3 presents the methodology, data set and introduces the first concepts of network science and the mathematical formulations of the main metrics of network centrality and modularity. Section 4 presents the analysis of the results and Sect. 5 contains a final discussion of the main findings with conclusions. The limitations of the study and future research challenges are also presented.

2 Theoretical framework

2.1 Customer reviews, helpfulness and rating

Most e-platform interactions involve a variety of—often-heterogeneous—entities, such as customers, vendors, and public/private institutions that generally have no relationship history. These relationships are built in the form of feedback—reviews, helpfulness and ratings.

2.1.1 Reviews

Online customer reviews provide new potential customers with relevant information about a product or service [3]. Online consumer reviews are popular sources of information about products and services: 72% of consumers aged 25–34 seek information for recommendations and opinions before buying goods and services (Mintel, 2015). Reviews can be defined as any comment on a product or service written by a consumer [34]. It has been empirically shown that the type of online consumer reviews assigned to a product significantly impacts its future sales.

According to [1, 2] customer reviews help customers to learn more about the product and decide whether it is the right product for them. Consumers rely more and more on reviews to assess product quality when making purchasing decisions, and the criticisms set forth are an unbiased reflection on product quality. The literature indicates that the quality, reliability, and helpfulness of reviews are critical factors on the impact of sales volume and that the negative effect of reviews is greater than the positive effect [24].

Thus, a large number of researchers agree that reviews influence decision-making processes and affect individuals’ behavior [3, 31, 48, 58]. However, several authors [37, 67] found that users give more importance to the criticisms written by real clients than to statistical summaries [16]. This finding highlights the importance of truthful and unbiased peer-to-peer information when consumers rely on reviews to make wise buying decisions.

Reviews also have an impact on advertising. Hollenbeck et al. [47] studied the relationship between online reviews and advertising spending in the hotel industry. They have combined a data set of TripAdvisor reviews with other data sets describing these hotels’ advertising expenditures, and show that online ratings have a causal demand-side effect on ad spending. Some researchers also stress that fake reviews can result in unfair competition, where a product’s ranking is artificially inflated or deflated [43], and the usefulness of online reviews is impeded by false reviews that give an untruthful picture of product quality [74].

Therefore, helpful reviews could be a signal of truthful reviews as sincere consumers write reviews to share their experiences, either positive or negative, that helped other consumers in their buying decision-making [74]. This suggests that helpfulness of reviews measure is of utmost importance in online marketplaces.

More often, decision-making is carried out within a social networking framework, in which individuals rely on the opinions and support of their closest friends or people with similar interests. For this purpose, the reviews are published in electronic portals that are intended to collect opinions to aid decision making [65].

However, the proliferation of reviews and the wealth of information available generates a great information overload [60], making it difficult for consumers to orient themselves and determine the most useful information for them. As useful reviews can increase sales [38], several e-commerce organizations allow consumers to vote on the helpfulness of each review, signaling to other consumers which reviews are the most useful for assessing the performance of the product.

Following, we elaborate on the importance of helpfulness of reviews, both from the customers and companies point of view.

2.1.2 Helpfulness

Numerous studies have extensively studied the determinants, outcomes and the influencing factors of the helpfulness of online reviews. Kim et al. [51] looked at the association of different online product review features (i.e. review valence, length, pros and cons, helpfulness, authorship, and product recommendation) with purchase probabilities and offer theoretical contributions to the literature on information processing, as well as managerial insights regarding how advertisers can use reviews and how firms should manage their online recommendation systems to better serve existing and potential consumers.

A helpful review reveals the diagnosis, i.e., the ability for other consumers to better understand the quality and performance of the product or service [50]. The measure of helpfulness plays a critical role in the review and recommendation [38], and its importance arises from the fact that a popular product usually has many reviews for the consumer to read. Therefore, assessments need to be classified and recommended for consumers. For example, Amazon.com asked readers to vote on the helpfulness of product reviews, with the ultimate goal of influencing consumer decisions by offering more useful reviews. According to Yang et al. [87], Amazon.com raises profits annually by $2.7 billion with this question: “Was this review useful?”.

Therefore, the helpfulness of a review relies on reliable and unbiased customer-based information, helping consumers in the online buying decision-making process. From a business point of view, it is important to implement a scale to classify the helpfulness of user assessments to understand their perceptions of products and/or services [7].

Other studies have examined how the online consumer review features influence the level of usefulness or helpfulness (or utility) of online reviews. Mudambi and Schuff [65] investigated review helpfulness using data from Amazon.com. The authors proved that the extremity and the word count positively affect the consumers’ perceptions of review helpfulness. They also have demonstrated that positive outcomes of helpful online customer reviews seems to reduce the fatal impacts of malicious reviews for vendors, increasing the reliability and usefulness for consumers, alleviating risk decision and uncertainty, getting the needed information, which is time-consuming and energy-consuming. Thus, actively providing helpful reviews can benefit consumers for quick purchase decisions and satisfy their shopping experience. Besides, they also showed that the product type plays a mediated role in influencing review helpfulness.

Afterward, more studies focusing on review helpfulness have been contributed identifying the determinants of review helpfulness [83], including the severity of language used in the review, reviewers’ identities and backgrounds [20], balance and presentation order [71], and truthful reviews [72]. Recently, Cui and Wang [25] have demonstrated that review presentation format (e.g., product videos and images) is also considered an influential factor in the helpfulness of reviews, as it allows consumers to obtain more product details, which are difficult to describe in text-based reviews, such as color, movement, and sounds.

The literature on the determinants of the reviews helpfulness has presented gradual developments in designing and validating algorithms to calculate the score of the helpfulness of the review and the classification of the product. Some researchers start by examining what makes the reviews useful and have found the importance of the source of review (e.g., characteristics of the reviewer), to influence the decision of a consumer to vote on the helpfulness of a review [38, 68]. Moreover, Chua & Banerjee [23] found that the relationship between the quality of the information and the helpfulness of the review varies according to the product category and the review (e.g., favorable, unfavorable, and mixed). These findings indicate that the type of product being studied is also an important factor when studying the helpfulness indicator of product reviews.

Some authors study how helpfulness may prioritize online product reviews by quality. Du et al. [30] proposed a deep neural architecture to learn the explicit content-rating interaction (ECRI) for automatic helpfulness prediction. Experimental results demonstrate that exploiting the explicit content-rating interaction improves automatic helpfulness prediction.

2.1.3 Rating

Rating is defined by Steck [75] as a measure of accuracy of the quality of a product. Typically, customers who purchase a product or use a service are invited to leave a review or rating based on their experience. These ratings are usually expressed on a scale of 1 to 5, with 1 being the lowest and 5 being the highest. The rating system serves several purposes [21, 69, 90]: (i) Provide feedback to the seller,(ii) Inform potential buyers that can use the ratings and reviews to make informed decisions about whether to purchase a product or service; (iii) Establish credibility for the seller or platform; (iv) Provide a sense of community among buyers and sellers.

According to Bonchi et al. [10], rating is a key measure to ensure the long-term success of e-commerce and to manage Customer Relationship Management (CRM) activities. In addition to reviews, positive ratings can change people’s attitudes about the related product review [48].

In most e-marketplaces, customers can leave comments, feedback, and ratings after an order from a third-party. This lets other customers know about experiences with products and services. According to Amazon [2], customers can rate third-party sellers from one to five stars, with five stars being the best. The seller’s average rating appears beside their name on Amazon’s site.

Some authors explore the relationship between product rating and reviews for predicting helpfulness, without introducing network concepts. For example, Dash et al. [27] introduced P2R2 (Product feature based Personalized Review Ranking), a framework to predict review helpfulness for individual consumers based on their preferences in product features using a latent class regression model. Ping et al. [70] developed a methodology for enhancing the quality and usefulness of online reviews using a machine learning approach. On a different perspective, Lee et al. [56] analyzed online reviews on Amazon.com to identify review types and key drivers of perceived usefulness of reviews.

We will see later on why ratings are important to our research and we will link them to centrality of a product in a social network.

2.2 Network centrality and rating/helpfulness

To define a framework to relate centrality of reviews (an essential metric in network science) with rating/helpfulness, it is important to refer to the existing related work.

Other authors introduce and explore networks’ concepts in this context. For example, Wang [84] examines the association between centrality and reviews by analyzing the differences in reviewer characteristics by network structural positions. In other words, the author identified a relationship between the centrality of reviewers and reviewer characteristics. Lee et al. [55] studied the relationship between the herding effect and ratings, where users’ ratings are influenced by prior ratings depending on movie popularity. Wang et al. [82] exploited the temporal sequence of social-networking events and ratings. They conclude that rating similarity between friends is significantly higher after the formation of the friend relationship, indicating that with social-networking functions, online rating contributors are socially nudged when giving their ratings. Li et al. [57], explored social influence in online restaurant reviews and concluded that prior average review rating exerts a positive influence on subsequent review ratings for the same restaurant, although the effect is reduced by the variance in existing review ratings. Su et al. [77] showed that complex networks of user relationships could be used with the proposed similarity measure to design a rating prediction algorithm for recommender systems (using MovieLens and Netflix data). De Meo et al. [62] studied helpfulness-based reputation (HBR) scores and centrality-based reputation (CBR) scores. As they mention in their research, the identification of users featuring large HBR scores is one of the most important research issues in the field of Social Networks, as a critical success factor of many Web-based platforms. Authors conclude that CBR scores allow for predicting HBR ones, and Eigenvector Centrality was found to be the most important predictor. So it is important to pull trust relationships to spot those users producing the most helpful reviews for the whole community. In Table 1 we summarize the main contributions around the links between centrality, rating and helpfulness.

Table 1 Existing literature exploring the links between centrality of reviews and rating/helpfulness

There are also some authors that relate rating and helpfulness, such as Chua and Banerjee [22], who review helpfulness as a function of reviewer reputation, review rating, and review depth. They conclude that helpfulness is positively related to reviewer profile and review depth but is negatively related to review rating.

Although in previous literature, there are some associations between the concepts we address in our work, we could not find simultaneous links between centrality, rating and helpfulness. Our rationale is that this centrality may reflect an increase in rating and helpfulness. In other words, the centrality of a product in a social network of products can lead to a positive feedback loop, where increased visibility and exposure lead to higher ratings and perceived helpfulness, which in turn can lead to even greater visibility and exposure within the network. We can say that there is a relationship found in the literature that goes in the direction of the influence that centrality has separately on rating and helpfulness. In Fig. 1 we summarize the most important contributions of the literature considering the relationship between centrality, rating and helpfulness.

Fig. 1
figure 1

Source: the authors

Most important contributions of the literature on the relationship between centrality, rating and helpfulness.

2.3 Research questions and hypotheses

As introduced above, the originality of this work lies in the relationship between both centrality and rating/helpfulness. We use a network of reviews connecting the products to each other. It is therefore a network in which the centrality of a product reflects its importance, because the product has been commented on by many users. Then, centrality measures obtained from networks of products are used: betweenness centrality and eigenvector centrality. Modularity is also used to measure the strength of division of the network into modules. The other variables of analysis—the “quality” measures—rating, and helpfulness, are obtained directly as a product’s features in the Amazon dataset. The relationship between the centrality measures, rating and helpfulness is explored in this work. Our central question is the following: do centrality measures improve product rating and helpfulness of reviews? This central research question is supported by three research hypotheses:

H1

Correlation between the centrality of reviews, product rating and the helpfulness of reviews is significant.

H2

The centrality of reviews (and the Rating) have an impact on the helpfulness of the reviews; and.

H3

The centrality of reviews (and the Helpfulness) has an impact on product rating.

Figure 2 illustrates the research framework and hypotheses and positions our work relating Centrality, Rating, and Helpfulness.

Fig. 2
figure 2

Research framework and hypotheses relating centrality, rating and helpfulness

To assess H1, we first identify the products with higher levels of centrality in the network through a correlation matrix. In order to test H2 and H3, we apply regression tree models (Breiman, 1984) and Random Forests.

3 Methodology

In this section, we introduce the Measures of Centrality in Social Networks, namely Eigenvector Centrality and Betweenness Centrality. Afterward, in Sect. 4.3. (Methods), we introduce Bipartite Networks and explain how a projection of a bipartite network into a one–mode network can be used to produce a network of products.

3.1 Measures of centrality in social networks

Social networks are important systems for spreading flows through edge connectivity. Social Media is based on social networks, and involves publishing flows of information and shared content. Different types of flows include products or services recommendations, sharing posts about specific issues, or ratings. For example, in Spotify, users can recommend songs to their friends, including Facebook friends [51].

Generally speaking, a social network is a group of individuals or groups connected by some relationships. Links can be created online or offline [26]. Individuals are linked together by social bonds (often called relationships), which may be formal or informal [13]. In theory, a simple network or graph (G) is defined as a set of discrete social entities (called nodes or vertices), represented by V(G) and links (also called edges), represented by E(G). In symbolic terms, we may write a graph as a whole entity G, such as a tuple G = (V(G), E(G)). Entities or nodes in a social network are often called actors, and these can be individuals, organizations etc. [80]. Two nodes (vertices) connected to each other are considered neighbors and the number of elements of the system correspond to the number of nodes [6].

For example, in Fig. 3 there are eight nodes or vertices, V(G)\(=\left\{v1,v2,v3,v4,v5,v6,v7,v8\right\}\), and eight links or edges: E(G)\(=\left\{e1,e2,e3,e4,e5,e6,e7,e8\right\}.\)

Fig. 3
figure 3

Source: Adapted from Das et al. [26],the enumeration of nodes Vi (i = 1,…, 8) was omitted for simplification and i = 1, …, 8, was used instead

Representation of a social network G.

The information about the existence (or not) of links between the nodes often implies the creation of an adjacency matrix A. Let A be an adjacency matrix containing values aij, such that aij = 1 if node i is connected to node j and aij = 0. Therefore, the adjacency matrix is a square matrix |V|×|V| such as:

$$a_{ij} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & { {\text{if}}\;{\text{node}}\; i\;{\text{is}} \;{\text{connnected}}\;{\text{to}}\;{\text{node}}\;j} \hfill \\ {0,} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.$$
(1)

The diagonal elements of A are zero, since edges from a node to itself (loops) are not allowed in simple graphs.

3.2 Eigenvector centrality and betweenness centrality

One of the most popular measures of centrality is the node or degree centrality. The degree centrality of a node is simply its degree—the number of edges it has. The higher the degree, the more central the node is. However, degree centrality does not capture the real importance of a node in a network. While degree centrality only considers the number of connections a node has, eigenvector centrality considers the quality of those connections by taking into account the centrality of neighboring nodes. Betweenness centrality, on the other hand, focuses on the node’s position in the network and its ability to control information flow. These measures provide a more nuanced understanding of node importance by considering both local and global network characteristics. For these reasons, we think that Eigenvector Centrality and Betweenness Centrality are more complete and adequate than node centrality. We will introduce them in the following.

Eigenvector centrality is based on notions of influence, ranking, and prestige of the neighbors of the node that we intend to analyze [6]. That is to say, the centrality of a node is measured by the importance of the neighbors to which the node is connected, since they have easy access to the information and sources of influence. This index of centrality describes the general influence of a node throughout the network, which makes its importance and its impact on the reviews something to be further studied [66].

The Bonacich [9] approach is quite adequate for the calculation of centrality, since it not only takes into account the centrality of the neighbors, but also their quality. Then, the centrality of the vector for node i, \({x}_{i}\), is given by:

$$x_{i} = \frac{1}{\lambda }\mathop \sum \limits_{j = 1}^{n} a_{ij} x_{j} \quad i = 1,2 \ldots n$$
(2)

where \({a}_{i}\) is 1 or 0 whether there is (or there is not) a link between nodes i and j, respectively. Measure \({x}_{j}\) represents the centrality of node i, and \(\lambda\) is a constant. It is wise to choose \(\lambda\) as the largest eigenvalue in absolute value of matrix A. This measure of centrality is based on the fact that a node is important if the neighbor is important too. We assume that the cardinality of V(G) is n.

Eigenvector centrality has been used because it proves to be an important measure of centrality, since it considers the centrality of the neighbors [9].

The centrality of intermediation, also known as betweenness centrality, measures the extent to which a node is an important intermediary between the links of other nodes in the network, that is, it reflects the number of shorter paths connecting pairs of nodes that pass through a specific node. These nodes have a very high centrality because they connect communities that would otherwise not be linked [33]. Betweenness centrality (or intermediation) is one of the most used measures of centrality [85]. It measures the extent to which a node is an important intermediary between the links of other nodes in the network. The intermediation centrality CB(x) of a vertex x in the network is given by:

$$C_{B } \left( x \right) = \mathop \sum \limits_{s \ne t \in V\left( G \right)} \frac{{\sigma_{st} \left( x \right)}}{{\sigma_{st} }}$$
(3)

where \({\sigma }_{\mathrm{st}}\)(x) denotes the number of shortest paths between s and t containing x, and \({\sigma }_{\mathrm{st}}\) denotes the number of all the shortest paths between s and t in the network.

3.3 Bipartite networks

In this research there are two different types of nodes: products and reviewers as showed in Fig. 4.

Fig. 4
figure 4

a and b—a bipartite or two-mode social network G’. a The solid filled nodes (1,2,3,7 and 8) correspond to one type of nodes (for example the products) and the empty nodes (4,5 and 6) correspond to another type (for example the reviewers). Note that there are no links between nodes of the same type. b The same network is now represented in a more “conventional” way, where the shapes of the two different type nodes are circles (products) and squares (reviewers)

This is the case of Bipartite networks, or Bipartite graphs whose vertices can be divided into two disjoint and independent subsets V1 and V2. In a bipartite graph every edge E links one node of V1 and one node of V2 [6]. Mathematically, the definition can be stated as follows:

Definition 1

(Adapted from Banerjee et al. [5]). G(V1, V2, E) will be called a bipartite graph if V(G) = V1(G) ∪ V2(G) and V1(G) ∩ V2(G) = \(\varnothing\), and each edge connects two nodes (v1, v2) ∈ E(G). G will be a complete bipartite graph if ∀v1 ∈ V1(G) and ∀v2 ∈ V2(G), (v1, v2) ∈ E(G).

In the case of a Bipartite Network, assuming that r = #V1(G) and s = #V2(G), then the bi-adjacency matrix corresponds to the following:

$$A = \left( {\frac{{0_{rr} }}{{B^{T} }}\frac{B}{{0_{ss} }}} \right)$$
(4)

where B is an r × s matrix, and 0r,r and 0s,s represent the r × r and the s × s zero matrices.

3.3.1 Projection of a bipartite network into a one–mode network

Since we are interested in analyzing the importance of the products, measured by the corresponding centrality, we need to project the original bipartite network into a one-mode network. Bipartite networks can be transformed into unipartite networks through one-mode projections [5, 81, 85]. This means that the resultant network contains nodes of only one set: in our case, the products’ network. Application of a one-mode projection to a bipartite network generates two unipartite networks, one for each layer, G1 and G2, so that vertices with common neighbors are connected by edges in their respective projection.

Definition 2

(Adapted from Banerjee et al. [5]). Let G(V1, V2, E) be a bipartite graph with #V1(G) or |V1(G)|= r, and #V2(G) or |V2(G)|= s and #E(G) or |E(G)|= m. Projection of the bipartite graph G for the vertex set V1 with respect to the vertex set V2 is the same as to construct a unipartite or one-mode network G1 (V1, E′) where V(G) = V1 and (v1iv1j) ∈ E(G1) if N(V1i) ∩ N(V1j) ≠ \(\varnothing .\) The same applies for the projection of the bipartite graph G for the vertex set V2 with respect to the vertex set V1: it is to construct a unipartite or one-mode network G2 (V2, E’’) where V(G) = V2 and (v2iv2j) ∈ E(G2) if N(V2i) ∩ N(V2j) ≠ \(\varnothing .\)

Where cardinality of neighborhood of a vertex (the degree of the vertex) is denoted by deg(vi) =|V1(vi)|. Figure 5a and b show, respectively, the one mode projections of V1 (i.e., G1) and V2 (i.e., G2), taken from the same example of Fig. 3, based on the two types of nodes of the dataset used in this work.

Fig. 5
figure 5

a One mode projection of G1 (ex. products). b One mode projection of G2 (ex. reviewers)

3.4 Modeling and analysis

3.4.1 Centrality measures

To identify the most important products in terms of centrality, the two measures of centrality are used: betweenness centrality and eigenvector centrality. To this end, a projected, one mode network, the network of products has been created to help calculate centrality, where the products (network nodes) are connected (through edges) when the same reviewer comments on these products, two or more. It is important to note that the projected network of products is also the projected network of reviews, because it is based on the reviewers. In this sense, the names “network of products” and “network of reviews” are used interchangeably.

To test the research hypotheses identified in Chapter 2, we first identify the products with higher levels of centrality in the network, using the Gephi software [8]. A correlation matrix and linear dependency models are used to test H1.

To identify the most important products in terms of centrality, the two measures of centrality are used: betweenness centrality and eigenvector centrality. To this end, a projected, one mode network, the network of products has been created to help calculate centrality, where the products (network nodes) are connected (through edges) when the same reviewer comments on these products, two or more. It is important to note that the projected network of products is also the projected network of reviews, because it is based on the reviewers. In this sense, the names “network of products” and “network of reviews” are used interchangeably.

3.4.2 Cluster analysis

A cluster analysis is performed to find patterns of nodes in the data. Detecting patterns of nodes based on their connectivity provides an idea of how node communities emerge in the network. On the other hand, it is also important to detect patterns of nodes based on their attributes. In network science, clustering is related to the consistency of certain patterns based on the similarity of the nodes. A cluster is a subset of nodes where the products are identical to each other, and distinct to the other products’ clusters. In this work we started by computing the number of clusters that are appropriate for our data, using the elbow rule and the scree plot—that takes into account the variance (within-group sum of squares): as the number of clusters increases, the variance decreases. The elbow at five clusters represents the most parsimonious balance between minimizing the number of clusters and minimizing the variance within each cluster. It is important to consider coherent sizes for the clusters, since a larger cluster increases redundancy, which makes each loop less important and compromises network conciseness [66].

For this purpose, we used K-means [59], a method for quantitative variables that iteratively groups n observations in k clusters, where each observation belongs to the cluster with the closest centroid. The objective is to minimize the sum of the quadratic error of the several groups generated, that is, the smaller the sum, the more homogeneous the groups will be. This method implies a previous choice of the number of initial points (centroids), giving rise to a number of groups predefined by the analyst [44]. After selecting the five initial centroids, the program repeats the algorithm until it reaches the minimum of the established criterion: (a) to form five clusters, associating each new product with the nearest centroid,and (b) recalculate the centroid of each cluster. The variables chosen for analysis are: Betweenness Centrality, Eigenvector Centrality, Rating and Helpfulness. Modularity was not included since it is a measure of the global interconnectedness, not providing important information at the level of individual nodes (except, for the modularity class, which is a qualitative variable).

3.4.3 Regression trees and random forests

In order to test H2 and H3, we applied Regression Trees and Random and Forests algorithms to study the impact of centrality on the other variables under study (rating and helpfulness).

Regression trees are increasingly used today as one of the predictive modeling approaches used in statistics, data mining and machine learning. They are supervised learning algorithms that use a tree nature in which each internal (non-leaf) node is labeled with an input feature and a target variable is predicted. The algorithm we use is CART (Classification And Regression Tree), first introduced by Breiman et al. [11]. Regression trees, including the CART algorithm, do not directly measure causality effects. Regression trees are primarily used for predictive modeling and identifying relationships between predictor variables and the target variable, although they may suggest some sort of causality effects.

The difference between trees used for regression and trees used for classification is the type of target variable (quantitative or qualitative, respectively). In this work, we use regression trees, as the variables to be predicted are quantitative. This research uses four variables: two centrality measures (Betweenness and Eigenvector) and two quality measures (Rating and Helpfulness). The latter will be used as dependent in the models to be presented later on.

3.5 Data set

To build the products’ network, we used a data set containing information about products, reviews, and reviewers (users) provided by Amazon.com. The data set is openly available [42] and contains a great number of reviews in several product categories [61]. Being considered one of the Big Five companies in the U.S. information technology industry, Amazon is an American multinational technology company focusing on e-commerce, cloud computing, digital streaming, and artificial intelligence.

The data set contains more than 150 million reviews on products in various categories, ranging from “books and technology” to “beauty articles”, registered from May 1996 to July 2014. For the sake of practicality, we selected the category of musical instruments. The selection of this category is due to the fact that it includes several subcategories of products with technical characteristics that require previous information to assist buying decisions, and consequently, the reviews are likely to be very useful for the future buyers to make their choices.

At the time of this study, there are 15 subcategories of musical instruments on the platform, among which we can find guitars, bass guitars, ukuleles, keyboards, microphones, strings, and accessories, among others. The category musical instruments, accounts for about 10,261 reviews and 500,176 ratings, the equivalent of 717 products and in each review the data set provides the information of the data set attributes described in Table 2.

Table 2 Attributes of the data set

Initial data set includes about 10,261 product reviews (see Table 3). After the data cleansing step, some missing values have been removed (e.g., “helpful” non-information, “undefined” fields generating errors to the imported data base), as such, the effective sample accounts for 2214 reviews.Footnote 1 A primary analysis of these reviews generates 5562 relations of 717 different products, between 2010 and 2014.

Table 3 Data set summary

4 Results

In this section, we present the main results, namely the centrality measures, cluster analysis and classification.

4.1 Network and centrality analysis

The original bipartite network contains the links between the reviews and the products and has been compressed into a one-mode projection. A new data set was created for further analysis, with aggregate data of network measures by product/review, as well as the rating and helpfulness. The network measures (Betweenness Centrality and Eigenvalue centrality) give us complementary perspectives of the importance of each product. Eigenvector centrality measures the transitive influence of nodes. Therefore, a node with higher Eigenvector centrality is connected to many nodes who themselves have high scores. On the other hand, Betweenness centrality measures the extent to which a node is an important intermediary between the links of other nodes in the network. Both centrality measures have been operationalized using igraph, a popular R package for network analysis and visualization.

As we will see in Regression Tree’s results and in the conclusions, Eigenvalue Centrality will be much more important than Betweenness Centrality for explaining Rating and Helpfulness. The reason for this to happen is that more connected products, measured by the importance of the neighbors to which nodes are connected, tend to have a higher impact on Rating and Helpfulness. Betweenness Centrality does not have the same impact in these quality measures.

Based on the one-mode projection network, it is possible to proceed with further analyses: a Cluster analysis and Regression-based analysis to establish relationships between Helpfulness, Rating and Centrality, that we present in the next sections.

4.2 Cluster analysis

Software R, [73] and kmeans, the function used to perform cluster analysis in R have been used for this task. Data has been standardized previously, since otherwise machine learning algorithms such as clustering will be dominated by the variables that use a larger scale, adversely affecting model performance. We used a normalization procedure in R (function scale that transforms original values into a [0,1] range interval. At the end, each cluster can be described by the corresponding means of the different attributes (see Table 4).

Table 4 Clusters’ means (final centroids) obtained with original variables—before normalization-, after applying K-means

Products in Cluster 1 have the highest average Rating and Helpfulness values. Cluster 2 contains the highest Betweenness and Eigenvector Centrality means, together with higher Rating and Helpfulness. Cluster 3 contains mean values that are relatively low for all attributes. Cluster 4 contains higher means values for Helpfulness and Betweenness Centrality. Cluster 5 seems to be residual, as there is no attribute to stand out compared to the other clusters (with exception for Rating).

Attribute “Rating” is well represented in all clusters. Betweenness centrality and Helpfulness are combined to form clusters 2 and 4. Higher values of Rating and Helpfulness also emerge together in the clustering process, namely in clusters 1 and 2.

4.3 Testing research hypotheses

4.3.1 Relationship between helpfulness, rating and centrality

We start by computing correlation between variables in order to capture the strength of the relationship between Helpfulness, Rating and Centrality. Values are shown in Table 5.

Table 5 Correlation matrix (p-values in brackets—most significant values are in bold))

The highest correlation (0.63) stands between the two measures of centrality: Betweenness and Eigenvector. Additionally, the correlation between Helpfulness and Rating is also relatively high (0.35) when compared to the other values of the correlation matrix. It means that these two attributes are associated, providing insights that the reviews of higher rated products are also the most useful ones. On the other hand, the correlation between the centrality measures (Betweenness and Eigenvector), and the quality measures (Rating and Helpfulness) is very low.

Previous research (Landherr et al., 2010), also finds the existence of a weak relationship between the centrality of reviews and the rating reveals empirical evidence that the users publish reviews, no matter whether products obtain high or low ratings. However, what is revealed is that more reviews—corresponding to higher centrality—are not synonymous with better quality. Products may be central in the network, although with low ratings.

As discussed in the literature review, the question: “Was this review useful?” has been playing an increasingly important role in helping consumer decision-making, so that the user receives information from someone who has already used the product and decided to share their experience spontaneously and free of charge. However, from our results products may have high centrality rates, although providing little helpfulness reviews to users.

4.3.2 Using regression trees and random forests to assess the impact of the centrality measures on the rating and helpfulness.

To answer hypotheses H2 and H3, we have developed a regression tree using the Rpart implementation of the CART algorithm available in R package Rpart [79]. The algorithm rpart (recursive partitioning and regression trees) of R is an implementation of CART by Breiman et al. [11]. Although the algorithm rpart is not exactly the same as CART, they share similarities in their methodology and purpose. Both rpart and CART share the same underlying principles of recursive binary splitting, building decision trees, and pruning to balance model complexity and prediction accuracy. Package Rpart.plot has been used for the plots.

4.3.2.1 Rating

We start by measuring the impact of the Centrality variables on Rating. For that purpose, all variables have been used as explanatory and Rating has been used as dependent.

In Fig. 6a, we can see from rows (4) and (5) that when Helpfulness is higher (>= 0.3541667), then Rating is also higher, on average (78.99027). It is also possible to see that when Eigenvector Centrality is higher, then it has a positive impact on Rating. This means that more connected products, measured by the importance of the neighbors to which nodes are connected, tend to have a higher Rating. We also computed the feature importance measure provided by rpart, based on the mean decrease in node impurity (Gini index or deviance) caused by a particular predictor variable.

Fig. 6
figure 6

a Text-based regression tree using rating as dependent variable and b tree plot

Higher values indicate greater importance, suggesting that the variable has a stronger impact on the target variable within the tree. Eigenvector Centrality is the most important variable to predict Rating, followed by Helpfulness.

A pruning procedure has been used with the prune() function, as a way to reduce complexity and the size of the tree by removing parts (branches) that do not provide power to classify instances. The tree was indeed very much reduced but, as a consequence, the outcome is almost uninterpretable. We then calculated model accuracy by creating a procedure based on the Holdout Method. Model evaluation aims to estimate the generalization accuracy of a model on future (unseen/out-of-sample) data. We took the usual procedure of splitting the data using 70% of the original data as training data and the remaining as test data. Test data has been used to get predictions from the model trained on the training data. To evaluate the differences between the predictions from the model and the original data, we compute two measures of accuracy: mean absolute error (MAE) and root mean square error (RMSE).

$$MAE = \frac{1}{n}\mathop \sum \limits_{j = 1}^{n} \left| {P_{i} - T_{i} } \right| \quad i = 1,2 \ldots n$$
(5)
$$RMSE = \frac{1}{n}\sqrt {\mathop \sum \limits_{j = 1}^{n} \left( {P_{i} - T_{i} } \right)^{2} } \quad i = 1,2 \ldots n$$
(6)

where n is the dimension of the test data, Pi is the i-th predicted value of the test data and Ti is the i-th original value. After 100 model iterations, we got an average MAE of 0.522 and a RMSE of 0.707.

In order to explore an alternative to Regression Trees, we tested Random Forests [12]. This is a type of ensemble learning method also used for regression and other tasks that work by creating many regression trees at training time and outputting the mean/average prediction of the individual tree.

We used R package RandomForests and ran the model 100 times, for which we obtained an average accuracy of: MAE = 0.366 and RMSE = 0.522. Random Forest also helps to understand how much the accuracy increases when an explanatory (independent) variable is included in terms of its Mean Square Error (%Increase MSE). A second measure of accuracy is based on the decrease of the residual sum of squares of impurity when a variable is chosen to split a node (IncNodePurity).

Therefore, we can state from the results that Helpfulness can be a (weak) predictor of Rating, as it increases by 0.13% the corresponding predicting capacity. Betweenness is a poorer predictor of Rating, and Eigenvector centrality does not work well as a predictor at all, since the increase in prediction is negative.

4.3.2.2 Helpfulness

We ran the model again, but now taking all variables as explanatory and Helpfulness as dependent.

Using the regression tree above we learn that when Rating is higher than 3.3., then Helpfulness is also higher on average. The same type of relationship (though weaker), occurs with Eigencentrality, from which we can conclude that these variables have a positive association. After running the evaluation procedure, we obtained MAE = 3.458 and RMSE = 3.535.

Again, we computed the variable importance provided by rpart based on the mean decrease in node impurity (Gini index or deviance) caused by a particular predictor variable.

Eigenvector Centrality is once more the most important variable to predict Rating, followed by Helpfulness.

Using Random Forests, we obtained an average accuracy of: MAE = 3.460 and RMSE = 3.537.

The impact of both centrality measures (Betweenness and Eigenvector) on Helpfulness, seen from the perspective of the mean increase in accuracy is very low (see Tables 6, 7, 8, 9). A summary of the accuracy measures (MAE and RMSE) for Regression Trees and Random Forests is presented in Table 10.

Table 6 Variable importance in the regression tree for predicting rating
Table 7 Accuracy measures of random forest algorithm performance taking rating as dependent variable
Table 8 Variable importance in the regression tree for predicting helpfulness
Table 9 Accuracy measures of random forest algorithm performance taking helpful as dependent variable
Table 10 Summary of accuracy measures (MAE and RMSE) for regression trees and random forests

When Rating is used as a target (dependent) variable, the accuracy is higher than with Helpfulness. On the other hand, Random Forests are more accurate (MAE and RMSE are smaller) than CART for predicting Rating but not for predicting Helpfulness.(Fig. 7).

Fig. 7
figure 7

a Text-based regression tree using helpfulness as dependent variable, and b tree plot

4.3.3 Hypotheses outcomes

Having reached this stage, and after checking the research hypotheses, we are able to recapitulate the following conclusions:

H1

Our results present significant evidence that there is a clear relationship between product Rating and the Helpfulness of the reviews. That is, the higher the Rating of a product, the higher the Helpfulness of the corresponding review. In addition, there is also a strong association between the two measures of centrality, Betweenness and Eigenvector centrality, meaning that influence, ranking, and prestige of the neighbors of a product are also important in placing it as an intermediary between the links of other products in the network. We could not find any significant correlation between centrality measures and quality measures.

H2

Although Betweenness Centrality has low impact on the Rating, it may be however a predictor of Rating. Eigenvector centrality has a positive impact on Rating but cannot be considered a predictor of Rating.

H3

Measures of centrality (Betweenness Centrality and Eigenvector Centrality) have a positive (weak) impact on Helpfulness, although they cannot be considered good predictors of Helpfulness.

5 Discussion and conclusions

Online customer reviews provide new potential customers with relevant information about a product or service, helping them in complex and risky buying decisions. In this research, we used an original bipartite network containing the links between the reviews and the products that have been compressed into a one-mode projection, corresponding to a single mode network of products linked by the respective reviewers. We used centrality measures to assess the amount of reviews, not exactly by their number, but by measuring their importance in the network—measured by the centrality—of the products reviewed.

Our results present significant evidence that there is a clear relationship between product Rating and the Helpfulness of the reviews. That is, the higher the Rating of a product, the higher the Helpfulness of the corresponding review. This relationship operates in both ways, meaning the Helpfulness and Rating can both be used for predicting each other. This result is in line with previous research that identifies a similar positive relationship regarding the impact of review rating on review helpfulness. More specifically, those reviews with two-star ratings are the most helpful, while helpfulness drops dramatically for three-star reviews and increases slightly again for those four- and five-star ones Ping et al., [70]. Lee et al. [56] has shown that reviews with both higher star ratings and longer reviews are usually perceived to be more helpful to potential customers and therefore have positive impacts on the purchase decision, particularly of experience goods.

Furthermore, we also found that Betweenness and Eigenvector centrality are also correlated, meaning that influence, ranking, and prestige of the neighbors of a product are also important in placing it as an intermediary between the links of other products in the network.

On the other hand, our results also show significant evidence that there is no clear relationship between the measures of centrality (Betweenness and Eigenvector) and product quality (Rating and Helpfulness). In other words, it is concluded that consumers comment on a product, regardless of its quality. Therefore, a high centrality of reviews does not imply a high rated product, for several potential reasons, such as customer dissatisfaction about product performance, or heterogeneous customer value expectancy. Additionally, Ping et al. [70] indicate that review volume for the product becomes much less important to the review helpfulness after some initial reviews have been accumulated. And not all customer reviews provide valuable and credible feedback and the sheer volume of online reviews also creates the problem of information overload.

Despite the relationship between the herding effect and ratings our results do not show a similar effect of reviews centrality on quality measures: rating and review helpfulness. This finding might suggest that products may be central (e.g. most popular) to the network, although providing little review ratings or review helpfulness, offering no valuable information and credible feedback to help consumer decision making.

Even so, measures of centrality can be used as predictors of the helpfulness of the reviews. This means that the more central the products stand in the network of reviews, the more useful they can be. This does not mean that Betweenness Centrality and Eigenvector Centrality are good predictors of Helpfulness—which they are not-, but in the way the relationship exists, which is confirmed by the positive correlation and the patterns that have been found in the cluster analysis. For Rating, the particular influence of Betweenness Centrality is interesting: the power of intermediation of the product reviews are somewhat connected to higher rated products. We know this relationship is weak and not significant, but it opens up the exploration of new possibilities of seeking relationships between centrality and quality measures.

5.1 Theoretical contributions

In this research we develop a new theoretical framework to analyze product rating and perceived helpfulness of the online customer reviews. The study provides a one-mode projection-based approach of a bipartite network of products sold by the Amazon.com e-Marketplace in the category “musical instruments”, by linking products through the reviews, simultaneously containing two types of nodes: reviewers and products. The results of this study contribute to the current understanding of reviews centrality measures (betweenness and eigenvector), and the quality measures (rating and helpfulness) within a network of product reviews. The main findings are the existence of a clear relationship between product rating and the helpfulness of the reviews and a weak relationship between the centrality measures (betweenness and eigenvector), and the quality measures (rating and helpfulness). It explains that a high number of reviews do not necessarily imply a high product rating. On the other hand, when reviews are helpful for consumer decision-making, we observe an increase in the number of reviews. In other words, products may be central to the network, although with low ratings and with reviews providing little usefulness to consumers.

5.2 Practical contributions

The findings in this study have many important implications for e-commerce businesses’ improvement of the review service management to support customers’ experiences and online customers’ decision-making.

First, online firms need to facilitate the most pertinent reviews to help ensure that customers find the most relevant information to meet their needs. Thus, providing review helpfulness is the way potential consumers perceive other consumers’ reviews as more informative and helpful, being an important factor to assist consumers’ decision-making and to mitigate the information overload problem [70].

It is important to leverage product rating as this measure is considered the best way to exhibit product quality information, grab consumer attention, reduce risk decision-making and persuade consumption and, actively providing helpful reviews can benefit consumers for quick purchase decisions and satisfy their shopping experiences.

Second, motivating and rewarding reviewers to post credible ratings, long reviews, which contain clear definitions, specific explanations, and more precise descriptions about the reviewers’ experiences with the product, are of great help to potential customers of experience products in making their purchase decisions.

Most e-commerce websites today provide reviewers the opportunity to post product videos and images, consumers can obtain more product details, which are difficult to describe in text-based reviews, such as color, movement, and sounds [88], being of critical relevance for assisting buying decision-making of experimental products when compared with search products [25].

Online vendors may encourage and reward their consumers in different ways (e.g., cashback, vouchers, and member points) to write reviews with images or videos for marketing. This suggestion is in line with Woolley and Sharif [86] which conclude that “Simply knowing you’ll receive a reward for writing a review makes the process more enjoyable, which makes you more likely to write a positive review”. Therefore, offering incentives can be an effective strategy for improving customers’ review-writing experience and increasing the positive and helpful content in product reviews. Additionally, review helpfulness can be used to monitor reviewer qualifications, conferring a badge to top quality reviewers.

Third, it suggests redesigning review sorting interfaces and displaying the consumer rating distribution by helpfulness on the product page, resulting in consumer trust which is of instrumental support for consumer decision making. Therefore, such a service design is especially vital for online retailers, such as, Amazon marketplace, where online customer reviews are extremely voluminous and overwhelming.

Finally, this study might inspire online businesses providing diverse review tools while understanding the impact of online reviews and social networks, on the brand reputation and reliability of the seller. They can generate a customer reviewer community to develop effective strategies to help build and strengthen their relationships with customers, having long-term effects on revisits to the site and the product/service repurchases [56].

5.3 Limitations and future research

Although this research is yet another step in the review of reviews, we recognize that this approach has some limitations. First, the main one concerns the unique analysis of a specific product category and not the product range available on the Amazon e-Marketplace. Therefore, as a challenge for future work, we suggest analyzing a different category of “search products”—such as music or books—to see if they show similar results to those obtained in this research and how these relationships vary between search products and experience products. Noticeably, Mudambi and Schuff [65] showed that the product type plays a mediated role in influencing review helpfulness.

Second, this study examined only online reviews posted on Amazon.com. This limitation provides opportunities for future research to explore other factors including online/offline retailers advertising and specific situations faced by potential customers, which can affect the helpfulness of online reviews. For example, future studies can investigate the dominant determinants of review helpfulness and examine its implicit dependency on various review tools (e.g. text/description length, photos, video), reviewer characteristics (e.g. cultural, technical) and product/service category (e.g. search versus experience goods).

Another limitation relates to the focus of our study: it does not distinguish true and false reviews. We do not approach fake reviews and false scores, although we assume they may exist. A lot of research highlights the importance of truthful and unbiased peer-to-peer information when consumers rely on reviews to make wise buying decisions. However, malicious consumer reviews and fake reviews provided by vendors-disguised “consumers” become the major problems that interfere with consumers making the right choices [72]. This is a future research area calling for using Machine Learning algorithms to detect both computer- and human-generated fake reviews.