Consumer-generated media (CGM) are the most active information-sharing platforms in which users generate contents by voluntary participation. For example, Facebook provides an information-sharing platform in which their users freely post their comments and enjoy responses by their readers. CGM reflect positive traits of the Internet, because, in CGM, aggregating users’ voluntary participation bears values, and thus, they have network externality in which the more active users are, the more the values of the CGM are. Although users’ motivations to generate contents are not only rational thinking but also intrinsic psychological minds including self-disclosure [27], brand image [13], and communication [24], institutional designs from the perspective of the rational and logical incentives should be considered, because inactivate and unsuccessful CGM are not so rare.

CGM rely on user-provided information and thus fail if information is not provided. Getting users to provide information generally requires effort costs including time costs and click costs [18]. Therefore, CGM users are given incentives to discourage free riding, a situation in which users receive information, but do not provide it. While huge CGM never worry about freeriding, many managers of small-sized CGM pay attention to it. CGM can be regarded as a kind of public goods game—a social dilemma game in which users may refrain from paying costs (that is, free riding)—although they could benefit substantially if they contributed.

To avoid the free-rider problem, many CGM adopt incentive systems for users to receive comments as appreciation for posting articles. These comments are considered rewards for contributing to the public goods game. Moreover, many real CGM systems provide Like buttons to react to comments, which can be regarded as meta-rewards. This is because comments also give psychological benefits to original article providers as well as Like buttons give psychological benefits to their receivers.

The public goods game framework is a strong analytical tool for understanding the contents which generate behaviors of users. Some studies insist that information-sharing services in online networks have public goods game features [7, 8, 15]. Empirical studies have analyzed influences on the cooperative behaviors of reciprocity and network structures in social media [10, 17, 21] and some data analyses consider the effects of reputation systems on online markets [4, 30].

In public goods game research, kinship and reciprocity promote cooperation [19]. Sanctions, including punishments for free riding and rewards for cooperative behavior, also encourage cooperation [3, 6, 16, 22, 28]. Theoretical analyses have pointed out that cooperation through sanctions cannot be maintained due to the second-order free riders, who cooperate but shirk sanction behaviors to non-cooperative others [20]. To avoid the second-order free riders, Axelrod [1] introduced a meta-punishment to punish those who did not participate in punishing the second-order free riders. Despite this approach, some studies have pointed out that meta-punishments alone cannot maintain a stable cooperative regime [9, 31, 32]. Okada et al. [20] extended the meta-punishment concept and exhaustively explored all combinations of meta-incentive systems, including meta-rewards as well as meta-punishments.

Toriumi et al. [26] used a public goods game model to show that meta-rewards are required to maintain cooperation. A meta-reward is a reward for those who gave a reward to cooperative users. Many CGMs implement a function that allows other users to express their gratitude to those who provided information, and the users who expressed their gratitude can also be given something as a reward. For example, Facebook/Blog users can post comment to information-providers who mainly reply to these comments. Here, we regard comments as rewards and responses as meta-rewards.

However, when we consider Facebook and blog sites, there are three challenges to applying these theoretical second-order sanction system studies to real CGM empirically.

  1. 1.

    Linkage hypothesis: whoever performs the first-order sanction (rewards and punishments) also performs the second-order one.

  2. 2.

    No limitation of meta-reward doers: all users can give meta-rewards to all others.

  3. 3.

    Sanctions without expectation: users give rewards without considering their expectations for meta-rewards.

First, for the linkage hypothesis, which is adopted by Axelrod [1] first, there is a positive correlation between the probability of imposing the first-order sanctions and that of imposing the second-order sanctions. This hypothesis is needed for the theoretical rationale of meta-sanctions, because, if the second-order sanctions are independent of the first-order sanctions, the third-order free riders who shirk the second-order sanctions only are possible, and thus cooperation through meta-sanctions collapses. Experimental studies have no consensus on this linkage hypothesis. Some experiments support the linkage between the first-order sanctions and cooperative behaviors [11, 12], while others deny it [5, 29]. The linkage hypothesis between the first-order and second-order sanctions is partially supported by an experiment of a one-shot public goods game [14].

We will model our CGM public goods game without assuming the linkage hypothesis between the first- and second-order rewards. While a previous model [26] uses the same parameter, \(r_i\), as the probabilities of giving rewards and giving meta-rewards, our model separates the former probability from the latter.

Second, many of the theoretical models are different from real CGM in terms of meta-rewards. The theoretical models assume that all users can give meta-rewards, while only those who post an article can give meta-rewards in real CGM. For example, Facebook/blog sites allow replies to comments from all users, but in many cases, original article posted user only replies to the comments. That is, the meta-reward actions are performed by the information providers, because the comments are for them, not for the others. No study has tested the effect of only permitting users who receive the first-order rewards to give meta-rewards. We, therefore, undertake this challenge in this study.

Third, we assume that people expect the consequences of their own actions. This tendency can apply to a reward action in CGM. When CGM users give rewards to others, they are confident that the receiving users will respond with meta-rewards to them. They would not give rewards if they did not expect a meta-reward in return. This tendency is natural in terms of reciprocal altruism; people act altruistically if they expect returns. In our model, we introduce a belief on giving meta-rewards to capture this tendency.

Models and methods

For this study, we developed a new model that considers the three challenges of the previous models discussed above and propose a required definition of a system or an institution for resolving social dilemma problems in real CGM. In this section, we develop a model that reflects real CGM by extending the CGM model proposed by Toriumi et al. [25]. We then define an adaptive process of players in the model to explore feasible solutions of strategies for promoting and maintaining cooperation. Third, we introduce several scenarios to provide insight for managing real CGM by comparing their performances. Finally, we set parameter values to perform our simulation.

A restricted meta-reward game model

We consider N agents playing a restricted meta-reward game. The game is run for a discrete time and each period is referred to as a round. In each round, all agents play three sequential steps in serial order. Using the case of Agent i as an example, Agent i has its own strategy denoted by \((b_i,r_i,rr_i)\), which we will explain later.

In the first step, the agent provides its own token into a public pool with probability \(b_i\) and otherwise does not. In CGM, a contribution and a non-contribution are, respectively, regarded as an information-providing behavior and a non-providing behavior. If a token is provided by Agent i, i must pay a cost \(\kappa _0\), also the other \(N-1\) players receive a benefit, \(\rho _0\).

In the second step, rewards for providing a public good may occur. In CGM, posting a comment to an information provider is regarded as a reward. If and only if Agent i provides a token, the other \(N-1\) agents consider whether or not they will give a reward to Agent i. Agent \(j (\ne i)\) gives a reward to Agent i with probability \(p_{r_{i\rightarrow j}}\) and otherwise does not. This probability is calculated as \(p_{r_{i\rightarrow j}} = \varepsilon \cdot r_j\), where \(r_j\) is j’s own reward parameter and \(\varepsilon\) is an expected rate of meta-rewards newly introduced in this model to consider the third challenge of the above-mentioned prior studies. If a reward is given, Agent i gains a constant benefit, \(\rho _1\), while Agent j must pay a constant cost, \(\kappa _1\).

In the third step, meta-rewards for giving rewards may occur. In our model, meta-rewards from contributors are possible in the first step only to consider the second challenge of the previous studies, thus making this model a restricted game. In CGM, a reply to comments is regarded as a meta-reward. If and only if Agent i received a reward from Agent j, Agent i can decide whether to give a meta-reward to Agent j with probability \(rr_i\), and otherwise not. While Toriumi et al. [25] assume that \(r_i=rr_i\), our model assumes that these are independent of each other to consider the linkage hypothesis. If a meta-reward is given, Agent j gains a constant benefit, \(\rho _2\), while Agent i must pay a constant cost, \(\kappa _2\).

Each agent plays the above three steps four times in each round. When all agents complete these steps, each agent’s final payoff at each round is regarded as its fitness value.

Figure 1 illustrates the conceptual diagram of the model.

Fig. 1
figure 1

Outline of Extended CGM Model

Adaptive process of strategy

At the end of each round, each agent evolves their own strategy. Although many evolution algorithms have been tested [9], we employ roulette selection as a selection mechanism, because it is one of the most basic selection methods adopted by Axelrod [1, 2].

To do so, a strategy, \((b_i, r_i, rr_i)\), is coded as a binary code and is regarded as a locus. Agent i randomly selected as a parent agent by roulette selection. The probability that Agent i is randomly chosen is defined as follows:

$$\begin{aligned} \Pi _i= & {} \frac{\exp \left( \frac{v_i-{\overline{v}}}{\sigma }\right) +\varepsilon }{\sum \exp \left( \frac{v_j-{\overline{v}}}{\sigma }\right) +\varepsilon }, \end{aligned}$$

where \(v_i, {\bar{v}}\), and \(\sigma\) are, respectively, Agent i’s fitness value, the average values of fitness of all agents, and the standard deviation of the fitness values of all agents. Value \(\varepsilon\) is set to 0.0001 to avoid division by zero. This probability function shows that a strategy with a higher payoff tends to spread in the next generation. Next, the strategy parameters are converted to binary code as in Axelrod’s procedure [1].

After adopting this method, each binary code in the new locus may reverse one’s value (either 0 to 1 or 1 to 0) with a constant probability, \(1\%\). The focal agent has this new locus as their own strategy in the next round.

Simulation scenarios

In the restricted meta-reward game, there is no incentive to give meta-rewards, and thus, players never provide meta-rewards. To consider this point, we introduce player expectations of meta-rewards. We then explore how these expectations are reflected in the probability of providing rewards using the following three scenarios that are different values of expected rates of meta-rewards, \(\varepsilon\).

  1. 1.

    No reference (\(\varepsilon =1.0\)): players do not use any reference.

  2. 2.

    Social reference (\(\varepsilon =\frac{1}{N}\sum _k rr_k\)): players use the average rate of meta-rewards in the group.

  3. 3.

    Individual reference (\(\varepsilon =rr_i\)): players use cooperator i’s probability of meta-rewards.

Scenario 1 is a baseline. Scenario 2 describes a situation that players can get information on a providing rate of meta-rewards in CGM. For instance, we suppose that a system in which seeing all meta-rewards for rewards by others is possible. Scenario 3 describes a situation that visualizes a providing rate of meta-rewards for information provided in CGM. In this scenario, we assume that players can decide whether or not to provide meta-rewards to a cooperator after they check the providing rate of meta-rewards of the focal cooperator.

Parameter setting

For simplicity, we set the values of the parameters above by installing two new intervening parameters: \(\delta\) and \(\mu\):

$$\begin{aligned} \kappa _0 &= 1.0 \end{aligned}$$
$$\begin{aligned} \rho _n&= \mu \cdot \kappa _n \end{aligned}$$
$$\begin{aligned} \kappa _n &= \delta \cdot \kappa _{n-1}, \end{aligned}$$

where \(n = 1,2\).

At first, we simulate the case of \(\mu =2\) and \(\delta =0.8\) to clarify the performances of each scenario. Then, we investigate the influences of the cost–reward ratios in Sect. 3.2. Table 1 shows the values of the other parameters in the simulation.

Table 1 Simulation parameters

Simulation results

Comparison of three scenarios

We simulate 100 runs with different random seeds in each scenario, and show the averages and the variances of values using error bars in Figs. 23, and 4. In these figures, the vertical axes show the step numbers, while the horizontal axes show the average parameter values: cooperation indicates cooperation rates, \(b_i\); Reward indicates reward rates, \(r_i\); and MetaReward indicates meta-reward rates, \(rr_i\).

As shown in Fig. 2, the cooperation rate in Scenario 1 decreases at about 100 steps while increasing at the beginning. This is due to the decrease in reward rates. The rate gradually decreases immediately after the beginning and reaches 0.1 at 20 steps. No reward never bears cooperation.

Scenario 2 faces the same mechanism, and thus, neither scenario can maintain a cooperative regime.

In Scenario 3, on the other hand, the cooperation rate increases from the beginning, and then, the meta-reward rate also increases and, finally, the reward rate increases, therefore maintaining a stable cooperative regime, as shown in Fig. 4.

Why does Scenario 3 promote cooperative regimes while Scenario 1 does not? This is quite surprising, because parameter value \(\varepsilon\) is 1 in Scenario 3, while it is less than 1 in Scenario 1. We then analyzed the time series of cooperation rates, reward rates, and meta-reward rates in Scenario 3 in comparison with Scenario 1. At the beginning of the simulation, cooperative rates increased in both scenarios. However, the next phenomena are different. In Scenario 3, the meta-reward rates increased before the reward rates increased. This is because players with high meta-reward rates tend to receive more rewards than those with low meta-reward rates. If the number of players who give rewards is sufficiently large, the high meta-reward rates bear the benefit of the rewards and are larger than the costs of meta-rewards. Therefore, players with high meta-reward rates benefit more than those with low meta-reward rates.

The more players with high meta-reward rates there are, the greater the probability of receiving meta-rewards when giving rewards. Therefore, players who tend to give rewards gain more benefit than those who do not, and thus, the reward rates increase. High reward rates enhance the benefit of cooperation and, therefore, cooperative players have an advantage over defective players. Cooperative regimes stay robust.

In both Scenarios 1 and 2, on the other hand, a cooperator’s meta-reward rate is independent of the rate of receiving rewards, and thus, players with high meta-reward rates do not always have higher probabilities of receiving rewards than those with low meta-reward rates. Despite this, the former players must give meta-rewards by paying costs. Therefore, neither scenario has an incentive for raising meta-rewards.

In Scenario 3, rewards are given by referring to the individual rate of meta-rewards and, indirectly, players with low meta-reward rates receive a kind of punishment, i.e., they are not given rewards. Players with low meta-reward rates gain relatively small amounts of payoffs, and thus, they are distinct sooner or later under the selection pressure. At last, players with high meta-reward rates become the majority. To receive meta-rewards, players tend to post rewarding actions, and to receive rewards, players tend to cooperate. Therefore, these traits indicate that a cooperative regime emerges if players can observe the other individual’s meta-reward rate. In the context of CGM, information providing would increase by selecting information providers who respond to posted comments at an expected response rate.

Influence of cost–reward ratios

In our model, the rate of the reward benefit on the reward cost is important for promoting cooperative regimes [26]. Therefore, we simulated many cases with different values of \(\mu\) and \(\delta\). Figure 5 shows the average rate of cooperation in 1000th step with in 50 runs per each case. In this figure, the x-axis indicates \(\mu\), the y-axis indicates \(\delta\), and the color bar indicates the average cooperation rates. Figures 6 and 7 show those in Scenarios 1 and 2, respectively.

The scopes of \(\mu\) and \(\delta\) are, respectively, \(0.0\le \mu \le 5.0\) and \(0 \le \delta \le 1.0\). This figure shows that

  1. 1.

    cooperative regimes emerge only in Scenario 3;

  2. 2.

    cooperative regimes never emerge if \(\mu <1.4\); and

  3. 3.

    cooperative regimes emerge if approximately \(\mu \cdot \delta > 1.0\)

Among these, Result 2 is consistent with a previous study [25] that demonstrated that cooperative regimes require a substantially large benefit of rewards compared with their costs. Our result adds the insight that it also requires a sufficiently larger value of \(\mu\) in our model than the previous study’s model. This is because the expected values of meta-rewards are small if \(\mu\) is small, and thus, the incentive to give rewards vanishes.

Next, we consider Result 3. As a result of our simulation, condition \(\mu \cdot \delta > 1.0\) is necessary for promoting cooperation. In terms of the relationship between rewards and meta-rewards, if the benefit of meta-rewards is greater than the cost of rewards, players may receive a benefit through giving rewards, and thus, there are incentives to give rewards. This indicates that

$$\begin{aligned} \rho _2 > \kappa _1 \end{aligned}$$

is required. If \(\kappa _1 > 0\) is satisfied, equations \(\rho _2 = \mu \cdot \kappa _2 = \mu \cdot \delta \kappa _1\) are satisfied, and thus, the necessary condition of reward behaviors is as follows:

$$\begin{aligned} \mu \cdot \delta = \frac{\rho _2}{\kappa _1}> & {} 1.0. \end{aligned}$$

Strictly on this point, players do not always receive meta-rewards, and thus, we should consider the average rate of meta-rewards, \(\overline{rr_i}\). Therefore

$$\begin{aligned} \overline{rr_i} \cdot \mu \cdot \delta > 1.0 \end{aligned}$$

is the necessary condition.

If this condition is satisfied, players who give rewards to other players at sufficiently large rates of meta-rewards have an advantage. This also means that cooperative agents are given incentives from which they should receive a large amount of meta-reward rates. This mechanism works, and therefore, players with large amounts of both reward rates and meta-reward rates have survival advantages and, finally, cooperative regimes emerge.

We note why the cooperation rate converges to zero in these figures. This is because our model installs mutants invaded in the population each period. In the adaptive process in our model, they do not become extinct sooner, and thus not so few cooperative agents survive even if the society is occupied by the defectors.

Fig. 2
figure 2

Result of Scenario 1

Fig. 3
figure 3

Result of Scenario 2

Fig. 4
figure 4

Result of Scenario 3

Fig. 5
figure 5

Change \(\mu , \delta\) in Scenario 3

Fig. 6
figure 6

Change \(\mu , \delta\) in Scenario 1

Fig. 7
figure 7

Change \(\mu , \delta\) in Scenario 2


While our main results support the importance of meta-rewards for activating CGM, we must state the other important drivers of real posting including brand image [13], attention seeking, communication, archiving, and entertainment [24]. Moreover, we have no option but to accept the future study on the empirical data that support that the original article providers respond to other commenters replies to sustain posting on CGM.

We developed a restricted public goods games model to overcome the mismatches found between previous models and actual CGM. Our model reveals that restricted public goods games cannot provide cooperative regimes when players are myopicFootnote 1 and never have any strategies on their actions. Cooperative regimes emerge if players that give first-order rewards are given information that reveals whether cooperative players will give second-order rewards to the first-order rewarders. In the context of CGM, if users who post articles reply to commenters/responders, active posting of articles occurs if potential commenters/responders can ascertain that the user posting the article will respond to their comments.

Furthermore, we have tested the different adaptive process in Sect. 2.2. In this paper, we explained the case of Axelrod’s selection rule. However, the readers may consider that the results keep in the different selection rule. Therefore, we performed Genetic Algorithm instead of Axelrod’s and we confirmed that the essential points never change.

We adopt the linkage hypothesis [20], and thus, the first-order reward seems to be affected by the opponent’s second-order reward in the past. In other words, the model is regarded as a kind of a ‘direct reciprocity’ with probability. The original model performed by [1] uses the linkage hypothesis strictly, while our model uses the hypothesis probabilistically.

Although everyone can post meta-comments to comments in functions of Facebook/Blog, the typical usage of meta-comments is for the original information-providers only. We would like to explore the cooperative conditions even if the system has restricted meta-rewards, and thus, we have tried the original assumption. However, we should extend this point in the future works.

This study should be extended. First, the present version of our model describes two types of players actions: cooperation as posting information and defect as non-posting. However, defect behaviors in CGM can be divided into two types: do nothing and post inadequate information. This issue should be introduced in a future version. Second, while our model assumes that all players can observe all information, this is not realistic. We are interested in the influence when the frequency of information accessibility depends on the quality of the information. In this version, we assume that agents’ payoffs are observable to the others, because we think that the other payoff can be expected by observing their actions. However, this point can be loosen and should be tested in the future extension.