1 Introduction

In two-sided markets, an intermediary provides a platform enabling two different user groups to interact, for instance to make a transaction to satisfy their interdependent demands (Bakos and Katsamakas 2008; Ellison and Ellison 2005; Roch1et and Tirole 2003, 2006). Some two-sided online markets have expanded at a furious pace in recent years (Tucker and Zhang 2010). eBay, for example, brings together sellers and prospective buyers of different kinds of goods, Google advertisers and web users, and prosper lenders and borrowers of private loans (Berger and Gleisner 2009). Eisenmann et al. (2006) provide a comprehensive list of examples for online and offline two-sided markets. Often, a neutral third party manages the platform (Yoo et al. 2002, 2007) with the commercial objective to maximize its own profits by optimally monetizing one or both user groups.

Previous research on two-sided markets indicates that the two user groups exhibit different kinds of network effects (Katz and Shapiro 1985; Liebowitz and Margolis 1994). Users may derive positive cross-side network effects (CNEs) from the participation of members on the other side of the market, which means the larger the installed user base on one side of the platform, the more attractive the service for the opposite side’s users (Armstrong 2006; Li et al. 2010; Tucker and Zhang 2010). Network effects can also emerge within one user group, known as same-side network effects (SNEs). For example, a new eBay seller can have a negative effect on other sellers because he or she increases competition between sellers and may snatch away potential buyers (Kraemer et al. 2012; Li et al. 2010).

Utilizing positive network effects and mitigating negative ones is an important challenge for providers of two-sided markets. In recent years, the number of scientific studies which empirically assess such effects has been rapidly increasing (Chu and Manchanda 2013). Yoo et al. (2002) highlight the importance of identifying the magnitude of the network effects for both user groups, and state that it is difficult to estimate these effects. Knowledge of the direction and the magnitude of network effects can be used to support customer acquisition, pricing, monetization, and IT investment strategies for two-sided markets (Bakos and Katsamakas 2008; Kraemer et al. 2012; Sridhar et al. 2011).

Our empirical study examines a leading European online dating platform. Although online dating is one of the example industries in the literature on two-sided markets and seems theoretically very promising for identifying network effects (Armstrong 2006; Caillaud and Jullien 2003; Ellison and Ellison 2005; Rochet and Tirole 2003, 2006), this paper is the first to examine this industry empirically. In our case, the two user groups are heterosexual men and women. The platform enables them to search for each other, to communicate and to initiate real-life dates.

For an intermediary of a two-sided market, it is of interest to know how much future revenue and/or profit can be expected from a given user group; this datum informs effective and efficient use of limited budgets (Malthouse and Blattberg 2005; Borle et al. 2008). Prior to our research, between 35 and 41 % of the users on the platform in question were women, and the intermediary aimed to reach a 50/50 split in the near future. Naturally, one might think that equal numbers of men and women on such a platform yield the best user experience (then, every woman matches with a man) and thus the highest revenue per user for the platform intermediary. However, this does not take into consideration the differences in the user groups’ willingness to pay and how CNEs and SNEs impact user behavior.

Our research aims to determine the direction and the magnitude of the different kinds of network effects on the platform and their impact on revenue, both in aggregate and of each user group individually. In addition to this empirical validation of existing theory, we propose an approach to determine the revenue-optimal ratio of men to women on the platform in light of the various existing network effects. We show that the online dating platform in question can significantly increase its revenue with the proper balance of male and female users.

2 Network effects in two-sided markets

2.1 Previous research

Katz and Shapiro (1985) state that for many technologies, users may benefit from a growing user base. Services such as the telephone, e-mail and social networks exhibit positive network effects. These occur if two or more individuals are able to interact within this network, changing their utility of the network.

Two-sided markets have two different user groups. The intermediary provides a platform for the interaction between these groups (Berger and Gleisner 2009; Kraemer et al. 2012; Yoo et al. 2002, 2007). Usually, a user interacts only with participants from the other user group. For example, a retailer aims to sell his or her products on eBay to a certain consumer (not to another retailer), and a heterosexual man looks only for a potential female partner on Match.com.

Such two-sided markets possess network effects across user groups (CNEs) and within a single user group (SNEs). CNEs exist if the number of users on one market side influences the utility of the opposite group’s users. On eBay, for example, an increased number of sellers improves the product selection and makes the platform more attractive to buyers. Similarly, having more buyers increases sellers’ chance of successfully selling their items, thereby making the platform more attractive to them. SNEs exist if a user’s utility is affected by the installed user base of his or her own user group (Armstrong 2006; Eisenmann et al. 2006). For example, more eBay sellers competing for a given number of potential buyers reduce each other’s chances of transacting with a buyer. Depending on the investigated market, SNEs can possess either a negative effect (Dai and Kauffman 2006; Villanueva et al. 2008; Yoo et al. 2002, 2007) or a positive one (Bakos and Katsamakas 2008; Eisenmann et al. 2006) on users’ utility.

To date, the literature on two-sided online markets has concentrated mainly on two research paths. The first path focuses on pricing considerations that are specific to two-sided markets experiencing network effects and examines which price structure to apply at which price level, and which user group to charge for using the services provided (Armstrong 2006; Chao and Derdenger 2013; Eisenmann et al. 2006; Jullien 2005; Parker and Van Alstyne 2005; Rochet and Tirole 2003, 2006; Rysman 2009). In the case of eBay, the platform may charge sellers, buyers, or both user groups for using the platform.

The second research path investigates the effectiveness of the intermediary’s investment decisions and is more closely related to our work. Bakos and Katsamakas (2008) analyze design choices and investments such as the quality of technology, the services offered to each side, and the rules of interaction between the two user groups that create network effects in two-sided markets. Yoo et al. (2002) offer different strategies to optimize the intermediary’s revenue, depending on the ownership model of the platform. Kraemer et al. (2012) find asymmetric network effects on an eBay-like platform and assess the effectiveness of various IT and design investment features in increasing the platform value. Tucker and Zhang (2010) examine the impact of advertising the size of the user base on further participation of buyers and sellers in two-sided markets.

While these studies aim to find the intermediary’s optimal strategy (e.g., in terms of revenue or platform value) to invest in IT improvements, quality, or marketing strategies, our paper searches for the revenue-optimal split between the two user groups, after empirically proving all corresponding directions of network effects, considering fixed user fees. In spite of substantial theoretical and methodological work on network effects, Wilbur (2008) as well as Kraemer et al. (2012) states that empirical analyses are still scarce due to a lack of real-life data to properly identify the effects within and across the user groups. Table 1 summarizes the results of these empirical studies and highlights the research gap and the contribution of our paper.

Table 1 Empirical studies assessing the economic results of network effects across user groups

Our paper is related to the studies shown in Table 1, but with some notable differences. Chu and Manchanda (2013) state that previous work often focused on the benefits (or costs) a user obtains from additional users from either the same or the opposite user group, but not simultaneously from both sides. As a consequence, many studies thoroughly quantify CNEs, yet do not consider SNEs (e.g., Brynjolfsson and Kemerer 1996) or use lagged sales as a proxy for SNEs (Sridhar et al. 2011). The few existing studies that investigate both direct CNEs and SNEs use their results to model individual behavior (Tucker and Zhang 2010) or network value (Asvanund et al. 2004), while our study examines the direct impact of network effects on the intermediary’s revenue, number of users and subscribers. We also notice that most studies employ data on a market level, while our study—as few others—uses a unique transactional data provided by a company. In addition, our work is the first empirical paper that studies network effects in the online dating industry.

2.2 Expected network effects on an online dating platform

2.2.1 Online dating

Three parties are involved in such a market, namely the intermediary that provides the platform and the two user groups, women and men, looking for potential partners. For reasons of simplicity (see likewise Armstrong 2006), we focus our analysis on participants looking for users of the opposite gender. Men searching for men and women searching for women are both homogeneous user groups without interaction with other user groups, and thus form a one-sided market, which is not part of our study.

Users of a dating platform clearly belong to one market side. When registering, a new user provides information on his/her gender and whether he/she is interested in meeting men or women. After this, the user typically does not change his/her role. In contrast, an eBay user can both sell and buy items at the same time, which makes it more difficult to identify the occurring network effects.

Most online dating platforms possess a ‘freemium’ pricing model. On platforms with this model, new users can create a profile for free, browse through the profiles of other users, see who visited their own profile, and send preset short messages known as ‘winks’ (such as “your picture looks nice”) to other users. However, only paying users, purchasing a subscription, can start full-text conversations with others and reply to winks. This means that at least one person (man or woman) needs to be a paying user to initiate the contact and possibly a ‘first date’ later on. A look at (each) the 100 top-grossing dating and social networking apps for iPhone (AppAnnie 2015) in the US, Japan and Germany shows that 30 out of 34 of such apps (i.e., 88 %) follow such a freemium strategy.

Kinsey et al. (1948) describe that the traditional gender role expects men to initiate contacts and women to respond. In real-life dating, women usually receive more offers from men than vice versa (Gutek et al. 1990). In addition, (Fisman et al. 2006) report from a speed dating experiment that men respond more strongly to their counterparts’ physical attractiveness. If this holds true for online dating, one can expect men’s willingness to subscribe to the paid service to be stronger than those of women.

2.2.2 CNEs

The main purpose of using an online dating platform is to look for, find, and contact potential partners of the opposite gender. Hence, users of one user group (e.g., men) care especially about the number of users on the other side (in this case: women) (Armstrong 2006; McIntyre and Subramaniam 2009; Tucker and Zhang 2010). Two-sided markets yield effects in which users in one group choose a good that affects another group’s choice of a different good (Parker and Van Alstyne 2005). For example, a woman joining the dating platform may motivate men to contact her. Thus, the utility of the platform to a paying male (female) user increases when he (she) can communicate with more women (men) (Yoo et al. 2002), which means that paying users on both market sides enjoy positive network effects from the installed user base on the opposite market side. This network effect can reflect the increased probability of finding a satisfactory match among the other side’s users (Bakos and Katsamakas 2008). Keeping the number of men constant, more women offer men a wider variety of matches (Ellison and Ellison 2005; Gehrig 1998), a greater chance of finding a unique fitting match (Caillaud and Jullien 2003), and reduce the competition between men for a specific woman (Dai and Kauffman 2006; Wang and Seidmann 1995; Yoo et al. 2002).

Previous research has shown that positive network effects leading to increased user enjoyment of the underlying service also have a positive impact on customers’ willingness to pay (e.g., Borgatti et al. 2009; Brynjolfsson and Kemerer 1996; Farrell and Saloner 1985; Katz and Shapiro 1985). Eisenmann et al. (2006) as well as Ellison and Ellison (2005) show that both user groups in two-sided markets are willing to pay more for access to a bigger network.

CNEs can also positively influence user acquisition (Villanueva et al. 2008) and retention (Chen and Xie 2007; Nitzan and Libai 2011). Single men or women are more likely to join a platform that possesses a large number of relevant users than one that does not (Li et al. 2010). The impact of CNEs on retention or churn, however, are not trivial in the given freemium context: On the one hand, a smaller number of users of the opposite gender makes a dating service less attractive because the chances of finding a partner are lower. Thus, many users would be frustrated and may sign off earlier. On the other hand, a smaller number of users of the opposite gender may hinder people from signing up to the service (if they know in advance) or aggravate existing users’ search for a fitting match, which may then lead to a longer usage lifetime for both sides.

2.2.3 SNEs

While CNEs are usually positive (but not always, as Sridhar et al. 2011 show), SNEs can be commonly found both ways in two-sided markets. For example, positive effects on each user’s network utility can be found if game console owners appreciate co-playing and trading games with friends who possess the same console (Eisenmann et al. 2006), or if the platform users create a community that can provide support, collaborate, and share information with other users (Bakos and Katsamakas 2008). However, in most cases, SNEs have a negative effect on users’ utility, especially in markets where users prefer fewer rivals (e.g., sellers on eBay competing for the same buyers) (Dai and Kauffman 2006; Li et al. 2010; Tucker and Zhang 2010; Wang and Seidmann 1995). Following the aforementioned idea that the utility of the online dating platform to a specific user increases when he/she can contact more users of the opposite gender, the utility of the service should decrease when there are more users of the same gender (i.e., rivals) competing for the users of the other group.

At any point in time, men and women on the platform can use the search function to check the number of users of each gender. Here, the number of users of the opposite gender is more relevant as users are looking for a partner, not a rival. Still, men/women have to option to search the platform for users of their own gender. In practice, however, they often estimate the number of rivals and the chance to find a match based on ‘weak signals’ such as the number of profile visits they receive from interested users of the opposite gender (Bapna et al. 2012) or the share of received messages or winks. Having too many rivals may eventually lead to fewer registrations from that user group, faster churn, and/or fewer subscriptions to the charged service. It will be interesting to see if we find substantive negative SNEs at all in our empirical study and how they differ between men and women.

3 Theoretical validation: identifying direction and magnitude of network effects

3.1 Platform and data description

In this section, we aim to empirically examine the existence and measure the magnitude of various network effects. To do so, we use customer and payment data from a leading European online dating platform that has been operational for approximately 10 years. Users can register and create a profile for free. Every user must provide a nickname, his/her gender, place of residence and whether he or she would like to meet men or women. In addition, users can share profile pictures, age, hobbies and other personal details. All users can actively search and browse through the profiles of other (male or female) users in their vicinity. As mentioned above, we focus on the cases of men searching for women and vice versa.

The online dating platform applies the industry-typical freemium model described in Sect. 2.2: signing up and searching for other users is free of charge; however, users have to subscribe to one of the two available premium packages to be able to initiate conversations with other users. The premium packages, which we refer to as ‘Silver’ and ‘Gold’, can be purchased through a monthly subscription and can be renewed at any time. The monthly prices lie between €20 and €60 per month, depending on the length of the subscription (1–12 months; longer term packages incur a lower monthly price) and the chosen package (Gold is more expensive than Silver). The subscription prices were not changed during the entire investigated timeframe and are the same for both women and men. The Silver package allows subscribers to send messages, initiate chats and see which members are interested in their profile. The Gold package additionally highlights its subscribers in the search results and recommends users of the opposite gender in the same city with similar interests and hobbies.

Our data set covers two and a half years, from July 1st, 2010 to December 31st, 2012 (i.e., 915 consecutive days). We examine the data from one sample city of approximately 100,000 inhabitants. A total of 8923 users registered within our sample period, of which 40.8 % were women.

The analyzed payment data covers all transactions (i.e., subscriptions to a premium package) including the start and end date of the subscription, the product type (Silver or Gold), and the price. All incomplete transactions, such as fraud, chargebacks, and free upgrades (“try our Gold membership for free for 1 month!”) are excluded from the sample. The dating platform generated a total revenue of approximately €90,000 from paying users. The majority (89.3 %) of the revenue is produced by male users. Not only do men spend more money on the platform, they are also more loyal to it. The median lifetime (i.e., interval between registration and sign-off) is 102 days for a male user and 75 days for a female user. Table 2 provides a data summary and Table 3 shows the key figures on a daily basis. In addition, we show in Table 4 that Gold customers have significantly higher daily and total revenue compared to Silver users.

Table 2 Descriptive data
Table 3 Descriptive data on a daily basis, N = 915 days
Table 4 Silver versus Gold subscriptions: comparison of duration and revenue

3.2 Model and variables

Kraemer et al. (2012) summarize that network effects in two-sided markets can be measured in several ways. Among them are choice models (e.g., Pavlou 2002; Rysman 2009; Stock and Yogo 2005), diffusion models (e.g., Gandal et al. 2000; Gupta et al. 2009; Chu and Manchanda 2013), vector autoregressions (e.g., Chen et al. 2001) and linear regressions (e.g., Hendel et al. 2007; Seamans and Zhu 2013). To account for the specifications of both user groups, we estimate simultaneous equation models (SURE; seemingly unrelated regression equations), as used by Mantrala et al. (2007) and Sridhar et al. (2011). To address the challenges that outliers pose for some statistical models, we use the Huber–White sandwich estimators (Huber 1967; White 1980) in all our models, thereby obviating minor concerns about the potential failure to meet assumptions, such as normality, heteroskedasticity, or observations that exhibit large residuals, leverage, or influence.

3.2.1 Dependent variables

For the purposes of our study, we consider revenue maximization on a per-user level and in total to be the primary economic variable, and aim to assess to what extent network effects (both CNEs and SNEs) describe the investigated platform’s total revenue within a given timeframe. We break down the activity and revenue data on a daily basis. When a user subscribes to a premium package, we split the relevant revenue evenly over the entire subscription period. An example: a free user subscribes to a premium package from January 1st, 2012 to March 31st, 2012 (i.e., 90 days) for a total of €180, and returns to using the service for free afterwards. In our data set, he/she shows daily revenue of €2 in these 3 months, and daily revenue of zero before and after. Using this approach, revenue can be stated (for any day) as the product of the average revenue per user and the installed base (per-user group). In our regression models, we will consecutively check for network effects, first describing the daily revenue per user (DailyRevenuePerWoman/Man), second the net user gains (NetGainWomen/Men, i.e., variation of the installed base compared to the previous day), and finally the total revenue (DailyRevenueAllWomen/AllMen/AllUsers).

3.2.2 Independent variables

Most research models (e.g., Armstrong 2006; Bakos and Katsamakas 2008; Fudenberg and Tirole 2000; Katz and Shapiro 1985; Pang and Etzion 2012; Yoo et al. 2002) consider network effects to be linear in the size of the relevant user base. For our models in Sect. 3.3, we also employ a linear specification of network effects, counting the number of active Men and Women as the relevant user bases. Later, in Sect. 4, we use a modified model to ascertain the optimal user split between men and women.

According to the intermediary, most dating customers register in the evening and need some time setting up their profile and uploading appropriate profile pictures. We therefore assume that they start affecting other users with a time lag of 1 day (see likewise Chu and Manchanda 2013); Men and Women are therefore the number of users at the end of the previous day. In addition, we consider several control variables such as the platform lifetime in days as well as dummy variables for extraordinary TV events, seasonality, and major updates to the game. These dummy variables are set at 1 if applicable to a certain case, and 0 if not. For example, Update2 went live on day 4088; all cases prior to the update have been labeled with 0 and with 1 as of that day. During the sample period of two and a half years, the platform underwent ten permanent game updates such as design changes and the introduction of new features. Table 5 describes the independent variables used in our model.

Table 5 Description of the independent variables

3.3 Identification of CNEs and SNEs

Our first analysis investigates how network effects describe the average daily revenue per user. The employed SURE model treats the average DailyRevenuePerWoman and DailyRevenuePerMan (both in eurocent) as dependent variables. We estimate several models to ensure the robustness of our results. We begin considering only the number of users of the same gender as independent variables (SNEs; model 1) and successively include additional parameters: the number of users of the opposite gender (CNEs; model 2), platform parameters (model 3), and eventually seasonal parameters (complete model 4). Table 6 summarizes the results. We detect no change of algebraic signs for the significant variables from one model to another and thus conclude that our findings are robust. We emphasize that these results are only of descriptive nature and we can only assume causality due to the strong theoretical background available in the domain of network effects.

Table 6 Results from SURE model (dependent variables: DailyRevenuePerWoman and DailyRevenuePerMan in eurocent)

As we expected, we see that male users generate a much higher basic daily revenue (constant is 17.03, p < 0.01) compared to female users (8.35, p < 0.01). Without considering any network effects, adding more men to the platform would therefore be much more remunerative than adding additional women. However, our model also finds support for negative SNEs on both sides: we can see that the installed base of female users Women is negatively correlated (p < 0.01) with DailyRevenuePerWoman, as Men is with DailyRevenuePerMan. We can also find positive correlations between Men and DailyRevenuePerWoman as well as between Women and DailyRevenuePerMan. Both are highly significant (p < 0.01) and support positive CNEs. Looking at the magnitude of the network effects, we see that the positive CNEs that women exert on men are stronger than vice versa (0.00664 vs. 0.00274). Moreover, the negative SNEs effected by women are weaker than those by men (−0.00495 vs. −0.00831). While such positive CNEs could be expected, it is interesting to see that we find significant negative SNEs in both cases. Users—and primarily men—are indeed affected by stronger competition, which leads to reduced user expenditures on the focal service.

Next, we estimate a SURE model with NetGainWomen and NetGainMen as dependent variables. For each day, NetGainWomen/Men describes the change of the installed user base (per-user group) compared to the previous day (i.e., new registrations minus churners). Table 7 shows the results.

Table 7 Results from SURE model (dependent variables: NetGainWomen and NetGainMen in number of users)

While we cannot find any significant CNEs (in the final model), we observe significant negative SNEs on both user sides. The more women (men) on the platform, the higher the number of churning women (men). We interpret this result as a competition effect that strengthens the negative SNEs we have seen regarding DailyRevenuePerUser: in case of strong competition, users do not only tend to stay free users, but they are also more likely to leave the platform. We do not observe positive reputation or popularity effects (i.e., the site growing faster as prospective customers learn that more people are using it; see Table 8 for a respective analysis).

Table 8 Results from SURE model (dependent variables: NetGainWomen and NetGainMen)

These results are especially interesting as they indicate that each user group has a reasonable maximum size. With additional users, it becomes increasingly hard (and probably expensive) for the intermediary to acquire and keep users of a certain gender. At such a point, it may become more effective to acquire new users of the opposite user group (which brings us to the determination of the optimal split between men and women in Sect. 4.2).

We will now examine the impact on total daily revenue that additional Women and Men have. Table 9 displays the results of this analysis. Consistent to our previous analyses, we employ a SURE model to estimate DailyRevenueAllWomen and DailyRevenueAllMen, while we apply a separate OLS regression model to estimate DailyRevenueAllUsers.

Table 9 Results from SURE model (dependent variables: DailyRevenueAllWomen, DailyRevenueAllMen in €) and OLS regression (dependent variable: DailyRevenueAllUsers in €)

We can see that additional women always generate additional revenue. Despite the negative SNEs leading to lower daily revenue per woman (Table 6), higher churn of female users (Table 7) and lower total revenue from women (Table 9), more women still have a positive revenue effect because of the positive CNEs they exert. While one additional female user reduces the daily DailyRevenueAllWomen by €0.21, it has a positive effect on the daily DailyRevenueAllMen of €0.69, which eventually increases DailyRevenueAllUsers by €0.48 per day. We see that a user’s basic willingness to pay including positive CNEs overcompensates here the negative SNEs.

On the other side, despite being the main payers (independent of network effects), increasing the number of men does not always mean additional revenue. While we can find significant positive CNEs on DailyRevenueAllWomen, the effect of purely adding male users to DailyRevenueAllMen and DailyRevenueAllUsers is insignificant, mostly because of the aforementioned negative SNEs. Knowing of the existence of these effects, we aim to find to revenue-optimal split of men and women in the next section.

4 Practical application: determining the revenue-optimal share of men and women

4.1 Motivation and numeric example

Most two-sided markets are managed with the objective of maximizing profit generation via the paying installed user base (Yoo et al. 2002, 2007). Due to budgetary constraints, intermediaries are only able to acquire and serve a finite number of users. Such intermediaries may fail to maximize their revenue if they do not consider the network effects present on their platform. To demonstrate how such circumstances can lead to mismanagement of the platform, we will now examine a numerical example using the results from our previous analyses.

In Sect. 3.3, we found that male users spend more money on average than female users, but female users carry an additional indirect revenue potential because the positive CNEs they exert on revenue generated per male user are stronger than vice versa. In addition, the existing negative SNEs are stronger for men than for women. We apply these results to the simplified numerical example in Table 10.

Table 10 User characteristics in a fictitious two-sided dating market

For our example, we assume the network provider possesses a budget to acquire 100 users of any gender and is looking for the split between men m and women w yielding the highest overall revenue. Total revenue equals the sum of the revenue generated by both men and women:

$$\text{Re} {\text{v}}_{m,w} = m\; {\text{prob}}_{m} {\text{fee}}_{m} + w\; {\text{prob}}_{w} {\text{fee}}_{w} \;{\text\;{with}}\;m + w = 100.$$
(1)

In this stylized two-sided market, men are more likely to become paying users than women (prob m  = 6 vs. prob w  = 2 %). In both groups, paying users pay the same average fee (fee m  = fee w  = €100). This means an average man generates revenue of 6 % · €100 = €6, while a woman generates on average only 2 % · €100 = €2. An intermediary who does not consider network effects would thus conclude that they should only acquire men as users and not a single woman.

This strategy, however, seems clearly questionable as a dating platform without women offers men no reason to become paying users on. We will now consider the impact of CNEs and SNEs upon the basic purchase likelihood. As shown in the following quadratic equation, each user’s expected revenue is influenced by positive CNEs from all users of the opposite gender and negative SNEs from all other users of the same gender.

$$\text{Re} {\text{v}}_{m,w} = m\cdot ({\text{prob}}_{m} {\text{fee}}_{m} + {\text{CNE}}_{w} w + {\text{SNE}}_{m} \cdot (m\; - \;1)) + w \cdot ({\text{prob}}_{w} {\text{fee}}_{w} + {\text{CNE}}_{m} m + {\text{SNE}}_{w} \cdot (w\; - \;1)) {\text\quad{with}}\;m\; + \;w = \, 100.$$
(2)

Differentiating the revenue formula (2) with respect to m and setting the derivate to zero allows to determine the revenue-optimal split of male and female users (we expand this process in Sect. 4.2). Figure 1 shows the total revenue in our example of all male and female users combined when considering user split-dependent network effects. Changing the share of women (i.e., the horizontal axis in Fig. 1) shows two effects: First, as men generally have a higher probability for becoming paying users, the revenue stemming from this basic likelihood is highest with more men, even in light of the negative revenue impact of SNEs. Second, an elevated proportion of female users exerts CNEs, which have the highest positive revenue impact at 50 % of the user base. The CNE-induced revenue curve follows an inverted U-shaped form, and the revenue surplus is incrementally reduced with a lower/higher share of women. Taken together, these effects lead to a revenue optimum at circa 67 % men and 33 % women. Given the same total number of users, the intermediary’s revenue are 5.4 % higher than in a 50/50 user split (€567.99 vs. €538.75).

Fig. 1
figure 1

Revenue-optimal share of women in the numerical example

This simple example demonstrates that intermediaries with knowledge of network effects can make better business decisions, for example by identifying and profitably acquiring those customers who promise the highest revenue contribution to the network; such an identification enables the intermediary to optimize the user split on their platform.

4.2 User split optimization for the investigated platform

We will now use authentic data to empirically determine the optimal ratio of male to female users with regard to the highest possible revenue generation. Our approach is usable for platform intermediaries in two-sided markets that aim at an effective use of their limited user acquisition budgets.

In this section, we use the same data set as in Sect. 3 with slightly adjusted variables in the OLS model. First, we now use DailyRevenuePerUser as the dependent variable (i.e., DailyRevenueAllUsers/Users) to ascertain the proportion of women that yields the best results. We also replace the previously used absolute user numbers Men and Women with a dependent variable which represents the proportion of female users, both in linear and quadratic form (ShareOfWomen and ShareOfWomenSquared, each as a percentage of total users). All other variables remain the same as those in Table 6. Table 11 shows the results of the employed regression model.

Table 11 Results from OLS regression

Table 11 shows positive linear and negative quadratic influence of the proportion of female users on total revenue per user, which yields a single point of female-dependent maximized revenue. The regression formula (3) slightly differs from the equation used in the numeric example (2) as we are not restricted to a total of 100 users and have different variables compared to the previously used, simplified model. Differentiating the abridged regression formula (3) shown below with respect to ShareOfWomen and setting the derivate to zero (4) yields a critical point at a proportion of female users of 36.2 % (5). As we can easily see from Eq. (4), the function’s second derivative is negative and the identified point is therefore a local maximum in terms of revenue: the desired revenue-optimal proportion of female users.

$${\text{Daily\;Revenue\;Per\;User \;=\; }}\beta_{0} + \beta_{1 } {\text\quad{Share\;Of\;Women }}\;{ + }\; \beta_{2} {\text\quad{Share\;Of\;Women\;Squared }}\;{ + } \cdots { + }\varepsilon$$
(3)
$$\frac{{d \;{\text{Daily}}\;{\text{Revenue}}\;{\text{Per}}\;{\text{User}}}}{{d \;{\text{Share\; Of\; Women}}}} { = 346} . 8 3 5 1 { + 2 }\cdot (- 4 7 9. 6 1 6 3 ) \cdot{\text{Share\;Of\;Women = 0}}$$
(4)
$$( {\text{Optimal)\;Share\;Of\;Women \;=\; }}\frac{ - 3 4 6. 8 3 5 1}{ - 2 \cdot 4 7 9. 6 1 6 3} { = 36} . 2\,{\text{\%}}.$$
(5)

Similar to the previous approach, the logit regression model in Table 12 aims to assess the proportion of women yielding the highest share of paying users. In both analyses, we find similar results: a share of women of 34.6 % leads to the highest share of premium subscribers, while 36.2 % maximizes the intermediary’s revenue. Below this optimum, additional female users contribute higher utility to the overall network (through positive network effects) than men, leading to either more subscribers or additional revenue. When the optimum is surpassed, adding more men to the platform will be more valuable than adding more women.

Table 12 Results from logit regression analysis (dependent variable: IsPayer)

Figure 2 illustrates the relationship between the share of female users and its correlation to the expected total revenue. Like in our numerical example, the curve has an inverted U-shape. As shown previously in Table 3, the proportion of women fluctuated between 35.5 and 41.1 % over the course of our 915-day observation period. However, our results show that, given a constant number of users, the intermediary’s total daily revenue will be approximately 2 % higher with a women’s share of 36.2 % compared to the historical maximum of 41.1 %, and a full 17.2 % higher compared to a 50/50 split. Our results display that the intermediary in our study should abandon its previous goal of reaching a 50/50 user split.

Fig. 2
figure 2

Revenue-optimal share of women

5 Discussion

5.1 Research contributions

Our study investigates users’ spending behavior on an online dating platform. Despite progress in gender equality, findings from more than 60 years ago (Kinsey et al. 1948) still seem to apply. Asymmetric societal norms still exist in people’s mate searching behavior that prevent women from making the first move (Bapna et al. 2012; Fisman et al. 2006; Piskorski 2012). This aspect certainly accounts for the results of our study, where we could see men are more likely willing to pay for online dating services assuming a sufficient installed user base of women than vice versa.

Our study identifies the existence and the magnitude of the various network effects in this market. Estimating a SURE model reveals positive CNEs in both directions: having more female users increases the average revenue per male user and the total revenue generated by men, and vice versa. A larger choice set of potential partners increases the chance of a free user finding someone he/she is interested in and eventually subscribing to a premium package that allows him/her to send messages to other platform users.

Furthermore, we find negative SNEs for both men and women. Increasing the number of women reduces the revenue per woman, leads to a higher churn of female users, and reduces the total revenue generated from the installed base of women. Increasing the number of men reduces the revenue per man and leads to a higher churn of male users while there is no significant negative impact on total revenue. In the given freemium model, free users might be deterred from purchasing the premium package if there are too many people competing for a given number of users of the opposite gender. Existing premium users who send messages to potential partners may receive fewer answers if there is too much competition, and become frustrated. As a consequence, the share of customers who renew their subscription may decrease and the number of users leaving the platform may increase.

In addition, we observe that the positive CNEs that women exert are stronger than the SNEs on the women’s side. As long as the number of male users clearly exceeds female ones, more women always mean extra revenue. For men, we could not find statistical support in this case. As a combination of positive CNEs and substantial negative SNEs, the total revenue impact of solely increasing the number of male users is not significant.

5.2 Practical contributions

We have shown that operators of two-sided markets who aim to optimize their revenue can use information on network effects to acquire, manage and monetize their user base more effectively. We present an approach that determines the optimal split between the two user groups in terms of revenue and the number of premium subscribers. We find a positive linear and a negative quadratic influence of the proportion of women on revenue and number of subscribers. Thus, the utility of incremental women (who are the user group that exert the strongest CNEs) for the entire network follows an inverted U-shape. This is in line with the work by Bapna and Umyarov (2012) who discovered in an online social music network that the strength of influence decreases with a user’s number of friends. Our model can be easily extended, for example to other regional markets, or to any other two-sided platform with network effects.

We find that a female proportion of the user base of 34.6 % leads to the highest share of premium subscribers, while the revenue-optimal proportion of women was 36.2 %. As the platform’s share of female users was at circa 40 % and therefore above the revenue optimum at the end of our observation timeframe, acquiring more male users promised higher future revenue at that time. Our findings are in conflict with the intermediary’s initial strategy to achieve a 50/50 user split between women and men. Our results indicate that the optimal user split generates 17.2 % more revenue with the same number of users than the targeted, intuitive 50/50 split.

The company that provided us with the data used the results from our analysis to develop a decision support system that assesses the expected customer lifetime value of an additional male or female user. The system is continuously collecting information to provide an updated assessment of the revenue-optimal share of women at any time. This allows the company to identify those users who promise the highest incremental value for the platform and to adapt its customer acquisition strategy accordingly. Based on our static results from the end of our observation timeframe, the network intermediary relocated its marketing budgets to acquire more male users; it adapted the costs per install (CPI) for new users according to the CLV projection for new users and launched a marketing campaign that primarily targeted male singles.

5.3 Limitations

The limitations of our study offer several avenues for interesting future research. First, there are other possible ways to measure the influence of network effects on revenue. Besides the employed SURE, OLS and logit regression models, a random-effects or fixed-effects panel data model would also be appropriate. Alternatively, a hazard model could be used to better understand the dynamics of the development. In our work, we employed a linear—and for the optimization problem additionally a quadratic—specification of network effects. Apart from these forms, logarithmic and polynomial relationships (or combined functions; Asvanund et al. 2004) between the installed user base and dependent economic variables are also possible.

Second, unobserved causes may exist that could bias the estimates of network effects (Liu et al. 2007). It is hard to imagine that omitted variables could easily reverse the assessed direction of the network effects. However, the estimated coefficients may still be biased in terms of their magnitude (Kraemer et al. 2012). We tried to employ instrumental variables but failed to identify valid orthogonal variables. This would have certainly helped us to build additional confidence in our results.

Third, the observed network effects are likely to strongly depend on the underlying price model. In our study, we observe that additional female users increase male users’ willingness to pay more strongly than vice versa. Depending on the magnitude of the (asymmetric) CNEs, several researchers suggest increasing the price difference between the two user groups on a two-sided market (e.g., Strauss 1999) or to charge only one user group and give away the service to the other under certain conditions (e.g., Armstrong 2006; Bakos and Katsamakas 2008; Caillaud and Jullien 2003)—a possible (but in practice hardly used) strategy for dating platforms that might yield a different revenue-optimal user split. Jullien (2005) provides a list of possible price models for intermediaries in two-sided markets. Empirical testing of the impacts of a price model change (e.g., moving from a subscription-based to a transaction-based pricing model) upon the revenue-optimal user split would be a worthwhile supplement to the examination carried out in our study.

Lastly, not only price model changes but also price level changes may alter the network effects and thereby the revenue-optimal proportion of women in the user base of such a platform. In our case, the intermediary kept prices fixed during the entire observation period; however, a price change may result in a new optimal user split, depending on each user group’s respective price sensitivity.

6 Summary and conclusion

This study’s objectives were to empirically assess the influence of CNEs and SNEs on revenue in a two-sided online network and to derive the revenue-optimal split between the two user groups, men and women. Therefore, we investigated a leading online dating platform’s user activity and payment data over a period of two and a half years. Our sample covered 8923 users in one city who spent approximately €90,000 by subscribing to one of the premium packages offered by the platform provider.

In general, men are more willing to pay for dating services than women (if the installed base of women is sufficiently large). In addition, we observed that both user groups (i.e., male and female users) exert positive CNEs with regard to revenue and user enrollment of the other group; however, the positive CNEs women exert on revenue generation per man are stronger than vice versa. Moreover, we identified negative SNEs which lead to lower revenue per user and an increased churn rate on a market side, when that side exclusively grows.

Operators of two-sided markets can use information regarding asymmetric network effects such as these to acquire, manage, and monetize their user base more effectively. For the online dating platform in our study, we calculated the revenue-optimal user split and found that a female proportion of the user base of 36.2 % yielded 17.2 % more revenue than a 50/50 split for the same total number of users. Our model is transferrable not only to other online dating platforms, but to all kinds of two-sided markets with network effects. Platform intermediaries can use the results from this optimization problem to develop more efficient user acquisition and monetization strategies.

7 Executive summary

In two-sided markets such as the online dating industry in question, two different user groups interact and generate various network effects, which can be either positive or negative. Users may derive positive cross-side network effects from the participation of the other user group; for example, the more women are on a dating platform, the more attractive the service is for men. In addition, same-side network effects can (usually negatively) impact the utility of the platform for the group’s users if the size of that group becomes too large. Capitalizing on positive network effects and mitigating negative ones is an important challenge for providers of two-sided platforms. In our work, we analyze activity and payment information for over 8900 online dating users over a two-and-a-half-year period. We show that positive cross-side and negative same-side network effects have a significant impact on the revenue generated per user. We use these results to determine the revenue-optimal ratio of women to men on the platform. There is a natural inclination to think that an equal number of men and women (i.e., a 50/50 split) yields the best user experience and thus the highest revenue per user for the platform intermediary. However, our analysis shows that the revenue-optimal proportion of female users on the platform is a mere 36.2%, mainly because (a) men have a higher basic willingness to pay for the service than women, and (b) women exert stronger positive cross-side network effects on the on-platform spending habits of men than vice versa. The identified optimum yields 17.2% higher revenue than the 50/50 split the platform provider initially aimed for. Academics and practitioners can use our framework to quantify network effects, determine the revenue-optimal ratio of users in any two-sided market, and develop more effective customer acquisition and monetization strategies.