A relatively recent phenomenon, online dating is becoming an increasingly relevant site of investigation spanning disciplines as varied as sociology, economics, evolutionary biology, and anthropology [5]. Foundational work on mate preferences in online dating, matching markets, and the role of physical attractiveness in online dating has been done by Finkel et al. [9], Hirstch et al. [14], and Fiore et al. [10]. Zhang and Yasseri [31] explored the latent asymmetries in messaging between men and women on these platforms and Holme et al. [15] studied the users’ community dynamics of such platforms. Though the aforementioned literature is rich and sets a foundation for robust discussion of online dating, no existing study presents a longitudinal approach to online dating. The contribution of this work is the expansive dataset which encompasses over 12 years of user activity, allowing us to better understand not only how these phenomena of interest work in extraordinary detail, but how they have changed over time.
As the Internet rose as a social medium used to facilitate communication, it eventually adapted to specialist functions including online dating sites. Online dating is the practice of using dating sites—made specifically for users to meet each other for the end goal of finding a romantic partner [9]. As Michael Norton put it, “Finding a romantic partner is one of the biggest problems that humans face and the invention of online dating is one of the first times in human history we’ve seen some innovation” [13]. In fact, online dating has emerged as one of the most widely used applications on the Internet. Online dating has an annual growth rate of 70% in the United States. It has also developed into a highly profitable business with growing numbers of people worldwide willing to pay for access to services that will find them a romantic partner. Online dating is now a $2.1 billion business in the US and is expected to continue growing in the foreseeable future [22]. Considering three-quarters of US singles have tried dating sites and up to a third of newly married couples originally met online [32], online dating seems to have shed its old stigma, ostensibly here to stay as the new normal.
When considering online dating, it may be useful to think of these platforms and marriage in general as markets [25]. As economist Alvin Roth explains in his book Who Gets What and Why, there can be thick and thin matching markets where thick markets have lots of buyers and sellers (single people in this case) and little differentiation, while thin markets have fewer buyers and sellers and considerable differentiation [25]. For instance, we can imagine that there was a thick market for marrying your high school sweethearts before women started going to college. However, as more and more women decided to pursue higher education and enter the workforce, the market shifted to a wider selection of potential spouses for each side and decreased from the thickness of the market.
The increased variety of potential mates gave way to dating phenomena like speed-dating, which was a pre-internet predecessor to any modern app with a market design where singles meet many people very quickly, indicate who they are interested in, and only receive each other’s contact information if there is mutual interest. However, with the rise of the Internet, there is now a thick market for finding love online again. More specifically, we can think of these Internet-based dating platforms as two-sided matching markets (if we exclude niche platforms for polyamory and non-traditional relationships). This means that there are two sides of the market to be matched, participants on both sides care about to whom they are matched, and money cannot be used to determine the assignment [1]. This model includes high-end management consulting firms competing for college graduates that must attract candidates who also choose them, home buyers and sellers, and many more important markets. Two-sided matching markets have been extensively studied, with the literature splitting them into two categories: the “marriage” model and the “college admissions” model [1].
Becker’s (1973) marriage model [33] assumes simple preferences, with men and women ranked vertically from best to worst. This model and its assumptions have been applied to diverse problems such as explaining gender differences in educational attainment, changes in chief executive officer wages, and the relationship between the distribution of talent and international trade [34,35,36,37,38,39].
Another line of research follows Gale and Shapley’s college admissions model [41] which allows for complex heterogeneous preferences. This model is a cornerstone of market design and has been applied to the study and design of market clearing houses such as matching residents to hospitals and students to charter schools. This begs the question: who gets matched with whom in the online dating matching market? Are differences in dimensions of type mostly horizontal (e.g., some pairs make better matches than others, following the college admissions model), or vertical (e.g., there are some people that we can universally agree are more desirable mates than others, following the marriage model)? Work on a small sample of online dating users provides limited support for the latter [3].
Earlier work suggests that there are “superstar” users who attract lots of attention and matches on any given platform. In some cases, the top 5% of all men on a platform receives twice as many messages as the next 5% and several times as many messages as all the other men [23]. However, it would be incorrect to assume these superstars would be universally appealing to all users and that popularity alone determines matches. Instead, it could be useful to consider the economic concept of assortative mating observed in offline marriage markets, and how online matching reflects or deviates from this behaviour.
Positive assortative mating or matching occurs when people choose mates with similar characteristics. Empirical evidence strongly suggests that spouses tend to be similar in a variety of characteristics, including age, education, race, religion, physical characteristics, and personality traits [24, 34, 41, 43, 44]. This phenomenon can be measured and observed in online dating markets when we inspect the pairs. Using data from an online dating site, Hirsch et al. found that although physical attractiveness and income are largely vertical attributes, preferences concerning a partner’s age, education, race, and height tend to sort assortatively. Likewise, the examination of "bounding" characteristics shows that life course attributes, including marital status, whether one wants children, and how many children one has already, are much more likely than chance to be the same across the two users in a dyadic interaction [10].
In other words, mate preferences are not simply vertical, meaning that we always want mates with the highest level of education, income, etc. Rather, horizontal preferences and preferences for similarity, in particular, play an important role [14]. Overall, users with similar education levels are three times as likely to match. As we can observe, assortative mating occurs in both online and offline contexts and can partially help explain why these markets still tend to be efficient. Lewis [18] provides evidence for the co-existence of both similarity and universal desirability (status)-based mechanisms.
Newer niche dating apps that only admit users from certain echelons of society may be changing the way we sort and actually exacerbate existing assortative tendencies. A recent Bloomberg report argues that dating apps, particularly elite ones like the League and Luxy, may be worsening economic inequality by making it easier for couples to pair by socioeconomic status. The League famously only admits graduates from top universities, while Luxy purports that the median income of users on its platform is $500,000. Instead of meeting someone at a bar or other social setting, singles can now use apps to find their economic and educational equivalent. While one might argue that this phenomenon already occurs offline, according to Bloomberg, “these services help facilitate unions between educated, affluent Millennials who are clustering in such cities as San Francisco and New York"—indirectly intensifying economic inequality.
While those may be exceptional cases, some combination of an individual’s attributes and potential partners’ preferences dictate market dynamics both in online and offline contexts. This means that an individual may have high desirability for one person and low desirability for another, and the preferences may not necessarily be monotonically related to their attributes. Efficient matching in this market thus relies on the existence of pairs of mutually desirable agents in a setting where preferences are heterogeneously distributed. As Hitsch et al. note, these markets tend to naturally resolve into pairs of mutual desirability [14].
Online platforms provide us with a unique opportunity to study the economic and evolutionary concepts of sorting and matching. While part of this is due to the ability to observe and classify user attributes, preferences, and behaviour in great detail, it is also due to the unique lack of search frictions in online dating markets. Certainly, a main reason for the existence of online dating sites is to make the search for a partner as easy as possible. Yet, despite the wealth of insight user-generated data, online dating has revealed about latent and stated mate preferences, there remains significant uncertainty regarding the way these preferences have evolved over time.
Sociologists often assume that society has become more egalitarian, and that the pluralist ideals have translated into a more equal quest for love [7]. It would then follow that people’s mate preferences have become more pluralist, switching from sorting based on ascribed traits to sorting based on acquired traits. Ascribed characteristics, as used in the social sciences, refer to properties of an individual attained at birth. The individual has very little, if any, control over these characteristics. In other words, based on the progress we have reportedly seen over the past decade in social integration, we would expect to observe users placing less importance on inherited traits like ethnicity and height, and more importance placed on characteristics achieved through merit such as education.
RQ1: How have stated and revealed mate preferences evolved over the last decade and are the claims of a more egalitarian society in fact reflected in online dating and mate selection?
In mate selection and especially in online dating, there seems to be a preoccupation with physical beauty [24]. Historically, theories of interpersonal attraction and interpersonal judgments have emphasized the importance of physical attributes over other factors such as personality and intelligence [44, 45]. Accordingly, online dating sites often urge their users to post photos of themselves to increase the chances that potential dates will contact them. Dating services like Grindr and Tinder have gone even further by doing away with detailed profile descriptions altogether, allowing users to base their dating decisions on physical appearance alone or at least at the first instance [12]. Indeed, 85% of interviewees in a study of Australian online dating users said that they would not contact someone without a photo on his or her profile [30].
Only a few studies so far have considered how users judge attractiveness online generally or in online dating in particular and how this translates into messaging strategy. Ellison et al. [5] describe the strategies employed by online dating users to interpret the self-presentations of others. Primarily, the participants they interviewed made substantial inferences from small cues, lending support to Walther’s theory of Social Information Processing [58]. For example, one woman felt that people who were sitting down in their online dating profile photos were trying to disguise that they were overweight [5]. Fiore et al. found that in line with past research on the psychology of attraction, the attractiveness of the photograph was the strongest predictors of whole profile attractiveness in online dating [59].
However, while it is evident that the attractiveness of one’s photo is important in determining overall perceived attractiveness of an online dating profile as a whole, predicting popularity based on looks alone is much more ambiguous. Rudder [26] explored the importance of attractiveness in online dating and found that how good-looking you are does not dictate how popular you are on an online dating website. In fact, having some people think that you are ugly can work in your favor [12].
To try and test how attractiveness might predict popularity, the OkCupid team took a random sample of 5,000 female users and compared the average attractiveness scores they each received from other users with the number of messages they were sent in a month. They found that it is not just the better-looking people who receive lots of messages. Using the spread of attractiveness ratings, they identified people who divide opinion on their attractiveness. These polarizing users ended up being far more popular on internet dating sites than universally attractive people [26]. In essence, the most beautiful users will always do well, but users whose attractiveness divides opinion are better off than those who everyone agrees is just quite cute.
Fiore and Donath [60] also explored this question of predicting popularity, but used self-reported attractiveness instead of attractiveness scores given by other users. They found that men received more messages when they were older, more educated, and had higher levels of self-reported attractiveness. Women received more messages when they did not describe themselves as “heavy,” had higher levels of self-reported attractiveness, and posted a photo on their profiles.
Among online daters, sending signals such as a “Superlike” or “Smile,” or “favoriting” a user can be a way to let them know a user is interested. In a notable study using a Korean dating/marriage site, researchers found evidence that the most sought-after people on the website were not very responsive to “virtual roses” [17]. Because their attitude was “well, of course, that person is interested in me.” Instead, the virtual rose was most effective on the middle desirability group which did not have as many great dating options and was almost twice as likely to accept a proposal sent with the costly signal of a rose.
This brings to light issues with signaling optimization: Despite the positive effect of sending roses, a considerable portion of participants did not use their roses and even those who exhausted their supply did not properly use them to maximize their dating success. It seems there are substantial tradeoffs in preference signaling. Reminiscent of the bar scene with John Nash in A Beautiful Mind, a user could send their signal to the ‘blonde’ or the most attractive female on the platform, who would be their number one pick. However, if everyone uses this strategy, chances of success are low. Instead, users would be better off using their costly signal on a medium-quality mate where chances of reciprocity are higher. By the same token, it seems like success could be almost guaranteed by seeking out the least desirable mate and sending a signal, but this is obviously not optimal. Therefore, there is a trade-off in choosing who to send a costly signal such as a favorite or message to that goes back to the aforementioned difference in user “quality” or desirability.
RQ2: What is the impact of user attractiveness on messaging patterns and is it a powerful predictor of “success” in online dating?
In the social sciences, gender is a built-in variable that can account for measurable differences in behaviour [46]. While non-binary users and same-sex dyads are a growing segment of online dating users, the dataset examined in this work consists exclusively of heterosexual dyads. One of the main research areas related to online dating systems is the difference in messaging behaviour between men and women on these platforms. However, to meaningfully investigate computer-mediated communication between genders, it is important to first understand underlying patterns of offline communication between heterosexual dyads that may be reflected, moderated, or exacerbated online.
Examining single women’s use of the telephone in heterosexual dating relationships, Sarch found that in line with gender norms at the time of the study, subjects expected men to pursue women [47]. Additionally, on occasions when a woman ever took initiative and started a conversation, she expected her partner to “overcompensate” by reaching out with more frequency. Subjects also reportedly saw the frequency of how often their dates called as an indicator of how well the relationship was going or how often their date was thinking about them.
In keeping with these two indicators, subjects did not want to be perceived as the pursuer, so they limited the frequency of their own calls by ensuring that each one was “carefully executed so that sufficient time elapsed between multiple phone calls” ([47], p. 141). This phenomenon has not entirely disappeared—Ansari and Klinenberg observe, “the fear of coming off as desperate or overeager through texting” as a common concern in recent focus groups [32]. Despite coming 22 years after Sarch’s study, Ansari and Klinenberg’s research shows that initiator status and contact frequency equating to interest have translated from telephone calls to modern online messaging culture.
Besides the stigma against female initiators, another reason initiators tend to be male has to do with the way incentives are structured in online dating. About 60% of the men in Whitty and Carr’s study saw online dating as a “numbers game” [30]. Given the seemingly endless number of profiles available, individuals could keep trying until they get a response, meaning that they are not fully interested in some of the profiles they send messages to. Instead, they would send a large number of initiations regardless of actual interest and see which women reciprocate, filtering at the response level.
The result is staggeringly lop-sided activity levels for men and women. Men are on average twice as active as women in online dating apps—skewing an already imbalanced gender ratio; taking into consideration activity level, the gender ratio of the active user base is about 80:20 [13]. Rudder [26] confirms this, showing that even the most attractive men receive fewer messages than women on average. In turn, since women are often inundated with date requests, they are less compelled to respond to each request [28]. Fiore et al. confirm this, finding that women responded more selectively than men, answering 16% of the time compared to men’s 26% reciprocation rate [10].
Zhang and Yasseri found that messages were five times more likely to have been initiated by a man than by a woman even in mobile dating applications that allow users to communicate only after they have mutually signaled their interest [31], in line with previous work that found men to be the main initiators in heterosexual conversations [9, 28, 29, 49]. Fiore et al. also confirm this, finding that rates of initial contact differed sharply by gender. Men initiated a median 1 contact per day compared with 0.875 for women [10]. Given this difference combined with the greater number of men on the site, women tended to be contacted much more often than men, a median 2 times per day, compared to 0.5 for men. Finally, more popular men and women—those who were contacted more often per day—initiated contact with others slightly less often, confirming economic theory that “high quality” users need not pursue others as actively.
RQ3: Has gender asymmetry in online dating messaging behaviour remained stable, lessened, or grown over time?
We integrate the previously mentioned literature on attractiveness and selectivity to investigate how user behaviour and strategy varies across different facets of communication; searching for partners to initiate contact with, and selecting which users to reply to when they have some awareness of their attractiveness or signals of their success. As well as studying variations of behaviour in the population, we are also motivated by research around Dunbar’s number [4] to study what limits and commonalities might be present in the data around users’ communication.
RQ4: How different facets of online daters’ success relate to their selectivity?
Referring back to the “college admission” model that suggests strong homophily in seeking partners, most studies have overlooked whether a match based on homophily actually translates into initiation of contact and communication between users in a liquid market and in the absence of search friction. Given the abundance of inactive users and the asymmetry in the activity between male and female users, matching alone is insufficient to determine whether online dating is driven by homophilic tendencies. Hence, we form our last research question as the following.
RQ5: Does similarity between the parties involved in a computationally made match map into initiation of contact and successful communication?
Moreover, homophily is unlikely to be uniformly distributed across all characteristics for all users. For instance, some users will weigh age differences stronger than others. While there seem to be some hints that especially demographic or socioeconomic features play an important role, the exact relationships and relevant variables are still ambiguous.
RQ6: Given presence of homophily, which are the decisive dimensions and variables predicting successful communication?