Monocentric Cyberspace: The Primary Market for Internet Domain Names
Cyberspace is no different from traditional cities, at least in economic terms. Urban economics governs the creation of new space on the Internet and explains location choices and price gradients in virtual space. This study explores registration dynamics in the largest primary market for virtual space: Internet domain names. After developing a framework for domain registrations, it empirically tests whether domain registrations are constrained by the depletion of unregistered high quality domain names. Estimations based on registrations of COM domain names suggest that the number of domains expands substantially slower than the growth in overall demand for domain space. Supplying alternative domain extensions can relax the shortage in domains in the short term.
KeywordsVirtual land-rush marketsPrimary market for cyberspaceRegistrations of internet domain namesSupply constraints
One of the defining myths of the World Wide Web is that ever-increasing levels of electronic communication and interconnectedness will ultimately overcome physical distance. Higher and higher degrees of virtual omnipresence hypothetically raze traditional boundaries, reshape the spaces we live and work in (among others, Gaspar and Glaeser 1998; Ihlanfeldt 1995; Pascal 1987; Tranos and Nijkamp 2013) and will eventually culminate in the “death of distance” (Cairncross 1997). Put more provocatively, the world is said to “flatten” (Friedman 2005),
Putting aside all information technology exuberance, location will continue to matter. Even if we relocated all our work, social interactions, shopping, and leisure activities into cyberspace1 and left our physical shells in self-sufficient containers that provide the same amenities anywhere – our world would still be governed by distances and locations, albeit in new dimensions.
One of the largest markets for locations in spaces beyond the traditional coordinate systems, is the market for Internet domain name registrations. Domain names map natural language character strings, which can be easily memorized by humans, to technical network addresses on the Internet, which tend to be hard to recall.2 Strictly speaking, they are just pointers to data and online services, but effectively they provide locations that enable humans to navigate the web (Mueller 1998).
So far, the primary market for domain names is unchartered territory in academic literature. This paper is the first to explore the determinants of domain registrations using an adaptation of the bedrock of Urban Economics, the classic monocentric city model (Alonso 1964; Mills 1972; Muth 1969). In this model, a city is located in a featureless plain and all employment is located in the central business district (CBD), to which all residents commute regularly. When selecting a location to live, residents trade off the cost of commuting as a function of the distance to the CBD, vis-a-vis the consumption of other goods. In equilibrium, rent level rise as the distance to CBD falls.
In virtual space, a comparable rent gradient can be observed: Sought-after virtual “downtown locations” transact for several million USD (The Economist Online 2010), similar to locations in the centers of brick-and-mortar cities, while domains located in the cyber-periphery trade for a fraction of the central spaces’ values only (Lindenthal 2014). Analogous to the monocentric city model, differences in prices of virtual locations can be explained by the differences in distance to a central location. In virtual space, however, all users reside in the same location and travel from this universal origin to different virtual destinations to access information or services. While it takes only a few keystrokes to direct a browser to a new location, the associated effort varies between different destinations. The cost of cyber-commuting depends on the linguistic attributes of the destinations which differ in familiarity, pronounceability, and memorability of the domain names (for an excellent review of proper name memory see Cohen and Burke 1993). Neuroscience studies have shown that the human memory stores and retrieves different types of concepts, words and names not only through a variety of neural systems but also in separate locations within the brain (e.g. Binder et al. 2009; Damasio et al. 2004; Humphreys and Forde 2001). Cyberspace may have overcome traditional concepts of physical distances, but not all locations are equally easy to communicate, memorize, recall, and type. This paper develops measures of some of the these distances and empirically show that word-specific cyber-commuting costs determine domain registrations. Documenting virtual distances and an equivalent to commuting costs is a prerequisite when applying the monocentric city model to virtual space.
At first sight, the seemingly unlimited3 number of domains that still can be registered against a low and exogenously determined registration fee appears to be at odds with the monocentric city model where land in the CBD is naturally scarce: If supply of space was unconstrained every website could have the most desirable location and should be valued at the same nominal registration fee. Again, showing that supply of space is constrained is another necessary step towards a monocentric virtual city model.
Second, the re-sale prices of registered domain names have risen 63 % from 2006 through 2012 (Lindenthal 2014). These rising prices are indicative of the demand for virtual locations outpacing the supply of available attractive names and that competition drives up prices for “central” domains.
Third, the Internet Corporation for Assigned Names and Numbers (ICANN), the non-profit organization regulating the Domain Name System (DNS), has initiated the release of 1400 new Top Level Domains (TLD) that will augment the current selection of 22 global extensions like COM, EDU, NET or ORG (Internet Corporation for Assigned Names and Numbers 2014). Entrepreneurs have made large bets on the right to monetize these new swathes of cyberspace, putting down about 350 million USD in application fees (Internet Corporation for Assigned Names and Numbers 2013) and investing billions more for the necessary infrastructure needed to manage the new space. Clearly, they have trust in being able to serve a previously unmet demand.
Estimating the extent of supply constraints is not only interesting for urban economists, but also a timely and relevant challenge for policy makers pondering whether the current domain name system serves the Internet optimally and for the business community trying to serve any unmet demand for space.
The next section of this paper suggests a framework that can empirically test supply constraints, price gradients and demand curves for Internet domain names. Subsequently, data on domain registrations will be introduced before the last two sections present the results of the empirical estimations and a conclusion.
The Determinants Of Domain Name Registrations
Three factors determine the total number of registered domains: First, the ultimate driver of domain name registrations is the demand for virtual space by businesses, organizations or individuals that offer Internet-based information and services to Internet end users. More than 20 years after the inception of the Internet, the total head count of these virtual dwellers (Pop) is still expanding rapidly. While the exact amount of space demanded per dweller is difficult to quantify and also likely to change over time, it is safe to assume that Pop and total demand for virtual space are positively correlated: For instance, doubling the number of virtual dwellers is expected to lead to twice the demand for space (keeping all other factors equal).
Second, the registration fees and other fixed costs (Kreg) associated with owning a domain name are negatively related to total registration numbers. The fixed costs are comprised of the wholesale domain registration fees charged by the company that administers the domain registry, the markup added by competing middlemen re-selling domains to end users, and by additional costs for hosting and related services. While the direct costs of owning and hosting a domain have been falling year after year due to intense competition between service providers,4 it is safe to assume that K is identical for all registrations in a cross-sectional study.5
The marginal effort Emarginal is assumed to increase in registration numbers. Domain name registrations exhibit a pecking order regarding domain quality: names that had been registered relatively early tend to be of higher quality than those registered later. Marginal domain registrations, on average, contain out of more characters, are less descriptive and more difficult to memorize than the existing stock, requiring higher efforts by end users as registrations increase. Those high quality and easy to access locations that are claimed first in land rush markets tend to trade for higher values in secondary markets subsequently.
This study assumes the same level of use intensity for all domains. While the classical, traditional monocentric model does include variable density, that is not a necessary feature of the monocentric model. All the essential elements of the monocentric model still come through with a fixed, constant density as shown in Geltner et al. (2001, Chapter 4). In addition, Lindenthal and Loebbecke (2014) have already documented that more valuable domains are more likely to be developed into more extensive websites, which represents a higher use intensity or “density” compared to registrations of lower quality domains without further development.
Owner-operated websites are not the only form of cyberspace available to virtual dwellers. Alternatively, they can connect with their audiences through shared spaces like social media platforms, wikis, online market places or direct communications and marketing. For instance, the increasing role of social networks in connecting companies and its customers could weaken the demand for domains in general. A local business might find it more cost-effective to promote its Facebook profile instead of steering customers towards their website. Reversed, changes to search engine algorithms could make it easier for users to find relevant content on millions of individual websites, tilting the balance in favor of owning domains. The competitive position of domains versus other options is, among other factors, accounted for in factor m.
This framework also allows investigating demand levels for segments of domains by employing subset specific values for E. For instance, the relative commuting costs for a domain under the COM Top Level Domain (TLD) could differ from the cost of accessing a NET or ORG domain, resulting in the different demand levels for each TLD, documented by Lindenthal (2014).
The empirical part of the paper splits a cross-section of domain registrations into subsets for which the level of demand Pop is quantifiable and the number of registered domains is known. The fixed cost K is identically distributed for all domains in cross-section and can therefore be omitted. In a future study, the price sensitivity of domain registrations could be estimated by analyzing longitudinally different values for K.
The regression coefficient β estimates the elasticity of Pop and Registrations, α is a constant and εi an identical and independently distributed error term.
The relative increase in the number of registered domains is smaller than the relative increase in the number of potential domain registrants.
Defining E at the domain level also allows accounting for any differences in linguistic quality between domains and the resulting differences in commuting costs and registrations. Zipf (1936) shows that shorter words tend to appear more frequently in natural languages than long expressions. Similarly, the length of Twitter hashtags is inversely related to their usage frequencies (Cunha et al. 2011). If his principle of least effort also holds true in domain space, shorter domains will be registered more often than long domain names. For instance, bearers of long surnames are less likely to register a domain containing their name than somebody with a relatively short name. To give a simplifying example, domains derived from the keywords “Pennington Associates Milwaukee” might be more tedious to type than any from “Carr Associates Miami”, making the former less likely to appear in registrations.
The likelihood of a character string being registered as a domain name decreases in the length of the string.
A negative estimate for the regression coefficient β2 can be interpreted as evidence for different levels of effort required by users – or for the equivalent to commuting costs required by the monocentric city model.
In a similar fashion, the number of keywords within a domain name can be interpreted as an additional measure for commuting costs, as more keywords require more effort when memorizing and recalling. However, combining multiple keywords results in a trade-off between brevity and descriptiveness. In case the domains “pizza.com” or “pizzaboston.com” are already taken, “tastypizzaboston.com” might still be available as the electronic storefront of a local pizza place. Theoretically, each additional keyword increases the number of potential domain names by several orders of magnitude: If the total number of viable single keywords is W, then W2 two-keyword combinations, or W3 three-keyword combinations are possible. Whoever is willing to accept the higher effort required to access a domain consisting of many keywords has plenty of choice. This trade-off between availability and domain quality reconciles the view of seemingly unconstrained domain supply and the observation that short, low-effort domains are not easy to come by: Just add a few more keywords and you can have any domain you want.
Domain space is less constrained for combinations of multiple keywords than for single-keyword domains.
The domain name system was designed in a distributed fashion6 with as little information as possible managed in centralized registries. Each Top Level Domain, for instance, administers separate databases for its domain registrations and delegates the actual task of managing the information linked to a domain to a large number of decentralized domain name servers. So called TLD zone files keep track of the authoritative domain name servers for each domain under a specific TLD. The zone file for the COM domains can be downloaded from Verisign (2014b). Strictly speaking, this zone file does not contain all registered domains, but only active COM domain names with a DNS entry, which account for more than 99.5 % of all domains (Verisign 2014a).
Domain names are often comprised of multiple keywords linked together into one character string, complicating any analysis of the names’ meaning. This concatenation is reverted and all domains are split into their base keywords, employing an automatic programming interface described in Huang et al. (2010) and hosted by Microsoft Research (Microsoft Research 2014).7 The next step identifies groups of domain names that contain popular surnames or city names as a keyword. This segmentation builds on the premise that the number of domains per city resident or bearer of a surname is (on average) the same across all cities or names: Why would there be a different number of domain registrations domains containing “Miller” versus “Smith”, for example, after accounting for the total number of citizens named Miller or Smith and the length of the name?
Similarly, the demand for domains is also expected to be equal across cities after controlling for the size and the socio-economic composition of the city populations. The cities’ average income per capita and the share of the population holding university degrees are added to the regression equation as control variables since Goldfarb and Prince (2008) found high-income and well educated people to be overrepresented among early Internet users. Beyond education and income, why would there be higher or lower per capita demand for domains in e. g. Houston versus Dallas?
The US Census (US Census 2014b) provides an overview of the most popular surnames from the year 2000, including frequency counts and basic demographic information. In addition, this source also lists the population numbers of all US Metropolitan Statistical Areas (MSA) (US Census 2014a). Before linking population numbers and domain registrations, MSAs with the same name (like Portland Oregon and Portland Maine) are aggregated into one observation and the corresponding population numbers are added up.
All COM domains
COM domains containing one of 1000 most frequent US surnames
COM domains containing US MSA names
Top 50 %
Bottom 50 %
Top 50 %
Bottom 50 %
Registrations (in million)
As demand for a certain group of domains increases, registration of domains in these groups also increases, including an increasing number of long names. For the 500 most frequent surnames in the US census, 60 % more domains have been registered than for the following 500 surnames in the frequency ranking (5.93 million vs. 3.7 million). The higher registration numbers come at a cost. The more sought after domains are on average longer and contain more keywords than domains containing less frequent surnames. All differences in length and keyword counts are statistically significant with t-values above 2.6.
For cities, the differences become even more pronounced. When splitting the sample of MSAs at the population median, the most populous cities account for more than 4 times the number of domain registrations originating from the lower half of MSAs (2.49 million vs. 0.6 million) and the magnitude of the average length-differences is also substantially bigger.
The variable Length captures the number of characters for each surname or MSA name in the sample. On average, a top-1000 surname is 6.08 characters long. The maximum is 11 and the minimum is only 2 characters (US Census 2014b). For MSA names, the mean of Length is 8.67, and the range is 4 to 16 characters (US Census 2014a).
For the MSAs, the share of population having attained a bachelor or graduate degree ranges from 7 % (Merced, CA) to 34 % (Boulder, CO), and the minmum income per capita is USD 14,126 (McAllen, TX) while the maximum value is USD 48,900 (Bridgeport, CT).
Regression coefficient estimates
ln(% with degree)
D3 Keywords (vs. 2)
D4 Keywords (vs. 2)
D5 Keywords (vs. 2)
D3 Keywords × ln(Pop)
D4 Keywords × ln(Pop)
D5 Keywords × ln(Pop)
The estimate of 0.80 % for MSAs may be a conservative figure as the market may be in fact even more constrained due to two reasons. First, some MSA names undoubtedly have marketing appeal to users from other parts of the world. Global brand names like “New York” or “Los Angeles” are not exclusive to residents from these MSAs only. Any non-native usage could inflate domain registration numbers for larger MSAs and bias the elasticity estimates upwards. Second, large MSAs like New York could have disproportionately more small businesses and retail shops given their consumption variety. On other hand, smaller MSAs might depend more heavily on branches of larger chains that do not require their own web presences. Analyzing the link between the industrial composition of MSAs and domain registrations could be an interesting aspect for a future study.
Longer domains are less desirable than shorter names. Panel (b) of Table 2 lists the estimated regression coefficients from (5). The coefficients for ln(Length), β2, are negative and significantly different from 0 for surnames and MSA names alike. Increasing the length of a surname from the median (6 characters) to the 75th percentile (7 characters), reduces the number of domain registrations by a staggering 24 %.10 For MSA names, adding one more letter to the median of 9 results in an 14 % lower number of registrations. H2 is clearly confirmed.
H3 hypothesizes that registrants circumvent the problem of their desired domain already being registered by adding more keywords. The regression coefficients from (5) are displayed in Panel (c) of Table 3. The negative coefficients for D3 Keywords, D4 Keywords, and D5 Keywords confirm that domains with fewer keywords are more popular than longer alternatives. The base case, D2 Keywords, comprises of the shortest and most sought-after group of domains consisting of only the MSA name or surname in conjunction with one more keyword. The clear preference for less complex names decreases for larger values of Pop, due to the positive estimates for the interaction terms’ Dkln(Pop) coefficients. For example, in all MSAs the population is larger than 55,000, so the −0.436 for D3 Keywords is fully offset by 0.132 times ln(55,000).
The estimated elasticities for two-keyword MSA domains is 0.667, which is below the overall elasticity of 0.795 estimated in model (a) for all keyword lengths before. The elasticity for three-keyword MSA domains is already higher at 0.799 (0.667 + 0.132), while four-keyword domains and five-keyword domains are basically unconstrained. For surnames, the elasticities also increase each time a new keyword is added. These estimates support Hypothesis 3: adding more keywords to a domain name reduces supply constraints.
Conclusion and Discussion
This study investigates whether the primary market for Internet domain names can be analyzed using standard urban economic theories. It shows that, indeed, core prerequisites are met which allow the application of the monocentric city model to virtual space: an equivalent to commuting costs can be constructed from differences in linguistic properties of domain names and the supply of “central” virtual locations that exhibit low commuting costs is constrained.
The empirical results show that the number of registered domain names has been increasing at a slower pace than than fundamental demand for domain space. The magnitude of this effect is large in economic terms: If the elasticities found for domains containing surnames or city names are a valid estimate for the overall elasticity, then the total number of domain names could be up to a quarter higher if more domains were available for registration.
The introduction of new global top level domains has the potential to serve some of the demand not met by the current domain extensions (additional benefits of more competition in TLD space, like technical innovation, lower prices, joint marketing efforts and overall more choice for consumers have been described by Mueller  and others). It is too early to tell whether the new space will be accepted as a viable alternative to the established space. Replacing the ubiquitous COM or the country-specific TLDs like NL, DE, or CO.UK with a new TLD is comparable to adding one more keyword to a name: It increases choice massively but additional keywords are only viable in very crowded segments of domain space (see Table 3, Panel [c]).
In land markets, external factors like the topography of a metropolitan area can exacerbate land scarcity (Saiz 2010): cities that are physically constrained by water or mountainous terrain exhibit steeper land price gradients than places in open landscapes. Analogous policy choices in the administration of the web’s address system can partially remove existing constraints on domain supply. Launching additional top level domains will alleviate scarcity in web locations, but not fully overcome it. Figuratively speaking, the new TLDs flatten hills and fill in water, but once these obstacles are removed, the overarching constraints will kick in again: The set of catchy keywords that appeal to humans is still bound by the way we process language, even if we had unlimited choice in top level domains. Legend has it that Mark Twain advised to buy land, since “they have stopped making it”.11 Similarly, one can argue that investing into (top level) domains is a promising business venture, since we have stopped inventing new languages, at least at a large scale.
Additionally, the paper confirms old market wisdom: Longer names are indeed registered less frequently than shorter names. In follow up studies, the virtual equivalent to commuting costs can be extended to analyze the effect of other linguistic characteristics like the keyword types, special characters, numerals or hyphens. Additional keywords in a name might provide more choice of possible domains but they come at the cost of longer overall names.
These findings can be generalized to other location systems based on natural language like Twitter handles, identifiers in online communities or names of existing companies, individuals or even cities. Latino and Hispanic immigrants display a strong preference for places with Spanish names in the south-western USA, after accounting for county fixed effects and other locational variables (Saiz 2014). Since city names influence residential choices, it would not be surprising if a city bearing a short, tongue-friendly name attracts more new residents than less appealingly named alternatives: Why settle with a linguistic challenge like Schenectady if Albany is so close by?
Finally, by applying the monocentric city model to virtual space, one can transfer long-established findings from traditional land markets to domain markets: For instance, the intensity of space use is predicted to be higher for domain names with low commuting costs than for locations in the linguistic periphery – a prediction which could be tested in follow up studies.
More research is also needed to understand the determinants of domain registrations in time. The current study draws its conclusion from one cross-section only. Technological change could reduce the effort required to navigate to a website and might channel more demand to peripheral locations. Examples of relevant technological change include the auto-complete function in the browser’s address bar which automatically fills in any long domain names, in case they have been visited in the past. Additionally, it would be interesting to investigate where unsuccessful registrants turn after not having found a suitable name. The substitution between “owner-occupied” domain space and “rented” locations on e. g. social media platforms is not understood in academic literature yet.
Batty’s (1997) virtual geography terminology distinguishes between cyberplaces, in which the built form and electronic networks are interdependent, and cyberspace, which is fully detached from physical space.
The Internet standard RFC 1035 (http://tools.ietf.org/html/rfc1035) and its subsequent updates specify the rules for domain name supply. Each label a domain name consists of (labels are separated by dots) can contain up to 63 octets, allowing for e. g. 2663 combinations of the latin characters A to Z in the Second Level Domain. Adding numbers, dashes and millions of internationalized domain names further increases this already massive number of potential domain names.
Anecdotal evidence on long term trends in web hosting is presented byRoyal Pingdomhttp://royal.pingdom.com/2008/02/19/web-hosting-now-vs-10-years-ago/
Registration fees should not be confused with resale values in secondary markets. At any given time, wholesale registration fees are similar and do not depend on the inherent quality of the domain. Resale prices in the secondary market, however, vary greatly and the variation in values can be partially explained in hedonic regressions (Lindenthal and Loebbecke 2014) or a repeat sales analysis (Lindenthal 2014).
The list of domains and the corresponding keywords are available from author on request.
Moss and Townsend (1997) used a different specification when calculating the geographic domain registration density by dividing the number of domains registered by residents of a city over the city’s total population.
The results are robust with regard to the population size of MSAs or the frequencies of the surnames: When splitting the sample at the median of Pop all estimated elasticities are estimated to remain significantly smaller than 1.
1 – exp.((ln(7)-ln(6))*-1.787) = 0.24
Multiple versions of this quote are widely distributed – a source has not been handed down, however.
My postdoctoral research project received substantial financial contributions by the Internet Corporation of Assigned Names and Numbers (ICANN). I would like to explicitly thank Cyrus Namazi and Mike Zupke for supporting this research project and for giving me the opportunity to discuss preliminary results with ICANN. Albert Saiz, David Geltner, Piet Eichholtz, and Liao Wen-Chi are thanked for their insightful feedback on earlier versions of this paper.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.