Monocentric Cyberspace: The Primary Market for Internet Domain Names

Cyberspace is no different from traditional cities, at least in economic terms. Urban economics governs the creation of new space on the Internet and explains location choices and price gradients in virtual space. This study explores registration dynamics in the largest primary market for virtual space: Internet domain names. After developing a framework for domain registrations, it empirically tests whether domain registrations are constrained by the depletion of unregistered high quality domain names. Estimations based on registrations of COM domain names suggest that the number of domains expands substantially slower than the growth in overall demand for domain space. Supplying alternative domain extensions can relax the shortage in domains in the short term.

1 Batty's (1997) virtual geography terminology distinguishes between cyberplaces, in which the built form and electronic networks are interdependent, and cyberspace, which is fully detached from physical space.
provide the same amenities anywhereour world would still be governed by distances and locations, albeit in new dimensions.
One of the largest markets for locations in spaces beyond the traditional coordinate systems, is the market for Internet domain name registrations.Domain names map natural language character strings, which can be easily memorized by humans, to technical network addresses on the Internet, which tend to be hard to recall. 2 Strictly speaking, they are just pointers to data and online services, but effectively they provide locations that enable humans to navigate the web (Mueller 1998).
So far, the primary market for domain names is unchartered territory in academic literature.This paper is the first to explore the determinants of domain registrations using an adaptation of the bedrock of Urban Economics, the classic monocentric city model (Alonso 1964;Mills 1972;Muth 1969).In this model, a city is located in a featureless plain and all employment is located in the central business district (CBD), to which all residents commute regularly.When selecting a location to live, residents trade off the cost of commuting as a function of the distance to the CBD, vis-a-vis the consumption of other goods.In equilibrium, rent level rise as the distance to CBD falls.
In virtual space, a comparable rent gradient can be observed: Sought-after virtual Bdowntown locations^transact for several million USD (The Economist Online 2010), similar to locations in the centers of brick-and-mortar cities, while domains located in the cyber-periphery trade for a fraction of the central spaces' values only (Lindenthal 2014).Analogous to the monocentric city model, differences in prices of virtual locations can be explained by the differences in distance to a central location.In virtual space, however, all users reside in the same location and travel from this universal origin to different virtual destinations to access information or services.While it takes only a few keystrokes to direct a browser to a new location, the associated effort varies between different destinations.The cost of cyber-commuting depends on the linguistic attributes of the destinations which differ in familiarity, pronounceability, and memorability of the domain names (for an excellent review of proper name memory see Cohen and Burke 1993).Neuroscience studies have shown that the human memory stores and retrieves different types of concepts, words and names not only through a variety of neural systems but also in separate locations within the brain (e.g.Binder et al. 2009;Damasio et al. 2004;Humphreys and Forde 2001).Cyberspace may have overcome traditional concepts of physical distances, but not all locations are equally easy to communicate, memorize, recall, and type.This paper develops measures of some of the these distances and empirically show that word-specific cyber-commuting costs determine domain registrations.Documenting virtual distances and an equivalent to commuting costs is a prerequisite when applying the monocentric city model to virtual space.
At first sight, the seemingly unlimited 3 number of domains that still can be registered against a low and exogenously determined registration fee appears to be at odds with the monocentric city model where land in the CBD is naturally scarce: If supply of space was unconstrained every website could have the most desirable location and should be valued at the same nominal registration fee.Again, showing that supply of space is constrained is another necessary step towards a monocentric virtual city model.
Three stylized facts suggest that domain names actually are indeed a limited resource.First, the annual increase in the number of registered domain names has been slowing down in the last years while the global Internet user base has been growing at an increasing pace.Figure 1 shows that up to 2007, the universe of COM domain names has expanded at a similar speed to the number of world residents with Internet access.From 2007 onwards, the annual additions to the domain stock lag the overall trend which could be caused by fewer and fewer high quality keywords remaining available for registration.With more than 271 million registered domains at the end of 2013 (Verisign 2014a), it is not easy to find a good domain that has not been claimed yet.
Second, the re-sale prices of registered domain names have risen 63 % from 2006 through 2012 (Lindenthal 2014).These rising prices are indicative of the demand for virtual locations outpacing the supply of available attractive names and that competition drives up prices for Bcentral^domains.
Third  Zook (2015), domain price index from Lindenthal (2014), number of global internet users from Worldbank (2005) investing billions more for the necessary infrastructure needed to manage the new space.Clearly, they have trust in being able to serve a previously unmet demand.
Estimating the extent of supply constraints is not only interesting for urban economists, but also a timely and relevant challenge for policy makers pondering whether the current domain name system serves the Internet optimally and for the business community trying to serve any unmet demand for space.
The next section of this paper suggests a framework that can empirically test supply constraints, price gradients and demand curves for Internet domain names.Subsequently, data on domain registrations will be introduced before the last two sections present the results of the empirical estimations and a conclusion.

The Determinants Of Domain Name Registrations
Three factors determine the total number of registered domains: First, the ultimate driver of domain name registrations is the demand for virtual space by businesses, organizations or individuals that offer Internet-based information and services to Internet end users.More than 20 years after the inception of the Internet, the total head count of these virtual dwellers (Pop) is still expanding rapidly.While the exact amount of space demanded per dweller is difficult to quantify and also likely to change over time, it is safe to assume that Pop and total demand for virtual space are positively correlated: For instance, doubling the number of virtual dwellers is expected to lead to twice the demand for space (keeping all other factors equal).
Second, the registration fees and other fixed costs (K reg ) associated with owning a domain name are negatively related to total registration numbers.The fixed costs are comprised of the wholesale domain registration fees charged by the company that administers the domain registry, the markup added by competing middlemen re-selling domains to end users, and by additional costs for hosting and related services.While the direct costs of owning and hosting a domain have been falling year after year due to intense competition between service providers,4 it is safe to assume that K is identical for all registrations in a crosssectional study. 5ltimately, each end user of a website needs exercise an effort E to access an online location.This commuting cost depends on domain specific factors like the recognizability and ease of recollection of a specific name and also on the general competitive position of domains versus other forms of virtual space.If the required effort of commuting to a location is low, owning this location is desirable as it is possible to attract end users easily.Locations with high required efforts are less attractive since fewer end users visit.New domains get registered as long as the utility gained from owning marginal domains given a marginal effort level E marginal required by any user exceeds registration costs.In sum, the total number of registrations of domain names can be formalized as with a, b, c, d being scaling parameters accounting for the overall attractiveness of domain names versus other forms of virtual space (a) and the elasticities of registrations with respect to general demand (b), registration costs (c), and maginal efforts or commuting costs (d).
The marginal effort E marginal is assumed to increase in registration numbers.Domain name registrations exhibit a pecking order regarding domain quality: names that had been registered relatively early tend to be of higher quality than those registered later.Marginal domain registrations, on average, contain out of more characters, are less descriptive and more difficult to memorize than the existing stock, requiring higher efforts by end users as registrations increase.Those high quality and easy to access locations that are claimed first in land rush markets tend to trade for higher values in secondary markets subsequently.
The marginal level of effort required by users as more and more lower quality domains get registered, can be generalized as where g and h are scaling factors.The choice of a power function is motivated by Zipf's observation that the frequency at which a word is used is inversely proportional to this word's rank in the frequency table (Zipf 1936) and that the rank is a particular power function of word frequency (Zipf 1949).If domains are registered along the rank suggested by the keyword frequency table, the marketing potential of domains will also follow a power law.This notion is supported by Cunha et al. (2011) finding that the frequencies of Twitter hashtags are governed by a Zipfian power distribution.Assuming identical registration costs for all domains, and plugging (2) into (1) can be solved for Registrations and simplified to where where m and n are products of earlier used constants (and therefore constants as well): m = (ab/cKg) -(1+h) and This study assumes the same level of use intensity for all domains.While the classical, traditional monocentric model does include variable density, that is not a necessary feature of the monocentric model.All the essential elements of the monocentric model still come through with a fixed, constant density as shown in Geltner et al. (2001, Chapter 4).In addition, Lindenthal and Loebbecke (2014) have already documented that more valuable domains are more likely to be developed into more extensive websites, which represents a higher use intensity or Bdensity^compared to registrations of lower quality domains without further development.
Owner-operated websites are not the only form of cyberspace available to virtual dwellers.Alternatively, they can connect with their audiences through shared spaces like social media platforms, wikis, online market places or direct communications and marketing.For instance, the increasing role of social networks in connecting companies and its customers could weaken the demand for domains in general.A local business might find it more cost-effective to promote its Facebook profile instead of steering customers towards their website.Reversed, changes to search engine algorithms could make it easier for users to find relevant content on millions of individual websites, tilting the balance in favor of owning domains.The competitive position of domains versus other options is, among other factors, accounted for in factor m.
This framework also allows investigating demand levels for segments of domains by employing subset specific values for E. For instance, the relative commuting costs for a domain under the COM Top Level Domain (TLD) could differ from the cost of accessing a NET or ORG domain, resulting in the different demand levels for each TLD, documented by Lindenthal (2014).
The empirical part of the paper splits a cross-section of domain registrations into subsets for which the level of demand Pop is quantifiable and the number of registered domains is known.The fixed cost K is identically distributed for all domains in crosssection and can therefore be omitted.In a future study, the price sensitivity of domain registrations could be estimated by analyzing longitudinally different values for K.
Figure 2 visualizes the approach: For each group of domains, the intersection of the demand curve D1 and D2 and supply curve S1 can be observed as the number of registered domains, R1 and R2.Demand for domains from group 2 is higher than demand for domains from group 1, as indicated by an upward shifted demand curve and higher values for R2 than for R1.This analysis is only feasible if data on fundamental demand levels and registration numbers per segment can be directly observed.
The relationship between the number of potential domain registrants Pop and actual Registrations can now be estimated empirically in the following log-log regression specification: The regression coefficient β estimates the elasticity of Pop and Registrations, α is a constant and ε i an identical and independently distributed error term.
Since the price for a domain registration is constant, the elasticity of registrations with respect to Pop should equal 1 by theory if the supply of domains was effectively unconstrained.The regression coefficient β would then have a value of 1.A coefficient estimate significantly below 1, however, would supports the hypothesis of virtual space scarcity: H1-The relative increase in the number of registered domains is smaller than the relative increase in the number of potential domain registrants.
Defining E at the domain level also allows accounting for any differences in linguistic quality between domains and the resulting differences in commuting costs and registrations.Zipf (1936) shows that shorter words tend to appear more frequently in natural languages than long expressions.Similarly, the length of Twitter hashtags is inversely related to their usage frequencies (Cunha et al. 2011).If his principle of least effort also holds true in domain space, domains will be registered more often than long domain names.For instance, bearers of long surnames are less likely to register a domain containing their name than somebody with a relatively short name.To give a simplifying example, domains derived from the keywords BPennington Associates Milwaukee^might be more tedious to type than any from BCarr Associates Miami^, making the former less likely to appear in registrations.
If domain length is a valid proxy for the effort required by users to access a virtual location, then an inverse relation between the length of a string and the number of registrations containing this string can be expected.
H2-The likelihood of a character string being registered as a domain name decreases in the length of the string.
To test H2, the variable domain Length is added to (4): A negative estimate for the regression coefficient β 2 can be interpreted as evidence for different levels of effort required by usersor for the equivalent to commuting costs required by the monocentric city model.
In a similar fashion, the number of keywords within a domain name can be interpreted as an additional measure for commuting costs, as more keywords require more effort when memorizing and recalling.However, combining multiple keywords results in a trade-off between brevity and descriptiveness.In case the domains Bpizza.com^or Bpizzaboston.com^arealready taken, Btastypizzaboston.com^might still be available as the electronic storefront of a local pizza place.Theoretically, each additional keyword increases the number of potential domain names by several orders of magnitude: If the total number of viable single keywords is W, then W 2 two-keyword combinations, or W 3 three-keyword combinations are possible.Whoever is willing to accept the higher effort required to access a domain consisting of many keywords has plenty of choice.This trade-off between availability and domain quality reconciles the view of seemingly unconstrained domain supply and observation that short, low-effort are not easy to come by: Just add a few more keywords and you can have any domain you want.
This notion motivates one last hypothesis: H1-Domain space is less constrained for combinations of multiple keywords than for single-keyword domains.
To test H3 empirically, the domain registrations for each surname or MSA i are further subdivided into 4 subgroups k, where k denominates the number of keywords in each name.For instance, Registrations Boston,2 counts the number of COM registrations containing BOSTON and one additional keyword, Registrations Boston,3 is the number domains with two additional keywords, and so forth.The dummy variables D n are defined as 1 if k = n and 0 otherwise.All β's are regression coefficients:

Data
The domain name system was designed in a distributed fashion6 with as little information as possible managed in centralized registries.Each Top Level Domain, for instance, administers separate databases for its domain registrations and delegates the actual task of managing the information linked to a domain to a large number of decentralized domain name servers.So called TLD zone files keep track of the authoritative domain name servers for each domain under a specific TLD.The zone file for the COM domains can be downloaded from Verisign (2014b).Strictly speaking, this zone file does not contain all registered domains, but only active COM domain names with a DNS entry, which account for more than 99.5 % of all domains (Verisign 2014a).Domain names are often comprised of multiple keywords linked together into one character string, complicating any analysis of the names' meaning.This concatenation is reverted and all domains are split into their base keywords, employing an automatic programming interface described in Huang et al. (2010) and hosted by Microsoft Research (Microsoft Research 2014). 7The next step identifies groups of domain names that contain popular surnames or city names as a keyword.This segmentation builds on the premise that the number of domains per city resident or bearer of a surname is (on average) the same across all cities or names: Why would there be a different number of domain registrations domains containing BMiller^versus BSmith^, for example, after accounting for the total number of citizens named Miller or Smith and the length of the name?Similarly, the demand for domains is also expected be equal across cities after for the size and socio-economic composition of the city populations.The cities' average income per capita and the share of the population holding university degrees are added to the regression equation as control variables since Goldfarb and Prince (2008) found high-income and well educated people to be overrepresented among early Internet users.Beyond education and income, why would there be higher or lower per capita demand for domains in e. g.Houston versus Dallas?
The US Census (US Census 2014b) provides an overview of the most popular surnames from the year 2000, including frequency counts and basic demographic information.In addition, this source also lists the population numbers of all US Metropolitan Statistical Areas (MSA) (US Census 2014a).Before linking population numbers and domain registrations, MSAs with the same name (like Portland Oregon and Portland Maine) are aggregated into one observation and the corresponding population numbers are added up.
Table 1 displays summary statistics for all domain names and for subsamples of domains containing MSA names or popular surnames.Out of 107.5 million domain names, 9.6 million domains contain a popular US surname and 2.98 million domains feature a city or MSA name. 8Overall, surname domains are on average 14.65 characters long (not counting the B.com^), which is 1.3 characters more than the average length of a COM domain.Surname domains also contain more keywords with an average of 2.67 words versus 2.47 for all domains.City related domains are even longer with 18.78 characters and 3.09 keywords on average.
As demand for a certain group of domains increases, registration of domains in these groups also increases, including an increasing number of long names.For the 500 most frequent surnames in the US census, 60 % more domains have been registered than for the following 500 surnames in the frequency ranking (5.93 million vs. 3.7 million).The higher registration numbers come at a cost.The more sought after domains are on average longer and contain more keywords than domains containing less frequent surnames.All differences in length and keyword counts are statistically significant with t-values above 2.6.
For cities, the differences become even more pronounced.When splitting the sample of MSAs at the population median, the most populous cities account for more than 4 times the number of domain registrations originating from the lower half of MSAs (2.49 million vs. 0.6 million) and the magnitude of the average length-differences is also substantially bigger.
The variable Length captures the number of characters for each surname or MSA name in the sample.On average, a top-1000 surname is 6.08 characters long.The maximum is 11 and the minimum is only 2 characters (US Census 2014b).For MSA names, the mean of Length is 8.67, and the range is 4 to 16 characters (US Census 2014a).
For the MSAs, the share of population having attained a bachelor or graduate degree ranges from 7 % (Merced, CA) to 34 % (Boulder, CO), and the minmum income per capita is USD 14,126 (McAllen, TX) while the maximum value is USD 48,900 (Bridgeport, CT).

Results
Figure 3 presents the population numbers (a) and the frequencies of surnames (b) plotted against number of corresponding domain registrations.The logarithms of the domain numbers are in both cases a linear function of the logarithms of the underlying demand metrics.However, the estimated trend lines (solid lines), have a slope below one (dashed lines) indicating constrained markets.Panel (a) in Table 2 quantifies the magnitude of the elasticities: If the population of a MSA is one percent bigger than the population of another MSA, the difference in domain registrations is only about 0.80 % higher.For surnames, the elasticity is even lower.A one percent increase in surnames pushes domain registrations up by only 0.74 %.The estimated coefficients for ln(Pop) are statistically different from both 0 and 1 at the 1 % confidence level, confirming Hypothesis 1. 9 Domain demand is linked to the knowledge economy: Higher levels of educational attainments in an MSA lead to higher levels of domain demand.After controlling for education, the coefficient for income becomes insignificant.
The estimate of 0.80 % for MSAs may be a conservative figure as the market may be in fact even more constrained due to two reasons.First, some MSA names undoubtedly have marketing appeal to users from other parts of the world.Global brand names like BNew York^or BLos Angeles^are not exclusive to residents from these MSAs only.Any non-native usage could inflate domain registration numbers for larger MSAs and bias the elasticity estimates upwards.Second, large MSAs like New York could have disproportionately more small businesses and retail shops given their consumption variety.On other hand, smaller MSAs might depend more heavily on branches of larger chains that do not require their own web presences.Analyzing the link between the industrial composition of MSAs and domain registrations could be an interesting aspect for a future study. 9The results are robust with regard to the population size of MSAs or the frequencies of the surnames: When splitting the sample at the median of Pop all estimated elasticities are estimated to remain significantly smaller than 1.Longer domains are less desirable than shorter names.Panel (b) of Table 2 lists estimated regression coefficients from (5).The coefficients ln(Length), β 2 , are and significantly different from for surnames and MSA names alike.Increasing the length of a surname from the median (6 characters) to the 75th percentile (7 characters), reduces the number of domain registrations by a staggering 24 %. 10 For MSA names, adding one more letter to the median of 9 results in an 14 % lower number of registrations.H2 is clearly confirmed.
H3 hypothesizes that registrants circumvent the problem of their desired domain already being registered by adding more keywords.The regression coefficients from (5) are displayed in Panel (c) of Table 3.The negative coefficients for D 3 Keywords , D 4 Keywords , and D 5 Keywords confirm that domains with fewer keywords are more popular than longer alternatives.The base case, D 2 Keywords , comprises of the shortest and most sought-after group of domains consisting of only the MSA name or surname in conjunction with one 10 1exp.((ln(7)-ln(6))*-1.787)= 0.24 The estimated elasticities for two-keyword MSA domains is 0.667, which is below the overall elasticity of 0.795 estimated in model (a) for all keyword lengths before.The elasticity for three-keyword MSA domains is already higher at 0.799 (0.667 + 0.132), while four-keyword domains and five-keyword domains are basically unconstrained.For surnames, the elasticities also increase each time a new keyword is added.These estimates support Hypothesis 3: adding more keywords to a domain name reduces supply constraints.

Conclusion and Discussion
This study investigates whether the primary market for Internet domain names be analyzed using standard urban economic theories.It shows that, indeed, core prerequisites met which allow application the monocentric city model to virtual space: an equivalent to costs can be constructed from differences in linguistic properties of domain names and the supply of Bcentral^virtual locations that exhibit low commuting costs is constrained.
The empirical results show that the number of registered domain names has been increasing at a slower pace than than fundamental demand for domain space.The magnitude of this effect is large in economic terms: If the elasticities found for domains containing surnames or city names are a valid estimate for the overall elasticity, then the total number of domain names could be up to a quarter higher if more domains were available for registration.
The introduction of new global top level domains has the potential to serve some of the demand not met by the current domain extensions (additional benefits of more competition in TLD space, like technical innovation, lower prices, joint marketing efforts and overall more choice for consumers have been described by Mueller [1998] and others).It is too early to tell whether the new space will be accepted as a viable alternative to the established space.Replacing the ubiquitous COM or the countryspecific TLDs like NL, DE, or CO.UK with a new TLD is comparable to adding one more keyword to a name: It increases choice massively but additional keywords are only viable in very crowded segments of domain space (see Table 3, Panel [c]).
In land markets, external factors like the topography of a metropolitan area can exacerbate land scarcity (Saiz 2010): cities that are physically constrained by water or mountainous terrain exhibit steeper land price gradients than places in open landscapes.Analogous policy choices in the administration of the web's address system can partially remove existing constraints on domain supply.Launching additional top level domains will alleviate scarcity in web locations, but not fully overcome it.Figuratively speaking, the new TLDs flatten hills and fill in water, but once these obstacles are removed, the overarching constraints will kick in again: The set of catchy keywords that appeal to humans is still bound by the way we process language, even if we had unlimited choice in top level domains.Legend has it that Mark Twain advised to buy land, since Bthey have stopped making it^. 11Similarly, one can argue that investing into (top level) domains is a promising business venture, since we have stopped inventing new languages, at least at a large scale.
Additionally, the paper confirms old market wisdom: Longer names are indeed registered less frequently than shorter names.In follow up studies, the virtual equivalent to commuting costs can be extended to analyze the effect of other linguistic characteristics like the keyword types, special characters, numerals or hyphens.Additional keywords in a name might provide more choice of possible domains but they come at the cost of longer overall names.
These findings can be generalized to other location systems based on natural language like Twitter handles, identifiers in online communities or names of existing companies, individuals or even cities.Latino and Hispanic immigrants display a strong 11 Multiple versions of this quote are widely distributeda source has not been handed down, however.
preference for places with Spanish names in south-western USA, after accounting for county fixed effects and other locational variables (Saiz 2014).Since city names influence residential choices, it would not be if a city a short, tonguefriendly name attracts more new residents less appealingly named alternatives: settle with a linguistic challenge like Schenectady if Albany is so close by?Finally, by applying the monocentric city model to virtual space, one can transfer long-established findings from traditional land markets to domain markets: For instance, the intensity of space use is predicted to be higher for domain names with low commuting costs than for locations in the linguistic peripherya prediction which could be tested in follow up studies.
More research is also needed to understand the determinants of domain registrations in time.The current study draws its conclusion from one cross-section only.Technological change could reduce the effort required to navigate to a website and might channel more demand to peripheral locations.Examples of relevant technological change include the auto-complete function in the browser's address bar which automatically fills in any long domain names, in case they have been visited in the past.Additionally, it would be interesting to investigate where unsuccessful registrants turn after not having found a suitable name.The substitution between Bowner-occupiedd omain space and Brented^locations on e. g. social media platforms is not understood in academic literature yet.
Fig. 1 Internet Usage, Domain Name Registrations, and Re-sale Prices.Notes: Domain registrations and the worldwide number of Internet users grew at similar rates up to 2007.Later, domains get registered at a slower pace.The prices paid for already existing domains increased by 60 % from 2006 through 2013.Data: Domain registrations from ICANN (2014) andZook (2015), domain price index fromLindenthal (2014), number of global internet users fromWorldbank (2005)

Fig. 2
Fig. 2 Demand and Supply for Domain Names

Fig. 3
Fig. 3 Domain registrations and fundamental demand.Notes: All axes in logarithmic scales.The solid lines represent the estimated elasticity between the population of US MSAs (a) or the frequency of surnames in the 2012 US Census (b) and the number of corresponding registrations of COM domains.For both panels, the actual elasticities are lower than the unconstrained elasticities shown as dashed lines.Domain registrations grow slower than fundamental demand for cyberspace

Table 1
Summary StatisticsDomains containing popular surnames or city names account for a large share of all registered COM domains.Overall, the number of characters and keywords per domain is larger for more frequent (Top 50 %) surnames and more populous cities.All but one difference (Δ) in length and number of keywords are statistically significant, with t-values above 2.6

Table 2
Regression coefficient estimates