EVERYBODY’S TALKIN’ AT ME: LEVELS OF MAJORITY LANGUAGE ACQUISITION BY MINORITY LANGUAGE SPEAKERS

Immigrants in economies with a dominant native language exhibit substantial heterogeneities in language acquisition of the majority language. We model partial equilibrium language acquisition as an equilibrium phenomenon. We consider an environment where heterogeneous agents from various minority groups choose whether to acquire a majority language fully, partially, or not at all. Different acquisition decisions confer different communicative benefits and incur different costs. We offer an equilibrium characterization of language acquisition strategies and find that partial acquisition can arise as an equilibrium behavior. We also show that a language equilibrium may exhibit insufficient learning relative to the social optimum. In addition, we provide a local stability analysis of steady state language equilibria. Finally, we discuss econometric implementation of the language acquisition model and establish identification conditions


Introduction
Commonality of language has long been understood to play an essential role in promoting national solidarity while language differences can be a source of division and conflict. The distribution of language use within a given population, therefore, can have important implications for social stability. Further, in societies with a dominant majority language and changes in population composition due to migration, knowledge of the majority language is an essential dimension of assimilation.
The process of immigrant language acquisitions exhibits enormous heterogeneity across time and place. Examples of the slow convergence of language commonality abound in European contexts. Hobsbawm (1990) [41] describes how, in 1789, about half of French population did not speak French at all, and only about 12-13% spoke French fairly well. It took more than 200 years to reach the current level of French language in the country, about 88% of the population. Even now segments of the population speak various languages, as each of Breton, Corsican, German, Italian, Portuguese, Occitan, and, possibly, Picard, is used by hundreds of thousands of people. 1 Russian/Soviet history provides a second illustration of how the emergence of a common language can be extremely slow. Following the Russian-Persian war 1826-1828, the Russia Empire took control of a wide range of territories including current Georgia, Armenia and Azerbaijan. Russian became the official and administrative language in the region and was combined with systemic efforts to spread the language across the newly acquired regions. These efforts had limited success. For example, in the early 20 th century, it is estimated that only 3-4% of Armenians could read or speak Russian (Suny 1968[55]). The numbers increased under Soviet rule, but even then, according to the 1970 USSR Census, only 30.1% of Armenians could read or speak Russian, whereas the corresponding numbers are even lower in Azerbaijan and Georgia, 16.6% and 21.3%, respectively (Zinchenko 1972[57]).
In other cases, there appears to be a steady-state failure for a unique common language to emerge. In Belgium, the native language of about 60% of the population is Flemish, while 40% have French as their native language. According to Eurobarometer, only 40% of Flemish-speaking population claim to know French, whereas even a much lower number of French-speaking residents, 12%(!), speak Flemish (Ginsburgh and Weber 2011 [32]). A similar disparity prevails in Canada with 75% of Anglophones and 23% of Francophones. In the English-speaking part of Canada (outside of Quebec), less than 7% of Anglophones speak French, while in Quebec, only about 35% of Francophones speak English (Statistics Canada, 2016 Census).
There are also examples of convergence to a common language. In Israel, Hebrew was chosen to become the lingua franca for all the various linguistic groups from North Africa, Eastern Europe, and North America that emigrated to the country and is, among the Jewish population, universally known. Among the Arab population it is also very widely adopted. Similarly, it is well understood that in the American case, the children of immigrants universally learn English, while English, although now universal among Native Americans to the extent that Native American languages are threatened with extinction, had a much slower path to common use.
The American case gives a different perspective on language acquisition. While the noted Dutch linguist Abram de Swaan (2001) [21] has claimed that "the globalization proceeds in English," this statement is properly qualified by observing that globalization via the mixing of peoples proceeds via nonstandard English. This is evidenced by the emergence of various dialects of English such as Spanglish, which is widely spoken in American cities that have large Latino communities, such as Los Angeles, New York, and Miami. Since questions on the command of Spanglish are usually not included in surveys, it is difficult to determine the exact number of Spanglish speakers. Stavans (2003) estimates the number of US speakers to be around forty million. 2 Spanglish is spoken by Spanish-speaking people who have moved to the US from other countries, some of whom have limited command of English. As a result, Spanglish plays an important role in Latino communities and "there is little doubt that Spanglish is here to stay" (Rothman and Rell 2005[50], p. 533).
The importance of the degree of command of the majority language by non-native speakers is indicated by the fact that it is included in the official censuses in various countries, including the United States, the United Kingdom, Ireland and Australia. For example, Table  1  As Table 1 shows, there is a significant number of partial commanders of English in each country. In the US, the number of partial learners is about one third of all non-native speakers of English (13.5%+19.8%). The number is even higher in Ireland, where the fraction of non-native English speakers who exhibit a partial command of English reaches 42.4%. Together, these types of stylized facts lead us to study environments in which minority language speakers choose across three levels of majority language understanding: none, partial, and full. We call this intermediate knowledge stage "partial learning." The variegated patterns we have described have led sociolinguistics to focus on the interplay of economic incentives and issues of personal identity as determinants of language acquisition (see, e.g., Joseph (2004) [43], Gumperz (2009) [38]). In this paper, we complement the sociolinguistic arguments by constructing a formal model to analyze patterns of language acquisition in a multilingual society.
The key decision underlying language acquisition involves basic comparisons of costs and benefits. Partial learning is easier than full learning, while full learning may offer more extensive channels of communications within the society and, potentially, higher rewards than partial acquisition. One novelty of our work is the focus on partial learning, while previous models have only treated language acquisition as full or none.
A second crucial element of our approach is attention to heterogeneity in language acquisition among minority language speakers at both the individual and community levels. We distinguish agents via individual and group characteristics, which could, for example, be related to the level of their individual skills, their native language, and the level of literacy of their group. This allows for a discussion of the distribution of language skills in different groups in ways that can be taken to data. We consider a setting with one (native and dominant) linguistic group in the host country and multiple immigrant or minority groups. Within each minority group, individuals differ with respect to language ability which influences the decision on the level of the dominant language to acquire. 4 Following the traditional theoretical literature on language acquisition (e.g., Selten and Pool (1991) [51] and Lazear (1999) [46]), we examine equilibrium outcomes in a non-cooperative language game among minority groups where the utility of minority individuals is given by their communicative benefit net of language acquisition cost. The key microfoundation of this literature is the positive dependence of the utility of every agent in the economy on the number of others with whom she can communicate with by using a common language. The incentives to acquire other languages may be driven by both pure market monetary rewards and non-market benefits of access and exposure to other cultures. We first address a benchmark case where all minority members face a dichotomous choice: either fully engage in the acquisition of the host language or completely refrain from learning it. This analysis extends the traditional binary approach to language acquisition in formal models.
We then extend the analysis to the case where minority agents have the option of partial learning. Naturally, both the cost and benefit of partial learning are lower than those of full learning. The introduction of three ordered alternatives is a novel feature of our paper relative to existing social interactions models of discrete choice (Brock and Durlauf (2002) [11]). Our results for the three-option model differ substantially from the two-alternative setting, in terms of equilibrium behavior, comparative statics, welfare issues, network externalities, and language policy implications. To be specific, partial learning can arise as an equilibrium choice, and the number of partial learners can even exceed that of full learners among minority agents, when partial learning is more valuable relative to full learning in terms of cost and benefits. Moreover, a higher cost of full learning, while naturally reducing the number of full learners, will, somewhat surprisingly, increase the number of partial learners in equilibrium. This phenomenon cannot be captured by the traditional binary language acquisition setting.
We then study the dynamics of language learning in the language economy. This allows us to explore the stability of the equilibria in the static version of our environment and thus speaks to likely limiting configurations of community language acquisition.
Finally, we examine how econometric analogs of the framework might be taken to data. Specifically, we discuss identification issues that arise in our language framework. Here we demonstrate some interesting differences from existing results on identification of social interactions.

Literature Review on Language Acquisition
Our analysis of language acquisition builds on a small body of prior work. This prior work has exclusively focused on binary language choices: each individual either learns the other language or not and so does not address partial language acquisition. Nevertheless, important aspects of our equilibrium analysis is based on the prior literature.
In our analysis of language equilibrium, we rely on the model of communicative benefits of Selten and Pool (1991) [51] in which the utility of every agent in the economy increases in the number of others who share a common language. 5 As we alluded to earlier, this assumption is driven by both market monetary rewards and non-market benefits from acquiring other languages. While the main objective of the Selten and Pool paper is the proof of existence of an equilibrium in a very general setting, Church and King (1993) [20] aim at characterization of linguistic equilibrium. To do so, they consider a simplified setting with two linguistic groups and homogeneous costs of language learning for all individuals in each population. Their cost homogeneity assumption produces pooling equilibria in which either the entire population acquires the other language or nobody does. To enrich the Church and King framework, Lazear (1999) [46] (see also Gabszewicz et al. (2011) [26], Ginsburgh and Weber (2011) [32]) introduces heterogeneous linguistic aptitudes, leading to the emergence of separating equilibria, at which a part of the population learns the other group's language, while the rest refrains from acquiring the other language. In addition to their existence and characterization results, Church and King (1993) [20], Lazear (1999) [46], and Gabszewicz et al. (2011) [26] also point out that, due to network externalities, some individuals free ride on communicative benefits generated by other members, which may lead to inefficiency of equilibrium where the equilibrium levels of learning fall below the socially optimal levels. 6 While we rely on the Selten-Pool communicative benefits model, our paper offers novel directions to the existing literature. We formally introduce a concept of partial learning, which is a wide-spread phenomena where large segments of the population, especially immigrants, opt for partial rather than full command of the majority language. While the issue of partial learning was recognized by policy makers and included in population censuses (since 1950 in the US), so far it has not been formally discussed in the theoretical literature. Moreover, while the papers mentioned above deal with two linguistic communities, our analysis allows for more than two immigrant communities, which is the case in many countries. Another major theme of our analysis involves the dynamics of language acquisition and understanding the stability of different steady state language configurations. The closest predecessor to our work is Marrone (2019) [49] who explores the joint evolution of knowledge of mother tongue and dominant language in which individuals make continuous investments in each that determine fluency in each. A key feature of the dynamics in that model involves the dynamic complementarities between the stock of past investments and the marginal product of current ones. Our model focuses on intergroup complementarities rather than the types of complementarities in Marrone (2019) [49].
There is also a prior empirical literature language acquisition. Most of this literature has focused on estimating the returns to language acquisition of foreign language by immigrants who have an incentive to learn the language of the host country if they want to assimilate with locals and find a job. These studies suggest parameter heterogeneity across environments and so provide one route by which our model can explain differences in language acquisition across contexts. Chiswick and Miller (2014) [18] identify a wide range of return values between 5 and 35 percent, depending on data sets, source, destination countries, languages, and gender. 7 There is also a branch of literature, albeit smaller, that examines the number of natives who acquire foreign languages to use at the workplace. 8 It turns out that acquiring a new language adds between 5 and 20 percent to earnings depending on the country and the language considered. Ginsburgh et al. (2007) [30] is the rare example of a study that directly estimates language acquisition, following the Selten-Pool model. This paper derives demand functions for foreign languages estimated for English, French, German and Spanish in 13 European countries. They base their variation on three variables: the number of speakers that share this individual native language, the number of speakers of the language she considers acquiring, and the linguistic proximity between the two languages. More recently, Ginsburgh et al. (2017) [28] utilize the Selten-Pool model to estimate learning decisions by citizens in some 190 countries in the world by considering 13 of the most important world languages, 9 and identify various factors that influence individuals' learning of the language including the world population of speakers of that language and the population of speakers of that language in the country of the individuals' residence.
While we do not directly contribute to this empirical literature, our indirect contribution is establishing identification conditions for determining how language acquisition levels may be ascribed to social as opposed to individual level mechanisms.

A Language Economy
Consider an economy with a constant population and (n + 1) groups, a majority group B and n minority groups S i , i ∈ {1, . . . , n}. The population size of B is λ, and the (identical) population size of each S i is normalized to be 1, with λ > 1. Individuals in each group are initially unilingual and speak their respective native languages, denoted as b for group B and s i for group S i . Each language, b or s i , in the economy is linguistically distant from another language in that communication between agents from different groups can only take 7 The research for single countries covers, e.g., Australia (Chiswick and Miller 1995[17]); Canada (Aydemir and Skuterud 2005[4]); Germany (Dustmann and Van Soest 2002 [24]); Israel (Beenstock et al. 2001[5]); the United Kingdom (Leslie and Lindley 2001 [47]); and the US (Hellerstein and Neumark 2003 [40]). 8 For example, Canada -Shapiro and Stelcner (1997) [52], countries of the EU -Ginsburgh and Prieto-Rodriguez (2011) [31], Hungary - Galasi (2003) [27], Switzerland -Cattaneo and Winkelman (2005) [16], and the US - Fry and Lowell (2003) [25]. Interestingly, that in the context of Canada, Christofides and Swidinsky (2010) [19] indicated a substantial, statistically significant reward to the command of English in the Frenchspeaking of Quebec and insignificant effect to French in the rest of Canada. 9 Chinese, English, Spanish, Arabic, Russian, French, Portuguese, German, Malay, Japanese, Turkish, Italian and Dutch, in descending order of number of speakers. place if the agents at least partially speak the same language.
To focus on language acquisition behavior of minority agents, we assume that majority agents do not learn any minority language, while minority agents can choose to partially or fully learn the majority language b at some cost. 10 Specifically, each minority group consists of heterogeneous individuals distinguished on the basis of a linguistic cost parameter θ, i.e., the private (monetary or effort) cost of learning b. Minority agents with higher θ's are hence less inclined to learn b than their counterparts with lower θ's. In particular, a type-θ minority agent can fully learn language b at cost ℓ f θ, partially learn language b at cost ℓ p θ, or choose to not learn language b at no cost, where ℓ f > ℓ p > 0. 11 Here, each language learning cost is modeled as the product of a personal factor (θ) and a linguistic factor (F, P, N ), as in Selten and Pool (1991) [51]. The linguistic cost θ in each minority group is independently and identically distributed over [0, 1] according to a continuously differentiable cumulative distribution function H(θ) with an everywhere positive density h(θ).
Fully or partially learning the majority language provides communicative benefits to minority agents. The communicative benefit for a minority agent is 1 if he meets someone and both of them fully know a common language. The communicative benefit is reduced to α if the minority agent partially learns language b and meets someone who knows b fully, and further reduced to α 2 if the minority agent partially learns b and meets someone who also knows b partially (0 < α < 1). To rationalize these communicative benefits, imagine that each minority agent randomly meets another in the economy to conduct a bilateral trade, which can only be carried out via at least some communication between the two agents. The communication benefits can then be interpreted as the probabilities of a successful bilateral trade, i.e., the bilateral trade takes place with probability 1 if the two parties communicate perfectly, with probability 0 if the two cannot communicate, and with probabilities α and α 2 if there is only partial communication between the two.
A minority agent hence chooses to fully (F ), partially (P ), or not (N ) learn language b. Denote a pure strategy of a type-θ minority agent in group i as σ i (θ) ∈ {F, P, N }, which is a Borel measurable function, and σ as a strategy profile for all minority agents. The 10 This is a reasonable assumption as minority agents are more inclined to learn a majority language, which allows them access to the prevailing economic resources and opportunities. Laitin (2000) [45] however argues that minority language survival is a coordination problem and multiple languages can coexist with various language movements. 11 For an empirical evaluation of language costs, see Carliner (2000) [15]. In addition, the importance of heterogeneity in language acquisition costs is emphasized by Bleakley and Chin (2010) [8] who use the arrival ages of immigrants to identify causal effects of English language acquisition on socioeconomic outcomes as age of arrival captures differences in language learning ability due to brain development.
(expected) payoff function of a type-θ agent i ∈ S i given σ is: where g (·) denotes communication benefits, c (·) is learning costs, and Agent i's utility u i (σ; θ) consists of the cost of choosing strategy σ i (θ), c (σ i (θ)), and the total benefit of choosing σ i (θ), which is the sum of the benefit from communicating with i's own people in S i (payoff of 1), the benefit from communicating with majority agents (payoff of λg (σ i (θ))), and that from communicating with minority agents in another group S j (payoff of 1 0 g (σ i (θ)) g (σ j (t)) dH (t)). 12 To see (1) more clearly, consider n = 2 and the payoff of a type-θ agent in S 1 from σ 1 (θ) is: The second term in u 1 (σ; θ) is the benefits from communicating with the majority group and the other minority group, where the integration 1 0 g (σ 2 (t)) dH (t) represents the measures of agents in S 2 who choose F , P , and N .

Static Language Equilibrium
How will minority agents make language acquisition decisions in such a language economy? The payoff function in (1) makes it clear that a minority agent's decision hinges on the tradeoff between the agent's idiosyncratic learning cost and communicative benefits. Importantly, notice that full or partial language learning from a minority agent generates positive spillover effects on majority agents, as well as agents in other minority groups, which makes the interaction one with strategic complementarities.
We now analyze static language equilibria where minority agents make simultaneous language acquisition decisions noncooperatively. Given that all minority groups are identical, we adopt the natural solution concept of symmetric (Bayesian) Nash equilibrium, called symmetric equilibrium hereafter, defined as follows: Definition 1 A symmetric equilibrium in the language economy is a strategy profile σ = (σ 1 , . . . , σ n ), where σ i : [0, 1] → {F, P, N } for group S i and σ i (θ) = σ j (θ) for all i, j ∈ {1, . . . , n} and θ ∈ [0, 1], such that given σ −i = (σ 1 , ..., σ i−1 , σ i+1 , ..., σ n ), σ i (θ) is a best response for a type-θ minority agent in S i , i.e., Hence in a symmetric equilibrium, agents in different minority groups choose a same strategy if these agents have the same linguistic type θ. We have also restricted attention to pure-strategy symmetric equilibrium.
Given the separability and linearity of communicative benefits and learning costs in our setting, it is intuitive that minority agents play cutoff strategies in a symmetric equilibrium. Lemma 1 summarizes some preliminary equilibrium properties: 13

(Monotonicity) For any types θ and θ
, there is always a positive measure of full learners in σ * .
Lemma 1 implies that any (pure) symmetric equilibrium is in cutoff strategies with at most two interior and monotonic cutoffs θ f and θ p , θ f < θ p , where type θ f is indifferent between full learning and partial learning, while type θ p is indifferent between partial learning and no learning. In addition, Lemma 1 implies that any symmetric equilibrium can only take one of four formats: first, there is complete coverage and all minority agents fully learn b (equilibrium F); second, there is complete coverage and all minority agents either fully or partially learn b (equilibrium FP); third, there is incomplete coverage and minority agents either fully learn or not learn b (equilibrium FN); and fourth, there is incomplete coverage and minority agents either fully learn, or partially learn, or not learn b (equilibrium FPN).

Binary Language Acquisition
As a baseline, we first consider a setting with binary language acquisition, where minority agents choose whether or not to learn a language b. One can view this setting as a special case of our language economy where there is no benefit from partial learning (α = 0) and/or the cost of partial learning ℓ p is sufficiently high, so that no minority agent chooses partial learning in equilibrium. Our analysis here enables us to connect with the previous literature, which has mainly focused on binary language acquisition.
For equilibrium construction in this setting, consider a (common) belief that in every minority group, all types less than θ f choose full learning (F ). The payoffs from F and N for a type-θ minority agent in S i are, respectively, A symmetric equilibrium hence features an equilibrium cutoff θ f implicitly defined as with at least one 1], and ℓ f > λ + n − 1, there is a unique interior equilibrium with cutoff For a general distribution H (θ), however, there can be multiple equilibria, which are Pareto ranked as we will see. Figure 1 provides such an illustration where the two solid curves correspond to the RHS and LHS of equation (2).
We next briefly discuss the welfare implications of an (interior) equilibrium. Given the parameters, the total welfare of an outcome where minority agents with language aptitude in [0, θ] fully acquire the majority language is which is the difference between total communicative benefits and learning costs. For each minority group, 2λH (θ) is the communicative benefit with the majority agents, and (n − 1) (H (θ)) 2 is the communicative benefits with the other (n − 1) minority groups. A benevolent social planner chooses language acquisition decision, i.e., θ, for each minority group to maximize Figure 1: Multiple Equilibria in Binary Language Acquisition.
Consider an interior equilibrium with cutoff θ f ∈ (0, 1), and evaluate the derivative of where the second equality uses the equilibrium condition (2). Hence, for any interior equilibrium, there is insufficient language learning in equilibrium compared to that maximizing the total social welfare W B (θ). This is a familiar phenomenon common in economic interactions with spillover effects: in the language economy, a minority agent's language acquisition generates communicative benefits for the majority agents as well as minority agents in the other groups, but such additional benefits are absent in the minority agent's optimization problem, resulting in inefficient learning relative to the optimal or efficient learning outcome.
Proposition 1 summarizes our above analysis: Proposition 1 (Language Equilibrium with Binary Acquisition) In the language economy with binary language acquisition, a language equilibrium with equilibrium cutoff θ f is characterized by (2). In addition, there is insufficient learning in every interior language equilibrium relative to the efficient language learning outcome.

Partial Language Acquisition
We now depart from the traditional binary-acquisition analysis and allow for partial language acquisition. We will characterize all possible equilibrium configurations that can arise. Our main interest is to understand when and why minority agents voluntarily and optimally choose to partially learn the majority language.
Recall that by Lemma 1, there are four possible equilibrium configurations: equilibrium F and equilibrium FP where all minority agents fully or partially acquire language b, and equilibrium FPN and equilibrium FN where some minority agents choose to not learn language b at all. Hereafter, we characterize equilibrium conditions for each equilibrium configuration, which consists of identifying the associated equilibrium cutoffs and incentive constraints for all types in each minority group.
Consider first equilibrium FPN (σ FPN ) which is characterized by two interior cutoffs θ f and θ p with 0 < θ f < θ p < 1, so that in each minority group types in [0, θ f ] fully learn b, types in (θ f , θ p ] partially learn b, and types in (θ p , 1] do not learn b. 14 A type-θ agent's payoffs from {F, P, N } in an FPN equilibrium can be calculated as The conditions for an FPN equilibrium can then be identified as: 0 < θ f < θ p < 1 Expressions (5) and (6) characterize the equilibrium cutoffs θ f and θ p respectively. In addition, we can simplify (5) and (6) to obtain implying that the interior cutoffs θ f and θ p in an FPN equilibrium maintain a linear relationship regardless of the distribution H (·).
Next consider equilibrium FP (σ FP ) which is pinned down by a single interior cutoff θ f ∈ (0, 1) such that all types below θ f fully acquire language b and all types above θ f 14 We use the same notation for the cutoffs θ f , θ p for all equilibrium formats to minimize notation. partially acquire language b in each minority group. This equilibrium arises when partial learning is sufficiently beneficial (α is large) and/or partial learning is not too costly (ℓ p is small). We similarly write down a θ-agent's payoffs from {F, P } as The associated conditions for an FP equilibrium can then be written as Here, (9) pins down the cutoff θ f , (10) implies that the most inept type θ = 1 prefers P to F , and (11) says that type θ = 1 prefers P to N as well.
Equilibrium FN (σ FN ), where minority agents either fully learn or not learn b, arises intuitively when partial learning is either of little value or costly. We again calculate Hence the equilibrium cutoff type θ f satisfies which coincides with (2) in the binary setting. The conditions for equilibrium FN are: Finally, consider equilibrium F (σ F ) where all minority agents choose to fully acquire language b. Intuitively, this equilibrium arises whenever the cost of full learning ℓ f is sufficiently small. For a type-θ agent in equilibrium F, we have The incentive constraint for equilibrium F is hence As shown above, while the equilibrium cutoffs (except that for equilibrium F) are only implicitly defined, the characterization for each equilibrium format is straightforward. In particular, the linear structure of the payoffs in our setting greatly simplifies our analysis, where the incentive constraints of all types in [0, 1] can be entirely reduced to some critical types' incentive constraints. 15 Finally, we can conduct a similar welfare analysis as in the binary acquisition setting (see (4)), which, together with the above equilibrium characterization, leads to: Proposition 2 (Language Equilibrium with Partial Acquisition) In the language economy with partial language acquisition, a symmetric equilibrium is characterized by conditions (5)- (14), depending on the equilibrium format. Moreover, except for the full-learning equilibrium F, there is insufficient learning in every symmetric equilibrium relative to the efficient language learning outcome.
For the welfare analysis in Proposition 2, while the efficient learning outcome cannot be explicitly identified, we can similarly employ a local analysis to show that social welfare strictly increases if we marginally increase the measure of full/partial learners. In particular, for the equilibrium format FPN, we find that there are insufficient full learning and insufficient partial learning, relative to the efficient learning outcome.

Equilibrium Multiplicity
We now construct an explicit numerical example to show that multiple equilibria, yielding different learning outcomes, are a real, not just conceptual, phenomenon in our language economy. And we do this for both the binary acquisition setting and the partial acquisition setting, with a focus on equilibrium FPN for the latter.
Consider the following piecewise linear distribution: , where x, y, z > 0, y > x, y > z, and 2y + x + z = 4. 15 Technically, the fact that only some critical types' incentive constraints matter is due to Lemma 1, in particular, the monotonicity property of equilibria in Lemma 1.
Hence,Ĥ (θ) has three connected linear segments, with a reverse Z shape.
The following Figure 2 illustrates the equilibrium characterizations for the binary acquisition setting (left panel) and the partial acquisition setting (right panel), using equilibrium conditions (2) and (5) The phenomenon of multiple equilibria results from various coordination possibilities by the minority groups, which is common for games with strategic complementarities. Next, we specialize to the case where H (θ) is uniform, for which the equilibrium is unique, allowing for an explicit calculation of the four equilibrium formats.

Uniformly Distributed Language Aptitudes
We now assume throughout this section that the language aptitude distribution in each minority group is uniform, θ ∼ U [0, 1]. In this case, we are able to provide an explicit, if technical, description of the language equilibrium. In particular, we can "trace out" the regions of parameters for all four equilibrium formats, which form a partition of the entire parameter space, implying that there is a unique equilibrium for each parameter constellation. 17 The explicit equilibrium characterization also enables us to provide definitive answers to issues such as measures of partial learners and language policies.
We only present the analysis for equilibrium FPN (σ FPN ), leaving the rest to the Online 16 In each panel, the horizontal axis denotes θ, while the vertical axis denotes function where we have implicitly used the relationship in (8). The parameters used in Figure 2 are x = 0.1, y = 1.9, z = 0.1, ℓ f = 9, ℓ p = 3, α = 0.5, λ = 1.2 and n = 7. 17 As is standard, equilibrium uniqueness here is obtained by ignoring the equilibrium behavior of (indifferent) types with measure zero.
Appendix. Given two cutoffs θ f and θ p , the payoffs for a type-θ minority agent are The indifferent types θ f and θ p can be explicitly calculated to be The monotonicity property of Lemma 1 then implies that as long as θ f and θ p in (15) and (16) satisfy 0 < θ f < θ p < 1, the incentives for all types to choose their respective equilibrium strategies are satisfied. The condition 0 < θ f < θ p < 1 hence completely characterizes the set of parameter constellations for equilibrium FPN. Similar equilibrium conditions can be explicitly derived for equilibria F, FP, and FN. These explicit equilibrium conditions enable us to identify the set of parameter constellations (ℓ f , ℓ p , λ, α, n) for each equilibrium format, which is summarized in Proposition 3. Given the large set of parameters involved in the characterization, we introduce two variables to help delineate the equilibrium characterization: .
For interpretation, L f and L p are respectively the 'cost and (maximum) benefit' ratios of full learning and partial learning for the extreme type θ = 1. Alternatively, we can regard L f and L p as relative costs of full and partial learning respectively. As the incentives of type θ = 1 are crucial for several (extreme) equilibrium formats to arise, the parameters L f and L p will greatly simplify our equilibrium presentation.

Proposition 3 (Language Equilibrium under Uniform Distribution)
In the language economy with uniformly distributed linguistic aptitude, there is a unique language equilibrium for each parameter constellation (ℓ f , ℓ p , λ, α, n). Specifically, [a] equilibrium F arises for L f ≤ 1; [b] equilibrium FN arises for 1 < L f ≤ L p ; [c] equilibrium FPN arises for L p < L f ; [d] equilibrium FP does not exist.
[II] L p < 1 There exist parameter thresholds L p ,ᾱ, and G such that 18 [a] equilibrium F arises for L f ≤ 1 − α (1 − L p ); [b] equilibrium FN does not exist; [c] equilibrium FPN arises for L p ∈ L p , 1 and L f > G; [d] equilibrium FP arises for the remaining combinations of L f and L p .
To see the intuition, first consider the case of L p ≥ 1, where partial learning is relatively costly. All minority agents fully learn b if F is relatively inexpensive (L f ≤ 1). If 1 < L f ≤ L p , then the extreme type θ = 1 prefers N to F and prefers F to P (even when all the other minority agents choose F ). Hence, minority agents choose either F or N , resulting in equilibrium FN. For a similar reason, equilibrium FP does not exist when L p ≥ 1. Finally, if L f > L p , only agents with small types choose F , while intermediate types choose P , which leads to equilibrium FPN.
The case for L p < 1 where partial learning is relatively inexpensive is similar, though equilibrium analysis now is more cumbersome since one has to explicitly account for (more nuanced) trade-off between F and P . Given a small L p , there will always be some types choosing P whenever F is not chosen by every type. Hence, equilibrium FN does not exist when L p < 1. 19 Next, when L f is sufficiently small (L f ≤ 1 − α (1 − L p )), we similarly have that all types again fully learn b. For larger full learning cost, i.e., L f > 1 − α (1 − L p ), not all types choose F , and we then either have equilibrium FPN when both L f and L p are large, or equilibrium FP when either L p or L f is small. Importantly, Proposition 3 shows the existence, as well as uniqueness, of symmetric equilibrium for each parameter constellation (ℓ f , ℓ p , λ, α, n) in the uniform setting. This is a direct implication of the fact that the characterization in Proposition 3 spans the entire space of (L p , L f ) and the four equilibrium regions of (L p , L f ) are mutually exclusive.
Proposition 3 enables us to graphically delineate the parameter constellations for all four equilibrium formats. Figure 3(a) shows a map of equilibria in the (α, ℓ f )-space with parameters λ = 2, n = 2, and ℓ p = 1, while Figure 3(b) shows a map of equilibria in the (ℓ p , ℓ f )-space with λ = 2, n = 2, α = 0.6. 20 In both Figure 3(a) and Figure 3(b), the entire space is completely divided into four disjoint regions. The dotted vertical lines in Figure 3 correspond to the threshold L p = 1 in Proposition 3.   First, all minority agents fully acquire the majority language if ℓ f is sufficiently small, regardless of α and ℓ p . In Figure 3(a), when α = 0, the equilibrium characterization coincides with that in (3) and Proposition 1. 21 When ℓ f is large (so that not all minority agents choose F ), equilibrium FPN arises if α is intermediate, i.e., the benefit from partial learning is only sufficient to induce minority agents with intermediate types to partially learn. As α increases further, then even the most inept minority agents find it optimal to partially learn, resulting in equilibrium FP. Figure 3(b), which provides another perspective to view the equilibrium characterization, can be interpreted similarly, except that a lower ℓ p corresponds to a large α, so that Figure 3(b) is roughly a "flipped" version of Figure 3(a).
Proposition 3 allows us to explicitly compare the number of partial learners with that of full learners when partial learning arises in equilibrium. In particular, the number of partial learners can exceed that of full learners when full learning is sufficiently high: 22 20 Explicit algebraic calculations for Figure 3 (a) can be found in the Online Appendix. 21 Indeed, if ℓ f > λ + n − 1 = 3, the equilibrium cutoff θ f in (3) is interior, i.e., we have equilibrium FN, consistent with Figure 3(a). 22 Indeed, one can explicitly show that this happens in equilibrium FP if L f > αL p + Proposition 4 (Number of Partial Learners) In the language equilibrium, the number of partial learners in each minority group is strictly larger than that of full learners in each minority group if L f is sufficiently large.
Hence, partial learning will be more prevalent than full learning among minority agents whenever full learning is sufficiently costly. In more practical terms, partial learning (or no learning) will be more likely to arise among minority agents if they have limited access to fully learning the majority language, or alternatively when partially learning the majority language is sufficient for minority agents, perhaps due to the limited set of professions they can take up.
Next, we employ Proposition 3 to conduct a comparative statics analysis for equilibrium measures of full and partial learners, focusing on the natural ("interior") equilibrium FPN : 23  2.
> 0 for the number of minority groups (n) ; 3.
> 0 for the benefit of partial learning (α) ;
Hence, a larger majority group (λ) and more minority groups (n), both strictly increasing communicative benefits from full and partial learning, give rise to more full learners and more partial learners. Somewhat similarly, a larger communicative benefit from partial learning (α) induces strictly less full learners and strictly more partial learners (from previous full learners and previous non-learners), given that ceteris paribus partial learning is more attractive relative to full learning. Interestingly, as the full learning becomes more costly (ℓ f ), while both equilibrium cutoffs strictly decrease (hence less full learners), there are strictly more partial learners, resulting from previous full learners now switching to partial learning in response to an increase in ℓ f . Finally, we discuss some policy implications of our analysis. Proposition 2 shows that decentralized language decisions lead to insufficient learning, which justifies policy interventions to facilitate minority agents' language learning. Partial learning, which arises whenever full learning is costly (large ℓ f ) or partial learning is accepted and beneficial enough (α large) 23 Proposition 5 is straightforward and is based on the equilibrium cutoffs θ f and θ p characterized in (15) and (16) and the corresponding equilibrium conditions for σ FPN . We hence omit the proof.
by Proposition 3, induces policy implications that differ from those in the traditional binary acquisition settings. For example, a policy maker can implement a pull language policy that subsidizes language learning. By Proposition 2, the policy maker can either subsidize full learning, or subsidize partial learning, particularly so if subsidizing partial learning is more cost effective. And such subsidies improve welfare for both majority and minority agents. On the other hand, another intuitive policy tool, in the presence of partial learning, is a push language policy where a policy maker suppresses partial learning, in hopes of pushing more minority agents towards full learning. Such a push policy, not surprisingly, will impose binding constraints (in an FPN/FP equilibrium) and result in inferior welfare consequences to minority agents. Somewhat surprisingly, however, such a push policy will hurt the majority agents as well: Proposition 6 (Partial Learning to Majority Welfare) Consider a language equilibrium where partial learning is present (i.e., equilibria FP and FPN). The majority agents are strictly worse off if partial learning is banned from the language economy.
Hence, while the push policy forces some additional minority agents to fully acquire the majority language, more previous minority partial learners switch to not learning the majority language, which results in lower communicative benefits and lower welfare for both majority agents and minority agents.
To summarize, this section consists of three parts. The first deals with the dichotomous setting where all individuals face two options: to acquire a full command of the majority language at specific cost or refrain from studying it at all. We provide a complete characterization of linguistic equilibria and demonstrate an insufficient learning in every interior equilibrium, as compared to the welfare-optimizing level of language acquisition. The second part of the section extends the characterization of equilibria to the tripartite setting where each individual faces three choices: full learning, partial learning and no learning. While being less expensive than full learning, a partial command of the majority language limits the scope of communication reach for those who choose this option. We again characterize linguistic equilibria and point out the suboptimality of the equilibrium levels of language acquisition. The final part of this section deals with the special case of the uniform distribution of linguistic aptitudes across all linguistic groups. The uniformity assumption allows us to derive a unique linguistic equilibrium and exhibit plausible conditions under which the number of partial learners exceeds that of full learners. Interestingly, we also show that banning the partial learning option would have an adverse effect on both majority and minority population groups. The theoretical results above could be linked to linguistic landscape in various countries. For example, the revival of Hebrew as a language of common discourse is a remarkable story of coordinated and focused effort to ensure an easy access to Hebrew instruction. It highlights the link between explicit or implicit cost of acquiring the majority language with the number of its speakers which we investigate theoretically (lowering ℓ f and increasing α in Proposition 5)-about 84% a decade ago to 94% in 2021 (Central Bureau of Statistics of Israel, Social Survey on Languages 2021). 24

Dynamics of Language Learning
Until now, we have analyzed a static language setting where various key parameters, especially partial and full learning costs, are fixed exogenously. However, important questions remain on how language acquisition behavior evolves over time. For example, what patterns of language acquisition behavior will prevail in the long run? Is there a tendency for all minority agents to at least partially acquire the majority language? If not, what are the factors that prevent language acquisition in the limit? We turn now to modeling dynamics of language learning. Our main objective is to propose a dynamic framework and investigate language acquisition patterns in the long run. To that end, we consider a deterministic language learning dynamic process where the cost of language learning decreases over time as more minority agents choose to fully/partially learn the majority language. 25 To describe the dynamic process, first observe that Lemma 1 and the characterizations in Propositions 1-3 allow us to restrict analysis to the dynamics of the cutoff points. The dynamics of language learning is initialized at a point in one of the four equilibrium zones (e.g., an equilibrium zone in the box diagram in Figure 3(a), (b)). Specifically, at the initial period, call it t = 0, the minority agents make learning decisions at the baseline learning costs ℓ f and ℓ p , resulting in a static language equilibrium as in Propositions 2 and 3. In period t ≥ 1, each equilibrium cutoff point (θ f,t , θ p,t ) is then determined again as in Propositions 2-3 by the following "updated" cost parameters where ϕ > 0, and q f,t−1 and q p,t−1 are the (equilibrium) fractions of agents in [0, 1] choosing 24 A similar effort with a somewhat more mitigated success can be observed in other countries, e.g., in Australia, with the development of its AMEP (Adult Migrant English Program). 25 Grin (1992) [36] is an early analysis of minority language dynamics using a first-order linear difference equation to explore whether minority languages survive and identifies stability of a minority language related to the sensitivity of individual choices to changes in the fraction of people speaking the minority language. Our analysis considers multiple levels of acquisition from explicit microfoundations. Also see Grin (1996) [37] for a literature survey.
F and P in period t − 1, i.e., The cost parameters ℓ f,t and ℓ p,t are functions of the baseline costs, ℓ f , ℓ p , and the fractions of minority agents in [0, 1] that chose F and P in the previous period. In addition, we assume that ℓ f,t = l f (ℓ f , q f,t−1 , ϕ) , ℓ p,t = l p (ℓ p , q p,t−1 , ϕ) decrease as q f,t−1 , q p,t−1 increase and ℓ f,t → ℓ f and ℓ p,t → ℓ p as ϕ → 0. The parameter ϕ captures how fast the language learning costs are updated based on q f,t−1 and q p,t−1 in each period.
Hence, we analyze the dynamics of a sequence of language economies where minority agents make myopic language acquisition decisions in each period, based on updated learning cost parameters ℓ f,t , ℓ p,t and rational expectations that all minority agents have full structural understanding of the economy and make decisions according to the static equilibrium in period t. The interpretation of the cost functions in (17) is that the acquisition outcomes in period t − 1 (i.e., q f,t−1 and q p,t−1 ) affect the language learning costs in period t (i.e., ℓ f,t and ℓ p,t ) in that the more minority agents fully or partially acquire language b in period t − 1, the more experienced these agents are so that they can fully or partially acquire language b at lower costs. 26 For technical and expositional convenience, we restrict our analysis to the uniform distribution setting and consider the following explicit cost parameter functions: 27 ℓ f,t = ℓ f e −ϕq f,t−1 , q f,t−1 ∈ [0, 1] ; and ℓ p,t = ℓ p e −ϕq p,t−1 , q p,t−1 ∈ [0, 1] .
The dynamics model outlined above is specific and the associated learning mechanism certainly does not exhaust all possibilities that can be considered here. However, such a dynamic analysis provides a useful angle to view how dynamic analogies of the static model will evolve. More importantly, the dynamic model enables us to investigate (local) stability properties of the static language equilibrium in Section 3. In the remainder of this section, we will study the trajectory and limiting behavior of the above dynamic process initialized at a point in one of the four equilibrium zones identified in Proposition 3. Since by construction, the dynamic process starting in the interior of the equilibrium-F zone stops and remains at the initial point forever (in terms of language acquisition behavior), hereafter we will consider cases where the initial point of the dynamics is either an FN equilibrium, an FP equilibrium, or an FPN equilibrium.

Language Learning Dynamics: F vs N
We start with the case where the initial point is in the interior of the FN-equilibrium zone. We proceed in the sequel as if we were in the baseline setting of binary language acquisition, where minority agents are restricted to choose either F or N , and hence our analysis here directly provides dynamic stability results for the equilibrium in the traditional binary acquisition literature. We will discuss, toward the end of this section, that such stability results also establish dynamic stability of a learning dynamics initiated at an equilibrium FN for our model with partial learning.
In a binary language learning dynamics, given the equilibrium cutoff θ f,t−1 in period t − 1 and the rational expectation that all minority agents with types less than θ f,t choose to fully acquire language b in period t, the payoff from F for a type-θ minority agent is where the learning cost of F is due to (18) and q f,t−1 = θ f,t−1 .
We impose the following assumption: It is immediate to verify that under Assumption 1, the function r (·) is positive, strictly increasing, and strictly convex on [0, 1].
We now analyze the steady states of the binary language learning dynamics. Here a steady state is defined as a language acquisition outcome θ * ∈ [0, 1] such that . (20) Graphically, θ * occurs when r (θ) intersects the 45-degree line in the (θ, r (θ)) space. Given Assumption 1 and that the dynamics driver function r (·) is strictly increasing and strictly convex, there is a unique (interior) steady state θ * with θ * = r (θ * ) and dr(θ) As a result, the unique steady state is globally stable from any initial condition in [0, 1].
Proposition 7 summarizes the above discussion on the binary learning dynamics: Proposition 7 Suppose Assumption 1 holds. In the binary language learning dynamics between F and N , there is a unique steady state θ * = r (θ * ) with θ * ∈ (0, 1). In addition, the unique steady state is stable.
We use the following Figure 2 to illustrate Proposition 7 where each dotted line is the 45-degree line and each solid curve is r (θ):    Figure 4 that if the learning speed in the dynamics ϕ is sufficiently large, Assumption 1 is violated and there can be two steady states, one stable (steady state 1) and the other unstable (steady state 2). Proposition 8 below presents some well-expected comparative statics for the stable steady state identified in Proposition 7. Specifically, the steady state θ * strictly increases if the size of the majority group increases, if the number of minority groups increases, if dynamic learning is faster, or if the cost of full learning is smaller.
Proposition 8 For the unique and stable steady state θ * in Proposition 7, we have Observe that Assumption 1 is closely related to the condition for an interior equilibrium in the static setting. Indeed, when ϕ = 0, Assumption 1 reduces to the interior equilibrium condition for the binary acquisition setting (see (3)). In particular, the globally stable steady state θ * is "close" to the interior equilibrium cutoff in the static binary acquisition setting whenever the learning speed ϕ is sufficiently close to 0. As such, our dynamic analysis here provides some dynamic justification for the interior equilibrium commonly studied in the traditional binary language acquisition literature.
Finally, since the traditional binary language acquisition setting can be regarded as a special case in our general language acquisition setting where either α is small or ℓ p is sufficiently large, our stability results in this section also imply that if the dynamic process starts at an (interior) FN equilibrium, the dynamics will remain in the FN-equilibrium zone as long as ϕ is sufficiently small, i.e., a local stability result for an interior equilibrium in the FN-equilibrium zone. 28

Language Learning Dynamics: F vs P
Now consider the case where the initial point is in the interior of the FP-equilibrium zone. As in Section 4.1, we can alternatively think of the dynamic setting here as one where minority agents can only choose from {F, P }, perhaps because a government imposes a penalty for not at least partially learning the majority language so that no minority agent chooses N . We will demonstrate that the dynamics with such an initial point remain in the (interior) FP-equilibrium zone and an FP equilibrium is locally stable, as long as ϕ is sufficiently small.
Given the period-(t − 1) cutoffs θ f,t−1 , θ p,t−1 (θ f,t−1 < 1 and θ p,t−1 = 1) and the rational expectation that all minority agents adopt the equilibrium cutoffs θ f,t and θ p,t in period t, the period-t payoffs from F , and P for a type-θ minority agent are respectively: Since θ p,t−1 = 1 and the binary choices {F, P }, the behavior of minority agents in period t is then captured by the cutoff type θ f,t who is indifferent between F and P : is the corresponding dynamic driver function and .
We next assume One can again verify that under Assumption 2, the dynamic driver function g (·) is positive, strictly increasing and strictly convex on [0, 1].
We similarly use the following Figure 5 to illustrate Proposition 9 where each dotted line is the 45-degree line and each solid curve is g (θ):  The left panel of Figure 5 presents a scenario with a unique steady state as in Proposition 9 (λ = 2, n = 2, α = 0.6, ℓ f = 8, ℓ p = 4, ϕ = 0.3), while the right panel of Figure 5 presents a scenario with a stable steady state 1 and an unstable steady state 2 (λ = 2, n = 2, α = 0.6, ℓ f = 8, ℓ p = 4, ϕ = 0.7). It can be verified that Assumption 2 is violated for the right panel of Figure 5. 29 Finally, observe that if ϕ = 0, Assumption 2 coincides with the interior FP equilibrium condition (see (10)) , which implies that minority agents play the (static) equilibrium FP in our setting with partial acquisition. Hence, if the dynamics starts inside the FP-equilibrium zone, then Assumption 2 holds and Proposition 9 then implies that the dynamics will remain in the FP-equilibrium zone and converge to a steady state that is close to the point where the dynamics is initiated, as long as ϕ is sufficiently small. In other words, an FP equilibrium in the interior of the FP-equilibrium zone is locally stable under our dynamics for sufficiently small ϕ.

Language Learning Dynamics: F vs P vs N
We now move to a dynamic analysis for the FPN-equilibrium zone, i.e., we start from an initial FPN equilibrium (θ f,0 , θ p,0 ) with 0 < θ f,0 < θ p,0 < 1 at t = 0 and we analyze (local) stability of the initial equilibrium point (θ f,0 , θ p,0 ), i.e., whether the stable steady state of our dynamics comes close to the initial equilibrium point when ϕ is sufficiently small. In the sequel, we only present key steps in our analysis, given our somewhat modest objective (local stability) and the smoothness of the dynamic system. A more detailed and precise analysis of the dynamics in the FPN-equilibrium zone can be found in the Appendix (Section 6.2) where we specialize to a specific setting of the dynamics in order to obtain a more precise understanding of the forces for and against local stability.
Given the equilibrium cutoffs (θ f,t−1 , θ p,t−1 ) with 0 < θ f,t−1 < θ p,t−1 < 1 in period t − 1 and the expectation of the equilibrium cutoffs θ f,t and θ p,t in period t, the payoffs from F and P in period t for a type-θ minority agent are respectively: The expressions (23) and (24) for a steady state can then be written as (θ * = (θ * f , θ * p )): To investigate local stability of the steady state defined in (25), be small departures from the steady state (θ * f , θ * p ). We then have with initial point (θ ′ f,0 , θ ′ p,0 ) given and t ∈ N. While the partial derivatives in (26) are cumbersome to calculate, one can verify that where we emphasize the dependence of θ * on ϕ and write θ * = θ * (ϕ). With this notation, we have θ * (ϕ) → θ * (0) as ϕ → 0 and θ * (0) is the solution of (22) when ϕ = 0, which also coincides with (θ f , θ p ) calculated from (5) and (6). Assuming the 2×2 matrix A (θ * (ϕ)) to be diagonalizable, there then exists a nonsignular 2 × 2 matrix P (ϕ) such that where matrix Λ (ϕ) is a diagonal matrix and displays the eigenvalues of A (θ * (ϕ)) on its diagonal. By multiplying both sides of (28) by the scalar ϕ, we see that up to o (ϕ) the eigenvalues of the matrix in the linear dynamics in (27) are ϕ times the eigenvalues of A (θ * (ϕ)). We summarize the above analysis in the following proposition: Proposition 10 (Local Stability of FPN Equilibrium) Up to the first order in ϕ, the eigenvalues of the matrix in (27) are ϕ times the matrix A (θ * (ϕ)). In other words, the linear system (22) is stable to the first order if ϕ is small enough.
Importantly, Proposition 10 implies that once we find a steady state solution of (27) when ϕ = 0, (θ f , θ p ) ∈ (0, 1) 2 with 0 < θ f < θ p < 1, i.e., if we start from an FPN Equilibrium, then the linear dynamic system (27) will be stable as long as ϕ is sufficiently small. Therefore, if the language learning dynamics is initiated in the FPN-equilibrium zone, the steady state of the dynamics will stay in the FPN-equilibrium zone as long as ϕ is small enough. At first sight, it appears that the dependence on ϕ of the matrix A (θ * (ϕ)) might falsify Proposition 10. Notice, however, that under modest regularity conditions, we also have A (θ * (ϕ)) → A (θ * (0)) as ϕ → 0, where A (θ * (0)) solves (27) with ℓ f,t and ℓ p,t being replaced by constants ℓ f and ℓ p respectively. We use a numerical example in Figure 6 to illustrate Proposition 10. Figure 6, which is based on the equilibrium map in the (ℓ p , ℓ f )-space in Figure 3(b), shows the trajectories of the learning dynamics of (22) from three initial points in the FPN-equilibrium zone. For the case of ϕ = 0.5 (Figure 6 (a)), if the learning dynamics starts in the "deep" interior of the FPN-equilibrium zone (i.e., the point (4.2, 12)), the steady state and the entire trajectory of the dynamics remain in the FPN-equilibrium zone; while if the learning dynamics starts near the boundary of the FPN-equilibrium zone (i.e., the points (2.2, 12) and (4.2, 8)), the steady state wanders out of the FPN-equilibrium zone. However, all the three trajectories stay entirely inside the FPN-equilibrium zone when ϕ = 0.1 (Figure 6 (b)), which is consistent with Proposition 10.
In short, this section examines dynamics of language learning and addresses the dynamic stability of the language equilibrium characterized in the previous section. The deterministic language learning dynamic process relies on the assumption that the spread of the majority language across minority groups leads to adjusted (full and partial) language acquisition costs over time. We again begin with the binary setting of the full and no learning of the majority language and show that with a small learning speed, there is a unique steady state, which is globally stable, and given the stability result, we also present some wellexpected comparative statics analysis associated with the stable state. We then turn to the tripartite setting with full, partial and no learning. In the case where the minority groups are partitioned into full and partial learners (i.e., the dynamics starts with an FP-equilibrium), we again demonstrate uniqueness and stability of the FP-equilibrium under a small learning speed. The most challenging and realistic case addresses the situation where all minority groups contain three levels of language acquisition: full, partial and no learners. Our main result here is that again the FPN-equilibrium is locally stable as long as the learning speed is small enough.
An important implication of our stability analysis is that while a sufficiently low learning speed ϕ guarantees (local) stability, a higher learning speed can shatter an undesirable steady state and lead the learning dynamics to an otherwise desirable one (see Figure 6(a)). Such an insight can point out useful directions for economies that are stuck in suboptimal language acquisition situations (such as Belgium) to break out of the current state and move to a qualitatively different one.

Econometrics
In this section, we consider the identification of the social effects that determine equilibrium language acquisition. Our model does not have a direct statistical generalization, so our objective here is to characterize how one can obtain evidence for the mechanisms that underlie our model.
Assume that agents are randomly drawn from a set of neighborhoods. We denote an agent as k and her neighborhood as n (k). Here, "neighborhoods" could be census blocks, census tracts, or even larger population units. Suppose that data are available, in addition to individual agent neighborhood locations and associated choices F vs P vs N but also observable covariates that describe agent k, as well as observable covariates describing various aspects of n (k) in addition to measures of languague choices within the neighborhoods.
Our econometric model treats ability to learn the majority language and levels of language fluency in the majority language as functions of observable covariates. It is natural to work with a measure of skill in our econometric model, so we replace the (discretized) cutoffs in θ space, 0 < θ min < θ 2 < · · · < θ I < 1, with cutoffs in a language learning skill measure S = 1/θ − 1, with ∞ > S max > · · · > S 1 > 0. We maintain the following Assumption 3 for the equilibrium FPN, with analogous assumptions for the equilibria F, FN and FP: Assumption 3 (Fixing Cutoffs) Set S f,n(i) = 1 θ f,n(i) − 1 and S p,n(i) = 1 θ p,n(i) − 1 where θ f,n(i) , θ p,n(i) solve (5) and (6) in Section 3.2 as the equilibrium cutoffs for neighborhood n (i) with λ = λ n(i) . S f,n(i) and S p,n(i) are the respective learning-skill cutoffs for full learning and partial learn-ing in neighborhood n (i). We assume that these cutoffs are observable to the econometrician and our data set is rich enough to include as many neighborhoods as needed to get enough variation for our identification analysis below. Indeed, if the data set at the census tract or census block level is rich enough to have measures of the fractions of non-learners, partial learners, and full learners, it is then possible to approximate the cutoffs from the data. To be explicit, one can construct the cutoffs using the observable fractions of full learners, partial learners, and non-learners in neighborhood n (i), denoted respectively as Z F,n(i) , Z P,n(i) , and Z N,n(i) , i.e., where µ {A} denotes the measure of the set A and F S,n(i) is the corresponding cumulative empirical distribution of learning skills from measure µ {·}.
We consider the econometric model: The terms X i , Y n(i) , and η i are, respectively, an r-dimensional vector of observed individual covariates (X i ), an s-dimensional vector of observed "contextual" covariates for neighborhood n (i) (Y n(i) ), and regression errors (η i ), while Z F,n(i) , Z P,n(i) , and Z N,n(i) are the observed fractions of F -, P -, and N -learners defined above.
Throughout, we assume the unobserved heterogeneity in the system is orthogonal to the observable determinants of skill: This assumption allows us to focus on the specific identifications of social models such as (29); we discuss relaxation of this assumption below.
Equation (29) is a variation of the standard model of social interactions (see Manski (1993) [48] for the original formulation and Section 3.2 of Brock and Durlauf (2001b) [10] for the general version). Relative to the original Manski model, this formulation allows neighborhood variables to differ from averages of the individual-level variables and allows for nonlinearities in feedback as in equation (29) by incorporating the fractions of Z F,n(i) , Z P,n(i) , and Z N,n(i) as additional regressors, with the restriction that the sum of the three fractions adds up to one in each neighborhood n (i).
The objective of this section is to ask whether parameters mapping Z F,n(i) , Z P,n(i) , and Z N,n(i) to language proficiency are identified. Identification issues are raised by the reflection problem (Manski 1993[48]), which is a variant of the identification problem in rational expectations econometrics, e.g., Wallis (1980) [56], in that it involves potential collinearity between expected values which drive behavior and other variable present in the equation. To understand when identification holds or fails, we follow the same procedure as in Brock and Durlauf (2001a,b)[9] [10].
We start our identification analysis with the simplest binary language acquisition case: F vs N with P not possible. We assume that S i ≤ S max < ∞, θ i ≥ θ min > 0. Following (29), for this case, we have: Following Brock and Durlauf (2001b) [10], suppose that the linear space spanned by the elements of 1, X i , Y n(i) , Z F,n(i) is r + s + 2, where recall that X i has dimension r and Y n(i) has dimension s. There are two composite constants (i.e., k+J N and J F −J N ) and two vectors of dimensions r and s (i.e., c ′ and d ′ ) for a total of r + s + 2 objects to identify. Hence, we know that (k + J N ), c ′ , d ′ , (J F − J N ) can be identified. A problem however remains in that we have three constants k, J F , J N but only two equations (i.e., the identified "k + J N " and "J F − J N ") to solve for k, J F and J N . As a result, one of the constants in (k, J N , J F ) remains unidentified. This limit does not mean that the data are informative as whether (J N , J F ) are both zero. Second, knowledge about the magnitude of language spillover effects has natural policy value due to social multipliers they produce with respect to policy interventions to raise language skill levels.
The non-identification of the constants (k, J F , J P , J N ) in our setting is, in our view, not a serious drawback though. After all, the composite parameters (J F − J N ) and (J P − J N ), i.e., the partial derivatives of skill S i with respect to Z F,n(i) and Z P,n(i) , are identified. Intuitively, these composite parameters measure the externalities of the "aggregate" language acquisition behavior in neighborhood n(i) on an individual's language skill and hence her language acquisition behavior. Knowledge about the magnitude of such externalities indeed offers useful information for policy makers, and hence is, in our view, of primary policy interest.
What drives the identification result?
The key substantive requirement is that Z F,n(i) is linearly independent of 1, X i , Y n(i) . There are many routes to such linear independence. For example, linear independence of Z F,n(i) over 1, X i , Y n(i) can be achieved if λ n(i) , the relative population size of the majority in n (i), varies independently of 1, X i , Y n(i) , which is indeed plausible. More generally, Z F,n(i) is generically a nonlinear function of the joint density of 1, X i , Y n(i) in the sense that the set of densities η i that produce linear dependence is nongeneric in the space of densities that are absolutely continuous. See Brock and Durlauf (2007) [13] for discussion of this point.
For the general case of F vs P vs N , i.e., with partial language acquisition, an analogous argument holds. Recall that Z F,n(i) , Z P,n(i) , (29), and equations (5), (6) of Section 3.2 above for the formulas for the cutoffs: The following Theorem 1 presents our identification results for equation (31): Theorem 1 Assume that the dimension of the linear space spanned by the elements of Theorem 1, a natural consequence of Theorem 6 of Brock and Durlauf (2001b) [10], shows that the previous positive identification results apply to our three choice framework. Theorem 1 has an immediate corollary: The parameters c ′ and d ′ are identified. The composite parameters k + J N , To understand Corollary 1, notice that c ′ is identified by the dimension r of the linear space spanned by X i , while d ′ is identified by the dimension s of the linear space spanned by Y n(i) . In addition, since the three composite parameters are used to pin down four constants (k, J F , J P , J N ), one of them remains unknown. As before, the identification is partial.
The major limitation to these findings is Assumption 4, for the obvious reason that it ignores endogeneity of neighborhood membership. However, there is a constructive route to identification if one models self-selection via the construction of control function variables, cf. Heckman (1979) [39], identification is augmented. Note that semiparametric estimates will suffice for identification. To see this, suppose agent i is observed in neighborhood n (i) if and only if a latent variable t i > 0 exists, where t i measures agent i's evaluation of n (i), and can be written as a linear function of a vector of observables (R i ) and a normally distributed error τ i , i.e., t i = γ ′ R i + τ i . Assume that the error τ i and the regression error η i in equation (30) and equation (31) are jointly normally distributed. Then following Section 3.6 of Brock and Durlauf (2001b) [10], we obtain two new regressors at the price of one extra parameter. This approach can be useful if there is enough variation in the average over n (i) of the control function variable across the set of neighborhoods in the data set. The analyst also needs to find a regressor to include in R i that is not already in the primary regression before correction for selection bias. 30 But this requirement is standard when addressing self selection. The upshot of our discussion is that self-selection of neighborhoods does not raise any new issues in the context of our language model.

Conclusions and Recommendations for Future Research
This paper presented a theoretical language acquisition framework where individuals from multiple minority groups can choose to learn the majority language at three different levels of fluency: fluent, partially fluent, and not fluent at all. An important feature of our framework is the existence of positive externalities for the whole economy in language learning. We showed that such externalities can generate multiple language equilibria in a general setting.
Our theory development on language acquisition was followed by a dynamic analysis, with a main purpose of investigating local stability of the equilibria found in the static language acquisition framework. In particular, we considered a deterministic learning dynamic process where the costs of language learning adjust over time in accordance with how many minority agents partially or fully learn the majority language in the previous period. We found that depending on the adjustment rate of the learning costs, there could be locally stable or locally non-stable equilibria. Our analysis here help us understand what structural features are important for stability, as well as limiting configurations of language acquisition behavior in our framework.
Finally, we showed how our model can be related to empirical work by exploring how language spillovers of the type we study may be uncovered empirically. Here we argue that our conceptual framework leads to positive identification results under empirically plausible conditions.
In terms of future research, we see value in integrating neighborhood choice and language choice into a common framework. One route to this would be via a sequential logit approach in which individuals first choose neighborhoods and then choose F vs P vs N in an empirical following ordered logit framework. Recall that we have two thresholds S p,n(i) < S f,n(i) where agents in n (i) choose N for S i ≤ S p,n(i) , choose P for S p,n(i) < S i ≤ S f,n(i) , and choose F for S f,n(i) < S i . This integration can lead to more complicated dynamics when one considers the coevolution of neighborhood memberships and language choice.
A second research direction involves using our framework to systematically investigate the sources of heterogeneity in partial language versus full language acquisition. For example, our model would explain the Belgium and US steady-state differences by focusing on limited communicative benefits available to those acquire Flemish and French in Belgium as compared to the extensive communicative and market reach to learners of English in the US. Moreover, partial learning, linguistic interaction between English and Spanish, and the emergence of Spanglish in the US, are different from the relatively static co-existence of Flemish and French, highlighting a different linguistic dynamics. Evaluating whether these differences in fact produce the language patterns we discuss requires moving toward structural empirical work.
Finally, recall that language equilibria typically exhibit inefficient learning compared to the socially optimal level of language acquisition. The suboptimality of equilibrium levels of language acquisition and persistence of partial learning in various censuses call for a careful and systematic analysis of public policies in this regard, which are briefly mentioned in our paper. However, the partial (and full) language learning and its usage are heavily impacted by individuals' social identity, as "language cannot be legislated; it is the freest, most democratic form of expression of the human spirit" (Stavans (2000) [54], p.557). Thus, it would be important to address linkages between language learning and identity in various settings. An important next step in developing these models is the introduction of identity considerations in the spirit of  [14] as well as in the spirit of Laitin (1993) [44]. To do this requires a distinct formulation of the utility of identity, the meaning of solidarity of co-ethnics as such, and should not amount to more than simply adding percentages of co-ethnic learners in the utility function. Marrone (2019) [49] gives a variation of this type of approach in considering identity and language investment as joint processes. Our proposal is to treat economic benefits and identity benefits as distinct processes. For this reason, we pursue that approach in a sequel paper. examination of the degree of command of English for the group of partial learners across the United States, the United Kingdom, Ireland, and Australia.
To create statistics about language and the ability to speak English, all US censuses since 1890 (with exception of the 1950 census) contained questions about whether a person speaks a language other than English at home, what language he/she speaks, and how well he/she speaks English. While in earlier censuses the ability to speak English was coded as yes or no, since the 1980 census, however, the command of English for those who do not speak English at home was categorized by four possible options: (i) speaking it very well (group E), (ii) speaking well (Group F), (iii) speaking not well (group G), (iv) not speaking at all (group C). (See the US table). That is, the recognition of an incomplete or partial command of English has become prominent already 40 years ago. Groups F and G jointly contain about 34% of those who do not speak English at home. If we identify partial learners as members of group G only (those who speak English not well) the number is still substantial-about 14%. By using this census data, Carliner (2000) [15] points out quite different earning patterns of these groups. For example, among well-educated men, those who speak English very well earn 9.6% more than men who speak English well, 17.6% more than men who speak English poorly, and 33.6% more than men who speak no English. 31 By applying the same methodology to the 2016 census in Ireland, the same two groups E and F yield the 45% from the total number of residents of Ireland who do not speak English at home, while the group G alone represents about 13% of those respondents. 32 The data for UK does not distinguish between those who speak English well and very well. In the US census terminology, the fraction of those who do not speak English well reaches 17.5%. 33 Similarly to the UK data, the Australian census lumps together those who speak English well and very well. Moreover, it does not distinguish between those who speak English not well or not at all. It turns out that the fraction of the latter group among all those who do not speak English at home, reaches 20%. 34

A Detailed Analysis of Dynamics in the FPN-Equilibrium Zone
To understand better the forces behind the local stability of a steady state of the dynamic system initiated at the FPN-equilibrium zone, i.e., the linear dynamic system (22), we consider here a special case where all n minority groups are lumped into one "ethnic" group, i.e., all minority groups are homogenous so that n = 1. As we will see, while it removes the interesting economics of externality, the dynamic analysis for the special setting is more transparent and intuitive.
And given the dynamic system (32), it can be verified that a sufficient condition for (θ f,t , θ p,t ) to be in the FPN-Equilibrium zone for all t ∈ N is αλ < ℓ p e −ϕ , ℓ p < αℓ f e −ϕ .
We conclude that if we start the dynamics at an initial point in the FPN-equilibrium zone (so that (33) holds) and that ϕ is sufficiently small (so that (34) holds), the trajectory of the dynamic system (32) will always remain in the FPN-equilibrium zone. Finally, notice that the eigenvalues of ϕM (0) and ϕM (1) can always be made to be all less than one in absolute value as long as ϕ is sufficiently small. 37 We hence can define a cutoffφ > 0 so that (1) the eigenvalues of ϕM (0) and ϕM (1) are all less than one in absolute value, and (2) αλ < ℓ p e −ϕ and ℓ p < αℓ f e −ϕ , for all ϕ ∈ 0,φ .
Our explicit dynamic analysis above demonstrates that as long as ϕ is sufficiently small, i.e., the language learning speed is slow enough, then language learning dynamics initiated 36 We can alternatively think of (θ f,0 , θ p,0 ) as calculated from expression (32) by setting ϕ = 0. 37 The restriction on ϕ for the eigenvalues to be all less than one in absolute value is important. Consider the parametric setting where λ = 2, α = 0.6, ℓ p = 4.2, ℓ f = 8. One can calculate that in the (interior) FPN-equilibrium zone will remain in the zone and will also converge to an FPN equilibrium in the zone.