The Team Formation Model
We take into consideration a finite set N of n agents. Each agent i has an endowment \(w_i \in {\mathbb {N}}_+\) of a time resource. We denote by \(\mathbf {w} \in {\mathbb {N}}_+^n\) the vector of endowments of all agents.Footnote 3 A team is a vector \(\mathbf {t} \in {\mathbb {N}}^n\), \(\mathbf {t} \le \mathbf {w}\), with \(t_i\) indicating the amount of time employed by agent i in a joint task. We denote by T the set of teams.
Let A be a finite set of activities (or tasks). A project \(p = (a,\mathbf {t})\) is an activity \(a \in A\) carried out by a team \(\mathbf {t} \in T\). We use set \(P \subseteq A \times T\) to collect all projects \(p = (a,\mathbf {t})\) such that team \(\mathbf {t}\) is able to accomplish activity a. We can think of P as representing the technology, since it indicates, for every possible task, which combinations of inputs allow the task to be completed.Footnote 4 It will simplify the following exposition to introduce, with a slight abuse of notation, the auxiliary function \(n(p)=n(a,\mathbf {t}) \equiv \{ i \in N: t_i > 0 \}\), which gives us the set of agents that put some positive amount of time (possibly different among agents) into project p. Another notation we will use is \(h(p)=h(a,\mathbf {t}) \equiv \sum _{i=1}^N t_i \), which indicates the total amount of time (e.g., hours) employed on aggregate by the agents in project p.
In the following discussion we will often use teams and projects as synonyms, but some clarification is necessary. A project \(p=(a,\mathbf {t})\) characterizes an activity a performed by a team \(\mathbf {t}\), where \(\mathbf {t}\) specifies not only the members of the team (who are in the set n(p)) but also how much time each of them devotes to the project. A collection of projects, i.e., of activities and teams, where each activity is performed at most by one team is called a state. We note that while every project p can occur only once in a state, because every activity a can be executed only once by the same team, the same team \(\mathbf {t}\) can occur in different projects, if this is allowed by the technology P, i.e., if there are at least two projects \((a,\mathbf {t}),(b,\mathbf {t}) \in P\), with \(a \ne b\).
A state is denoted by \(x \subseteq P\). We use \(\mathbf {e}(x) = \sum _{ (a,\mathbf {t}) \in x} \mathbf {t}\) to indicate the vector collecting the overall amount of resources employed in state x, agent by agent. We say that x is feasible if \(\mathbf {e}(x) \le \mathbf {w}\), and we denote by \(X \subseteq {\mathcal {P}}(P)\) the collection of subsets of P containing all feasible states. We also introduce function \(\ell (x)=|x|\) that simply counts the number of projects that are completed in state x.
Finally, we introduce utilities that agents earn depending on the state they are in. For every \(i \in N\), and for every \(x \in X\), we denote by \(u_i(x)\) the utility gained by agent i in state x.
Given these elements, it is possible to define a team formation model with the quintuple \((N,\mathbf {w},P,\mathbf {u})\). The primitives are the set N of agents involved, their constraints \(\mathbf {w}\), the set P of projects allowed by technology, and agents’ utilities \(\mathbf {u}\). Given N, \(\mathbf {w}\) and P, it is possible to derive the set X of all feasible states which is a partially ordered set with respect to set inclusion.
Assumptions
In deriving our results, we employ the following restrictions on the possible structure of teams (first three) and on utilities (second group of three). We explicitly refer to each of these assumption whenever used. We note that some of them are a refinement of one another, while others are incompatible.
Assumption t1
In every \((a,\mathbf {t}) \in P\), we have for every \(i \in N\) that either \(t_i= 0\) or \(t_i=1\).
Assumption t1 states that the time allocated to each feasible project by every agent is always 0 or 1, or simply (up to a normalization of time) that the time allocated to each feasible project by its participants is a constant of the model which is homogeneous across projects for every agent.
In contrast, the next two are assumptions that exogenously fix the number of members in each team. We will discuss them in more detail in Sect. 4.4 where we will see how our model is a generalization of other common theoretical setups.
Assumption s1
There is a \(k \in {\mathbb {N}}_+\), such that for every \(p \in P\), we have that \(|n(p)|=k\).
Following Assumption s2 is a refinement of Assumption s1, where k is fixed to be equal to 2.
Assumption s2
For every \(p \in P\), we have that \(|n(p)|=2\).
We now present some assumptions that specify how agents gain utilities by performing activities in teams.
Assumption v1
For every \(x, x' \in X\), with \(x'\ne x\) and \(x' = x \cup \{p\}\), and for every \(i \in N\) such that \(i \in n(p)\), we have that \(u_i (x') > u_i (x)\).
Assumption v1 is the only one that is needed for our main result. It states that the marginal utility in forming a team, for each of its members, is always positive, independently of all other teams in place. We note that this assumption allows for a large variety of externalities that a project may have on the utility of non-members of that team, or on the fact that the same team could bring different marginal effects to its members, depending on the state.
In particular, this is in line with the assumptions of the model in [1], where the benefit of an agent from participating to a project always increases if she gets involved in it. Moreover, this assumption is consistent with the behavior of researchers that we observe in the APS dataset discussed in Sect. 3.
An additional possible restriction is to impose that the aggregate utility of each project is constant across projects (normalized to 1).
Assumption v2
For each \(x \in X\), \(\sum _{i \in N} u_i (x) = |x|\).
Finally, we will consider also a more restrictive assumption that asks for linearity in teams membership, so making the marginal value of each team, for each of its members, independent on states.
Assumption v3
For each state \(x \in X\), and any agent \(i \in N\), we have that \(u_i (x) = v \cdot |\{p \in x: i \in n(p)\}|\), with \(v \in {\mathbb {R}}^+\).
The last two assumptions convey different ideas on the assignment of utilities: While Assumption v2 imposes that the aggregate marginal value of each team is 1, Assumption v3 says that the payoff earned by each agent i is merely given by the number of projects in which i participates. We note that the two assumptions are compatible only if Assumption s1 holds as well, in which case we have \(v=\frac{1}{k}\).
Maximal States
We observe that X is a partially ordered set under inclusion. This is because, for any two states x and \(x'\) belonging to X, we can have that either x is included in \(x'\), or \(x'\) is included in x, or no set inclusion relationship can be established between them. However, as the empty state \(x_0\) is included in any other state, it is the only minimal state (or the least state) and, given two states x and \(x'\), the set of those states that are included in both is always nonempty. On the other hand, as there is a threshold \(\mathbf {w}\) on the overall available resources, there may not always be a common superset for any two states. In general there will be many maximal states, i.e., states above which it is not possible to include other teams, because otherwise the threshold would be exceeded.
We denote by \({\mathcal {M}}\) the set of maximal states, \({\mathcal {M}} = \{x \in X : x \subseteq x' \text { and } x \ne x' \Rightarrow x' \notin X \}\). We denote by \({\mathcal {L}}\) the set of states with maximum number of completed projects, \({\mathcal {L}} = \{x \in X : |x| \ge |x'|, \text { for all } x' \in X\}\). We observe that \({\mathcal {L}} \subseteq {\mathcal {M}}\). In fact, if \(x \in X\) and \(x \notin {\mathcal {M}}\), then there exists a feasible state that can be obtained from x by adding some project, and x cannot maximize the number of projects. In contrast, there exist in general maximal states that do not maximize the number of projects, as the following example shows.Footnote 5
Example 1
(Maximal states and maximum number of projects)
Consider the case in which \(N=\{i,j,k,m\}\), \(\mathbf {w}=(2,2,2,2)\), \(A = \{a,b\}\), and \(P = \{ (a,(1,1,0,0)), (b,(1,1,0,0)), (a,(0,1,1,0)),\) (b, (0, 1, 1, 0)), \((a,(0,0,1,1)), (b,(0,0,1,1)) \}\). This is a situation in which there are four agents with two units of time each, there are two activities to be performed, and each activity requires that either \(\{i,j\}\), or \(\{j,k\}\), or \(\{k,m\}\) must be involved, with one unit of time each. We note that Assumptions t1 and s2 hold. Figure 3 illustrates the partial order on set X resulting from the above assumptions: An arrow from a state x to another state y indicates that we can pass from x to y by adding a single project.Footnote 6 There are three maximal states, but only one of them maximizes the number of projects. \(\square \)
Why this Generalization?
The theoretical setup we have introduced encompasses different matching models with non-transferable utility that have been developed in the literature. We acknowledge that some of the existing models might be stretched to deal with most of the cases analyzed within our setup. However, we claim that our model is a natural and simple container for all these models, and we find a value in its capability to adapt easily so as to take into consideration specific cases. In the following we will illustrate this capability.
Cooperative games with non-transferable utility are obtained in our setup if we specify that each agent can belong to one coalition only, and that no externalities are allowed. In order to deal with matching, as done in [3], we simply need to add Assumption s2, so that only teams of size two are allowed to be formed. Marriage—that is bipartite matching—can be obtained through adequate constraints on the technology; after dividing the set of agents between males and females, only heterosexual pairs are allowed in P, and additional constraints can also be considered. Figure 4 provides an example.
If, instead, we relax the upper bound on \(\mathbf {w}\), still following Assumption s2, then any state can be considered as a network that satisfies the constraints imposed by \(\mathbf {w}\) (concerning the maximum degree of nodes), and the connections made available by technology P (representing the exogenous network of opportunities). This is illustrated in the following example.
Example 2
(Co-authorships) A popular and seminal model in the economic literature on networks is the “Co-author model” of Jackson and Wolinsky [25] and later extended by [41]:Footnote 7 Agents are researchers and links are pair-wise collaborations on scientific projects, which are costly but provide payoffs that depend endogenously on the negative externalities given by each collaboration to the other co-authors of an author.
We can include in our setup a payoff function with costs and negative externalities of a project \(p=(a,\mathbf {t})\) on the members of other teams that are formed by the members of \(\mathbf {t}\), as in the original model, and with Assumption s2. However, our model allows for more generality and also for more realistic time constraintsFootnote 8 that can be imposed on the available (multi-)matchings. First of all, (i) some agents may work alone, but even three or more agents can set up a team together and produce a paper, as happens in the profession. Then, (ii) with regard to constraints, there could be an exogenous network G of acquaintances, so that a group of co-authors is possible only if they are mutually connected in G. Or, (iii) the researchers could have exogenous complementary skills, and only projects involving agents with enough diversity could be successful. Aspects such as the three listed above, and even others, could all be modeled by some technology P. \(\square \)
So, what is the added value of our setup with respect to existing ones, in terms of representation of real-world phenomena? To provide an answer through an example, let us stick to the co-authorship model of Example 2. Imagine that agents i, j and k set up a project p together, so that \(|n( p )|=3\). This could be represented in the original co-authorship model of Jackson and Wolinsky [25] that allows only for couples, by saying that i is linked to j and k, and j and k are also linked together. We observe that the link between i and j would have a negative externality on each neighbor of these agents, including k. However, in the general setup and in reality, the fact that i and j are in a three-agent collaboration has a positive externality on the third agent k, and a negative externality on the others. Formally, this could be done in the original network formation model by specifying, for each link between two agents, and for any other agent, the sign of the externality of that link on the third agent. It is clear that this would seriously complicate the notation and that only a more general framework such as the one we use can overcome such difficulties.
We conclude this section with a stylized but fairly general applications that show how the possibilities and the competing incentives of some environments cannot be dealt with using the standard models of matching and network formation.
This model provides a possible mechanism for explaining some of the inefficiencies that we observe in the APS dataset discussed in Sect. 3, where researchers seem to congestion the overall activity because they do not internalize the negative externalities that they have on each other.
The Publishing Model
We present here an extended example providing the general idea of the model. We continue to adopt an intuition related to the insights presented in Sect. 3 and to the daily experience of everyone in the academic profession, but it is clear that it can easily be extended to R&D between firms that are competing in a market, as in the model of [16]. Consider a world where there are n homogeneous scientific authors, each trying to form teams of collaborators and each with a common time constraint w. They all have two goals: a good output in terms of publications (on which they compete with colleagues), but also the objective of doing good research that can provide advancements in the field. Each author maximizes in each project both the probability of being published and the probability of authoring a good idea. We assume no constraint on the multi-matching technology P, except for the fact that agents cannot work alone: \(p \in P\) if and only if \(|n(p)| \ge 2\).
Here we assume that a project p has a strictly positive divulgative fitness (i.e., an expected popularity) that we call \(\phi (p)\). The divulgative fitness of a paper may depend on the amount of work that is put into the paper by its members. This fitness can clearly also be related to heterogeneous exogenous factors.
Accordingly, a paper’s probability of being published has the multinomial form:Footnote 9
$$\begin{aligned} P_{pub} (p) = \frac{ \phi (p)}{ \sum _{q \in x} \phi (q) }. \end{aligned}$$
When a new team is formed there are clear negative externalities (increasing in the divulgative fitness of the new project) for all the agents that are not members of the new team, because their probabilities of being published decrease.
In a related but not necessarily collinear way, we assume that each project has a strictly positive probability of providing a good idea which is \(P_{good} (p) \). This probability is reasonably increasing in effort, but there may be communication and coordination costs which make it decrease in the size, in terms of members, of the team. Or, there could be positive externalities from the aggregate quality of all the scientific production as a whole. In general, the whole environment of x can provide both positive and negative externalities, with network effects like those described in the connection model and in the co-authorship model of [25].
To provide a simplified functional form, which maintains the general idea, we assume that each author i receives a payoff in a generic state x that is:
$$\begin{aligned} u_i (x) = \sum _{p: i \in n(p)} \left( U P_{pub} (p) + \frac{V}{|n(p)|} \cdot P_{good} (p) \right) , \end{aligned}$$
where U and V are positive numbers, homogeneous for all agents,Footnote 10 and \(P_{good} (p)\) does not depend on other existing projects in x. We observe that, while the utility U coming from a publication is not affected by the number of authors (what matters is to have a publication in the curriculum vitae), the benefits V deriving from a good idea must be shared among the participants (consider, for instance, the earnings that come from a patented idea).
This utility is in line with what observed in Sect. 3. On the one hand, authors benefit from taking part in many projects and from having multiple collaborators to accommodate and meet the time constraint while, on the other hand, they do not take into account the externalities which may cause a reduction in effort spent and on the quality of the project.
For this simplified model it is not difficult to prove that it satisfies Assumption v1 (while the other assumptions are not).Footnote 11 That is because, for an agent i, if we call \(\Phi \equiv \sum _{q \in x: i \in n(q)} \phi (q)\), the marginal utility for being member of a new project \(p'\) is:
$$\begin{aligned} u_i (x \cup p') - u_i (x) = U \left( \phi (p') \frac{ \sum _{q \in x: i \not \in n(q)} \phi (q) }{\left( \Phi + \phi (p')\right) \Phi }\right) + V \cdot P_{good} (p'). \end{aligned}$$
The first term is nonnegative, and it is null only if that agent was already a member of each existing team. The second term is strictly positive by definition.