Discovering a cohesive football team through players’ attributed collaboration networks

Yu, Shenbao; Zeng, Yifeng; Pan, Yinghui; Chen, Bilian

doi:10.1007/s10489-022-04199-4

Discovering a cohesive football team through players’ attributed collaboration networks

Open access
Published: 12 October 2022

Volume 53, pages 13506–13526, (2023)
Cite this article

Download PDF

You have full access to this open access article

Applied Intelligence Aims and scope Submit manuscript

Discovering a cohesive football team through players’ attributed collaboration networks

Download PDF

Shenbao Yu^1,2,
Yifeng Zeng ORCID: orcid.org/0000-0002-5246-403X³,
Yinghui Pan⁴ &
…
Bilian Chen^1,2

3387 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

The process of team composition in multiplayer sports such as football has been a main area of interest within the field of the science of teamwork, which is important for improving competition results and game experience. Recent algorithms for the football team composition problem take into account the skill proficiency of players but not the interactions between players that contribute to winning the championship. To automate the composition of a cohesive team, we consider the internal collaborations among football players. Specifically, we propose a Team Composition based on the Football Players’ Attributed Collaboration Network (TC-FPACN) model, aiming to identify a cohesive football team by maximizing football players’ capabilities and their collaborations via three network metrics, namely, network ability, network density and network heterogeneity&homogeneity. Solving the optimization problem is NP-hard; we develop an approximation method based on greedy algorithms and then improve the method through pruning strategies given a budget limit. We conduct experiments on two popular football simulation platforms. The experimental results show that our proposed approach can form effective teams that dominate others in the majority of simulated competitions.

Team performance analysis in football match using network analysis-based approach

Article 05 January 2024

Defining a historic football team: Using Network Science to analyze Guardiola’s F.C. Barcelona

Article Open access 19 September 2019

FIFA World Cup referees’ networks: a constant-size clique approach

Article 06 July 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The process of team composition, which aims to discover an appropriate set of individuals with relevant expertise to achieve common goals efficiently, has been a major area of interest in the field of the science of teamwork. As football (also called “soccer” in some countries) requires a high level of teamwork, it is one of the best options for studying the team composition problem since it is characterized by a large amount of communication, interaction, and collaboration between team members. In reality, it is difficult to assess the effectiveness of a football team composition result because it may require a considerable amount of money as well as being labor-intensive. Fortunately, the emergence of a wide variety of football video games, such as Pro Evolution Soccer (PES)^{Footnote 1}, Electronic Arts Sports FC (also known as FIFA)^{Footnote 2} and Football Manager^{Footnote 3}, offers an opportunity to compose a team based on human preferences and evaluate outcomes efficiently. This opportunity exists not only because gamers can completely redo club designs as well as edit any player in the game but also because the platforms can fully simulate on-pitch football matches. Subsequently, the football team composition task becomes interesting and important on the game platforms.

As a multiplayer game, the process of football player selection and team composition is designed to select the most suitable player for a particular playing position and role [1], which is vital for clubs to be able to deliver high sports and financial returns [2]. Such a process is crucial since a poor selection result can affect player loyalty as well as cost a football team millions of dollars [3]. However, the multicriteria complexity and decision-making difficulty make the selection of players a challenging task. Although team managers and coaches use a variety of assessments to choose players by considering many aspects, including player productivity and limited wage budgets, the selection process would be too time-consuming to be realistic, and the accurate evaluation of a player’s suitability for a team is also a considerable puzzle. Thus, applying a systematic approach such as the mathematical modeling method is urgent.

Many studies have attempted to address the football team composition problem, but most of them rely on attributes such as players’ skills and physical status. For instance, most researchers utilize anthropometric measurements (e.g., age, height, and weight), fitness-related indices (e.g., vertical jump ability and speed), and players’ techniques (e.g., short passing and shooting) for the football player selection problem [4]. In addition, the market value and salary of football players are taken into account [5, 6]. Specifically, Zeng et al. [5] considered the players’ total salary as a budget constraint and resorted to a submodular function to solve the team composition problem. However, such attributes are not sufficient to measure a football team’s competitiveness. Achieving good results depends on not only the high-level players who are involved but also how effectively they collaborate, communicate, and work together as a team.

Assume, for example, a team manager who wants to build a football team consisting of players with distinguished skills in the following areas: {attacking prowess, ball control, defensive prowess, physical contact, and speed}. We also assume that there is a network including five football players {P₁,P₂,P₃,P₄,P₅} in Fig. 1. Each player highlights the corresponding skills, and an edge between two football players indicates that they can collaborate effectively. Such a network is referred to as an attributed collaboration network (ACN)^{Footnote 4} [7]. Without considering the connection among players, the manager can select either C₁ = {P₁,P₂,P₃} or C₂ = {P₁,P₄,P₅} - both C₁ and C₂ have the required skill set. However, the candidate set C₁ is the better choice since the network indicates that P₁ cannot work with P₄ and P₅ effectively.

The existence of an ACN among football players is quite common. In a football league, an obvious type of player collaboration is developed upon whether they are from the same team or country, which is often used to organize players in a club. In this case, the network encodes the fact that football players from the same club or country can communicate more easily and cooperate more harmoniously with each other than those serving in different teams. In addition, it is known that defensive and offensive positions differ in player composition because they are conjunctive and disjunctive tasks respectively [8]. The success of driving off each attack is dependent on completing a joint action. Here, the weakest defender is detrimental to the team’s defensive performance because he or she limits the team’s defensive capabilities. In contrast, a team’s offensive capacity is determined by the output of the best-performing member. Moreover, the distribution of a team’s offensive (defensive) performance can be measured by the network heterogeneity (homogeneity) [9, 10]. Low heterogeneity (or high homogeneity) indicates that all players share a similar level of interaction through the match, and vice versa. Thus, attacking benefits from heterogeneous players, while homogeneity ensures that there are no weak links among defensive players. This insight facilitates our understanding of the underlying functional mechanism of collaboration and motivates us to develop players’ attributed collaboration networks for the football team composition problem.

In this paper, we consider the team composition problem in the context of the Football Players’ Attributed Collaboration Network (FPACN). Each node in the network is a football player with certain skills, such as attacking prowess, ball control, dribbling, while edges between nodes are constructed based on the clubs they played for and their nationalities, which reflect the affinity between players. After obtaining the attributed collaboration network, given a certain budget, we propose a TC-FPACN model, the acronym for Team Composition based on the Football Players’ Attributed Collaboration Network, to identify a set of highly qualified football players and form a remarkably cohesive team. We evaluate the cohesiveness of a football team on the basis of three predefined network metrics, namely, network ability, network density, and network heterogeneity&homogeneity, in the TC-FPACN, whose goal is to discover a football team that maximizes the combination of the three network metrics. As we present the team’s properties through the attributed collaboration network, the constrained optimization problem can be converted to finding a maximum density subgraph in a graph, which turns out to be NP-hard [11]. The problem becomes more complicated when players’ ability and heterogeneity (or homogeneity) are considered. We propose an approximation algorithm that finds the best team based on greedy algorithms and further improve the algorithm using pruning methods under a budget constraint. We summarize the main contributions of this paper below.

We propose a Team Composition based on the Football Players’ Attributed Collaboration Network (TC-FPACN) model, which incorporates three network metrics (i.e., network ability, network density, and network heterogeneity&homogeneity) to define players’ cooperation mechanism.
We formulate the team composition task as a constrained optimization problem for the TC-FPACN that finds the optimal subgraph based on the network metrics. Since the problem is NP-hard, we propose a greedy algorithm with a pruning technique to solve it.
We conduct an empirical study on two video game platforms, i.e., Pro Evolution Soccer 2018 (PES2018) and EA SPORTS FIFA 22 (FIFA2022) to evaluate the effectiveness of the proposed model. Simulation results show that our model achieves favorable performance in competition against other teams.

The remainder of the paper is organized as follows. We review related works in Section 2. In Section 3, we first formally introduce the team composition task, then describe the three network metrics of the TC-FPACN and finally, formulate the team composition problem. We propose the new algorithms in Section 4. Section 5 demonstrates the performance of the proposed method. Finally, Section 6 concludes our work and discusses future research directions.

2 Related work

Since this paper considers finding a cohesive football team based on football players’ capabilities and collaborations, we start with a review of football decisions, especially for player selection and team composition, and proceed with related research on the evaluation of personal ability and the retrieval of the team from collaboration networks in general.

2.1 Football player selection and team composition

The process of football player selection and football team composition is a complex problem with conflicting objectives. The traditional solution to this problem is to assess several quantitative factors that are compulsory for coaches and their technical committees to produce the most elite player. These factors include the player’s anthropometric measurements [4], fitness-related indices [12], and skills [5, 13]. To name a few, Inan and Cavas [13] analyzed the offensive and defensive characteristics of Turkish Super League football players, such as the long pass accuracy, and developed an artificial neural network model for talent selection. Zeng et al. [5] defined a submodular function that represents the team’s skill coverage and used improved greedy algorithms to solve the optimization problem. Given the existence of different duties for football players in the field, many researchers have also considered that the relevant criteria of skills must be assigned according to each player’s position [3, 14, 15]. Ozceylan [3], for example, used an analytic hierarchic process to prioritize the criteria for each player based on their position and developed a 0-1 integer linear programming to determine top players in a team.

Most approaches mentioned above emphasize the on-pitch sport success. In addition, there are other factors worth considering, such as financial aspects [16, 17] and the future potential of professional football players [18, 19]. For instance, Singh and Lamba [16] resorted to machine learning models including decision tree and gradient boost to identify the factors that affect the financial market values of football players and then used the selected factors to predict the player’s market value. In [18], the authors projected a target player’s potential by searching the corresponding historical attributes to identify other football players with a similar profile. Zhao et al. [19] defined three attributes, including the potential factor, to evaluate the performance of teams and football players.

Nevertheless, forming a winning football team involves more than having the required mix of skills under the budget limit. Player selection is a difficult decision-making problem that needs to take into account the collaboration mechanisms among football players, which are ignored in the literature.

2.2 Personal ability evaluation

Personal ability is always an important guideline for team composition. Player selection needs to consider quantitative attributes, and the most widely used rating systems for a player are based on performance data. Since there are multiple attributes to consider when assessing a player’s ability, algorithms based on multicriteria decision-making (MCDM) are regarded as simple and suitable for developing solutions [20]. As a key component of the MCDM method, the analytic hierarchic process (AHP) is widely used to determine the weights of the selected criteria [21]. Using the AHP methods, each player’s attributes are ranked according to their importance in a given position. In parallel, the technique for order of preference by similarity to ideal solution (TOPSIS) – the well-known MCDM method – is applied extensively to rank the alternatives, partly due to its mathematical clarity. A plethora of methods have been developed following this breakthrough, such as TOPSIS-IPA [22] and Fuzzy-TOPSIS [23]. More recently, Sałabun et al. [24] developed a multicriteria model based on the characteristic objects method to evaluate players in team sports.

In addition to MCDM-based models, Liu et al. [25] introduced the text information of postmatch reports written by professional soccer journalists or editors and proposed an affective computing model for the player’s performance rating. Furthermore, Pantzalis and Tjortjis [26] conducted an intensive study to define the main attributes that influence a defender’s match rating. They found that classic defensive actions such as interceptions and clearances, along with player attributes such as jumping reach and strength, are more suitable for evaluating defenders.

2.3 Collaboration networks for a team formation

A successful team relies on not only individual ability but also communication and collaboration. The study of scientific collaboration aims to compute the fitness level of an expert for collaborating with other experts on a set of skills [27]. Given an expertise collaboration network, Lappas et al. [28] first considered team formation in the presence of a collaboration network and measured effectiveness using communication cost. Furthermore, density-based measurements were proposed [29,30,31], and the authors generalized the approach [28] by considering the team formation problem as a multiobjective optimization task. For example, Selvarajah et al. [31] aimed to build a more effective team by analyzing various scenarios, such as how frequently team members had worked together in the past. In parallel, Datta et al. [32] proposed a composite mechanism to exploit different elements of individuals and the community given by their expertise and connections. Furthermore, Awal and Bharadwaj [33] quantified and optimized a team’s collective ability based on a collective intelligence index, which encodes individuals’ knowledge competence and their collaboration competence.

Given that the major limitations of the class of solutions mentioned above are that they fail to capture complex interactions and are computationally intractable, more recent work adopted neural architectures to learn a mapping between the skills and experts’ space [34,35,36]. For instance, Hamidi et al. [36] focused on state-of-the-art neural network methods to learn the dense representations for nodes in the collaboration network and bootstrapped the training process through transfer learning. Similarly, in this paper, we focus on the team formation problem based on the collaboration network and explore an efficient way to find a team. Specifically, we consider a network structure of football players as an attributed collaboration network, where nodes representing players are associated with their skills and the weights attached to edges reflect their degree of affinity.

3 TC-FPACN model

In this section, we present the TC-FPACN model, which is formed by three network metrics that contribute to determining the cohesiveness of a football team, including network ability, network density, and network heterogeneity&homogeneity. We first formally introduce the team composition task and then detail the network metrics. Finally, we formulate the objective function of TC-FPACN, which is to discover a subnetwork by maximizing the three metrics simultaneously.

3.1 Task formulation

Let P = {P_n} (1≤n≤N) be a set of football players, and S = {S_m} (1≤m≤M) be a set of players’ skills, where N and M are the number of football players and skills, respectively. Assume that football players are organized in a weighted and undirected graph (i.e., FPACN), denoted as $\mathcal {G}(\mathcal {V}, \mathcal {E})$ with a set of nodes $\mathcal {V}$ and a set of edges $\mathcal {E}$. Each node $v_{n} \in \mathcal {V}$ is associated with a football player P_n equipped with a set of skills ^{Footnote 5}, while an edge $(i,j) \in \mathcal {E}$ models the relationship between the pair of the players (i.e., P_i and P_j). In addition, for readability, we present the main notations used throughout the paper in Appendix A, Table 10.

In football, it is intuitive that different positions on the pitch highlight different skills, which means that some skills are common (e.g., body control and jump) while others (e.g., goalkeeping) are unique to a particular position (e.g., goalkeeper). Thus, we divide football players into three groups - Forward/Midfielder, Backward, and Goalkeeper - according to a player’s position in the football field, with the corresponding collaboration network $\mathcal {G} = \mathcal {G}_{\mathrm {F}} \cup \mathcal {G}_{\mathrm {B}} \cup \mathcal {G}_{\mathrm {G}}$, where $\mathcal {G}_{\mathrm {F}}$, $\mathcal {G}_{\mathrm {B}}$, and $\mathcal {G}_{\mathrm {G}}$ are subgraphs for Forward/Midfielder, Backward, and Goalkeeper respectively. We define the task of football team composition as follows:

Definition 1

Given an attributed collaboration network of all football players and a limited budget, the goal of our team composition task is to form a cohesive subnetwork (i.e., football team) $\mathcal {G}^{\prime }(\mathcal {V}^{\prime }, \mathcal {E}^{\prime }) \subseteq \mathcal {G}(\mathcal {V}, \mathcal {E})$, where the node set $\mathcal {V}^{\prime }$ represents the selected football players.

3.2 Three network metrics

The TC-FPACN model considers the cohesiveness of a football team from three aspects: a) network ability, b) network density, and c) network heterogeneity&homogeneity. We now describe the three network metrics in detail.

3.2.1 Network ability

Given a football player P_n ∈P (1≤n≤N) with a set of skills, each of which is labelled with the corresponding weight and personal level, we first build a model to calculate the personal ability of P_n, denoted $\phi _{\mathrm {P}_{n}}$, in (1).

$$ \phi_{\mathrm{P}_{n}} = \sum\limits_{m=1}^{M} W_{\mathrm{S}_{m}} L_{\mathrm{P}_{n},\mathrm{S}_{m}}, $$

(1)

where $W_{\mathrm {S}_{m}}$ is the weight of skill S_m, and $L_{\mathrm {P}_{n}, \mathrm {S}_{m}}$ is the personal level of S_m for player P_n. With the personal ability defined in (1), we calculate the network ability of $\mathcal {G}^{\prime }(\mathcal {V}^{\prime }, \mathcal {E}^{\prime })$ for a football team (i.e., the competency of the whole team), which gives

$$ {\Phi}(\mathcal{G}^{\prime}) = \sum\limits_{n=1}^{\lvert \mathcal{V}^{\prime} \rvert} \phi_{\mathrm{P}_{n}}, $$

(2)

where $\lvert \mathcal {V}^{\prime } \rvert $ is the number of selected football players in a team. We can see from (2) that it is the sum of the personal abilities of the selected players, which means that a higher network ability score contributes to forming a better football team.

3.2.2 Network density

As shown in (2), a naive scheme for building a football team is to identify suitable players with good skills for each position and then put them together. However, the team’s victory depends on not only the number of football stars but also the collaboration of the players, enabling them to function as a cohesive team in the field. Intuitively, good collaboration is commonly built upon players’ relationships. To establish relationships among football players, in this paper, we consider whether they come from the same team or country, which is often used for organizing players in a club. Formally, let us consider the graph $\mathcal {G}(\mathcal {V}, \mathcal {E})$. Given any two nodes $v_{i}, v_{j} \in \mathcal {V}$ associated with two football players P_i and P_j, if they come from the same country, the same club, or both, we add the edge (i,j) to $\mathcal {E}$, and the relationship is weighted by calculating the Jaccard similarity, denoted as ω_i,j, in (3).

$$ \omega_{i,j} = \frac{\lvert \mathbf{V}_{\mathrm{P}_{i}} \cap \mathbf{V}_{\mathrm{P}_{j}} \rvert}{\lvert \mathbf{V}_{\mathrm{P}_{i}} \cup \mathbf{V}_{\mathrm{P}_{j}} \rvert}, $$

(3)

where $\mathbf {V}_{\mathrm {P}_{i}}$ is the vector of player P_i with the elements team name and nationality.

Based on the relationships among football players, we now turn to define the network density for measuring team cohesiveness. Although many methods have been used to define a team’s cohesion based on social networks, such as the diameter communication cost [28], density-based measurement [29], and local clustering coefficient [32], the definition of a team’s cohesiveness is still an open issue. Different from the existing works, we define the network density to measure the strength of inner-team interaction in the subnetwork $\mathcal {G}^{\prime }(\mathcal {V}^{\prime }, \mathcal {E}^{\prime })$ for a football team in (4).

$$ {\Psi}(\mathcal{G}^{\prime}) = \frac{{\sum}_{(i,j){\in}\mathcal{E}^{\prime}} \omega_{i,j}}{\lvert \mathcal{E}^{\prime} \rvert}, $$

(4)

where (i,j) is an edge in $\mathcal {E}^{\prime }$, ω_i,j is the corresponding weight defined in (3), and $\lvert \mathcal {E}^{\prime } \rvert $ is the number of edges. If there is no edge between two nodes, we set ω_i,j = 0. A larger value of ${\Psi }(\mathcal {G}^{\prime })$ suggests that football players are better able to interact with each other, while a smaller value indicates the presence of more ambiguous relationships. To better understand the importance of the network density, we give a toy example below.

Example 1

Considering the two undirected, weighted graphs in Fig. 2, each node denotes a football player, and the edges reflect the relationship between any two players. The values of $\phi _{\text {P}_{i}}$ and ω_i,j are also shown in the figure. If we ignore the collaborative relationships between football players, it is intuitive that the two players {P₂,P₃} are highly scored and shall be selected into a team (see the left-hand side of Fig. 2); however, their relationship (the right-hand side of Fig. 2) is rather weak. In contrast, the players {P₁,P₃} would be the better candidates, as they have the strongest connection, which suggests that the connection strength (network density) among players helps to build and reinforce a cohesive team.

3.2.3 Network heterogeneity & homogeneity

In this section, we proceed to define the network heterogeneity&homogeneity, which is also an important factor for team cohesiveness in the TC-FPACN. It is well known that heterogeneity and homogeneity are opposites, which means that improving heterogeneity may compromise homogeneity and vice versa. Specifically, heterogeneity highlights the diversity of attributes and behaviors among group members; in contrast, homogeneity emphasizes the within-group similarities regarding these shared attributes.

We adopt the Gini coefficient [37] to measure heterogeneity (or homogeneity) for the set of players, denoted Gc. Since the Gini coefficient can be calculated in many forms [38,39,40], we use an approximate calculation method [38] as follows:

$$ Gc = \frac{1}{M}\sum\limits_{m=1}^{M} \frac{1}{2{N}^{2}u}\sum\limits_{i=1}^{N}\sum\limits_{j=1}^{N} W_{\mathrm{S}_{m}} \lvert L_{\mathrm{P}_{i}, \mathrm{S}_{m}} - L_{\mathrm{P}_{j}, \mathrm{S}_{m}} \rvert, $$

(5)

where $\lvert L_{\mathrm {P}_{i}, \mathrm {S}_{m}} - L_{\mathrm {P}_{j}, \mathrm {S}_{m}} \rvert $ measures the difference in the skill level related to S_m between two players P_i and P_j, and u is the average value of skill S_m. In (5), we see that Gc = 1 indicates the maximum heterogeneity, while Gc = 0 is the maximum homogeneity, which means that they are interdependent [8].

In the context of football games, the two main tasks are attack and defense, and they require different mechanisms to select players to successfully complete the tasks. Attacks on a goal benefit from players who have different skills and require a set of heterogeneous forward players. However, defense requires homogeneous players since it is expected that most defense players can play in any position in the defense area. Considering that Forward and Midfield players are involved in the attack and Backward players are responsible for the defense, based on the Gini coefficient defined in (5), we measure the network heterogeneity&homogeneity for $\mathcal {G}^{\prime }(\mathcal {V}^{\prime }, \mathcal {E}^{\prime })$ for a football team as follows:

$$ {\Upsilon}(\mathcal{G}^{\prime}) = \left\{ \begin{array}{rl} Gc, & \quad \text{if} v_{n} \in \mathcal{G}^{\prime}{\cap} \mathcal{G}_{\mathrm{F}} \\ \frac{1}{Gc}, & \quad \text{if}~ v_{n} \in \mathcal{G}^{\prime}{\cap} \mathcal{G}_{\mathrm{B}} \end{array} \right., $$

(6)

where v_n ($1 {\leq } n {\leq } {\lvert {\mathcal {V}^{\prime }}\rvert }$) represents a football player selected from the two graphs (i.e., $\mathcal {G}_{\mathrm {F}}$ and $\mathcal {G}_{\mathrm {B}}$) simultaneously. Equation (6) shows that a cohesive team should maximize network heterogeneity for the Forward/Midfielder while minimizing it for the Backward in the team composition.

3.3 Team composition via three network metrics

As mentioned, we delve into three network metrics of the TC-FPACN model that lay the foundation for building a cohesive football team. Considering all these factors, we introduce the trade-off parameters α and β, where 0≤α + β≤ 1, which configures acceptable combinations among network ability, network density, and network heterogeneity&homogeneity. Formally, given the attributed collaboration network of football players $\mathcal {G}(\mathcal {V}, \mathcal {E})$ and a fixed budget (Bu) for recruiting players, we use σ to denote the objective function of the TC-FPACN and then formulate the team composition task as solving the following optimization problem.

$$ \begin{array}{ll} \mathop{\max}\limits_{\mathcal{G}^{\prime} \subseteq \mathcal{G}} & \sigma(\mathcal{G}^{\prime}) := {\alpha} {\Phi}(\mathcal{G}^{\prime}) + {\beta} {\Psi}(\mathcal{G}^{\prime}) + (1 - \alpha - \beta) {\Upsilon}(\mathcal{G}^{\prime}), \\ \text{s.t.} & {\sum}_{n=1}^{\lvert \mathcal{V}^{\prime} \rvert} Cost(\mathrm{P}_{n}) \leq \text{Bu}, \\ & \lvert \mathcal{V}^{\prime} \rvert = 11, \end{array} $$

(7)

where ${\sum }_{n=1}^{\lvert \mathcal {V}^{\prime } \rvert } Cost(\mathrm {P}_{n})$ denotes the total cost of the football team, in which the function Cost(P_n) measures the cost of player P_n based on his personal rating, which we will explain in Section 5.1.

As shown in problem (7), the goal of TC-FPACN is to find a subgraph $\mathcal {G}^{\prime }(\mathcal {V}^{\prime }, \mathcal {E}^{\prime })$ containing a set of football players that maximize the function considering the three metrics simultaneously. The subgraph $\mathcal {G}^{\prime }$ contains players for three types of positions in (8):

$$ \mathcal{G}^{\prime} = \mathcal{G}^{\prime}_{\mathrm{F}} \cup \mathcal{G}^{\prime}_{\mathrm{B}} \cup \mathcal{G}^{\prime}_{\mathrm{G}}, $$

(8)

where $\mathcal {G}^{\prime }_{\mathrm {F}} \subseteq \mathcal {G}_{\mathrm {F}}$, $\mathcal {G}^{\prime }_{\mathrm {B}} \subseteq \mathcal {G}_{\mathrm {B}}$, and $\mathcal {G}^{\prime }_{\mathrm {G}} \subseteq \mathcal {G}_{\mathrm {G}}$. Note that we focus on choosing suitable players in the field and neglect bench players, which means that the number of nodes in $\mathcal {G}^{\prime }$ is 11 (i.e., $\lvert {\mathcal {V}^{\prime }}\rvert = 11$), and $\mathcal {G}^{\prime }_{\mathrm {G}}$ contains one goalkeeper.

4 Optimization method based on greedy algorithm

Given that finding the optimal subgraph based on the optimization function of problem (7) is NP-hard [11], we develop a greedy algorithm to solve the aforementioned team composition problem. We consider a team with a 4-3-3 formation, which is widely-used in international competition. This formation means that there is one goalkeeper, four guards, three midfielders and three forwards on a team. We first leave out the goalkeeper and develop two algorithms to find the best players from Forward/Midfielder (i.e., $\mathcal {G}_{\mathrm {F}}$) and Backward (i.e., $\mathcal {G}_{\mathrm {B}}$), respectively. Next, we propose a pruning technique to organize the final football team.

We show the process to find the best Forward/Midfielder players in Algorithm 1. For brevity, we omit the pseudocode for finding the best Backward players because the two algorithms differ only in the input: the former selects players from $\mathcal {G}_{\mathrm {F}}$, while the latter chooses players from $\mathcal {G}_{\mathrm {B}}$. As shown in Algorithm 1, we start with an empty graph (line 1), which poses a difficulty to the direct application of the three network metrics; therefore, we need to choose the starting football player. In this paper, we consider a key player with a good trade-off between personal ability and connections to other players. Specifically, for each player, we first extract the subnetwork that consists of the player and the player’s neighbors (lines 2-4), and then determine the key player (denoted v^c) that maximizes both personal ability and network density (lines 5-6). The algorithm then proceeds through multiple iterations (lines 7-13). In each loop, the algorithm adds the most suitable player v^∗ in $\mathcal {G}_{\mathrm {F}}$, who maximizes the value of the objective function of problem (7) (lines 8-10). Note that we remove the player who is selected from $\mathcal {V}_{\mathrm {F}}$ at the end of each iteration, which avoids the same players being selected into the team (line 11). Finally, once the total number of players reaches the size requirement, the algorithm returns the final subgraph $\mathcal {G}^{\prime }_{\mathrm {F}}$ (line 14).

The results from the algorithms above are used as inputs for the final team composition. Since we need to ensure that the total cost of a team does not exceed the budget, we add a pruning strategy to the greedy algorithm. We propose the idea of cost performance, denoted Cp, as a measurement to decide which player must be cut if the total cost exceeds the given budget. Specifically, for a football player P_n, the corresponding cost performance Cp is computed in (9).

$$ Cp(\mathrm{P}_{n}) = \frac{\phi_{\mathrm{P}_{n}}}{Cost(\mathrm{P}_{n})}. $$

(9)

We frame the new approach for solving the objective function of the TC-FPACN in problem (7) as the FBTP (Finding the Best Team with Pruning) algorithm presented in Algorithm 2. We first find the best goalkeeper (line 1); and the best team under no budget constraint consists of $\mathcal {G}^{\prime }_{\mathrm {F}}$, $\mathcal {G}^{\prime }_{\mathrm {B}}$ and the selected goalkeeper (line 2). The pruning operations are embedded in the greedy algorithm (lines 3-8). Specifically, we use a loop to check whether the total cost of the football team exceeds the budget. If the cost does not satisfy the budget requirement, we perform a pruning strategy that determine the football player v^cut with the lowest cost performance (line 4) and remove v^cut from the football team $\mathcal {G}^{\prime }$ (line 5). Next, we choose the other suitable candidate according to the position of v^cut (lines 6-7) based on the greedy algorithm. For example, if the position of v^cut belongs to Forward/Midfielder, we execute the procedures in lines 8-11 of Algorithm 1 to select v^∗.

To better illustrate the workflow for constructing a football team based on the algorithms mentioned above, we provide a vivid example in Fig. 3, which illustrates the process of finding five football players from Forward/Midfielder. We first focus on choosing players without the budget constraint (see the left-hand side of the figure). We start with the key player S and proceed to find the most suitable forward (or midfielder) in each iteration through Algorithm 1. For instance, in step 1, we tend to choose the football player A that maximizes the objective function of problem (7). We return the final selection result (i.e., {S,A,B,C,E}) in step 4, as the number of players is full. Since the selected players do not consider a proposed budget, on the right-hand side of Fig. 3, armed with Algorithm 2, we proceed to conduct the pruning operation by removing the player with the lowest cost performance and then find another football player, i.e., we remove C and add D. For example, we output the candidate set, {S,A,B,E,D} if the total package is no larger than the budget; otherwise, the pruning and selection processes are repeated until the budget requirement is satisfied.

5 Empirical study: data analysis and team evaluation

Given the discussions in Section 1, it is difficult to form a series of football teams in the real world to evaluate the performance of the proposed model. Fortunately, football video games provide a convenient and quick way to assess the effectiveness of our model. In this paper, we implement and test our method on the two most popular game platforms (i.e., PES2018 and FIFA2022). Figure 4 shows screenshots of the two platforms; both are classical and full-fledged platforms that not only are equipped with well-simulated football players in real life but also provide hours of entertainment in multiplayer mode, including simulating a football match. We conduct a series of experiments with the quick games of PES2018 and FIFA2022 based on a Windows PC. All the codes are implemented in Python, and the numerical computations are conducted on a server with a 12-core Intel(R) Xeon(R) CPU E5-2620 v3 @2.40 GHz and 16 GB memory. The source code of our method is publicly available at https://github.com/misterbobo/TCFPACN.

5.1 Data analysis

Since the values of many attributes of the team composition are calculated from game data, we first analyze the original data from PES and FIFA and preprocess the data^{Footnote 6}. In PES2018, we retrieve the data that contain 9,563 football players; we also collect FIFA2022 data, which includes data on 18,278 players from the official website^{Footnote 7}. Table 1 provides a brief overview of the two datasets, both of which list player IDs, positions, and names, as well as descriptions of each player’s skills, such as a player’s attacking prowess in Table 1a.

Table 1 The structure of the original data in two game platforms

Full size table

As seen from Table 1, a player serves in a particular position in a football team. It is also clear that each position has different skill requirements. Consider an example in Table 1a, the skill of attacking prowess is crucial for a Forward player, while it has no relevance for a goalkeeper. Table 2 shows the assessments of 23 skills for some well-known players in PES2018. The numerical values reflect each player’s performance on each skill. As seen from Table 2, it is necessary to link the skills to distinct positions.

Table 2 A sample of four players’ assessment by 23 criteria in PES2018

Full size table

For each dataset, we first divide the raw data into three groups (i.e., Forward/Midfielder, Backward, and Goalkeeper) according to each player’s position on the pitch. For each group, we rank the skills based on the average values and select the top-10 skill values presented in Table 3. Notably, we ignore the criteria for goalkeepers in Table 3 because both datasets have only a few skills that are relevant to goalkeepers; hence, we include all of them. The weight of each skill is assigned following the principle mentioned in [3], as provided in the last column of Table 3.

Table 3 The criteria for forward/midfielder and backward

Full size table

The main goal of this paper is to form a cohesive team with a budget constraint (see problem (7)). Therefore, it is necessary to know the salary for each football player. However, there are many football players with missing salaries in both PES and FIFA datasets. It is known that a player’s cost is positively correlated with his rating, which is a good indicator. Here, given a football player P_n, we use the fitting function mentioned in [5] to evaluate his cost as follows, which can be used to formulate the total team cost.

$$ Cost(\mathrm{P}_{n}) = \eta e^{\theta r(\mathrm{P}_{n})}, $$

where η = 6.375 × 10^− 4, 𝜃 = 0.1029. In addition, r(P_n) denotes the rating (or overall) of P_n (see Table 1).

5.2 Performance metrics and parameter settings

To understand game results intuitively, we use goal difference (GD) and team points (Tps), which are the general rules in international competitions, as the metrics to evaluate team performance. Specifically, given a competition set Δ = {Δ₁,Δ₂,⋯ ,Δ_Z}, where Δ_z (1≤z≤Z) represents a match and Z is the total number of matches, the value of GD for a football team is calculated as the number of goals scored in all matches minus the number of goals conceded, which gives

$$ \text{GD} = \sum\limits_{z=1}^{Z} \delta_{s}({\Delta}_{z}) - \delta_{c}({\Delta}_{z}), $$

where δ_s(Δ_z) and δ_c(Δ_z) are the number of goals scored and conceded in one match, respectively. Tps denotes the total match scores of a team, as shown in (10).

$$ \text{Tps} = \sum\limits_{z=1}^{Z} \text{Tp}({\Delta}_{z}), $$

(10)

where Tp(Δ_z) is a team point for one match, which gives

$$ \text{Tp}({\Delta}_z) = \left\{ \begin{array}{rl} 3, & \quad \text{if win} \\ 0, & \quad \text{if draw} \\ -1, & \quad \text{if lose} \end{array} \right.. $$

(11)

Unless stated otherwise, we set the number of matches Z = 30 and set Bu = 100 to simulate the unconstrained budget case. In addition, for PES2018, we use the FBTP algorithm with the settings α = 0.6 and β = 0.2 in the TC-FPACN to solve the optimization problem (denoted TC-FPACN+FBTP); similarly, we set α = 0.4 and β = 0.4 for FIFA2022. We further present a sensitivity analysis of parameters α and β based on our new evaluation strategy in Section 5.5.

5.3 Simulation results

As the team budget has a large impact on team composition, we investigate the capability of the TC-FPACN+FBTP to deal with different team composition scenarios (i.e., with or without the budget constraint).

5.3.1 Team performance without a budget constraint

In this subsection, we conduct experiments to show the effectiveness of the team generated by the TC-FPACN+FBTP that ignores the budget constraint. We show our team formation results in Table 4. Based on the recommended players, we compose our DREAM TEAM in PES2018 and FIFA2022, denoted DT-PES and DT-FIFA, respectively (see the left-hand side of Fig. 5a and 5b). To conduct a performance comparison and ensure the fairness of competitions, we select a team in PES2018 with a cost approximately equal to DT-PES, namely, MD WHITE^{Footnote 8} (the right-hand side of Fig. 5a), which is one of the most competitive teams in the game. In FIFA2022, we choose MANCHESTER UNITED^{Footnote 9} as the competitor (pictured on the right in Fig. 5a), which not only has a similar cost to DT-FIFA but also has the leading record in its football league.

Table 4 Selected football players in PES2018 and FIFA2022 with Bu = 100

Full size table

Table 5 shows the battle results on the two game platforms, including the scoreline of each match, the total cost, Tps, and GD. A close inspection of the match results in the table shows that DT-PES wins more matches than MD WHITE in PES2018, and DT-FIFA achieves good performance than MANCHESTER UNITED in FIFA2022. Moreover, the cost of our team is slightly smaller than that of MD WHITE (or MANCHESTER UNITED). It is clear that whichever platform we use, our team dominates through the 30-race series, which highlights the effectiveness of the proposed model.

Table 5 Match results for DT-PES v.s. MD WHITE and DT-FIFA v.s. MANCHESTER UNITED without a budget constraint

Full size table

To demonstrate the strength of our team, we simulate matches in which random teams battle with MD WHITE and MANCHESTER UNITED, respectively. There are two ways to generate a random team. Given the total cost of MD WHITE (or MANCHESTER UNITED) as the budget constraint, one way is to pick a player for each position randomly based on the average budget, while the other way is first to pick a few players that consume most of the budget and then select other players based on the remaining budget. We name the resulting teams RAND 1 and RAND 2, respectively in PES2018, and RAND 3 and RAND 4, respectively in FIFA2022. The simulated results are shown in Table 6. From the perspective of Tps and GD, we find that our teams perform better than all the random teams when competing against MD WHITE in PES2018 and MANCHESTER UNITED in FIFA2022.

Table 6 Match results for random teams against MD WHITE in PES2018 and MANCHESTER UNITED in FIFA2022

Full size table

5.3.2 Team performance considering different budget constraints

It is common for football player recruitment to be constrained by a budget crunch. In this subsection, we discuss the performance of the TC-FPACN+FBTP by adjusting the budget constraint. In PES2018, since MD WHITE is one of the best teams with the highest cost burden, we use its cost as the budget limit (denoted as Bu_hi), and set the budget change from Bu_lo to Bu_hi, where Bu_lo = 10 and Bu_hi = 60. Similarly, in FIFA2022, we set Bu_hi = 40, whose value is close to the cost of MANCHESTER UNITED, and Bu_lo = 0. We define the budget levels in Table 7.

Table 7 The budget levels and the corresponding range of values

Full size table

Since Table 5 shows the outcomes of the simulated matches against MD WHITE and MANCHESTER UNITED, both of which have a cost of Level V, we select only four typical teams on each game platform whose costs fall within Level I to Level IV. Specifically, in PES2018, we choose AS RED WHITE, VALENCIA, LONDON FC, and PM BLACK WHITE; in FIFA2022, the four teams are CD TONDELA, FC NANTES, REAL SOCIEDAD, and AC MILAN. For each competitor, we use the corresponding budget level as the constraint to select football players who constitute the DREAM TEAM based on the TC-FPACN+FBTP. We show the match results in Table 8. As shown in the table, all eight teams generated by the TC-FPACN+FBTP are more successful at winning events in terms of Tps. In addition, except for losing two goals when playing a 30-game series against REAL SOCIEDAD in FIFA2022, the remaining teams formed with our method still win the series with the superior goal difference. The match results suggest that the proposed method can assemble a team that wins nearly all the competitions given a budget level.

Table 8 Match results under different budget constraints

Full size table

5.4 Method comparisons

In this subsection, we compare the TC-FPACN+FBTP with other approaches from two aspects. We first compare the TC-FPACN+FBTP with the other football team composition method, namely, CEFG (Cost-Effective Forward selection Greedy) [5]. Next, we discuss the performance of the search strategy based on the random walk algorithm (RW) [41], which is widely used in many areas (e.g., recommender systems [42, 43], community detection [44, 45], and sampling algorithms [46]) for solving the constrained optimization problem (7), denoted TC-FPACN+RW.

5.4.1 Comparison with the CEFG method

We first compare the team composition quality of our method with the CEFG. We again use the PES2018 and FIFA2022 game platforms and focus on the Tps and GD of the two methods for different budget levels. For a given budget constraint, we first generate two football teams on a platform with the TC-FPACN+FBTP and CEFG and then simulate 30 matches between the two teams. Figure 6 compares the simulation results, from which we can conclude that the team generated by the TC-FPACN+FBTP dominates the play on the football pitch. In addition, the data in all four figures shows that the Tps (or GD) increases first and then decreases with the increase in the budget level, and the numerical value reaches a peak at Level II in PES2018 and Level III in FIFA2022. Interestingly, a closer observation reveals that the cost of our team at Level II in PES2018 is approximately equivalent to that at Level III in FIFA2022. A possible reason for the disappointing performance of CEFG is that the team recommended by the CEFG tends to contain a few superstars, and the remaining players may lack competitiveness, especially at a small budget level, thereby leading to poor match results. However, the TC-FPACN+FBTP is more efficient for building a cohesive team that balances the ability in each position and facilitates collaboration among players. Thus, the results suggest that the proposed method generates reliable and promising performance and is not constrained by the choice of game platform.

5.4.2 Comparison with the random walk strategy

In this subsection, we compare the performance of the TC-FPACN+FBTP with the TC-FPACN+RW. Let $\mathcal {G}(\mathcal {V}, \mathcal {E})$ be the attributed collaboration network of football players. The TC-FPACN+RW begins at a node v_i randomly, and at each step, it moves to another node v_j with a probability proportional to the weight of edge (i,j). We consider the probability (or weight) from v_i to v_j based on the objective function value σ that includes v_j in problem (7), which means that a higher value of σ results in a greater probability of choosing node v_j. The searching process stops if the required number of football players is met, and all the nodes selected in this way form the final football team.

Similar to the process of the simulation match mentioned in Section 5.4.1, we use the team formed by the TC-FPACN+FBTP to compete against the team set up by the TC-FPACN+RW in PES2018 and FIFA2022. The simulation results are compared in Fig. 7. Figure 7a and b show that the team generated by the proposed method wins all matches against the team produced by the TC-FPACN+RW on the PES2018 platform. In addition, Fig. 7c and d show that the team formed by the TC-FPACN+FBTP also shows enough dominance to win matches under four budget constraints (i.e., from Level II to Level V). A possible explanation for the results might be that the RW strategy focuses only on neighbors of the current node in the players’ network in each searching step, which is easily trapped in a local optimum, thereby compromising the discovery of the most suitable players. Note that at Level I, the value of Tps and GD of the team built via our method is smaller than the team produced by the TC-FPACN+RW (see the rightmost bars in Fig. 7c and d), which means our team lost most of the matches. This result is likely to be related to the very low budget, which fails to recruit even one competitive football player. Nevertheless, the overall results show the effectiveness of the proposed FBTP searching algorithm.

5.5 Sensitivity analysis of the parameters

In this subsection, we discuss the parameter sensitivity of the TC-FPACN model, which includes α and β, under no budget constraint. We again use the FBTP algorithm to choose football players. Since it is time-consuming to simulate all matches for different parameter settings, we seek another indicator to evaluate team performance efficiently instead of using Tps and GD. In PES2018, we observe that there is an eye-catching number, namely, Team Spirit^{Footnote 10} (TS), when we complete the configuration of a football team (e.g., the upper right corner of the left-hand side in Fig. 5a). In fact, TS indicates how good the relationship is on the pitch, and a high TS value could occur in a player who has an affinity for the manager’s team instructions, which naturally leads to better teamwork. In FIFA2022, due to the lack of a similar concept to TS, we use the overall rating (OR), which is calculated by first summing the ratings of all football players on a team and then computing the average (e.g., see the player’s rating on the left-hand side of Fig. 5b). We assume that a higher value of OR indicates better team performance.

Armed with TS and OR, as well as the total team cost, we set up the tests to loop through all values of α and β, and the increment of α and β in each iteration is 0.1. If we select a smaller increment, the evaluation becomes more labor-intensive, and the recommended players do not change much. Figure 8 exhibits the results when tuning α and β. If α = 0 and β = 0, the objective function of problem (7) maximizes the heterogeneity&homogeneity of a team, which results in both poor TS and OR values. Similarly, if α = 1 and β = 0, the function considers only the network ability, which not only leads to a degradation in the TS or OR value but also increases the cost burden. Additionally, there is a slight incline in the values of TS and OR when increasing β. This observation suggests that the network density is an important factor that noticeably benefits the team spirit, and it also demonstrates that football is a team sport. Given the results in Fig. 8a and b, we can choose appropriate settings for the parameters α = 0.6 and β = 0.2 for PES2018 because they achieve the highest team spirit value while incurring a relatively low cost. For FIFA2022, Fig. 8c and d show that at the grid point (α,β) = (0.4,0.4), we obtain a good balance of a relatively high OR value and a low total cost; hence, we use this pair of parameters as the tuning result.

6 Conclusions

In this paper, we study the problem of optimizing football team composition in the context of the attributed collaboration network of football players. Since the team’s success requires full cooperation between football players, we propose a team scoring function that considers three network metrics, namely, network ability, network density, and network heterogeneity&homogeneity. We then convert the constrained team composition task into the problem of finding an optimal subgraph in the attributed collaboration network. To tackle this problem, we present a novel approach that searches a subgraph by using a greedy algorithm with pruning techniques. We conduct an empirical study of the proposed techniques on two simulated game platforms (PES2018 and FIFA2022). The experimental results show that our method can build a competitive team.

Despite achieving good performance, we have barely scratched the surface of football players’ cooperation mechanisms. In particular, the search strategy tends to be trapped in a local optimum in our study. Further work needs to be conducted to investigate sophisticated social factors and delve into how they interact, as well as to explore other search optimization algorithms based on a given budget constraint. In addition, although our new approach aims to determine a football team composition, the investigation of the cooperation factors in this paper can be generalized to solve the team cohesion problem. We will investigate such a generalization and its applications in other problem domains.

Notes

https://www.konami.com/
https://www.ea.com/games/fifa
https://www.footballmanager.com/
We list the abbreviations of major terms throughout this paper in Table 10, Appendix A to ease reading.
In the context of the attributed collaboration network of football players, if not otherwise specified, we use v_n or P_n indiscriminately to represent the same football player.
The two datasets we use are publicly available on https://github.com/misterbobo/TCFPACN/tree/main/Data.
https://sofifa.com/
https://www.realmadrid.com/
https://www.manutd.com/
https://www.konami.com/wepes/2018/manual/ps4/en-us/myclub.html

References

Rajesh P, Alam M, Tahernezhadi M et al (2020) A data science approach to football team player selection. In: 2020 IEEE international conference on electro information technology (EIT), pp 175–183. https://doi.org/10.1109/EIT48999.2020.9208331
Salles SAF, Hora HRMd, Erthal M, Santos ACdSGd, Shimoya A (2019) Operations research contributions for football teams formation: a systematic review. Pesqui Oper 39:277–293. https://doi.org/10.1590/0101-7438.2019.039.02.0277
Article Google Scholar
Ozceylan E (2016) A mathematical model using ahp priorities for soccer player selection: a case study. South Afr J Ind Eng 27(2):190–205. https://doi.org/10.7166/27-2-1265
Google Scholar
Abidin D (2021) A case study on player selection and team formation in football with machine learning. Turk J Electr Eng Comput Sci 29(3):1672–1691. https://doi.org/10.3906/elk-2005-27
Article MathSciNet Google Scholar
Zeng Y, Shen G, Chen B, Tang J (2019) Team composition in pes2018 using submodular function optimization. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2919447
Payyappalli VM, Zhuang J (2019) A data-driven integer programming model for soccer clubs’ decision making on player transfers. Environ Syst Decis 39(4):466–481. https://doi.org/10.1007/s10669-019-09721-7 https://doi.org/10.1007/s10669-019-09721-7
Article Google Scholar
Wang W, Liu J, Tang T, Tuarob S, Xia F, Gong Z, King I (2020) Attributed collaboration network embedding for academic relationship mining. ACM Trans Web (TWEB) 15 (1):1–20. https://doi.org/10.1145/3409736
Google Scholar
Gelade GA (2018) The influence of team composition on attacking and defending in football. J Sport Econ 19(8):1174–1190. https://doi.org/10.1177/1527002517716974
Article Google Scholar
Zepp C, Kleinert J (2015) Homogeneity of prototypical attributes in soccer teams. Sage Open 5(3):1–10. https://doi.org/10.1177/2158244015602517 https://doi.org/10.1177/2158244015602517
Article Google Scholar
Ingersoll K, Malesky E, Saiegh SM (2017) Heterogeneity and team performance: evaluating the effect of cultural diversity in the world’s top soccer league. J Sport Anal 3(2):67–92. https://doi.org/10.3233/JSA-170052
Article Google Scholar
Khuller S, Saha B (2009) On finding dense subgraphs. In: International colloquium on automata, languages, and programming, pp 597–608. https://doi.org/10.1007/978-3-642-02927-1∖_50
Damian P, Cristian P, Dragoş FT (2021) Considerations regarding the selection in the football game. In: Ovidius University annals, series physical education & sport/science, movement & health, p 21
Inan T, Cavas L (2021) Estimation of market values of football players through artificial neural network: a model study from the turkish super league. Appl Artif Intell 35(13):1022–1042. https://doi.org/10.1080/08839514.2021.1966884
Article Google Scholar
Anamisa D, Kustiyahningsih Y, Yusuf M, Rochman E, Putro S, Syakur M, Bakti A (2021) A selection system for the position ideal of football players based on the ahp and topsis methods. In: IOP conference series: materials science and engineering, vol 1125. p 012044. https://doi.org/10.1088/1757-899X/1125/1/012044
Nasiri MM, Ranjbar M, Tavana M, Santos Arteaga FJ, Yazdanparast R (2019) A novel hybrid method for selecting soccer players during the transfer season. Expert Syst 36(1):12342. https://doi.org/10.1111/exsy.12342
Article Google Scholar
Singh P, Lamba PS (2019) Influence of crowdsourcing, popularity and previous year statistics in market value estimation of football players. J Discret Math Sci Cryptogr 22(2):113–126. https://doi.org/10.1080/09720529.2019.1576333
Article MathSciNet Google Scholar
Arrul VS, Subramanian P, Mafas R (2022) Predicting the football players’ market value using neural network model: a data-driven approach. In: 2022 IEEE international conference on distributed computing and electrical circuits and electronics (ICDCECE), pp 1–7. https://doi.org/10.1109/ICDCECE53908.2022.9792681
Vroonen R, Decroos T, Van Haaren J, Davis J (2017) Predicting the potential of professional soccer players. In: Proceedings of the 4th workshop on machine learning and data mining for sports analytics, vol 1971. pp 1–10
Zhao H, Chen H, Yu S, Chen B (2021) Multi-objective optimization for football team member selection. IEEE Access. https://doi.org/10.1109/ACCESS.2021.3091185
Dadelo S, Turskis Z, Zavadskas EK, Dadeliene R (2014) Multi-criteria assessment and ranking system of sport team formation based on objective-measured values of criteria set. Expert Syst Appl 41(14):6106–6113. https://doi.org/10.1016/j.eswa.2014.03.036
Article Google Scholar
Syaifudin YW, Puspitaningayu P (2021) Predicting winner of football match using analytical hierarchy process: an analysis based on previous matches data. In: 2021 international conference on data analytics for business and industry (ICDABI), pp 47–52. https://doi.org/10.1109/ICDABI53623.2021.9655836
WANG J (2021) A novel rugby team player selection method integrating the topsis and ipa methods. Int J Sport Psychol 52:137–158. https://doi.org/10.7352/IJSP.2021.52.137
Google Scholar
Baharin NH, Rashidi NF, Mahad NF (2021) Manager selection using fuzzy topsis method. In: Journal of physics: conference series, vol 1988. p 012057. https://doi.org/10.1088/1742-6596/1988/1/012057
Sałabun W, Shekhovtsov A, Pamučar D, Wątróbski J, Kizielewicz B, Więckowski J, Bozanić D, Urbaniak K, Nyczaj B (2020) A fuzzy inference system for players evaluation in multi-player sports: the football study case. Symmetry 12(12):2029. https://doi.org/10.3390/sym12122029
Article Google Scholar
Liu W, Xie X, Ma S, Wang Y (2020) An improved evaluation method for soccer player performance using affective computing. In: 2020 3rd international conference on artificial intelligence and big data (ICAIBD), pp 324–329. https://doi.org/10.1109/ICAIBD49809.2020.9137435
Pantzalis VC, Tjortjis C (2020) Sports analytics for football league table and player performance prediction. In: 2020 11th international conference on information, intelligence, systems and applications (IISA), pp 1–8. https://doi.org/10.1109/IISA50023.2020.9284352
Ghasemian F, Zamanifar K, Ghasem-Aghaee N (2018) An evolutionary non-linear ranking algorithm for ranking scientific collaborations. Appl Intell 48(2):465–481. https://doi.org/10.1007/s10489-017-0990-4
Article Google Scholar
Lappas T, Liu K, Terzi E (2009) Finding a team of experts in social networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 467–476. https://doi.org/10.1145/1557019.1557074
Li C-T, Shan M-K (2010) Team formation for generalized tasks in expertise social networks. In: 2010 IEEE second international conference on social computing, pp 9–16. https://doi.org/10.1109/SocialCom.2010.12
Juárez J, Brizuela CA (2018) A multi-objective formulation of the team formation problem in social networks: preliminary results. In: Proceedings of the genetic and evolutionary computation conference, pp 261–268. https://doi.org/10.1145/3205455.3205634
Selvarajah K, Zadeh PM, Kobti Z, Palanichamy Y, Kargar M (2021) A unified framework for effective team formation in social networks. Expert Syst Appl 177:114886. https://doi.org/10.1016/j.eswa.2021.114886
Article Google Scholar
Datta A, Tan Teck Yong J, Ventresque A (2011) T-recs: team recommendation system through expertise and cohesiveness. In: Proceedings of the 20th international conference companion on world wide web, pp 201–204. https://doi.org/10.1145/1963192.1963289
Awal GK, Bharadwaj KK (2014) Team formation in social networks based on collective intelligence–an evolutionary approach. Appl Intell 41(2):627–648. https://doi.org/10.1007/s10489-014-0528-y
Article Google Scholar
Sapienza A, Goyal P, Ferrara E (2019) Deep neural networks for optimal team composition. Front Big Data 2:14. https://doi.org/10.3389/fdata.2019.00014
Article Google Scholar
Hamidi Rad R, Fani H, Kargar M, Szlichta J, Bagheri E (2020) Learning to form skill-based teams of experts. In: Proceedings of the 29th ACM international conference on information & knowledge management, pp 2049–2052. https://doi.org/10.1145/3340531.3412140
Hamidi Rad R, Bagheri E, Kargar M, Srivastava D, Szlichta J (2021) Retrieving skill-based teams from collaboration networks. In: Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pp 2015–2019. https://doi.org/10.1145/3404835.3463105
Gini C (1921) Measurement of inequality of incomes. Econ J 31(121):124–126. https://doi.org/10.2307/2223319
Article Google Scholar
Alvaredo F (2011) A note on the relationship between top income shares and the gini coefficient. Econ Lett 110(3):0–277. https://doi.org/10.1016/j.econlet.2010.10.008
Article Google Scholar
Deltas G (2003) The small-sample bias of the gini coefficient: results and implications for empirical research. Rev Econ Stat 85(1):226–234. https://doi.org/10.1162/rest.2003.85.1.226
Article Google Scholar
Sueyoshi T, Qu J, Li A, Liu X (2021) A new approach for evaluating technology inequality and diffusion barriers: the concept of efficiency gini coefficient and its application in chinese provinces. Energy 235:121256. https://doi.org/10.1016/j.energy.2021.121256
Article Google Scholar
Lovász L (1993) Random walks on graphs. Combinatorics 2(1-46):4. https://doi.org/10.1007/BFb0077189
Google Scholar
Feng S, Zhang H, Cao J, Yao Y (2019) Merging user social network into the random walk model for better group recommendation. Appl Intell 49(6):2046–2058. https://doi.org/10.1007/s10489-018-1375-z
Article Google Scholar
Pradhan T, Pal S (2020) A multi-level fusion based decision support system for academic collaborator recommendation. Knowl-Based Syst 197:105784. https://doi.org/10.1016/j.knosys.2020.105784 https://doi.org/10.1016/j.knosys.2020.105784
Article Google Scholar
Bahadori S, Moradi P, Zare H (2021) An improved limited random walk approach for identification of overlapping communities in complex networks. Appl Intell 51(6):3561–3580. https://doi.org/10.1007/s10489-020-01999-4
Article Google Scholar
Guo K, Wang Q, Lin J, Wu L, Guo W, Chao K-M (2022) Network representation learning based on community-aware and adaptive random walk for overlapping community detection. Appl Intell 52:9919–9937. https://doi.org/10.1007/s10489-021-02999-8
Article Google Scholar
Shao Y, Huang S, Miao X, Cui B, Chen L (2020) Memory-aware framework for efficient second-order random walk on large graphs. In: Proceedings of the 2020 ACM SIGMOD international conference on management of data, pp 1797–1812. https://doi.org/10.1145/3318464.3380562

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Grant Nos. 61836005 and 62176225 and the Youth Innovation Fund of Xiamen under Grant No. 3502Z20206049.

Author information

Authors and Affiliations

Department of Automation, Xiamen University, Xiamen, 361005, China
Shenbao Yu & Bilian Chen
Xiamen Key Laboratory of Big Data Intelligent Analysis and Decision-making, Xiamen, 361005, China
Shenbao Yu & Bilian Chen
Department of Computer and Information Sciences, Northumbria University, Newcastle, NE1 8ST, UK
Yifeng Zeng
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
Yinghui Pan

Authors

Shenbao Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Zeng
View author publications
You can also search for this author in PubMed Google Scholar
Yinghui Pan
View author publications
You can also search for this author in PubMed Google Scholar
Bilian Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Yifeng Zeng or Yinghui Pan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A

Table 9 Summary of abbreviations

Full size table

Table 10 Notations adapted in the paper

Full size table

Table 11 The most common positions used in association football

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yu, S., Zeng, Y., Pan, Y. et al. Discovering a cohesive football team through players’ attributed collaboration networks. Appl Intell 53, 13506–13526 (2023). https://doi.org/10.1007/s10489-022-04199-4

Download citation

Accepted: 21 September 2022
Published: 12 October 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04199-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Discovering a cohesive football team through players’ attributed collaboration networks

Abstract

Similar content being viewed by others

Team performance analysis in football match using network analysis-based approach

Defining a historic football team: Using Network Science to analyze Guardiola’s F.C. Barcelona

FIFA World Cup referees’ networks: a constant-size clique approach

1 Introduction

2 Related work

2.1 Football player selection and team composition

2.2 Personal ability evaluation

2.3 Collaboration networks for a team formation

3 TC-FPACN model

3.1 Task formulation

Definition 1

3.2 Three network metrics

3.2.1 Network ability

3.2.2 Network density

Example 1

3.2.3 Network heterogeneity & homogeneity

3.3 Team composition via three network metrics

4 Optimization method based on greedy algorithm

5 Empirical study: data analysis and team evaluation

5.1 Data analysis

5.2 Performance metrics and parameter settings

5.3 Simulation results

5.3.1 Team performance without a budget constraint

5.3.2 Team performance considering different budget constraints

5.4 Method comparisons

5.4.1 Comparison with the CEFG method

5.4.2 Comparison with the random walk strategy

5.5 Sensitivity analysis of the parameters

6 Conclusions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation