1 Introduction

Many different complex problems based on biology [1], politics [2], mathematics [3], physics [4], economics [5], computer science [6] and so on can be represented as networks. When a network is described mathematically, it can be defined by vertices and edges of a graph. The vertices of the graph show the objects of the network, and the edge between any two vertices indicates the relationship between them. The community feature is an important structure of complex networks. With using the community structure, the networks can be divided according to its similar or different features [7, 8]. Therefore, community detection (CD) in networks has an important role to obtain meaningful outcomes in networks such as political election networks, biological networks, social networks, technological networks and so on [7, 9]. Every network probably has a minimum community structure. For example, when protein–protein interaction networks are examined, it can be seen that proteins with the same specific structure within the cell are expected to be a member of sub-communities [10,11,12]. The examples for implementation of CD problems can be extended for many other community problems such as product recommendation networks [13, 14], world wide web networks [6, 15], refactoring software packages [16], metabolic networks [17, 18], skill acquisition in robots [19], social networks [20,21,22,23], cancer detection networks [24], epidemic spreading on networks [25, 26], dimensionality reduction in pattern recognition [27] and link prediction problem [28].

The interactions and activities of individuals in a society with each other create a social network. Recently, most people have widely used internet and the applications depended on internet such as Facebook, Twitter and Instagram. These applications create many different real-life social networks, and these social networks have amazing characteristics and patterns, which can be analyzed for numerous beneficial purposes. The use of the social media on the internet enables people to expand their social network in unpredicted ways. Thus, the interactions of people on social media are getting rise and a large network is easily formed [20]. As a result, the importance of the community detection in social networks has increased with a tremendous way [20, 29].

A social network can be said to exhibit a community structure if its nodes can be partitioned into disjoint or overlapping clusters of nodes such that the number of edges within a cluster exceeds the number of edges between any two clusters by a reasonable amount. Generally, if a network shows a community structure, it also shows a hierarchical community structure [20, 30]. The process of the finding the sub-groups in a network according to the similar or dissimilar patterns and properties is named as community detection [31, 32]. CD on a network provides some considerable features such as easily analyzing the whole network and finding the malicious activity on the network. If community detection is not used on a network, each vertex of the network needs to be analyzed separately to understand the network.

With using the CD, there is no need to analyze each vertex of the network, and a general map of the network can be formed according to the similar or dissimilar characteristics of the vertices. Thus, the tendency of the network can easily understood [32, 33]. There are two different topologies such as static or dynamic topologies in CD of networks. If any network is part of a static network, the network structure is already known. Thus, it is easy to find out the community structure of a static network compared to a dynamic network. Because, the dynamic network can be updated in the process [33]. There are many CD methods related to static networks [3, 34,35,36,37]. However, when CD processes of real-life networks are examined, it can be seen that they are often a part of dynamic networks [38, 39].

Community detection (CD) problem generally can be defined as a discrete optimization problem and can be solved using optimization algorithms [7]. Therefore, many metaheuristic methods have been proposed by the researchers and implemented on CD problems in last decades such as a swarm intelligence-based hybrid approach for identifying network modules [40], a new multi-objective evolutionary framework for community mining in dynamic social networks [41], a multi-objective optimization of CD using discrete teaching–learning-based optimization [42], improving the performance of evolutionary multi-objective co-clustering models for CD in complex social networks [43], multi-objective local search for CD in networks [44]. Because the optimization-based solution approaches (metaheuristic algorithms) are not dependent to any problem, these methods can be easily applied to optimization problems [45]. Metaheuristic methods try to find the optimal solution for each optimization problem, but they do not give any guarantee for it. However, the goal of the optimization-based methods is to figure out optimal or a feasible results in a reasonable time according to a pre-defined objective function [46, 47]. Networks or graphs are made up of vertices and the edges connecting the vertices. Therefore, the complexity of the networks or a graph directly depends on the number of vertices and edges. The number of edges and vertices in a network is expressed as decision variables for optimization problems. Besides, as the number of decision variables increases, the number of potential solutions increases exponentially and finding the optimal solution of the optimization problem becomes very complex for multidimensional problems [48].

1.1 Main contribution and motivation of this study

Community detection (CD) problem is classified as NP-hard problem [33, 49]. Therefore, classical methods do not perform acceptably for problems such as CD in networks [50]. Recently, many metaheuristic algorithms have been proposed and implemented for different kinds of optimization problems such as continuous, discrete and binary optimization problems [45]. In this study, a modified discrete version of Coot bird natural life model (COOT) optimization algorithm has been proposed to solve CD problem in social networks. COOT algorithm is a newly proposed nature-inspired optimization algorithm by Naruei and Keynia [51] for solving continuous optimization problems. Considering the behavior of the coot birds, it can be seen that the behavior of the coot birds is based on two different behaviors on the water surface. In the first phase, the movements of the birds are irregular; in the second phase, the movements become regular [51]. Because the CD problem is a discrete problem, a modified discrete version of COOT algorithm (for short MCOOT) is proposed and implemented on CD problem in social networks. The main contributions and motivations of this study are given as follows:

  • A new mechanism is integrated into basic COOT method. This new mechanism is based on three different procedures:

    1. (1)

      some dimensions of any coot individual are randomly selected for update process,

    2. (2)

      a new position update rule is used,

    3. (3)

      and a genetic mutation operator is integrated into new mechanism to avoid from local optimal solution.

  • MCOOT method is applied to ten different small and large social networks. To demonstrate the solution quality and robustness of the proposed method, it is compared with state-of-the-art optimization methods.

  • A time analysis of compared algorithms is given in experiments.

The outline of the study is presented as follows: A literature review of CD problem, and some recent studies based on COOT method are presented in Sect. 2. The basic COOT optimization algorithm is detailed in Sect. 3. In Sect. 4, the proposed modified discrete Coot bird optimization algorithm (for short MCOOT) is explained. The CD problem and modularity problem are explained in Sect. 5. The experimental results are given in Sect. 6, and the conclusions and future works are presented in Sect. 7.

2 Literature review

Community detection (CD) refers to identifying vertices with communication in a complex network. CD is a very important issue to find the functional and structural features of a network [23]. With CD, a network can be easily evaluated and some significant outputs can be obtained. Therefore, many different single-objective or multi-objective metaheuristic methods have been proposed in the literature to solve CD problems in networks [7]. In this section, firstly a brief description of some optimization algorithms for solving CD problem in networks is explained. Then, some studies on COOT algorithms are briefly explained.

Recently, optimization-based methods have been widely used by researchers for solving CD problems. Pizzuti [22] has proposed a genetic algorithm (GA)-based method named as GA-net to identify the communities in social networks. Pizzuti has implemented the GA-net on the synthetic and real-world networks to present the capability of the GA-net to successfully determine the communities in network structure. He et al. [52] have proposed an approach based on GA with ensemble learning technique named as GAEL for determining the CD in complex networks. GAEL has used the basic crossover operator with a multi-individual crossover operator based on ensemble learning. They have implemented GAEL method on computer-based generated networks and real-life complex networks. In another study, Li and Song [53] have proposed an extended compact genetic algorithm (ECGA) to perform CD in complex networks. They have implemented the proposed ECGA method on the benchmark and six real-life complex networks of Lancichinetti et al. [54] (LFR) and Girvan and Newman [55] (GN). Pizzuti [56] has proposed a multi-objective genetic algorithm (MOGA-Net) for recognizing the CD in the complicated networks. Pizzuti has implemented the proposed MOGA-Net method on some synthetic benchmarks and four real-life networks such as the Krebs’ books on American politics, Bottlenose Dolphins, the Zachary’s Karate Club and the American College Football.

Shi et al. [57] have proposed a novel method based on particle swarm optimization (PSO) algorithm to explore the CD structures in network modularity. They have used an improved spectral method for converting the CD problem into a cluster problem. They have implemented their proposed PSO on three different real-life networks. Gong et al. [58] have proposed an approach based on PSO algorithm with a decomposition mechanism for clustering the complex networks. In their studies, a multi-objective discrete PSO (MODPSO) method has been implemented on signed and unsigned networks. Cai et al. [59] have modified the continuous position update rule of basic PSO to a discrete version in order to determine the community structures in the signed networks. They have implemented the proposed discrete PSO on the synthetic and real-world signed networks. In another study, Cai et al. [60] have proposed a discrete PSO algorithm with a greedy mechanism to obtain CD in social networks. The greedy mechanism has been used to move the position of the particles to an optimal region. They have implemented their proposed approaches on the large-scale social network clustering problems. Rahimi et al. [7] have proposed a novel multi-objective PSO algorithm called as MOPSO-Net to perform the CD in complex networks.

Atay et al. [1] have implemented six metaheuristic methods with a modularity technique for solving CD in networks problem. Two methods named as bat algorithm (BA) and gravitational search algorithm (GSA) have been used like in the original studies, and the other methods named as scatter search algorithm (SS) with GA method named as SSGA, the modified big bang-big crunch method (BB-BC), an effective hyperheuristic differential search algorithm (HDSA) and the improved version of bat algorithm (BA) with differential evolutionary (DE) method named as BADE have been proposed by them. They have implemented the six different metaheuristic methods on five different social networks and four different biological networks. Atay et al. [40] have proposed a new hybrid method based on shuffled frog leaping algorithm (SFLA) and GA method to solve CD problem in networks, and the proposed method is named as unified SFLA (uniSFLA) Community detection in networks that has a discrete structure. Therefore, Koç [33] has proposed six different discrete versions of the optimization methods such as Harris Hawks optimization (HHO) algorithm, COOT bird algorithm, arithmetic optimization algorithm (AROA), atom search optimization method (ASO), slime mould algorithm (SMA) and Archimedes optimization algorithm (AOA) for determining the sub-communities in networks. Ghafori and Gharehchopogh [23] have proposed a multi-objective version of the cuckoo search algorithm (CSA) named as MOCSA to realize the CD on social media. They have used a new strategy based on the close adjacent vertices detection in the cost function to increase the performance of the proposed method. They have implemented the MOCSA method on the eight different data samples named Football, Polbooks, Geom, Karate, Email, Power Grid, NetScience and Dolphins. Moradi and Rostami [27] have proposed a novel feature selection algorithm based on the graph clustering methods and ant colony optimization (ACO) for solving classification problems. In their study, the features have been distributed to subgroups with a CD method.

Banati and Arora [61] have proposed a hybrid discrete algorithm named as DTL-GSO method based on teacher’s learners (I-TLBO) algorithm and group search optimization (GSO) method to carry out the CD in complex networks. They have implemented DTL-GSO method on the two artificially generated and four different real-life datasets. Imtiaz et al. [62] have proposed a multi-layer ACO method named as MLACO for determining the communities in the complicated networks. They have used the ratio cut (RC) and kernel k-means (KKM) methods as an objective function of their proposed method. They have implemented the MLACO method on the different synthetic and real-world complex networks to demonstrate the performance of their proposed method. In another study, Cai et al. [63] have proposed a new CD method for simplified networks by combining structure and attribute information. Song et al. [64] have used complex networks and multiple artificial intelligence algorithms for table tennis match action recognition and technical-tactical analysis. In another study, Mishra et al. [65] have proposed a multi-objective optimization algorithm-based unbiased community identification in dynamic social networks. They have implemented their proposed method on twelve different complex social networks. Shishavan and Gharehchopogh [66] have proposed an improved cuckoo search optimization algorithm (CSO) with genetic algorithm (GA) for solving CD problem in complex networks. In another study, Kumar et al. [67] have proposed a novel stacked autoencoder-based deep learning approach augmented by the crow search algorithm (CSA)-based k-means clustering method to obtain the community structure in complex networks. Arasteh et al. [68] have proposed a solution approach based on gravity algorithm for detection of the communities from large-scale complex networks. Reihanian et al. [69] have proposed an enhanced multi-objective biogeography-based optimization method (called as MOBBO-OCD) for overlapping community detection in social networks with node attributes. They have implemented the MOBBO-OCD method on the fourteen different real-life network problems and compared with fifteen different detection algorithms in the literature. In another study, Gharehchopogh [70] has the proposed different methods based on HHO method for solving the problem of the CD in social networks. He has proposed three different methods such as the improved HHO method with the opposition-based learning technique (IHHOOBL), the improved HHO method with lévy flight function (IHHOLF) and the improved HHO method with a chaotic map (IHHOCM). The proposed methods have been run on twelve different datasets based on NMI and modularity criteria.

COOT algorithm is a newly proposed nature-inspired method for solving continuous optimization problems. In recent years, COOT method-based many solution approaches have been proposed in the literature to deal with the continuous and discrete problems. Hussien et al. [71] have proposed a novel methodology for optimal control of islanded microgrids (MGs) with using COOT method. In their studies, the optimum gains for the PI controller have been found using COOT algorithm under a multi-objective optimization framework. In another study, Mostafa et al. [72] have proposed an enhanced COOT algorithm method. They have integrated two other techniques into COOT method named as opposition-based learning (OBL) and orthogonal learning to handle with restrictions of COOT algorithm. They have implemented their proposed method on nine different dimensionality reduction problems taken from UCI datasets. Houssein et al. [73] have proposed a modified version of COOT algorithm with seven different strategies called as leading the group toward the optimal area, best agent guide, phasor operator, OBL method, transition factor (TF), control randomization and adjusting the position. They have conducted their proposed COOT method on the standard and complex problems given in IEEE CEC’2017 conference. In another study, Guo et al. [74] have proposed a hybrid method based on HHO method and COOT method to solve the continuous numerical optimization problems.

3 Coot bird natural life model (COOT)

COOT algorithm is based on the different collective behaviors of the coots. These collective behaviors of the coots are regular and irregular movements on the surface of the water. The whole group members try to move toward the target of quality food. Therefore, they update their current positions in the light of information of the group leader’s positions [51]. The mathematical model of COOT method can be explained with four distinctive features called as random movement to this side and that side, chain movement, adjusting the position based on the group leaders and leading the group by the leaders toward the optimal area (leader movement). The initial population \(\left( {\overrightarrow {x} = \left\{ {\overrightarrow {{x_{1} ,}} \overrightarrow {{x_{2} }} ,\overrightarrow {{x_{3} ,}} ...,\overrightarrow {{x_{n} }} } \right\}} \right)\) is randomly generated as in other continuous optimization algorithms. The first position of an individual of COOT method is formed randomly with Eq. (1) as follows:

$$CootPos(i) = rand(1,d).*(ub - lb) + lb$$
(1)

In Eq. (1), \(i\) shows the index number of the current individual, \(CootPos(i)\) refers to the position of the \(ith\) coot position, \(d\) refers to the number of the decision variables, \(lb\) and \(ub\) refer to the lower and upper bounds of the problem in the search space, respectively.

3.1 Random movement to this side and that side

A random position (\(RandPos\)) is generated using Eq. (2) in the range of the \(ub\) and \(lb\) to change the position of the current coot individual.

$$RandPos = rand(1,d).*(ub - lb) + lb$$
(2)

The process of updating the current position with a random walk allows the algorithm to move away from a local optimum position. In order to obtain the new position of the current individual, the position update rule is calculated as in Eq. (3):

$$CootPos(i) = CootPos(i) + A \times R2 \times (RandPos - CootPos(i))$$
(3)

In Eq. (3), \(R2\) represents a random value in range of the \([0, 1]\); \(A\) value is calculated with Eq. (4) as follows:

$$A = 1 - L \times \left( \frac{1}{Iter} \right)$$
(4)

In Eq. (4), \(L\) expresses the current iteration, and \(Iter\) refers to the maximum number of the iteration.

3.2 Chain movement

When chain movement is applied, the distance vector between two coot individuals is calculated firstly; the first individual moves toward the other individual by about half of the distance vector. The new position of the current coot individual is generated with chain movement method by Eq. (5).

$$CootPos(i) = 0.5 \times \left( {CootPos(i - 1) + CootPos(i)} \right)$$
(5)

3.3 Adjusting the position based on the group leaders

In general, a few coot individuals in a group change their current positions to move toward the quality food sources, and the other coot individuals adjust their current positions according to the positions of group's leaders. The adjusting process in COOT method is calculated with Eq. (6) given as follows:

$$K = 1 + \left( {iMODNL} \right)$$
(6)

where \(NL\) represents the number of the leaders, and \(K\) shows the index number of the group leader. In COOT method, \(coot(i)\) is updated its current position with the information of the leader’s \(k\). The next position of the current coot individual is calculated using the selected leader as follows:

$$CootPos(i) = LeaderPos(k) + 2 \times R1 \times Cos(2R\pi ) \times (LeaderPos(k) - CootPos(i))$$
(7)

In Eq. (7), \(k\) shows the index number of the current leader, \(LeaderPos(k)\) refers to the selected leader position, \(R1\) value refers to a random value in range of the \([0, 1]\), \(R\) value refers to a random value in range of the \([-1, +1]\), \(\pi\) value represents a constant value which is \(3.14\).

3.4 Leading the group by the leaders toward the optimal area (leader movement)

In optimization problems, the individuals in the population must be in range of the search space. Therefore, if any dimension of an individual is out of the search space, the dimension out of the search space should be ensured to be moved into the search space. In COOT method, the coot individuals maintain their position using the position of the group leaders. Therefore, the positions of the group leaders should be in range of the search space. In COOT, the group leaders move their position by using Eq. (8).

$$LeaderPos(i) = \left\{ {\begin{array}{*{20}c} {B \times R3 \times \cos (2R\pi ) \times (gBest - LeaderPos(i)) + gBest,} & {R4 < 0.5} \\ {B \times R3 \times \cos (2R\pi ) \times (gBest - LeaderPos(i)) - gBest,} & {R4 > = 0.5} \\ \end{array} } \right\}$$
(8)

where \(gBest\) shows the position of the coot individual with best fitness value, \(R3\) and \(R4\) values refer to random values in range of the \([0, 1]\), and \(B\) value is calculated as in Eq. (9):

$$B = 2 - L \times \left( \frac{1}{Iter} \right)$$
(9)

4 Modified Coot bird optimization algorithm (MCOOT)

Like other metaheuristic algorithms, due to the insufficient balance between exploitation and exploration, low diversity tendency and slow convergence speed, the basic COOT method can get trapped in the local optimal solution. To extend the local and global search tendencies of the basic COOT method, a new mechanism is developed based on three different procedures. The first procedure is to randomly select some dimensions of any coot individual for the update process. The second procedure is to propose and integrate a new location update rule, and the last procedure is to integrate a genetic mutation operator into COOT.

4.1 Selecting the dimension of the coot individual for update process

In this procedure, some dimensions of the coot individual are randomly selected for update process. First, a random value (dimension number-DN) is generated between one to dimension size (DS) of the coot individual. Then, dimensions of the current coot individual up to DN size are randomly determined and updated with Eq. (10). The procedure of selecting the dimension of the coot individual for update process can be explained on a small example as follows: For example, it is assumed that the decision variables of the problem are 8, and the DN value is randomly determined and is 4. Therefore, this means that 4 randomly selected positions of the coot individual would be in the update process.

4.2 A new position update rule

A new position update rule which has been used in the studies of Aslan and Beşkirli [75] is used in the proposed algorithm. The new update rule uses the position of the agent with best fitness value and the position of the current agent for update process. Similarly, when updating the position of the leader in COOT method, the leader's current position and the global best position are also used. Therefore, the new update rule can be easily integrated to the basic COOT method. The new update rule is explained in Eq. (10).

$$\begin{gathered} Step(i) = gBest - CootPos(i) \hfill \\ CootPos(i) = gBest + R5 \times Step(i) \hfill \\ \end{gathered}$$
(10)

where \(gBest\) represents the position of the coot individual with best fitness value; \(R5\) value refers to random value in range of \([0, 1]\).

4.3 A genetic mutation procedure

A genetic mutation procedure which has been used in the study of Zou et al. [76] is integrated into new mechanism in order to avoid from the local optimal solution. The mutation process is carried out according to a small mutation rate (MR), and in this study this value is selected as \(0.01\). If a randomly generated value is smaller than MR, a new position is generated for the current dimension of the coot individual. The mutation procedure is explained in Fig. 1.

Fig. 1
figure 1

Genetic mutation procedure

In the proposed hybrid method, firstly, the basic COOT method is applied to generate a new position for the current coot individual, and then for each coot individual the new mechanism given in Fig. 2 is executed.

Fig. 2
figure 2

A new mechanism for proposed method

4.4 Discretization process of the continuous values of the coot

As described in the above sections, CD problem is a discrete problem. Therefore, a continuous optimization algorithm cannot be directly applied on the CD problem in networks. In this study, first, the proposed method is executed in continuous search space; then, before calculating the fitness function, the continuous represented coot positions are converted to discrete positions. For discretization process, the continuous values of each dimension of the coot individuals are rounded to the nearest discrete values. The pseudocode of MCOOT method is explained in Fig. 3.

Fig. 3
figure 3

The pseudocode of the proposed method

5 Community detection (CD) problem

Community detection (CD) represents the identification of adjacent vertices in a complex network. It is significant to determine the network’s functional and constructional characteristics [23]. Therefore, CD in networks has an important role for some network problems such as: political election networks, biological networks, social networks and technological networks [7, 9]. The CD problem in networks and the modularity are explained in Sects. 5.1 and 5.2, respectively.

5.1 Problem description

Locus-based adjacency representation (LAR) structure is used for graph-based representation to solve the community detection problem [77]. If a network is represented by a graph\(G(V,E)\), it is said to be based on a set of vertices from \(1\) to \(n\) such as \(V(G) = \left\{ {v_{1} ,v_{2} , \ldots ,v_{n} } \right\}\) and a set of edges from \(1\) to \(m\) such as \(E(G) = \left\{ {e_{1} ,e_{2} , \ldots ,e_{m} } \right\}\). In the network problems, nodes or vertices represent the different characteristics and edges or acts represent the connection between two different nodes or vertices, and the vertices are distributed into \(t\) different communities according to similar and dissimilar features of the vertices [74]. Each dimension of the chromosome has two different information named as \(PopulationID\) and\(CommunityID\). \(PopulationID\) represents a randomly selected adjacent vertex from the adjacent of the \(jth\) vertices. \(CommunityID\) refers to the \(jth\) vertex for communities generated from the \(PopulationID\). In order to explain the \(PopulationID\) and \(CommunityID\) structures, an example of a network of eight vertices is presented in Fig. 4a. In addition, Fig. 4b represents an example of chromosomes structure of the network given in Fig. 4a. In this figure, the chromosome consists of three different sets that provide information called \(ID\),\(PopulationID\), and \(CommunityID\). The first set contains the node sequence index, the second set contains the selected adjacent node, and the third set shows the potential index of the node’s community. In addition, the subgraphs formed from the given chromosome are presented in Fig. 4c, and each subgraph is shown with a different color.2

Fig. 4
figure 4

A randomly generated network with \(ID\), \(PopulationID\), \(CommunityID\) and the obtained subgraphs [33]

5.2 Modularity

CD problem is a maximization problem, and modularity maximization is widely used in the literature in order to obtain the CD structures in networks. In this method, first, a network is divided to several communities, and then the quality of the network is evaluated. Using this objective function, potential communities in networks are obtained and their vertices of the corresponding network are optimized to produce the community structure with the highest cost [33, 78]. The mathematical model of the modularity maximization is given in Eq. (11).

$$Q_{{{\text{Basic}}}} = \sum\limits_{j = 1}^{n} {\left( {e_{jk} - a_{j}^{2} } \right)}$$
(11)

where \(n\) refers to the total number of vertices in a network, \({e}_{jk}\) shows the number of the connections between one in the group j and the other end in group k, and \(\sum\nolimits_{j = 1}^{n} {\left( {e_{jk} } \right)}\) represents the number of connections with one in the group. These methods generally use the metaheuristic methods in order to obtain the optimal value of \({Q}_{{\text{Basic}}}\) in CD problems [37]. To obtain a general modularity function, when a network is represented by any graph pattern, the outcomes of the community structures can be determined as subgraphs with attributes and quantities such as maximum common properties and the number of interaction. If a node is included to a subgroup, it should have important or similar features of the related communities. On the other hand, this node should have minimum similar relations with the other subgroups. Thus, the interactions between the subgroups will be in a minimum way. In real life, there are many different kinds of communities, and these communities are classified according to the rare relationships or intense relationships, similar and dissimilar features. For example, in social fields, every person is being a member of the society which has the similar characteristics of the himself or herself, and in the informatics field, computer with maximum data transfer is included in the same topology [1, 33].

A network can be easily represented with a graph structure. As described above, the members of the communities can be defined as vertices, and the relations between the vertices can be defined as edges. In a graph, the connections between vertices are shown on the adjacency matrix (for short \(AdjM\)), and thus the relations between vertices can be found and evaluated more easily. \(AdjM\) is a \(n\times n\) matrix, and it is generated according to the number of the vertices. For example, if the number of the vertices of a network is 8, then the adjacency matrix will be a matrix with 8 columns and 8 rows. In this study, \(AdjM\) is created according to Eq. (12).

$$AdjM_{j,k} = \left\{ {\begin{array}{*{20}l} {1,} & {{\text{if}}\;{\text{there}}\;{\text{is}}\;{\text{any}}\;{\text{connection}}\;{\text{between}}\;{\text{vertices}}\;j\;{\text{and}}\;k} \\ {0.} & {{\text{otherwise}}} \\ \end{array} } \right.$$
(12)

According to Eq. (12), if the value of the any cell of the \(AdjM\) is 1, this indicates that there is a connection between these related two vertices. The \(AdjM\) of the network example given in Fig. 4a is explained in Fig. 5.

Fig. 5
figure 5

The adjacent matrix of the network example with 8 vertices

When Fig. 5 is examined, it is seen that if there is a connection between two vertices, the value of the related cell is assigned as value ‘1’; otherwise, it is assigned as value ‘0’. The modularity fitness function of graph \(G\) is explained in Eq. (13).

$$Q = \frac{1}{2 \times m}\sum\limits_{jk} {\left( {AdjM_{j,k} - \frac{{z_{j} \times z_{k} }}{2 \times m}} \right)} \times \delta \left( {C_{j} ,C_{k} } \right)$$
(13)

where \(Q\) refers to the fitness function, \(G\) represents a graph structure, \({z}_{j}\) and \({z}_{k}\) represents the degrees of the \(jth\) and \(kth\) nodes, respectively. The degree of the any node represents the total number of the adjacent nodes of related node. \({C}_{j}\) and \({C}_{k}\) refer to the community of the \(j{\text{th}}\) and \(k{\text{th}}\) nodes belong, respectively. \(\updelta {(C}_{j},{C}_{k})\) also indicates whether the vertices \(j\) and \(k\) are in the same community. The total number of edges of a graph is calculated with Eq. (14), the degree of the any node is calculated with Eq. (15), and \(\updelta {(C}_{j},{C}_{k})\) output is calculated with Eq. (16).

$$m = \frac{1}{2}\sum\limits_{jk} {AdjM_{{\left( {j,k} \right)}} }$$
(14)
$$k_{j} = \sum\limits_{j} {AdjM_{{\left( {j,k} \right)}} }$$
(15)
$$\delta = \left\{ {\begin{array}{*{20}c} {1,} & {{\text{if}}\;C_{j} = C_{k} } \\ {0,} & {{\text{if}}\;C_{j} \ne C_{k} } \\ \end{array} } \right.$$
(16)

6 Experimental results

In order to analyze and validate the effectiveness of MCOOT, it is applied to 10 different small-scale and large-scale social networks. MCOOT method is coded and carried out on in MATLAB R2021a platform and a machine with Windows 10 64-bit operating system, Intel Core i7 2.80 GHz CPU and 16 GB RAM. In the previous studies in the literature for CD problem in social networks, the population size, stopping criteria (total amount of maximum iterations) and maximum number of the evaluations (\(MaxFEs = {\text{population}}\;{\text{size}} \times {\text{iteration}}\;{\text{number}}\)) were set as \(20\), \(500\), and \(10000\) respectively. Therefore, for make a fair comparison the population size (\(NP\)) is chosen as \(20\) and stopping criteria (\(Iter\)) is chosen as \(500\) for all the algorithms. The obtained experiment results are reported as \({\text{mean}}\), \({\text{best}}\), s \(\mathrm{tandard deviation}\) (Std.) and \(worst\) results of 30 runs. The algorithmic parameters of MCOOT such as \(MR\) and \(P\) values are selected as 0.01 and 0.5, respectively. In order to demonstrate the power of MCOOT method, it is compared with the results of the algorithms called COOT, AOA, ASO, HHO, SMA, AROA, BADE, SSGA, BB-BC, BA, GSA, CNM, CS, DECD, FN, GACD, GATHB, GN, MA-Net, MENSGA, MOGA-Net and PSO.

6.1 Test suit

Networks have different number of the vertices and edges. Moreover, the complexity of the networks depends on the number of vertices and edges. As the number of the vertices and edges of a network increases, the complexity of that network is likely to increase. Therefore, there are many different small-scale and large-scale networks in the literature. In this study, five different small-sized social networks and five different large-sized social networks are used in the experiments. Table 1 shows the information about five different small-sized networks. These social networks are Grevy’s zebras [79], Zachary’s karate club [80, 81], Bottlenose dolphins [82], Books about US politics [81] and American college football [83] networks.

Table 1 The small-scale social networks for experiments

Table 2 presents the information about five different large-sized social networks. These networks are Little rock lake [84], Jazz musicians [85], Physicians [86], Similarities [87], and FilmTrust [88] networks.

Table 2 The large-scale social networks for experiments

6.2 Comparison of the results of algorithms

In this section, the proposed method is compared with some state-of-the-art optimization algorithms proposed in the literature in terms of solution quality. MCOOT is separately compared with the algorithms in each of the studies in the literature, and the experimental results are evaluated in terms of the solution quality and robustness.

6.2.1 Comparison with COOT, AOA, ASO, HHO, SMA and AROA methods

The first comparison is made with the study of the Koc [33]. He has proposed six different discrete versions of the optimization methods named as HHO, COOT, AROA, ASO, SMA and AOA algorithms for determining the communities in social networks. These methods which have been implemented by Koc are used on ten different social networks in this study. The experimental results of MCOOT and COOT, AOA, ASO, HHO, SMA and AROA methods using the small-sized social networks are given in Table 3, and the best results of the comparisons are given as bold. The experimental results of COOT, AOA, ASO, HHO, SMA and AROA methods on the small-sized networks are directly taken from the study of Koc [33]. When Table 3 is analyzed, it can be seen that the proposed method finds the best results for all five small-sized social networks. In addition, the second best results are found by COOT, and COOT obtains the best results in Zebras and Karate networks. The smallest social network is Zebras with 27 nodes and 111 edges. Therefore, all the algorithms in the experiments obtain the optimal value for Zebras dataset. According to the comparative results in Table 3, the performance of MCOOT is superior than those of COOT, AOA, ASO, HHO, SMA and AROA on Dolphin, Books and Football social networks.

Table 3 Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of the solution quality for small-sized networks

Figure 6 shows the Mean fitness values of the compared algorithms in Table 3. According to Fig. 6, the Mean results of MCOOT in the small-sized social networks are the same or higher than the compared algorithms. All of the algorithms achieve the same Mean results for Zebras network. In addition, especially, when the results of algorithms on Football network are analyzed, it can be easily seen that MCOOT obtains the best Mean result. The second best Mean result is found by COOT method. Moreover, the Mean results of the SMA method on small-sized social networks are lower than the other methods in terms of the solution quality.

Fig. 6
figure 6

Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of Mean results for the small networks

The convergence graph of the algorithms for small-sized social networks is given Fig. 7. As seen from Fig. 7, the convergence of MCOOT is generally faster than the other methods. When Fig. 7a is analyzed, the other algorithms except COOT obtain the best fitness values (global best modularity values) in close to 10 iterations for Zebras network, and COOT method reaches to the best fitness value in close to 70 iterations. When Fig. 7b is analyzed, the proposed MCOOT method achieves the best results in about 30 generations, and the other methods find the best fitness value between 150 and 250 generations except SMA method for Karate network. SMA method cannot achieve the best fitness value for Karate networks in total 500 iterations. According to Fig. 7c, the best fitness value of Dolphin network is obtained only by the proposed method in around 400 iterations, and the lowest performance is shown by SMA method. For Books network, MCOOT achieves the best result in about 250 generations, and the second best result is found by COOT method. When Fig. 7e is analyzed, for Football network, SMA method is seen to be trapped to the local optimal solution in about 100 generations, and the best fitness value is found by MCOOT in near 400 generations.

Fig. 7
figure 7

The convergence curves of global best modularity values of compared algorithms for small-sized networks

Figure 8 shows best sub-communities structure of the Zebras network. The Zebras network has 27 vertices and 111 edges. The sub-communities are generated according to the best Q value; the Q value found by MCOOT method is 0.2768 for Zebras network. When Fig. 8 is analyzed, it can be seen that 4 sub-graphs are generated for Zebras network. The vertices included in the same sub-communities are represented by the same color. For instance, \(13\), \(17\), \(22\), and \(23{\text{th}}\) vertices are in the same group. It is understood from Fig. 8, some vertices have the interactions to each other whether they are not in the same sub-group. The gray lines show the relationship between vertices of the different sub-communities.

Fig. 8
figure 8

Best communities obtained from MCOOT method for Zebras network

The best sub-communities structure of the Karate network is given in Fig. 9. The Karate network has 34 nodes and 78 edges. The best general modularity value found by MCOOT is 0.4198 for Karate network. When Fig. 9 is analyzed, it can be seen that 4 sub-communities are generated for Karate network. It is understood from Fig. 9, some nodes have the connections to each other whether they are not in the same sub-communities. For example,\(5{\text{th}}\) node and \(20{\text{th}}\) node have a link with each other, but they are not in the same group.

Fig. 9
figure 9

Best communities obtained from MCOOT method for Karate network

The best sub-communities graph of the Dolphin network is given in Fig. 10. The Dolphin network has 62 nodes and 159 edges. The best general modularity value found by MCOOT is 0.5285 for Dolphin network. When Fig. 10 is analyzed, it can be seen that 5 sub-communities are generated for Dolphin network.

Fig. 10
figure 10

Best communities obtained from MCOOT method for Dolphin network

The best sub-communities structure of the Books network is given in Fig. 11. The Books network has 105 nodes and 441 edges. The best Q value obtained by MCOOT method is 0.5272 for Books network. When Fig. 11 is analyzed, it can be seen that 5 sub-communities are generated for Books network. Two sub-communities of Books network contain many nodes, and the nodes in these sub-communities appear to have strong connections within the group.

Fig. 11
figure 11

Best communities obtained from MCOOT method for Books network

The best sub-communities’ topology obtained from MCOOT is given in Fig. 12 for Football network. The Football network has 115 nodes and 615 edges. The best Q value produced by MCOOT is 0.6046 for Football network. According to Fig. 12, there are 10 different sub-communities generated by MCOOT for Football network. As can be seen from Fig. 12, there are many connections between the nodes both within and outside the sub-communities.

Fig. 12
figure 12

Best communities obtained from MCOOT method for Football network

One another comparison is made for the large-scale social networks such as Rock, Jazz, Physicians, Similarities and Film dataset. The results of MCOOT, COOT, AOA, ASO, HHO, SMA and AROA method are given in Table 4, and the best results of comparisons are given as bold. The experimental results of COOT, AOA, ASO, HHO, SMA and AROA methods on the large-sized networks are directly taken from the study of Koc [33]. According to Table 4, MCOOT method obtains the best results for all five large-sized social networks. Considering the results of algorithms on the Rock, Jazz, Physicians, Similarities, and Film networks, MCOOT obtains the better Mean and Best values compared to the others in terms of the solution quality. However, SMA obtains the worst modularity values for Rock, Jazz and Physicians networks compared to the other algorithm. It is seen from Table 4; MCOOT obtains better Worst modularity values for Similarities and Film networks. According to Table 4, the second best results are obtained from COOT for all the large-sized networks, and the lowest performance is obtained from AROA. In addition, the mean results of the MCOOT, COOT, AOA, ASO, HHO, SMA and AROA methods are given Fig. 13.

Fig. 13
figure 13

Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of Mean results for large-scale networks

Table 4 Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of the solution quality for large-scale networks

The Mean fitness values of MCOOT and the other algorithms are presented in Fig. 6. When the Mean results of the modularity values given in Fig. 6 are taken into account, the Mean results of MCOOT method on the large-sized social networks are higher than the other algorithms. The second best Mean results are found by COOT method on all the networks except Rock networks. For Rock network, the second best Mean modularity value is found by HHO. The Mean results of AROA are lower than those of the other methods in terms of the solution quality on large-sized social networks.

The convergence graph of MCOOT and the other algorithms for large-sized social networks is given Fig. 14. As seen from Fig. 14, the convergence of MCOOT is generally better than the other methods. According to Fig. 14a, MCOOT obtains the best global best modularity value in about 450 iterations for Rock network, and the second best value is obtained from COOT in around 450 generations. When Fig. 14b is analyzed, MCOOT achieves the global best modularity value in near 500 generations, and the convergence of the other algorithms is far from that of MCOOT for Jazz network. In addition, the convergence curves of the HHO and COOT are similar and these methods reach to their own best global modularity values in around 400 generations. According to Fig. 14c, the best fitness value of Physicians network is found by the proposed method in around 450 iterations, and the lowest performance is demonstrated by AROA. AROA is seen to be trapped in local optimal solution in near 300 iterations. For Similarities network, MCOOT obtains the best modularity value in around 75 generations, and the convergence curves of COOT, SMA and HHO become very similar after 100 generations. When Fig. 14e is analyzed, for Film network, AROA, SMA and ASO methods are seen to be easily trapped to the local optimal solutions in around of 10, 40 and 100 iterations, respectively. However, the convergence curve of MCOOT is slightly better than those of the other algorithms and MCOOT finds the best modularity value around the 400 generations.

Fig. 14
figure 14

The convergence curves of global best modularity values of compared algorithms for large-scale networks

6.2.2 Comparison with BADE, SSGA, BB-BC, BA and GSA methods

The second comparison is made with the studies of the Atay et al. [1]. They have implemented the six different metaheuristic algorithms called BADE, SSGA, BB-BC, BA and GSA methods to CD problem in networks problem. They have utilized these methods on the five different small-sized social networks. The experimental results of MCOOT and BADE, SSGA, BB-BC, BA and GSA methods on the small-sized social networks are given in Table 5, and the best results of the comparisons are given as bold. The results of BADE, SSGA, BB-BC, BA and GSA are directly taken from the studies of Atay et al. [1].

Table 5 Comparison of the results of MCOOT method with BADE, SSGA, BB-BC, BA and GSA methods in terms of the solution quality for small-sized networks

When Table 5 is analyzed, it can be seen that MCOOT obtains the better results in terms of the Mean and Best modularity values for all five small-sized social networks. In addition, the other methods also find the best Q values for Karate and Zebras networks. Moreover, the other methods except BB-BC obtain the best Mean Q value for Zebras network. According to Table 5, the second best results are found by SSGA method and it achieves the best results for Zebras, Karate, Dolphin and Books networks. However, for Football network, the second best results are found by BADE method.

6.2.3 Comparison with some studies in the literature

In the last comparison, MCOOT is compared with CNM, CS, DECD, FN, GACD, GATHB, GN, MA-Net, MENSGA, MOGA-Net and PSO algorithms. The experimental results of these algorithms in the literature are directly taken from the study of Koc [33]. When Table 6 is examined, it can be seen that these methods utilize four different small-sized social networks in their experiments. According to Table 6, since there is no result of CS and PSO methods for the Books network and DECD method for the Dolphin network in the literature, they are not compared.

Table 6 Comparison of MCOOT with some algorithms in the literature according to the best fitness values

When Table 6 is examined, it can be seen that MCOOT obtains the better results in terms of the best modularity values for all of the networks. When the results of Karate network are taken into account, MCOOT, GACD, MA-Net, MENSGA and PSO algorithms achieve the best Q value, and the lowest performance is obtained by CS method. For Dolphin network, MCOOT and GACD find the best modularity value and the lowest performance is obtained by MOGA-Net. According to Table 6, MCOOT, GACD and MA-Net obtain the best modularity value for Books network, and the lowest results are obtained from FN and CNM methods. For Football network, MCOOT, DECD, MA-Net and PSO obtain the best Q value, and the lowest result is found by MOGA-Net.

6.3 A time analysis of the compared algorithms

In this study, MCOOT method is first compared with some state-of-the-art optimization algorithms in terms of the Mean, Best, Worst and Std. values. In addition, a time analyses of MCOOT and the other methods are made in order to show the power and the effectiveness of the proposed method. First, the time results of the proposed study are presented in comparison with Koç's study [33]. Table 7 shows the time analysis of MCOOT, COOT, AOA, ASO, HHO, SMA and AROA methods on the small-sized social networks as second. Q value is calculated in all the methods including MCOOT according to only the Community ID information. Therefore, the algorithms are terminated quickly when they reach the known optimum Q value and time performance of the algorithms is increased. According to Tables 7, MCOOT obtains the best time cost in terms of Mean results on Zebras and Karate networks. On Dolphin network, SMA algorithm achieves the best time cost in terms of Mean and Best results. For Books and Football networks, ASO algorithm obtains the best time cost in terms of Mean results. However, it should be noted the fact that an algorithm is fast does not mean that this algorithm obtains the best solution. Therefore, considering the results in Tables 3 and 7, it should be emphasized that MCOOT obtains better results than the others in terms of solution quality.

Table 7 Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of time cost for small-sized networks

Figure 15 presents the comparative results of MCOOT with COOT, AOA, ASO, HHO, SMA and AROA in terms of Mean time costs for small-sized networks. It is seen from Fig. 15, HHO algorithm is the slowest method according to the total time costs as second. Moreover, ASO method is the fastest method in terms of the total time costs. However, the time performance of ASO, COOT and MCOOT methods are near to each other.

Fig. 15
figure 15

Comparative results of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of mean time results for small-sized networks

Table 8 presents the time analysis of MCOOT, COOT, AOA, ASO, HHO, SMA and AROA methods on the large-sized social networks. When Table 8 is examined, it is seen that, AROA method is obtained the best time costs for all large-sized networks except Film dataset. For Film social network, the performance of the ASO method is better than the other methods for all metrics such as Best, Worst, Mean and Std. in terms of time cost. According to Table 8, the results of the SMA method and HHO method are worse than the other compared algorithms in terms of Mean, Best and Worst time cost metrics. The time cost performance of COOT, MCOOT, AOA and ASO is generally near to each other. AOA method obtains the second best results for Rock, Jazz and Physicians networks in terms of the Best and Mean time costs. Besides, for Similarities networks, the second best results are obtained from ASO method in terms of the Best and Mean time costs. For Film network, the second Best, Worst and Mean time costs are obtained from MCOOT method.

Table 8 Comparison of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of time cost for large-scale networks

Figure 16 shows the comparative results of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of Mean time costs for large-sized networks. According to Fig. 16, HHO and SMA algorithms are worse than the other compared algorithms in terms of the time performance. When Fig. 16 is analyzed, ASO method is the fastest in terms of the total time costs. The second best total time performance is obtained from AROA, and the third fast algorithms is MCOOT for large-sized social networks. However, when the best modularity values of Table 4 are taken into account, the worst results are obtained from AROA method. Although the time analysis is an important criterion for evaluating the performance of the algorithms, it alone is not sufficient for algorithm comparison. Therefore, it would be better to evaluate the performance of an algorithm with metrics such as the success of obtaining the optimal solution and convergence speed rather than time analysis.

Fig. 16
figure 16

Comparative results of MCOOT method with COOT, AOA, ASO, HHO, SMA and AROA methods in terms of mean time results for large-scale networks

The last time analysis is made with the studies of Atay et al. [1]. When the study of the Atay et al. [1] is analyzed, it is seen that the sub-communities detected by the optimization method for the candidate solutions obtained have been compared one to one and checked whether they belong to the same sub-communities. If an algorithm is executed according to this procedure, the time costs probably are likely to be high values. Therefore, in this study, the modularity value is calculated according to only the Community ID information as in the study of Koc. Therefore, MCOOT terminates all the iterations quickly and time performance of the algorithm is increased. Table 9 shows the time analysis of MCOOT, BADE, SSGA, BB-BC, BA and GSA methods on the small-sized social networks as second. According to Table 9, MCOOT is the best approach in terms of time performance for all of the networks. SSGA is the slowest in terms of the time costs. When Table 9 is analyzed, it is seen that average running time of MCOOT method is 14.54, and this value is extremely a good value when compared with the other methods, because BADE, SSGA, BB-BC, BA and GSA methods are terminated with 204.11, 245.28, 205.97, 155.83 and 164.05 average running time, respectively.

Table 9 Comparison of MCOOT method with BADE, SSGA, BB-BC, BA and GSA methods in terms of time cost

Figure 17 presents the comparative results of MCOOT method with BADE, SSGA, BB-BC, BA and GSA methods in terms of time costs for small-sized networks. According to Fig. 17, the time performance of MCOOT algorithm is the best one. The worst time performance is obtained from SSGA.

Fig. 17
figure 17

Comparative results of MCOOT method with BADE, SSGA, BB-BC, BA and GSA methods in terms of mean time results for large-scale networks

7 Conclusions and future works

This study focuses on proposing an improved discrete version of COOT method to solve community detection (CD) problem in social networks. CD has an important role to obtain significant outputs from networks such as political election networks, biological networks, social networks and technological networks. If the CD is not being used in order to obtain some results from networks, then the entire network should be taken into consideration. In this situation, all vertices and edges of the network should be considered to evaluate the network. Therefore, as the number of the vertices and edges of a network increases, the structure complexity and the time complexity of the network also will be increased. So that, CD in networks is classified as NP-hard problem, and classical methods cannot show an acceptable performance for problems such as CD. They need strong solution approaches in order to solve the CD in networks in a reasonable time. As a result, in this study, a modified discrete version of Coot bird natural life model (COOT) optimization algorithm is proposed for solving CD in networks problem. Like other metaheuristic algorithms, because of the insufficient balance between exploitation and exploration, tends to low diversity and slow convergence speed, the basic COOT method can be trapped to local optimal solution. In order to strengthen the local and global search tendencies of the basic COOT, a new mechanism is proposed based on three different procedures. The first one is the random selection of some dimensions of any individual coot for the update process. The second one is the development of a new location update rule, and the last procedure is the integration of a genetic mutation operator into the basic COOT method. In addition, since COOT is a continuous optimization algorithm, the continuous values should be converted to discrete values since CD problem is a discrete problem. Therefore, in this study, first, the proposed COOT method (named as MCOOT) is run in continuous search space; then, before calculating the fitness values (modularity value—Q), the continuous represented coot position values are converted to discrete values.

Ten different data instances of the small- and large-sized networks are used in the experiments. The small-sized networks consist of five networks, and they are Grevy’s Zebras, American College Football, Zachary’s Karate Club, Bottlenose Dolphins and Books about US Politics networks, respectively. The large-sized networks consist of five networks and they are Jazz musicians, Similarities, Little Rock Lake, FilmTrust and Physicians networks, respectively. In order to show and validate the effectiveness of MCOOT method, it is compared with BADE, SSGA, BB-BC, BA, GSA, COOT, AOA, ASO, HHO, SMA and AROA, CNM, CS, DECD, FN, GACD, GATHB, GN, MA-Net, MENSGA, MOGA-Net and PSO algorithms. According to the experimental results, MCOOT method shows a superior performance on the small-sized and large-sized social networks in terms of the Mean and Best modularity values. Considering all the comparisons, the superiority of MCOOT can be attributed to the balanced use of exploration and exploitation capability throughout the iterations. In addition, a time analysis of MCOOT and the other compared methods are given in experiments. It should be noted that the success of an algorithm in terms of time alone cannot be a measure. However, with using time analysis and experimental results together, some valuable conclusions can be drawn about the algorithms. For example, when Table 8 is examined, it is seen that AROA obtains the best time costs for all large-sized networks except Film network. However, when Table 4 is analyzed, it is seen that the Mean results of AROA on large-sized social networks are lower than those of the other methods in the comparison in terms of the solution quality. Hence, it can be said that the performance of the AROA method on the large-sized networks is less than those of the other compared algorithms. As a result, according to the experimental results and time analysis, the proposed MCOOT method exhibits a much better performance on the small-sized and large-sized social networks in terms of the solution quality and robustness.

For future works, the proposed MCOOT method can be applied to biological or ecological networks outside of social networks, and the time costs of MCOOT can be reduced with using new update procedures. In addition, the proposed method can be implemented on the other discrete problems such as graph coloring, traveling salesman problem (TSP) and knapsack problems.