1 Introduction

Massive services have emerged with the application and integration of emerging technologies such as 5G, cloud computing, edge computing, big data, and artificial intelligence. The number of intelligent terminal devices and generated data is increasing rapidly, leading to an explosive growth in computational demands. According to IDC’s Global DataSphere research [1], it is predicted that global data volumes will generate 129 ZB in 2023 and will be more than double by 2027. Due to limited computational resources in mobile terminal devices, traditional cloud computing architectures cannot meet the computational demands of users with this explosive growth in data services. Simultaneously, driven by user expectations for responsive and reliable services, there is a keen focus on enhancing the quality of services while processing massive quantities of data. Therefore, sinking cloud computing capabilities to the edge side and deploying services based on the demands of different geographical regions can better meet the diverse needs of users in various regions, promoting balanced social development.

Service deployment based on geographical region [2] refers to deploying different services on servers in various locations according to users’ requirements. This deployment approach can increase service quality, reduce network latency and congestion, and decrease the costs and risks associated with data transmission. Service deployment has attracted researchers’ attention in recent years. Many efforts have focused on formulating suitable deployment strategies with different objectives and constraints, such as to improve end-users’ perceived service quality at minimum deployment cost [3], to maximize the end-user coverage with a specific deployment budget [4], and to maximize service robustness under budget constraints [5]. However, due to the complex and dynamic nature of the environment, achieving rational resource allocation and carrying out service deployment based on geographical region poses a challenging problem. Edge servers exhibit heterogeneity, with each edge node possessing limited and distinct computational and storage resources. Deploying services closer to users at the network edge can achieve higher performance and ensure service quality for users. Many of the current approaches may lead to some problems as follows:

  1. 1.

    Low User Coverage: Edge servers are deployed within specific geographical location limits, and their resource capacity is constrained [6, 7]. There is a limitation on the number of services deployed on each server. As the number of users increases and the range of service requests expands, servers cannot offer all types of services to users in every geographical region, reducing user coverage.

  2. 2.

    Insufficient Service Reliability: Users can only access the service within the signal range of the edge server [8]. However, the connection between users and servers might be unstable due to user mobility. In cases where the connection is lost, users must reconnect to another edge server, which may lead to service interruptions during this process, consequently reducing service reliability. Moreover, edge servers are susceptible to operational failures caused by software anomalies, hardware malfunctions, and malicious attacks [5, 9]. These factors can result in service interruptions and impact service reliability.

Based on the analysis above, this paper defines service priorities based on user demands in different geographical regions and regional characteristics. Building upon this, a Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) is proposed. Specifically, this paper employs algorithms such as dispersion, density clustering, and attention mechanisms from the dimensions of service distribution, service aggregation, and service resource requirements to calculate service priorities. Based on these priorities, deployment costs are allocated to determine the number of service deployments. Subsequently, the K-medoids clustering algorithm is utilized to select edge servers and preliminarily generate an initial deployment plan. Finally, an improved genetic algorithm based on priority is employed, considering user coverage and service reliability. Through multiple iterations, it generates service deployment plans. The priority-improved genetic algorithm possesses solid global search capabilities, adaptability, and interpretability advantages. It is well suited for developing deployment plans for various services in different geographical regions within the cloud-edge-end scenario presented in this paper.

The main contributions are summarized as follows:

  1. 1.

    The service priority metric is proposed for budget allocation, initial population generation, and improved selection strategies. By comprehensively considering user demands for different services in various geographical regions and the resources available, distinct service priorities are calculated within other geographical regions and across the entire region.

  2. 2.

    The Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) is designed to improve the process of server selection. It takes into consideration service reliability and user coverage. It brings higher user coverage and service reliability to the users for a higher performance deployment scenario.

The rest of this paper is organized as follows. Section 2 summarizes related works. Section 3 provides a detailed overview of service geographic region deployment under budget constraints. Section 4 outlines the performance evaluation results in different scenarios. Finally, we summarize this paper and point out the future work in Sect. 5.

2 Related work

2.1 Service prioritization

Service prioritization refers to ranking and allocating resources to different services in situations with limited resources. Setting service priorities ensures that critical and urgent services receive sufficient resources and are treated with higher precedence to meet their needs and requirements. References [10, 11] utilize the Deadline Monotonic (DM) scheduling algorithm to determine task priorities and the priority for accessing shared resources. References [12, 13] introduce per-flow priority, associating each priority with an individual flow. A priority range is established for each flow type, and priorities for each flow within that range are randomly assigned. This allows for probabilistic priorities, where one flow may have a higher priority than another within a certain probability. References [14, 15] separately allocate different priorities to services based on their tolerance to delays and their impact on the service ecosystem, though this approach only considers user requirements. Reference [16] addresses the coupling effect between computation offloading and service caching by considering service priorities. It aims to maximize offloading efficiency while maintaining cache balance based on service priorities, then control service caching through a given offloading strategy that replaces the cache service list, thus enhancing the benefits of joint offloading and cache control. Reference [17] adjusts weights based on service priorities, and decisions can also be dynamically updated as service priorities change. Reference [18] prioritizes services based on runtime and adequate response time. Reference [19] introduces a time-aware service recommendation approach based on dynamic service priorities and Quality of Service (QoS). It rates each service for a recommendation based on service priorities and QoS scores. Reference [20] traffic prioritization is done on the basis of the throughput, reliability, and latency requirements.

2.2 Service deployment

In the context of cloud-edge-device scenarios, service deployment refers to the simultaneous deployment of applications and services across cloud, edge nodes, and terminal devices, achieving collaborative operations among the cloud, edge, and terminal. This deployment approach combines the advantages of cloud, edge, and terminal computing, providing more flexible, efficient, and reliable services. Reference [6] formulates service deployment strategies based on the resource limitations of edge servers, the business logic of applications, and the average response time of services. Reference [7], primarily focusing on user demands, aims to maximize service quality and reliability while minimizing energy consumption and costs during service deployment at the edge. The deployment strategy in Reference [4], driven by service providers’ benefits, allows the deployment of application instances on edge servers within specific regions, providing services to most users in that region. Considering user mobility and service latency demands, Reference [21] deploys services in a mobile edge-cloud network to offer reliable services to users. Reference [22], considering factors like geographical location, bandwidth, and reliability, employs dynamic deployment strategies for service deployment. From a service provider’s perspective, Reference [3] considers service sharing and communication interference to minimize deployment costs and maximize overall service quality, effectively addressing multi-service deployment issues. Reference [23] introduces a hybrid optimization of Particle Swarm and Chemical Reaction to mitigate service delay and enhance Quality of Service. Reference [24] reduces service latency and increases resource utilization by considering the priority of services and the distribution of resource consumption. Reference [25] proposes an online heuristic algorithm. The algorithm recursively halves the SFC request into two sub-SFCs and finds the solution that satisfies the latency and resource constraints by slicing the latency constraints of the two sub-SFCs. Reference [26] proposes a user-end adaptive service deployment algorithm based on reinforcement learning, dynamically adjusting strategies according to task types and available node types to optimize user experience. The service deployment plan in Reference [27], by balancing access latency, communication latency, and service switching costs, enhances service quality economically to maximize end-user coverage. Reference [28] a mobility model based on the probability density function of sojourn time is used to characterize user mobility intensity and optimize user experience by minimizing service deployment overhead. Reference [29], MSCD framework can automatically customize multi-domain network services based on users’ personalized service preferences, and deploy them in multi-domain networks, to provide customizable multi-domain services for different users. Reference [30] an Ant Colony Optimization meta-heuristic algorithm for the Online SFC Deployment is proposed to minimize the server operation cost and network latency for near real-time SFC deployment.

However, these service deployment methods often focus on a singular perspective. They may not comprehensively meet the needs of both users and service providers or maximize revenue based on distinct geographical characteristics.

3 Priority-based service deployment strategy

In addressing the service geographical deployment challenges in the cloud-edge-device scenario, this paper derives service priorities based on user demands for distinct service categories across various geographical regions and the available resources. Subsequently, servers are preliminarily selected according to service priorities using the K-medoids clustering algorithm. Following the generation of an initial deployment plan, an enhanced priority-based genetic algorithm is proposed to refine server selection. This algorithm aims to yield higher user benefits by comprehensively considering user coverage and service reliability.

Fig. 1
figure 1

Priority-based service deployment strategy

As illustrated in Fig. 1, the multi-service geographical deployment approach based on service priorities primarily consists of Attention-based Service Prioritization and Priority-Enhanced Genetic Algorithm. The Attention-based Service Prioritization module calculates service priorities by considering the relationships among various factors influencing the services and the user demands for services in different geographical regions. This process resolves the issue of deploying different service categories in various geographical regions. The Priority-Enhanced Genetic Algorithm module generates an initial population based on service priorities. It incorporates these priorities into the individual selection phase, enabling faster and more accurate generation of service deployment plans. The main notations used in this paper are listed in the Table. 1.

Table 1 Main notations

3.1 Service priority

In the context of cloud-edge-device scenarios, deploying multiple services requires considering the optimal number of instances for each service in different geographical regions. Additionally, users within different geographical regions exhibit varying demands for services. To address this challenge, this paper introduces the concept of service priority. By analyzing user demands for services within different geographical regions, the paper calculates users’ preferences for different services, thereby determining the priority of each service within that region. The priorities of services across various regions are aggregated to establish the overall service priority in the entire region. This priority-based approach is then utilized to rank services and guide their deployment across different geographical regions within budget constraints.

In real-life scenarios, when users’ service requests are widely distributed geographically, the same service must be deployed on servers in different regions to ensure timely response to user requests. For instance, users could request chat services in both business districts and tourist regions. In this paper, we adopt the degree of dispersion to represent the wideness of service request distribution D(XY). The greater the degree of dispersion, the wider the distribution of requests, and vice versa. We use the standard deviation to calculate the degree of dispersion.

Definition 1

(Service request distribution) According to the service request distribution D(XY), the calculation formula of the standard deviation of 2-bit random vector is:

$$\begin{aligned} D(X, Y)=\sqrt{\frac{ { \sum _{i=1}^{n}}\left[ \left( x_{i} - {\overline{x}} \right) + \left( y_{i} - {\overline{y}} \right) \right] ^{2} }{n-1} } \end{aligned}$$
(1)

Services with high request frequencies also require deploying multiple instances. A high request frequency indicates a substantial user demand and a higher reliability requirement for the service in that region. Consequently, deployment on servers within that region is prioritized to meet user needs. For instance, shopping services might receive many requests in commercial districts, leading to a prioritized deployment of shopping services in those regions compared to other services. A density-based clustering algorithm is employed to measure the aggregation level in various regions. The choice of a density-based clustering algorithm is due to its suitability for this problem, as it eliminates the need to set an initial number of cluster centers, making it more applicable. Additionally, density-based clustering is robust to noise, efficient in terms of speed, and suitable for handling larger datasets compared to centroid-based clustering algorithms [31].

Unlike cloud servers, edge servers do not possess ample resources. Therefore, extensive deployment on edge servers might be unsuitable if a service demand necessitates significant computational resources.

Definition 2

(Geographical service preference within a region) Suppose we use the sets \(S=\{s_{1}, s_{2},\ldots , s_{o}\}\) to represent the services and \(R=\{r_{1}, r_{2},\ldots , r_{p}\}\) to denote the regions. Geographical service \(s_{i}\) preference within region \(r_{j}\), denoted as \(Pr(s_{ij})\), is mainly defined by three factors: the overall request distribution of service \(s_{i}\), denoted as \(d_{i}\); the request frequency of service \(s_{i}\) in region \(r_{j}\), denoted as \(f_{ij}\); and the resource requirements of service \(s_{i}\) in region \(r_{j}\), denoted as \(re_{ij}\). \(Pr(s_{ij})\) can be calculated as follows.

$$\begin{aligned} \text{Pr}(s_{ij})=\alpha \cdot d_{i} + \beta \cdot f_{ij} +\gamma \cdot re_{ij} \end{aligned}$$
(2)

where parameters \(\alpha\), \(\beta\) and \(\gamma\) are the weights for \(d_{i}\), \(f_{ij}\), subject to \(\alpha\) + \(\beta\)+ \(\gamma\)=1. \(\alpha , \beta ,\gamma \in (-1,1)\).

The factors influencing geographical service preference mutually interact with one another. Typically, research initially involves randomly determining weight coefficients and then progressively adjusting them to ascertain the impact of these factors on service redundancy deployment. Although random methods are simple, they can be slow in determining weight coefficients and may yield results deviating from the optimal solution. Moreover, these methods are, to some extent, influenced by the subjectivity of experimenters, leading to deficiencies in weight coefficient determination through random approaches. This paper determines the weights of each factor by learning their interrelationships. The Attention Mechanism can selectively filter out a small amount of crucial information from a large dataset and focus on this subset of essential data. Therefore, the Attention Mechanism is employed to compute corresponding weight coefficients, determining the effects of the three factors on service redundancy deployment. This is achieved through three feature spaces designed to learn the latent feature similarity between the various factors. The feature spaces are represented as follows.

$$\begin{aligned}Q=XW^{q} \end{aligned}$$
(3)
$$\begin{aligned}K=XW^{k} \end{aligned}$$
(4)
$$\begin{aligned} V=XW^{v} \end{aligned}$$
(5)

\(X\in R^{3\times {S}}\) represents the feature matrix composed of latent feature vectors for S services. \(W^{q}\in R^{3\times {S}}\), \(W^{k}\in R^{3\times {S}}\), and \(W^{v}\in R^{3\times {S}}\) are parameter matrices used to map the feature vectors into the respective spaces. Q, K, and V correspond to Query, Key, and Value. Query is used to compare with other units to determine their importance. Key is used to measure the relevance of the Query to other elements in the sequence. Value contains information about the current position, which is used to generate the final output representation. Subsequently, attention weights \(\alpha _{s_{i}}\) for each service within the group relative to the specified service \(s_i\) can be calculated, indicating the relevance between each service and service \(s_i\).

$$\begin{aligned} \alpha _{s_{i}}=\text{softmax}\left(\frac{Q_{s_{i}}K^{T}}{\sqrt{d_{k}}}\right) \end{aligned}$$
(6)

\(Q_{s_{i}}\) represents the latent feature vector of service \(s_i\) within Q. \(\sqrt{d_{k}}\) denotes the square root of the Key vector, where \(\sqrt{d_{k}}=3\). Subsequently, by the normalized exponential function softmax, the attention weights are multiplied by the corresponding Value values, focusing on a small set of crucial information.

$$\begin{aligned} \text{Attention}_{s_{i}}=\alpha _{s_{i}}V \end{aligned}$$
(7)

Finally, the information obtained for attention is summed and normalized to derive the weight coefficients that influence the geographical service preference factors within the region.

$$\begin{aligned} \text{Coefficient}(\alpha , \beta , \gamma )= \text{softmax}\left( \sum _{i=1}^{|S|}\text{Attention}_{s_{i}}\right) \end{aligned}$$
(8)

Definition 3

(Service priority) Users’ demands for different types of services vary based on their geographical regions. Hence, service priorities can be utilized to illustrate the distinct user requirements for services. Service \(s_i\) priority within region \(r_{j}\), denoted as \(P(s_{ij})\), can be calculated by the natural logarithm of the ratio between the proportion of service preference in that region and the proportion of service preference across all regions.

$$\begin{aligned} P(s_{ij})=\ln\left(\frac{\text{Pr}(s_{ij})}{\textstyle \sum _{o=1}^{|S|}\text{Pr}(s_{oj})}/ \frac{\textstyle \sum _{p=1}^{|R|}\text{Pr}(s_{ip})}{\textstyle \sum _{p=1}^{|R|}\textstyle \sum _{o=1}^{|S|}\text{Pr}(s_{op})}\right) \end{aligned}$$
(9)

Service \(s_i\) priority, denoted as \(P(s_{i})\), can be calculated as follows.

$$\begin{aligned} P(s_{i})=\frac{\textstyle \sum _{p=1}^{|R|}P(s_{ip})}{|R|} \end{aligned}$$
(10)

3.2 Budget allocation

Using the service above priorities as a reference, the deployment quantities for each service instance are determined based on the total deployment budget and the budget for deploying individual services.

Definition 4

(Budget allocation) N is the default number of service deployments, limited by the current budget b. Subsequently, budget allocation is carried out by service priorities. \(SSB_{i}^{0}\) represents the spare service budget when assigning to \(s_i\), while \(CSB_{i}^{1}\) represents the current service budget [32].

$$\begin{aligned}{} & {} \text{SSB}_{i}^{0}=b-\textstyle \sum\limits_{k=1}^{i-1}(c_{k}\times n_{k})-\textstyle \sum\limits_{k=i}^{|S|}[c_{k}\times (N+C_{i}^{r})] \end{aligned}$$
(11)
$$\begin{aligned}{} & {} \text{CSB}_{i}^{1}=c_{i}\times (N+C_{i}^{r})+SSB_{i}^{0}\times f_{i} \end{aligned}$$
(12)
$$\begin{aligned}{} & {} n_{i}=\text{CSB}_{i}^{1} \end{aligned}$$
(13)

\(n_i\) represents the computed deployment number for service \(S_i\), \(C_i^r\) denotes the number of cluster centers obtained from the density clustering algorithm, \(c_i\) signifies the deployment cost for service \(S_i\), and \(f_i\) is defined as the ratio of the deployment cost of current service and the sum of the undeployed services, which is as follows:

$$\begin{aligned} f_{i}=\left\{ \begin{array}{ll}\frac{c_{i}}{\textstyle \sum _{k=i}^{|S|}c_{k}} &{} \text{SSB}_{i}^{0}\geqq 0\\ 0 &{} \text{otherwise} \end{array}\right. \end{aligned}$$
(14)

Following the cost above allocation, the required deployment number for each service \(S_i\) can be calculated.

3.3 Initial deployment plan generation

After determining the number of deployments for each service, the services must be assigned to appropriate servers. Given the fixed number of services to be deployed, this paper opts for the K-medoids algorithm to generate the initial deployment plan. Common centroid-based clustering algorithms include K-means, K-medoids, K-medians, and K-means++. K-means, K-means++, and K-medians share a similar approach, with K-means++ emphasizing the selection of initial centroids as far apart as possible based on distances. K-medians rely on the median of sample points, while K-means and K-means++ focus on the mean value of sample points; however, their selected centroids might not be actual sample points. Hence, in this context, the K-medoids algorithm is more suitable because it necessitates centroids to be essential sample points. The deployment count for each region is determined according to the calculations in section 3.2, and then K-medoids clustering is performed based on regions, resulting in the initial deployment plan.

3.4 Generation of service deployment plan

3.4.1 User coverage and service reliability

Suppose we use the sets \(\text{ES}=\{\text{es}_{1}, \text{es}_{2},\ldots , \text{es}_{n}\}\) to represent the edge servers and \(U=\{u_{1}, u_{2},\ldots , u_{m}\}\) to denote the end-users. The objective of service deployment is to achieve high service reliability and extensive user coverage.

User coverage [33]


Each edge server has a certain coverage range and offers services to end-users within that range. In real scenarios, there might be overlapping regions between edge servers, where any overlapping edge servers can serve end-users. If an end-user is solely covered by one edge server, he can only access service instances deployed on that specific edge server. Therefore, determining whether an end-user is covered by an edge server is crucial to ascertain their accessibility to service instances. To achieve this, we use the sets \(L(es)=\{L_{1}(es), L_{2}(es),\ldots , L_{n}(es)\}\) to represent the locations of edge servers and \(L(u)=\{L_{1}(u), L_{2}(u),\ldots , L_{m}(u)\}\) to denote the locations of end-users. The Euclidean metric function \(\text{dist}(L_{i}(es), L_{k}(u))\) can be employed to measure the distance between edge server \(\text{es}_{i}\) and user \(u_{k}\).

Definition 5

(Instance accessibility) The service instance accessibility \(\text{Acc}(\text{es}_{i},u_{k})\) defines a function that returns 1, if the end-user \(u_{k}\) locates within edge server \(\text{es}_{i}'s\) coverage, and returns 0 otherwise.

$$\begin{aligned} \text{Acc}(\text{es}_{i},u_{k})=\left\{ \begin{array}{ll}1 &{} \text{dist}(L_{i}(es), L_{k}(u))\leqq \text{cov}(\text{es}_{i})\\ 0 &{} \text{otherwise} \end{array}\right. \end{aligned}$$
(15)

where \(\text{cov}(\text{es}_{i})\) represents the coverage radius of edge server \(\text{es}_{i}\).

Definition 6

(User coverage benefit) User coverage benefit \(\text{CB}(\text{es}_i)\) is measured by the total number of end-users who can access the edge server \(\text{es}_i\). \(\text{CB}(\text{es}_i)\) is defined as follows:

$$\begin{aligned} \text{CB}(\text{es}_{i})={u_{k}|\text{Acc}(\text{es}_{i},u_{k})=1,\forall u_{k}\in U} \end{aligned}$$
(16)

According to Definition 6, given a set of edge servers ES deploying service instances and a set of end-users U. The overall user coverage benefit of all edge servers ES for all end-users U, denoted as CB(ES), can be measured as follows.

$$\begin{aligned} \text{CB}(\text{ES})=|\displaystyle \bigcup _{\text{es}_{i}\in \text{ES}}\text{CB}(\text{es}_{i})| \end{aligned}.$$
(17)

Service Reliability [33]

Suppose the service required by user \(u_{k}\) is deployed on both edge servers \(\text{es}_i\) and \(\text{es}_o\). If \(\text{es}_i\) becomes unavailable due to hardware/software issues or network anomalies, the user \(u_{k}\) can continue their service on edge server \(\text{es}_o\) without experiencing a service interruption. In this scenario, the service reliability for user \(u_{k}\) is ensured, and edge server \(\text{es}_o\) can be seen as contributing to the service reliability benefit for user \(u_{k}\). The service reliability benefit for user \(u_{k}\) is defined as follows.

Definition 7

(Service reliability benefit) Suppose a set of edge servers \(\text{ES}=\{\text{es}_{1}, \text{es}_{2},\ldots , \text{es}_{n}\}\) deploying service instances and an end-user \(u_{k}\in U\), the service reliability benefit \(RB(u_{k})\) obtained by \(u_{k}\) is measured by the number of those edge servers that co-cover \(u_{k}\).

$$\begin{aligned} \text{RB}(u_{k})=\left\{ \begin{array}{ll}0 &{} \sum _{\text{es}_{i}\in \text{ES}} \text{Acc}(\text{es}_{i},u_{k})\leqq 1\\ \sum _{\text{es}_{i}\in \text{ES}} \text{Acc}(\text{es}_{i},u_{k})-1 &{} \sum _{\text{es}_{i}\in \text{ES}} \text{Acc}(\text{es}_{i},u_{k})> 1 \end{array}\right. \end{aligned}$$
(18)

We can find that \(\text{RB}(u_{k})=0\) when \(u_{k}\) can access only one service instance and is \(j-1\) if \(u_{k}\) can access j instance.

According to Definition 7, suppose a set of edge servers ES deploying service instances and a set of end-users U, the overall service reliability benefit of all edge servers ES for all end-users U, denoted as RB(ES), can be used to represent service reliability. The calculation of RB(ES) is defined as follows.

$$\begin{aligned} \begin{aligned} \text{RB}(\text{ES})&=\displaystyle \sum _{u_{k}\in U}\displaystyle \sum _{\text{es}_{i}\in \text{ES}} \text{Acc}(\text{es}_{i},u_{k})-|\displaystyle \bigcup _{\text{es}_{i}\in \text{ES}}\text{CB}(\text{es}_{i})| \\&=\displaystyle \sum _{\text{es}_{i}\in \text{ES}}|\text{CB}(\text{es}_{i})|-|\displaystyle \bigcup _{\text{es}_{i}\in \text{ES}}\text{CB}(\text{es}_{i})| \end{aligned} \end{aligned}$$
(19)

3.4.2 Priority-enhanced genetic algorithm

Compared to other optimization algorithms, the Priority-Enhanced Genetic Algorithm possesses advantages such as strong global search capability, high parallelism, and strong interpretability. It can also address discrete optimization problems, making it well suited for generating service deployment solutions in the cloud-edge-end scenario described in this paper. Therefore, this paper adopts the Priority-Enhanced Genetic Algorithm to generate various deployment plans for multiple services across different geographical regions.

Population initialization

The initialization of the population is crucial for the convergence of the genetic algorithm and can impact the speed at which the optimal population is generated. Many population initialization methods employ random methods. However, random strategies can often result in a significant deviation between the fitness of the initial population of individuals and the desired optimal fitness, leading to a decrease in solution quality. Therefore, an appropriate population initialization method is required to enhance the performance of the genetic algorithm. This paper generates the initial deployment plan by selecting servers using the K-medoids algorithm. Since the initial centroids chosen by the K-medoids algorithm are random and based on sample points, the generated individuals are distinct, constituting a population and achieving population initialization.

Chromosome representation

The encoding method employs the floating-point encoding approach.

Chromosome representation

In this paper, the total user coverage benefits and overall service reliability benefit mentioned earlier are utilized to calculate the fitness of each individual. Additionally, each individual must satisfy resource and cost constraints.

Crossover and mutation

Traditional genetic algorithms often employ fixed crossover and mutation probabilities, which can destroy good individuals and retain poor ones, resulting in decreased algorithm performance. In this paper, an adaptive crossover and mutation probability strategy is devised. The crossover and mutation probabilities are adjusted based on a combination of the current individual’s fitness and the difference between the maximum and average fitness of the current population. The crossover probability \(p_c\) and mutation probability \(p_m\) are defined as follows.

$$\begin{aligned}{} & {} P_{c}=\left\{ \begin{array}{ll}k_{1}\sin \left(\frac{f_{max}-f'}{f_{max}-{\overline{f}}}\right) &{} f'\geqq {\overline{f}} \\ k_{3} &{} f'< {\overline{f}} \end{array}\right. \end{aligned}$$
(20)
$$\begin{aligned}{} & {} p_{m}=\left\{ \begin{array}{ll}k_{2}\sin \left(\frac{f_{max}-f}{f_{max}-{\overline{f}}}\right) &{} f\geqq {\overline{f}} \\ k_{4} &{} f< {\overline{f}} \end{array}\right. \end{aligned}$$
(21)

\(k_1\),\(k_2\),\(k_3\),\(k_4\leqq 1,f_max\) is the maximum fitness value of individuals in the current population, \({\overline{f}}\) is the average fitness value of individuals in the current population, \(f'\) is the higher fitness value of the two individuals undergoing crossover, and f is the fitness value of the individual undergoing mutation.

Selection strategy

The best-preserved selection function is employed to choose individuals for the next generation. Firstly, the roulette wheel selection method performs the selection operation in the genetic algorithm. Next, the structure of the individual with the highest fitness in the current population is directly copied to the next-generation population. Subsequently, by identifying the least fit individuals in the new population, the optimal individual undergoes genetic mutation, with the mutation probability determined by the service priority \(P(s_{ij})\). Following this, the fitness of the two individuals is compared, and if the least fit individual has a lower fitness, it is replaced; otherwise, no replacement occurs.

Algorithm 1
figure a

Multi-service geographic region deployment based on priority algorithm (MS-GD-P)

Algorithm complexity analysis

The proposed Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) exhibits a time complexity derived from evaluating each algorithm step. The iterative process runs for a maximum of G iterations. In each iteration, assessing the population with the fitness function requires O(Mf), where M is the population size and f is the complexity of the fitness function. Cloning the population, identifying the best and worst individuals, and adding them to the population each have a O(M) complexity. The special mutation operation has a O(f) complexity. The crossover and mutation operations, conditional on a random value, result in a complexity of O(Mf) for each individual in the population. Selecting the new population and applying priority mutation to the best individuals also have O(M) complexities. Consequently, the complexity for one iteration is O(Mf); for G iterations, the total complexity is O(GMf). The advantages of the MS-GD-P algorithm can be summarized as follows:

  1. 1.

    Initial Population: Using the K-medoids algorithm to generate the initial population improves the quality of the initial solutions, enabling the algorithm to converge more rapidly to high-quality solutions.

  2. 2.

    Priority Mutation Strategy: Introducing priority mutation strategies helps the algorithm escape local optima, increasing the diversity of the search space and enhancing the global search capability.

  3. 3.

    Dynamic Population Adjustment: The algorithm dynamically adjusts the population in each iteration by retaining the best individuals and performing special mutations while replacing the worst individuals. This improves the overall quality of the population and avoids premature convergence.

  4. 4.

    Fitness-Prioritized Selection: Prioritizing the selection of individuals with higher fitness during the selection phase accelerates the convergence speed of the algorithm.

  5. 5.

    Flexible Crossover and Mutation: The flexible settings for crossover and mutation rates allow the algorithm to adaptively balance exploration and exploitation at different stages, enhancing its robustness.

4 Experimental evaluation

4.1 Experimental setup

The hardware environment for the experiments in this study consisted of a CPU model 12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz, 16GB of RAM, and the operating system Windows 11. The experimental implementation was carried out using the Python programming language.

To evaluate the proposed service deployment method that considers service reliability and user coverage based on service priority. The widely-used EUA dataset [33] is employed to conduct the experiments. It includes the location information (i.e., latitude and longitude) about real-world base stations and end-users within metropolitan Melbourne, Australia. There are a total number of 1,464 base stations and 174,305 end-users. The city was divided into four regions: Commercial, Residential, Tourist, and General, with the request frequency of five applications set in the millions. Each user initiated 20, 15, and 5 requests to the applications at high, medium, and low frequencies. Table 2 shows six types of services configured in this study and deployed in the four city regions, each with varying request frequencies depending on the region. Generally, services with more extensive functionality require more instances to be deployed, increasing costs. Table 2 also presents each service’s functionalities and deployment costs.

Table 2 Service information

4.2 Comparison approaches

We compared the proposed MS-GD-P method with six representative algorithms:

  • AR (Average Random algorithm): Allocates the deployment budget equally among services, resulting in an equal number of service instances.

  • BEAD-C [4]: Focuses on maximizing user coverage by selecting edge servers without considering service reliability benefits.

  • DCDS [22]: Dynamically deploys services based on factors such as geographical location, bandwidth, and reliability, accounting for user mobility.

  • ASD [26]: An adaptive user-side service deployment algorithm based on reinforcement learning, which dynamically adjusts strategies based on task and available node types.

  • BEAD-O [33]: An optimal method based on integer programming that maximizes both user coverage and service reliability benefits by selecting edge servers suitable for solving small-scale optimization problems.

  • BEAD-G [33]: A greedy-based approximate algorithm for budgeted edge application deployment, considering user coverage and service reliability benefits to find solutions for larger-scale problems.

4.3 Experimental analysis

4.3.1 Overall experimental results analysis

When the number of end-users increases from 300 to 500, as illustrated in Fig. 2, all algorithms achieve higher user coverage and service reliability benefits. However, it can be seen from the figure that the overall user coverage and service reliability benefits of MS-GD-P are higher than the overall benefits obtained by the other algorithms. When the number of end users increases, the overall user coverage and service reliability benefits of each method increase. This is because as the number of users increases, more users will exist in the same region and the possibility of users being co-covered by edge servers increases. Overall, across all cases in Fig. 2, the average advantage of MS-GD-P is 55.43% over AR, 33.25% over BEAD-C, 12.28% over DCDS, 8.33% over ASD, 25.74% over BEAD-O, and 30.51% over BEAD-G.

Fig. 2
figure 2

Comparison of overall benefits

Regarding user coverage benefits, as illustrated in Fig. 3, the average advantage of AR, BEAD-C, DCDS, ASD, BEAD-O, and BEAD-G exhibit 78.04%, 33.76%, 12.16%, 8.60%, 44.03%, and 31.02% less user coverage benefits than MS-GD-P, respectively. MS-GD-P outperforms other algorithms in user coverage, enabling more user coverage and provisioning services for them. This is attributed to MS-GD-P considering the broad distribution of users and service request frequencies during deployment, leading to higher user coverage benefits than other algorithms and effectively meeting user demands.

Fig. 3
figure 3

Comparison of user coverage benefits

Regarding service reliability benefits, as illustrated in Fig. 4, the average advantage of MS-GD-P surpasses AR, BEAD-C, DCDS, ASD, and BEAD-G by 10.07%, 31.64%, 12.69%, 7.49%, and 28.86%, respectively. However, MS-GD-P falls short by 12.29% compared to BEAD-O. This discrepancy can be explained by the fact that the BEAD-O algorithm aims to achieve theoretically maximal benefits by prioritizing regions with higher user requests, neglecting users in remote regions. It deploys services on edge servers with high request frequencies to achieve higher service reliability benefits. In contrast, MS-GD-P takes a holistic approach, considering service distribution, request frequencies, and resource consumption. It caters to users in remote regions with lower request frequencies, aiming to provide reliable services for all users. Therefore, while MS-GD-P’s service reliability benefits are lower than BEAD-O’s, its practical deployment strategy is more reasonable, considering all regions’ users and providing high-quality services for them.

Fig. 4
figure 4

Comparison of service reliability benefits

4.3.2 Analysis of benefits for different service categories

As illustrated in Fig. 5, the benefits vary for different service categories due to their distribution breadth and request frequencies. MS-GD-P achieves higher benefits across all services. BEAD-O achieves higher benefits in Information Authentication, Music, and Complex Service than other algorithms. This is attributed to BEAD-O’s pursuit of theoretical maximal benefits, deploying these services extensively due to their low deployment costs and high returns. However, BEAD-O’s benefits for other services, as depicted in Fig. 5, are significantly lower than other algorithms.

Fig. 5
figure 5

Benefits of each service

Excluding BEAD-O, MS-GD-P yields higher benefits in Information Authentication, Music, AR, and Complex Service compared to other algorithms. For the Image Rendering service, MS-GD-P is superior to AR by 63.81%, DCDS by 8.41%, ASD by 8.96%, and inferior to BEAD-C by 6.09%, BEAD-G by 4.65%. For the Face Recognition service, MS-GD-P surpasses AR by 41.41%, DCDS by 16.16%, and ASD by 5.32%, but lags behind BEAD-C by 8.16% and BEAD-G by 10.30%. At the same time, MS-GD-P exhibits significant advantages over BEAD-C and BEAD-G in Information Authentication, Music, VR, and Complex Services. MS-GD-P outperformed BEAD-C by 126.96%, 266.22%, 15.93%, and 13.07%, and outperformed BEAD-G by 78.00%, 231.62%, 15.46%, and 22.95%, respectively, in terms of average benefits in Information Authentication, Music, VR, and Complex Services. The reason is that Face Recognition and Image Rendering services are primarily concentrated in densely populated tourist and commercial regions, yielding high benefits. Therefore, BEAD-C and BEAD-G deploy these services extensively. However, due to its higher deployment costs and resource consumption, MS-GD-P optimally deploys a certain number of services to meet overall user demands, resulting in slightly lower Face Recognition benefits and Image Rendering benefits than BEAD-C and BEAD-G.

Although MS-GD-P’s user coverage and service reliability benefits are slightly lower than BEAD-C and BEAD-G for Face Recognition and Image Rendering. However, BEAD-C and BEAD-G exhibit lower benefits in Information Authentication, Music, VR, and Complex Services, indicating their struggle to balance broad distribution and low-frequency services. In contrast, MS-GD-P, guided by service priority based on service distribution breadth, request frequency, and resource consumption, deploys services rationally. It effectively considers widely distributed and low-frequency services, achieving favorable outcomes across various service types, ensuring a positive user experience, and maximizing provider benefits.

In conclusion, MS-GD-P achieves rational deployment of various services across different geographical regions, considering both user coverage and service reliability comprehensively. Compared to other algorithms, MS-GD-P yields higher benefits.

5 Conclusion and future work

This paper introduces a service geographical deployment approach that prioritizes service reliability and user coverage based on service priorities. By considering user demands for services in different geographical regions and the characteristics of those regions, service priorities are calculated for diverse geographical regions. This prioritization forms the basis for enhancing the genetic algorithm to achieve the final deployment plan. A comparison with baseline methods using the real-world EUA dataset demonstrates that the deployment plans generated by the algorithm proposed in this paper achieve higher service reliability and user coverage.

Currently, we focus on optimizing service deployment plans by uncovering the varied service demands of users in different geographical regions. In the future, we will explore the impact of user movement on service deployment.