Abstract
In cloud-edge-end scenarios, how to achieve rational resource allocation, implement effective service deployment, and ensure high service quality has become a hot research topic in academic domains. Service providers usually deploy services by considering the characteristics of different geographical regions, which helps to meet the diverse needs of users in different regions and optimize resource allocation and utilization. However, due to the widespread distribution of users and limited server resources, providing all types of services to users in every geographical region is not feasible. In addition, edge servers are prone to operational failures caused by software anomalies, hardware malfunctions, and malicious attacks, which will decrease service reliability. To address the problems above, this paper proposes a metric for service priorities based on user demands and regional characteristics for different geographical regions. Building upon this foundation, a Multi-Service Geographic region Deployment based on Priority (MS-GD-P) is proposed. This method takes user coverage and service reliability into consideration, which facilitates users’ needs for multiple services in different geographical regions. Experimental results on real datasets demonstrate that MS-GD-P outperforms baseline methods in user coverage and service reliability.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Massive services have emerged with the application and integration of emerging technologies such as 5G, cloud computing, edge computing, big data, and artificial intelligence. The number of intelligent terminal devices and generated data is increasing rapidly, leading to an explosive growth in computational demands. According to IDC’s Global DataSphere research [1], it is predicted that global data volumes will generate 129 ZB in 2023 and will be more than double by 2027. Due to limited computational resources in mobile terminal devices, traditional cloud computing architectures cannot meet the computational demands of users with this explosive growth in data services. Simultaneously, driven by user expectations for responsive and reliable services, there is a keen focus on enhancing the quality of services while processing massive quantities of data. Therefore, sinking cloud computing capabilities to the edge side and deploying services based on the demands of different geographical regions can better meet the diverse needs of users in various regions, promoting balanced social development.
Service deployment based on geographical region [2] refers to deploying different services on servers in various locations according to users’ requirements. This deployment approach can increase service quality, reduce network latency and congestion, and decrease the costs and risks associated with data transmission. Service deployment has attracted researchers’ attention in recent years. Many efforts have focused on formulating suitable deployment strategies with different objectives and constraints, such as to improve end-users’ perceived service quality at minimum deployment cost [3], to maximize the end-user coverage with a specific deployment budget [4], and to maximize service robustness under budget constraints [5]. However, due to the complex and dynamic nature of the environment, achieving rational resource allocation and carrying out service deployment based on geographical region poses a challenging problem. Edge servers exhibit heterogeneity, with each edge node possessing limited and distinct computational and storage resources. Deploying services closer to users at the network edge can achieve higher performance and ensure service quality for users. Many of the current approaches may lead to some problems as follows:
-
1.
Low User Coverage: Edge servers are deployed within specific geographical location limits, and their resource capacity is constrained [6, 7]. There is a limitation on the number of services deployed on each server. As the number of users increases and the range of service requests expands, servers cannot offer all types of services to users in every geographical region, reducing user coverage.
-
2.
Insufficient Service Reliability: Users can only access the service within the signal range of the edge server [8]. However, the connection between users and servers might be unstable due to user mobility. In cases where the connection is lost, users must reconnect to another edge server, which may lead to service interruptions during this process, consequently reducing service reliability. Moreover, edge servers are susceptible to operational failures caused by software anomalies, hardware malfunctions, and malicious attacks [5, 9]. These factors can result in service interruptions and impact service reliability.
Based on the analysis above, this paper defines service priorities based on user demands in different geographical regions and regional characteristics. Building upon this, a Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) is proposed. Specifically, this paper employs algorithms such as dispersion, density clustering, and attention mechanisms from the dimensions of service distribution, service aggregation, and service resource requirements to calculate service priorities. Based on these priorities, deployment costs are allocated to determine the number of service deployments. Subsequently, the K-medoids clustering algorithm is utilized to select edge servers and preliminarily generate an initial deployment plan. Finally, an improved genetic algorithm based on priority is employed, considering user coverage and service reliability. Through multiple iterations, it generates service deployment plans. The priority-improved genetic algorithm possesses solid global search capabilities, adaptability, and interpretability advantages. It is well suited for developing deployment plans for various services in different geographical regions within the cloud-edge-end scenario presented in this paper.
The main contributions are summarized as follows:
-
1.
The service priority metric is proposed for budget allocation, initial population generation, and improved selection strategies. By comprehensively considering user demands for different services in various geographical regions and the resources available, distinct service priorities are calculated within other geographical regions and across the entire region.
-
2.
The Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) is designed to improve the process of server selection. It takes into consideration service reliability and user coverage. It brings higher user coverage and service reliability to the users for a higher performance deployment scenario.
The rest of this paper is organized as follows. Section 2 summarizes related works. Section 3 provides a detailed overview of service geographic region deployment under budget constraints. Section 4 outlines the performance evaluation results in different scenarios. Finally, we summarize this paper and point out the future work in Sect. 5.
2 Related work
2.1 Service prioritization
Service prioritization refers to ranking and allocating resources to different services in situations with limited resources. Setting service priorities ensures that critical and urgent services receive sufficient resources and are treated with higher precedence to meet their needs and requirements. References [10, 11] utilize the Deadline Monotonic (DM) scheduling algorithm to determine task priorities and the priority for accessing shared resources. References [12, 13] introduce per-flow priority, associating each priority with an individual flow. A priority range is established for each flow type, and priorities for each flow within that range are randomly assigned. This allows for probabilistic priorities, where one flow may have a higher priority than another within a certain probability. References [14, 15] separately allocate different priorities to services based on their tolerance to delays and their impact on the service ecosystem, though this approach only considers user requirements. Reference [16] addresses the coupling effect between computation offloading and service caching by considering service priorities. It aims to maximize offloading efficiency while maintaining cache balance based on service priorities, then control service caching through a given offloading strategy that replaces the cache service list, thus enhancing the benefits of joint offloading and cache control. Reference [17] adjusts weights based on service priorities, and decisions can also be dynamically updated as service priorities change. Reference [18] prioritizes services based on runtime and adequate response time. Reference [19] introduces a time-aware service recommendation approach based on dynamic service priorities and Quality of Service (QoS). It rates each service for a recommendation based on service priorities and QoS scores. Reference [20] traffic prioritization is done on the basis of the throughput, reliability, and latency requirements.
2.2 Service deployment
In the context of cloud-edge-device scenarios, service deployment refers to the simultaneous deployment of applications and services across cloud, edge nodes, and terminal devices, achieving collaborative operations among the cloud, edge, and terminal. This deployment approach combines the advantages of cloud, edge, and terminal computing, providing more flexible, efficient, and reliable services. Reference [6] formulates service deployment strategies based on the resource limitations of edge servers, the business logic of applications, and the average response time of services. Reference [7], primarily focusing on user demands, aims to maximize service quality and reliability while minimizing energy consumption and costs during service deployment at the edge. The deployment strategy in Reference [4], driven by service providers’ benefits, allows the deployment of application instances on edge servers within specific regions, providing services to most users in that region. Considering user mobility and service latency demands, Reference [21] deploys services in a mobile edge-cloud network to offer reliable services to users. Reference [22], considering factors like geographical location, bandwidth, and reliability, employs dynamic deployment strategies for service deployment. From a service provider’s perspective, Reference [3] considers service sharing and communication interference to minimize deployment costs and maximize overall service quality, effectively addressing multi-service deployment issues. Reference [23] introduces a hybrid optimization of Particle Swarm and Chemical Reaction to mitigate service delay and enhance Quality of Service. Reference [24] reduces service latency and increases resource utilization by considering the priority of services and the distribution of resource consumption. Reference [25] proposes an online heuristic algorithm. The algorithm recursively halves the SFC request into two sub-SFCs and finds the solution that satisfies the latency and resource constraints by slicing the latency constraints of the two sub-SFCs. Reference [26] proposes a user-end adaptive service deployment algorithm based on reinforcement learning, dynamically adjusting strategies according to task types and available node types to optimize user experience. The service deployment plan in Reference [27], by balancing access latency, communication latency, and service switching costs, enhances service quality economically to maximize end-user coverage. Reference [28] a mobility model based on the probability density function of sojourn time is used to characterize user mobility intensity and optimize user experience by minimizing service deployment overhead. Reference [29], MSCD framework can automatically customize multi-domain network services based on users’ personalized service preferences, and deploy them in multi-domain networks, to provide customizable multi-domain services for different users. Reference [30] an Ant Colony Optimization meta-heuristic algorithm for the Online SFC Deployment is proposed to minimize the server operation cost and network latency for near real-time SFC deployment.
However, these service deployment methods often focus on a singular perspective. They may not comprehensively meet the needs of both users and service providers or maximize revenue based on distinct geographical characteristics.
3 Priority-based service deployment strategy
In addressing the service geographical deployment challenges in the cloud-edge-device scenario, this paper derives service priorities based on user demands for distinct service categories across various geographical regions and the available resources. Subsequently, servers are preliminarily selected according to service priorities using the K-medoids clustering algorithm. Following the generation of an initial deployment plan, an enhanced priority-based genetic algorithm is proposed to refine server selection. This algorithm aims to yield higher user benefits by comprehensively considering user coverage and service reliability.
As illustrated in Fig. 1, the multi-service geographical deployment approach based on service priorities primarily consists of Attention-based Service Prioritization and Priority-Enhanced Genetic Algorithm. The Attention-based Service Prioritization module calculates service priorities by considering the relationships among various factors influencing the services and the user demands for services in different geographical regions. This process resolves the issue of deploying different service categories in various geographical regions. The Priority-Enhanced Genetic Algorithm module generates an initial population based on service priorities. It incorporates these priorities into the individual selection phase, enabling faster and more accurate generation of service deployment plans. The main notations used in this paper are listed in the Table. 1.
3.1 Service priority
In the context of cloud-edge-device scenarios, deploying multiple services requires considering the optimal number of instances for each service in different geographical regions. Additionally, users within different geographical regions exhibit varying demands for services. To address this challenge, this paper introduces the concept of service priority. By analyzing user demands for services within different geographical regions, the paper calculates users’ preferences for different services, thereby determining the priority of each service within that region. The priorities of services across various regions are aggregated to establish the overall service priority in the entire region. This priority-based approach is then utilized to rank services and guide their deployment across different geographical regions within budget constraints.
In real-life scenarios, when users’ service requests are widely distributed geographically, the same service must be deployed on servers in different regions to ensure timely response to user requests. For instance, users could request chat services in both business districts and tourist regions. In this paper, we adopt the degree of dispersion to represent the wideness of service request distribution D(X, Y). The greater the degree of dispersion, the wider the distribution of requests, and vice versa. We use the standard deviation to calculate the degree of dispersion.
Definition 1
(Service request distribution) According to the service request distribution D(X, Y), the calculation formula of the standard deviation of 2-bit random vector is:
Services with high request frequencies also require deploying multiple instances. A high request frequency indicates a substantial user demand and a higher reliability requirement for the service in that region. Consequently, deployment on servers within that region is prioritized to meet user needs. For instance, shopping services might receive many requests in commercial districts, leading to a prioritized deployment of shopping services in those regions compared to other services. A density-based clustering algorithm is employed to measure the aggregation level in various regions. The choice of a density-based clustering algorithm is due to its suitability for this problem, as it eliminates the need to set an initial number of cluster centers, making it more applicable. Additionally, density-based clustering is robust to noise, efficient in terms of speed, and suitable for handling larger datasets compared to centroid-based clustering algorithms [31].
Unlike cloud servers, edge servers do not possess ample resources. Therefore, extensive deployment on edge servers might be unsuitable if a service demand necessitates significant computational resources.
Definition 2
(Geographical service preference within a region) Suppose we use the sets \(S=\{s_{1}, s_{2},\ldots , s_{o}\}\) to represent the services and \(R=\{r_{1}, r_{2},\ldots , r_{p}\}\) to denote the regions. Geographical service \(s_{i}\) preference within region \(r_{j}\), denoted as \(Pr(s_{ij})\), is mainly defined by three factors: the overall request distribution of service \(s_{i}\), denoted as \(d_{i}\); the request frequency of service \(s_{i}\) in region \(r_{j}\), denoted as \(f_{ij}\); and the resource requirements of service \(s_{i}\) in region \(r_{j}\), denoted as \(re_{ij}\). \(Pr(s_{ij})\) can be calculated as follows.
where parameters \(\alpha\), \(\beta\) and \(\gamma\) are the weights for \(d_{i}\), \(f_{ij}\), subject to \(\alpha\) + \(\beta\)+ \(\gamma\)=1. \(\alpha , \beta ,\gamma \in (-1,1)\).
The factors influencing geographical service preference mutually interact with one another. Typically, research initially involves randomly determining weight coefficients and then progressively adjusting them to ascertain the impact of these factors on service redundancy deployment. Although random methods are simple, they can be slow in determining weight coefficients and may yield results deviating from the optimal solution. Moreover, these methods are, to some extent, influenced by the subjectivity of experimenters, leading to deficiencies in weight coefficient determination through random approaches. This paper determines the weights of each factor by learning their interrelationships. The Attention Mechanism can selectively filter out a small amount of crucial information from a large dataset and focus on this subset of essential data. Therefore, the Attention Mechanism is employed to compute corresponding weight coefficients, determining the effects of the three factors on service redundancy deployment. This is achieved through three feature spaces designed to learn the latent feature similarity between the various factors. The feature spaces are represented as follows.
\(X\in R^{3\times {S}}\) represents the feature matrix composed of latent feature vectors for S services. \(W^{q}\in R^{3\times {S}}\), \(W^{k}\in R^{3\times {S}}\), and \(W^{v}\in R^{3\times {S}}\) are parameter matrices used to map the feature vectors into the respective spaces. Q, K, and V correspond to Query, Key, and Value. Query is used to compare with other units to determine their importance. Key is used to measure the relevance of the Query to other elements in the sequence. Value contains information about the current position, which is used to generate the final output representation. Subsequently, attention weights \(\alpha _{s_{i}}\) for each service within the group relative to the specified service \(s_i\) can be calculated, indicating the relevance between each service and service \(s_i\).
\(Q_{s_{i}}\) represents the latent feature vector of service \(s_i\) within Q. \(\sqrt{d_{k}}\) denotes the square root of the Key vector, where \(\sqrt{d_{k}}=3\). Subsequently, by the normalized exponential function softmax, the attention weights are multiplied by the corresponding Value values, focusing on a small set of crucial information.
Finally, the information obtained for attention is summed and normalized to derive the weight coefficients that influence the geographical service preference factors within the region.
Definition 3
(Service priority) Users’ demands for different types of services vary based on their geographical regions. Hence, service priorities can be utilized to illustrate the distinct user requirements for services. Service \(s_i\) priority within region \(r_{j}\), denoted as \(P(s_{ij})\), can be calculated by the natural logarithm of the ratio between the proportion of service preference in that region and the proportion of service preference across all regions.
Service \(s_i\) priority, denoted as \(P(s_{i})\), can be calculated as follows.
3.2 Budget allocation
Using the service above priorities as a reference, the deployment quantities for each service instance are determined based on the total deployment budget and the budget for deploying individual services.
Definition 4
(Budget allocation) N is the default number of service deployments, limited by the current budget b. Subsequently, budget allocation is carried out by service priorities. \(SSB_{i}^{0}\) represents the spare service budget when assigning to \(s_i\), while \(CSB_{i}^{1}\) represents the current service budget [32].
\(n_i\) represents the computed deployment number for service \(S_i\), \(C_i^r\) denotes the number of cluster centers obtained from the density clustering algorithm, \(c_i\) signifies the deployment cost for service \(S_i\), and \(f_i\) is defined as the ratio of the deployment cost of current service and the sum of the undeployed services, which is as follows:
Following the cost above allocation, the required deployment number for each service \(S_i\) can be calculated.
3.3 Initial deployment plan generation
After determining the number of deployments for each service, the services must be assigned to appropriate servers. Given the fixed number of services to be deployed, this paper opts for the K-medoids algorithm to generate the initial deployment plan. Common centroid-based clustering algorithms include K-means, K-medoids, K-medians, and K-means++. K-means, K-means++, and K-medians share a similar approach, with K-means++ emphasizing the selection of initial centroids as far apart as possible based on distances. K-medians rely on the median of sample points, while K-means and K-means++ focus on the mean value of sample points; however, their selected centroids might not be actual sample points. Hence, in this context, the K-medoids algorithm is more suitable because it necessitates centroids to be essential sample points. The deployment count for each region is determined according to the calculations in section 3.2, and then K-medoids clustering is performed based on regions, resulting in the initial deployment plan.
3.4 Generation of service deployment plan
3.4.1 User coverage and service reliability
Suppose we use the sets \(\text{ES}=\{\text{es}_{1}, \text{es}_{2},\ldots , \text{es}_{n}\}\) to represent the edge servers and \(U=\{u_{1}, u_{2},\ldots , u_{m}\}\) to denote the end-users. The objective of service deployment is to achieve high service reliability and extensive user coverage.
User coverage [33]
Each edge server has a certain coverage range and offers services to end-users within that range. In real scenarios, there might be overlapping regions between edge servers, where any overlapping edge servers can serve end-users. If an end-user is solely covered by one edge server, he can only access service instances deployed on that specific edge server. Therefore, determining whether an end-user is covered by an edge server is crucial to ascertain their accessibility to service instances. To achieve this, we use the sets \(L(es)=\{L_{1}(es), L_{2}(es),\ldots , L_{n}(es)\}\) to represent the locations of edge servers and \(L(u)=\{L_{1}(u), L_{2}(u),\ldots , L_{m}(u)\}\) to denote the locations of end-users. The Euclidean metric function \(\text{dist}(L_{i}(es), L_{k}(u))\) can be employed to measure the distance between edge server \(\text{es}_{i}\) and user \(u_{k}\).
Definition 5
(Instance accessibility) The service instance accessibility \(\text{Acc}(\text{es}_{i},u_{k})\) defines a function that returns 1, if the end-user \(u_{k}\) locates within edge server \(\text{es}_{i}'s\) coverage, and returns 0 otherwise.
where \(\text{cov}(\text{es}_{i})\) represents the coverage radius of edge server \(\text{es}_{i}\).
Definition 6
(User coverage benefit) User coverage benefit \(\text{CB}(\text{es}_i)\) is measured by the total number of end-users who can access the edge server \(\text{es}_i\). \(\text{CB}(\text{es}_i)\) is defined as follows:
According to Definition 6, given a set of edge servers ES deploying service instances and a set of end-users U. The overall user coverage benefit of all edge servers ES for all end-users U, denoted as CB(ES), can be measured as follows.
Service Reliability [33]
Suppose the service required by user \(u_{k}\) is deployed on both edge servers \(\text{es}_i\) and \(\text{es}_o\). If \(\text{es}_i\) becomes unavailable due to hardware/software issues or network anomalies, the user \(u_{k}\) can continue their service on edge server \(\text{es}_o\) without experiencing a service interruption. In this scenario, the service reliability for user \(u_{k}\) is ensured, and edge server \(\text{es}_o\) can be seen as contributing to the service reliability benefit for user \(u_{k}\). The service reliability benefit for user \(u_{k}\) is defined as follows.
Definition 7
(Service reliability benefit) Suppose a set of edge servers \(\text{ES}=\{\text{es}_{1}, \text{es}_{2},\ldots , \text{es}_{n}\}\) deploying service instances and an end-user \(u_{k}\in U\), the service reliability benefit \(RB(u_{k})\) obtained by \(u_{k}\) is measured by the number of those edge servers that co-cover \(u_{k}\).
We can find that \(\text{RB}(u_{k})=0\) when \(u_{k}\) can access only one service instance and is \(j-1\) if \(u_{k}\) can access j instance.
According to Definition 7, suppose a set of edge servers ES deploying service instances and a set of end-users U, the overall service reliability benefit of all edge servers ES for all end-users U, denoted as RB(ES), can be used to represent service reliability. The calculation of RB(ES) is defined as follows.
3.4.2 Priority-enhanced genetic algorithm
Compared to other optimization algorithms, the Priority-Enhanced Genetic Algorithm possesses advantages such as strong global search capability, high parallelism, and strong interpretability. It can also address discrete optimization problems, making it well suited for generating service deployment solutions in the cloud-edge-end scenario described in this paper. Therefore, this paper adopts the Priority-Enhanced Genetic Algorithm to generate various deployment plans for multiple services across different geographical regions.
Population initialization
The initialization of the population is crucial for the convergence of the genetic algorithm and can impact the speed at which the optimal population is generated. Many population initialization methods employ random methods. However, random strategies can often result in a significant deviation between the fitness of the initial population of individuals and the desired optimal fitness, leading to a decrease in solution quality. Therefore, an appropriate population initialization method is required to enhance the performance of the genetic algorithm. This paper generates the initial deployment plan by selecting servers using the K-medoids algorithm. Since the initial centroids chosen by the K-medoids algorithm are random and based on sample points, the generated individuals are distinct, constituting a population and achieving population initialization.
Chromosome representation
The encoding method employs the floating-point encoding approach.
Chromosome representation
In this paper, the total user coverage benefits and overall service reliability benefit mentioned earlier are utilized to calculate the fitness of each individual. Additionally, each individual must satisfy resource and cost constraints.
Crossover and mutation
Traditional genetic algorithms often employ fixed crossover and mutation probabilities, which can destroy good individuals and retain poor ones, resulting in decreased algorithm performance. In this paper, an adaptive crossover and mutation probability strategy is devised. The crossover and mutation probabilities are adjusted based on a combination of the current individual’s fitness and the difference between the maximum and average fitness of the current population. The crossover probability \(p_c\) and mutation probability \(p_m\) are defined as follows.
\(k_1\),\(k_2\),\(k_3\),\(k_4\leqq 1,f_max\) is the maximum fitness value of individuals in the current population, \({\overline{f}}\) is the average fitness value of individuals in the current population, \(f'\) is the higher fitness value of the two individuals undergoing crossover, and f is the fitness value of the individual undergoing mutation.
Selection strategy
The best-preserved selection function is employed to choose individuals for the next generation. Firstly, the roulette wheel selection method performs the selection operation in the genetic algorithm. Next, the structure of the individual with the highest fitness in the current population is directly copied to the next-generation population. Subsequently, by identifying the least fit individuals in the new population, the optimal individual undergoes genetic mutation, with the mutation probability determined by the service priority \(P(s_{ij})\). Following this, the fitness of the two individuals is compared, and if the least fit individual has a lower fitness, it is replaced; otherwise, no replacement occurs.
Algorithm complexity analysis
The proposed Multi-Service Geographic region Deployment based on Priority algorithm (MS-GD-P) exhibits a time complexity derived from evaluating each algorithm step. The iterative process runs for a maximum of G iterations. In each iteration, assessing the population with the fitness function requires O(Mf), where M is the population size and f is the complexity of the fitness function. Cloning the population, identifying the best and worst individuals, and adding them to the population each have a O(M) complexity. The special mutation operation has a O(f) complexity. The crossover and mutation operations, conditional on a random value, result in a complexity of O(Mf) for each individual in the population. Selecting the new population and applying priority mutation to the best individuals also have O(M) complexities. Consequently, the complexity for one iteration is O(Mf); for G iterations, the total complexity is O(GMf). The advantages of the MS-GD-P algorithm can be summarized as follows:
-
1.
Initial Population: Using the K-medoids algorithm to generate the initial population improves the quality of the initial solutions, enabling the algorithm to converge more rapidly to high-quality solutions.
-
2.
Priority Mutation Strategy: Introducing priority mutation strategies helps the algorithm escape local optima, increasing the diversity of the search space and enhancing the global search capability.
-
3.
Dynamic Population Adjustment: The algorithm dynamically adjusts the population in each iteration by retaining the best individuals and performing special mutations while replacing the worst individuals. This improves the overall quality of the population and avoids premature convergence.
-
4.
Fitness-Prioritized Selection: Prioritizing the selection of individuals with higher fitness during the selection phase accelerates the convergence speed of the algorithm.
-
5.
Flexible Crossover and Mutation: The flexible settings for crossover and mutation rates allow the algorithm to adaptively balance exploration and exploitation at different stages, enhancing its robustness.
4 Experimental evaluation
4.1 Experimental setup
The hardware environment for the experiments in this study consisted of a CPU model 12th Gen Intel(R) Core(TM) i7-12700H 2.30 GHz, 16GB of RAM, and the operating system Windows 11. The experimental implementation was carried out using the Python programming language.
To evaluate the proposed service deployment method that considers service reliability and user coverage based on service priority. The widely-used EUA dataset [33] is employed to conduct the experiments. It includes the location information (i.e., latitude and longitude) about real-world base stations and end-users within metropolitan Melbourne, Australia. There are a total number of 1,464 base stations and 174,305 end-users. The city was divided into four regions: Commercial, Residential, Tourist, and General, with the request frequency of five applications set in the millions. Each user initiated 20, 15, and 5 requests to the applications at high, medium, and low frequencies. Table 2 shows six types of services configured in this study and deployed in the four city regions, each with varying request frequencies depending on the region. Generally, services with more extensive functionality require more instances to be deployed, increasing costs. Table 2 also presents each service’s functionalities and deployment costs.
4.2 Comparison approaches
We compared the proposed MS-GD-P method with six representative algorithms:
-
AR (Average Random algorithm): Allocates the deployment budget equally among services, resulting in an equal number of service instances.
-
BEAD-C [4]: Focuses on maximizing user coverage by selecting edge servers without considering service reliability benefits.
-
DCDS [22]: Dynamically deploys services based on factors such as geographical location, bandwidth, and reliability, accounting for user mobility.
-
ASD [26]: An adaptive user-side service deployment algorithm based on reinforcement learning, which dynamically adjusts strategies based on task and available node types.
-
BEAD-O [33]: An optimal method based on integer programming that maximizes both user coverage and service reliability benefits by selecting edge servers suitable for solving small-scale optimization problems.
-
BEAD-G [33]: A greedy-based approximate algorithm for budgeted edge application deployment, considering user coverage and service reliability benefits to find solutions for larger-scale problems.
4.3 Experimental analysis
4.3.1 Overall experimental results analysis
When the number of end-users increases from 300 to 500, as illustrated in Fig. 2, all algorithms achieve higher user coverage and service reliability benefits. However, it can be seen from the figure that the overall user coverage and service reliability benefits of MS-GD-P are higher than the overall benefits obtained by the other algorithms. When the number of end users increases, the overall user coverage and service reliability benefits of each method increase. This is because as the number of users increases, more users will exist in the same region and the possibility of users being co-covered by edge servers increases. Overall, across all cases in Fig. 2, the average advantage of MS-GD-P is 55.43% over AR, 33.25% over BEAD-C, 12.28% over DCDS, 8.33% over ASD, 25.74% over BEAD-O, and 30.51% over BEAD-G.
Regarding user coverage benefits, as illustrated in Fig. 3, the average advantage of AR, BEAD-C, DCDS, ASD, BEAD-O, and BEAD-G exhibit 78.04%, 33.76%, 12.16%, 8.60%, 44.03%, and 31.02% less user coverage benefits than MS-GD-P, respectively. MS-GD-P outperforms other algorithms in user coverage, enabling more user coverage and provisioning services for them. This is attributed to MS-GD-P considering the broad distribution of users and service request frequencies during deployment, leading to higher user coverage benefits than other algorithms and effectively meeting user demands.
Regarding service reliability benefits, as illustrated in Fig. 4, the average advantage of MS-GD-P surpasses AR, BEAD-C, DCDS, ASD, and BEAD-G by 10.07%, 31.64%, 12.69%, 7.49%, and 28.86%, respectively. However, MS-GD-P falls short by 12.29% compared to BEAD-O. This discrepancy can be explained by the fact that the BEAD-O algorithm aims to achieve theoretically maximal benefits by prioritizing regions with higher user requests, neglecting users in remote regions. It deploys services on edge servers with high request frequencies to achieve higher service reliability benefits. In contrast, MS-GD-P takes a holistic approach, considering service distribution, request frequencies, and resource consumption. It caters to users in remote regions with lower request frequencies, aiming to provide reliable services for all users. Therefore, while MS-GD-P’s service reliability benefits are lower than BEAD-O’s, its practical deployment strategy is more reasonable, considering all regions’ users and providing high-quality services for them.
4.3.2 Analysis of benefits for different service categories
As illustrated in Fig. 5, the benefits vary for different service categories due to their distribution breadth and request frequencies. MS-GD-P achieves higher benefits across all services. BEAD-O achieves higher benefits in Information Authentication, Music, and Complex Service than other algorithms. This is attributed to BEAD-O’s pursuit of theoretical maximal benefits, deploying these services extensively due to their low deployment costs and high returns. However, BEAD-O’s benefits for other services, as depicted in Fig. 5, are significantly lower than other algorithms.
Excluding BEAD-O, MS-GD-P yields higher benefits in Information Authentication, Music, AR, and Complex Service compared to other algorithms. For the Image Rendering service, MS-GD-P is superior to AR by 63.81%, DCDS by 8.41%, ASD by 8.96%, and inferior to BEAD-C by 6.09%, BEAD-G by 4.65%. For the Face Recognition service, MS-GD-P surpasses AR by 41.41%, DCDS by 16.16%, and ASD by 5.32%, but lags behind BEAD-C by 8.16% and BEAD-G by 10.30%. At the same time, MS-GD-P exhibits significant advantages over BEAD-C and BEAD-G in Information Authentication, Music, VR, and Complex Services. MS-GD-P outperformed BEAD-C by 126.96%, 266.22%, 15.93%, and 13.07%, and outperformed BEAD-G by 78.00%, 231.62%, 15.46%, and 22.95%, respectively, in terms of average benefits in Information Authentication, Music, VR, and Complex Services. The reason is that Face Recognition and Image Rendering services are primarily concentrated in densely populated tourist and commercial regions, yielding high benefits. Therefore, BEAD-C and BEAD-G deploy these services extensively. However, due to its higher deployment costs and resource consumption, MS-GD-P optimally deploys a certain number of services to meet overall user demands, resulting in slightly lower Face Recognition benefits and Image Rendering benefits than BEAD-C and BEAD-G.
Although MS-GD-P’s user coverage and service reliability benefits are slightly lower than BEAD-C and BEAD-G for Face Recognition and Image Rendering. However, BEAD-C and BEAD-G exhibit lower benefits in Information Authentication, Music, VR, and Complex Services, indicating their struggle to balance broad distribution and low-frequency services. In contrast, MS-GD-P, guided by service priority based on service distribution breadth, request frequency, and resource consumption, deploys services rationally. It effectively considers widely distributed and low-frequency services, achieving favorable outcomes across various service types, ensuring a positive user experience, and maximizing provider benefits.
In conclusion, MS-GD-P achieves rational deployment of various services across different geographical regions, considering both user coverage and service reliability comprehensively. Compared to other algorithms, MS-GD-P yields higher benefits.
5 Conclusion and future work
This paper introduces a service geographical deployment approach that prioritizes service reliability and user coverage based on service priorities. By considering user demands for services in different geographical regions and the characteristics of those regions, service priorities are calculated for diverse geographical regions. This prioritization forms the basis for enhancing the genetic algorithm to achieve the final deployment plan. A comparison with baseline methods using the real-world EUA dataset demonstrates that the deployment plans generated by the algorithm proposed in this paper achieve higher service reliability and user coverage.
Currently, we focus on optimizing service deployment plans by uncovering the varied service demands of users in different geographical regions. In the future, we will explore the impact of user movement on service deployment.
Data availability
No datasets were generated or analysed during the current study.
References
Rydning J (2023) Worldwide idc global datasphere forecast, 2023–2027: It’s a distributed, diverse, and dynamic (3d) datasphere. Technical report
Lima D, Miranda H (2022) A geographical-aware state deployment service for fog computing. Comput Netw 216:109208. https://doi.org/10.1016/j.comnet.2022.109208
Zhao L, Tan W, Li B et al (2021) Joint shareability and interference for multiple edge application deployment in mobile-edge computing environment. IEEE Internet Things J 9(3):1762–1774. https://doi.org/10.1109/JIOT.2021.3088493
Chen F, Zhou J, Xia X et al (2020) Optimal application deployment in mobile edge computing environment. In: 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), IEEE, pp 184–192, 10.1109/CLOUD49709.2020.00037
Li B, He Q, Cui G et al (2020) Read: Robustness-oriented edge application deployment in edge computing environment. IEEE Trans Serv Comput 15(3):1746–1759. https://doi.org/10.1109/TSC.2020.3015316
Deng S, Xiang Z, Taheri J et al (2020) Optimal application deployment in resource constrained distributed edges. IEEE Trans Mobile Comput 20(5):1907–1923. https://doi.org/10.1109/TMC.2020.2970698
Cao B, Wei Q, Lv Z et al (2020) Many-objective deployment optimization of edge devices for 5g networks. IEEE Trans Netw Sci Eng 7(4):2117–2125. https://doi.org/10.1109/TNSE.2020.3008381
Yin H, Zhang X, Liu HH et al (2017) Edge provisioning with flexible server placement. IEEE Trans Parallel Distrib Syst 28(4):1031–1045. https://doi.org/10.1109/TPDS.2016.2604803
Li B, He Q, Chen F et al (2021) Inspecting edge data integrity with aggregate signature in distributed edge computing environment. IEEE Trans Cloud Comput 10(4):2691–2703. https://doi.org/10.1109/TCC.2021.3059448
Zagalo K, Abdeddaïm Y, Bar-Hen A et al (2022) Response time stochastic analysis for fixed-priority stable real-time systems. IEEE Trans Comput 72(1):3–14. https://doi.org/10.1109/TC.2022.3211421
Jiang X, Chen Z, Yang M et al (2022) A unified blocking analysis for parallel tasks with spin locks under global fixed priority scheduling. IEEE Trans Comput 72(1):15–28. https://doi.org/10.1109/TC.2022.3198634
Akahoshi K, Oki E (2022) Service deployment with per-flow-priority-based virtual network function resizing. In: 2022 23rd Asia-Pacific Network Operations and Management Symposium (APNOMS), IEEE, pp 1–6. https://doi.org/10.23919/APNOMS56106.2022.9919986
Malandrino F, Chiasserini CF, Einziger G et al (2019) Reducing service deployment cost through VNF sharing. IEEE/ACM Trans Netw 27(6):2363–2376. https://doi.org/10.1109/TNET.2019.2945127
Wang Y, Hu X, Guo L et al (2020) Research on v2i/v2v hybrid multi-hop edge computing offloading algorithm in iov environment. In: 2020 IEEE 5th International Conference on Intelligent Transportation Engineering (ICITE), IEEE, pp 336–340, https://doi.org/10.1109/ICITE50838.2020.9231334
Wei C, Fan Y, Zhang J et al (2020) A-HSG: neural attentive service recommendation based on high-order social graph. In: 2020 IEEE International Conference on Web Services (ICWS), IEEE, pp 338–346, https://doi.org/10.1109/ICWS49710.2020.00051
Ko SW, Kim SJ, Jung H et al (2022) Computation offloading and service caching for mobile edge computing under personalized service preference. IEEE Trans Wirel Commun 21(8):6568–6583. https://doi.org/10.1109/TWC.2022.3151131
Li L, Liu M, Shen W et al (2019) Recommending mobile services with trustworthy qos and dynamic user preferences via fahp and ordinal utility function. IEEE Trans Mobile Comput 19(2):419–431. https://doi.org/10.1109/TMC.2019.2896239
Abbasi S, Rahmani AM, Balador A et al (2023) A fault-tolerant adaptive genetic algorithm for service scheduling in internet of vehicles. Appl Soft Comput 143:110413. https://doi.org/10.1016/j.asoc.2023.110413
Zhang Y, Li Z, Tang X et al (2020) Time-aware service recommendation based on dynamic preference and qos. In: 2020 IEEE International Conference on Web Services (ICWS), IEEE, pp 347–354. https://doi.org/10.1109/ICWS49710.2020.00052
Gupta A, Jaiswal S, Bohara VA et al (2023) Priority based v2v data offloading scheme for fiwi based vehicular network using reinforcement learning. In: Vehicular Communications. https://doi.org/10.1016/j.vehcom.2023.100629
Ma Y, Liang W, Li J et al (2020) Mobility-aware and delay-sensitive service provisioning in mobile edge-cloud networks. IEEE Transa Mobile Comput 21(1):196–210. https://doi.org/10.1109/TMC.2020.3006507
Zha Y, Sun Q (2022) Qos-based dynamic clustering deployment strategy in mec. In: 2022 International Conference on Informatics, Networking and Computing (ICINC), IEEE, pp 10–14. https://doi.org/10.1109/ICINC58035.2022.00010
Hashemifar S, Rajabzadeh A (2023) Optimal service provisioning in iot fog-based environment for qos-aware delay-sensitive application. Comput Electr Eng 111:108984. https://doi.org/10.1016/j.compeleceng.2023.108984
Zhang K, Zhou Y, Wang C et al (2023) Towards an automatic deployment model of iot services in fog computing using an adaptive differential evolution algorithm. Internet Things 24:100918. https://doi.org/10.1016/j.iot.2023.100918
Liu H, Long S, Li Z et al (2022) Revenue maximizing online service function chain deployment in multi-tier computing network. IEEE Trans Parallel Distrib Syst 34(3):781–796. https://doi.org/10.1109/TPDS.2022.3232205
Li G, Miao J, Wang Z, et al (2022) An adaptive user service deployment strategy for mobile edge computing. China Commun 19(10):238–249. https://doi.org/10.23919/JCC.2022.00.032
Gao B, Zhou Z, Liu F et al (2021) An online framework for joint network selection and service placement in mobile edge computing. IEEE Trans Mobile Comput 21(11):3836–3851. https://doi.org/10.1109/TMC.2021.3064847
Zhang Q, Li C, Huang Y et al (2023) Effective multi-controller management and adaptive service deployment strategy in multi-access edge computing environment. Ad Hoc Netw 138:103020. https://doi.org/10.1016/j.adhoc.2022.103020
Zhang C, Liu Y, Zhang S et al (2023) Sfc-based multi-domain service customization and deployment. Comput Commun 211:59–72. https://doi.org/10.1016/j.comcom.2023.08.025
Mao Y, Shang X, Yang Y (2023) Ant colony based online learning algorithm for service function chain deployment. In: IEEE INFOCOM 2023-IEEE Conference on Computer Communications, IEEE, pp 1–10, https://doi.org/10.1109/INFOCOM53939.2023.10229012
Campello RJ, Kröger P, Sander J et al (2020) Density-based clustering. Wiley Interdiscip Rev: Data Min Knowl Discov 10(2):e1343. https://doi.org/10.1002/widm.1343
Shi T, Ma H, Chen G et al (2020) Location-aware and budget-constrained service deployment for composite applications in multi-cloud environment. IEEE Trans Parallel Distrib Syst 31(8):1954–1969. https://doi.org/10.1109/TPDS.2020.2981306
Zhao L, Li B, Tan W et al (2022) Joint coverage-reliability for budgeted edge application deployment in mobile edge computing environment. IEEE Trans Parallel Distrib Syst 33(12):3760–3771. https://doi.org/10.1109/TPDS.2022.3166163
Acknowledgement
This work was supported by the National Natural Science Foundation of China under Grant No. 62272243 and Jiangsu Key Laboratory of Big Data Security &Intelligent Processing.
Author information
Authors and Affiliations
Contributions
J, W and L carried out the conception and design of the research, J participated in the acquisition of data. J carried out the analysis and interpretation of data. J performed the statistical analysis. W participated in obtaining funding. J drafted the manuscript and J, W and L participated in revision of manuscript for important intellectual content. All authors reviewed and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Jin, H., Wang, H. & Luo, J. MS-GD-P: priority-based service deployment for cloud-edge-end scenarios. J Supercomput 80, 25713–25735 (2024). https://doi.org/10.1007/s11227-024-06423-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06423-z