1 Introduction

The amount of data that is being created, gathered, processed, and analyzed through advanced analytics, artificial intelligence, and machine learning (AI/ML) techniques is ever-increasing, driven by the widespread adoption of data communication technologies (ICT), such as social media applications, the Internet of Things (IoT), and sensors. Cloud computing offers the elasticity needed in terms of storage and computing resources to scale such growth while providing cost-efficiency and various other quality of service (QoS) properties, such as availability and security, through the storage as a service (StaaS) approach [73]. Indeed, cost is one of the important factors for organizations while adopting cloud storage [95]; however, cloud storage providers offer complex pricing policies, including the actual storage cost and the related services (e.g., network) [63]. Given the increasing use of StaaS and its rapidly growing economic value [62], cost optimization for StaaS has become a challenging endeavor for industry and research. The goal is to minimize the cost of data storage under complex and diverse pricing policies coupled with varying storage and network resources and services offered by cloud service providers (CSPs) [70].

Cloud storage cost might widely vary depending on a company’s needs. A wide range of interrelated parameters affect the cost, from retrieval frequency and storage capacity to network bandwidth. There are also various cost trade-offs emerging due to the varying pricing policies of cloud storage providers and the usage characteristics of application providers [73], which require a cost-benefit analysis based on a cost model. An example could be the cost trade-off between storing the results of computation and re-computing the results, where the decision depends on the size and usage pattern of the data. Furthermore, organizations also use multi-cloud or hybrid solutions [117] by combining multiple public and/or private cloud storage providers to avoid vendor lock-in, achieve high availability and performance, and optimize cost [104]. An application deployed using multiple public and/or private cloud providers distributed over several regions has the ability to enhance the application’s performance while reducing cost substantially. According to a survey [52], user satisfaction dramatically decreases by a slight increase of 100 ms in Web page presentation time, i.e., latency, which requires storing the requested data in data centers close to the users of the Web application to decrease data access latency. Data locality improves availability, but it is costly since it usually requires storing multiple replicas with known cost trade-offs concerning bandwidth or computing [73]. The cost structure for using cloud storage services is complex, unclear, and complicated, particularly in a multi-cloud or hybrid ecosystem. Comprehensive models and mechanisms are required to optimize the cost of using cloud storage services and to make informed storage service selection decisions for data placement, for which it is essential to understand this complex cost structure, associated parameters, and trade-offs between different parameters affecting the cost.

The cloud storage providers tout ostensibly simple use-based pricing plans when it comes to pricing; however, a practical cost analysis of cloud storage is not straightforward [53], and there are a limited number of studies that focus on cost optimization across multiple CSPs with varying price policies [60]. According to a survey among record-keeping professionals [78], 86% of the respondents opt for cloud storage to save costs, while only 19% use cost models. This is because either cost models are complicated to implement or do not meet their requirements. In this respect, this article provides a detailed taxonomy of cloud storage cost, including storage and network usage costs, and a taxonomy of QoS elements such as network performance, availability, and reliability [48]. We collected and analyzed data from the documentation of three major cloud service providers to find commonalities and differences, to provide a comprehensive taxonomy of cloud storage cost. It fills this gap by providing a structured approach that can be used to develop tools for cost optimization and provides a basis for more meaningful cost comparisons between cloud storage providers, which can help organizations make more informed decisions about their cloud storage strategy. We also discuss various cost trade-offs, including storage and computation, storage and cache, and storage and network, and provide cost and latency comparison examples for major cloud storage providers, such as Google Cloud Storage, Microsoft Azure Storage, and Amazon Web Services, along with a set of user scenarios to demonstrate the complexity of cost structure and cloud ecosystem under varying contexts and parameters. Finally, we overview and discuss existing literature for cloud storage provider selection and cost optimization. We aim that the work presented in this article will provide decision-makers, interested in optimizing their storage cost, and researchers working on these and associated problems (e.g., cost modeling) with a better understanding and insights regarding the elements contributing to the storage cost and this complex problem domain. These insights apply for various contexts, ranging from single cloud solutions to multi- and hybrid cloud solutions.

The rest of the article is structured as follows. Section 2 provides an overview of the key concepts and challenges, while Section 3 introduces the research framework. Section 4 presents the taxonomy of cost structure for cloud storage. Section 5 discusses various QoS elements related to cloud storage. Section 6 presents the existing literature on cloud storage selection and various cost-optimization strategies, whereas trade-offs emerging while employing cloud services are discussed in Section 7. Section 8 puts different CSPs into context by comparing their cost and latency under an example setting and provides a set of user scenarios. Section 9 summarises the key findings and discusses several topics presented throughout the article and, finally, Section 10 concludes the article.

2 Overview

Data-intensive applications processing large amounts of data are ideal candidates for cloud deployment due to the need for a large number of storage and computing resources. These include online analytical processing (OLAP) and online transaction processing (OLTP) applications, which manage enormous amounts of data potentially growing at exponential rates and can be suggested for cloud deployment [73]. A single cloud storage provider with multiple regions or, as discussed earlier, due to concerns about cost, scalability, availability, performance, and vendor lock-in, a geo-distributed approach through a multi-cloud or hybrid solution could be opted for. In this article, we focus on a few of the major cloud service providers worldwide, including Amazon Web Services (AWS), Microsoft Azure (Azure), and Google Cloud, among others like Alibaba Cloud and IBM Cloud. However, there is no guarantee that one of these multinational CSPs alone is optimal for an organization’s needs.

In cloud storage, data is stored in the form of objects (files, blobs, entities, items, records), which are pieces of data that form a data set (collection, set, grouping, repository). Every object in cloud storage resides in a bucket, which is a data container containing objects that can be accessed by their own methods. The term bucket is used by AWS and Google Cloud, whereas Azure refers to it as a container. Data could be stored and accessed in various structures (e.g., structured and semi-structured), abstractions (such as file and block), and formats (e.g., key-value, and relational) [73, 76]; and users can choose one or more locations where the storage bucket will be placed. Data could be distributed over multiple data stores in order to exploit the advantages of a multi-cloud and multi-region environment, and it also plays an essential role in data compliance issues, where data is required to be stored in particular geographical locations, e.g., GDPR [108]. Yet, realising distributed data-intensive applications on the cloud is not straightforward, since there are several concerns to be addressed. Sharding and data replication [29] are the key concepts for data distribution. Sharding refers to splitting and distributing data across different nodes in a system, where each node holds a single copy of the data; it provides scalability regarding load balancing, storage capacity, and high availability. Data replication refers to copying data or parts of it to multiple nodes in a system; it provides high availability and durability. However, data replication increases the cost and introduces the issue of data consistency due to synchronization issues between the nodes under network partitioning; therefore, a trade-off between availability and consistency emerges [103]. CSPs offer storage services from data centers located in different regions all around the world; therefore, communication and coordination among nodes could be hindered due to network issues in both cases, causing increased latency [40]. Data replication and sharding with an adequate data distribution strategy could also be used to provide data locality and hence low latency by placing data closer to the computation early on rather than moving it as needed later [9].

The location of a cloud storage service is characterised by continent, region, and availability zones (AZ). A continent is a geographical region such as North America, South America, and the Middle East. Each continent can have one or more regions, and each region features data centers deployed within a latency-defined perimeter. They are connected through a dedicated regional low-latency network. Availability zones are physically separate locations within each region that tolerate local failures. Failures can range from software and hardware failures to events such as earthquakes, floods, and fires. The availability zones are connected by a high-performance network with extremely low round-trip latency. Each region often has three or more availability zones. Availability zones are designed so that if one zone is affected, regional services, capacity, and high availability are supported by the remaining two zones. Network infrastructure constitutes a major and integral part of the cloud continuum. Users are charged for using network services, that is reading and writing data to and from cloud storage (for most CSPs, data transfer is free). These are linked with data egress and ingress, while the former refers to data leaving a data container, and the latter refers to data entering a container. Reading information or metadata from a cloud storage bucket is an example of egress. HTTP queries delivered to cloud storage are an example of ingress, e.g., data or metadata entering into a cloud storage bucket. Given that data is distributed over multiple geographical areas or regions over a distributed infrastructure managed by multiple third parties and transferred over the network, security and privacy concerns must also be addressed. This is particularly challenging in complex multi-cloud and hybrid settings, as approaches that work seamlessly over multiple providers are required, apart from the additional cost introduced. In multi-cloud and hybrid settings, therefore, several challenges need to be addressed [40], such as multi-cloud management, security (including regulations and policies), workload and workflow management (including data and computing), and cost optimization under different contexts and parameters.

Cost is the main focus of this article and needs to be considered in line with other QoS attributes and service level agreements (SLAs), which may also affect the cost directly or indirectly. Cloud storage services offer a simple pay-as-you-go pricing plan; however, they do offer various pricing models as well [130]. In the block-rate pricing model, data ranges are defined, and each range has a different per GB price for storing data. Some CSPs, such as Azure, also offer a reserved pricing plan that helps lower the data storage cost by committing to and reserving Azure storage for one year or three years. In addition to all these, with almost all the CSPs, there is an opportunity to directly contact the sales team and get a custom offer according to the requirements. A cloud service provider offers several different services with more or less the same functionality, but they cost differently because there’s a difference in performance. For example, Amazon S3 and Reduced Redundancy Storage (RRS) are online storage services, but the latter compromises redundancy for a lower cost [73]. An even more relevant example of this scenario is the model of storage tiers or classes offered by AWS, Google Cloud, and Azure, i.e., the division or categorization of storage services with respect to the value and access frequency of the data. Another strategy that the CSPs use is the bundling of services. It is not a strategy that has been adopted recently, and not just by CSPs, it is being used intensively by a wide variety of other economic sectors as well [28]. Although the ultimate purpose of bundling is cost-effectiveness and increased customer satisfaction [113], it is also a strategy that can discourage new competitors from entering a market [89]. Following this strategy, CSPs bundle storage services with other related services. For example, network services have a lower cost if data transfer between storage and other services is within the cloud environment, which means computing resources must also be from the same CSP.

3 Research framework and related work

Our goal in this study is to examine the cost structure, relevant QoS elements, trade-offs, notable cost optimization strategies, and challenges for cloud environments. In this respect, in what follows, we outline our key research questions, present our methodology, and refer to and discuss similar related work in comparison to the work presented in this article.

3.1 Research questions and methodology

The research questions below are formulated to investigate the multidimensional nature of cloud storage cost and performance aspects. Specifically, they are meant to clarify how various cost aspects interact, identify significant contributors to QoS, examine the trade-offs that come with using different cloud services, and analyse notable paths for optimising cloud storage cost.

  • RQ1: What are the different cost elements for cloud storage, and how are they related to each other?

  • RQ2: What are the key factors that contribute to the quality of service (QoS) in cloud storage, and how can they impact cost and the selection of a CSP?

  • RQ3: What kind of trade-offs are involved in utilizing different cloud services and their importance?

  • RQ4: What are the notable approaches for cloud storage cost optimization and storage selection?

Regarding the research methodology, we primarily analysed the documentation provided by three major cloud service providers and executed a targeted review of the relevant literature. Our study includes the following steps:

  1. 1.

    Data collection: We collected data from the documentation of three major cloud service providers (i.e., AWS, Azure, and Google Cloud) to gather information on their storage offerings, pricing structures, and availability zones; although there are many other cloud services available, the market share for Microsoft Azure, Google Cloud, and AWS still stands at 67% in Q4 of 2023 [37].

  2. 2.

    Literature review: We conducted a targeted literature review (i.e., informative and not all-encompassing) through presenting and discussing the notable approaches in the literature and highlighting the existing research directions to complement the conceptual overview of the cost domain provided by reviewing the main vendors.

  3. 3.

    Data analysis: We analysed the collected data to identify cost structures, quality of service elements, and trade-offs associated with cloud storage and compared the offerings of different cloud storage providers to highlight differences in geographical coverage, availability zones, and pricing models.

  4. 4.

    Taxonomy development: Based on the analysis of the data, we developed a taxonomy of cost structure for cloud storage, categorizing different types of cost and quality of service elements relevant to cloud storage provider selection and cost optimization.

  5. 5.

    Comparison and evaluation: we also compared the cost structures of different cloud storage providers, evaluated their offerings based on the developed taxonomy, and provided insights into the complexity of cost structures under varying parameters and scenarios of use.

3.2 Related work

There are a number of other studies focusing on cloud cost at a high level. In this section, we review them and highlight the differences compared to our work.

In [66], the authors provide and taxonomy and survey of cost optimisation for cloud storage from the perspective of users, with a focus on opportunities, motivations, and challenges. Mansouri et al. [73] provide a comprehensive taxonomy that covers key aspects of cloud-based data storage: data model, data dispersion, data consistency, data transaction service, and data management cost. However, neither of these articles presents the taxonomy for the actual cost elements or discusses the cost optimization strategies. In [130], a taxonomy is presented for cloud pricing models, whereas [44] provides a taxonomy for cloud computing in general. Hofer and Karagiannis [39] provide insights into cloud services. It discusses different categories of cloud resources, including storage, and offers a comparative analysis. While broader in scope, [10] touches upon resource allocation in cloud environments. It may not directly address storage taxonomy, but it contributes to the overall understanding of cloud infrastructure. In addition to that, there are other studies that do not present taxonomies for cloud storage cost, but different aspects of cloud computing. For example, cost-aware challenges for workflow scheduling approaches [4], future directions for sustainable cloud computing [31], cloud computing systems [105], critical factors related to migration to cloud [120], and issues concerning cloud computing ecosystems [106].

Compared to the existing literature, this article presents a taxonomy of cloud storage cost elements to simplify the complex cost structure by outlining various types of cloud storage cost elements and how they vary throughout various providers and impact the total cost. Additionally, it also contributes by presenting a taxonomy and a complete overview of QoS elements, how they impact cost and storage selection processes, and related literature. QoS results in various trade-offs; hence, this article explores different trade-offs while using cloud services, such as storage and computation, storage and cache, and storage and network, and possible ways to reduce the compromise. Existing reviews and studies are carried out on the subsets of cloud cost structure. To present the broader picture, this article presents a complete cost ecosystem in the context of real-world user scenarios as well as cost, latency, region, and availability zone comparisons. Moreover, some studies carried out surveys to find out different problems and challenges in the domain of cloud cost optimization. To address those challenges, in this article, notable storage selection strategies as well as cost optimization strategies (storage and network cost optimization) are presented.

4 A taxonomy of storage cost

Cloud computing cost can often be broken down into the following high-level and non-exhaustive groups: 1) storage cost based on the amount of data stored in the cloud and its duration; 2) data transfer cost based on the amount of data moved over the cloud network; and 3) compute cost based on the use of computing resources from the cloud continuum (e.g., VMs rented and their duration). In this article, we primarily focus on storage cost and data transfer cost. The proposed cloud cost taxonomy is shown in Figure 1. Storage cost comprises data storage, data replication, transaction, and network usage costs, whereas data transfer cost comprises data replication, transaction, and network usage costs. In addition to that, storage cost also incorporates optional data security and redundancy model costs; higher data redundancy will result in higher storage cost. We discuss five elements of the cloud storage cost structure based on how cloud services charge their users, including data storage, data replication, transactions, network usage, and data encryption costs. Storage and data transfer costs vary by storage tier (premium, hot, cold and archive) as discussed further below. Moreover, whether data is uploaded/downloaded from cloud storage incurs a transaction cost. Similarly, when data is transferred between cloud storage, it not only includes the cost of network usage but also a nominal transaction cost. In the remainder of this section, we look into each of these elements in detail. The taxonomy presented in this section is extracted by analysing the official pricing information provided publicly by AWSFootnote 1, AzureFootnote 2, and Google CloudFootnote 3.

Fig. 1
figure 1

Cloud computing cost taxonomy

Table 1 Comparison of the storage tiers: definitions and key characteristics of each storage tier offered by different CSPs

4.1 Storage tiers

Every chunk of data stored in cloud storage uses a piece of information (metadata) known as a “storage tier”, specifying its availability level and pricing structure. Storage servers on tiers don’t need to be connected to a virtual machine to store and read data. For example, AWS Elastic Block Storage can only be used with AWS EC2 instances; however, data stored on AWS S3 (which is tiered storage) can be accessed using a standard data transfer protocol. A user can also change the storage tier of an object by rewriting it or using object lifecycle management. The user can choose a default storage tier when creating a bucket. Items added to the bucket use this storage tier unless they are set up to use a different one. If a user changes the default storage tier of a bucket, it won’t affect any data objects that are already in the bucket. Data is collected for storage tiers under four categories: premium, hot, cold, and archive. A summarized comparison of storage tiers offered by three different providers is shown in Table 1, whereas the definitions and key characteristics of each storage tier are explained as follows.

  • Premium tier is better suited for data that is frequently accessed and/or is only stored for short periods of time. This tier is called “Premium” in Azure, “Standard” in Google Cloud, and “S3 Standard” in AWS. The premium tier costs more than the other tiers to store data, but it costs less to access the data.

  • Hot tier is suggested for storing the data that is frequently accessed and modified. In Azure, it is known as “Hot”, in Google Cloud as “Nearline”, and in AWS as “S3 Standard – Infrequent Access”. This tier also has a higher cost as compared to the cold and archive storage tiers, but the associated network usage costs are comparatively lower.

  • Cold tier is designed for storing data that is accessed and modified occasionally. All cloud storage providers recommend that data must be stored for a specific minimum amount of time. Azure, for example, recommends a minimum of 30 days. Storage cost is less than in the premium and hot tiers, but network usage cost is higher. This tier is referred to as “Cool” in Azure, “Coldline” in Google Cloud, and “S3 Glacier - Instant Retrieval” in AWS.

  • Archive tier is designed for storing data that is rarely accessed and is stored for a longer period of time – basically an offline tier. Mostly, the data that is stored cannot be accessed immediately. That is why it is recommended to store data that has flexibility in terms of latency requirements, i.e., on the order of hours. Unlike the cold tier, the minimum storage time is not just recommended but required; e.g., for Azure, it is 180 days. Azure and Google Cloud term this tier as “Archive”, whereas AWS terms similar tiers as “S3 Glacier Flexible” and “S3 Glacier Deep Archive”.

Storing a data object in only one tier at all times can be costly and inefficient [43, 49, 71]. Krumm and Hoffman [53] developed a tool designed specifically for cost and usage estimation for laboratories performing clinical testing. It provides a way to explore storage options from different CSPs, cost forecasts, data compression, and mechanisms for rapid transfer to the cold tier. Jin et al. [47] developed a framework for cost-effective video storage in cloud storage, while Erradi and Mansouri [27] developed a cost optimization algorithm for data storage and migration between different storage tiers. Nguyen et al. [91] proposed a cost-optimal two-tier fog storage technique for streaming services. Liu et al. [62,63,64] developed multiple algorithms presented in various studies for cost optimization using multi-tier cloud storage.

Fig. 2
figure 2

Cloud storage cost taxonomy

Storage tiers can be effectively used to achieve high data durability and availability, as suggested by Liu et al. [61] by developing an algorithm (PMCR) and doing extensive numerical analysis and real-world experiments on Amazon S3. Extending the work on high availability, Wiera et al. [96] presents a unified approach named Wiera to achieve not just high availability but also data consistency. Wiera makes it possible to set data management policies for a single data center or for a group of data centers. These policies let the user choose between different options and get the best performance, reliability, durability, and consistency. In this kind of situation, Wiera finds the optimal place to store user data so that it is possible balance different priorities, such as cost, availability, durability, and consistency. Similarly, Zhang et al. [136] presented a bidding algorithm for tiered cloud storage to achieve low latency.

4.2 Cost structure

The taxonomy for the cost/pricing structure is shown in Figure 2. The cost structure for cloud storage can be broken down into four main groups: 1) data storage, 2) data replication, 3) transactions, and 4) network usage. These four elements, as shown in Figure 2 with solid lines, are mandatory cost elements that a user can optimize but cannot completely avoid. The other three elements, which are data management, data backup, and data security, are optional. These are not provided by CSPs for free, but they are not mandatory. A user might have to pay for third-party data management services in the context of a multi-cloud or hybrid environment. Table 2 explains different cloud storage cost elements and the terms used by different CSPs.

Table 2 Cloud storage cost elements

4.2.1 Data storage cost

Storage cost refers to the cost of storing data in the cloud. It is charged on a per-GB-per-month basis. Each storage tier has different pricing. It also depends on the amount of data that is being stored. Some CSPs offer block-rate pricing, i.e., the larger the amount of data, the lower the unit costs are [88]. For example, there is a certain cost for data between 0 and 50 TB, and then for some tiers, it might be cheaper for over 50 TB of data. When it comes to Big Data, data storage costs could be quite significant. However, data compression techniques can reduce the size of the data by efficiently compressing the data, thereby reducing the storage cost. Hossain and Roy [41] developed a data compression framework for IoT sensor data in cloud storage. In their two-layered compression framework, they compressed the data up to 90% while maintaining an error rate of less than 1.5% and no bandwidth wastage. On the other hand, distributed data storage comes with its own challenges. One of the challenges of storing data in a distributed environment is the efficient repair of a failed node, i.e., minimizing the amount of data required to recover the failed node. Coding theory has evolved to overcome these challenges. Erasure encoding is used for reliable and efficient storage of Big Data in a distributed environment [58]. There are several erasure coding techniques developed over time; Balaji et al. [6] present an overview of such methodologies. From the perspective of cloud users, these coding techniques are crucial for ensuring effective data backup and disaster recovery. While they may not directly contribute to cost reduction for users, they hold significance from the perspective of CSPs as they reduce actual storage needs. In addition to that, private cloud owners, in particular, can find them beneficial in terms of cost reduction.

4.2.2 Data replication cost

Cloud replication refers to the process of replicating data from on-premises storage to the cloud, or from one cloud instance to another. Data storage systems adopt a 3-replicas data replication strategy by default, i.e., for each chunk of uploaded data, three copies are stored to achieve high data reliability and ensure better disaster recovery. This means that for users to store one gigabyte of data, they have to pay for the cost of three gigabytes and the cost of making data copies, known as “data replication”. This significantly affects the cost-effectiveness of cloud storage [59]. The cost of data replication is charged on a per-GB basis. Several data replication strategies are available to achieve various objectives. For example, Mansouri et al. [72], Liu et al. [61], and Edwin et al. [25] developed data replication strategies to achieve optimal cost. Mansouri and Javidi [68] and Nannai and Mirnalinee [90] focused on achieving low access latency by developing dynamic data replication strategies. Ali et al. [3] presented a framework (DROP) to pull off maximum performance and security, whereas Tos et al. [119] and Mokadem et al. [83] developed approaches to attain high performance and increase providers’ profit.

4.2.3 Transaction cost

Transaction cost refers to the costs for managing, monitoring, and controlling a transaction when reading or writing data to cloud storage [94]. Cloud storage providers charge not only for the amount of data that is transferred over the network but also for the number of operations it takes. It consists of two parts: the number of operations and the amount of data retrieved. Both READ and WRITE operations have different costs. They are charged based on the number of requests. DELETE and CANCEL requests are free. Types of requests include PUT, COPY, POST, LISTS, GET, and SELECT. On the other hand, data retrieval is charged on a per-GB basis. Google Cloud has a different term for transaction costs, namely “operation charges”, which are defined as the cost of all requests made to Google Cloud storage.

4.2.4 Network usage cost

The quantity of data read from or sent between the buckets is known as network consumption or network usage. Data transmitted by cloud storage through egress is reflected in the HTTP response headers. Hence, the term network usage cost is defined as the cost of bandwidth into and out of the cloud storage server. It is charged on a per-GB basis. Google Cloud has two tiers of network infrastructure: premium and standard. These differ from what Azure and AWS offer, as they only offer a single network tier. Although network performance varies by storage tier, meaning CSPs have multiple network tiers as well, users cannot explicitly choose between them. For Google Cloud, the cost to use the premium network tier is more than the standard network tier, but it offers better performance. Besides data servers and high-power computing resources, CSPs invest heavily in building their network infrastructure. The network usage cost is a complex combination that involves several factors, such as the route of data transfer, whether it’s within the same cloud or outside. In the case of the same cloud, the cost varies if the transfer is between the same region, different regions, the same continent, or different continents. To add even more complexity, the price to move data within the same continent is different if the continent is the US or EU, and it is different if the continent is Asia. In addition to that, CSPs charge different prices for moving data to the cloud storage (Bandwidth-IN) and retrieving/downloading data from the cloud storage (Bandwidth-OUT). The taxonomy for the network usage cost is shown in Figure 3. The cost of network usage is lowest when data is transferred between cloud services deployed in the same region, and it is highest when data is transferred outside the cloud network.

Fig. 3
figure 3

Network usage cost taxonomy

4.2.5 Data encryption cost

The data is encrypted before it is written on the disk and decrypted in real-time when accessed. The encryption and decryption are done using a key, which is either managed by the storage provider or the client. There are no extra costs associated with the server-managed keys, but CSPs do charge for customer-managed keys as they would be stored on the CSP itself. In terms of cost for key management, it can be divided into three categories: 1) the cost of the key, which is billed monthly; 2) the number of operations (encryption and decryption) performed with a key; and 3) the cost of the HSM (hardware security module, a physical device that provides extra security for sensitive data), which is billed per hour. There is also an additional cost for key rotation. Data encryption is an optional factor that affects the total cost of cloud storage; however, if a user chooses to encrypt the data, it will be charged monthly for cloud storage of the encryption key, per 10,000 requests for encryption and decryption, and for HSM. The cost of encryption and encryption/decryption keys is more or less the same for all CSPs, but HSM costs vary significantly. It can also have an effect on the application’s performance, as real-time data encryption and decryption are time-consuming operations.

4.3 Redundancy model

Redundancy implies that the service provider keeps valuable and important data in multiple locations. A client should ideally have numerous backups so that large server failures won’t impair their ability to access information [127]. The taxonomy for the redundancy model is shown in Figure 4 and various terms that are used by different CSPs for different redundancy models are listed in Table 3. Cloud storage providers offer to store data with three different redundancy options. Data is stored in a single region in a geographic location such as eu-west. A user can store data in two different geographical locations in a dual-region mode. For example, this mode can be a suitable option if the data is frequently accessed in two different regions, such as Europe and the US. A multiple-region mode can be selected if the data is frequently accessed from different regions. The redundancy model not only improves the durability of the data but also the availability [79], which directly affects the application performance. In addition to that, it also plays an important role in data security and data recovery. The term data replication defines the process in which data is copied or replicated from on-premises storage to the cloud or from one cloud storage location to a different location. Users are charged based on the amount of data being transferred over the network for replication. The redundancy model, on the other hand, determines how many different regions the data is replicated in. different redundancy models have different costs for data storage.

Fig. 4
figure 4

Cloud storage redundancy taxonomy

Moving from single-region to dual or even multiple regions can significantly reduce the access latency, especially when the data is accessed from different geographic regions, but it also has a direct impact on the cost of data storage. The higher the data redundancy, the higher the cost of storing data, including storage and data replication costs. However, it is advised to weigh the trade-offs between reduced costs and higher availability when determining which redundancy solution is ideal for a certain circumstance. Azure provides two further types of replication: dual and multi-region replication. Geo-replication is the replication model in which the data is replicated to a secondary region that is geographically remote from the primary region to protect the data against local disasters. The data in the secondary region can only be used for disaster recovery; it does not provide any read access. Geo-replication with read access is a data replication model in which a secondary region also provides read access. Waibel et al. [123] formulated a system model incorporating multiple cloud services to determine redundant yet cost-efficient storage. Moreover, to increase the performance of an application, different parts of the data set can be stored and loaded from different availability zones or even regions because bandwidth is limited, and all access will have to share the actual throughput of the physical network infrastructure. This ensures that an application’s performance is not compromised due to a bottleneck in network throughput.

Table 3 Cloud storage redundancy model

4.3.1 Single-region

A single geographic area, like Sao Paulo, is called a single-region. For data consumers, such as analytics pipelines [56, 93, 107], that are operated in the same region, a single-region is utilized to optimize latency and network capacity. Since no fees are levied for data replication in regional locations, single-regions are a particularly advantageous alternative for short-lived data. Single-region has the lowest cost compared to data kept in dual and multi-region.

4.3.2 Dual-region

A particular pair of areas, such as Tokyo and Osaka, is called a dual-region. When single-region performance benefits are required but improved availability from geo-redundancy is also desired, a dual-region is employed. High-performance analytics workloads that can run active-active in both regions simultaneously are well suited for dual-regions. This indicates that users will enjoy good performance while reading and writing data in both regions to the same bucket or data container. Due to the high consistency of dual-regions, the view of the data remains constant regardless of where reads and writes are occurring. Dual-region data storage is more expensive than single-region, but less expensive than multi-region and provides better availability and low latency.

4.3.3 Multi-region

A vast geographic area, like the United States, that encompasses two or more geographic locations is referred to as a multi-region When a user has to provide content to data consumers that are dispersed across a wide geographic area and are not connected to the cloud network, a multi-region approach is employed. Most of the time, the data is kept near where most users are. For instance, one may employ the concept of cross-region replication and deploy an EU bucket for EU data and a US bucket for US data. Therefore, a multi-region redundancy model can be used if the users who need the data are in different places. The multi-region redundancy model is the most expensive data storage model; however, it also addresses a wide range of security, privacy, availability, and data durability issues.

5 QoS elements

In this section, we discuss other QoS elements that must be considered when making decisions regarding CSP selection, since cost is not always the only factor to consider. There are other parameters that have not only an impact on the cost but also on the performance of the applications, such as network performance, data availability, consistency, and security. Figure 5 shows the taxonomy of major QoS elements that must be considered during the selection of a CSP.

Fig. 5
figure 5

Quality of Service elements

5.1 Network performance

Network performance maintenance and management is a difficult and expensive task involving measuring, planning, modelling, and constantly optimizing networks to ensure that cloud continuum applications run smoothly. Network performance depends on several factors, but according to a study, performance monitoring is the key element [38]. Many elements together contribute to the performance of any network, such as latency, bandwidth, throughput, and jitter. Each of these is discussed in the following sections.

5.1.1 Latency

One-way latency, in terms of computer networking, refers to the time it takes to transmit a data packet from one place to another; however, in most cases, round-trip time, or RTT, is measured. It is the total amount of time for a data packet to be transmitted from the source to the destination and back to the source [114]. Cloud service latency is the delay between a client request and a cloud service provider’s response. A low round-trip latency can also mean that a storage server is close. Hence, a low latency might indicate a shorter physical path, but it does not have to be [128]. Latency does not account for data processing speed or software efficiency and is usually measured in milliseconds. Commonly, latency is measured by pinging the destination node.

5.1.2 Bandwidth

In computer networks, bandwidth ranks among the most crucial performance indicators. When discussing data networks, bandwidth denotes the maximum rate of data transfer that can occur across a given network. Increasing capacity enhances functionality. Bandwidth and throughput are two common metrics for describing how much data can be transferred over time via a packet network. Users interested in improving end-to-end data transport performance, overlay network routing, and peer-to-peer file distribution will find bandwidth estimation useful [102].

5.1.3 Throughput

Throughput measures how quickly a given application transfers data. Simply, it is the total number of data packets delivered successfully and simultaneously. It is affected by bandwidth and hardware limitations [99]. A latency matrix might not reflect the same difference in throughput, so if throughput is important, a throughput matrix would have to be established (as it can be that the ping latency is fast, but the throughput might not directly correlate). A throughput matrix is more costly than a ping matrix due to the cost of data transfer. The major factor is the size of the data. For small data sets, as used in REST-based APIs and transaction applications, latency is more relevant; one needs the individual service invocation to return as fast as possible, e.g., [17]. For large data sets (in the GB, TB, and PB), the latency is less relevant, but the throughput is. The higher the throughput, the shorter it takes for the whole amount of data to be transferred. Throughput and bandwidth are similar concepts, but they assess distinct features of a network. While throughput refers to the quantity of data that is being effectively transferred via a network, bandwidth refers to the network’s maximum data capacity.

5.1.4 Jitter

The term “jitter” describes random fluctuations in the time it takes for packets to arrive at their destination. A jitter occurs when data is sent through a packet network and encounters different delays. It is primarily caused by routing and switching delays. Information packets from several networks must compete for a position at routers. Incoming packets are statistically multiplexed by routers, causing the varying delay or jitter. Another major cause of jitter is network congestion, which occurs if an adequate network path is unavailable for the data packets. The absence of jitter is very important for an application that supports real-time communication, for example, voice-over IP. If the jitter delays are substantial or are not taken into account by the receiver program, the outcome might be a degradation of performance, especially in real-time multimedia communications applications [21].

5.2 Availability and reliability

In a cloud environment, data availability refers to the ability to access data when needed. For cloud infrastructure solutions, availability refers to the fraction of the service’s paid duration during which the data center is reachable or provides the intended IT service [57]. On the other hand, data reliability refers to the integrity and consistency of the data. Another definition of data reliability is the likelihood that the system will achieve specified performance requirements and produce the intended results for the required time [57]. Data may be stored in multiple locations in a multi- or hybrid cloud environment, such as on different cloud platforms or regions. This can help to ensure that the data is always available, even if one of the cloud platforms experiences an outage or other issues. This can also help protect against data loss due to hardware failures or other issues.

CSPs measure data availability for users in terms of server or system uptime. A system’s uptime is the time it operates correctly. A service-level agreement between a cloud service provider and a client specifies availability objectives. Large providers offer SLAs of at least 99.9% availability for each paid service, including AWS, Azure, and Google Cloud. Less than nine hours of downtime per year is what the service providers often guarantee to their clients. The consumer may anticipate less downtime annually with more nines in the number. Multiple components are important for high availability and reliability, such as a) physical location, i.e., finding and removing single points of failure and deploying duplicate instances across multiple availability zones, b) strong network connection to manage high workloads, c) compute instances that act as servers in public clouds, d) storage instances, e) load balancing, and f) IP cutover (reroute traffic when an instance fails, the failed instance’s IP address must be remapped to the backup instance) [80].

In distributed systems, high availability and reliability have long been key concerns. For businesses using cloud computing, offering highly accessible and dependable services is crucial for preserving client confidence and satisfaction and reducing revenue losses. Mesbahi et al. [80] offered a reference roadmap for high availability and reliability in cloud computing settings. The roadmap identifies two main research gaps in the literature: one related to considering the requirements and viewpoints of all cloud actors and the other related to the impact of system performance overhead on high-availability solutions. The authors suggested manually evaluating the mutual impact of system performance overhead and high availability solutions using the OpenStack platform, whereas software-defined networking technology is proposed as a potential solution. Moreover, from the cloud service provider’s perspective, several obstacles still must be overcome to supply cloud services that adhere to all SLA terms. Endo et al. [26] carried out a systematic literature review and presented and discussed high-availability solutions for cloud computing, such as recovery, failure detection, and replication. It is discussed that while virtualization is seen as a good option for providing high availability (HA), some studies have criticized it as an effective solution. Additionally, difficulties in configuring and testing services and the lack of security solutions in existing highly available cloud systems are identified. The authors also note ongoing work on standardizing and improving high availability strategies in cloud computing, leading to HA-on-demand.

5.3 Data durability

A durable storage system can carry out its duties, even in unforeseen circumstances. For instance, a durable storage solution won’t suffer from data loss when storing data. The distributed storage method employs numerous copies of data and has been widely used in cloud applications due to cost and performance considerations. This redundant storage configuration ensures that data loss only happens when all deployed data replicas on disks are harmed. In this instance, the failure recovery methodology and organizations’ replica technique influence the data durability. A modelling and analysis technique is proposed by using a no-retrogressive Markov chain to represent data failure and recovery processes for data durability in cloud storage services by Jiang et al. [46]. To lessen the data loss brought on by related failures, they also suggested a replica organization technique based on routing tables. A study by Nachiappan et al. [87] examined the challenges in using replication and erasure coding as the most important data reliability techniques in cloud storage systems for Big Data applications. Both techniques have trade-offs in durability, availability, storage overhead, network bandwidth, energy consumption, and recovery performance. They introduced novel hybrid technique to improve Big Data applications’ reliability, latency, bandwidth usage, and storage efficiency on cloud computing. The proposed technique uses dynamic replication in erasure-coded storage systems and effectively handles reconstruction issues with less storage overhead and improved reliability and energy consumption. Cloud computing provides cost-effective on-demand services for Big Data applications, enabling storage and computing resources to be scaled quickly. In short, several strategies can be used together to ensure data durability in a multi- or hybrid cloud environment, such as a) data replication [69], b) erasure coding that involves breaking data into small chunks and encoding them with redundant information, making it possible to reconstruct even when some chunks are lost [100], and c) cross-cloud backups [77] which can create an additional layer of protection in case of data loss due to hardware failure, human error, or other issues.

5.4 Data consistency

Data must be reliable, updated, readily available, consistent, and safely accessible for many applications and businesses for daily operation and decision-making. Data consistency ensures that the outcomes of a database operation are immediately apparent to all stakeholders. This implies that all parties who have access to the data may view the outcomes of a transaction once it has been completed (confirmed or reversed). Data integrity is guaranteed, and data corruption is avoided through secure data consistency. Data consistency is a highly desired characteristic of a database system [19]. Bokhari and Theel [15] claimed that while there are many advanced data replication strategies, new strategies must be developed to address a variety of scenarios. They also proposed a comprehensive hybrid approach for data replication that uses voting structures. Girault et al. [32] studied the issue of data consistency in distributed systems and focus on determining the strongest consistency criterion that can be implemented in a convergent and available distributed system that tolerates partitions. They concluded that the strongest criterion is Monotonic Prefix Consistency (MPC). There are two further levels of data consistency as described in what follows.

5.4.1 Strong consistency

Strong consistency means data is consistent at all times in all nodes and has the drawback of lowering system performance overall, which might degrade the user experience of the applications that involve end-users. Relational databases adopt a strong consistency model (or a slightly relaxed variant) because it guarantees data integrity and prevents data corruption. Both of these are highly desirable quality attributes. Transactions are implemented using locks; they prevent updates to data (in a database) from concurrent update requests [111].

5.4.2 Eventual consistency

Eventual consistency states that given enough time and no revisions, the value for a particular data item will be consistent across all nodes eventually [14]. Eventual consistency compromises consistency in a distributed data storage environment to enhance availability and maximize network potential. The need for this trade-off was famously expressed as the CAP Theorem, or Brewer’s Theorem [16], which argues that one can have at most two of the following properties under the presence of partitions: consistency, availability, and partition tolerance.

5.5 Interoperability and portability

The capacity of various cloud providers to communicate with one another and reach agreements on data formats, SLAs, and other issues is referred to as “cloud interoperability”. In contrast, the ability to move software applications and data between cloud providers without regard to APIs, data types, or data models is known as “cloud portability”. According to Urquhart [121], application interoperability is divided into two sub-categories: portability (the ability to move an application from one host to another while it is shut down) and mobility (the ability to move an application in a live state from one host to another). In [118], authors presented portability and mobility as the challenges. Parak and Sustr [98] denote that, in terms of achieving cloud interoperability, the challenge is not to make everyone switch to a common solution but rather to allow everyone to keep their setup while still providing uniform access to essential services. There are several challenges related to cloud-to-cloud interoperability, and some are as follows: 1) cloud-service integration [24, 118], 2) security, privacy and trust [22, 24, 118, 122], and 3) management, monitoring and audit [24, 118], 4) cost model [22, 118, 122], and 5) service level agreements [22, 118]. To solve these challenges, software containers were an effective method for deploying software applications in general in a multi-cloud or hybrid environment [20, 35, 55, 92].

Barhate and Dhore [7] discussed the challenges of interoperability in the cloud computing environment and how it affects factors such as data migration, workload management, and load balancing. They also highlighted cloud federation [24] as a step towards efficient interoperability but focused on addressing the issues related to response time during peak hours of usage where the cloud computing technology tends to lag. Furthermore, they studied the hybrid architecture and found that it helps achieve better interoperability, improved QoS levels, peak-load handling, and scalability. Barhate and Dhore [8] also studied the effect and advantages of using a hybrid cloud for interoperability, combinations of broker policies and load balancing algorithms, and the effect of hybrid clouds on cloud cost optimization. They conducted two case studies to test the different methods of achieving interoperability and the cost associated with each and concluded that the closest data center policy and optimum response time policy are found to be the most efficient and cost-effective ways of achieving interoperability in a hybrid cloud environment as compared to re-configuring dynamically with load balancing.

To prove the effectiveness of the multi-cloud approach, Gómez et al. [33] proposed an architecture for integrating different electronic health record (EHR) systems by using cloud-based standardization and a data repository with high availability. The architecture is evaluated against five existing studies and found to reduce the problem of information exchange between different EHR systems to a single point of communication and comply with interoperability requisites. Bulent and Tarek [18] studied how cloud computing can be used in education to improve the delivery of learning resources and increase interoperability and claimed that by using cloud computing, educational institutions can provide cost-effective alternatives to expensive proprietary productivity tools. The study used semantic models to improve the delivery of the service. Results showed the system was 93.5% successful in reducing resource usage; however, they found some interoperability limitations in terms of cloud computing models and service models.

5.6 Security

To protect the data in the cloud, by default, users only have access to the cloud storage resources they create. They can grant access to other users using the access management features provided by each cloud storage provider. Cloud storage providers also support audit logs that list the requests made against the cloud storage resources for complete visibility into who is accessing what data. In addition to extensive access management and audit logs, each cloud storage service has mainly the following security features: 1) block public access and 2) data encryption. A simple taxonomy of the major cloud storage security features is shown in Figure 6. Data encryption methods might impact performance. Prakash et al. [101] studied the impact of different signing methods for data stored in cloud storage on performance. They suggested using a third-party auditor with MD5 and ECDSA signatures to check the data. The investigation of the suggested method’s results revealed that, for bigger data sizes, ECDSA outperformed MD5 in terms of security performance.

Fig. 6
figure 6

Cloud storage security taxonomy

5.6.1 Access management

Identity, Credential, and Access Management (ICAM) provides enterprises with concepts used to manage digital identities, credentials, and access to systems and applications. This is particularly useful when integrating with cloud computing service providers [110].

Internal access management

With cloud access management, users can choose who or what can get to cloud services and data. It is managed centrally and provides a complete analysis of the permissions across the cloud. Permissions can be applied and scaled with attribute-based access control. For example, granular permissions can be created based on user attributes like department, job role, and team name. It allows managing access per account or scale access across multiple accounts and applications within the cloud. Users can also streamline permissions management and use cross-account findings to set, verify, and refine policies on the journey toward least privilege [1].

Block public access

Users may ensure no public access to any item by applying the cloud’s block public access setting to every bucket in the account (both current and any new buckets formed in the future-by following a simple procedure). It is simple for the account administrator to set up a centralized control to avoid variation in security configuration regardless of how an item is uploaded or a bucket is formed due to cloud storage “block public access” settings that override permissions that enable public access.

Fig. 7
figure 7

High-level diagram illustrating a server-managed key for server-side encryption between customers’ managed network and cloud storage

5.6.2 Data encryption

Before data is sent to and stored in the cloud, it must be converted from its original plain text format into an unrecognizable one, such as ciphertext. Advanced techniques are used in encryption to encode the data, rendering it useless to users without the key. The data is decoded by authorized users using the key, restoring the secret information’s readable form. Only trustworthy people whose identities have been established and validated by some multi-factor authentication are given access to keys [135]. A few cryptographic protocols are also available for data encryption on cloud storage. As implemented in HAIL [112] and DepSky [11], using cryptographic protocols with erasure coding and RAID methods on top of several data stores increases security in some areas. Scalability, cost of computation and storage for encoding data, the choice of where the data is encoded, and the keys used for data encryption are preserved, are a few issues that need to be taken into consideration in such systems. Data protection against insider and outsider attackers is improved by combining private and public data stores and by using these strategies across public data stores provided by many vendors, especially for insiders who need access to data in several data stores. Storing data across numerous data stores has a side effect, since various replicas will likely be stored under various privacy laws. Choosing data repositories with comparable privacy laws and regulations would be important to lessen this negative impact.

Server-side data encryption

The object storage service employs server-side encryption to protect data from unauthorised entities’ access. All three CSPs offer server-side encryption, but the specifics of how it is implemented vary, notably in how keys are managed. When a cloud user transmits the data to the CSP, it encrypts it at the object level before writing it to the disk; however, the cloud provider must decode the data each time he needs to access it. Figure 7 shows the general scenario of data being encrypted using server-side encryption [34]. There is no difference in how one accesses encrypted or unencrypted objects as long as the request is authorized and the person accessing it has the proper set of access rights. For instance, if a pre-signed URL is used to share an object, encrypted and unencrypted objects can be accessed using the same URL [2].

Depending on how a user chooses to manage the encryption keys, there are two exclusive options. First is the server-managed keys; when a user uses server-side encryption with server-managed keys, each object is encrypted with a unique key. As an additional safeguard, the cloud storage encrypts the key itself with a root key that it regularly rotates [5]. The encryption keys are managed using HSM [36]. It is a physical machine that conducts encryption and decryption operations for encryption keys, strong authentication, and other cryptographic operations. The second is customer-managed keys; with server-side encryption with customer-managed keys, the user manages the keys and the CSP handles the data encryption.

Client-side data encryption

Client-side encryption ensures that data and files stored in the cloud can only be viewed on the client-side of the exchange [74]. Surv et al. [115] proposed user authentication to secure encryption algorithm data within cloud computing. They proposed a framework for client-side AES encryption in cloud computing. Moghaddam et al. [82] presented a comparative study of applying real-time encryption in cloud computing environments.

5.7 Discussion

The quality of service relies on several elements, including network performance, data durability, availability, consistency, security, and interoperability. Each of these elements, in turn, has an impact on the cost. Data stored in higher and more expensive tiers, such as premium or hot, is associated with better network performance, consequently raising the overall cost. Similarly, ensuring the high availability and consistency of data necessitates redundant storage and backup systems, as well as frequent data synchronization and replication, which contribute to the cost of cloud storage services. Additionally, as data security demands an investment in security infrastructure like firewalls, intrusion detection systems, and encryption technologies, CSPs directly pass on these costs to users in the form of encryption key expenses and HSM modules. This highlights the importance of implementing a comprehensive cost optimization strategy to effectively manage and navigate the complexity of expenses within the dynamic and multifaceted cloud storage ecosystem. With careful consideration when employing these strategies, organizations can not only ensure efficient resource allocation but also establish an environment where the balance between cost and performance trade-offs is achieved. This, in turn, leads to a more streamlined and cost-effective cloud storage operation.

6 Cost optimization strategies

Cloud cost optimization reduces overall cloud spending by 1) identifying mismanaged resources; 2) eliminating un-provisioned and un-used resources; 3) right-sizing computing services to scale; and 4) reserving the capacity for higher discounts. To reduce the cost of cloud services, it is important to first determine the main cost drivers in cloud environments. They can vary depending on the specific CSP and organizations’ requirements; however, based on the content provided so far, we can conclude that the most common cost drivers includes the followings.

  • Resource utilization, scale and demand: The amount and type of resources used, such as computing resources, storage, and network usage, are the main drivers of cloud cost. As more resources are consumed, the associated cost will also increase, which means that the more resources an organization needs, the more it will cost.

  • Region and availability: Cost can also be affected by geographic location and availability of resources; some cloud providers charge more for resources that are located in certain regions or have higher availability requirements.

  • Maintenance and management: Cost can also include the cost of maintaining and managing the cloud environment, such as software and security upgrades, backups, and disaster recovery.

  • Data egress: Additionally, data egress, or the data that is moving out of the cloud environment, can also be a cost driver if certain limits are exceeded.

  • SLAs, compliance and regulatory requirements, and third-party services: Cost can also depend on the SLAs that are in place with the cloud provider. Organizations may pay more for a higher level of availability and security. Organizations may also be subject to additional cost if they must comply with specific regulations and standards, such as the GDPR. Similarly, organizations may have to pay more if they use third-party services, such as data analytics or machine learning, offered by cloud providers or other vendors.

Cost optimization techniques

In terms of cloud storage cost optimization techniques, we consider two notable optimization categories: 1) the actual storage cost and 2) network usage cost. These can be optimized during the following phases.

  • Pre-deployment [13] cost optimization: This involves changing the infrastructure and applications before deployment to the cloud. It includes selecting the most cost-effective instance types, using reserved instances, and leveraging the spot instances. In addition, the application design can be optimized to minimize the number of resources required for workloads, such as through the use of auto-scaling and optimizing data storage and retrieval to minimize data transfer cost.

  • Post-deployment cost optimization: This involves changing the infrastructure and applications after deployment to the cloud. It can include monitoring resources for under-utilization, right-sizing or shutting down resources that are no longer needed, scheduling non-production resources to be turned off during non-business hours, and using cost-management tools to identify and optimize costs. According to Wang and Ding [124], this leads to increased QoS requirements.

Both pre- and post-deployment cost optimization strategies are crucial for effectively controlling and reducing cloud cost. By combining these techniques, cloud infrastructure and applications can be optimized to minimize the cost, while still meeting performance and availability requirements.

6.1 Storage cost optimization

This refers to the cost of storing data in the literal sense and the related cost elements, as explained in Section 4.2. Storage cost optimization can be done before and after the deployment of applications in the cloud environment. Pre-deployment optimization can be done by choosing the most appropriate location for storing data, i.e., by adopting a storage selection method, or post-deployment, i.e., by moving data between different storage tiers. In this section, we discuss relevant scientific literature for both pre- and post-deployment cost optimization strategies.

6.1.1 Pre-deployment

Regarding the pre-deployment techniques, an architecture for multi-cloud storage is presented by Wang et al. [125], which is based on the non-dominated sorting genetic algorithm II (NSGA-II). It was used to solve the multi-objective optimization problem established in order to simultaneously reduce overall cost and maximize data availability, yielding a set of non-dominated solutions known as the Pareto-optimal set. Then, a technique based on the entropy approach is suggested to find the best option for users who cannot select one directly from the Pareto-optimal set. Moreover, since CSPs offer data storage services through data centers spread out globally with varying get/put latencies, cloud users have two issues when choosing a data center: 1) how to distribute data across global data centers in a way that satisfies the application service level objective (SLO) requirements for data availability and latency, and 2) how to distribute data and reserve resources in datacenters affiliated with various CSPs in a way that reduces the cloud cost. A method for distributing resources efficiently was suggested by Liu and Shen [60]. They also suggest the following three enhancement strategies to lessen cloud cost and service latency: 1) data reallocation based on coefficients, 2) data transmission based on multicast, and 3) congestion control based on request redirection. Additionally, data migration and replication are essential parts of any software application related to Big Data. The study by Wang et al. [125] presents a cost optimization method for dynamic data replication and migration in cloud storage facilities or data centers.

Table 4 Analysis of scientific literature related to storage cost optimization

6.1.2 Post-deployment

In terms of post-deployment techniques, Liu et al. [60] proposed a cost-minimization algorithm for multi-cloud storage using AWS S3, Azure, and Google Cloud. The algorithm, called DAR, uses integer programming and includes a data allocation algorithm and an optimal resource reservation algorithm to reduce costs and ensure SLO compliance. The system also includes enhancement methods for further cost and latency reduction and experiments’ results on real CSPs showed the algorithm’s effectiveness. Another post-deployment approach is proposed by Mansouri et al. [71]. They proposed an optimal offline dynamic programming algorithm and two practical online algorithms for determining the placement of objects in hot and cool tiers in a tiered cloud storage service, in order to minimize monetary cost and improve the quality of service. Their experimental evaluation using an actual Twitter workload showed that their algorithms can yield significant cost savings compared to storing data in the hot tier all the time. Furthermore, evaluating the algorithm on Azure’s 2-tier storage service, developing better online algorithms, and using machine learning to predict object access patterns are mentioned as future work. Erradi et al. [27] also proposed a cost-optimizing approach for tiered cloud storage services by taking into account storage, read, write, and transfer fees. The approach includes two online algorithms to adapt to a future workload that is unclear, and a theoretical analysis and experimental evaluation were conducted to demonstrate the cost-effectiveness of the proposed algorithms. A summary of the approaches discussed above is shown in Table 4.

6.1.3 Discussion

In short, various studies proposed a variety of methods to reduce costs and improve service levels, including non-dominated sorting genetic algorithms, entropy approaches, data migration and replication, cost-minimization algorithms, and dynamic programming algorithms. These approaches include using various optimization techniques, such as non-dominated sorting genetic algorithms and integer programming, to reduce costs and ensure SLO compliance. The proposed methods also consider various aspects of cloud storage, including data availability, transmission, congestion control, and SLA requirements. However, a lack of practical implementation and real-world evaluations of the proposed methods makes it difficult to determine the effectiveness of the methods in practice. In addition to that, some of the methods are not suitable for dynamic and unpredictable workloads. Furthermore, there is a need for more comprehensive studies that consider the interplay between cost optimization and other aspects of cloud storage, such as security and privacy.

6.2 Network cost optimization

This refers to the cost of using the network to transfer data between different regions and the cost of transferring data from storage servers to compute resources. The network usage cost can be optimized by moving the compute resources closer to the storage servers without affecting the applications’ performance. Similar to actual storage cost optimization, network usage costs can also be optimized before the deployment of the application. These approaches can also be categorised into pre- and post-deployment folds.

6.2.1 Pre-deployment

A pre-deployment phase approach is developed by Zeng et al. [134]. They proposed a method for effectively and economically deploying edge servers in wireless metropolitan area networks. Shao et al. [109] proposed a data placement strategy for IoT services in wireless networks. The edge server deployment problem was analyzed and three critical issues were identified: edge location, user association, and capacity at edge locations. Additionally, in [54], a geo-distributed large-scale streaming analytics approach was proposed that utilizes aggregation networks for performing aggregation on a geo-distributed edge-cloud infrastructure. The infrastructure consists of edge servers, transit data centers, and destination data centers. The effectiveness of the proposed approach was evaluated using real-world traces from Twitter and Akamai. The results showed that the proposed approach is able to achieve a reduction of 47% to 83% in traffic costs over existing baselines without any compromise in timeliness. Ghoreishi et al. [30] proposed a cost-effective caching as a service (CaaS) framework for virtual video caching in 5G mobile networks. For evaluation, two virtual caching problems were formulated. Results obtained have shown significant performance enhancement of the proposed system in terms of return on investment, quality, offloaded traffic, and storage efficiency.

Several approaches are developed for cloudlet placement to optimize the network usage cost. For example, Mondal et al. [86] proposed a method for determining the optimal placement of hybrid cloudlets in a fibre-wireless network using an integer programming-based cost optimization framework. They also proposed a framework for low-latency application [84], which is evaluated against urban, suburban, and rural scenarios, and results are compared between field and hybrid cloudlet placement frameworks. The results show that the target latency requirement and the type of deployment scenario play a major role in determining the number of computational resources needed in the cloudlets and that the incremental energy budget for the access networks due to active cloudlet installation is under 18%. Additionally, in [85] by Mondal et al., a mixed-integer non-linear programming (MINLP) model is proposed to determine optimal locations for cloudlet placement in a hybrid architecture. These are very diverse approaches and they provide solutions to minimize total storage cost during the pre-deployment phase by optimizing the network resource usage. The methods ranging from optimizing edge server deployment in metropolitan networks to efficient cloudlet placement in hybrid architectures, highlights the importance of careful planing and resource utilization. As the studies show, the effective usage of these approaches not only enhances cost-effectiveness but also addresses challenges related to latency, resource allocation, and deployment in various network scenarios.

Pre-deployment phase has some challenges, such as service function chaining (SFC), the issue of deploying different network service instances across geographically dispersed data centers and enabling interconnection between them. The objective is to make it possible for network traffic to move easily over the underlying network, giving end users the best possible experience. To tackle the problem of SFC, in [12], an approach is presented for the problem of deploying SFC over geographically distributed data centers using Network Function Virtualization (NFV). The goal is to minimize inter-cloud traffic and response time in a multi-cloud scenario by setting up an ILP optimization problem with constraints such as total deployment costs and SLAs. A new affinity-based approach is also proposed as a solution for larger networks, and its performance is compared to the traditional greedy approach using the first-fit decreasing method. Results show that the affinity-based approach produces better results regarding total delays and total resource cost and can solve larger problems more quickly with a small compromise in solution quality. A similar approach is used in [126], where the problem of SFC orchestration in multiple data centers with the aim of minimizing deploying cost while meeting delay constraints is addressed. The impact of virtualized network functions (VNFs) on link cost and multiple ingresses/egresses of SFCs are considered. The problem is formulated as a Mixed Integer Linear Programming (MILP) model and a heuristic algorithm called Cost-Aware, and Delay-Constrained SFC Orchestration (CADCSO) is proposed to solve it. In large-scale network environments, CADCSO can reduce the average cost by at least 30% and increase the acceptance ratio by around 23%. Furthermore, CADSCO outperforms CASO significantly in the case of multiple ingresses/egresses.

6.2.2 Post-deployment

For approaches suitable for the post-deployment phase, Mansouri et al. [72] suggest two techniques for dynamic replication and data migration in cloud data centers (i.e., network cost). The first approach uses techniques from dynamic and linear programming under the presumption that accurate information about the burden on objects is readily available. The second method uses a randomized “Receding Horizon Control” (RHC) approach and uses knowledge about potential future workloads. Bhamre et al. [12] in their research on the issue of deploying service function chains across network function virtualization architectures, suggested a heuristic strategy to address the issue at hand. Dong et al. [23] proposed an online, cost-efficient transmission scheme for cloud users to address the issue of selecting high service levels that result in unnecessary resource waste. The proposed method utilizes an information-agnostic approach, where long-term transmission requests are split into a series of short-term ones, allowing for more efficient utilization of cloud resources. The effectiveness of this approach is demonstrated through the results of experiments and evaluations. Another approach to reducing network infrastructure use is creating a content delivery network (CDN) and caching the frequently accessed data.

Another caching approach is proposed by Kumar et al. [54]. They presented a framework for cost-efficient content placement in a Cloud-based CDN. The proposed approach uses a cost matrix and a crosslinking data structure to minimize replica dependencies and optimize content delivery. The system offers faster processing time and improved resource optimization and is suitable for parallel processing in a distributed storage environment. The proposed system enhances user experience in CDN and can be used as a support system. Furthermore, in [133], a study is conducted on the problem of minimizing operational costs in network function virtualization (NFV) in edge clouds. The study focuses on using a Lyapunov optimization framework to analyze the trade-off between queue backlog and overall cost for online packet scheduling, network function management, and resource allocation. A backpressure-based online scheduling algorithm is proposed and evaluated through simulation. The study’s results provide insight into the cost-efficient management of NFV in edge clouds. The summary of the approaches discussed above is shown in Table 5.

Techniques such as dynamic replication, data migration, and heuristic strategies address the dynamic nature of network costs. Additionally, content delivery networks, caching frequently accessed data, and cost-efficient content placement frameworks contribute to reducing network infrastructure use. Although these approaches are focused on network usage cost optimization, since network cost is a sub-element of total storage cost, they have an impact on reducing the overall storage cost in the cloud.

Table 5 Analysis of scientific literature related network cost optimization

6.2.3 Discussion

As discussed, several methods have been put forth to reduce the cost of network usage in cloud computing. These include techniques like service function chaining, dynamic replication, data placement strategies, cloudlet placement, cost-efficient transmission, and caching. Researchers have used optimization algorithms, heuristics, and simulations to find solutions that minimize data transfer costs and improve application performance by reducing latency and network usage, as well as maximizing resource utilization. However, these proposed algorithms may not work effectively in real-world situations where workloads and prices are more dynamic and complex. The data placement strategies may not always reflect actual user behaviour and usage patterns, and the methods for large-scale deployment may be unfeasible. Since network usage cost is an integral and expensive cost element [49], in practice, one would avoid accessing data remotely and instead move the computation close to the data. An example could be having the same software application step available in different regions, having the data available in different regions, and accessing the data locally in the same region by the software application (assuming the data passed between steps is small compared to the data accessed within a step). The comparison presented in [49] indicates that it is more cost-effective to store multiple copies of data rather than repeatedly transferring it over the network for processing.

6.3 Storage selection with respect to QoS

Organizations increasingly use cloud storage services for their data due to their cost, efficiency, scalability, and availability benefits. While cost and performance, as well as cost and availability, are trade-offs to be considered (as described in Section 7), various approaches to storage selection have been developed for single-cloud and multi-cloud paradigms. However, it can be difficult to identify the storage service or data center that best meets the desired QoS elements. This section discusses some approaches in the literature for storage selection based on different QoS elements such as data consistency, scalability, performance, availability, and security. In these approaches, the cost may or may not be treated as a QoS element.

Storage selection focused on data consistency and scalability

In [131], Xiahou et al. presented a storage selection method by proposing an AHP-backward algorithm-based cloud service selection strategy for data centers, known as 2CMSSS. The strategy addresses architecture, selection, replica layout, and data consistency issues in multi-datacenter HDFS storage systems. It aims to enhance the performance and scalability of cloud storage systems.

Data replication focused on performance and scalability

Zhao et al. [137] studies replica selection and replication strategies for HDFS, considering bandwidth, peer performance, and replica’s history request. An algorithm was developed to reduce running time and balance the node load. This research provides new insights and approaches to improve the performance and scalability of distributed storage systems.

Storage selection focused on security and customer support

In [42], Ilieva et al. proposed a new approach for evaluating and ranking cloud services and a method for choosing cloud storage as a fuzzy multi-criteria problem. Indicators for evaluating cloud technologies are considered and converted into fuzzy triangular numbers. This new methodology combining multi-criteria and fuzzy approaches is used to rank cloud services. Factors such as product features, functionalities, customer support, and security options are also taken into account.

Storage selection focused on data availability

Another storage selection technique is proposed by Oki et al. [97] in which they presented selection models for cloud storage to satisfy data availability requirements. The models were demonstrated through a prototype implementation on Science Information Network 5.

Storage selection focused on performance and scalability

For a cloud-based video streaming service, Milani et al. [81] proposed a QoS-focused approach for storage service allocation using a hybrid multi-objective water cycle and grey wolf optimizer (MWG) that considers various QoS objectives such as energy, processing time, transmission time, and load balancing in both fog and cloud layers. This approach can improve the performance and scalability of cloud storage systems and provide cost savings for companies.

Storage selection focused on network performance and security

In [51], Khan et al. proposed a smart data placement method for storage options, considering parameters like cost, proximity, network performance, server-side encryption, and user preferences. The evaluation demonstrated the effectiveness of the proposed approach, including data transfer performance and the utility of the individual parameters, in addition to the feasibility of dynamic operations.

6.3.1 Discussion

Tahir et al. [116] conducted a systematic review on cloud storage mechanisms in e-healthcare systems, aiming to identify future research challenges for improving dependability and availability. It provides an overview of current research in the field and highlights the gap that needs further research. Each study targets one or more parameters to present the most suitable data placement strategy, including network performance, bandwidth, processing time, load balancing, data availability, and security. The flexibility to tune the trade-offs between different parameters is missing. For example, if a solution is developed to achieve maximum data availability, it does not consider its impact on cost or other parameters. Moreover, some approaches are highly platform-dependent and cannot be used outside their own ecosystem. There are, however, platform-independent storage tier optimization approaches that have been presented in [49, 63, 65]. However, there is a need for a generic approach that considers cloud resources and related QoS factors and can provide an optimized strategy for storage selection and data placement. For example, one such framework is presented in [51] for storage selection. Moreover, a proposal for a platform-independent resource placement and cost optimization approach is presented in [50]. A summary of the approaches above is shown in Table 6.

Table 6 Scientific literature for selection of cloud storage based on QoS elements

7 Trade-offs

In a multi-cloud or hybrid setting, trade-offs refer to the integral and complex task of achieving balance between various factors that significantly influence decision-making concerning resource allocation and utilization. The deployment of applications and infrastructure in a cloud environment demands a judicious assessment of numerous trade-offs, including, but not limited to:

7.1 Storage-computation trade-off

There is a fundamental trade-off between storage and computation cost when it comes to distributed computing [132]. This trade-off exists because storage and computation each have their own unique set of costs and benefits. For instance, storing data is typically less expensive in the short term, but as the amount of data grows, the cost of storage can become significant. On the other hand, the computation can be more expensive upfront, but it can also provide greater flexibility and scalability in the long run. This trade-off is particularly relevant in applications that require video processing. Video data takes up a significant amount of storage space, but processing video is also a resource-intensive operation. Organizations must weigh the cost of storing large amounts of data against the cost of processing it and make a decision that best fits their specific needs and budget. A similar trade-off exists in the context of privacy preservation. Data encryption requires computational resources, but storing sensitive data without encryption leaves it vulnerable to attacks. Organizations must weigh the cost of encrypting data against the potential consequences of not doing so, and make a decision that balances cost and security.

7.2 Storage-cache trade-off

As discussed in Section 4.1, cloud providers offer different storage tiers with different prices and performance metrics. A tier like the cold tier provides low storage costs but charges more for network usage than reading and writing data. However, tiers such as Elastic Block Store (EBS) by AWS or Azure drive by Microsoft Azure, which can only be used with a compute instance, provide storage at a higher cost with a cheaper network usage cost. Hence, if an application carries out read-and-write operations frequently, it is cheaper to save the object in EBS as a cache; if not, it can be stored in the hot or cold tier. Higher cost efficiency can be achieved by exploiting the nature of the application at hand. If there are frequent read operations, data can be stored in caches such as EBS or memory attached to compute instances.

7.3 Storage-network trade-off

The choice of the least expensive network and storage resources at the right moment in the object’s lifespan plays a crucial part in cost optimization due to the considerable disparities in storage and network prices across data stores and the time-varying burden on a data object during its lifetime. It can be inefficient to keep items for their entire lifespan in a data store using the least expensive network or storage. As a result, the storage-network trade-off necessitates a method for choosing a data object’s location when its status can switch from hot tier to cold tier and vice versa. Hence, when we know that data will not be accessed for the foreseeable future, it can be moved to the cold or archive tier. Storing data in a cold or archive tier will significantly reduce the storage cost. Krumm and Hoffman [53] showed the cost benefits of moving data between different tiers.

7.4 Availability-reliability trade-off

As described in Section 5.4, different consistency models have distinct advantages and disadvantages. The CAP Theorem is related to choosing between a strong or eventual data consistency paradigm. In other words, the solution architect must decide between data availability and data reliability (consistency) in the event of a network partition, which is true for distributed databases and cloud architecture. The two consistency models choose reliability or availability over the other. Data reliability is not eliminated by eventual consistency. However, this model suggests compromising data availability for a short period of time in favour of data reliability.

7.5 Cost-performance trade-off

Cloud providers offer different storage tiers, and CSPs like Google Cloud also offer two different network tiers, standard and premium. Data stored in the premium tier has high availability but is more costly. The more we move towards the archive tier, the lower the storage costs are, but SLAs are also reduced. Azure offers meager costs on the archive tier, but retrieval times can go up to 12 hours. In addition to that, data availability also depends upon the redundancy model. Usually, all CSPs store at least three copies of data, but distributing data between different geographic regions can significantly improve data availability and durability and increase the cost simultaneously. A multi-region redundancy model should only be selected based on user requirements. Applications such as social media platforms with end-users have a greater need for the lowest possible latency; hence, data must be stored as close to the user as possible. In the case of software applications, where low latency is a secondary requirement, a multi-region redundancy model could be a waste of financial resources. In short, the relation between cost and data availability, durability, and latency is directly proportional.

7.6 Discussion

There is no straightforward way of approaching these trade-offs, as it is about achieving balance between various factors. Users have to select one or the other based on their requirements, resources, priorities, and optimization goals. For example, in big data analytics applications, users have to decide between storage-computation trade-offs; similarly, for an application deployed in different geographic regions, users have to face availability-reliability trade-offs. Moreover, trade-offs make the cost optimization process more difficult, as the users always have to deal with cost-performance trade-offs, i.e., cost needs to be reduced while maintaining a certain level of QoS. Navigating the trade-offs involved in a cloud environment requires giving careful consideration to several factors, including costs, QoS, and optimization goals. By clarifying the non-functional requirements, such as cost, performance, availability, reliability, and fault tolerance, users can minimize compromise.

8 Cost ecosystem in context

This section puts the cost ecosystem in context by comparing the cloud storage offerings of three leading cloud providers: AWS, Azure, and Google Cloud. The comparison presents storage cost using various tiers, latency offerings, geographical regions, and availability zones. We aim to present the complexity of the cost structure by showing how each CSP offers multiple storage tiers, each with its own unique characteristics and costs. It is also worth noting that block-rate pricing is not consistently applied. It is available for some storage tiers but not for others. Additionally, we demonstrate how latency (a QoS element) impacts the storage cost, hence a direct relationship between the two. The comparison provides insight into the current state of the market and the various options available to customers. By analyzing the cost of storage in different tiers, latency offerings, data center locations, and availability, it would contribute to understanding how cloud service providers are adapting to the market’s demands and advancing their offerings to remain competitive. The information is collected from the official website of each CSP: AWSFootnote 4 (US East), Google CloudFootnote 5 (US Central-1), and AzureFootnote 6 (US Central) in November 2023 for single-region redundancy model. Moreover, it is only related to the public cloud; the government cloud is not included in this comparison. Note that the examples presented in this section are not intended to evaluate different CSPs with respect to each other, given that in a multi-cloud or hybrid setting, the choice of providers would be varied with respect to the highly dynamic needs and contexts.

8.1 Storage cost comparison

This study conducted a cost comparison across all storage tiers (premium, cold, hot, and archive) to provide a comprehensive and detailed analysis of the cost of storing data across different cloud service providers. The cost of cloud storage is highly dependent on the scenario of use and varies across different CSPs. The comparison includes different storage tiers or classes such as standard, infrequent access and archive storage from different CSPs, and shows how the cost varies based on the storage tier and usage patterns. This further demonstrates the complexity of cost structure under varying parameters. The study also aids in understanding the cost-benefit trade-off of storing data in different storage tiers based on usage patterns.

Table 7 Cost comparison for premium tier in US dollars per GB per month
Table 8 Cost comparison for hot tier in US dollars per GB per month

Table 7 shows the comparison of the cost of storing data between three of the most widely used cloud service providers: Google Cloud, Azure, and AWS in the premium storage tier, which is a higher-cost, higher-performance storage option offered by the three providers. The cost comparison is based on the data storage cost only and does not consider other costs associated with using cloud services, such as compute costs or data transfer costs. It can be seen that Google Cloud offers the cheapest cost in this particular instance for storing data in the premium tier. AWS offers block pricing, i.e., a different cost for different data amounts. The larger the size of the data stored, the cheaper the cost. For applications with short-term or transient workloads, such as proof of concepts, pilots, application testing, product evaluations, labs, and training environments, CSP with lowest cost of premium tier can be employed

Table 8 shows the comparison of the cost of storing data in Google, Azure, and AWS in the hot storage tier. It can be seen that when it comes to storing data in the hot tier, Google offers the cheapest cost in this particular instance, followed by Azure and AWS. In the hot tier, Azure has a different cost for different data amounts. The larger the size of the data stored, the cheaper the cost. Hot tier storage is a type of storage that is optimized for frequently accessed data. It is typically used in scenarios where data needs to be accessed quickly and frequently, such as online transactional processing workloads, high-performance computing (HPC) workloads, media streaming, gaming, and Big Data. It is important to note that using hot tier storage can come with a higher cost than other types of storage, as it requires more resources to support high performance and frequent access.

Table 9 shows the comparison of the cost of storing data in Google, Azure, and AWS in the cold storage tier. It can be seen that when it comes to storing data in the cold tier, Google and AWS offer the cheapest cost in this particular instance, followed by Azure. The amount of stored data in the cold tier is irrelevant to the storage cost. Since it is a type of storage that is optimized for infrequently accessed data, it is typically used in scenarios where data does not need to be accessed as frequently, such as archival and backup data, data warehousing, object storage, disaster recovery, and cold backup. Hence, it is suitable for storing data that does not need to be accessed frequently but needs to be kept for compliance, regulatory, or historical reasons.

Table 10 shows the comparison of the cost of storing data in Google, Azure, and AWS in the archive storage tier. It can be seen that AWS Deep Archive offers the cheapest cost in this particular instance, but for regular archive storage, Google offers the cheapest cost, followed by Azure and AWS when it comes to storing data in the archive tier. In the archive tier, like the cold tier, the amount of stored data is irrelevant to the storage cost. Applications with dormant workloads occupy no compute capacity and generate no network traffic, reducing the running costs to just storage. Re-animating such a workload through an API call provides unique opportunities to minimize costs. Example use cases: test/development, UAT, unit and system testing, QA environments, and cold disaster recovery sites. AWS Deep Archive provides the cheapest cost in this scenario, whereas Google guarantees instant retrieval at a slightly higher cost.

It can be seen that the number of tiers is not limited to four across all CSPs and the cost differs for each storage tier. It is also important to note that the price varies based on the geographic location of the storage server. Therefore if one CSP offers the cheapest price in the above-mentioned region, it doesn’t need to offer the cheapest cost for other regions as well. It is also important to consider other cost elements as well such as network usage cost, data replication and transaction cost. Additionally, CSPs offer block-rate pricing, but even that offering is not consistent. For some tiers, it is available, while sometimes it is not.

Table 9 Cost comparison for cold tier in US dollars per GB per month
Table 10 Cost comparison for archive tier in US dollars per GB per month

8.2 Latency comparison in archive tier

Table 11 illustrates the results of the latency comparison in the archive tier for the three cloud service providers. Each provider has designed the archive tier for infrequently accessed data with flexible latency requirements. The data reveals that data retrieval in Azure storage can take up to 12 hours. In contrast, AWS offers additional tiers within the archive tier, namely Glacier Flexible Retrieval and Glacier Deep Archive. Data stored in the former can be retrieved within a range of 1 minute to 12 hours, while in the latter, retrieval times are similar to Azure, with a maximum of 12 hours. Notably, Google cloud storage offers instant retrieval for data stored in the archive tier, unlike the other providers. The latency of data retrieval in the archive tier can greatly impact the cost and feasibility of utilizing the archive tier for organizations. Longer retrieval times can increase the cost of accessing data and may not be suitable for use cases that require frequent or near-instant access to archived data. Therefore, retrieval latency should be taken into account when selecting a cloud provider and designing cloud infrastructure for archival storage.

Table 11 Cloud storage latency comparison in archive tier
Table 12 Cloud storage continents

8.3 Regions and availability zones

Table 12 illustrates the comparison of data center availability among different cloud service providers across various regions of the world. It can be observed that AWS has the most extensive coverage of data centers among the three providers, with availability in more regions than Google and Azure. Additionally, the data indicates that Google, as of the time of this study, does not provide data storage facilities in the Middle East region, although it is advertised to be available in the near future. Similarly, Azure currently does not offer storage facilities in the Australian region.

Table 13 presents the number of regions available in each geographical location offered by each cloud service provider. The data illustrates that despite having data centers in fewer continents than AWS, Google offers the highest number of regions at 34, followed by 27 for AWS and 26 for Azure. Table 13 also shows the number of availability zones offered in each region by each cloud service provider. The data indicate that similar to the number of regions, Google offers the highest number of availability zones at 104, followed by 87 for AWS and 81 for Azure.

The geographical coverage of data centers plays a crucial role in the availability, performance, and disaster recovery capabilities of services for organizations utilizing cloud computing. Organizations with a global presence or customers in multiple regions may require data centers in specific locations to minimize latency and comply with data sovereignty regulations. Additionally, the availability of data storage facilities in a particular region can also impact the cost and feasibility of deploying services in that region. Therefore, organizations should take into consideration the geographical coverage and availability of data centers when selecting a cloud provider and designing their cloud infrastructure.

Table 13 Cloud storage regions comparison

This shows the global presence, infrastructure complexity, geographical proximity, and regional extent of CSPs like Google Cloud, Azure, and AWS. It highlights the significance of ensuring data availability, durability, and low latency, all of which are crucial QoS elements. These CSPs recognize these factors by offering multiple regions within each continent and subdividing them into availability zones. However, it’s crucial to recognize that as data becomes more distributed, challenges related to data privacy, sovereignty, and compliance with regulations also increase. Additionally, this approach raises the cost associated with ensuring data security and can lead to increased costs in maintaining data security.

8.4 User scenarios

In this article, we have discussed the complexity of cost structure, how different cost elements are related to each other, and how they contribute to the total cost of data storage. However, it is also important to perform a quantitative comparison of the impact on the total cost of data storage. In order to do that, and to make the complexity more visible, this section presents some real-life user scenarios with different application types, such as video streaming, Big Data analytics, and IoT data collection, with discussions about the different cost elements involved in such an application and their effect on the total cost.

8.4.1 Scenario 1: video streaming

This is a scenario for a video streaming service. Taking into consideration the type of application, the following requirements have been set. This application is likely to incur more network usage cost and less storage cost.

  • Storage requirement: 50 terabytes (TB) initially, expected to grow rapidly.

  • Suggested storage tiers:

    • Premium tier: 50% of total storage capacity.

    • Hot tier: 30% of total storage capacity.

    • Cold tier: 15% of total storage capacity.

    • Archive tier: 5% of total storage capacity.

  • Data encryption requirement: AES-256 encryption for all stored video content.

  • Estimated network usage: 500TB.

  • Redundancy model: Multi-region.

  • Trade-offs to consider:

    • Storage-network trade-off.

    • Availability-reliability trade-off.

    • Cost-performance trade-off.

  • Compliance and regulatory costs: Costs associated with ensuring compliance with data protection regulations. It may incur more data transfer cost.

  • Data backup cost: The given application also incurs the cost of data backup, and in case the user opts for multiple backups, an additional data replication cost will be added.

  • Data management cost: An optional cost for data management tools can be added to organize, store, and manage video content efficiently.

  • QoS elements consideration: Low latency and high availability.

Video streaming services, distributed across multiple regions for low latency and high availability, encounter trade-offs related to storage and network performance alongside availability and reliability considerations. Figure 8 shows how a different combination of the redundancy model and storage tier affects the total storage cost. Cost estimates are calculated using the official Google cost estimator using the above-mentioned perimeters. The relation between network cost and storage cost can be seen when data is stored in different storage tiers. When data is stored in a higher storage tier, the storage cost is higher, but it has a lower network usage cost. In terms of cost optimization, for this type of application, network cost can be optimized by careful data placement in different regions, hence keeping the data close to consumers. In this scenario, storage cost can be optimized by carefully migrating data between different storage tiers based on access patterns.

Fig. 8
figure 8

The monthly cost estimate comparison in US dollars for 50TB of data with 500TB of data transfer while data is stored in different storage tiers with single and multi-region redundancy models

8.4.2 Scenario 2: Big Data analytics

This is a scenario for a Big Data analytics platform. Taking into consideration the type of application, the following requirements have been set up. In this scenario, a user is likely to pay more for storage cost as compared to network usage.

  • Storage requirement: 50 petabytes (PB) initially, with a projected exponential growth over time.

  • Suggested storage tiers:

    • Premium tier: 40% of total storage capacity

    • Hot tier: 30% of total storage capacity

    • Cold tier: 20% of total storage capacity

    • Archive tier: 10% of total storage capacity

  • Data encryption requirement: AES-256 encryption for all stored data to ensure data security and compliance.

  • Estimated network usage: 1000TB per month for data transfer between distributed computing nodes and storage systems.

  • Redundancy model: Single region.

  • Trade-offs to consider:

    • Storage-computation trade-off.

    • Storage-cache trade-off.

    • Storage-network trade-off.

  • Compliance and regulatory costs: This includes costs associated with ensuring compliance with data protection regulations, such as GDPR or HIPAA, which may involve additional security measures and auditing processes.

  • Data backup cost: The platform incurs costs for data backup to ensure data durability and disaster recovery preparedness. Multiple backup copies may result in additional replication costs.

  • QoS elements consideration: High performance and scalability.

Big data analytics platforms, typically deployed in single regions for high performance and scalability, face trade-offs in storage, computation, and cache utilization to optimize processing efficiency. Backup and recovery mechanisms, often in single regions for data integrity and reliability, must deal with trade-offs between availability, reliability, and cost-effectiveness. Figure 9 shows how a different combination of the redundancy model and storage tier affects the total storage cost. Cost estimates are calculated using the official Google cost estimator using the above-mentioned perimeters. The relation between network cost and storage cost can be seen when data is stored in different storage tiers. It also shows the effect of the redundancy model on storage cost. When data is stored in a higher storage tier, the storage cost increases a lot, but it has a lower network usage cost. It is also interesting to see that the same parameters of data, when stored in the archive tier with a single-region redundancy model, incur a higher network usage cost. In terms of cost optimization, storage cost can be optimized by carefully migrating data between different storage tiers based on access patterns.

Fig. 9
figure 9

The monthly cost estimate comparison in US dollars for 50PB of data with 1000TB of data retrieval while data is stored in different storage tiers with single and multi-region redundancy model

8.4.3 Scenario 3: IoT data collection

This is a scenario for an IoT data collection platform. Considering the nature of the application, the following requirements have been established.

  • Storage requirement: Initially, 100 terabytes (TB) of storage capacity is required to store incoming IoT data streams, with expected continuous growth.

  • Suggested storage tiers:

    • Premium tier: 50% of total storage capacity.

    • Hot tier: 30% of total storage capacity.

    • Cold tier: 15% of total storage capacity.

    • Archive tier: 5% of total storage capacity.

  • Data encryption requirement: All incoming and stored IoT data must be encrypted using industry-standard encryption algorithms (e.g., AES-256).

  • Estimated network usage: Approximately 500TB per month.

  • Redundancy model: Multi-region.

  • Trade-offs to consider:

    • Storage-network trade-off.

    • Availability-reliability trade-off.

    • Cost-performance trade-off.

  • Compliance and regulatory costs: It may have costs associated with ensuring compliance with data protection regulations, industry standards, and privacy laws governing the collection, storage, and processing of IoT data.

  • Data Backup: The data collection platform also incurs costs for data backup to ensure data durability and disaster recovery. Multiple backup copies may result in additional storage and replication costs.

  • QoS elements consideration: Interoperability and data reliability.

IoT data collection platforms, deployed in multiple regions for interoperability and data reliability, have to deal with trade-offs in storage, network usage, and availability versus application performance. Figure 10 shows how a different combination of the redundancy model and storage tier affects the total storage cost for the IoT data collection scenario. The relation between network cost and storage cost can be seen when data is stored in different storage tiers. It also shows the effect of the redundancy model on storage cost. This type of application incurs more network cost than storage cost, so in terms of cost reduction, more focus should be on optimizing network cost. Moreover, storage cost can be optimized by carefully migrating data between different storage tiers based on access patterns.

Fig. 10
figure 10

The monthly cost estimate comparison in US dollars for 100TB of data with 500TB of data transfer while data is stored in different storage tiers with single and multi-region redundancy model

Table 14 Different user-case scenarios presenting application types and associated cost elements, suitable redundancy model, recommended QoS elements and necessary trade-offs

8.4.4 Discussion

The taxonomy proposed in this context can be useful in understanding and reducing expenses by providing a structured framework to categorize and analyze cost elements associated with cloud storage. In Table 14, more scenarios are provided and summarised; each has its own unique requirements and trade-offs. In scenarios like video streaming and Big Data analytics applications with dynamic resource scaling, the taxonomy enables organizations to identify cost drivers such as compute services, storage solutions, network services, and monitoring/logging tools. By understanding the cost structures of each service and their interrelationships, users can make informed decisions to optimize resource utilization, minimize idle capacity costs, and manage additional expenses associated with multi-region deployments.

9 Discussions

The taxonomy and literature presented in this article provide decision-makers and researchers with valuable insights into the complex problem domain of selecting cloud storage providers for data placement and optimizing their costs. In this section, we provide a discussion of each of the various aspects presented in this article; however, before presenting the discussion, below we summarise some notable key findings, among others, from this study.

  • The cloud storage cost structure consists of multiple elements, both mandatory (such as the cost of data storage, network usage, data replication, etc.) and optional (such as security, data management, etc.). It means the user who opts to store data in cloud storage not only pays for storing the data, but each time the data is retrieved, processed, and updated, all these operations incur separate costs. However, the total cost can be reduced by optimizing individual cost elements such as storage and network usage.

  • CSPs offer multiple storage tiers designed to accommodate a wide range of user needs. Each tier has specific cost and performance characteristics. A higher storage tier results in better QoS, but also incurs a higher cost. Lower storage tiers are cheaper but have low performance. Hence, it is inefficient and expensive to store data all the time in the same tier. By moving data between different tiers over time based on the access pattern, latency requirement, and size, the cost can be reduced.

  • CSPs offer different redundancy models, such as single, dual, and multi-region. Higher data redundancy enhances data availability and decreases latency, but it also adds the extra cost of data replication and backup. Additionally, it creates data consistency issues; hence, users have to face a trade-off between data availability and consistency.

  • The cost of cloud services is influenced by various QoS elements, including network performance, data durability, availability, consistency, security, and interoperability. Storing data in higher tiers, like premium or hot, improves network performance but increases cost. Similarly, high data availability and consistency require redundant storage, backup systems, and frequent data synchronization, hence increasing the cost. Moreover, data security incurs additional costs. Therefore, the users have to face a cost-performance trade-off.

  • To incorporate QoS elements into the total cost, a comprehensive cost optimization strategy is required, such as the cost-effectiveness ratio method [50]. With careful planning and a cost-effectiveness ratio method, users can balance cost and performance trade-offs, leading to efficient resource allocation and cost-effective cloud storage operations.

  • For optimizing the cloud cost, the first step is to understand the major cost drivers, such as storage and network, and then to optimize resource utilization, i.e., eliminate idle and extra resources. Optimization strategies can be implemented in two different phases: pre- and post-deployment of the application. Mostly, pre-deployment approaches are platform-dependent and cannot accommodate a wide range of user requirements. In general, existing cost optimization approaches lack practical implementation and evaluation in real-world scenarios.

  • Cloud storage selection is a crucial process that is performed before application deployment. The goal is to find the most suitable storage option based on user requirements such as storage, performance, QoS (data availability, consistency, scalability, security, customer support), compliance, and cost requirements. Hence, it is a complicated process, and existing approaches are either platform-dependent or focus on only a few of the required parameters. Therefore, a comprehensive framework is required with the flexibility to tune the trade-offs between different parameters.

  • Trade-offs are a complicated topic when it comes to cloud environments, as there is no simple solution to this problem. It is about achieving a balance between various factors. Users have to select one or the other based on their requirements, resources, and priorities. For example, some common trade-offs are: storage-computation, storage-cache, storage-network, availability-reliability, and cost-performance. Trade-offs make the cost optimization process more difficult, as the cost needs to be reduced while maintaining a certain level of QoS which not only requires an extensive cost optimization framework, but also continuous monitoring and resource optimization.

  • Multi- and hybrid cloud environments provide certain benefits, while there are also some challenges related to portability and interoperability, such as cloud-service integration, security, privacy, trust, management, cost models, and service level agreements. In order to achieve interoperability, the challenge is not to make everyone switch to a common solution but rather to allow everyone to keep their setup while still providing uniform access to essential services. In the current multi-cloud setting, it is recommended to implement the required modifications at the application level (e.g., software containers).

9.1 Types of cloud environment

The domain of cloud computing environments is complicated, comprising a wide array of configurations. Broadly categorized, these environments fall into two primary domains: those relying on a single cloud service provider and those embracing a mix of multiple cloud service providers and private clouds. The choice between these approaches hinges on the complicated characteristics and requirements of the specific application, reflecting the nuanced considerations surrounding both public and private cloud utilization. The placement of data in a cloud environment can have a significant effect on the application’s cost and performance. For applications that are directly accessed by end-users, such as social media, it is recommended to store the data among several cloud storage providers that are geographically near the end-users to reduce the latency of data retrieval and enhance the user experience. However, for applications that are not directly accessed by end users, it may be more cost-effective to store the data within the same cloud. This is because moving data between different cloud providers can add extra costs and increase latency, both of which can hurt the performance of the application. In these cases, storing the data within the same cloud can help to keep costs under control and ensure that the application performs optimally. It’s also worth noting that the choice of the cloud service provider can have an impact on the cost and performance of an application. Different cloud service providers have different pricing models, and some may be more cost-effective for certain applications. Additionally, different cloud service providers may have different levels of performance and reliability, and both cost and performance depend upon the geographic location. It is possible that CSP A has better network infrastructure in region X or less consumer load, hence better performance in that region, so it’s important to choose a provider that not only meets the specific needs of the application but it is also important to consider the QoS factors as well.

9.2 Cloud storage taxonomy

The cost structure of cloud computing and cloud storage, in particular, is a complex and intricate system that can be challenging to understand. Different CSPs offer various pricing models, making it difficult to compare costs and choose the best option for a specific application. This complexity arises from the multitude of factors and variables that influence pricing, including storage tiers, data transfer, access patterns, and provider-specific pricing models. As a result, navigating and optimizing these costs necessitates a comprehensive understanding of the complexities involved, making it a challenging yet crucial aspect of cloud service management. Cloud pricing is much more complex than it is advertised. Customers do not pay “pay as you use” but “pay what they order”; usage is irrelevant in reality. If a compute resource is booked, whether it is used or not, the user is billed for the resource. Moreover, when it comes to the total cost of data storage, network usage cost plays an integral part in the total cost. Cloud service providers have built a pervasive infrastructure with servers and a physical network between those servers and data centers. Therefore, users are barely charged for the actual storage but for using the network infrastructure of these providers. Martens et al. [75] noted that many cloud cost calculations lack a systematic approach to the cost estimation behind various cloud pricing models.

9.3 QoS elements

When choosing a cloud service provider, cost is not the only factor to consider. Other quality of service elements can also have a significant impact on the cost and performance of an application. These elements include network performance, data availability, consistency, security, and others. The cost of these elements is not advertised directly, but it depends on the selection of the redundancy model, storage tiers, and network tiers. The higher the redundancy model or tier, the higher the cost. Therefore, it is required to find the cost-effectiveness ratio of the required QoS elements to achieve maximum cost reduction while maintaining application performance. It is also important to consider these factors when deciding which CSP to use, as they can have a major impact on the application’s success. For example, network performance can affect the speed at which data is transferred and the responsiveness of the application. Data availability is critical for ensuring that the application is always accessible to users, while consistency ensures that data is always up-to-date and accurate. Security is also an important factor, as sensitive data must be protected from unauthorized access. Choosing a CSP that offers the right combination of cost and QoS elements is crucial for ensuring the success of a cloud application.

9.4 Cloud cost and optimization techniques

Many CSPs or cloud computing advocators claim that cloud computing is cheaper because of its Total Cost of Ownership (TCO), and this is also the biggest reason for organizations to switch to cloud servers, but Weinman [129] argued that “Cloud Computing is not cheap computing”. Hence, it’s important to understand the main cost drivers in cloud computing to reduce costs. These drivers can vary depending on the CSP and organization but typically include resource utilization, scale and demand, region and availability, maintenance and management, data egress, SLAs, compliance and regulatory requirements, and third-party services. Resource usage, particularly compute resources, storage, and network usage is a key factor in cloud costs. Currently, cloud optimization methods fail to address user requirements in computing, storage, and network. This presents a challenge of cost-effectively and efficiently storing data in cloud systems. When it comes to cost optimization, storage architecture can/should be optimized before application deployment, which concerns the geographical locations where the data should be stored and when and which data shall be moved to which particular storage tier. There are two main aspects here for cost optimization. One is the data and computes resources placement strategy, and the other is migrating data between different storage tiers. This can be done by carefully analyzing the application requirements. Data can be stored in different regions based on access needs and patterns, and compute instances can be deployed closer to the data stored. This way, colossal network costs can be saved. On the other hand, the actual storage costs can be saved by moving the data which is not accessed frequently to a lower tier, such as an archive, right after it is processed.

Several studies proposed methods to reduce costs and improve service levels in cloud computing by using techniques such as non-dominated sorting genetic algorithms, entropy approaches, data migration and replication, cost-minimization algorithms, dynamic programming algorithms, and more. These methods aim to minimize data transfer costs and improve application performance while considering aspects such as data availability, transmission, congestion control, and SLA requirements. However, there is a lack of practical implementation and real-world evaluations of the proposed methods, and some methods are unsuitable for dynamic and unpredictable workloads. There is also a need for comprehensive studies that consider the interplay between cost optimization and other aspects of cloud storage, such as security and privacy.

9.5 Trade-offs

In a multi-cloud or hybrid setting, various trade-offs must be considered when allocating and utilizing resources effectively. These trade-offs can impact an application’s cost, performance, and reliability. Trade-offs make the cost optimization process more difficult, as the users always have to deal with cost-performance trade-offs, i.e., cost needs to be reduced while maintaining a certain level of QoS. For effective cloud deployment and resource utilization, it is important to consider these trade-offs based on the specific requirements of the applications and infrastructure. For example, if an application requires high levels of reliability, it may be necessary to accept a lower level of availability to achieve this, and vice versa. Similarly, in big data analytics applications, users might have to decide between storage-computation trade-offs. Trade-offs cannot be avoided, as suggested by the CAP theorem, which says that a user will always have to choose two of the three (consistency, availability, and partition tolerance). Hence, navigating the trade-offs involved in a cloud environment requires giving careful consideration to several factors, including cost, QoS and optimization goals. However, users can reduce compromises arising from trade-offs by explicitly defining non-functional requirements, including cost, performance, availability, reliability, and fault tolerance.

10 Conclusions and future work

In this article, a taxonomy for cloud storage cost is presented, including multi- and hybrid clouds. The taxonomy includes different types of cost elements associated with cloud storage and how they vary across providers. Primarily, the article provides insights regarding cloud cost structure with its complicated nature along with complexity of cost models, while existing cost models cannot meet the industry-specific needs. Then, various QoS elements (network performance, availability, data durability and consistency, interoperability, and security), which indirectly impact the storage cost, are discussed. Next, the trade-offs involved in employing cloud services are explored, and various cost-optimization strategies are presented. Finally, a comparison of different cloud storage providers and scenarios is provided to put the storage cost into context. The analysis provides decision-makers and researchers with valuable insights into the complex problem domain of selecting cloud storage providers for data placement and optimizing their costs.

Regarding the future research, one possible direction is to expand the comparison of cloud vendors with additional metrics beyond just cost. For example, QoS factors such as performance, scalability, or support offerings. A more comprehensive view could prove useful. Another direction worth investigating could be the influence of different cloud deployment models, such as public, private, and hybrid clouds, on cost and other critical parameters, such as security and performance. It may also be beneficial to investigate the creation of cost optimization strategies that cater to specific industries and their respective needs. In order to provide a more realistic analysis of the current literature, a study could be conducted to provide not only qualitative but also quantitative comparisons of those approaches. Furthermore, a study on the optimization of cloud storage for various types of data, such as Big Data or crucial data, and its impact on cost, performance, and security could provide valuable insights.

When it comes to cost optimization, the development of guidelines, novel algorithms, and approaches can play an important role across various dimensions of cloud computing, including compute, network, and storage cost optimization. With the proposal of taxonomy in this article, a base is provided, on top of which, cost optimization strategies and guidelines can be developed targeting one or more cost elements. For example, one future research area involves the creation of novel frameworks designed for the classification of storage objects into distinct storage tiers based on usage and access patterns. This process can enhance the efficient utilization of cloud storage resources and create a balance between cost and performance. Moreover, there is a pressing need for further research in the area of data management and cloud resource placement to achieve comprehensive optimization of storage and network costs. This includes exploring approaches that strategically place data and cloud resources in a manner that minimizes expenses and maximizes resource utilization. Similarly, the impact of QoS elements can be studied and incorporated into cost optimization frameworks to efficiently deal with the trade-offs. By focusing on these potential areas of research, efficient and cost-effective cloud computing solutions can be developed, ultimately benefiting cloud users.