1 Introduction

The speedy development of technology and Internet of Things (IoT)-based devices in organizations and enterprises give rise to progressive increases in various types of data. IoT has become a vital part of human life and it can be sensed in our day-to-day activities. It was said by Kumar et al. (2019) that IoT is a revolutionary approach that has changed numerous aspects of human life. It makes our lives easy and secure by handling various applications of smart city societies including pollution control, smart transportation, smart industries, smart home security systems, smart water supply, and many more systems. The small amount of data accumulates and gives rise to Big Data which is stored, processed, and analyzed by a set of technologies. Big Data is a large volume of data generated by IoT sensors, servers, social media, and medical equipment, etc. Cloud computing is internet-based computing that enables inexpensive, reliable, easy, simple, and convenient accessibility to the resource (Albugmi et al. 2016). Cloud computing provides service, and reduces infrastructure maintenance overheads. Apart from this it also provides better performance to the end users and flexibility for storing data over the cloud. However, storing highly confidential Big Data obtained from IoT devices, medical data, and server data over the cloud may pose threats to attackers. Therefore, data security is a most important concern when a large or bulk of confidential data is to be stored in the cloud (Sumithra and Parameswari 2022).

Cyber attacks target IoT devices that impact stakeholders, and they may damage physical systems, m-health, and economic systems severely. Earlier events show that IoT devices hold numerous vulnerabilities. Many manufacturers struggle to protect IoT devices from vulnerabilities (Schiller et al. 2022a). Cloud computing integrates distributed computing, grid computing, and utility computing to establish a shared virtual resource pool (Sun 2019). There are privacy and security issues in these cases because the owners have no control over the information and tasks carried out on the platform. Various privacy protection methods have been introduced such as encryption, access control, cryptography, and digital signature but they are not strong enough, as a result, attackers easily break through the security wall and harm the data over the cloud.

The authors of the research papers reviewed various methods and suggested some measures and directions to protect the data in cloud computing and edge computing environments (Ravi Kumar et al. 2018; Zhang et al. 2018). Through this study, the author found that data privacy, data remoteness, data leakage, and data segregation are crucial problems that may exist. The survey paper (Hong-Yen and Jiankun 2019) addressed modern privacy and preserving models to focus on numerous privacy-interrelated frameworks to be implemented in practice.

As a contribution, the current paper aims to accomplish the following objectives:

  1. i.

    To examine existing security frameworks, standards, and techniques that incorporate different standards across multiple areas of cloud-IoT technologies.

  2. ii.

    To explore and discuss open-ended challenges in a Cloud-IoT-based environment concerning securities and privacy.

  3. iii.

    To present and discuss the classification of challenges in Cloud-IoT environments after evaluating the performance of existing literature. It also provides solutions for the identified open-ended challenges and addresses future security concerns related to Cloud-IoT technologies.

The following are the Research Questions (RQs) that the researchers tried to investigate through the current research paper:

RQ1: To Investigate how IoT, Big Data, and Cloud computing technologies are interconnected, and how security can be a major concern when data is stored in a cloud environment.

RQ2: What are the security objectives for the data security and privacy domain?

RQ3: What are the privacy concerns for end-users in cloud-IoT-based environments?

RQ4: What is the role of edge computing in enhancing privacy in a cloud-IoT environment?

RQ5: What are vulnerabilities that exist in the cloud-IoT infrastructure?

RQ6: What are the current research trends and areas of focus?

RQ7: What are Advancements in security threat detection and avoidance?

RQ8: How machine learning can be a useful tool in detecting vulnerabilities within a cloud-IoT environment?

RQ9: How can blockchain technology be an effective measure of data security and privacies?

RQ10: What are Current Issues in Data Security and Privacy?

Regarding the remaining portion of the document, Sect. 2 describes the methodology of this research work. Section 3 discusses the characteristics of a research paper and explains how the current paper differs from others. Section 4 talks about security goals in the Cloud-IoT environment. Section 5 discusses the taxonomy related to Cloud-IoT environment which includes Big Data and IoT along with its applications in various domains. Section 6 is a comprehensive study of various attacks in the Cloud-IoT environment. Section 7(A) explores the study of various research trends through Table 1. Section 7(B) describes attack vectors and mitigation strategies through Table 2. Section 8 presents an in-depth analysis of digital forensics. Section 9 talks about the machine learning and Blockchain technologies-based approaches used for threat detection and recovery. Section 10 covers the current challenges in data security and privacy, and it provides a brief description of possible solutions listed in Table 3. Further, this section highlights the research gaps identified by the author in Table 4. Conclusions and future research work are discussed in Sect. 11.

Table 1 Research work and focus of research
Table 2 In-depth analysis of various attacks and their counter measures
Table 3 Challenges and solutions for data security and privacy in Cloud-IoT environments
Table 4 Specific gaps the current research aims to address the data security, and privacy preservation in cloud-IoT technologies

2 Methodology: a systematic approach

The methodology is the systematic approach that is used by the author to conduct research, analyze the data, and frame conclusions. The methodology section covers the boundary area of methods and approaches that are followed by the author to write the current research paper. The methodology of this research paper is as follows: To examine IoT security challenges and threats author searched numerous kinds of literature on IoT security. For this keyword IoT, Cloud-IoT security was used for standard survey papers that were published in reputed journals like IEEE, Elsevier, Springer, and many more. After completing this task, the author examined numerous techniques and methodologies presented in those survey papers critically analyzed the facts, and algorithms, and selected a set of relevant topics that is important from a security perspective, provided with the help of the author’s individual experience in the sphere of security. In addition to this, the author introduced various standard approaches that are recognized in the sphere of security for protection against threats. At last, the author utilized Internet-based search techniques to find the most appropriate security products. The Methodology of the current paper is divided into three standard stages as follows.

Phase 1 Identification of the study area, formulation of the research questions, sampling, and establishing the primary search approach or standards.

Phase 2 Using the search strategy or criterion about existing literature, carrying out keyword searches, Boolean searches involving the combination of keywords and phrases with the operators “AND”, and “OR”, and database searches, assessing the results, and formulating selection criteria.

Phase 3 Finding and evaluating approved literature, articles, papers, websites, and web documents by the chosen primary research topic.

Figure 1 shows the distribution of references over the year. The figure portrays which year the researcher’s paper was selected to prepare the current paper. The author selected the previous research paper from year 2016 to 2024, a recently published paper. The author reviewed the paper which includes published journal and conference papers. The author searched, examined, and analyzed the paper was included in the references section of the manuscript.

Fig. 1
figure 1

Reference timeline

2.1 Inclusion/exclusion criteria

The inclusion and exclusion criteria aim to identify the research studies that correspond with the questions under investigation. The primary studies were identified using the inclusion criteria that we are presenting. The exclusion criteria were left out since they represent the negative version of the inclusion criteria that were specified.

Inclusion Criteria

IC1: Publications released on and after the year 2016.

IC2: Publications that have been published in peer-reviewed journals, conferences, workshops, etc.

IC3: English-language articles published.

IC4: Articles that are required to contain an abstract and title.

IC5: Publication that specifically addresses the subject topic such as data security, privacies encryption decryption, machine learning, blockchain, or research problems.

IC6: Research with subject-specific keywords included.

IC7: Systematic reviews, theoretical analysis, and empirical study.

Exclusion Criteria

EX1: Articles that are loosely connected to the research question or do not answer it.

EX2: Articles whose complete text cannot be accessed.

EX3: Articles not available in English.

EX4: Articles that were released almost ten years ago.

EX5: Studies with poor ratings or serious methodological errors.

EX6: Articles that don’t explicitly address privacy and data security concerns is excluded.

EX7: To prevent prejudice from incorporating the same study more than once, remove duplicate publications.

2.2 Algorithms, tools, and techniques Implemented

Algorithms

Alogrithm1: RSA (Rivest-Shamir-Adleman), ECC (Elliptic Curve Cryptography), and AES (Advanced Encryption Standard) are a few examples of the particular encryption techniques used in the studies.

Alogrithm2: Determine which machine learning techniques—such as anomaly detection methods (e.g., k-means clustering, isolation forests)—are utilized to detect data breaches or to ensure data security.

Tools

Tool1: To manage and organize references, use programs like Scispace, Citation Gecko, and Open Knowledge Map.

Tool2: To create visual representations of the data in MS Excel (graph).

Tool3: To check Grammar and Spelling Grammarly software tools are used.

Tool4: To draw the picture tools such as Paint, Smart Draw, and Origin-Lab are used.

Tool5: iThenticate is used for Plagiarism detection.

Techniques for Search

Technique1: Use Boolean operators and specified keywords to search IEEE Xplore, Scopus, and Google Scholar. The following query is an example: “data security” AND “privacies” AND (encryption OR data protection) AND “2016–2024”.

3 Advancing IoT, Big Data, and cloud integration: novelty in current research

In the rapidly evolving landscape of technology, the convergence of the IoT, Big Data, and Cloud computing stands at the forefront of innovation. Each domain, when studied individually, offers significant advancements and benefits. However, the integration of these technologies opens up unprecedented possibilities, presenting both opportunities and challenges. This research work provides the novel aspects of combining IoT, Big Data, and Cloud computing. Further, the paper highlights the transformative impact on various industries and emerging security concerns. This study aims to uncover new insights and propose solutions to ensure the safe and efficient deployment of integrated systems by exploring how these technologies interact. The major contributions of the current research paper are as follows:

  1. i.

    Integration of IoT, Big Data, and Cloud Computing: The paper examines the combined effects and security threats of integrating IoT, Big Data, and Cloud computing.

  2. ii.

    Role Analysis: It offers an in-depth analysis of how IoT, Big Data, and Cloud storage work together.

  3. iii.

    Data Flow: The paper explores the process where data generated from IoT devices becomes Big Data and is subsequently stored in the Cloud.

  4. iv.

    Security Threats: It highlights the potential security threats during the transmission and storage of data.

  5. v.

    Proposed Protections: The authors propose standard approaches to protect against potential attacks that could compromise the data.

  6. vi.

    Digital Forensics: The paper discusses digital forensics as a method to preserve and analyze digital data post-attack, aiding in tracing the attacker’s footprint and identifying patterns and trends.

  7. vii.

    Recent Data Security Technologies: In this research work, the authors addressed new technologies that have the potential to significantly reduce threats in cloud-IoT environments.

  8. viii.

    Research Focus: Authors determine the researcher’s field of expertise methodically.

4 Security goals in Cloud-IoT environments: a comprehensive overview

Security in Cloud-IoT environments is paramount due to the interconnected nature of devices and the vast amount of sensitive data they generate and process. Ensuring the confidentiality, integrity, and availability of data and services has become a major challenge as cloud computing and IoT devices become more integrated into everyday life and vital infrastructure. In an ever-changing digital ecosystem, this comprehensive overview seeks to explore the major security objectives, difficulties, and tactics that are crucial for protecting Cloud-IoT environments.

Figure 2 shows security objectives in a cloud environment. To guarantee the confidentiality, integrity, availability, and general security of data, applications, and resources hosted in the cloud, security objectives for a cloud environment are essential. These goals assist businesses in defining their security objectives and directing the application of suitable security solutions. For the confidentiality, integrity, and availability of data and services hosted in the cloud, security objectives for the environment are crucial. These goals aid organizations in developing a framework for putting security measures in place and in defining their security objectives. To respond to changing threats and keep a solid security posture in the cloud, it is essential to regularly assess and update security goals and procedures.

Fig. 2
figure 2

Security objective in cloud environment

Confidentiality Confidentiality refers to safeguarding or protecting critical data from unauthorized access. The information will only be revealed or accessible to those persons who are authorized (Schiller et al. 2022a).

Identification and Recognition Identification is a unique way to provide attributes to users or devices to differentiate from other users. Recognition is related to the validation of the claimed identity. When a user gives a password, it matches with the saved password and identifies an individual (Schiller et al. 2022a).

Privac: To safeguard the privacy of individual data, security measures are implemented. It also ensures that data must be responsibly handled. It involves protecting personnel information (Schiller et al. 2022a).

Authentication: The authentication measures procedure involves confirming the identities of individuals and protecting against unauthorized access. It involves the user providing a username and password (Schiller et al. 2022a).

Availability Availability refers to the accessibility and usage of data when required by an authenticated person. It involves maintaining availability includes protecting against denial of service, downtime, and disruptions that can hamper the availability of data (Schiller et al. 2022a).

Integrity Integrity ensures that data should be consistent, accurate, and unchangeable throughout its lifecycle. It also ensures the trustworthiness of the data (Schiller et al. 2022a).

Case studies that demonstrate how these security goals are implemented in practice are described below:

Estonia’s e-Residency Program: e-Residents receive a government-issued digital ID that is stored on a blockchain. This ID allows them to securely sign documents, access Estonian e-services, and run a business remotely.

MediLedger in Pharmaceutical Supply Chain: MediLedger uses blockchain, a decentralized ledger, to ensure data integrity and transparency.

Civic’s blockchain-based identity verification: It allows users to create and verify digital identities. Further, Enigma uses secure multi-party computation (sMPC) on the blockchain to ensure that data can be shared and analyzed without being exposed.

5 Taxonomy of Cloud-IoT environment

In the rapidly growing landscape of the Cloud-IoT environment, understanding the taxonomy is significant for navigating the complexities of connected devices and realizing their full potential in the swift diversification of the Cloud-IoT ecosystem.

5.1 The relationship between IoT, Big Data, and cloud computing

There is a strong synergistic relationship between Cloud Computing, Big Data, and the IoT, with each technology augmenting the other’s capabilities. IoT enables data collection which is uploaded to the cloud for storage and processing. These bulk data are accumulated in the cloud and form a large volume of data known as Big Data. Big Data tools and techniques are applied to these bulk data for processing and scrutiny of data on the cloud. Real-time monitoring and analysis are made possible by the convergence of cloud computing, Big Data, and IoT. This makes it possible to respond and act quickly, which optimizes processes, boosts productivity, and enhances user experiences.

Figure 3 illustrates the relationship between the IoT devices that are placed at remote locations. Data is generated from IoT devices which are stored and analysed on the cloud using Big Data tools. Finally, after processing data on the cloud decision is made. IoT, Big Data, and cloud computing work together to create a potent trio that propels efficiency and innovation in a wide range of sectors, including manufacturing, agriculture, smart cities, and healthcare.

Fig. 3
figure 3

Relationship between IoT, Big Data, and cloud computing

5.1.1 Understanding the dynamics of Big Data

Big Data in a few years come out as an ideal that has provided an enormous amount of data and provided a chance to enhance and refine decision-making applications. Big Data offers great value and has been considered as being a driving force behind economic growth and technological innovation (Dutkiewicz et al. 2022). Machines and humans both contribute to data through online records, closed-circuit television streaming, and other means. Social media and smartphones create enormous amounts of data every minute (Ram Mohan et al. 2018). Big Data is a large amount of data that is fast and complex. These data are not easy to process using conventional methods. Today Giant Companies substantial portion of the value advanced from data generated by the company which is continually examined to produce better and advanced products. A prime example of Big Data is the New York Stock Exchange, which creates one terabyte of fresh trade data daily. Big Data characteristics are defined by the 4 V’s i.e. Volume, Variety, Velocity, and Veracity which is shown in the figure below. Big Data involves three main actions integration, managing, and analysis.

Figure 4A and B illustrate the essential 4 V’s i.e. Variety, Volume Velocity, and Veracity of Big Data through 4 blocks. Volume block represents the size of data that grows exponentially such as Peta byte, Exa byte, etc. It represents how much information is present. The volume of data is increasing exponentially. Velocity block shows that data is streaming into the server for analysis and the outcome is only useful if the delay is short. It is used to portray how fast information can be available. Data must be generated quickly and should also be processed rapidly. For example, a healthcare monitoring system in which sensors record the activities that occur in our body and if an abnormal situation occurs needs a quick reaction. Variety blocks represent, a variety of data and various formats, types, and structures of data that exist such as sensor data, PDF, photo, video, social media data, time series, etc. The veracity block ensures that data should be consistent, relevant, and complete in itself. Hence, the error can be minimized accurate results can be produced and decisions can be taken through analysis of the result.

Fig. 4
figure 4

A Big Data characteristics. B Four V’s portray of Big Data

Apart from its several advantages Big Data faces security challenges as well such as attackers can damage or steal information where a large volume of data is stored such as cloud and fog. An attacker can steal data and he/she can attempt to study and analyze data and thereafter can change the outcome of the result accordingly. Therefore special protection and privacy of data such as cryptographic defense mechanisms should be provided so that data can be kept safe and secure (Kaaniche and Laurent 2017). The healthcare industry is one of the most promising areas where Big Data may be used to effect change. Large-scale medical data holds great promise for bettering patient outcomes, anticipating epidemics, gaining insightful knowledge, preventing avoidable diseases, lowering healthcare costs, and enhancing overall quality of life. To address security and privacy threats in healthcare, the author has provided some suggested strategies and approaches that have been documented in the literature, while also outlining their drawbacks (Abouelmehdi et al. 2018).

5.1.2 Connecting the world: the evolution and impact of the Internet of Things

The development of the Internet of Things has revolutionized the Internet market around the world. The Internet of Things is a device that when connected to the Internet transmits, receives, and stores data over the cloud. The Internet of Things is embedded with several devices such as sensors, physical devices, and software to control the devices. IoT can be device can include anything that contains a UID (Unique Identification Number) that can be used to in identify uniquely over the internet. IoT devices have several benefits such as high efficiency, providing more business opportunities, high productivity, increased mobility, and many more. Apart from the above-mentioned benefits, IoT devices can also be deployed to monitor tool execution and find and diagnose the issues before any major break happens in the functioning of the device, also in addition it reduces maintenance costs and thereby increases the throughput. IoT devices can able to gather large volumes of data beyond any human can think of it. As the world is developing data is considered to be an oil for the development of any country, so to cope with new challenges IoT devices should also be made smarter than traditional devices which can able to guide and make decisions. To achieve such objectives IoT devices should be accompanied by machine learning and artificial intelligence technology to enhance the performance of the device and to make sense of collected data.

Figure 5 illustrates the key components of IoT devices. The components are the building blocks of IoT which is shown in the diagram. These “DGCAU” components collectively facilitate the working of the IoT devices. Each component is significant in terms of productivity, data collection, monitoring, and connectivity. In Fig. 5 ‘D’ stands for IoT Device. IoT devices are those which are such as medical equipment, smart meters, home security systems, smart lights, etc. which are used to collect data. The second ‘G’ stands for Gateway which is similar to a centralized hub that is used to interconnect IoT devices and sensors to the cloud. Advanced gateway facilitates data flow in both directions between IoT devices and the cloud. ‘C’ indicates the cloud aids in the storage of data and simultaneously analyzing data. Rapid processing and strong control mechanisms enable cloud-enabled IoT devices to minimize the risk of attack. User identities and data are protected by strict authentication methods, encryption tools, and biometric authentication in Internet of Things devices. ‘A’ signifies the Analysis of data that was stored in the cloud to determine the outcome. Analysis tool studies large amounts of data and produces useful information, which is helpful in decision-making. The last component ‘U’ represents the user interface or UI module that facilitates the user to administer the IoT device with which they are interacting. it is generally a graphical user interface that includes a display screen, mouse, keyboard, etc.

Fig. 5
figure 5

Key components of the Internet of Things

Figure 6 shows the various applications of the IoT which are technology paradigms used to interconnect the devices with the Internet, collect data, share data, transmit data, and act upon data. IoT has enormous application in day-to-day life therefore enabling us to perform our work widely and conveniently. Smart Lighting IoT can be used to operate the light remotely through a smartphone. Transportation IoT is used to track vehicles and goods in real-time. IoT finds application in health which enables doctors to monitor the patients remotely. In Logistics IoT helps to keep track of goods and vehicle devices. IoT is useful for smart framing because IoT sensors can monitor, measure, and track soil moisture, nutrients needed for crop fertilization, and irrigation needs. IoT devices used in retail monitor the department’s real-time inventory level and stock and forward orders when a product is discovered to be out of stock. With features like motion sensors, doorbell cameras, and video surveillance, smart home security systems employ the Internet of Things to monitor and secure houses. IoT is used by smart grids to increase the effectiveness and dependability of electricity delivery. Water quality indicators like pH, turbidity, chlorine levels, and pollutants are continuously monitored by IoT sensors. Smart meters with IoT capabilities allow for real-time monitoring of utility consumption. IoT equipment on autonomous vehicles processes sensor data in real time. This entails reading and assessing the environment to make deft choices regarding safety, navigation, and vehicle control. Wearable gadgets gather information on activities, health, and other topics before sending it for analysis to smartphones or the cloud for processing.

Fig. 6
figure 6

Internet of Things applications

Apart from the benefits of IoT devices in day-to-day life, IoT devices suffer security threats as well. The rapid growth of IoT devices has revolutionized how we interact with technology. As the number of IoT devices increases the security concern also increases simultaneously. The author addresses the issue of sharing sensitive data securely for designated recipients in the context of the Blockchain Internet of Things (B-IoT) (Yin et al. 2022). The author has scrutinized the security flaws in computer systems based on cloud, blockchain, IoT, and fog computing (Mishra et al. 2022; Yao 2022; Abdulkader 2022). Security challenges and threats in IoT and cloud environments addressed by various authors are presented in the papers (Pandey et al. 2023; Ray and Dutta 2020; Bedi et al. 2021). Cloud Computing and IoT Using Attribute-Based Encryption approaches are developed by authors found to be very effective in the security domain (Mihailescu et al. 2022; Henze et al. 2017). The author presents D-CAM, a solution for achieving distributed configuration, authorization, and management across borders between IoT networks (Simsek 2023). The study presented by the author is a novel handshake protocol for the broker-based publish/subscribe paradigm in the Internet of Things that offers key exchange-based authentication, authorization, and access control (Shin and Kwon 2020; Stergiou et al. 2018).The goal of a systematic literature review (SLR) paper is to examine the body of research on cloud computing security, risks, and difficulties that are presented by authors (Wang 2021). The primary issue in the cloud environment has been confirmed to be data access, despite the security measures being deemed dependable (Javid et al. 2020; Gai et al. 2021; Shukla 2022). We suggested an effective data access control method that uses optimal homomorphic encryption (HE) to get around this issue (Gnana Sophia et al. 2023). The paper highlights the edge computing security and privacy requirements (Yahuza et al. 2020). Multiple encryption techniques are presented by the authors which are significant in protecting privacy and data security (Sharma et al. 2019; Silva et al. 2018; Bertino 2016). The author proposes a distributed machine learning-oriented data integrity verification scheme (DML-DIV) to ensure the integrity of training data (Zhao and Jiang 2020). The researcher introduced an identity-based (ID-based) RDIC protocol including security against a malicious cloud server which is presented in the paper (Yu et al. 2017; Sookhak et al. 2018). The authors studied various security challenges concerning IoT devices, Big Data generated by IoT devices, and cloud and presented them in the paper (Akmal et al. 2021; Awaysheh et al. 2022; Tang 2020; Shi 2018).

5.2 Navigating the cloud: exploring the world of cloud computing

Cloud Computing refers to Internet-based computing, where shared resources data, software, and information are to the customer and devices on demand. The term “cloud” used to appear on the Internet. Huge memory space and inexpensive, high-performance computing are made possible by the cloud computing paradigm. Users can get cost savings and productivity benefits to manage projects and develop collaborations by moving their local data management system to cloud storage and utilizing cloud-based services. Information and knowledge extraction is greatly aided by computing infrastructure, particularly cloud computing. The services for cloud computing are provided using the network, generally the Internet. The characteristics of cloud computing include broad network access, on-demand service, rapid elasticity, and many more. With the help of the cloud, numerous services are accessible to clients. Broadly there are three types of services offered that enable the client to use software, platform, and infrastructure. Several types of cloud can be subscribed to by anyone as per the requirement of an individual or any organization. These include private cloud, public cloud, and hybrid cloud. Private cloud solely owned by any business houses. In this type of cloud infrastructure software is preserved on a private network and hardware and software entirely belong to the organization. Public clouds are commonly cloud services that are allotted to various subscribers. Third-party owned and operated the cloud resource.

The public cloud is mostly used for online office applications, testing, development, etc. A hybrid cloud is a combination of public and private clouds, which is implemented by a couple of interrelated organizations. Common types of cloud services are presented through the 3-layer architecture of Cloud Services in Fig. 7 and each one is discussed.

Fig. 7
figure 7

3-Layer architecture of cloud services

Figure 7 exhibits the different types of cloud and services provided by the cloud. The figure conveys the three-layer architecture of the cloud. IaaS makes virtualized computing resources available via the internet, enabling customers to pay-as-you-go access and manage the essential parts of the infrastructure. These resources often include storage, networking, virtual machines, and other things. Platform as a Service (PaaS) is a cloud computing architecture that offers developers a platform and environment to create, deploy, and manage applications. PaaS provides a variety of tools and services that speed up and improve the efficiency of the application development process. A cloud computing approach called Software as a Service (SaaS) allows users to access software programs online. SaaS has many benefits, including affordability, scalability, and accessibility.

Because crucial data is processed and stored on the cloud, for instance in Internet of Things applications, it also poses security and privacy issues (Alouffi et al. 2021; Hamzah Amlak and Kraidi Al-Saedi 2023; Yu et al. 2022). Cloud security is an important area where authors have tried to find the best possible solution through their research they have highlighted the challenges of possible solutions to the problem through finding and investigation in the paper (Gupta et al. 2022; Chaowei et al. 2017; Wang et al. 2021).

To ensure the integrity of data kept in the cloud, the author’s study proposes an effective public auditing technique that makes use of Third third-party auditor (TPA) (Reddy 2018; Hiremath and Kunte 2017; Yan and Gui 2021). The author proposes an efficient certificate-based data integrity auditing protocol for cloud-assisted WBANs (wireless body area networks (Li and Zhang 2022). The author proposed a secure architecture by associating DNA cryptography, HMAC, and a third-party auditor to provide security and privacy (Kumar 2021; Duan et al. 2019). Adversaries are always coming up with new ways to get access to users’ devices and data through developing technologies like the cloud, edge, and IoT. The author discussed various attacks along with security solutions (Pawlicki et al. 2023). The paper highlights the research challenges and directions concerning cyber security to build a comprehensive security model for Electronic health records (Chenthara et al. 2019; Hou et al. 2020; Ishaq et al. 2021; Jusak et al. 2022). The author mentioned the research and analysis of privacy-preserving data mining (PPDM) and classified using various approaches for data modification in the research paper (Binjubeir et al. 2020).

Even with all the benefits mentioned, there are security and privacy issues while using cloud computing (Nanda et al. 2020; Himeur et al. 2022). The issue of data security and privacy for Big Data is complicated by the use of cloud computing for Big Data management, storage, and applications. Since cloud services are typically offered on a common infrastructure, there is always potential for new attacks, both internal and external, such as password theft or application programming interface (API) flaws. The author has proposed a software architecture model by using approaches like hardware security extensions (Intel SGX) and homomorphic encryption. To improve data security in large data cloud environments and defend against threats, a virtualization design and related tactics are suggested by the author. The TID (Token Identification) model developed by the author provides security to the data. The user has various access rights as a client. The authentication access token establishes a connection with the user account after the user logs into the cloud network. The researchers have developed the Remote Data Checking (RDC) technique, which uses the sampling technique to evaluate the integrity of data that is outsourced across remote servers. Authors developed the techniques for remote data auditing that are very beneficial in ensuring the integrity and dependability of the data that is outsourced. Data, auditing, monitoring, and output these elements are all included in the DAMO taxonomy. The author in his paper offers a unique security-by-design framework for the implementation of BD (Big Data) frameworks via cloud computing (Big Cloud) (Ye et al. 2021). Various data security issues in the Big Data cloud computing environment are addressed by the authors in his paper. Various methods for safeguarding privacy and data security in public clouds are covered in the article (Jain et al. 2016). A multi-cloud architecture with privacy and data security enabled is suggested by the author. To increase user security on SNg (Social Networking) by utilizing techniques that can give data about BD technology (Big Data) greater privacy. This approach is described by the author in the paper along with various metrics and usage-related outcomes. The author examines financial risk analysis and related regulatory studies using blockchain and Big Data technologies. A secure cloud environment can be achieved by using a hybrid cryptographic system (HCS), which combines the advantages of symmetric and asymmetric encryption.

Figure 8 shows a hierarchical structure created to handle and process data and applications efficiently depending on how close they are to the user or the source of the data. “Hierarchical edge computing” refers to the interplay between these three layers, cloud, fog, and edge. The Cloud Layer is a centralized data processing center that provides abundant computing and storage capacity for handling and storing enormous volumes of data as well as running sophisticated applications. The growths of the Internet and its associated ideas, such as edge computing, cloud computing, and the Internet of Things, have had a permanent impact. The cloud layer is a highly scalable data center that is perfect for managing large-scale applications and services because they can extend horizontally to manage increased workloads.

Fig. 8
figure 8

Hierarchical edge computing

The fog Layer is an intermediate layer after the cloud layer which spreads and distributes processing responsibilities among several local servers or devices, which can be very useful for IoT applications with many data sources. Fog computing is ideal for latency-sensitive applications that demand quick responses. Virtual components called cloudlets are employed in fog computing. Fog computing has emerged as a promising paradigm in overcoming the growing challenges (e.g., low latency, location awareness, and geographic distribution) arising from many real-world IoT applications, by extending the cloud to the network edge. To facilitate data offloading and computation, these virtual computers offer a micro data centre close to mobile devices (Lu et al. 2020). Fog computing offers new insights into the extension of cloud computing systems by procuring services to the edges of the network. It shortens the time it takes for data to go to the cloud and back by processing it closer to the source. The edge layer, which is frequently located adjacent to IoT medical devices themselves (Muzammal et al. 2018), is the one that is nearest to the data source or end users. A promising paradigm that expands on cloud computing capabilities is edge computing. It processes data instantly, allowing for extremely quick replies devices, sensor devices, and industrial machinery, mobile terminals are examples of edge devices that can function autonomously and make decisions in the present without relying on a central cloud infrastructure (Ghaffar et al. 2020; Jiang et al. 2016). Big Data applications are a risk for cyber security assaults, as these attacks directly affect applications utilized across several sectors, such as Big Data analytics. The authors presented a novel data encryption approach, which is known as Dynamic Data Encryption Strategy (D2ES) to protect and safeguard the data which proves promising in cloud computing. Encrypted data can be obtained by cryptography methods, enabling secure communication links within the networking system. Researchers suggested the blockchain-based Shamir threshold cryptography solution for IIoT (Industrial Internet of Things) data protection. An improved data security in mobile edge computing, the Fine-Grained Access Control mechanism (FGAC) is suggested to guarantee data security during data access (Ahmed et al. 2021).To analyze and investigate the data reduction at the fog level, researchers attempted to create a model. This researcher has successfully applied methods including artificial intelligence, principal component analysis (PCA), and the Naïve Bayesian classifier for data reduction.

6 Exploring the complex landscape of Cloud-IoT threats: an in-depth analysis

Security concerns are growing along with the integration of Cloud Computing and the IoT. Numerous dangers and vulnerabilities that might compromise the availability, confidentiality, and integrity of data and services are brought about by the junction of these two technologies. We examine the subtleties, possible effects, and vital necessity of strong security measures to protect against changing hazards in interconnected environments as we delve into the complex nature of Cloud-IoT security concerns in this analysis.

Figure 9 illustrates the numerous types of attacks that can take place in the cloud. These Attacks can harm the cloud service provider as well as cloud customers. The attacker is an individual who attempts to use a cloud infrastructure, platform, or service’s vulnerabilities or flaws for nefarious reasons in the world of cloud computing. Because they frequently house significant data and offer computational resources that may be used for a variety of purposes, such as launching cyber-attacks, stealing confidential information, or causing disruption, cloud systems are very alluring targets for attackers. For different purposes, including data theft, service interruption, or resource exploitation, attackers target cloud environments. To breach cloud systems, attackers use a range of methods and tactics. These attack methods can include insider threats, sniffer attacks password change SQL-Ingestion, Eavesdropping, malware, distributed denial-of-service (DDoS) attacks, phishing, and more (Basit et al. 2021; Ullah et al. 2019; Jahromi et al. 2021).

Fig. 9
figure 9

Threats in cloud computing environment

DDOS Attack A distributed denial-of-service attack aims to disrupt regular network operations by flooding the network with traffic. Denial-of-service attacks aim to prevent end users from accessing the network.

Man-in-Middle Attack In a man-in-middle attack, the attacker generally modifies the conversation between the two parties. In a man-in-the-middle attack, attackers generally eavesdrop on sensitive information and alter the conversation. The integrity and security of sensitive data are seriously threatened by MitM attacks.

Sniffer Attack It is an attack in which an unauthorized person intercepts and gains control over network traffic. The goal is to capture and examine the data when it passes over the network.

DNS Attack The domain name system attacks the domain name system, which is responsible for converting human name readable to IP address. DNS attacks have the potential to affect the DNS infrastructure’s availability, integrity, and confidentiality, which could cause interruptions to internet services.

DOS Attack A Denial of Service (DoS) assault involves the exploitation of a single source, typically a compromised device or computer, to overwhelm a target’s resources and cause a loss of service.

SQL Ingestion In SQL (Structure Query Language Ingestion), attackers ingest harmful code inside the parameters of a web application. The main goal of attackers is to manipulate SQL databases. In this type of attack, the attacker gains the advantage of bad input, which enables the attacker to execute the SQL command.

Phishing Attack In Phishing attackers use some trick to expose delicate information, for example, username, personal information, password, and credit card details. Phishing attacks sometimes use the personas of reliable companies, banks, or websites to trick people into doing things that could jeopardize their security.

Cryptographic Attacks Cryptography is important to ensure confidentiality and integrity and authenticate the user. The attacker exploits vulnerability or weakness in the existing system. Attackers compromise the security of cryptographic systems.

XSS Attacks Cross-site scripting (XSS) is one of the serious attacks that occur when vulnerable code which is in the form of a script is injected into the web page of the user. The objective of the attacker is to steal sensitive information about the user by running the scripting code in the user’s browse.

Eavesdropping Attacks Eavesdropping is a kind of attack in which attacker unauthorized person tries to listen to or sniff the conversation between two people and steal information. In this type of attack, the attacker even manipulates the information.

Password Change Request Interception Attack The assailant attempts to intercept legitimate users’ password changes. Interception of this kind could happen during a browser-server conversation.

7 Exploring research trends and areas of focus

As technology continues to evolve at a rapid pace, researchers and academics are continually exploring new trends and areas of focus within their respective fields. To keep ahead of new difficulties, seize opportunities, and encourage innovation, this investigation is essential. We explore the current research trends and areas of attention in a variety of disciplines in this overview, offering insight into the cutting-edge subjects that are influencing the direction of technological and scientific advancement. After scrutinizing the number of published research papers we came across various domains in which researchers have worked and proposed various security frameworks.

Table 1 represents the research work and focus of various researchers in field security. From the table above it can be concluded that researcher have focused on Cloud Computing and their finding are more concentrated on Cloud security and the Internet of Things. The researcher primarily focused on the development of security algorithms to protect the data from being damaged or corrupted by cyber attackers. Through study, it was found that researchers have developed innovative techniques by making use of machine learning techniques, and blockchain technology to safeguard data developed for the Internet of Things.

Cryptography is another eminent way to protect our data. Researchers have created algorithms to encrypt and decrypt data prominently so that data can be safely transmitted over the network. A method like PSEBVC: Provably Secure ECC and Biometric Based Authentication Framework is developed by the author as a countermeasure for attacks.

In the digital landscape, the risks of cyber-attacks are growing enormously which is becoming a challenge for both organizations and individuals. A comprehensive examination of attack vectors and mitigation strategies is essential for understanding and effectively countering these attacks (Wylde et al. 2022a, b). Through an analysis of numerous attack pathways and related mitigation techniques including artificial intelligence-based solutions discussed in paper (Al Hamid et al. 2017; Abed and Anupam 2022). This research paper aims to offer important insights on how to enhance security and defend against cyber threats in a constantly changing security environment. The objective of this analysis is to provide individuals and organizations with the necessary knowledge and tools to improve their digital security and minimize risks in the constantly changing threat landscape. To do this, each attack mechanism is thoroughly examined, and appropriate remedies are explored through Table 2.

Table 2 is a complete description of the investigation of the several research papers related to security threats that exist, various categories of attackers that occur on the cloud, and countermeasures that can be taken to prevent attacks summarized in the Table by the author. The table shows how attacks affect the data and what standard approaches were developed by researchers to protect data.

8 Unveiling the intricacies of digital forensics in Cloud-IoT environments

Digital Forensics is a branch of forensic science that concentrates on recovery of data, analysis of data and exhibit the digital evidence that is found on electronic devices. The IoT Forensics can be identified as part of Digital Forensics. The objective of IoT Forensics is to explore digital information in an authorized manner. IoT forensics data can be accumulated through IoT devices, sensors, networks, and cloud. There are some differences between security, IoT, and forensics. The protection against physical and logical security threats is provided by IoT security adopts multiple methods to protect from threats and minimize attacks (Unal et al. 2018). Forensics examines the data present in the devices and recreates the happenings by utilizing investigative methods to preserve and analyze digital data. Post-mortem examinations are the main focus of forensics i.e. discovering shortcomings that emerged from the event. Forensic experts obtain digital proof throughout the actual event with the help of standard approaches used in forensic analyses of physical proofs of electronic data to determine and reframe the events by storing and analysis of digital information using different methods of investigation. Some authors have presented detailed studies to investigate the forensic issues in cloud computing and provide possible solutions, and guidelines, including existing case studies (Morioka and Sharbaf 2016; Al-Dhaqm et al. 2021). The paper offers an enhanced blockchain-based IoT digital forensics architecture that builds the Blockchain’s Merkle tree using the fuzzy hash in addition to the traditional hash for authentication (Mahrous et al. 2021). Authors Almutairi and Moulahi (2023) trained models locally using federated learning on data stored on the IoT devices using a dataset created to simulate attacks in the IoT environment. In order to make the blockchain lightweight, the authors next carried out aggregation via blockchain by gathering the parameters from the IoT gateway (Almutairi and Moulahi 2023).

The IoT has revolutionized various sectors through seamless device interactions, yet it has introduced significant security and privacy challenges. Traditional security measures often fall short due to IoT’s distinct characteristics like heterogeneity and resource limitations. Danish Javed et al. (2024a) explored the synergy of quantum computing, federated learning, and 6G networks to bolster IoT security. Quantum computing enhanced encryption, while federated learning preserved data privacy by keeping training data on local devices. Leveraging 6G’s high-speed, low-latency capabilities allows for secure, real-time data processing among IoT devices. The study also reviewed recent advancements, proposed a framework for integrating these technologies, and discussed future directions for IoT security. Recent innovations in network communication have revolutionized the industrial sector with automatic communication through the Industrial Internet of Things (IIoT). Despite its benefits, the increased connectivity and use of low-power devices in IIoT heighten vulnerability to attacks, and its diverse nature complicates centralized threat detection. To tackle this, authors Javed et al. (2023) proposed a fog-based Augmented Intelligence (IA) defense mechanism that uses GRU and BiLSTM deep learning classifiers for anomaly detection and secure communication. This framework (Cu-GRU-BiLSTM), which achieved up to 99.91% accuracy, surpassed existing threat detection methods, proving its effectiveness for securing IIoT environments (Javeed et al. 2023).

Further, the hybrid approach proposed by Danish Javed et al. (2024b) enhances intrusion detection in federated learning (FL) for IoT by addressing existing limitations. Here, CNNs identify local intrusion patterns by extracting spatial features, while BiLSTM captures sequential patterns and temporal dependencies. Using a zero-trust model, data stays on local devices, and only the learned weights are shared with the centralized FL server. The server then combines updates to improve the global model’s accuracy. Tests on CICIDS2017 and Edge-IIoTset datasets show this method outperforms centralized and federated deep learning-based IDS.

9 Advancements in security threat detection and avoidance

With the constant advancement in sophistication of cyber attacks, enterprises, and individuals alike are obliged to use innovative methods and technologies to detect, prevent, and mitigate potential security breaches. Threat detection is seeing tremendous breakthroughs, enabling defenders to keep one step ahead of malicious actors. These advancements include machine learning algorithms and behavior analysis methodologies. This ongoing change emphasizes how crucial it is to take preventative action to protect sensitive data and maintain digital trust in an environment where dangers are becoming more complicated.

9.1 Harnessing the power of machine learning

Machine learning is a subset of artificial intelligence (AI) that focuses on developing models and algorithms that enable computers to learn from data and make decisions or predictions without having to be explicitly programmed to do so. As a result, machine learning algorithms are beneficial when dealing with vast amounts of data since, after being trained on the data (Ali et al. 2020), the trained model uses its learning experience to present precise outcomes on new data. Data generated by IoT devices may suffer from threats (Safaei Yaraziz et al. 2023). Today Machine Learning proves to be one of the strongest tools to identify threats and maintain the integrity of data in transmission. The foundation of machine learning is the algorithms that are used to train the models. The first step in using machine learning to address a problem is gathering data. Next come tasks like data preparation, data analysis, training, testing, and eventually deploying the model for real-world application. Two types of ML problems can be solved by supervised machine learning algorithms: regression and classification. Classification is used to solve problems with binary target variables (yes/no), while Regression ML algorithms are used to address problems of similar nature when the target variable is continuous. A phishing attack has become one of the most prominent attacks faced by internet users, and governments. The attacker(s) transmits URL(s) to the intended victims via text messaging, social networking, or spam messages. They do this by mimicking the behavior of authentic websites when creating website pages. Malware attack during data in transit is a common type of attacks that can manipulate the data and damage the data. To prevent such attacks ML model can be one of the tools to identify such attacks and prevent them to such extents. Machine Learning algorithms have been used to build several intrusion detection systems, improving the systems’ ability to identify threats and enabling uninterrupted business operations (Pathak et al. 2023). Despite many benefits that SDN(Software-Defined Networking) offers such as offer nimble and adaptable network growth, malicious attacks that can eventually prevent network services are unavoidable (Unal et al. 2018). Machine learning has been used in several studies to detect distributed denial of service (DDoS) threats in SDN (Software-Defined Networking) environments (Morioka and Sharbaf 2016). ML models are being trained on numerous datasets to build models that can detect cloud attacks with elevated accuracy. Various classifier is implemented in the ML model to identify attacks such as SVM, Decision tree, K-NN (K-Nearest Neighbour), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), random forests, and many more. The use of random forest and K-NN classification approaches enables malware detection method proofs to be 99.7% accuracy and 99.9% in several cases (Abed and Anupam 2022; Morioka and Sharbaf 2016). These classifiers can be used with different feature engineering and feature selection strategies to create machine learning models that effectively handle certain security issues and enhance overall cyber security posture.

Figure 10 represents security threat detection using machine learning algorithms and models. The automatic detection of potential security threats and abnormalities within a file system using machine learning techniques uses a predictive model to identify the threat and classifies it as a malware file or harmless file over the system. Data breaches and other security problems can be prevented because of their ability to assist enterprises in detecting and responding to threats more quickly and effectively. In this process, it involves two important stages. The first stage is the Training stage where the model is being trained using different files. Files are sent as input to train the model. Numerous machine learning algorithms, such as decision trees, random forests, support vector machines (SVM), neural networks, and others, can be used for training the model. After the model is trained, then comes to the security stage where an unknown file is given to the model for analysis the file. For the detection of security threats, supervised learning techniques like classification and regression are frequently used. The machine learning model generates notifications for security professionals to investigate when it spots a potential security danger or abnormality. Automated responses to lessen or control the crisis may also be triggered based on how serious the threat is. Deep learning approach is used to detect pirated software and malware-infected files across the IoT network. Using color picture visualization, the deep CNN is utilized to identify harmful infections in Internet of Things networks. Secure video transmission over the cloud is discussed in the paper (Hossain et al. 2018). Researchers have developed Holistic Big Data Integrated Artificial Intelligent Modelling (HBDIAIM) to provide and improve privacy and security in data management (Chen et al. 2021). The previously developed model falls short in providing adequate data privacy and security, keeping this shortcoming in mind author (Yazdinejad et al. 2024a) developed an Auditable Privacy-Preserving Federated Learning (AP2FL) model tailored for electronics in healthcare. AP2FL model provides secure training and aggregation processes on the server side as well as the client side. Thereby protecting and minimizing the risk of data leakage. Researchers primarily focus on Machine learning-based threat detection models to address the challenges within Consumer IoT. Using Federation Learning (FL) techniques data privacy in Consumer IoT is maintained (Namakshenas et al. 2024). The author suggests an approach to attack detection that makes use of deep learning (DL) algorithms to identify false data injection (FDI) assaults (Sakhnini et al. 2023). In the research paper, the author utilizes federated learning to automatically search for threats in blockchain-based IIoT (Industrial Internet of Things) networks using a threat-hunting framework we call block hunter (Yazdinejad et al. 2022).

Fig. 10
figure 10

Security threat detection using machine learning technique

Real-life applications of machine learning in malware detection

AT&T Uses machine learning to protect networks and find malware that targets telecom infrastructure.

Mayo Clinic A healthcare organization that implements machine learning techniques to safeguard patient data from malware attacks and unauthorized access.

Bank of America Employs AI and machine learning to improve cyber security safeguards, identifying malware and averting breaches in data.

Cylance A cyber security firm that heavily relies on machine learning to identify and eradicate malware. To identify threats instantly, its algorithm is trained on an extensive dataset of both malicious and benign files.

Amazon Web Services (AWS) AWS uses machine learning techniques to identify threats, examining the logs and network traffic.

Symantec An American consumer-based software company that employs machine learning techniques to identify and categorize malware.

National Security Agency (NSA) To improve national cyber security, the National Security Agency (NSA) uses cutting-edge machine learning algorithms to identify and analyze malware.

9.2 Unlocking the power of blockchain: a cutting-edge safeguard technique for enhanced security in the digital landscape

Blockchain is an emerging decentralized technology that securely stores and authenticates transactions across a network of computers. Its decentralized and open structure makes it a viable option for many companies looking to improve digital security, efficiency, and trust. Although cloud computing is becoming more and more popular for processing and storing data, security, and privacy are still big issues because of the possibility of hostile assaults on wireless and mobile communication networks. Data transfer privacy and system security are improved by using blockchain technology. To put it briefly, a blockchain is auditable, can function as a distributed ledger with digitally signed data, and allows changes to be tracked back to the original data to ensure security. This demonstrates that the security of data may be guaranteed by blockchain technology (Safaei Yaraziz et al. 2023). The suggested IAS protocol is developed on top of blockchain technology to guarantee the security and authenticity of data transmission in cloud computing. A potential solution to the security and privacy problems in the Internet of Things is blockchain technology (Williams et al. 2022; Waheed et al. 2020). For every transaction including proper authentication, data can pass through the blockchain distributed ledger thanks to blockchain technology, which does away with the idea of an IoT central server. Blockchain technology could provide a more effective answer to the issues that IoT systems confront. Transactional privacy, decentralization, the immutability of data, non-repudiation, transparency, pseudonymity, and traceability, as well as integrity, authorization, system transparency, and fault tolerance, are the primary security features of blockchain technology. The Smart contact is verified, put into use, and then shared as a Distributed Ledger Technology (DLT) over a Pier-to-Pier (P2P) network as a function of blockchain (Wylde et al. 2022b). The authors created and put into use smart, secure fuzzy blockchain architecture. This framework makes use of a unique fuzzy DL model, improved adaptive neuro-fuzzy inference system (ANFIS)-based attack detection, fuzzy matching (FM), and fuzzy control system (FCS) for network attack detection (Yazdinejad et al. 2023).

Figure 11 illustrates the specification of blockchain technology concerning cloud environment. Blockchain is a distributed ledger across a peer-to-peer network. Blockchain features can help cloud services reach their full potential and address the many problems that arise. A collection of connected building blocks that are coupled and arranged in an appropriate linear sequence is used to keep a detailed record of all transactions. Decentralization, Security, transparency, availability, traceability, and many more are the essential features of blockchain technology which is highlighted by the figure presented by the author.

Fig. 11
figure 11

Specifications of blockchain technology

Decentralization Decentralization in blockchain technologies refers to the dividing of control and decision-making across the network users instead of concentrating on the centralized entity. It addresses the limitations of a centralized system in which security is compromised.

Security The network architecture of blockchain technology provides security by minimizing the risk of failure. The allocated characteristics of blockchain strengthen the security. Attacks on any nodes are less likely to put the entire network at risk.

Automation An intelligent system automates the carrying out of the consensus, and removes the requirement of human intervention. Smart contact enhances transaction efficiency. It automatically implements the terms and conditions of the agreement whenever conditions or terms are fulfilled.

Transparency The transactions made on blockchain appeared to every participant over the network. The method not only provides trust and security to the data but also promotes accountability which helps to gain the faith of the user.

Cost Reduction In conventional systems, the settlement of financial transactions might take several days, causing delays and capital lockups. Blockchain eliminates the need for drawn-out clearing and settlement procedures by enabling very immediate transaction settlement.

Transaction in Real Time Transactions in real time can be made over the network. Real-time transaction implements techniques like Proof-of-stake to attain quick acknowledgment of transactions. This technique permits fast agreement among nodes on the validity of transactions.

Availability Transaction availability guarantees a user’s ability to communicate with the network and complete transactions dependably. Availability may still be impacted by network maintenance, upgrades, and sporadic problems.

Traceability Traceability features of blockchain enable to provide of transparent transactions. The transaction can be can be traced by the user. Blockchain is helpful in industries where the origin, transportation, and ownership of assets need to be accurately recorded and validated because of its traceability capabilities.

Auditable Real-time transaction auditing is made possible by the blockchain ledger’s transparency and immutability. At any time, participants can check the transaction history.

Unalterable The ability of blockchain to keep a safe and impenetrable record of transactions is one of its unchangeable features. Once information is posted to the blockchain, it is impossible to change, guaranteeing the information’s integrity and immutability.

Figure 12 portrays the basic components of blockchain. These components work in agreement to form a secure ledger system. Blockchain technology comprises those elements that work in agreement to formulate a secure and decentralized ledger. Supply chain management, decentralized applications, voting systems, healthcare, and property registration are the major applications of technology. Each component plays an important role in blockchain functioning.

Fig. 12
figure 12

Basic building blocks for blockchain

Ledgers Ledgers in blockchain technology are used to maintain transparency of the record in transactions. Every node contains a replica of the complete ledger, protecting it from being altered or any kind of fraud. The ledger with the help of a chain of blocks carries out transactions; blocks represent every transaction in ledgers.

Blockchain Network In a Blockchain network, the user is referred to as nodes. All the users collectively validate the transaction and record the transaction in a synchronized manner. Blocks are depositors for a cluster of transactions. Blocks contain a timestamp, which is of location of preciously occurred transactions and a cryptographic has for the current blocks.

Wallet Blockchain technology wallets are tools that let users manage and store funds safely. It enables users to access the public and private keys, facilitating the blockchain’s ability to transfer and receive crypto-currency. There are two types of wallets Hot Wallets: Easy for frequent transactions and internet-connected. Cold wallets are offline and thought to be safer for storing money over time.

Events Events are essential for improving the automation, transparency, and usability of blockchain systems. They give decentralized networks a way to communicate and update in real time. The execution of smart contracts or modifications to the ledger’s current state is frequently linked to blockchain events.

Smart Contacts The blockchain records the complete history of smart contract execution, making it transparent and auditable. It has numerous applications such as in supply chain, finance, etc. Based on predetermined criteria, smart contracts carry out actions.

System Management Blockchain technology’s system management characteristics include a variety of operations and procedures meant to guarantee the safety, effectiveness, and appropriate operation of the blockchain network. These characteristics are essential to preserving the dependability and integrity of decentralized systems.

Blockchain Census The blockchain’s consensus techniques, like Proof of Work (PoW) and Proof of Stake (PoS), help make the system resistant to censorship. By requiring a distributed agreement from all network users, these mechanisms make it more difficult for one party to control or restrict transactions.

System Integration Establishing seamless connectivity between different blockchain networks and between blockchain technology and traditional systems is the aim of blockchain system integration. The successful communication and information sharing between diverse systems is greatly dependent upon standards, protocols, and APIs (Application Programming Interfaces).

Membership Services Membership services features of blockchain technology a functions and features for a member or participant management in a blockchain network. It is used to manage access control, rights, and user identity on the network. The elements of the blockchain ecosystem enhance its overall security, governance, and usefulness.

Figure 13 shows the internal workings of the blockchain technology which is used to perform any kind of transaction over the cloud securely. The figure above is a step-by-step explanation of how the transaction takes place over the cloud. In step 1, first of all, the transaction is generated by any one of the users and request is the directed to the server for processing further. In step 2, the server after receiving the transaction request creates a block that can appear for the transaction. Next step i.e. step 3 a chain or interconnect block is created using algorithms to authenticate the user and ensure that the request is being made by the authenticated user. Further, in step 4, this block is distributed to other users or groups of users to grant permission for the transaction to happen. Once the group of users grants permission the transaction or block will be successfully added to the existing blocks that are shown in step 5 in the above figure. If any user disapproves or denies it, then the block will not be added to the existing chain. The modification that has taken place is permanent and cannot be modified further. Therefore, it ensures data security in the cloud environment.

Fig. 13
figure 13

Functioning of blockchain technology in Cloud IoT systems

Real-World Applications of Blockchain Technology in Enhancing Security and Data Protection is as follows:

Walmart Walmart one of the retail companies collaborated with IBM to implement blockchain technology to track the movement of products, maintain food safety, and minimizes the possibility of contamination.

MedRec MedRec is an MIT-developed blockchain-based electronic medical record system that gives individuals more control over their health information while maintaining confidentiality and privacy.Allows for real-time transactions and decentralized energy management by utilizing blockchain to increase the security and efficiency of energy distribution.

Ripple Ripples operates in the financial sector. It uses blockchain techniques to protect the data and enables real-time secure payment.

Follow My Vote Follow My Vote creates a safe, open, and verifiable online voting system using blockchain technology.

uPort uPort is a blockchain-based self-governing identity platform that empowers people to take control of their online personas while improving security and privacy.

10 Unveiling the challenges: addressing current issues in data security and privacy within the Cloud IoT environment

10.1 Open ended problems

The open-ended problems and primary issues about data security and privacy in cloud IoT systems are summarized in Table 3. Table 3 also provides targeted solutions to address each challenge, thereby ensuring a robust and secure cloud-IoT ecosystem.

10.2 Research gaps

The research gaps of data security and privacy preservation in cloud-IoT technologies are described in Table 4.

11 Conclusions

The IoT is on the verge of substantial expansion, necessitating secure data transfer and robust cloud storage solutions. As IoT devices become more widespread, the need for enhanced cloud security is critical. Current methods, while helpful, do not fully address modern threats, thus requiring the development of more advanced protective systems. Manufacturers can improve security by creating products grounded in a detailed assessment of IoT security risks and objectives. Effective measures include the implementation of strong authentication methods like One Time Password (OTP) features and robust cryptographic systems. While Machine Learning (ML) is widely used for data protection in various sectors, it faces challenges such as scalability issues with small data sets. Integrating ML with homomorphic encryption shows promise but needs further development. The evolving sophistication of hackers compels reliance on ML and AI for defense strategies. Additionally, blockchain technology, supported by platforms like Ethereum and Hyper-ledger Fabric, offers considerable potential for enhancing security, though more research is necessary to standardize these techniques.

The authors recommend three key solutions:

  1. (i)

    Develop new security standards and frameworks for cloud-based and IoT devices to tackle modern security challenges.

  2. (ii)

    Create more efficient ML models for real-time attack prediction.

  3. (iii)

    Design robust privacy protection protocols for blockchain technology to safeguard sensitive data.

The authors encountered several limitations during their research, including restricted access to relevant literature, challenges in avoiding plagiarism, difficulties in summarizing a large body of research, integrating information logically, and keeping up with the latest studies.