1 Introduction

Due to rapid digitalisation, cybersecurity has become vital to the technological world [1]. Healthcare services have been integrating electronic health records (EHRs) and the Internet of Medical Things (IoMT), which represent a shift towards a more connected and data-driven approach to healthcare. This transformation offers unprecedented opportunities to improve patient care and operational efficiency and, on the other hand, significant liability to healthcare cybersecurity threats. The sensitive nature of personal health information makes it an attractive target for cybercriminals. In the healthcare sector a breach could comprise patient privacy and security.

In the ENISA threat landscape of 2023, the health sector is one of the most targeted sectors of reported incidents [2]. The healthcare sector is also the preferred target of attackers, because of the data’s high commercial value [3]. Lack of consumers’ knowledge of data storage and the increase of internal and external hackers may reduce consumers’ trust in data storage, and, furthermore, highlight the urgent need for reliable and scalable cybersecurity mechanisms capable of defending against an evolving threat landscape.

The landscape of cybersecurity within the healthcare sector is fraught with vulnerabilities, leading to a spectrum of potential data breaches; among the most common are hacking or malicious attacks and unintentional disclosure [3]. Human factors also have an important influence on data breaches in healthcare, often serving as a critical vector through which breaches can occur [4]. The most common threats in the healthcare sector identified by ENISA are ransomware, data-related threats, intrusion, Distributed Denial of Service, Denial of Service or Ransom Denial of Service (DDoS/DoS/RDoS), supply chain attacks, malware, etc [5]. There were also numerous instances of data leaks during the COVID-19 pandemic from different healthcare institutions globally, typically with the collaboration of malicious insiders, or due to inadequate system configurations. In Europe, as reported by ENISA, the most common attacks were on patient data or electronic health records, followed by attacks on non-medical IT systems, networks, health information systems, and services [5]. Any hacking of medical devices or services can present a threat, or lower the users’ trust in medical systems. Medical systems carry sensitive information, and this is why it is important to develop solutions that can prevent or mitigate attacks that target them.

Some reviews have already been done on the topic of cybersecurity and healthcare. For instance, one study delved into cybersecurity risks faced by hospitals, with particular emphasis on the implications for non-cyber professionals [1]. Another notable review of 31 research articles performed in 2016 aggregated insights on modern threats and trends in cybersecurity in healthcare [6]. In a short review, the authors presented cybersecurity research topics focused partly on healthcare [7]. Moreover, the influence of human factors on cybersecurity within healthcare organisations has been explored, underscoring the complex interplay between technology and human behaviour [4]. In a paper on the state-of-the-art security and privacy issues in healthcare, the authors did a comprehensive review [8]. The onset of the COVID-19 pandemic introduced new dimensions to these challenges and potential solutions applicable in the healthcare domain [9]. None of those reviews mentioned above has done a systematic review of data breach solutions developed for healthcare.

Although research on cybersecurity is abundant and growing, there is a notable research gap in studies that focus specifically on the unique challenges and requirements of the healthcare sector. Generic cybersecurity solutions often fail to take into account the critical needs of healthcare data protection. This study seeks to fill this gap by providing a comprehensive analysis of cybersecurity mechanisms tailored to the healthcare industry. We aim to assess the current state of data breaches in healthcare, identify the specific vulnerabilities and threats faced by the sector, and review the effectiveness of various security solutions in mitigating these risks.

The rest of the paper is organised as follows. In Sect. 2, we present the literature review. In Sect. 3, we describe the systematic review methodology, research questions and selection of studies. In Sect. 4, we present the results, first focusing on bibliometric data, and then on the systematic analysis of publications. We discuss the findings in Sect. 5 and conclude in Sect. 6.

2 Background

Healthcare information is very sensitive and important, and data breach incidents can compromise patient privacy and even pose a concrete threat to their well-being. Recognising the gravity of these risks, legislators have been agile in updating regulations to strengthen data protection. In the EU, for example, the General Data Protection Regulation (GDPR) and the Directive on measures for a high standard level of cybersecurity across the Union (NIS2) exemplify rigorous efforts to enforce pseudonymisation and encryption of patients’ personal data [10, 11]. In the USA, the Health Information Portability and Accountability Act (HIPAA) establishes data protection, although it is not yet focused on specific sectors such as healthcare [12]. The COVID-19 pandemic has raised additional concerns about the need to protect modern healthcare systems against cyber attacks, which frequently target healthcare institutions [13, 14].

From 2011 to 2021, 3,822 personal health information breaches were identified in the United States, affecting more than 283 million people [15]. Most of the data breaches were hacking/IT-related. Furthermore, hospitals accounted for one-third of data breaches among different types of healthcare providers based on the research done on the data from 2009 to 2016 [16]. There have been a few data breaches in healthcare in recent years, and some have had severe impacts. Brno University Hospital was attacked through ransomware, and they had to postpone surgeries and appointments [17]. Similarly, at the University of Vermont Health Network, a ransomware attack was carried out, costing around 50 million US dollars [18]. Gilead Sciences, Inc. suffered a phishing attack, leading to impersonation and exfiltration [19]. High-profile ransomware attacks on institutions have caused immense financial losses and disruption to operations, which demonstrates the serious consequences of such breaches.

Many applications connected to Healthcare 4.0 are now moving to cloud computing, fog computing, the Internet of Things, and other mechanisms, to improve information sharing between different stakeholders [20]. Internet of Things devices, and especially the Internet of Medical Things (IoMT), present a significant risk in hospital networks, where unwanted persons could gain access to the system and attack it, which poses a major risk to patients [21].

In today’s healthcare systems, the amount of data collected is enormous, and data stored by the systems pose substantial privacy and security challenges. These concerns are a priority on the agenda for developers and managers of healthcare systems, who are trying to protect patient data from unauthorised access and cyber threats. To mitigate such risks, cryptography is used to reduce the vulnerability of data storage systems to potential breaches significantly. Security and privacy measures, in this context, include a range of protocols and practices designed to control access to patient data closely, to ensure that it is protected from unauthorised access. Healthcare systems strive to maintain the highest standards of data integrity and confidentiality, using advanced cryptographic techniques and robust security and privacy protocols to ensure patients’ trust and security.

Although the understanding of cybersecurity in the healthcare sector is well established, there is still a gap, in particular in the development of applications and robust solutions that address the vulnerabilities of healthcare systems, which are of a critical nature. This study will present specific threats and possible solutions to mitigate the attacks, and therefore aims to improve data privacy and security, as well as the resilience of healthcare systems. A wide range of cybersecurity solutions are presented, from encryption methods to the use of Machine Learning. The important concepts in the fields of data security and healthcare are introduced in the following subsections.

2.1 Healthcare and data

Data within the healthcare sector are both sensitive and invaluable, and that is why the healthcare field is among crucial industries where data have to be saved securely, and access to these data has to be adjusted to the needs. Secure data storage and well-implemented access control are both important for achieving the highest security and privacy of electronic or personal health records, especially with the records in Healthcare 4.0.

Healthcare data can be stored in two forms: electronic health records (EHRs) or personal health records (PHRs). EHRs usually collect and store clinical information data associated with patients’ healthcare, and are used by hospital staff and similar [22, 23]. PHRs typically allow individuals to access and control the use of their health information. This poses even bigger privacy and security issues, since these systems usually cannot be put behind a wall, and are normally accessible from anywhere. Both types of data storage can be connected to each other, or can be separate.

2.2 The development of technological solutions

The development of technological solutions for healthcare data must be updated continuously, considering the needs of the market or possible attacks. One of the technologies that is often involved in the development is Blockchain. Systems that have Blockchain technologies implemented can enhance the security and reliability of healthcare data, and offer access control for patients and other healthcare institutions more comprehensively [24]. Similarly, Machine Learning is utilised widely to detect any malfunctions, or to confirm the authenticity of data, which is crucial in countering data manipulation attacks, which also makes data unreliable and useless [25].

Furthermore, it is required by legislation that encryption is used in the development of healthcare technologies. However, current encryption protocols might be vulnerable due to the advancements in quantum computing [26]. Yet, healthcare data need to be secured in the best way possible. We distinguish between asymmetric and symmetric encryption. In symmetric encryption, the same key is used for encryption and decryption. The most used symmetric algorithm is AES (Advanced Encryption Standard).

For asymmetric encryption, data are encrypted with the public key, and can be decrypted only with the private key. The most widely recognised asymmetric encryption methods are the RSA algorithm, Elliptic-curve cryptography (ECC), homomorphic encryption and identity-based encryption. The RSA (Rivest-Shamir-Adleman) algorithm, introduced in 1977, remains the most widely used algorithm, applied frequently in the cloud, image, and wireless domains [27]. Homomorphic encryption is often used in healthcare because it processes data while maintaining confidentiality. In addition to other encryption protocols, where key generation, encryption and decryption algorithms are needed, property is also required in homomorphic encryption [28]. Identity-based encryption utilises a random character sequence as a public key, to reduce the key exchange overhead before communication, while elliptic-curve cryptography is based on elliptic curves, and its functionalities are similar to RSA, but simplifies calculations [29].

Hash functions are used to ensure the integrity of content, i.e., detecting any content modification. A hash function takes input of any size and generates a fixed-size output. The output is the same if the same content is hashed for the second time. It is also used for password verification and other applications. Among the most prevalent hash functions are SHA-1 (Secure Hash Algorithm) and SHA-2, developed by the U.S. National Security Agency [30]. SHA-1, though somewhat outdated, is capable of providing 160-bit message digests. SHA-2 consists of SHA-224, SHA-256, SHA-384 and SHA-512, each generating outputs of the size their names suggest (measured in bits).

2.3 Types of attacks

Cyberattacks fall into different classifications and threat levels, including hacking or malicious attacks, where ransomware is among the top threats. Kruse, et al. [6] identified cybersecurity trends, including ransomware, showing that healthcare professionals lack awareness of cybersecurity threats. Ransomware frequently catalyses the dysfunction of hospital systems, leading to the shutdown of operating rooms and other appointments, thereby endangering patient health, as well as causing financial losses. Ransomware typically encrypts all the data, and, in this way, denies a system and all of its legitimate users access to the data. For the decryption key, an organisation generally needs to pay a tremendous amount of money.

Phishing attacks represent another severe risk for data security. These, along with other, less common social engineering attacks, typically gain attackers access to the system, from where they can escalate their attack (e.g. by installing ransomware or malware). Typically, attackers install malware on the victim’s device, allowing them to access data, or obtain sensitive information that can be sold on the dark web. An impersonation attack can also present a threat, where someone is pretending to be a coworker or the head of an organisation and can gain access to data, credentials, or funds. This can lead to unauthorised access, where authentication credentials might have been stolen, and the hackers can gain access to data in the system. Other notable hacking attacks include injection attacks, where attackers can get access to unauthorised data which can also be modified or deleted. Man-in-the-middle attacks are also critical to mention, where the attacker positions himself between the client and server, and by impersonating the server to the client, and vice versa, can gain information that was not intended for them.

Insider threats are a recurring issue in healthcare, where staff, whether intentionally or unintentionally, may expose credentials or data. A stolen or lost device can also lead to losing sensitive information that can be used for extortion, unauthorised access, or selling it on the dark web. Human behaviour remains a fundamental risk to the organisation, and often companies do not educate their employees adequately about the potential threats [31]. Additionally, the type of personnel involved at the healthcare facilities can have a significant impact on the likelihood of data breaches [32].

3 Methodology

The objective of our study was to carry out a systematic review of the existing solutions for data breach mitigation in healthcare. We followed the guidelines for reporting systematic reviews – the PRISMA 2020 statement [33].

3.1 Research questions

In this study, we proposed the following research questions:

  • RQ1: How have the solutions for data breach mitigation in healthcare evolved over time?

  • RQ2: How have the proposed solutions addressed data breach mitigation in healthcare?

  • RQ3: What are the trends in technologies used for data breach mitigation solutions in healthcare?

3.2 Data sources and study criteria

We included the following electronic databases:

  • PubMed,

  • Scopus,

  • Clarivate Analytics—Web of Science (WoS),

  • Medline EBSCO,

  • ACM Digital Library (ACM).

Three reviewers preformed the review, and the primary search returned 1,128 results. The search was performed in online libraries in July 2023. The query string was defined to answer the proposed research questions: (“data breach*” OR “data hack*”) AND (“*health*” OR “medic*”). The search string was intentionally broad, so as to include all possible papers on this topic.

We established the following inclusion criteria:

  • original research studies;

  • only journal, conference papers, or book chapters;

  • published within the last ten years (2014–2023);

  • contains in abstract or title: (“data breach*” OR “data hack*”) AND (“*health*” OR “medic*”);

  • papers proposing solutions in the form of frameworks, protocols, algorithms, architecture, or models to mitigate data breaches in healthcare.

We established the following exclusion criteria:

  • papers not written in English;

  • review papers;

  • reports;

  • position papers.

The flow diagram is presented in Fig. 1.

Fig. 1
figure 1

Flow diagram of search

In the first stage, we identified 1,128 results from five electronic databases. In this stage, we had already limited the search results to the ones published in the last ten years (2014–2023) and included only English publications. We then excluded the 546 duplicates found across the five electronic databases. This left us with 582 records for screening. During this phase, we screened the abstracts, keywords and titles, and verified whether the papers were published as journals, conference papers or book chapters. We also verified if the paper output was a solution in the form of a framework, protocol, algorithm, architecture or model for the mitigation of data breaches in healthcare. We excluded review papers and reports. We ended up removing 475 records, and were left with 107 papers where we tried to access the full-text articles. Eight articles were excluded due to the unavailability of the full text. Ultimately, we were left with 99 full-text papers to be included in a review [25, 34, 35, 40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,135].

3.3 Data collection

In the data collection we gathered different information from the selected papers. First, we collected bibliographic data like authors, title, year of publication, publication type, and name of the journal or conference. Subsequently, we retrieved the number of citations from Google Scholar for each paper. Then, we read the whole papers, and extracted information on the field of healthcare, what the paper addressed, what the contribution of the paper was, what technologies, encryption and hash functions were used, and what attacks the solutions were trying to mitigate.

For the healthcare field, we categorised them into different groups:

  • Electronic health records (EHRs);

  • Personal health records (PHRs);

  • Data storage;

  • Access control.

Next, we analysed and collected information on whether the paper addressed privacy or, security issues, or both. We also noted information for which platform the solution was developed, categorising this into Cloud, IoT, or not defined.

Regarding the contributions of the paper, we classified them into four groups:

  • Framework;

  • Architecture;

  • Algorithm / protocol;

  • Model.

We collected data from the papers, to determine if they had developed a framework for a solution, created architecture, developed an algorithm/protocol, or presented a model. Framework and architecture are often structural designs of technologies or sequences used, and can also be implemented in part. Algorithms or protocols are typically solutions that are implemented to some extent. Models, on the other hand, often present a conceptual design, but are not implemented. A single paper could include more than one of these categories.

Next, we analysed the technologies that the paper used. In the final analysis, we separated them into five groups:

  • Blockchain;

  • Artificial Intelligence (AI);

  • Encryption;

  • Methods;

  • Other.

When assigning the papers, we initially extracted a comprehensive description of technologies and then sorted them into different fields. We identified whether the authors used Blockchain or AI methods. When the authors developed a theoretical model, they used mostly existing methods to illustrate the theoretical model, and they fitted them into the methods category. Encryption was used in many papers, but, for encryption, we selected specifically the solutions that focused solely on encryption. Additionally, we conducted a more thorough analysis of the encryption methods used in the proposal separately.

For the encryption methods used in the proposals, we organised them into four groups and seven subgroups:

  • Asymmetric encryption:

    • RSA;

    • Homomorphic encryption;

    • Elliptic-curve cryptography (ECC) encryption;

    • Identity-based encryption;

    • Other.

  • Symmetric encryption:

    • AES;

    • Other.

  • Unclassified;

  • Not Mentioned.

As for the hash functions used, we categorised them into these subgroups:

  • SHA1;

  • SHA256;

  • SHA512;

  • Used, but unclassified;

  • Not used.

We also documented how the solutions were evaluated or verified, usually through comparison with existing methods or other types of evaluation and verification.

For the possible attacks, we used existing literature classifications [3, 34], and adapted them to the information found in the selected publications:

  • hacking / malicious attack;

  • unauthorised access;

  • man-in-the-middle attack;

  • impersonation attack;

  • insider threat;

  • loss or theft of a device;

  • undefined.

Although man-in-the-middle attacks could be grouped under hacking attacks, and impersonation attacks and insider threats could be classified under unauthorised access, we maintained these three groups separately, because they appeared in more than three solutions. For the remainder of the papers, we tried to classify them into outlined groups based on the attack name or description.

4 Results

4.1 Bibliometric overview

In this Section, we provide a bibliometric overview of the selected publications. A complete Table of the selected papers with bibliometric data is presented in Appendix A. The 99 selected publications comprise 1 book section, 50 conference papers and 48 journal papers.

Figure 2 illustrates the total number of publications per year, including the breakdown by journals, conferences and the book section. The year 2023 is not included fully, because the search was done in July 2023.

Fig. 2
figure 2

Distribution of publication types by year of publication

The journals or conferences that had more than one publication out of all the 99 selected publications published are presented in Table 1. This information offers valuable insights for researchers aiming to publish future papers on similar topics.

Table 1 Frequency of publications at journals or conferences

An analysis of the citation count per paper was performed in September 2023. Remarkably, one paper from 2018 had 726 citations [35], highlighting the dynamic nature of the research area. On average, each paper received 19.42 citations, which shows a high level of interest and impact.

4.2 Analysis of publications

In this Section we will present the analysis of the content of publications that have developed solutions for data breach mitigation in healthcare.

4.2.1 Data breach mitigation solutions in healthcare

As presented in Fig. 3, various solutions were developed across different healthcare fields for data protection. The area of electronic health records protection has been rising in the last few years, being incorporated into 51 solutions altogether. Moreover, data storage for various data types (e.g. health insurance, medical image sharing) has been researched more in the last few years, and was presented in 44 solutions. Access control was implemented in 17 solutions, while personal health records were addressed in 7 solutions.

Fig. 3
figure 3

Distribution of solutions developed by year and field

Regarding privacy or security, our findings indicate that 50 solutions addressed the users’ data privacy, and 82 focused on data security. Interestingly, one-third of all papers addressed both aspects. Out of all publications, 21 were developed specifically for the cloud, 11 for IoT, and the remainder did not specify any platform.

In terms of the contribution of the publication, Fig. 4 reveals that the majority of contributions were frameworks (36), followed by architecture (32), algorithms or protocols (28), and the development of models appeared in 8 publications. Six publications contributed to two of the four available contributions. More frameworks, architecture and algorithms have been developed in recent years.

Fig. 4
figure 4

Overview of publication contributions

In Fig. 5, the technologies used most frequently in the solutions were Blockchain (45 solutions), AI or encryption (15 solutions each), and methods (7 solutions). An additional 28 solutions used technologies that did not fit under the previously mentioned categories. Notably, Blockchain has already been used in solutions since 2018, and AI methods began appearing in 2019, with both seeing increased use in recent years. Methods were typically theoretical solutions proposing models without technical implementation.

Fig. 5
figure 5

Technologies used in publications over time

Figure 6 outlines how the contributions were verified or evaluated. Comparisons to other models, algorithms, or similar approaches were utilised in 50 publications. Meanwhile, 12 used other methods, such as testing the results on various datasets, or validating the solution using developed validation tools. A notable gap in verification was observed in model contributions, because they were typically not technically implemented, and in framework contributions, which, in some cases, also lacked implementation.

Fig. 6
figure 6

Contributions and verification methods

4.2.2 Encryption and hash functions in the proposed solutions

Regarding data encryption, 81% of papers mentioned its use, whether symmetric or asymmetric, or not classified under any of the categories as presented in Table 2. There were 21 solutions with asymmetric encryption; the most common encryption was Elliptic-curve cryptography, used in 7 papers, followed by homomorphic encryption (6 papers), RSA algorithm (4 papers), and identity-based encryption (2 papers). Symmetric encryption was used in 20 solutions, with AES being the encryption method in 13 papers. In 44 publications, encryption was mentioned but could not be categorised as either asymmetric or symmetric. In 21 publications, the authors did not mention encryption.

Table 2 Data encryption in the analysed papers

Hash functions were implemented in 56 solutions, out of which the most common algorithms used were SHA256 in 8 cases, SHA1 in 4 cases, and SHA512 was used in 1 case. SHA1 was used in 3 solutions from 2016 and one from 2020, while SHA256 algorithms were used from 2020 on. In 43 solutions, hash functions were used, but remained unclassified. In the remaining 43 publications, hash functions were not mentioned.

4.2.3 Mitigation of attacks

Table 3 presents the types of attacks addressed by the reviewed solutions, based on classifications from the literature on trends in data breaches [3, 34]. The most common mitigation was for hacking or malicious attacks, addressed in 15 publications. Following, mitigation strategies for man-in-the-middle attacks were formed in 9 publications, while unauthorised access was mitigated in 12 publications. In 5 publications, there was mitigation of impersonation attack, and in 4 publications mitigation of insider threat. Loss or theft of the device was dealt with in a publication. It is noteworthy that a significant portion, two-thirds of the reviewed papers, did not specify the attacks targeted by their proposed solutions.

Table 3 Distribution of solutions by type of mitigated attacks

In the context of technological solutions employed against these attack types, Fig. 7 outlines the technologies adopted in the mitigative approaches. For this analysis, we excluded 66 publications with undefined attack mitigation strategies. Blockchain technology stood out as a versatile defence, proposed across all categories of cyberattacks. Secondly, technological solutions focusing only on encryption were used for countering hacking / malicious attacks, impersonation attacks and man-in-the-middle attacks. Furthermore, AI technologies were implemented to mitigate hacking / malicious attacks and impersonation attacks.

Fig. 7
figure 7

Technological responses to various attack types

5 Discussion

In this section, we will present the trends in the intersection of healthcare and data breaches by answering the research questions.

5.1 RQ1: How have the solutions for data breach mitigation in healthcare evolved over time?

Our comprehensive review of the last decade shows a consistent and marked growth in the research devoted to the development of solutions for data breach mitigation within the healthcare sector. In the first years there were up to 5 publications on this topic; however, the landscape began to shift, starting in 2019, also presented in Fig. 2. This interest has focused mainly on the protection of EHRs and the protection of data storage capacity as well as improving the access control systems and security of PHRs as outlined in Fig. 3.

The proposals often included technological interventions with developed frameworks, architecture, or algorithms/protocols. The development of models has been relatively infrequent, since these papers typically involve mainly theoretical methods that could be used for the future practical implementation. The most frequently used contribution of publications were frameworks. In Fig. 5 we have presented the technologies used through time for the selected proposals, and we can see a significant rise in Blockchain and AI technologies in recent years. The selected solutions include these technologies to mitigate the attacks, and are novel technologies in the healthcare sector [36].

These advancements reflect a proactive response to the evolution of cyberthreats in healthcare information systems, but nonetheless this field needs an even clearer direction and more technologically advanced solutions to cope with the increasing frequency of attacks.

5.2 RQ2: How have the proposed solutions addressed data breach mitigation in healthcare?

The scope of the proposed solutions reveals a multifaceted approach to confront data breaches namely integrating frameworks, architecture, algorithms / protocols, and theoretical models, as shown in Fig. 4. The technologies used encompass Blockchain, AI, encryption methods, other non-technical methods and other technologies, as presented in Fig. 5. Most importantly, these solutions are to mitigate different attacks, with a comprehensive breakdown provided in Table 3.

Emerging trends point to a shift towards addressing hacking / malicious attacks’ and unauthorised access incidents, which was similarly found in prior research [34]. Improper disposal of hard drives, or loss and theft of physical devices is not as common any more, since more attacks happen from distant locations or from insiders. These attacks were mitigated by using different technologies, as presented in Fig. 7. Blockchain technologies, in particular, were an answer for each of the mentioned attacks, offering novel defence mechanisms in the research, especially when it comes to secure data storage and sharing, and this is very important in the healthcare field. Further, the solutions used encryption and hashing while storing data in the proposed solutions, as presented in Table 2. The reviewed solutions apply various technologies, in particular Blockchain and AI, to address data breach mitigation in healthcare.

5.3 RQ3: What are the trends in technologies used for data breach mitigation solutions in healthcare?

Notably, Blockchain technology has become a major player because of its reliable data security capabilities. The adoption of Blockchain has been used since 2018, and the number of proposals with Blockchain is increasing annually as shown in Fig. 5. Similarly, AI was first used in 2019 for a proposal, and has risen annually. Some proposals only included encryption methods, and those have somehow maintained a constant flow through the years.

In recent years, most proposals have been creating frameworks, architecture or algorithms / protocols as presented in Fig. 4. Evaluation methods, as presented in Fig. 6, involve predominantly comparisons to existing methods or frameworks to evaluate their technical solutions, although some papers also used other types of verification. Further on, as presented in Table 2, most proposals included some form of encryption; in 21 cases, it was asymmetric encryption, and in some it was symmetric encryption, where the AES algorithm was the leading algorithm. As for the hash functions, 56% of solutions used them, and, in most cases where it was possible to identify the hash functions, they used SHA 256being the preferred algorithm since 2020, while the older SHA1 algorithm has not been used in the last three years.

6 Conclusion

Digital transformation in healthcare has improved patient care but also increased the vulnerability of sensitive data to cyberattacks. The sensitive nature of healthcare data therefore requires stringent protection measures, as well as cybersecurity strategies to avoid the sophisticated tactics used by malicious actors. Research highlights the urgent need for training hospital employees and that collaboration with experts is necessary to reduce breaches [1, 37]. Our comprehensive analysis of 99 research papers revealed the evolution of mechanisms to mitigate data breach attacks’ over the past decade, focusing on the protection of electronic health records, data storage, access control and personal health records. The solutions examined ranged from frameworks and architecture to algorithms / protocols and models, and reflected the efforts of the research community to develop innovative strategies to protect healthcare data. Particularly noteworthy are the roles of Blockchain technologies and AI which are key to addressing healthcare data security. In addition, cryptography is essential in securing the data, and is also needed because of the legislative requirements. The predominant use of encryption methods and hash functions in the reviewed solutions is in line with the recommendations from previous studies [38].

The nature of threats addressed by the solutions includes mainly hacking or malicious attacks, followed by unauthorised access and insider threats, which is also in line with the findings of Lee, who used a text mining approach to find insider threats in the healthcare industry, and found that theft of devices and employee negligence were among the top reasons for threats [39]. To mitigate data breach attacks, careful planning, timely implementation of solutions and tracking attack trends are crucial, and systems should be updated regularly.

In conclusion, this paper provides an overview of the current state of solutions to mitigate data breach attacks and also serves as a valuable resource for stakeholders in Healthcare 4.0. In addition, the research identifies several gaps that also point to future research directions, ranging from the full adaptation and integration of Blockchain and AI technologies, to the development and implementation of specific cybersecurity training programmes for healthcare employees. There is also a noticeable gap in aligning cybersecurity solutions with regulatory requirements. Addressing these gaps could increase the resilience of healthcare systems to cyberattacks significantly, and ensure the protection of sensitive healthcare information. The widespread use of IoMT devices and the increasing shift towards cloud storage bring additional threats to the system integrity and patient privacy, hence, future research should also emphasise IoT and cloud security. In addition, patient access to their personal health data introduces complex security considerations, and highlights the need for robust access control measures. Nonetheless, advances in quantum computing, could make many encryption methods vulnerable, so more research on reliable encryption is also crucial in future research.