1 Introduction

Companies’ databases (DBs) are repositories of their most significant and high-value data. As DB utilization has surged, so has the frequency of attacks on these databases. A DB attack is characterized as an event that jeopardizes a resource by altering or destroying vital data [1, 2]. The common goal of DB attacks is to access critical information. Illicitly acquiring sensitive data such as credit card details, banking data, and personal identifiers is another prevalent motive behind DB hacks. In our interconnected global society, several technologies provide avenues for DB attacks to exploit vulnerabilities in DB architecture, as per common understanding [1, 3, 4].

Many enterprises confront challenges like data piracy, data replication, and denial of service attacks. To infiltrate a company’s DBs, cybercriminals scout for system vulnerabilities and exploit them using specialized tools [5, 6].

The aspect of security should be prioritized during the development of information systems, particularly DBs. In terms of software development, security concerns must be addressed at every stage of the development cycle [7]. As illustrated in Fig. 1, security breaches, including the loss of critical data, have become commonplace in recent years. Given the importance of data security to numerous businesses, a range of measures and methodologies are required to safeguard the DB [8,9,10]. A secure DB is designed to react appropriately in the event of a potential DB attack [11].

Fig. 1
figure 1

Total data breaches cost in different countries [5]

In the current world, the impact of cyber-attacks on the commercial landscape must be addressed. To succeed in the globalized environment, businesses must ensure the protection of their vital data. DBs can be safeguarded from unauthorized access [12,13,14]. When a DB is outsourced to the cloud, cloud platforms introduce security challenges such as unreliable service providers, malicious cloud employees, data protection, consistency, and scalability. With cloud DBs becoming increasingly susceptible to both external and internal threats, traditional and conventional security measures are insufficient for their protection [15, 16].

While extensive work has been done in this field, much of it focuses on a few specific DB platforms or problems, typically explored through standard literature reviews. We aim to provide a more holistic view by conducting a systematic mapping study (SMS) to identify security concerns in DB architecture, development, and maintenance from a software engineering perspective. This SMS will help us identify the ongoing research challenges and priorities.

The following research questions (RQs) will guide our SMS to achieve our study objectives:

  • RQ1 What is the current state of the art in the development and implementation of secure DBs?

  • RQ2 What are the security issues in building, implementing, and maintaining secure DBs, as reported in the literature?

1.1 Paper contribution

The contributions of the intended work are as follows:

  • The proposed research undertakes a systematic mapping study (SMS) to identify and emphasize the challenges associated with developing and maintaining secure databases.

  • In addition to showing the difficulties experienced by users using various databases from a software engineering standpoint, our SMS survey sheds light on some of the most current database security studies.

  • It also highlights the importance of maintaining careful attention to database security and suggests a direction for future research in this field.

1.2 Motivation for the paper

Several research in the literature seeks to give a solution for database security. However, before moving forward with new solutions, it is necessary to synthesize current knowledge to offer security practitioners the most up-to-date information. We must identify the cutting-edge in constructing, implementing, and maintaining dependable databases, as well as security challenges, so BD’s design, development, and maintenance may be secure. The motivation behind this research is to provide in-depth solutions to these problems.

1.3 Paper organization

The remainder of the article is arranged in the following manner.

In Sect. 2, we discussed the background of DB security, and Sect. 3, illustrated the research methodology in detail. The results of our conducted SMS are given in Sect. 4. In Sect. 5, the Implication of our findings is discussed. Finally, the conclusion and future work are discussed in Sect.  6. Other supportive information is provided in the rest of the sections at the bottom of this paper.

2 Background

There are a number of studies that look at database security from different angles. In their study [17], Mai et al. suggest using cloud-based security measures to safeguard power system databases. Using an RSA encryption method, public and private keys are generated for database encryption; a huge prime integer is chosen randomly from the cloud platform’s Simple Storage Service and used as the client key. When the database receives a verification key, it compares it to the public key and private key established by the RSA encryption method. If the database determines that the access is legitimate, it provides feedback on the access. According to the findings of the tests, the database can be protected against threats as the threat situation value is always less than 0.50 once the design technique has been implemented.

A data encryption algorithm was developed by Ibrahim et al., which provides an encryption-based solution for DB security. In this system, information is encrypted using standard ASCII characters. They encrypted all of the data in the database and used three keys to access the primary formula. Numbers and text both work for the data. The suggested formula may restore the data’s original format by combining another coordinator with the aforementioned three keys. In order to achieve a comparable data size to when the data is encrypted at a decent pace, the algorithm prioritizes data size and recording speed [18].

The article offers a lightweight cryptosystem based on the Rivest Cypher 4 (RC4) algorithm [19] as a solution to the widespread problem of insecure database transfer between sender and recipient. This cryptosystem safeguards sensitive information by encrypting it before sending it through a network and then decrypting it upon its safe return. Database tables have an encapsulating system that ciphers symbolize hens.

The continual improvements in digitizing have enhanced the prominence of online services. Enterprises must store essential data in corporate DB systems, including bank records, activities, the history of patient paperwork, personal data, agreements, etc. The institutions also must maintain the data’s authenticity, privacy, and availability. Any intrusion in security procedures or data may cause severe economic loss and damage the company’s reputation [20]. The remarkable development in the deployment of DB’s is the required architecture to cope with information that can be attributed to the rising big data. Every 1.2 years, according to research, the entire quantity of institutional information doubles [21].

Most of the latest studies provide encryption-based solutions for DB security. However, before proceeding towards these solutions, there is a need to find out the flaws that lead to security breaches.

One or more of the following sources can lead to a security flaw:

Interior Internal origins of attack originate from inside the corporation. Human resources—organization supervisors, admins, workers, and interns—all fall within this category of insiders. Almost all insiders are recognized in a particular way, and just a few IT professionals have significant access levels.

Exterior Exterior attacks originate from entities outside the organization instance, cybercriminals, illegal parties of established ways, and government agencies. Usually, no confidence and trust, or benefit is offered for external sources.

Collaborator Any third party involved in a business connection with the organization, firm, or group is considered a partner in many companies. This significant collection of partners, distributors, vendors, contractual labor, and customers is known as the entire enterprise. There must be some level of confidence and privilege of accessibility or record among colleagues in the entire enterprise; therefore, this is often advised.

2.1 Secure databases

With incredibly high secure data and an expanded online presence, the worries concerning DB security are high at all-time. As more systems are connected and brought online to improve access, the sensitivity towards attacks is also increased, estimated to be about $1.3 million in massive financial losses; these mischievous attacks are also liable for public reputation and client relations with the association [21, 22]. All users can boundlessly get information from the DB server in an un-secure DB system. All hosts are allowed to associate with the server from any IP address and link with the DB server, making everyone’s information accessible in the storage engine [23, 24].

Hence, the DB system is retained with numerous security mechanisms which contain anticipation of unauthorized access to data from an insider or outsider of an organization. Proper encryption techniques should be applied to secure the DB’s [25]. The most comprehensive secure DB model is the multilevel model, which allows the arrangement of information according to its privacy and deals with mandatory access control MAC [7]. DB services are intended to ensure that client DB’s are secure by implementing backup and recovery techniques [26].

The DB can be protected from the third party, which is not authorized by the procedure called cryptography and utilizing other related techniques. The primary motivation behind DB security is ensuring data privacy from unauthorized outsiders. The essential techniques in DB security are authentication, confidentiality, and integrity, which are utilized to secure the DB’s [27]. DB construction, in particular, must consider security as the main goal while developing a data system. In this respect, security should be addressed at all stages of the software development process [7, 28,29,30].

2.2 Related work

Various articles examine the importance of security controls from the perspective of software engineering [31]. For instance, MÁRQUEZ et al. [32] conducted a systematic survey concentrating on the telemedicine platform’s safety from the software engineering viewpoint. The key focus of this article is investigating how Software development assists in designing a reliable telehealth platform. However, the proposed work is just restricted to, particularly telehealth systems.

Al-Sayid et al. [1] notably studied the challenges of data stores and proposed DB security issues. To prevent unauthorized access to or alteration of the DB’s critical material, they observed a wide variety of DB security issues. Another research by Zeb focuses on identifying potential attacks on the DB system using a standard research study. Mousa et al. [33] discover the various risks to DB safety in their analysis through the unstructured research study. Moghadam et al. [15] did an investigation on cloud servers to figure out all conceivable threats.

Nevertheless, this analysis is solely restricted to the cloud DB environment. The researchers Segundo Toapanta et al. [5] uncovered real-world examples of cybercrime. Apart from that, their research is restricted to cyberattacks.

The authors in [21] have suggested an innovative technique for spotting distinct threats to DB systems by assessing the risk for incoming new activities. Their research discovered various harmful attacks that could harm the DB system. The emphasis of their research is only confined to security assessment involving DB’s. Experts in [32] present a comprehensive mapping analysis, and their observations are only limited to the Telehealth system’s privacy from the software engineering point of view. They did not define the security problems in creating, implementing, and managing safe DB’s. Furthermore, with the rapid development of ICTs, it is essential to be up to date on the most recent developments in this field.

The primary goal of this research is to gain a greater understanding of this topic by conducting a Systematic Mapping Survey to identify the problems in building, managing, and sustaining reliable DB’s.

3 Research methodology

The goal of this study was accomplished by evaluating the current state of DB privacy and suggesting areas that needed further research work. With the SMS, researchers may better connect the data from literary research to a series of questions [34, 35]. SMS is a descriptive investigation that involves picking and putting combine all published research articles associated with a particular challenge and gives a broad summery of existing materials relating to the particular questions. In the near future, software engineers will benefit significantly from SMS because it provides a comprehensive overview of the research in the field. Figure 2 outlines the process that was followed to conduct the mapping study.

Fig. 2
figure 2

SMS process

3.1 Research questions

Our primary objective is to find the obstacles in planning, creating, and managing data protection. To achieve this objective, relevant study questions have been devised.

  • RQ 1What is the current state of the art in the development and implementation of secure DBs?

    To address RQ1, we have studied the material depending on the sub-questions mentioned above:

    • RQ 1.1n terms of reliable data modeling, development, and maintenance, which stage has received the most attention in the research?

    • RQ 1.2What are the primary sites for robust DB design?

    • RQ 1.3What are the ongoing research organizations working in robust data modeling?

    • RQ 1.4What kinds of DB attacks have been described in the research?

    • RQ 1.5According to the research, what are the various categories of DB's?

    • RQ 1.6 What kinds of DBMS platforms are often employed, as stated in the literature.

  • RQ 2What are the security issues in building, implementing, and maintaining secure DBs, as reported in the literature?

3.2 Search strategy

The scholars in [36,37,38] employed the PICO (Population, Intervention, Comparison, and Outcomes) framework to develop a list of terms and then drew search terms from research questions.

Population DB’s and software development in general.

Intervention Security Strategies.

Comparison No assessments proceed for the ongoing investigation.

Outcomes Reliable DB’s.

3.3 Search strings

After several tries, the following two search terms were selected to link the PICO aspects by utilizing Boolean connector (AND):

((“Database security” OR “Secure Databases” OR “Database protection” OR “Guarding Database” OR “Database intrusion” OR “Database prevention”) AND (“Security Mechanisms” OR “Security Models” OR “Security methods” OR “Security policies” OR “Security techniques” OR “Security Guidelines”)).

For Science Direct online repository, we compressed the above search term due to space limits. As a result, the accompanying keywords were entered into the ScienceDirect database:

((“Database security” OR “Secure Databases” OR “Database protection” OR “Guarding Database” OR “Database prevention”) AND (“Security Mechanisms” OR “Security methods” OR “Security techniques” OR “Security guidelines”)).

3.4 Literature resources

We choose below digital repositories (A to F) to do our SMS and execute the search stings for acquiring publications.

  • ACM–A

  • IEEE xplore–B

  • Springer link–C

  • AIS electronic library (AiSel)–D

  • Science direct–E

  • Wiley online library–F

3.5 Research evaluation criteria

Titles, abstracts, entire readings, and quality assessments were all factors in our selection of research publications. The primary goal of the selection process is to compile an appropriate collection of papers by imposing inclusion and exclusion standards on submissions. We have set the accompanying inclusion and exclusion criteria to perform our SMS effectively. The same inclusion and exclusion criteria have been used in other studies [39,40,41]

3.5.1 Inclusion criteria

Only articles that meet one or more of the below criteria were considered for inclusion in our collection.

I1 Research involving the design and implementation of database security measures.

I2 Research that explains how to protect DB’s.

I3 Research the difficulties and dangers of creating, implementing, and maintaining safe DB’s.

I4 Research on the planning, development, and management of reliable DB's included in this category.

3.5.2 Exclusion criteria

The preceding exclusion criteria were considered to find relevant articles.

E1 Publications that are not published in the English language.

E2 No consideration will be given to materials that haven’t been published in any journal, magazine, or conference proceedings, such as unpublished books and grey material.

E3 Books as well as non-peer-reviewed articles, including briefs, proposals, keynotes, evaluations, tutorials, and forum discussions.

E4 Articles that aren’t published in their whole digital.

E5 Publications that don’t meet the inclusion requirements.

E6 Research is only provided as abstracts or PowerPoint slides.

We used the snowballing approach [42,43,44] in addition to the previous inclusion/exclusion criteria for our concluding decision. The snowball method was used to choose seven articles from various research repositories. Appendix 1 contains the papers selected using the snowballing approach, from 94 to 100. In the latest research, scholars have employed the same method [45, 46].

3.6 Quality evaluation

All articles chosen in the selection have been evaluated for quality. Criteria for quality evaluation include:

To evaluate the papers, we used a three-point Likert scale (yes, partially, no) for every element of the quality evaluation criteria. We awarded each element of quality assessment criteria a score of 2 (yes), 1 (partially), or 0 (no) to achieve notable findings. Including an article in the SMS is permitted if it gained an average standard score of > or = 0.5. Many other scholars [45, 47,48,49] have employed a similar approach. A list of all of the questions from Table 1 is included in the quality ranking.

Table 1 Quality evaluation criteria

3.7 Article selection

Employing Afzal et al. tollgate’s technique, we adjusted the key publication selection in our SMS analysis upon executing the search terms (Sect. 3.3) and online DB’s (Sect. 3.4) [50]. The five stages of this method are as follows: (Table 2).

Table 2 Total number of articles per repository

Stage1 (St-1) Conducting literature searches in digital repositories/DB’s for most relevant articles.

Stage 2 (St-2) A article’s inclusion or removal is based on its title and abstract readings.

Stage 3 (St-3) To determine if an item should be included or not, the introduction and findings must be reviewed.

Stage 4 (St-4) the inclusion and exclusion of data analysis research are based on a full-text review of the research's findings.

In Stage 5 (St-5) most of the original studies that will be included in the SMS study have been vetted and selected for inclusion.

There were 4827 documents collected from the chosen web-based libraries/DB’s by imposing inclusion and exclusion criteria following the initial search string iteration (see Sect. 3.3). (Sects. 3.5.1 and 3.5.2, respectively). The tollgate strategy led to a shortlist of 100 publications that were eventually selected for the research. Quality evaluation criteria were used to evaluate the selected articles (Sect. 3.6). Appendix 1 includes a collection of the publications that were ultimately chosen.

3.8 Extracting and synthesizing content

A survey of the articles reviewed is used to obtain the data. In order to address the questions stated in Sect. 3.1, the entire content of every article has been reviewed, and pertinent data extracted. You can find a precise technique for extracting data in the SMS Protocol.

4 Description of key findings

A comprehensive mapping analysis was used throughout this study to determine current state-of-the-art and privacy issues in data modeling, development, and maintenance. Sections 4.1, 4.2, 4.3, 4.4, 4.5 and 4.6 contain the facts of our observations.

4.1 The current state of the art

RQ1 has been addressed using the below sub-questions as a reference (Sects.  4.1.1, 4.1.2, 4.1.3, 4.1.4, 4.1.5 and 4.1.6).

4.1.1 Stages in the building of a protected database

RQ 1.1 focuses on a reliable DB’s most frequently studied stages (design, development, and maintenance). As seen in Table 3, the “design” step was mentioned in most publications at a rate of 27%. There is a 25 percent chance that you’ll hear about the “developing” stage. The “maintenance” stage was only mentioned in 5 of our SMS research findings.

Table 3 Distinctive stages of a reliable DB design

4.1.2 Well-known sources for the building of reliable DB’s

RQ 1.2 is addressed in the second part of this SMS, which concentrates on the location of the papers chosen for this SMS. For venue and provider type analyses, we looked at five repositories, including A, B, C, D, and E. Tables 4 and 5 exhibit the snowballing method, which we refer to as “others.” Several of the papers from these collections were presented at conferences, journals, and workshops/symposia, among other venues. As shown in Table 4, 45 out of 100 articles were published through the conference venue. Secondly, we found that, with a rate of 37 out of 100, a large percentage of the publications came from the journal channel. Workshops and symposiums accounted for 18% of the articles presented.

Table 4 A venue-based sampling of articles for the final collection
Table 5 Publishing venues containing more than one selected article

Table 4 lists a total of 100 articles spanning a wide range of topics related to DB privacy. This indicates that scholars have devoted a great deal of attention to this topic. “International Journal of Information Security(IJIS)”, “The International Journal on Very Large Data Bases (VLDB)”, “Computers and Security (C&S)”, “Digital Investigation (DI)”, “Journal of Natural Sciences (JNS)” and “Journal of Zhejiang University SCIENCE A (JZUS-A)” were found to be the most popular publications for privacy mechanisms in secure DB designing, as mentioned in Table 5. We also discovered that the “Annual Computer Security Applications Conference(ACSAC)” and the “International Workshop on Digital Watermarking(IWDW)” are the most often referred articles on the issue of our research. Software engineering and other related domains can benefit greatly from DB privacy studies.

4.1.3 Research institutions participating in the construction of a reliable DB

The institution of the first researcher was utilized to determine and evaluate the highly ongoing researching institutes in the field of protected DB’s. Table 6 shows the findings for RQ 1.3, which reveal that “University of Florida, USA (UOF)” and “CISUC, University of Coimbra, Portugal (UOC)” produced the most research publications on protected DB’s (3 percent, each, out of 100). Ben-Gurion University of the Negev (BGU); RMIT University in Melbourne, Australia; YONSEI University in Seoul; TELECOM Bretagne in Brest, France(ENST); Anna University in Chennai, India (AUC); Huazhong University of Science and Technology in Wuhan (HUST); and George Mason University in Fairfax, Virgin Islands(GMU). BGU has presented two publications for each of the selected research.

Table 6 Security DB building research organizations with more than one chosen article

4.1.4 The most common kind of DB attacks, according to academic research

RQ 1.4 is concerned with identifying the many kinds of DB attacks that have been recorded. Table 7 shows the three types of incidents: internal, external, and both (internal and external). To effectively understand intrusions, we must combine cyber-attacks with breaches by collaborators. Because both internal and external attacks are mentioned in one article, we refer to this as both (internal and external). Our SMS study’s “Both (Internal & External)” attacks had a rate of 52, according to the assessment in Table 7. The bulk of the articles in our SMS survey highlighted “External” attacks with a frequency of 35%. In total, 13 papers in our SMS addressed the topic of “internal” attacks.

Table 7 DB threats that have been described in the literature

4.1.5 Database types that have been identified in the literature

To answer RQ 1.5, we must recognize the various DB’s discussed in the literature. Seventeen different DB’s have been documented in the research based on the data we gathered from the articles we included in our SMS. Table 8 shows that of the 100 articles in our SMS survey, 24 papers mentioned the term “Web DB.” Secondly, we found that “Commercial DB” appeared in 11 of the 100 articles in our SMS analysis. According to SMS, “multilevel DB and distributed DB” was mentioned in ten publications.

Table 8 DB’s that have been mentioned in the research

4.1.6 Kinds of database management systems (DBMS) presented in the research

Data management systems (DBMS) are examined in RQ 1.6. In this research, 11 distinct DBMS types have been documented based on our SMS data, which was gathered from a selection of studies. Most of the articles in our SMS survey mentioned an “Oracle DB system” with a 31 out of 100 rate, as shown in Table 9. Secondly, “MySQL DB system” was mentioned in most of the publications in our SMS analysis (23 out of 100). Our SMS research found 21 publications that mentioned the term “SQL Server DB system.”

Table 9 Kinds of DBMS as mentioned in the literature

4.2 Issues in databases

As demonstrated in Table 10 and Fig. 3 our existing research into DB privacy has uncovered 20 issues from a pool of 100 studies (see Appendix 1).

Table 10 Challenges in DB security
Fig. 3
figure 3

Issues in DB security

CC #1 Poor authentication system An unauthorized individual gains access to a DB, harvests vital information, and allows the hostile attacker to violate the safety of certified DB’s [1, 51].

CC #2 Database intruders We are talking about when we say “threat database attacks” Anonymous queries (anomalous query attack), Harmful queries (query flood attack), and Inferential Attacks (polyinstantiation issue, aggregate problem).

CC #3: Inadequate database protection Best Strategies Specifications Engineering, Architectural, Planning, and Development all suffer from the absence of proper security procedures.

CC #4 Authorized/Malicious User Threats An authorized individual, employee, or administrator may collect or disclose critical data [52].

CC #5 Inadequate access contro Whenever many persons need access to the information, the risk of data fraud and leakage increases. The access should be restricted and regulated [1]

CC #6 Inadequate NOP protection Inadequate NOP Protection is a shortage of network privacy, operating system privacy, and physical safety.

CC #7 Data leakage/privacy challenges Clients of database systems are increasingly concerned about information security. Attacks on disclosed confidential information, including passwords, emails, and private photographs, triggered this issue. Individuals and database systems cannot stop the propagation of data exploitation and destruction once the content has been leaked [53].

CC #8 Inappropriate database implementation/configuration/maintenance Numerous DB’s are improperly setup, formatted, and maintained, among the main reasons for database privacy issues [54].

CC #9 Absence of resources When we talk about a shortage of resources, we are talking about a need of trained employees, a lack of time and budget, a shortage of reliable resources, and an insufficient storage capacity, to name a few things.

CC #10 Database management challenges There are aspects of effectively handling database systems, connectivity, and information at different levels [53].

CC #11 Inadequate connectivity platforms Presently, the majority of customer, user, and third-party conversations are conducted online. The inclusion of an insecure transmission medium was driven by the Internet’s opportunity to link DB’s [1].

CC #12 Loss of information usage monitoring Several users are unconcerned regarding their communications but may inadvertently send important information to an unauthorized person or untrustworthy servers. Because of a shortage of supervision of data consumption, they are also lost and destroyed [1].

CC #13 Web-based accessibility of tools for database attacks Several tools being used for intrusions are accessible in this globally networked domain, allowing intruders to expose weak spots with minimal expertise of the victim DB architecture [1].

CC #14 Inadequate database monitoring strategy Regulatory risk, discovery, mitigation, and restoration risk are just a few of the dangers posed by a lack of DB auditing [1].

CC #15 Poor cryptography and anonymization No DB privacy plan, regulation, or technology would be sufficient without cryptography, whether the information is traveling over a network or being kept in the DB system [1].

CC #16 Unauthorized data alteration/deletion Any type of unauthorized information alteration or deletion can result in substantial economic losses for an organization or corporation [55].

CC #17 Semantic ambiguities DB issues, including semantic uncertainty, which arises from an absence of semantics or inadequate semantic descriptions, dissemination issues, updating scope constraints, and tuple mistrust, are addressed [56, 57].

CC #18: DB outsourcing problems: Because so many DB’s are now being outsourced, there are serious concerns about the data’s accuracy and safety. Clients will have to relinquish management of the information they have outsourced [58, 59].

CC #19 Regulatory and licensing challenges DB’s have many security issues, including policy and licensing concerns. Would the corporation have a consistent and approved policy and licensing from the authorities or organization [1, 60]?

#20 Poor verification system A poor verification system allows an attacker to assume the credentials of a legitimate DB and access its data. The invader has a wide range of options for determining the identification of data. Assuming passwords are easy to remember [1] or using a preset username and password.

4.3 An assessment of database protection issues based on continents

There is much research on various continents in our SMS findings. A comparative analysis of only three continents, i.e., Europe, North America, and Asia, is discussed in this work (See Appendix 2 for more details). We want to find out if these issues are different across continents. We believe that by examining the similarities and distinctions among these problems, we may better prepare ourselves to deal with them on the continent in question. We employed the sequential correlation chi-square test to determine whether there were notable variations among the issues in the three continents listed previously (Martin, 2000). There are many more similarities than distinctions among the issues in the three continents. Poor authentication systems, DB intruders, inadequate DB protection best strategies, and authorized/ malicious user threats are the only major differences found in Table 11. According to our findings, the most prevalent risks in the three continents are “Inadequate Access Control” (65%, 57%, and 64%), “Inadequate NOP Protection” (59%, 57%, and 60%), “Data Leakage/Privacy Challenges” (49%, 60%, 64%), and “Authorized/Malicious Individuals Threats” (40%, 20%, and 52%). It is not uncommon to see “Authorized/Malicious User Threats,” “Inadequate Access Control,” and “Inadequate NOP Security” across Europe and Asia. Inadequate Connectivity Platforms, Poor Verification Systems, Data Leakage/Privacy Challenges, and Regulatory and Licensing Challenges are some of the problems North American and European clients/users face while creating safe DB’s, as shown in Table 11. According to our research, the “Poor Verification System” problem affects the most significant number of customers and users in Asia (78 percent). “Data Leakage/Privacy Challenges” is the most common issue faced by European customers and individuals (60 percent). Many customers in North America face “Inadequate Access Control” and “Data Leakage/Privacy Challenges” concerns, respectively (i.e., 64 percent) (Fig. 4).

Table 11 Assessment based on the continent
Fig. 4
figure 4

Distribution depending on continents

4.4 Methodological assessment of database privacy issues

Table 12 shows how we divided the different types of difficulties into three distinct approaches. Table 12 shows the three approaches used: tests, Ordinary literature review OLR, and Other/Mixed Approaches as shown in Fig. 5. Other techniques include writing an experience report, conducting a case study, conducting a survey, and utilizing fuzzy methodologies. When we talk about “many methodologies,” we mean that more than one is employed in a single work. Testing is commonly utilized (39 out of 100 times, according to Table 12). The second notable finding in our SMS research is that 31 of the 100 participants used a standard literature review approach. Appendix 2 has further information. Many issues have been revealed by studying the distribution of publications among the three methodologies. Seventeen issues have been detected in relation to OLR, as shown in Table 12. Two of the Seventeen issues have been mentioned in over 50% of the publications. Inadequate Access Control (74%), and Data Leakage/Privacy Challenges (52%), are two of the most often stated problems. Tests face a total of 18 difficulties. Four of these 18 issues have been quoted more than 50% of the time in at least one of the publications. “Data Leakage/Privacy Challenges—64 percent”, “Inadequate NOP Protection—62 percent”, “Poor Authentication System—56 percent”, and “Inadequate Access Control—56 percent” are among the most often stated difficulties. Other/Mixed Approaches publications have highlighted twenty difficulties. Moreover, half of the publications cited 4 of the 20 issues listed. “Poor Authentication System—73%”, “Inadequate NOP Protection—63%”, “Inadequate Access Control—60%”, and “Data Leakage/Privacy Challenges—60%” are among the most frequently stated problems.

Table 12 Methodological based assessment
Fig. 5
figure 5

Methodological-based distribution of papers

Table 12 shows that no SMS approach was employed in any studies (n = 0). These findings prove that our study methodology is innovative in this particular field. We performed the Linear-by-Linear Chi-Square test for the earlier research-mentioned techniques and methodologies to establish whether there was a substantial difference between the challenges. “Poor Authentication System” and “Inappropriate DB implementation/configuration/maintenance” are the only notable variances.

4.5 Years-based study of database privacy issues

A comparison of issues over two time periods, 1990–2010 and 2011–2021, is shown in Table 13 and presented in Fig. 6. More information can be found in Appendix 2. Within the first phase; we found that 18 issues had been highlighted in the research. Four of the 18 issues have been quoted more than 50% in the publications. Inadequate Access Control (70 percent), Poor Authentication System (65 percent), Inadequate NOP Protection (62 percent), and Data Leakage/Privacy Challenges (52 percent) are the most commonly stated vulnerabilities. Between 1990 and 2010, 70 percent of DB’s had Inadequate Access Control, indicating that designers failed to effectively control access permission throughout implementationcontrol access permission throughout implementation.

Table 13 Years based evaluation
Fig. 6
figure 6

Year-based distribution of publications

Furthermore, admins in an organization are liable for ensuring that data is adequately protected via access permissions. The “Inadequate Access Control” difficulty has dropped to 58 percent in the second period. The literature has revealed 19 problems for the second time period. Four of the 19 obstacles have been referenced in at least half of the publications. “Data Leakage/Privacy Challenges” accounts for 63% of the faults, “Inadequate Access Control” for 58%, “Poor Authentication System” for 55%, and “Inadequate NOP Protection” for 55% of the issues, respectively. We used the Linear-by-Linear Chi-Square analysis and only identified a substantial variation for one problem, “DB Management Challenges, “with a p-value of less than.05.

4.6 Evaluation of articles based on their venue

Table 14 displays a breakdown of the various distribution methods. In addition to Journals, Symposiums, Conferences, and Workshops, we have presented our final articles on extracting data via SMS in various other publications venues as well. Journals, Workshops/Symposiums, and conferences have been classified into three categories for easy study. We found that 45 percent of our comprehensive study of articles was presented at conferences, according to Table 14 and Fig. 7. Additionally, 37% of the publications in Table 14 were presented in new journals. For further information, please see Appendix 2 at the ending of the study. Many issues have been discovered as a result of distributing papers via these three channels. According to our findings, 18 issues with journals need to be addressed. Four of the 18 challenges have been referenced in at least half of the publications. “Privacy Issues/Data Leakage—84 percent,” “Inadequate Access Control”—59 percent, “Inadequate NOP Protection”—59 percent,” and “Poor Authentication System—54 percent” are the most often stated difficulties. Conferences face a total of 20 obstacles. Three of these 20 difficulties have been quoted more than 50% of the time in at least one publication. “Poor Authentication System—71 percent,” “Inadequate Access Control—69 percent,” and “Inadequate NOP Protection—62 percent” are the most often stated issues. Workshops/Symposiums face a total of 16 difficulties. Two issues have been mentioned in over half of the publications out of the 16 total. “Data Leakage/Privacy Challenges—61 percent” and “Inadequate Access Control—56 percent” are the most commonly reported hurdles. Linear-by-Linear Chi-Squared test has been used to find substantial differences throughout the difficulties. We have found just one big variation between the hurdles “Data Leakage/Privacy Challenges”.

Table 14 Articles venue-based assessment
Fig. 7
figure 7

Venue-based distribution of articles

4.7 Comparison with existing studies

A wealth of studies have delved into various aspects of database security. Some of these have centered their attention on securing data transmission from server to client, while others have prioritized the construction of secure databases through secure coding practices. The increasing dependence on geographically dispersed information systems for daily operations might augment productivity and efficiency but simultaneously heightens the risk of security violations. Current security measures ensure data transmission protection, yet a comprehensive security strategy must also encompass mechanisms to enforce diverse access control policies. These policies should consider the content sensitivity, data attributes and traits, and other contextual data such as timing.

The consensus in the field is that effective access control systems should integrate data semantics. Moreover, strategies ensuring data integrity and availability must be customized for databases. Consequently, the database security community has developed an array of strategies and procedures over time to safeguard the privacy, integrity, and accessibility of stored data.

Nonetheless, despite these advancements, fresh challenges persist in the database security landscape. Evolving threats, data access “disintermediation,” and emerging computing paradigms and applications like grid-based computing and on-demand business have all introduced new security demands and innovative contexts where existing methodologies can be employed or extended. Despite a multitude of available solutions, raising awareness about existing security breaches is critical for bolstering database security.

In response, we decided to conduct a Systematic Mapping Study (SMS) on secure databases to offer an up-to-date perspective for both database users and developers. We did not find any comprehensive systematic literature review (SLR) or mapping study on this topic to draw comparisons with. However, we believe this research will offer a strategic roadmap for all database stakeholders.

5 Practical implications of research

The practical implications of this research are manifold and impactful. Initially, the results of this SMS will serve as an invaluable resource for DB privacy professionals and users. By leveraging the insights from this study, experts gain an enhanced understanding of DB privacy issues that need addressing. Consequently, they can prioritize their focus on the most significant security challenges. This, in turn, equips DB users with an awareness of their potential privacy risks. Thus, this study benefits consumers by assisting organizations in developing secure DB systems, mindful of the challenges they face (Table 10).

Furthermore, professionals such as DB designers, project managers, and scholars specializing in secure DB design are keen to keep abreast of the latest developments. This research provides DB developers with insights into novel strategies for DB security and the latest advancements in DB technology. Journals such as “VLDB,” “Computers & Security,” “DI,” and “JNS” should be of particular interest to them. Consequently, they would find it beneficial to scrutinize papers available from the “ACSAC” and “IWDW” Conferences and Workshops. The aforementioned venues present optimal resources for studying reliable DB development.

These venues, recognized for their focus on secure DB design, encourage scholars to contribute high-quality academic articles. The outcomes of this study will inform experts’ decision-making processes, providing guidance on where to invest when developing tools and methodologies for safeguarding DB systems. Lastly, it underscores the need for organizations to provide appropriate training for their customers to tackle critical challenges.

6 Conclusions and future work

Databases (DBs) serve as repositories for a company’s most vital and valuable data, and the increasing popularity of DB systems has been paralleled by a surge in data breach incidents. Therefore, safeguarding DBs necessitates a heightened level of vigilance relative to other systems in the company. As such, a repertoire of comprehensive DB protection measures is needed to uphold the DB system’s integrity [1].

Our research employs an SMS approach to explore the issues that various DBs encounter. The derived conclusions are elaborated in the context of the research questions within subsequent subsections. Research Question 1 (RQ1) aims to identify recent DB studies from a software engineering perspective, with the findings detailed in Sect. 4.1 anticipated to inform future research in this realm.

Additionally, we probed the research to answer underlying sub-questions. RQ 1.1 aims to discern the most frequently addressed stages in the literature: planning, developing, and maintaining. In our comprehensive collection of articles, the “design” phase was mentioned with a 27% frequency, as per Table 3. The “development” phase was the second most frequently mentioned, at 25%. The “maintenance” stage was included in 5% of the studies, as shown in Table 3.

RQ 1.2 prompted our selection of five digital repositories as primary locations for article searches. The final pool of articles from these repositories fell into three main publishing categories: Journals, Conferences, and Workshops/Symposiums. Table 4 reveals that 45% of our articles were presented at conferences, while 37% were published in journals. Workshops and symposia contributed 18% of the articles, as per Table 4. Table 5 showcases over 100 articles on DB privacy, indicating a significant commitment of resources to this area. There is a discernible link between the most common venues for secure DB design measures and the venues for secure and robust DB design strategies, as illustrated in Table 5. We also found that “ACSAC” and “IWDW” are the most popular sources of pertinent articles on the topic.

Various institutions continue to contribute significantly to research on secure DBs, which could benefit other areas such as software engineering. The first researcher’s affiliations were used to identify the most active research institutions in this area. Table 6 presents the findings for RQ 1.3, showing “UOF” and “UOC” as the most prolific authors of secure DB publications (3% of reviewed publications). “BGU”, “YONSEI”, “RMIT”, “IIT Kharagpur”, “ENST”, “HUST”, “AUC”, and “GMU” have each contributed two publications to the selected studies.

RQ 1.4 aimed to identify various types of DB threats, as evidenced in the literature. Table 7 outlines three types of attacks: internal, external, and a combination of both. The category “Both (Internal & External)” accounts for 52% of the discussed studies, according to Table 7. The literature reviewed included 35% of studies examining “External” attacks and 13% discussing “Internal” attacks.

RQ 1.5 sought to identify different DB types addressed in the literature. Seventeen unique DB types were found in the literature, based on data from our SMS-selected publications. As per Table 8, “Web DB” was mentioned in 24% of publications, with “Commercial DB” mentioned in 11%. “Multilevel DB and Distributed DB” were the focus of ten papers in our review.

RQ 1.6 aimed to highlight various types of DB management systems (DBMS) featured in the literature. The findings of our study reveal 11 unique DBMS. Table 9 indicates that “Oracle DB system” was discussed in 31% of publications, while “MySQL DB system” appeared in 23% of articles. “SQL Server” was mentioned in 21 articles, according to our findings.

RQ2 examined DB privacy issues documented in the literature. Applying our predefined inclusion and exclusion criteria, we compiled a finalized set of 100 papers. The data extraction and synthesis process led to a list of 22 challenges, which were narrowed down to 20 upon external reviewer’s recommendations, as shown in Table 10. The most common challenges were Data Leakage/Privacy Challenges (63), Poor Authentication System (59), DB Intruders (58), and Authorized/Malicious User Threats (36). These highlighted issues allow companies to evaluate their current security strengths and weaknesses and develop solutions promptly.

If left unaddressed, these identified challenges can potentially inflict harm on the DB system. As such, they must be addressed to enhance DB security. In future work, we aim to identify best practices to alleviate the secure DB challenges outlined in this study.