Introduction

Cyberspace provides users with an interactive platform to share information, engage in discussions or social forums and conduct business among many other activities. Cybersecurity provides the required preventive methods to protect data, networks, electronic devices, and servers from malicious attacks and unauthorized access. Elements of cybersecurity encompass application security, identity management, network security, data security, end-user education, disaster recovery, and business continuity. Some common types of cyber threats involve ransomware, phishing, malware, and social engineering. To combat such threats different cybersecurity tools are available which consist of anti-virus/anti-malware software, firewalls, encryption methods, two-factor authentication techniques, and software updates to improve security. Such measures are not satisfactory for tracking and security of cyberspace from various cybercrimes. To be capable of identifying a wide variety of warnings and providing clever real-time decisions, cyber defense systems should be adaptable, docile, and sound [28, 31, 33]. This can be facilitated by the use of Artificial Intelligence.

The digital realm has inspired human beings to extend their thinking abilities and thereby carry out research works to invent an artificial human brain. This continuous research led to the creation of Artificial Intelligence [49]. Artificial intelligence (AI) is a technology that is defined as the ability of machines to perform tasks that are associated with human intelligence. The main study of AI is to train the machines to simulate human skills, such as learning, rationalizing, thinking, and managing [93]. Some of the AI techniques include Natural Language Generation, Expert Systems, Intelligent Agents, Deep Learning, Machine Learning, Speech Recognition, Text Analytics, and NLP. These techniques combined with various other technological methods can be utilized to improve current cybersecurity methods.

Artificial Intelligence serves to develop applications that adjust to their structure of use; they self-direct, harmonize, diagnose, and importantly learn themselves by producing understandable knowledge from discrete data. Therefore, the future of cyber warfare and AI has already merged [67, 91]. AI can quickly identify and analyze new exploits and weaknesses in a system of interest and therefore, it can be utilized to augment the field of cybersecurity. These two fields became closely integrated when the cyberattacks were intended to affect the authentic execution at the individual user level and the moderate system levels [20]. CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is the most basic illustration of the amalgamation of AI and Cyber Security. Other than CAPTCHAs, a significant number of AI methods are employed in cybersecurity which can be classified conditionally as "distributed" methods and conveniently as “compact” methods [89]. Distributed methods include (i) Multi-Agent Systems of Intelligent Agents: An autonomous system composed of multiple interacting intelligent agents that work to distribute data and collaborate to execute relevant responses in case of unpredicted events, (ii) Artificial Neural Networks: consisting of artificial neuron that learns and solves problems when combined with each other, (iii) Artificial Immune Systems: an immune-based cyberattack management technique comprising of the development of immunocytes (variation, self-tolerance, clone) and antigens detection concurrently, and (iv) Genetic Algorithms: an implementation of the biological evolutionary processes, whereas compact methods consist of (i) Machine Learning Systems: systems with the ability to involuntarily learn and update from experience without being explicitly programmed, (ii) Expert Systems: a method that includes a knowledge base and an inference engine, and (iii) Fuzzy logic: a system that consists of a related rule set repository and a tool for obtaining and managing the rules. These AI methods are designed to learn and adapt the most detailed modifications in the trained model of the system and have the potential to act much more efficiently than existing techniques of cybersecurity.

Cyber infrastructures are largely exposed to different interruptions and warnings. Electrical devices, such as sensors and detectors, are not sufficient for ensuring the security of these infrastructures. Cyber intrusion occurs on a global scale. Due to the augmentation of the internet, the cyber attackers have access to the knowledge and instruments that are required to carry out cybercrimes. The conventional cybersecurity measures are not sufficient in fighting the tremendously increasing cyber threats. The traditional measures however follow a fixed algorithm that has a hard-wired logic on the decision-making level and thus is inefficient in managing the dynamically evolving cyberattacks [33]. The existing cybersecurity methods are slow in terms of execution. A common method of cybersecurity through firewalls has limitations in the security process. It is a perimeter defense technique and thus does not fight the enemy within a system [1]. Moreover, the firewall is not considered an efficient approach to fight against viruses and Trojan horses [1]. In addition to these, the immense spread of connected devices in the IoT has raised the requirement for intelligent security measures in response to the increasing demand of millions and billions of connected devices and services globally [2, 50, 54, 80]. Furthermore, efficient security measures are required to fight against the numerous network-centric cyber interventions that are carried out by intelligent agents, such as computer worms and viruses. The existing cybersecurity measures are insufficient to combat such attacks because they cannot manage the complete process of attack–response promptly. Thus, intelligent semi-autonomous agents are required that can identify, assess and react to network-centric cyberattacks in a timely manner [81]

This paper presents a review of the application of various Artificial Intelligence techniques in Cybersecurity for analyzing, detecting, and combating different types of Cyber Attacks. It demonstrates how AI methods can be an efficient tool for enhancing cyber defense abilities by augmenting the intelligence of the defense systems. Lastly, the future scope and challenges of the application of AI in cybersecurity have been discussed and necessary conclusions are drawn.

Application of various technologies in cybersecurity

Various emerging technologies have served their purpose greatly in overcoming the limitations of conventional techniques of cybersecurity. Big Data, Blockchain, Behavioral Analytics are examples of a few such technologies.

Big Data technology can be effectively applied in the field of cybersecurity for detecting Anomaly-based Intrusion and Fraud. Software architecture with cognitive algorithms is the basic requirement of Anomaly-based Intrusion detection techniques. Deviations from the learned model are detected by monitoring user activity, network traffic, or native system activity in a standard behavior-based solution. The models are usually divided into two classes (i) Legitimate and (ii) Abnormal Intrusion is considered to occur whenever there is a deviation to abnormal marks from a legitimate design [74]. In the case of Fraud detection using Big Data the two principal methods used are (i) Statistical and (ii) Artificial Intelligence [16]. One of the major areas where fraudulent practices are prevalent is in the health insurance system. Electronic health cards including smart chips implanted in them have been executed to combat fraud in health insurance. Such e-Health cards generate an immense volume of data that need to be processed. Frequently occurring faults that are concealed inside enormous storehouses of data can be recognized and corrected by implementing big data analysis. Big data analytics technologies, such as business rules, social network analysis, database searches, anomaly detection, and text mining, should be utilized to fight against health insurance fraud [22].

Lately, Blockchain technology has been the topic of enhanced scientific research and growth [88]. Due to the distinctive trust and security properties it possesses, blockchain technology has fostered significant attention among industry practitioners, researchers, and developers. The most security-focused blockchain utilizations are in (I) IoT for: (i) Corroboration of devices to the network and the authentication of users to the devices [34, 45, 70, 88], (ii) Protected deployment of firmware by means of peer-to-peer spread of updates [17, 29, 53, 88], (iii) Threat detection and malware prevention [41, 42, 88], (II) Data Repository and Allocating for: (i) Warranting that the data cached in the cloud remain immune to unauthorized modification, (ii) Securely storing and maintaining the hash lists that allow the searching of data, (iii) Verifying that the data exchange from dispatch to receipt remains same [7, 25, 88, 97], (III) Network Protection: due to the increased use of conceptualized machines, software defined networks and containers for application deployment, blockchain provides verification of crucial data to be stored in a decentralized and strong manner [11, 18, 23, 88], and (IV) Private User Data: which includes the protection of individual identifiable data being interacted with different functions and end-user settings for wearable Bluetooth devices [27, 37, 88], (V) Maneuvering and service of the World Wide Web for: (i) Assuring correctness of the wireless internet access point being attached to [66, 88] (ii) Assisting Navigation to the exact web page through precise DNS records [19, 88, 94], (iii) Reliably using web applications [88, 95], (iv) Interacting with others through safe and encrypted arrangements [10, 71, 88].

Behavioral Analytics utilizes User and Entity Behavior Analytics (UEBA) security solutions to recognize patterns of data transmissions in a network that deviates from the standard criteria. It confines the extent of managing huge quantities of information to detect as well as counterbalance threats within the network and predict, discover, and resolve errors by attaching technology with singular data points.

Of all the technologies, Artificial Intelligence has shown its potential application in cybersecurity by employing its various techniques in protection against various cyber threats, such as Intrusion Detection and Prevention, Denial of Service attack, Spam detection, Computer Worm Detection, Botnets, and so on.

Discussing the use of conditionally classified “distributed” AI methods in cybersecurity

Farzadnia et al. [36], proposed a novel hybrid method for Intrusion Detection System (IDS) using an Artificial Immune System (AIS). The system consisted of two defensive lines. The Dendritic Cell Algorithm (DCA) was used to make the first defensive line that was based on the Danger Theory (DT). The association of these dendritic cells with the detectors bolstered the efficiency of the detector and supported it in retaining the memory for an extended duration. The simulation of this sophisticated hybrid system was carried out in MATLAB. The dataset containing 9 sub-categories of attacks was given as input into the proposed model. The criterion for the evaluation of the model was Detection Rate, False Positive, False Negative, and Accuracy. The proposed model outperformed other methods in terms of Detection Rate. The performance from all the three datasets for the proposed model was 98.7%, 99.1%, and 99.3%, respectively. Moreover, the proposed method also displayed a lower false-positive rate compared to other systems.

Dutt et al. [35], proposed a two-layered immune system to monitor the network traffic and identify the intrusion within the network. The first layer of the proposed system was based on Statistical Modeling-based Anomaly Detection (SMAD) which worked as an Innate Immune System capable of detecting the first-hand vulnerabilities inside the network. Adaptive Immune-based Anomaly Detection (AIAD) was considered as the second layer of the system. This layer collected the information from the Header portion and considered the activation of the T-cells and B-cells to provide efficient intrusion detection. The proposed model was tested using the data and real-time network traffic analysis. The system displayed a 96.04% true-positive rate and 7.8% false-positive rate during the real-time network analysis, while in the case of the dataset the system displayed a 97.1% true-positive rate and 2.79% false-positive rate.

Suliman et al. [83] presented AIS-based IDS and used KDD Cup 99 dataset. It targeted DOS and probing attacks including land, smurf, Neptune, IP sweep, satan, and port sweep attack connections. Next, 24 features that distinguished normal and attack connections and 256,454 connections were investigated in the training phase. Additionally, connection encoding was performed and the initial antibodies were generated using the random number generator function. Following the fitness value calculation of the generated antibodies, the antibodies with the highest fitness value were cloned and then mutated based on a predetermined probability. For selecting the number of testing connections, probabilities of 0.2, 0.3, 0.4, and 0.5 were used which produced a true-positive rate of 96.9608%, 97.0204%, 98.4839%, and 99.8631%, respectively. The result analysis manifested that with the probability selection of 0.2, the best-quality antibodies were produced with a fitness value of 0.46 as compared to the other selection probabilities taken.

Louati and Ktata [57], proposed a deep learning-based multi-agent system for intrusion detection. The KDD 9 dataset was used for training the model. In the data pre-processing phase, all the symbolic features were converted into numeric values, and data normalization, and removing data with null attributes was performed. In the next phase of feature selection, Auto-encoders were used to reduce the dimension of the dataset. In the classification phase, two classifiers were used to ensure the efficiency of the system. Multilayer Perceptron with three hidden layers having 20, 15, and nodes were used along with KNN classifier. The proposed method proved to be beneficial in detecting the intrusion. The Multilayer Perceptron achieved an accuracy of 99.73% while the KNN classifier was able to give 99.95% accuracy.

Liang et al. [56], proposed a multi-agent intrusion detection system to predict and prevent attacks in the IoT environment. The system was based on Smart Efficient Secure and Scalable System (SESS). The system used a web portal for discovering the attacks from the network traffic. The SESS enabled the network administrator to monitor the IoT devices based on the traffic data. Furthermore, these data were collected and sent to the data process module where the first attack detection was executed which was based on feature classification. This dataset was divided into two parts namely the unidentified dataset and training dataset. The training dataset was used for training the detection agent while the unidentified dataset was used for analyzing the performance of the model. The proposed model achieved an accuracy of 98.85% by including certain parameters which outperformed various other methods.

Al-Yaseen et al. [4], proposed a Multi-agent system to optimize the efficiency of the intrusion detection system for reducing the time taken to detect the attacks. The conventional intrusion detection system analyzed the data that were collected from various sources with the help of sniffers. The role of the sniffer was to store the collected data and also convert the raw data into a readable data format. These data were then sent for analysis and detecting whether they contained any malicious activity or not. A new method was proposed to reduce the processing time. This method was used to divide the data into a small subset of data and then evaluate them separately to them merge into one. The subsets of the data were processed parallelly. The system had many agents, namely Coordinator agent, Communication Agents, and Analysis Agent, that were used to analyze the data. The proposed method was able to perform better than Pure K-means in terms of accuracy and was also able to reduce the processing time up to 81% compared to Pure K-Means.

Shenfield et al. [78] proposed a new artificial neural network for detecting malicious network traffic. The byte-level datum of the network traffic was converted into integer and then input into the artificial neural network. A continuous 1000 bytes of data were taken as input into the ANN. The Neural Network was a Multi-Layer Perceptron having 1000 nodes in the input layer, followed by two hidden layers having 30 nodes each and two nodes in the output layer. The ANN used tenfold cross-validation for evaluating the classifier. For training purposes, the maximum epoch was 1000, and a learning rate of 0.01 was used. The Artificial Neural Network was able to achieve an accuracy of 98% with a precision of 97%. Moreover, the model manifested a 1.8% false-positive rate.

Al-Zewairi et al. [5], proposed a deep learning approach for network intrusion detection systems. A multi-layer feed-forward artificial neural network was used for predicting the network intrusion. The dataset used for training the model was containing 45 features. The network had 5 hidden layers with a total of 50 neurons evenly distributed. The best activation function was found out by implementing 3 different activation functions with 2 different configurations. Next, the Rectified Linear Unit was used as the activation function for the neural network. After obtaining the optimal activation function, the proposed feed-forward multi-layer neural network was able to achieve 98.99% accuracy along with a low false alarm rate of 00.56%.

Zhang et al. [100], proposed an intrusion detection system based on a genetic algorithm and deep belief network (DBN). Binary coding was used as an encoding method for all the nodes in the three hidden layers in the binary chromosome. The length of the chromosome used was 18 bits from which the first 6 bits were reserved for the 1st hidden layer, 7–12 bits for the 2nd hidden layer, and 13–18 bits for the third hidden layer. Moreover, a selection operation was used to select the best chromosomes for the crossover and mutation. The internal crossover was adopted for the proposed method. The fitness function for the model was chosen to optimize the model. The model manifested an accuracy of 99.45%, 97.78%, 99.37%, and 98.68% for DoS, R2L, Probe, and U2R, respectively.

Azad and Jha [14], proposed an Intrusion Detection System that is based on a decision tree and genetic algorithm. The crossover operation was used to generate the new individual from the parent. Furthermore, the mutation operation was used to maintain the genetic diversity between different generations. The proposed model manifested an accuracy of 99.99% with the lowest error rate of 0.01% and it outperformed the C4.5 decision tree and Naive Bayes (Tables 1, 2, 3).

Examining the use of conveniently classified “compact” AI methods in cybersecurity

Zamir et al. [99], proposed a stacking model to detect phishing websites. The phishing data set was selected and then fed into various feature selection algorithms, such as information gain, gain ratio, Relief—F, and recursive feature elimination, to analyze the top features of the data set. Next, the strongest features and weakest features were combined into a new feature N1 and N2, respectively. The features were trained with various Machine Learning classifiers with Principal Components Analysis. The stacking of the model was based on combining the highest performing classifiers. The stacking 1 model (Neural Network + Random Forest + Bagging) outperformed all other classifiers in terms of accuracy by manifesting 97.4% accuracy followed by stacking 2 (KNN + Random Forest + Bagging) at 97.2% accuracy. The results manifested the improvement in the classification accuracy by stacking the highest performing classifiers.

Dada et al. [30], examined the implementation of various machine learning methods for email spam filtering. The study reviewed the advantages and drawbacks of various ML methods, namely clustering techniques, Naive Bayes classifier, Neural Network, Firefly Algorithm, Rough Set classifier, SVM, Decision Tree, C4.5 Algorithm, Logistic Model Tree Induction, Ensemble classifier, and deep learning algorithms for the spam filtering. The study outlined the problems in the existing Machine learning techniques, such as the classifiers being inefficient in reducing the false-positive rate, incapability of classifying in a real-time environment and thus resulting in data streams, inefficiency in updating the feature dynamically, the inability to classify spam emails which are in form of images. Moreover, the study recommended deep learning and deep adversarial learning as some of the techniques to overcome the existing difficulties faced by various machine learning classifiers.

Ubing et al. [92], presented the improvement in accuracy of detecting phishing websites through feature selection algorithm and ensemble learning. The dataset having 30 features was used in the study, and a random forest regressor was used as a feature selection algorithm that eliminated 21 least important features. These 9 features were then trained and tested by the ensemble learning which consisted of SVM, Gaussian Naive Bayes, KNN, Logistic Regression, Gradient Boosting, Multilayer Perceptron, and Random Forest classifiers. The proposed model manifested an accuracy of 95.4% with the least false negative. The model outperformed a majority of individual classifiers in terms of accuracy. The usage of multiple models proved beneficial as it was not biased towards one particular model and each model influenced the final ensemble prediction.

Çavuşoğlu [26], proposed a novel combination of different machine learning techniques along with feature selection methods that yielded high accuracy in intrusion detection. The NSL-KDD dataset was pre-processed and then two different datasets were obtained using two different approaches of feature selection algorithm to get the most important features from the dataset. These datasets were then divided into sub-parts according to the type of attack and evaluation was performed with a cross-fold validation technique. Accuracy, Detection Rate, True Positive Rate, False Positive Rate, F—Measure, and Matthews Correlation coefficients were considered as a criterion for the evaluation process. The proposed hybrid layered model outperformed all other methods of the past.

Alkasassbeh and Almseidin [8], demonstrated the importance of Knowledge Discovery in Databases for training and testing different machine learning classifiers. The database was pre-processed and 21 types of different attacks were categorized into four groups (DOS, PROBE, R2L, U2R) with different occurrences having a total of 41 features. Furthermore, in the training phase, J48 Tree, Multilayer Perceptron, and Bayes Network were used as classifiers. The J48 Tree outperformed the other classifiers by manifesting an accuracy of 93.1% with the lowest root mean squared error, whereas the Multilayer Perceptron achieved an accuracy of 91.9% and Bayes Network with 90.73% accuracy.

Rani and Goel [72], presented an Expert System design that could identify the kind of attacks that can occur in a system, the symptoms it shows, and propose appropriate countermeasures. Visual Studio 10.0 framework was used to execute the system and ASP.NET in handling interfaces and SQL server 2008 in handling databases. Rules were handled within the dot net framework at the backend. The user entered the observed symptoms and attack types in a prompt given by the attack identifier. The system then guided the countermeasures to resolve the attack existing in the system. This model served as a means for cyberattacks security awareness among internet users.

Atymtayeva et al. [12], discussed an Expert System approach by developing a method of formalizing Information Security (IS) knowledge to create a knowledge base for expert systems so that it can facilitate the automation of some Security implementation and evaluation jobs in the process of Information Security audit. A high-level composition of the knowledge base for IS was built by formalizing a method of IS assessment and decision-making which included examining IS standards and inferring key concepts from them. Next, the construction of system workflow was done where the key concepts recognized in the previous step could properly function together. Finally, a scheme for the population of the knowledge base was developed in which the lower-level concepts and sub concepts were derived.

Naik et al. [63], proposed a dynamic fuzzy rule interpolation-based honeypot for detecting and predicting the fingerprinting attacks on the honeypots. For the prediction of such attacks, Principal Component Analysis was used for reducing the least important features. The fuzzy inputs of the model including Abnormal TCP Packets, ICMP requests, ICMP packet size, and the UDP requests, displayed five fuzzy sets of output classified as Very Low, Low, Medium, High, and Very High that represented five security levels of the fingerprinting attack. The proposed model was then compared with five different methods, namely SinFP3, NetScanTools, Nmap, Xprobe2, and Nessus. The proposed method was able to improve the accuracy, detection, and sensitivity by dynamically enriching the system’s own knowledge base.

Naik et al. [64], proposed fuzzy hashing- and fuzzy rule-based methods, to augment the efficiency of the YARA rules for detecting the malware. The first proposed method used fuzzy hashing which was enhanced by the YARA rules when the existing YARA rules failed to detect the file as malware. The hashing methods used for the rules were SSDEEP, SDHASH, and mvHASH-B. These methods yielded greater accuracy for all the different types of ransomware. Out of all the three hashing methods, SSDEEP fuzzy hashing method proved to be beneficial in terms of improving the overall accuracy. The second proposed method focused on improving the effectiveness of YARA rules during the execution phase. The proposed method augmented the rule triggering condition of the Fuzzy Hash Matching. The Fuzzy Hash Matching was combined with the String matching condition of the YARA rules for the overall extension of the accuracy.

Naik et al. [65], proposed a computational intelligence honeypot system that was capable of predicting and discovering the attempted fingerprinting attack. The proposed intelligent system used two approaches Principal Component Analysis which was used to select the most important features for the prediction, and Fuzzy Inference System (FIS) which was used to correctly correlate the selected features by the Principal Component Analysis. In the FIS the three most important features were given as fuzzy inputs that yielded effective and optimized rules. The accuracy of the proposed computational intelligence system was then compared with five other fingerprinting attack detecting techniques. The system classified the attack into High, Medium, and Low attack levels. The system manifested 0% failure in detecting the attempts of fingerprinting attacks which outperformed other techniques.

Challenges and future scope

The emergence of AI in cybersecurity can be beneficial but challenging too [84,85,86].

Therefore, major challenges in the application of AI in cybersecurity can be: (i) designing an Artificial Intelligence system that does not have any negative effects while executing the task of cybersecurity, (ii) satisfying that the given AI system has a scalable overlooking, and (iii) overcoming the situation where, as more research progresses into new technologies, AI started growing smarter and self-developing, thereby replacing humans [73]. Although computational intelligence methods have been extensively applied in the area of computer security and forensics, Privacy and Power are some of the ethical and legal issues that arise as technology expands [79].

Nevertheless, AI provides a wide future scope for its implementation in cybersecurity. Several research works and experiments are going on with the aim to trace the ill-effects of the utilization of AI. Moreover, several attempts are being made to find solutions to such ill-effects before there is a position to implement the techniques of Artificial Intelligence in the real world [79]. Several used cases are being tested for ensuring the proper application of AI in cybersecurity with attacks on networks leading all kinds of attacks. The application of fuzzy rule-based expert systems has been a topic of prime attraction for researchers. They are keen to compare the performance of this system with other meta-heuristics like ANN, Fuzzy Neural Networks, Genetic Algorithm or general statistical techniques, such as linear and non-linear regression. The fuzzy rule-based approach is examined particularly to investigate if it has potential benefits in managing cybersecurity threats [40]. In respect to the scenario of ever-increasing cyber-crimes, the viruses and worms that infect and cause harm to cyberspace are also intelligent. This lays down the scope to expand intelligent sensors that can track the harmful actions of such intelligent viruses and worms and can ultimately aid to curb their growth [79]. Apart from this, data mining techniques also have a great scope in identifying some attack connections. This would attach more scientific reasons for the search space of a genetic algorithm [39]. Extensive AI applications for cyber threat detection have started outpacing prediction and response by a wider edge.

Conclusion

Through this study, it can be observed that AI is re-defining every aspect of cybersecurity. The introduction of AI techniques in login securities has started making the CAPTCHA technology inefficient and obsolete. The practical implementations of AI techniques in analyzing and detecting any cyberattack in a computer system have proved their promising potential in the betterment of the field of cybersecurity. The cost of detection and response to breaches in cyberspace is seen to be reduced significantly. Moreover, the average time taken to detect the threat and anomaly is observed to be decreased with the intervention of AI methods into the conventional detection process. Additionally, the accuracy and spontaneity of the detection process are improved with the aid of AI methods that help in improving the input, providing an improvised procedure for cybersecurity, and so on. Apart from contributing to the detection process, Intelligence systems can also be designed to warn and make the user cognizant of the possible cyberattacks and threats their computer system is vulnerable to.

Table 1 Comparative study on cyber threat using distributed AI methods
Table 2 Comparative study on cyber threat using compact AI methods
Table 3 Summary of possible challenges in applying AI techniques in cybersecurity and their potential solutions