1 Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), first reported in December 2019 from Wuhan, China [1, 2], is a novel coronavirus that has caused the ongoing COVID-19 pandemic. It appeared as pneumonia cases with an unidentified cause. The virus spread rapidly causing a large disease outbreak. Due to the severity of the outbreak caused by this virus, the World Health Organization (WHO) declared it a “Public Health Emergency of International Concern (PHEIC)” on January 30, 2020. After the cases outside of China increased almost 13-fold and the number of infected countries increased by 3 times with 118,000 cases in 114 countries, and 4291 deaths, WHO finally declared COVID-19 to be a “Global Pandemic” on March 11, 2020 [3]. On August 02, 2021, the number of cases was 199,558,118 with 4,247,970 deaths and 180,027,978 people recovered [4]. The number of active cases was 15,282,170 where 99.4% were mildly critical and 0.6% were very critical cases. The number of closed cases that had an outcome was 184,275,948 and 4,247,970 (i.e., 2%) died whereas 180,027,978 (i.e., 98%) were discharged after recovery. On January 22, 2020, the number of detected cases was 987 which increased the total to 199,085,178 (almost 200 million) by August 01, 2021. A more recent statistics on November 06, 2022 listed 6,605,691 deaths and 617,384,409 recovered cases among 637,767,096 COVID-infected cases worldwide.

SARS-CoV-2 is a positive-sense single-stranded RNA virus that can cause respiratory, neurological, hepatic, and other diseases [5]. It is the 7th human coronavirus that causes severe pneumonia similar to two other human coronaviruses — SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) [6]. OC43, NL63, HKU1, and 229E are the four other human coronaviruses that have mild symptoms. Through multiple mutations, new variants of SARS-CoV-2 have evolved that are more transmissible having new symptoms and being more likely to have fatal consequences. The current four variants of concern (VOC) are Alpha, Beta, Gamma, Delta, and Omicron that first appeared in the UK, South Africa, Brazil, India, and South Africa respectively [7]. Alpha and Beta variants were named on December 18, 2020, whereas the Gamma variant was named on January 11, 2021, Delta on May 11, 2021, and Omicron on November 24, 2021. The subvariants of these variants are due to some descendant lineages of the mutations.

Due to the severity of the disease, early and accurate detection is essential for managing a COVID-19 infection. This requires rapid fast tests for diagnosing the infection. As nucleic acid-based test approaches are among the most reliable for virus detection, the polymerase chain reaction (PCR) method is one of the most popular viral detection methods that achieved higher sensitivity and specificity within minimum time. The reverse transcriptase-PCR (RT-PCR) is the “gold standard” for SARS-CoV-2 detection [9]. The RT-PCR tests the nucleic acids from the upper and lower respiratory specimens like nasopharyngeal or oropharyngeal swabs and sputum. There are not enough RT-PCR testing kits available worldwide and the test itself is time-consuming. Optionally chest imaging techniques can instead be helpful for rapid COVID-19 detection, severity assessment, and patient management since SARS-CoV-2 normally infects the lung. The imaging techniques of chest radiography (i.e., CXR), chest CT, and lung ultrasounds can therefore be used for chest imaging and diagnosis of COVID-19 patients [10]. Figure 1 shows CXR, CT, and lung ultrasound sample images of COVID-19 patients.

Fig. 1
figure 1

Sample CXR, CT, and lung ultrasound of COVID-19 patients [8]

CXRs have been widely used for medical imaging for COVID-19 since X-ray scanners are available in almost every healthcare facility in every country. The cost for CXR is generally lower than for RT-PCR and other testing kits and it requires less processing time [12] compared to the kits. Figure 2 shows sample CXRs of healthy, bacterial pneumonia patient, viral pneumonia patient, and COVID-19 patient. Chest CT produces images with a bit higher sensitivity level than CXR [13]. Most of the experiments using both techniques showed that there are no significant differences between them for COVID-19 diagnosis. CXR and CT also often compliment each other in cases with intermediate findings [14]. Lung ultrasound is another low cost available option for imaging COVID-19 patients. Although it produces low level images, no X-ray radiation is involved [15].

Fig. 2
figure 2

Sample CXRs of healthy, pneumonia, and COVID-19 patients [11]

Medical images may portray normal (i.e., healthy) representation of any organ or body part, or they may contain some irregularity or deformity (also known as “abnormalities”) that indicate some medical condition. As these abnormalities in medical images are not generally present in the images of healthy people, they can be used to differentiate between healthy people and people with medical conditions or diseases. Medical imaging may show various types of abnormalities such as severity and pre-existing lung conditions in COVID-19 patients based on their disease stage. Different types of opacity in lungs are one major indication of the disease. This can include subpleural curvilinear opacity, reticulonodular opacity, and ground-glass opacity (i.e., bilateral, multifocal, peripheral, posterior, medial, and basal). Other common abnormalities include septal thickening, consolidation (i.e., pulmonary and air space), cavitating lesions, and crazy-paving appearances with various irregular shapes like tree-in-bud or halo signs [16]. Figure 3 shows some sample chest CT showing infection variations of COVID-19 patients.

Fig. 3
figure 3

Sample chest CTs with infected lung regions of COVID-19 patients [16]

The results of medical image analysis (i.e., CXR, CT, and lung ultrasound) for COVID-19 images provide a convenient and non-invasive way of diagnosis and hence, medical professionals have been using them along with the RT-PCR test results (Fig. 4). To further help the medical professionals, analyzing the medical images with AI can be valuable as AI requires significantly less amount of time than the manual process for assessing the results of the analysis. Hence, AI-based tools and techniques have been popular for COVID-19 analysis tasks and achieved high accuracy for (i) detecting if the patient is COVID-19 positive or negative, (ii) classifying COVID-19 cases from healthy, viral pneumonia, and bacterial pneumonia cases, (iii) identifying the severity of the infection, (iv) segmenting the infected regions, (v) tracking the progress of the disease over specific time intervals, and vi) indicating the infected region. Various basic ML, DL, and hybrid models have been implemented and tested on COVID-19 images for these tasks. Some basic COVID analysis applications are shown in Fig. 5. To accelerate the execution of the image processing, some existing efficient DL models trained with various image databases were directly applied through TL. Some researchers also proposed novel ideas and new deep neural network DNN models for the tasks. The experiments done on public or collected COVID-19 image datasets showed impressive performances for most of these approaches and provided some guidelines for new relevant researches.

Fig. 4
figure 4

Sample heatmap visualization of the CXR of a COVID-19 patient at different days using Grad-CAM [17]

Fig. 5
figure 5

COVID-19 medical image analysis

Within the short period of time since the COVID-19 outbreak, some researchers have developed and published AI-based tools and applications such as LungPrint [18], InferReadTM Solutions [19], Clara COVID-19 [20], CAD4COVID [21], AI4COVID-19 [22], CoViDiag [23], and COVID-19 Assistant Discrimination [24] for COVID-19 data analysis, assessment, and screening. These tools are either works in progress or ready to use. Some visualization-based AI applications have also been used for distinguishing the COVID-19-infected regions of the lungs from the remainder of the lung. Figure 4 shows sample visualizations of heatmaps of infected regions of CXRs of a COVID-19 patient on days 3, 7, and 9 respectively using Grad-CAM [25].

The overall framework of medical image processing for COVID detection and segmentation is shown in Fig. 6. The medical image datasets are used as inputs. CXRs, CTs, and ultrasounds are normally used for COVID detection and infected lung region segmentation in combination with various types of AI models/algorithms (i.e., ML, DL, TL, and hybrid). The input images are pre-processed using various algorithms to mostly remove noises and unnecessary parts from the images and enhance the required pixels. Then feature extraction algorithms are used to generate significant feature sets from the images. Based on the image and output requirements, these pre-processing and feature extraction models are chosen from existing image-based algorithms or novel algorithms. The feature sets are then used for classification and segmentation algorithms. The classification process detects if the image is of a COVID-infected lung or not. If it detects COVID, then the next step is segmenting the infected region of the lungs. Segmentation algorithms extract the infected region as outputs. Both detection and segmentation performances are evaluated with standard image-based performance measures. The image detection methods use various classification models to classify the image into healthy, COVID-19, and other disease classes. In most cases, the task is to evaluate if the output class generated by the model matches the class label (i.e., ground truth) of the original data. Accuracy, precision, sensitivity, specificity, and similar scores are computed to evaluate the correctness and quality of the classifier. On the other hand, the image segmentation task generates images with the segmented infection region. So, the segmentation evaluation process checks if the pixels from the ground truth image (i.e., infection region) matches the pixels of the segmented output image. Similarity or dissimilarity metrics like Dice score and Jaccard scores are generally calculated for image segmentation performance evaluation. The definitions and equations of these performance metrics are explained in Section 3.

Fig. 6
figure 6

COVID-19 medical image detection and segmentation workflow

In this paper, a review of recent research efforts on AI-based COVID-19 analysis from medical images is provided. A few survey papers are also reviewed and summarized along with ML, DL, TL, and hybrid model-based research works on COVID-19 medical image analysis. The mentioned approaches are summarized by including their contributions, novelties, tasks, features, methods, datasets, and performance scores. Some COVID-19 image datasets used in recent relevant approaches are also mentioned with their image characteristics, ground truths, sources, etc. The performance metrics commonly used for COVID-19 image analysis in recent works are defined and explained. Finally, some challenges and future research scopes based on the review of COVID-19 approaches are mentioned to provide indications for research directions.

The publications from January 2020 to August 2022 are considered for this literature review. Publications found in “PubMed” [26] and “Google Scholar” [27] using the following search terms and various combinations of these terms — “covid,” “covid-19,” “medical images,” “x-ray,” “ct,” “machine learning,” “deep learning,” “covid detection,” “covid segmentation,” “covid image datasets,” “covid medical image analysis,” “covid medical image review,” “covid medical image survey.” The publications listed in phase 1 were completely based on the inclusion of the search terms and more recent publication date. In phase 2, the listed papers were read and analyzed to select the most relevant ones and to remove duplicates. After removing duplicates and selecting the papers that were based on AI models for COVID-19 detection, segmentation, classification, etc., the final list of papers was created in phase 3. In phase 3, the final papers were selected based on the most recent publications dates and novelty of the research. This process was repeated for the survey papers and the AI model-based papers separately. Among 50 publications collected on the COVID-19 literature review until August 2022, 6 were removed from the list at phase 1 as they were not completely based on AI-based medical image analysis. A total of 44 survey papers were shortlisted and read in phase 2. Finally, in phase 3, 18 of them were finalized for this paper since they were most relevant to the topic and their publication dates were more recent meaning they included more updated summaries of the research work on COVID-19 medical image analysis. Similarly, for the AI model-based papers, 224 papers were collected in phase 1. After removing the duplicates and irrelevant ones, 50 papers were shortlisted in phase 2 and finally 32 were summarized in phase 3 by checking the novelty of the works. The medical image datasets available for COVID-19 were searched and listed based on the dataset searches and the mentioned datasets in the read papers. Twenty-one datasets that include COVID-19 CXRs, CTs, or ultrasounds (mostly images in different formats) were mentioned with their details in this paper. Figure 7 shows the survey papers, AI model-based papers and dataset selection process.

Fig. 7
figure 7

The paper selection process for this review

The major contributions of this paper are as follows:

  • providing a summary containing analysis of recent COVID-19 medical image-based researches using AI tools,

  • discussing basics of ML, DL, TL models, common evaluation metrics for the topic,

  • summarizing existing recent literature reviews on the topic to present the overall research directions on the topic,

  • providing a literature review on recent AI model-based researches on COVID-19 medical image analysis,

  • listing the methods, datasets, type of data, performances, and major contributions of recent researches on the topic,

  • providing necessary resources on available COVID-19 medical image databases,

  • listing the challenges related to the research topic to indicate possible future research ideas.

The rest of this paper is outlined as follows. Section 2 shows the basics of AI models and algorithms. Section 3 includes the performance metrics used in COVID-19 detection, classification, and segmentation. Sections 4 and 5 include summaries of some existing works on COVID-19 analysis surveys and various AI models (ML, DL, TL, and hybrid methods) respectively. Section 6 lists some available image datasets for COVID-19 detection methods. Section 7 discusses the scopes and challenges with respect to COVID-19 analysis and Section 8 concludes the paper.

2 AI models

AI is used to minimize human interaction with systems using computers to mimic human intelligence [28]. Medical data analysis is one of the major applications of AI and different ML algorithms have been used for automated systems that use medical data (i.e., image, text, and audio). AI models have been used in disease detection and diagnosis with ML, DL, and TL models. Analyzing large amount of medical data with speed, accuracy, and precision made AI tools and AI-based automated systems popular to healthcare workers and patients. Most healthcare professionals nowadays use some type of AI tool for all phases of patient diagnosis and management. Figure 8 shows the generic structure of AI algorithms.

Fig. 8
figure 8

Relationships between AI, ML, DL, and TL

2.1 Machine learning (ML)

ML is a subset of AI that enables a computer to learn from the given data. ML can be supervised, unsupervised, semi-supervised, and reinforcement learning [29] as shown in Fig. 9. Supervised learning is the learning process where the machine learns to generate outputs from inputs based on training input-output samples. Supervised learning processes use training output labels to learn to generate labels for test data for classification and regression tasks. Support vector machine, naïve bayes, linear regression, decision tree, etc. are some efficient classification algorithms [30]. The machine may also learn by extracting features from unlabeled data in what is known as unsupervised ML. Clustering and association algorithms like k-means, DBSCAN are data-driven learning algorithms. Semi-supervised learning is a combination of supervised and unsupervised learning process that uses both labeled and unlabeled data to achieve better classification than unsupervised models by utilizing the available data labels in the training process. Reinforcement learning is an environment-driven learning process that learns by rewards and penalties on correct and incorrect predictions respectively.

Fig. 9
figure 9

ML models

2.2 Deep learning (DL)

DL is a subset of ML that is implemented with DNNs. The DNN is a special type of ANNs that have multiple hidden layers between the input and output layer as shown in Fig. 10 [31]. ANNs were inspired by the brain of living things and they were created to mimic the learning process of biological brain. As DNN is also an ANN, it also learns complex information by using the layers of neurons connected to each other. Each layer of a DNN transforms the input data or the output of the previous layer into more complex format with different types of feature values. By encoding and decoding the information between layers of the network, the DL models learn and produce outputs. Convolutional neural networks (CNNs), U-Net, GoogleNet, ResNet, DenseNet, and AlexNet are some examples of DNNs.

Fig. 10
figure 10

Artificial neural network (ANN) vs. deep neural network (DNN) [31]

CNNs are multi-layer perceptron networks that are inspired by biological brain that are able to analyze and detect patterns in image pixels [32]. Due to the efficiency of CNNs in image feature extractions, most of the DL models for image analysis are generated with different combinations of CNNs and other supporting components. As CNN is a DNN, it is constructed with an input layer, an output layer, and multiple hidden layers between them. The layers will include convolution layers that generate feature maps from the input images to pass to the next layer. The convolution layers are normally followed by pooling layers that reduces the dimension of the convolution layers and the fully connected layers connect each neuron of one layer to every neuron of the other layer. The feature maps generated from the input images and passed along the layers while updating the feature maps by collecting more complex information finally produce the classification output for the input image (Fig. 11).

Fig. 11
figure 11

A sample CNN structure for medical image analysis [32]

The image analysis DL models have a general format that includes (i) data pre-processing, (ii) data augmentation, (iii) CNN/DNN model, (iv) post-processing, and (v) final prediction/segmentation/classification. Most researches on image analysis for object detection and segmentation from various types of images (i.e., medical images and non-medical images like facial images and tongue images) use similar frameworks with CNN/DNN structures [33, 34]. The pre-processing and post-processing can be as simple as applying some thresholding or as complex as applying some DNNs. They also may include data normalization, model regularization [35], and noise removal. The data augmentation [36] is a popular step in medical image analysis with DNNs as the number of available medical image data is limited and DNNs perform better with large datasets. The data augmentation process uses different methods to modify the existing data to create new data for dataset extension. Various CNN/DNN models are then applied on them to generate outputs.

2.3 Transfer learning (TL)

TL uses the idea of knowledge transfer from one domain to other related domains [37]. Knowledge achieved from a known domain (i.e., source domain) can be utilized to reduce training time, necessity of large amount of labeled data while improving the performance of the model on an unknown but related domain (i.e., target domain). An example is shown in Fig. 12. Tasks A, B, C, and D are different types of tasks from related domains where A and B are source domain tasks and C and D are target domain tasks. The knowledge achieved from tasks A and B is transferred to solve tasks C and D; thus, TL is applied. Deep transfer learning (DTL) provides the scope for disease detection not only from medical images, but also from other non-invasive images like facial images for genetic disorder (i.e., thalassemia and Down syndrome) detection [38]. Although TL does not always ensure better performance, it definitely generates a comparative alternative approach [39].

Fig. 12
figure 12

Transfer learning

Various ML, DL, TL, and hybrid models have their own advantages and limitations. The general workflow of these models is similar. They take medical images as inputs, applies some pre-processing on them to enhance the image features, then train the model with (i.e., supervised) or without (i.e., unsupervised) data labels and ground truth images. The outputs generated by the models detect the class of the image for image detection task, and segment the infected region in segmentation task. The uniqueness of the ML and DL models is at the ANN/DNN structure. By varying the DNN structure, number of levels of neurons, the connections between the neurons, parameters, and weights of parameters, these models can extract different types of implicit and explicit features from the images based on their pixels, textures, intensities, and semantic measurements. Hence, the outputs of the models vary as the outputs depend on the features used to train the system. The TL model outputs may vary based on the similarity and dissimilarity between the source and target domains. Inferring a general decision on the best model is difficult as the performance of the models varies depending on the data, extracted features, model hyperparameters, and the overall structure of the model.

3 Performance metrics

Most approaches for COVID-19 detection from chest image inputs have used some popular classification and segmentation problem-based performance metrics. The evaluations of the proposed or existing systems were mainly analyzed by classification accuracy, loss, true negative rate (TNR) or specificity, positive predictive value (PPV) or precision, true positive rate (TPR) or recall or sensitivity, false positive rate (FPR), F1-score, receiver operating characteristic (ROC), area under the ROC curve (AUC) and Matthews correlation coefficient (MCC), Dice coefficient/score, Jaccard coefficient/index [40, 41].

The values for true positive (TP) and true negative (TN) represent the number of correct predictions of the “Positive” class and “Negative” class respectively. False positive (FP) and false negative (FN) represent the number of incorrect predictions of “Positive” class and “Negative” class respectively. The accuracy value represents the percentage or amount of correctly classified data and loss shows the difference between the prediction and ground truth. The equation for accuracy calculation using the confusion matrix is mentioned in Eq. 1.

$$ \text{Accuracy} = \frac{\mathrm{TP + TN}}{\mathrm{TP + FP + TN + FN}} $$
(1)

The PPV score shows the ratio of correctly predicted positive values with respect to the total predicted positive values. TPR represents the ratio of correctly predicted positive values to all actual positive values and TNR represents the ratio of correctly predicted negative values to all actual negative values. FPR shows the ratio of predicted values to all actual negative values. F1-score represents the harmonic mean of PPV and TPR values. The calculations for these parameters are drawn in Eqs. 2345, and 6.

$$ \mathrm{PPV\ or\ Precision} = \frac{\text{TP}}{\mathrm{TP + FP}} $$
(2)
$$ \mathrm{TPR\ or\ Sensitivity ~ or ~ Recall} = \frac{\text{TP}}{\mathrm{TP + FN}} $$
(3)
$$ \text{FPR} = \frac{\text{FP}}{\mathrm{FP + TN}} $$
(4)
$$ \mathrm{TNR\ or\ Specificity} = \frac{\text{TN}}{\mathrm{TN + FP}} $$
(5)
$$ \mathrm{F_{1}\ score} = \frac{2 ~.~ \text{PPV} ~.~ \text{TPR}}{\mathrm{PPV + TPR}} $$
(6)

The ROC curve is a probability curve that plots TPR against false positive rate (FPR) at different decision thresholds. AUC is a summary of the ROC curve and it represents the capability or separability of the classifier to differentiate between classes. The higher the AUC, the better the classifier is at differentiating the elements of different classes. Figure 13 shows a sample of a typical ROC and AUC.

Fig. 13
figure 13

ROC and AUC [42].

The MCC score represents the quality of classification for a classifier. If the classifier generates 100% correct classifications, then the values of FP and FN are 0s. In that case, MCC score is 1 showing the perfect quality of the classifier. It can be calculated by Eq. 7 as follows.

$$ \text{MCC} = \frac{(\mathrm{TP ~.~ TN}) - (\mathrm{FP ~.~ FN})}{\sqrt{(\mathrm{TP + FP)(TP + FN)(TN + FP)(TN + FN)}}} $$
(7)

For Dice score and Jaccard index calculation, let A and B be the output image and the ground truth. Dice score measures the similarity between A and B with Eqs. 8 and 9. Jaccard index calculates the similarity between A and B by incorporating the triangle inequality with Eqs. 10 and 11.

$$ \mathrm{Dice ~ score} (A,B) = \frac{2|A \cap B|}{|A|+|B|} $$
(8)
$$ \mathrm{Dice ~ score} (A,B) = \frac{2 \text{TP}}{\mathrm{2 TP + FP + FN}} = \mathrm{F_{1} ~score} $$
(9)
$$ \mathrm{Jaccard ~ index} (A,B) = \frac{|A \cap B|}{|A \cup B|} $$
(10)
$$ \mathrm{Jaccard ~ index} (A,B) = \frac{\text{TP}}{\mathrm{TP + FP + FN}} = \text{IoU} $$
(11)

4 Survey/review papers

Researchers are trying to find good solutions for detecting, diagnosing, and monitoring the disease as well as developing drugs and vaccines for COVID-19. Although the available knowledge, resources, and systems are very limited with respect to curing the disease completely, researchers from various fields are approaching the problem of finding solutions as soon as possible. The first step for solving any problem is to gather as much information as possible on the problem and information about existing approaches or solutions to the problem. Therefore, it is essential to review existing works in order to understand the problem, provide possible solutions, and review challenges and scopes of research. Some researchers have been working on reviewing or summarizing the problem origin, applied methods, their advantages and limitations, available datasets, available tools, and applications of COVID-19 and presented them in survey/review papers. In this section, a few existing survey papers on COVID-19 approaches using image data were reviewed as follows.

Alafif et al. [43] recently summarized the existing approaches used for COVID-19 detection, segmentation, diagnosis, challenges, and future research directions in a comprehensive survey paper. They discussed COVID-19 from various perspectives with invasive and non-invasive systems, available tools, drugs and vaccine including a review of the origin of COVID-19, its severity, and current situation. A comparative discussion on manual vs. AI-based COVID analysis and represented the advantages of AI tools due to their accuracy and instant outputs was included. Chest CT and X-rays were discussed with current ML, DL, and hybrid methods using these images for their processing. Some available AI tools and models like CovidAID [44], Clara COVID-19 [45], COVNet [46], and FluSense [47] were also mentioned with their applications for COVID detection, COVID severity classification, COVID ROI segmentation, and patient monitoring. They listed some recent works with their methods, datasets, and performance metrics and most of them used DNNs as Truncated Inception Net, Darknet-19, U-Net, V-Net, VNet-IR-RPN, VB-Net, U-Net++, SqueezeNet, ResNet, DenseNet, etc., TL, and some used popular ML models like SVM and RF. COVID-19 analysis with audio inputs (respiratory sounds like cough, breathing, and voice) was also mentioned by the authors with similar listing. The drug and vaccine development procedure for COVID-19 was explained and AI-based drug development systems that researchers have been working on were included. Finally, the review was concluded with the challenges and scopes of COVID-19 detection, diagnosis, and analysis. The review showed the advantages of using DL models for COVID-19 analysis that were able to produce 90 to 100% accuracy, sensitivity, specificity, and AUC and performed better than radiologists in some cases. It has therefore been shown that AI-based tools and devices can provide rapid outcomes assisting patient diagnosis significantly and thus help healthcare professionals in their fight against COVID-19.

Another recent survey on the application of AI for COVID-19 [48] included the basics of COVID and general overview of AI ML models such as RF, SVM, LR, and XGBoost and DL models such as ANN, CNN, RNN, and LSTM. Ten CNN-based DL models used in recent COVID classification from X-rays and CT scans were summarized and their performances were compared using F-1 score and AUROC with accuracy values of more than 90%. They included the summaries of 7 approaches on COVID-19 severity detection (i.e., moderate, severe, and critical) with both ML and DL methods. The performances were compared with various combinations of CT, blood, and lab test data. Similar information was included on 10 COVID mortality risk assessment methods based on demographic, lab tests, medication, and other data. The limitations of these models included lack of datasets, datasets with limited number of data, missing features in datasets, their applicability to real-time systems and lack of proper performance metrics like FPR/TPR, and lack of discussion. A total of 13 drug repurposing-based methods were also discussed with AI applications, features, and limitation. Finally, the general challenges of AI-based models like generalization, dataset issues, possibilities of variations, and their relevance to the COVID-19 issues were analyzed to provide some guidance for future research efforts.

Nayak et al. [49] also reviewed recent works on ML- and DL-based models for COVID-19 detection and analysis. They collected 795 papers on COVID-19 after applying some relevant keywords and then filtered out 672 deemed not relevant. Finally, 123 papers (36 on ML, 64 on DL, 23 on others) were reviewed for their study. They started with a detail explanation of viruses that had caused severe disease over the last century and then they discussed the coronaviruses and their effects including COVID-19 on human and other animals. They described the statistics on death, recovery, and active cases in different countries; the reason for the spreading of the disease; the advantages of ML models for disease detection, data illustration, analysis, and prediction; and the limitations of available ML methods. Then they discussed the applications of DL models to overcome the challenges of ML models and to provide an easier and more effective way for disease detection, medical image analysis, disease diagnosis, drug and vaccine design, etc. The papers with their datasets, methods, inputs (i.e., text data, image data, and time-series data), and outcomes were summarized and compared. Different supervised and unsupervised ML models (i.e., SVM, RF, KNN, k-means, LR, SVR, DT, and ANN) and DL models (i.e., CNN, VGG, ResNet, Inception, LSTM, RNN, and DenseNet) were mentioned with their performances. The data used for COVID-19 research efforts, including clinical data, biomedical data, online data with their contents, applications, advantages, disadvantages, impacts of ML, DL for data and challenges with mention of some popular datasets were discussed. The authors included an extensive critical analysis on existing surveys, popular ML, DL models, publications, data, location, and impacts. Their analysis showed that although 52% of works were done on DL, whereas 29% approaches were based on ML and 19% on other models, only 11% publications of COVID-19 from December 2019 to September 2020 were on ML and DL (7% Dl and 4% ML). LR was the most popular ML model used for COVID-19 analysis and CNN for DL where ResNet, VGG, and Inception were the most popular CNN variations. Among various types of data, 60% COVID-19 research works were on image data (i.e., CT, X-ray, others), 31% were on text data, and the remaining publications were on other data. They also ranked existing works based on different strategies with respect to method, data, and performance. More than 50% of those publications were from Asia, whereas Europe and North America ranked second and third. The authors also included detail analysis on the authors, their countries, and contributions. Five other existing surveys were compared to their review and the impacts of COVID-19 on different sectors and concluded with a summary of their complete analysis.

Lalmuanawma et al. [50] reviewed the ML and AI applications for COVID-19 detection, prediction, contact tracing, and medicine approaches, listed the challenges and future directions for researchers of various fields involved with COVID-19 research in a similar manner. They listed 36 applications for contact tracing developed by different countries and compared 4 papers on screening, and 4 on prediction and forecasting based on methods, datasets, validations, and performance metrics for clinical, mamographic, and demographic datasets.

Mohammad-Rahimi et al. [51] reviewed 105 approaches to COVID-19 from 1827 papers based on some inclusion-exclusion criteria. The papers used various ML and DL models for analyzing CT or X-rays as inputs. ML algorithms like SVM, RF, DT, KNN, and NB and ensemble classifiers were able to achieve more than 90% accuracy in most cases with limited amount of data. ResNet, Inception, NASNetLarge, GoogleNet, CNN, AlexNet, VGG, SqueezeNet, Xception, and MobileNet were applied as DL methods for COVID-19 analysis and the performance metrics scores were close to 100% in many of those methods. Some hybrid models with combinations from ML, DL, and heuristics outperformed individual methods in many cases. They summarized all papers in a tabular comparison containing the data sources, structures, and sizes of the datasets, pre-processing, best model achieved in each research, the accuracy, sensitivity, specificity, and AUC of their models for comparison. Although CT images provide more detail than X-ray images, the review showed that X-rays images were also very efficient for COVID-19 detection and severity classification. One of the major issues of COVID-19 research efforts is limited and inconsistent data with missing annotations. Preparing complete and reliable COVID-19 image datasets was mentioned as necessary future work. Due to the limitation of data, ML models performed very similar as DL models, and also had a higher learning speed that DL methods. Thresholding, morphological operations, histogram equalization, and data augmentation were the most popular pre-processing steps. They also discussed possible future directions with TL as it showed promising performance for COVID-19 approaches.

A recent brief review on AI models for COVID-19 detection, classification, and segmentation from medical images listed 12 infection segmentation researches and 23 AI model researches [52]. Medical imaging and details of CT scan procedure with respiratory disease indications were discussed with respect to COVID patients. The image segmentation framework was explained with 12 recent lung and infection segmentation application that mostly used U-Net, U-Net++, and Inf-Net. The U-Net and Inf-Net architectures were described step by step to explain the segmentation process. Afterwards, 23 papers on COVID classification and segmentation from CXRs and CTs with YOLO, VGG, U-Net, U-Net++, basic DNN, ResNet, GoogleNet, etc. were discussed and their results were compared. Finally, the contributions of image analysis for COVID screening and severity assessment were summarized to conclude the survey.

Another recent survey by Subramanian et al. [53] included a brief summary of recent DL models used in COVID classification and detection from CXR and CT images. They added an elaborated background on CNN and other basic DNN models like VGG, ResNet, MobileNet, Xception, and DenseNet. A brief summary of the existing TL-based researches and novel researches was explained with their architectures and performances. Seven public COVID image datasets used for classification were discussed after a detailed analysis on COVID medical image datasets and their characteristics. The accuracy, precision, sensitivity, specificity, NPV, and F1-scores were defined and the performance of 11 researches with these metrics was compared to discuss the top most models with high efficiencies.

A review on 30 research papers until August 2020 on COVID-19 detection and severity classification with performance analysis was discussed in [54]. The authors discussed the dataset details for each paper, the models used, and their performance metrics scores. ML, DL, and hybrid models were used for feature extraction, classification for COVID/normal/non-COVID cases, classification of different level of severity of COVID, and COVID area segmentation. ML models like LightGBM, SVM, RF, DT, XGBoost, AdaBoost, Bagging, and DL models like Inception, InceptionResNet, IRRCNN, ShuffleNet, ResNet, MobileNet, DenseNet, Xception, attention, U-Net, U-Net++, CNN, and its variations were discussed with image data. These methods showed high performance scores (i.e., higher than 80% in most cases) and up to more than 99% accuracy and AUC for COVID-19 detection and classification tasks. They also showed the comparisons of the datasets, methods, sensitivity, specificity, precision, accuracy, AUC, F1-score of 5 papers on COVID/normal image classification, 13 on COVID/non-COVID classification, 11 on COVID/non-COVID pneumonia classification, and 3 on COVID severity classification. They concluded with the observations that DenseNet, ResNet, and ShuffleNet performed well for classifications, U-Net++ performed better for segmentation tasks, and TL performed well for all tasks.

Alghamdi et al. [55] reviewed 34 DL-based COVID-19 approaches on CXRs from March 2020 to May 2020. As X-rays are more available, less time consuming, and provide precise imaging of organs, using CXRs for COVID-19 diagnosis is very popular. They described the COVID-19 classification task as a multi-class problem that was addressed differently from existing works in that the classes are called COVID-19, healthy, bacterial pneumonia, viral pneumonia, no finding, etc. They explored 13 existing datasets containing X-rays of COVID-19 and other lung diseases with detail descriptions, image specifications, and links. As 71% research papers from the review were on TL of CNN variations, they described TL and its relevance to COVID-19 approaches. Some popular CNN structures used in those approaches like AlexNet, GoogleNet, VGGNet, ResNet, Xception, SENet, DenseNet, MobileNet, ShuffleNet, CapsNet, and autoencoder were discussed with the papers, their contributions, methods, and performances. Twenty-five papers that used these CNN-based DL models and 14 that used novel architectures were summarized and compared according to their datasets, contributions, and performance scores (i.e., accuracy, precision, specificity, sensitivity, AUC, and F1-score). They also discussed the methods used for visualizing the classification decisions in the reviewed papers such as Grad-CAM, CAM, gradients, attribution maps, LRP, GSInquire, and guided backpropagation. They explained challenges such as class imbalance problem, classification uncertainty, COVID severity classification, and dataset quality. Some future directions with proper dataset generation, exhaustive feature extraction, exploring semi-supervised models, GANs and TL, etc. were discussed to guide further methods. Another similar survey on DL-based image analysis for COVID-19 was published recently [56]. Unlike the previous paper, they reviewed works on both chest CT scan and X-rays.

Bhattacharya et al. [57] focused on ML and DL methods for both COVID-19 analysis and other abnormality analysis from medical image data. They discussed COVID-19 in a detailed overview consisting of the origin, current situation, necessity of DL approaches, basics of ML (supervised, unsupervised, semi-supervised) methods, DL models (i.e., RNN, GAN, and CNN), and their applications. They explained medical image analysis with some recent works for abnormality detection, classification, localization, segmentation, and registration on various medical images. Outbreak prediction, tracking spread, diagnosis, treatment, and limitations of COVID-19 with ML and DL models were also discussed. They also listed the datasets, methods, performance metrics, and research challenges of 27 medical image analysis approaches and 16 COVID-based works. They included 3 use cases of DL-based COVID-19 detection and monitoring from China, Canada, and South Korea. Finally, they discussed the challenges and future research directions for the pandemic with AI tools. A similar but more general review was done in [58] for analyzing the applications of AI regarding COVID-19. They provided an elaborate overview of AI applications in COVID-19 research papers regarding tasks like COVID-19 detection, severity classification, prediction, patient data management, patient management, resource management, CT image analysis with ML, DL, TL, and other AI models, analysis of biology-based data, drug discovery, and social behavior monitoring and management. They also discussed and compared the tasks and performances of 7 research works that used AI-based techniques (mainly DL models, some ML, and other models) for COVID-19 analysis from chest CT images and showed that AI was able to achieve 97% accuracy.

A detailed survey on imaging techniques for COVID-19 (i.e., CT and X-ray), segmentation of region of interests (ROIs) from those images, detection of COVID-19, classification, severity detection, diagnosis, monitoring, and COVID-19 image datasets was provided in [59]. The authors reviewed the image-based COVID methods on various ML, DL, and other AI techniques published until 31 March 2020. The workflow for conventional imaging, AI-based imaging, and COVID-19-based-specific AI applications for imaging with proper examples and explanations was included. They discussed basic image segmentation techniques and COVID-19 image segmentation techniques focused on U-Net and U-Net++ models for extracting lung regions and lesions of lungs. Summaries of 11 COVID-19 segmentation approaches with their data modality, subject details, methods, applications, ROI (i.e., lungs or lesions), and highlights of their contributions were provided. They also discussed AI-based techniques for detection, classification, severity measurements, and follow-ups of COVID-19 from images. They summarized 14 approaches on these with their data types, dataset details, tasks, models (i.e., U-Nets, CNNs, and RFs), and performances. Details on 4 COVID-19 image datasets and discussed the challenges and future research scopes for COVID-19 detection, segmentation, and diagnosis were listed.

Another recent survey on COVID-19 approaches that used ML and DL techniques on image and text datasets for diagnosis, prediction, and forecasting and providing a detailed overview of pandemics, COVID-19, existing systems, datasets, challenges, and possible future directions was given in [60]. The authors listed and discussed all pandemics from plague to COVID-19 with timelines and death statistics. Details on COVID-19 origin, transmission, cases, deaths, and their statistics in various countries and timelines were provided. They summarized 5 works on RF, 5 on SVM, and 9 on other ML models like linear regression, k-means, XGBoost, and Gabor with problem statements, type of data, used model, publication time, and performance scores. Similar summaries were included for 14 papers on CNN, 8 on LSTM, and 7 on other DL models like GAN and autoencoders. Most of them achieved more than 95% accuracy in their tasks. They also included an overview on the mathematical and statistical models used for the pandemic losses and prediction analysis with 9 papers. A performance analysis using the evaluation scores of 5 ML-based approaches and 13 DL-based approaches was provided. They also included detail statistics on publications on COVID-19, type of publications, countries, journals, type of data, and prediction-classification-forecasting distribution. They concluded with a detail discussion on the challenges of COVID-19 like data limitation, prediction accuracy, lack of advanced tools and applications, and absence of customized systems appropriate for developing countries.

A systematic review of 11 COVID-19 research works from the 36 that were identified as published until the middle of May 2020 was included in [61] after an extensive searching and screening process. The publications were based on various combinations of classification tasks as binary/multi-class/hierarchical classes. The authors also mentioned statistics of the databases and countries of publications for the reviewed papers. Binary, multi-class and hierarchical classes were described in detail with relevance to the existing works and a critical analysis was done on those papers based on dataset types, methods, and case studies of X-rays and CT scans. The challenges of evaluation, trade-offs, and importance of criteria were discussed in detail. An important contribution of this survey was a complete and step by step research proposal using multi-criteria decision analysis based on the reviewed works. A complete methodology starting from identification, data collection, pre-processing, evaluations, methods, and validations was elaborately discussed.

Sufian et al. [62] followed a similar idea in their systematic review on COVID-19 approaches that used DL, DTL, and edge computing (EC) for lung image COVID-19 detection and classification. They discussed each of these methods in detail and with relevance to COVID-19 approaches. Tasks and contributions of 15 papers on DL models, 9 on DTL models, and 7 on EC models for COVID-19 approaches were summarized. They also included the image and textual datasets used in COVID-19 papers and listed 9 datasets and 7 data sources (images and textual) with brief descriptions for each. The current and future challenges that were not only relevant to COVID-19 detection or classification, but also the human resources and other logistic issues were analyzed. They finally proposed a complete framework combining DL, DTL, and EC for COVID-19 detection, diagnosis, hospital management, and social distancing management as a possible future direction for COVID-19 systems.

A survey on COVID-19 based on mathematical models, AI models, and datasets was provided in [63]. The authors reviewed 61 components (19 on mathematical models, 18 on AI, 24 on datasets) published from December 2019 to April 2020 on COVID including journal articles, websites, reports, and fact sheets. They discussed mathematical models used for COVID-19 spread and dynamics analysis with models like susceptible-exposed-infected-removed (SEIR) and susceptible-lovered (SIR) and listed the tasks, models used in the reviewed mathematical model-based approaches. For the AI models mostly used for image data, they provided detailed discussions on detection and classification from CXR and CT data. ML and DL models like CNN, NB, LDA, SVM, RF, DT, and LR providing explanations according to the corresponding research items. Methods, datasets, data details, accuracy, sensitivity, and PPV were compared for the AI-based works and showed more than 99% accuracy. They included descriptions and sources of 24 COVID-19 datasets combining image, text, and value data. Finally, they discussed the advantages and limitations of both mathematical and AI models and proposed some possible future directions.

Several researchers have worked on reviews of datasets used in various COVID-19 analyses. A comprehensive study of the publicly available COVID-19 datasets, their applications, and possible future research directions was summarized in [64]. The image datasets included chest CT scans and X-rays, whereas the textual data included case reports on various factors of COVID-19, social network data relevant to COVID-19, and scholarly article information. The authors discussed dataset details like data collection process, location, number of data, type of data, data annotation, and data availability. They also included descriptions of methods used on those data for COVID-19 detection, segmentation, classification, etc. with their performances. They discussed the textual data containing demographic, economic, transmission, mobility, social media emotion, conversation, and scholarly article analysis. The application, method, data type, and links for 19 resources for images and 26 resources for texts including some Kaggle and Github sources were compared. The challenges regarding medical image and social media data, privacy issues, and authentication with some future research directions were suggested for COVID-19 analysis.

Another similar research on review of clustering algorithms for COVID-19 datasets was presented in [65]. Various textual datasets containing social network data, demographic data, and case report data were discussed that used clustering algorithms as k-means, FCM, DBScan, hierarchical clustering, etc. with objectives, datasets, methods, and results of their analysis. Latif et al. [66] prepared a survey paper based on different types of datasets, applications of data science, reviews of current approaches, and challenges of COVID-19. They explored image, textual, biomedical, voice datasets, competition datasets, datasets from developing countries, community-based datasets, other research works, and other statistical datasets on COVID-19. The data science aspects of COVID data such as risk assessments, classifying patients according to different priority criteria, proper screening of each patient, and diagnosis based on all variables were reviewed. They also evaluated various simulations and models used for COVID-19 diagnosis, contact tracing, imposing social interventions by applying social distancing, managing rumor propagation, planning resources and logistics accordingly, improving patient care, managing innovations of vaccines and other new medicines and treatments, controlling economic interventions, and applying all of those in different ways for developed and developing countries. The methods, datasets, and techniques (i.e., ML, DL, and hybrid) of 26 recent approaches on image data, 6 on text data, and 7 on pharmaceutical data were summarized. They included some statistics based on COVID-based publications, research topics, and other pandemics and listed the challenges of COVID-19 approaches like data limitation, time constraint, authentication, and security, necessity for multidimensional approaches and data and more concrete approaches for developing countries.

Various researchers reviewed COVID image analysis from different perspectives. Some survey papers focused on the general framework of the image analysis tasks and mentioned existing works for each step of those tasks. Some reviews were focused on the papers on COVID image analysis and tried to provide brief summaries of recent works in the field. Some researchers organized their literature review to combine all known information on AI-based COVID analysis. In this section, a brief idea of the existing literature is provided with the major tasks they performed, the general approaches they used, and their summarized contributions. The goal of this section is to provide an overall idea of each discussed review paper to relevant researchers so that they have some indications of combined information resources on these tasks available currently. Tables 12, and 3 show lists of summaries of the contributions mentioned in the review/survey papers on COVID-19 analysis.

Table 1 Survey/review papers
Table 2 Survey/review papers
Table 3 Survey/review papers

5 AI model-based papers

COVID-19 approaches on medical image datasets mostly used DL and TL models. Some methods included popular ML techniques or hybrid models. As CNN models perform better in medical image analysis than other techniques, various popular CNN models were used in most of the recent works. Although the necessity and urgency of a complete system for COVID-19 screening encouraged the researchers to use pre-trained CNNs on COVID-19 datasets for rapid output generation, some researchers focused on proposing novel frameworks with ML, DL algorithms and achieved promising outputs. Most recent works chose the detection and classification task, but few papers also provided methods for infected region segmentation and visualization. The detection and classification tasks were used on medical images like CXR, CT, and ultrasounds as binary (i.e., COVID-19 or normal), 3 classes (i.e., COVID-19, normal, pneumonia), and 4 classes (i.e., COVID-19, normal, viral pneumonia, bacterial pneumonia) tasks. The visualization task-based works mostly highlighted infected lung regions. The segmentation tasks segmented the lung regions and the infected parts of the lungs. Some recent AI-based approaches for COVID-19 detection, classification, segmentation, and visualization are summarized below.

Chandra et al. [67] recently proposed a two phase COVID-19 screening system for classifying X-ray images into COVID-19-infected, COVID-19-suspected, and normal images. After pre-processing (i.e., resizing, denoising, and normalizing) the images, they used 4 data transformation techniques (i.e., sharpening, Gaussian blur, brightness modification, and contrast modification) for data augmentation. Then a binary grey wolf optimization (BGWO) algorithm was used to select the best features from 8196 features (i.e., 8 first-order statistical features (FOSF), 88 grey level co-occurrence matrix (GLCM), and 8100 histogram of oriented gradients (HOG) features). Decision tree (DT), support vector machine (SVM) with 3 different kernel functions in 3 separate models, k-nearest neighbor (KNN), naïve bayes (NB), and ANN were trained in two phases (phase I — normal/abnormal, phase II — normal/abnormal COVID-19/abnormal pneumonia). Finally, majority voting was used to generate the final output. The proposed method achieved 98.06% accuracy in phase I and 93.41% in phase II and the results were comparable to 5 and 6 similar COVID-19 approaches for 2 and 3 class problems respectively.

Another similar ML-based COVID-19 detection and classification research was proposed recently in [68]. They explained the origin, transmission, statistics, and other basics of COVID in detail. Then they proposed a Ml approach for COVID-19 X-rays to classify them into COVID-19 and non-COVID-19 classes that outperformed the accuracy of DL models. After some basic pre-processing were done on the input image, the luminance value for each pixel was determined by the luma transform to extract 10,000 features from each image. Then the hybrid social group optimization (HSGO) algorithm was used to select the best set of features (i.e., 116 features) for each image as it achieved the highest accuracy with lowest number of features compared to SGO, principal component analysis (PCA), and kernel PCA (KPCA). The feature sets were applied as inputs to 5 ML classifiers — KNN, DT, random forest (RF), support vector classifier (SVC), and linear SVC (L-SVC). The SVC classifier achieved the highest accuracy of 99.65% outperforming 12 other DL and bio-inspired algorithms for COVID-19 classification.

A novel idea called one-shot learning was proposed and implemented in [69]. To use all necessary information from a limited amount of data of a multi-class problem, the model extracted the best samples from each class by ranking them based on their discrimination ability among classes. Then samples were chosen randomly from the best sample sets to create clusters for the one-shot learning of ensemble classifier. They proposed an ensemble model combining generalized regression neural network (GRNN) and probabilistic neural network (PNN) classifiers. After extracting raw features by the one-shot learning model, GRNN and PNN were individually applied for best performing samples that were used for clustering. The enhanced features were then used for the ensemble model of GRNN and PNN to produce final classification outputs. They used 2, 3, and 4 classes for COVID, normal, viral pneumonia, and bacterial pneumonia and achieved 100%, 80%, and 65% accuracies on average respectively. Their experiments on COVID-19 X-ray dataset showed performances comparable to the popular DNNs AlexNet, GoogleNet, and ResNet18.

A 23-layer CNN was proposed in [70] for COVID-19 CT image classification with a comparative analysis of ML, DL, and texture analysis methods for the task. They included summaries of 7 studies from 2020 on COVID-19/non-COVID-19 classification from CT images with their datasets, methods, test models, and results. At the pre-processing stage of the proposed approach, they applied GLCM, local binary pattern (LBP), and local entropy (LE) on each original image and then used basic data augmentation on these and the original images to increase the dataset 5, 10, and 20 times. The augmented datasets for their proposed 23-layer CNN containing 1 input layer, 5 blocks of convolution, batch normalization, Relu and max-pooling, 1 fully connected layer, 1 softmax layer, and 1 output layer was then applied. They experimented on their proposed model with various combinations of data augmentations (i.e., 5, 10 , 20 times) and data combinations (i.e., original, GLCM, LBP, LE) with 2- and 10-fold cross-validation. They applied 2 DL models (AlexNet and MobileNetV2) and 2 ML models (SVM and KNN) with similar data augmentations and combinations to compare their model. They also compared the model performance with 7 existing COVID-19 CT image classification and achieved comparable performances.

Ismael et al. [71] proposed a novel COVID-19 detection model for CXR images using a hybrid model containing ML, DL, and TL methods. After resizing the input images, pre-trained ResNet18, ResNet50, ResNet101, VGG16, and VGG19 models were fine tuned and used for deep feature extractions from the images. A SVM model with linear, quadratic, cubic, and gaussian kernel functions was used for training on the extracted deep features. They also experimented with eight local texture descriptor features to compare their affects on the classification. ResNet50 features with linear kernel SVM achieved the highest accuracy with 94.7% for classifying the images into COVID-19 and normal categories in minimum time (i.e., 48.9 s).

Another similar approach combining ML, DL, and Sobel filtering for COVID-19 detection from X-ray images was proposed in [72]. They reviewed 25 existing approaches on COVID-19 detection from X-ray and CT images with DL models and proposed the scopes of their work. They then generated their own dataset containing verified annotations of 333 X-rays collected at Omid Hospital in Tehran from February 2020 to April 2020. A total of 256 of those images were normal and 77 were COVID positive. The Sobel filtering for detecting the edges from the images was then applied and then they were pre-processed for two CNN models — one with sigmoid output layer and another with SVM output layer that used 10-fold cross-validation. They showed that using Sobel filtering improved the performance of the network and achieved 99.02% accuracy. Their model was also tested with 6 public datasets and generated high accuracy. The proposed method outperformed most of the results of 11 recent DL-based COVID-19 detection approaches.

A novel pre-processing method with fuzzy and stacking techniques was used in a hybrid model in [73] combining ML and DL methods for classifying images into COVID-19, normal, or pneumonia classes. Each input image was reconstructed with fuzzy color algorithm for RGB values and the reconstructed images were added to the original data using stacking algorithm to reduce noise and improve image quality. The complete image set was then fed separately into 2 pre-trained DL models named SqueezeNet and MobileNetV2 for feature extraction. Then a social mimic optimization (SMO) algorithm was used on each feature set to select the best set of features that were then combined to create the feature set of the image. Finally, a multi-class SVM used these features to classify the images into COVID-19, normal, or pneumonia classes. Various combinations of the images and networks showed that the proposed model with fuzzy color and stacking algorithms achieved the highest accuracy of 99.27% compared to the model with features from the individual networks. The system was also computationally cost effective as it used fewer parameters and features.

Elkorany et al. [74] proposed a very similar hybrid of ML, DL, and TL models named COVIDetection-Net combining two popular pre-trained CNN models for deep feature extraction and a multi-class SVM (MSVM) for classification of images into COVID, normal, viral pneumonia, and bacterial pneumonia classes (2, 3, and 4 classes respectively) with those features. Pre-trained ShuffleNet and SqueezeNet were applied on the datasets separately to generate 544 and 1000 features respectively. All 1544 deep features were then fed to a MSVM for 2, 3, and 4 class classification. The comparisons between the results of COVIDetection-Net and individual Shufflenet and SqueezeNet showed that combining features from both improved the performances for all classes. A comparison between the proposed model and two popular COVID-19 detection models Coronet[75] and CovXNet[76] showed 2 to 7% improvements in COVIDetection-Net. They also compared their model with 18 other approaches on COVID-19 detection including 2, 3, or 4 class classifications proving COVIDetection-Net achieved the highest accuracy 100%, 99.72%, and 94.44% for 2, 3, and 4 classes respectively.

Bhattacharyya et al. [77] proposed a segmentation and classification system for COVID-19 X-ray images to classify them into normal, COVID, and pneumonia classes. A conditional GAN model was used for lung segmentation to minimize the ROI for classification. After segmenting the lungs from the X-rays, different keypoint extraction methods like SIFT, BRISK, and k-means were used for keypoint extract and DL models were used for deep feature extractions. The feature sets were then applied for the final classification. The proposed model was compared to similar 5 recent DL-based classification and outperformed all of them with 96.6% accuracy. The A Bayesian DL model was used in [78] for classifying CXRs into the same three classes — normal, COVID, pneumonia with 96% classification accuracy on similar datasets. They applied the proposed model on a combination of a COVID X-ray dataset and a pneumonia X-ray dataset. The combined dataset was divided into training, testing, and validation sets. A basic 5 layer CNN model was used to extract the feature set from the X-rays and then the Bayesian optimizer was used to tune the hyperparameters to achieve the highest performance for the CNN for the 3 class classifier.

A novel feature fusion model for COVID-19 detection, classification, and segmentation was proposed in [79] that combined 3780 histogram-oriented gradient (HOG) features and 4096 features extracted by a pre-trained CNN (VGG19) automatically. The best 1186 features selected by maximum entropy method were used for COVID-19 classification with another pre-trained VGG19 model. A modified anisotropic diffusion filtering (MADF) algorithm was used for noise removal from COVID-19 positive images. Then the effected lung regions were segmented with the watershed algorithm. The classification achieved 99.49% accuracy that outperformed 19 other DL-based COVID-19 detection models.

A TL-based model was proposed in [80] to overcome the limitations of existing COVID-19 detection models with the application of Haralick features for texture analysis. The input images were pre-processed for histogram equalization, Weiner filter, and ROI cropping. Then the Haralick texture features were extracted from the pre-processed images for classification. Pre-trained DL models Resnet50, VGG16, and InceptionV3 were applied on a combination of X-ray and CT images collected from various sources to classify them into 4 classes: normal, COVID-19, viral pneumonia, and bacterial pneumonia. They also provided visualization for the infected lung region using Grad-CAM. The proposed model was tested with individual and combined Haralick features to analyze their effects on the classification and was discussed in detail. The model outperformed 9 existing COVID-19 classification approaches using ML, DL, and TL models with 93% accuracy.

Wang et al. [81] proposed one of the pioneering state-of-the-art DL models for COVID-19 detection called COVID-Net based on a projection-expansion-projection-extension (PEPX) design with human-machine collaboration strategy. They also generated one of the largest COVID-19 X-ray benchmark datasets called COVIDx with 13,975 CXRs from 13,870 patients. They designed a two-step network that included one stage with 1 convolution layer, 16 PEPX blocks, 1 flatten layer, 1 fully connected layer, and a softmax output layer, and second stage with 4 convolution layers for long range connectivity. Each PEPX block contained 5 layers for projection, expansion, depth-wise convolution, second stage projection, and extension for incorporating all features accurately. The proposed model was compare to VGG-19 and ResNet-50 and outperformed them both by 3 to 10% while achieving 93.3% accuracy.

Another novel 22-layer CNN-based COVID detection model named CoroDet was proposed in [82]. CoroDet contained 9 2D convolutional layers, 9 max pooling layers, 1 flatten layer, 2 dense layers, and 1 layer with a leaky Relu activation function. CoroDet was able to classify X-ray and CT images into 2, 3, and 4 classes as COVID-10, normal, and non-COVID pneumonia (non-COVID viral pneumonia and non-COVID bacterial pneumonia). The model was tested on of the largest COVID image dataset COVID-R and achieved 99.1%, 94.2%, and 91.2% accuracy for 2, 3, and 4 class problems respectively. COVID-R was generated by collecting, combining, and modifying X-ray and CT images from eight public datasets and was one of the major contributions of their research. The proposed method was compared to 10 similar COVID-19 classification methods and outperformed all by at least 1% to at most 17%.

Narin et al. [83] proposed a TL CNN-based COVID-19 detection model for classifying CXRs into 4 classes as COVID-10, normal, non-COVID viral pneumonia, and non-COVID bacterial pneumonia with 3 binary classification models. They used 5 pre-trained CNNs: ResNet50, ResNet101, ResNet152, InceptionV3, and Inception-ResNetV2 using 5-fold cross-validations on 3 datasets. The highest accuracy of 96.1 to 99.7% was achieved with ResNet50. The comparison between their proposed method and 10 similar approaches on 2, 3, and 4 class COVID-19 detection showed that their model outperformed most similar ML and DL approaches for COVID-19 detection from X-ray images.

Another TL-based model was proposed in [84] using an EfficientNet pre-trained on ImageNet dataset to optimize the computational cost (i.e., time and memory) of the COVID-19 screening system. They used the EfficientNet with two structures: flat (classification without considering the relationship between classes) and hierarchical (considering the taxonomy of classes). Two different dataset divisions were used that combined 3 COVID-19 X-ray datasets to train and test on normal and cross-dataset environments. Both flat and hierarchical models achieved 93% accuracy with more than 30% higher computational efficiency and 100% COVID-19 positive prediction rate. These characteristics made the approach efficient for multiple platforms, devices, and applicable for developing countries.

Das et al. [85] proposed an ensemble DL-based approach for COVID-19 X-ray classification with DenseNet201, Resnet50V2, and Inceptionv3. They used 7 COVID X-ray datasets from different sources to train and test their model. A pre-processed dataset was used to train and test the models separately and the test images were used for loss minimization. The weights for each model were calculated by using 5-fold cross-validation on the test data. Then the ensembler used the average of the weights for 3 models to generate the final output label (i.e., COVID positive or COVID negative). They compared the outputs of each individual networks, their concatenation, and the proposed approach to show that the proposed model achieved the highest accuracy of 91.62% and outperformed the others by at least 1%. They also included a detail comparative analysis between their method and 8 other recent works according to dataset, evaluation method, accuracy, sensitivity, specificity, classifier, and summary of approaches.

Another DL-based ensemble model named EDL-COVID was proposed in [86]. They designed the model by combining snapshots of a pioneer open source deep CNN for COVID-19 called COVID-Net [81]. The authors used multiple snapshots of the same COVID-Net at the same training execution to reduce the computation cost of training and testing multiple different networks. To overcome the lack of diversity problem of snapshots from the same network, they applied cosine annealing learning rate schedule for aggressive learning rates. After generating 6 models from the COVID-Net, the ensembler applied calculations on the class probabilities to provide an average and generated the final classification results. They tested all individual models and the ensemble one on COVID-19 X-ray images and compared the results to show that the ensemble output performance was higher than individual models in most cases.

A similar DL ensemble model of 3 popular CNNs (AlexNet, GoogleNet, ResNet18) with the same name (i.e., EDL-COVID) was proposed in [87] combining CNNs, TL, and majority voting. They discussed the clinical values of lung CT images for COVID-19 detection and listed the summaries of 6 relevant approaches. Detailed definitions and frameworks of TL, ensemble models, AlexNet, GoogleNet, and ResNet were provided. A dataset containing normal, lung tumor, and COVID-19 CT images was pre-processed and fed into pre-trained AlexNet, GoogleNet, and ResNet18 separately with 5-fold cross-validation. An ensembler used the outputs from 3 classifiers and applied a relative majority voting to produce the final classification output. The experimental results showed more than 99% classification accuracy with an optimized detection time proved that the ensemble model performed better than individual DL models in terms of accuracy and detection time.

A comparative analysis between 3 CNN models for classifying X-rays into COVID-19, normal, or pneumonia classes was presented in [88]. They mentioned that the novelty of the work was proposing a Leaky Relu activation function instead of Relu for all 3 models. They implemented Inception Net V3, XCeption net, and ResNeXt models with Leaky Relu and tested them with Kaggle COVID-19 X-ray repository data. A detailed analysis on the comparisons of the models was included and they concluded that XCeption net performed best for COVID-19 X-ray classification. Nayak et al. [12] worked on a similar research by comparing 8 popular CNN models on X-rays for COVID-19 detection. They applied pre-trained AlexNet, VGG-16, GoogleNet, MobileNet-V2, SqueezeNet, ResNet-34, ResNet-50, and Inception-V3 with various batch sizes (i.e., 8, 16, 32), optimizers (i.e., Adam, SGD, RMSProp, Adadelta), learning rates, and epochs to compare and choose the best network with the best set of parameters. They concluded with the decision that ResNet-34 achieved the highest accuracy of 98.33 with a sensitivity score of 100 that outperformed 8 recent similar research works.

Degerli et al. [89] not only proposed a DL-based COVID-19 detection and segmentation model, but also provided the largest COVID-19 X-ray dataset so far named “Qata-COV19” that was also the first dataset with COVID-19 segmentation ground truth mask images generated by a human-machine collaborative method. They used a 2-stage architecture that applied manual segmentation on a randomly chosen subset and trained DL-models on this to produce segmentations. Then MDs chose best segmentations and they were used at stage 2 for training and cross-validation. Three different DNNs named U-Net, U-Net++, and Deep layer aggregation (DLA) were used with four pre-trained encoders (i.e., DenseNet-121, CheXNet, Inception-v3, and ResNet-50) with two settings as frozen or non-frozen encoder. All 24 combinations of these networks were then used with two different dataset divisions and the results were compared to find the best model. U-Net, U-Net++ with DenseNet-121 achieved the best performance with more than 99% accuracy for both detection and segmentation.

A novel 14 layer CNN models was proposed in [90] for classifying chest CT and X-rays into normal, pneumonia, or COVID-19. They built the CNN model with 9 convolution layers, 1 max pooling layer, and 4 fully connected layers with proper hyper-parameters. Their model was tested on 3 public datasets (i.e., Cohen CXR, RSNA, Radiopaedia) and 1 local dataset from the radiology department of BVHB, Pakistan, to compare the similarities and differences of COVID-19 traits between local and public datasets. The proposed model achieved 96.68% that either outperformed or had similar scores as 7 existing COVID-19 classification approaches on ML, DL, and hybrid models. The model was also trained on a large dataset and is currently being used at the radiology department of BVHB, Pakistan.

Ozturk et al. [91] recently proposed a novel DL model called DarkCovidNet for COVID-19 X-ray detection and classification into 2 and 3 classes (i.e., COVID-19, no findings, pneumonia). The proposed model is based on the DarkNet-19 model using the YOLO (you only look once) [92] system for object detection. They generated a 19-layer model with DarkNet and convolution blocks including batch normalization, max pooling, and LeakyRelu activation with gradually increasing filters as 8, 16, 32, etc. They applied a 5-fold cross-validation and provided a visualization of the heat map with Grad-CAM. An expert radiologist observed the performance and results of the proposed network and commented on DarkCovidNet. They mentioned the model as an outstanding model for COVID-19 detection, a sensitive model for pneumonia detection, a helpful model with heatmap visualization with the drawback of incorrect detection for unclear lung images or X-rays of acute respiratory distress syndrome (ARDS) patients that produced diffused images. The network was compared to 9 other COVID-19 detection models and outperformed all for binary classification with 98.08% accuracy and achieved comparable performance for 3 class problems with 87.02% accuracy.

TL-based COVID-19 detection using multimodal images was discussed in [8] to identify pneumonia in the lungs and detect the pneumonia type (i.e., COVID-19 or non-COVID). Their research was also one of the very few approaches to COVID-19 detection that used lung ultrasound images with X-rays and CT scans. A detail comparative analysis on popular CNN models for COVID-19 classification was provided and 8 CNN models were evaluated. They evaluated pre-trained VGG16, VGG19, Resnet50, Inception V3, Xception, InceptionResNet, DenseNet, and NASNetLarge and compared their performances to choose the best model for COVID classification with minimum tuning requirement to apply TL task. The models showed F1-scores varied from 0.63 to 0.99 for test data and VGG19 performed better compared to others considering all types of data and ultrasound images and provided more accurate results than the other images. They also proposed a pre-processing framework to minimize the noises and quality imbalance and maximize the lung area visibility. After reading the color images, they were converted to grey scale and normalized with N-CLAHE were then converted back to color images and resized for augmentation and training. They discussed the different experiments done on the models with varying parameters and their effects on classification results. A comparative analysis provided guidelines for future COVID-19 detection approaches on the performances of popular CNNs on various image types.

Zebin et al. [17] proposed a framework for COVID-19 detection, synthetic data generation, and progress monitoring using DL-based models, generative adversarial networks (GANs), and activation mappings. They discussed the COVID-19 datasets, sources, and their distributions according to percentages of labels. They applied a cycle GAN algorithm for 5000 iterations on the normal images from the datasets to generate realistic synthetic COVID-19 images to solve data limitation issue and added 100 synthetic data to the training set. Then pre-trained VGG16, ResNet50, and EfficientNetB0 were used for feature extraction and classification of the images into normal, COVID-19, and non-COVID pneumonia classes. They also used Grad-CAM to generate heatmap visualizations for the COVID-infected lung region of the same patient after intervals to monitor the progress of the disease. The proposed framework was compared to 7 COVID-19 approaches and achieved comparable performance with accuracy varying between 0.88 and 0.96.

Xu et al. [93] proposed a mask-attention base DNN called MANet for COVID X-ray classification. The input dataset was created by combining 3 different datasets to achieve 5 class labels — normal, COVID, TB, viral pneumonia, bacterial pneumonia. A basic ResUNet was used to segment the lungs from the images and then four different DNN (i.e., ResNet34, ResNet50, VGG16, InceptionV3) were used with and without the mask attention layers. The performance evaluation clearly showed that the mask-attention models achieved around 2% better performance and ResNet50 with MA achieved the highest classification accuracy of 97%. An ensemble model with a very similar idea was proposed in [94] for similar COVID-19 X-ray segmentation and classification. A ResUNet model was used to segment the lungs from the X-rays to optimize the ROI for the classifier. The segmented lungs were then applied as inputs to an ensemble model containing ResNet, VGG, and DensNet. A majority voting method was used on the outputs of them to finalize the classifier output to detect COVID-19-infected lungs from the X-rays. The Grad-CAM tool was used for visualization of the lung infections. The proposed model achieved around 77% classification accuracy with 95% dice score.

A soft attention-based U-Net model was used for COVID infection lesion segmentation from chest CT images in [95]. As U-Net [96] is one of the most popular DNNs for medical image processing, basic U-net and various variations of U-Net have been used in medical image segmentation researches for last last few years with high performance and accuracy. Recent COVID image segmentations have been using U-Net variants for different infection region segmentation. This research applied soft attention at every layer of U-net to extract the implicit features so as to enhance the segmentation performance. They used 3 COVID segmentation datasets and extended them by applying spatial, color, and noise augmentation. The proposed model achieved 98.2% accuracy and 76.3% dice score while outperforming basic U-Net, U-Net++, SD-UNet, and basic attention U-net.

Punn et al. [97] proposed CHS-Net, a hierarchical segmentation model for COVID CT datasets to extract the infection region. The used two cascaded residual attention inception U-Net (RAIU-Net) to incorporate semantic information. The U-Net used a residual inception model with hybrid pooling function combining max pooling and spectral pooling, and spectral spatial and depth attention based skip connections. The proposed model segmented the lung mask and then the infection region with 95% accuracy. A few-shot U-Net model was proposed in [98] for similar infection segmentation from CT images. The model was trained with few data and then a medical professional provided feedback on the segmented outputs. The feedbacks were then included in the training process to refine the model. The proposed model was compared to CNN, FCN, and U-Nets and achieved higher accuracy and performance. Another infection segmentation model was proposed in [99] as BS Net (i.e., boundary guided semantic learning network. A dual branch multi level feature extraction and aggregation framework was defined and implemented to incorporate high level features for enhancing the segmentation performance. They were able to achieve 85% dice scores with 84.9% sensitivity and 86.7% precision.

There are other existing researches on COVID image analysis for COVID detection, classification, and segmentation. Although almost all types of ML, DL algorithms have been already used for these tasks, the DL models with complex CNN structures, U-Nets, ResNets, InceptionV3, DenseNets, attention-based networks, and their variations were able to achieve better performances for COVID X-ray/CT classification and segmentation. The hybrid, ensemble, and TL models also generated comparative performance scores. The performance metrics used in different researches were different and the datasets of these researches had various resources. Hence, comparing the performance of these models containing different datasets and different metrics would not represent a fair comparison. Tables 45678910, and 11 tabulate summaries of these approaches with their research task, methods, features, datasets, data type, performance, and contributions.

Table 4 Recent approaches on COVID-19
Table 5 Recent approaches on COVID-19
Table 6 Recent approaches on COVID-19
Table 7 Recent approaches on COVID-19
Table 8 Recent approaches on COVID-19
Table 9 Recent approaches on COVID-19
Table 10 Recent approaches on COVID-19
Table 11 Recent approaches on COVID-19

6 Datasets

As COVID-19 was discovered at the end of 2019 and it is still an ongoing pandemic, datasets containing COVID-19 patient information (i.e., images, texts, and gene) are very limited. The available datasets also have limitations such as unavailable labels or ground truth values, not proper annotations, not valid evaluation, etc. Among the few available COVID-19 image datasets, the majority include CXRs, some include CT scans and very few contain lung ultrasounds. Tables 12 and 13 show list of few available COVID-19 image datasets with their image types, tasks, ground truth availability and labels, source URLs, etc. Most of these datasets are still growing and few of them have overlapping images. As most of these datasets are still being updated, their specifications may vary on a future date. The X-rays, CTs, MRIs, and ultrasounds are mostly 2D or 3D images of various formats like .png, .jpg, and .dcm and only few of them have recently started including dynamic files. The text data is mostly the metadata, case descriptions, and sometimes annotations of the data.

Table 12 Image datasets used in COVID-19 approaches
Table 13 Image datasets used in COVID-19 approaches

7 Challenges and future research scopes

Researchers have been trying to find effective solutions for COVID-19 detection, diagnosis, and analysis using various AI-based tools and applications. They have encountered a few issues while working on automated systems for COVID-19 and have mentioned these limitations and possible solutions. Other few additional challenges were identified while reviewing the existing approaches for this survey. Some major challenges and suggestions for future research are mentioned below.

7.1 Challenges

7.1.1 Authenticity of the systems

Due to the emergency pandemic situation, many researchers have been proposing various AI-based systems for COVID-19 detection and analysis. As the medical professionals and patients need accurate automated systems as soon as possible, proper screening for the proposed models is needed to ensure the authenticity of the approaches. Researchers should provide clear justifications and information that can be used to reproduce the results and check the accuracy of the claims of the researchers.

7.1.2 Time constraint

As the COVID-19 pandemic is ongoing, the necessity of accurate automated systems that are able to detect, classify, diagnose, and monitor COVID-19 patients is acute. The urgency of working COVID-19 analysis systems that can be immediately implemented for COVID-19 patients is a very real issue. Another time constraint issue of the automated systems is the execution or analysis time. To stop the rapid progress of the pandemic, the automated systems are required to process patient data as quickly as possible but with correct results.

7.1.3 Lack of a complete automated system

Most approaches have been focusing on detection of the disease or segmentation of the ROIs. To properly manage the severity of the pandemic, complete automated systems are needed that will be able to separate COVID-19 patients from healthy or any other respiratory disease patients, then asses the severity of the disease, detect the exact infected regions, provide diagnosis, monitor patient progress, and provide a complete and comprehensive report to the medical professionals.

7.1.4 Amount of FPs and FNs

Detecting COVID-19 accurately is necessary to treat the patients; however, the accuracy scores of proposed or existing systems are not good enough to address the severity of the disease. Each automated system needs to check and minimize the number of false positives and false negatives detected by their systems. COVID-19 death rates and contamination rates are very clear indications that the errors in detection of COVID-19 can have serious impact on the ongoing pandemic. So, AI systems with zero false negative and false positive classifications should be one of the main objectives of the automated systems.

7.1.5 Applicability to real-time data

Researchers from different countries have been working on COVID-19-based systems using ML, DL, TL, and other methods. But most approaches were tested on a very limited amount of data. These AI-based systems need to be implemented and tested with real-time patient data in hospitals and other medical care facilities treating COVID-19 patients so that they can be fine tuned according to real-time data and can be used in large-scale practical scenarios.

7.1.6 Lack of customized systems

The statistics and news of COVID-19 from all over the world have already shown that the same systems, resources, and measures are not applicable for every country. Developed and developing countries need different types of setups and patients of different variants of COVID-19 require different care. These variations regarding countries, their resources, available healthcare systems, accessibility to treatments, variants of COVID-19, and their severity should be addressed while designing automated systems for various countries.

7.1.7 Lack of available datasets

One of the major limitations of COVID-19 analysis is the limited amount of available datasets. Some of the datasets mentioned in Section 6 are combinations of some other datasets. Lack of sufficient unique datasets of CXRs, CTs, and lung ultrasounds of COVID-19 patients limits the complete and accurate applications of DL models.

7.1.8 Limitations of available datasets

The available COVID-19 datasets have some limitations. Most datasets have very few images from a limited amount of patients. The image qualities, dimensions, modalities, and consistencies are not similar among most of the images of the same dataset. The lack of variations in the dataset images including all possible abnormal scenarios of COVID-19-infected lungs is absent. Another severe issue with the existing datasets is the lack of ground truths or labels of images. The datasets that included proper labels also need profound verification about the authentication and accuracy of their labeling.

Most of the limitations of the current researches are related to the time constraint. The urgency of the pandemic led to automated systems using TL models for faster disease detection and segmentation with a limited amount of verified data. Researchers did not have sufficient time to experiment with unique ideas for providing more concrete novel approaches. To provide a non-invasive automated COVID-19 detection and segmentation system, most researchers applied few popular ML, DL, and TL models that generated outputs with high performance scores previously on other medical or non-medical images. This process neglected the analysis of features unique to COVID-19 medical images and their affect on the classification and segmentation tasks. The time constraint also affected the dataset verification. The limitation of correctly annotated and verified datasets from patients of different age, gender, ethnicity, location, etc. made it difficult to provide general and customized automated systems for COVID detection. The time constraint raised some concerns about the validity of the researches. Currently, there are hundreds of proposed automated systems tested with small amount of various verified and unverified datasets that are not tested on real-time data to establish their authenticity.

7.2 Future research scopes

The challenges mentioned in the previous subsection can also provide some future research scopes on COVID-19 medical image analysis. As the efficiency of the detection, segmentation, and classification of COVID-19 infections is directly related to the availability, quality, completeness, and amount of medical image data, one of the major future research goal can be generating new benchmark datasets by combining, refining, and labeling the available datasets and adding more recent and precise data to them. Medical professionals or radiologists can provide correct data annotation label (i.e., healthy, COVID, and other diseases) for classification datasets and ground truth images with segmented infection regions for segmentation datasets. More advanced datasets may contain the specific types of infections, their severity, etc. Feedbacks from medical professionals to verify these datasets are also needed for future researches. Another possible research idea can be to develop more novel algorithms or systems for COVID-19 detection and segmentation. Due to the time constraint, most of the current researches are based on known ensemble methods, hybrid methods, TL models, or variations of popular DL/ML models. Although they work well for the current limited datasets, more original researches are needed in this field to provide some unique perspectives on the solutions to extract and apply the unique features of COVID images. More original researches incorporating geographical, environmental, and other specific constraints for different countries can also lead to more practical outcomes or automated systems for non-invasive COVID-19 detection.

8 Conclusion

Researchers from all fields have been trying to contribute in different ways to fight and win the war against the SARS-CoV-2 virus. Researchers have been trying to develop tools and applications to aid the medical professionals. In this paper, a review on the recent researches on COVID-19 medical image analysis is presented. Basic tasks like COVID detection, COVID-infected lung region segmentation, and severity assessment are solved using CXRs, CTs, etc. and the AI-based systems applied for these tasks recently have been analyzed and summarized here to provide a complete idea about the current researches. Although some researchers worked with basic ML models like SVM, RF, DT, and KNN, most recent methods applied DL-based methods like ResNet, Inception, Xception, MobileNet, U-Net, and U-Net++ for COVID-19 image analysis achieving high accuracy. TL models are also very popular in recent approaches and have been applied to incorporate the advantages of popular DL networks for COVID-19 analysis without spending time to implement and train existing DL models from scratch. Some researchers have provided some novel ideas to analyze the tasks from different perspectives and showed comparable performances. A large number of papers showed more than 99% accuracy for COVID-19 detection and classification on the available datasets and most of the performance scores were higher than 80% or 90%. Despite having hundreds of publications on COVID-19 medical image analysis with almost 100% accuracy, the requirements of more extensive, customized, and novel experiments on this field is still there due to the challenges mentioned Section 7. This is an interesting research area having societal and scientific impact. It is important to work on approaches that will reduce the false negatives to the most possible minimum, though zero is the preferred target to avoid further spread of the virus.