DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia

Alafif, Tarik; Etaiwi, Alaa; Hawsawi, Yousef; Alrefaei, Abdulmajeed; Albassam, Ayman; Althobaiti, Hassan

doi:10.1007/s41870-022-00973-2

DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia

Original Research
Published: 04 July 2022

Volume 14, pages 2825–2838, (2022)
Cite this article

Download PDF

International Journal of Information Technology Aims and scope Submit manuscript

DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia

Download PDF

Tarik Alafif ORCID: orcid.org/0000-0001-5991-8826¹^na1,
Alaa Etaiwi²^na1,
Yousef Hawsawi³,
Abdulmajeed Alrefaei⁴,
Ayman Albassam¹ &
…
Hassan Althobaiti¹

1310 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

A respiratory syndrome COVID-19 pandemic has become a serious global concern. Still, a large number of people have been daily infected worldwide. Discovering COVID-19 infection patterns is significant for health providers towards understanding the infection factors. Current COVID-19 research works have not been attempted to discover the infection patterns, yet. In this paper, we employ an Association Rules Apriori (ARA) algorithm to discover the infection patterns from COVID-19 recovered patients’ data. A non-clinical COVID-19 dataset is introduced and analyzed. A sample of recovered patients’ data is manually collected in Saudi Arabia. Our manual computation and experimental results show strong associative rules with high confidence scores among males, weight above 70 kilograms, height above 160 centimeters, and fever patterns. These patterns are the strongest infection patterns discovered from COVID-19 recovered patients’ data.

Intelligent Agent Based Case Base Reasoning Systems Build Knowledge Representation in COVID-19 Analysis of Recovery of Infectious Patients

Prediction of COVID-19 diagnosis based on openEHR artefacts

Article Open access 22 July 2022

Daniela Oliveira, Diana Ferreira, … José Machado

Data Mining in Healthcare and Prediction Model Using Data Mining Technique on COVID-19

1 Introduction

In December 2019, a new virus known as severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) emerged initially in Wuhan city of Hubei province in China [1]. A few months later, in January 2020, the whole genome sequence of the SARS-CoV-2 virus was reported in China [1]. The lethal virus belongs to the Coronavirus family and therefore it is called COVID-19 [2, 3]. One of the particular properties of SARS-CoV-2 virus is spreading very rapidly, and therefore, the Worldwide Health Organization (WHO) announced the pandemic as a Public Health Emergency of International Concern (PHEIC)[2, 4]. According to Johns Hopkins statistics for SARS-CoV-2 virus, until March 20 in 2022, more than 468 million new cases were reported with over 6 million deaths worldwide.

On March 2020, the first confirmed case was reported in Saudi Arabia. On June 24 in 2021, Saudi Arabia was ranked as the fortieth affected country worldwide with over 749,471 confirmed cases and 8,485 SARS-CoV-2-related deaths.

Globally, SARS-Cov-2 causes no symptoms, mild or severe symptoms to patients through the virus replications. The patients are usually suffering from dry cough, fever, dyspnea, pneumonia, anosmia, fatigue, gastrointestinal manifestations, and ageusia [5, 6].

Data Mining (DM) techniques play a major role in many research fields. These techniques have the ability to solve complex tasks and extract useful knowledge from small and big data for many diseases including infectious diseases [7]. In addition, DM techniques can also assist medical practitioners in healthcare systems in the war on COVID-19 pandemic.

Our work is motivated by the success of the Association Rules Apriori (ARA) algorithm [8] in finding patterns in data. The ARA algorithm is a popular DM technique proposed by Agrawal in 1993 [9] where it scans a dataset to identify common itemsets and generate associative rules. It was commercially used for business transactions to identify interesting associations and patterns between customers and most frequently purchased items using minimum support and minimum confidence thresholds. The patterns are generated by association rules. Each association rule consists of a consequent and an antecedent to describe the patterns. A famous example of association rule mining is the market basket analysis which helps businesses to expand marketing strategies by providing awareness into which items are most frequently purchased by customers. In this research, we intend to employ the ARA algorithm to find the most frequent infection patterns from COVID-19 recovered patients’ data.

Noticeably and luckily, there is a large number of infected people who have been recovered. Questions may be raised on how these people are infected? What are their health patterns? What are the most frequent infection patterns from those recovered patients? Mining the answers to these questions scientifically is important to find and understand the different infection patterns of COVID-19 patients. Therefore, understanding and finding the infection patterns of COVID-19 patients can assist medical practitioners in the behavior of the infection and spread public health awareness towards the COVID-19 pandemic. Meanwhile, discovering the infection patterns requires attaining infection and health information from recovered COVID-19 patients. However, existing public COVID-19 datasets lacks of infection and health information to help in discovering these patterns. Peiffer et al. [10] and Alafif et al. [11] support our statement by asserting that operations using machine learning and DM which they require a global collaboration and data sharing to tackle the COVID-19 pandemic problem. In addition to the lack of COVID-19 infectious information datasets, existing predictive COVID-19 methods such as in [12, 13], and [14] focus only on predicting the states of infected patients and the spread of COVID-19 respectively rather than attempting to find their infection patterns. To address the limitations in the existing COVID-19 datasets and methods, in this research work, we contribute the following:

We introduce our COVID-19 infection patterns (IP) dataset which includes a sample of recovered patients’ data collected in Saudi Arabia.
We apply the ARA algorithm to discover associative infection patterns from recovered COVID-19 patients’ data using the IP dataset.
We show manual computation results and verify them with experimental results.
A discussion on the results and their relations with the results of existing works is provided.

The remainder of this paper is organized as follows. In Sect. 2, we describe and analyze the IP dataset. Then, we present our proposed method to discover infection patterns from COVID-19 patients in Sect. 3. Experimental implementation and results are provided in Sect. 4. Then, discussions on the results, limitations, and challenges are provided in Sect. 5. Finally, we conclude our work in Sect. 6.

2 IP dataset

This study is conducted on 131 Saudi and non-Saudi subjects using a survey method. The study includes COVID-19 positive cases who spent a quarantine period in an isolation facility. Negative cases, positive cases that were critical and subjected to hospital admission, none critical cases who’s subjected to home isolation, and those with missing information were excluded in the survey. The survey questionnaire includes biographic data such as age, gender, nationality, smoking status, weight, height, blood type, and sport activity. It also includes a medical history such as multivitamins, medicine usage before infection, medicine usage after infection, symptoms after infection, chronic disease, previous infection of MERS-CoV, previous infection of COVID-19, and period of isolation. The IP dataset is publicly available on^{Footnote 1}.

The collected data from the questionnaire are tabulated as shown in Table 1. The data is manually processed and converted to a binary format.

Table 1 Questionnaires data

Full size table

Our preliminary analysis for the collected data demonstrates that 21.4% of the study population’s age is under 20 years old while the remaining 83.6% is above 20 years old (33.7% of them are between 30-40 years old while 25.5% are between age 20-30 years old). Although this indicates that the incidents rate in adults is higher, the obesity factor possibly increases the incidents in younger populations [15]. Figure 1 shows the age distribution in our study population.

Among our data in the dataset, gender is studied. It is found that 64.1% of the total subjects are male (84 subjects) while 37.4% are female (49 subjects). This association is reported in China back in February 2020 [16]. Figure 2 shows the gender domination in the study population.

In the IP dataset, 43% (56 subjects) have O+ blood group while 24% (32 subjects) have A+. Our data show that subjects who have O+ groups are more exposed to have COVID-19 than other blood types. However, this finding contradicts a similar study in Germany that excludes such a link [17]. Figure 3 shows the blood group in the study population.

Studies highlighted an insignificant relation between smoking and infection with COVID-19 [18]. As it is illustrated in our study, 23% of 131 subjects are smokers whereas is the remaining 77% are non-smokers. Figure 4 shows the percentage of smoker and non-smoker subjects in the study population.

Moreover, most subjects develop symptoms like cough, fever, diarrhea, headache, fatigue, loss of taste and smell while few of them don’t develop any symptoms as shown in Fig. 5. We notice that subjects’ history with chronic diseases is not related to the infection pattern nor administration of multivitamins as supported in [19]. Figures 6 and 7 show the distribution of chronic disease and multivitamins administration in the study population respectively.

Due to COVID-19 precautionary measures, difficulties in accessing the electronic link, and elder subjects, both surveys and consents are conducted online using smartphones, tablets, and other electronic devices. A soft copy is sent to the subjects as a link. The actual subjects’ size is 330 subjects. However, due to difficulties in obtaining their consent and conducting the survey, it has led to limiting the subjects’ size to 131. These factors include but are limited to; wrong/unresponsive contacts’ number, refusal to participate, subjects’ unfamiliarity with their blood groups, medications, height, and weight. Furthermore, most of the isolation facilities are recently closed and COVID-19 positive cases were directed to home isolation. This prevents us from reaching a higher targeted sample size.

3 Methodology

In this section, we manually employ the ARA algorithm to extract and generate the strong associative rules and correlated patterns using the IP dataset. Figure 8 shows a pipeline of our proposed approach. The goal of using the ARA algorithm is to find the most frequent patterns relating to COVID-19 infection patterns.

Let COVID-19 infection patterns (P) where $P = \lbrace p_{1}, p_{2}, \dots , p_{m} \rbrace$ is a set of binary patterns. Let D be the IP dataset. Each record of a infected patient (R) in D is represented as a binary vector, with $R[n] = 1$ if R has the pattern $p_{n}$. Otherwise, $R[n] = 0$ where n is the number of p. Let Q be a set of some patterns in P. We say that a record R satisfies X if for all pattern $p_{n}$ in Q, $R[n] = 1$.

By applying the ARA algorithm, we imply $Q \longrightarrow p_{n}$, where Q is a set of some pattern in P, and $p_{n}$ is an example of P that is not found in Q. The rule $Q \longrightarrow p_{n}$ is satisfied in the dataset D with a confidence threshold C where $0 \le C \le 1$ if and only if at least the confidence threshold of the records in D that satisfies Q tends to satisfy $p_{n}$.

Given the set of infection patterns p, we are interested in generating all the rules that satisfy minimum support MinSup and minimum confidence MinConf thresholds. The MinSup threshold helps to find the frequent patternsets in D that satisfy the union of patternsets in the consequent and antecedent of the rule. By using the MinSup threshold, we generate all combinations of patternsets and select only pattrensets that have a higher value than the threshold. The MinSup threshold is computed as follows:

$$MinSup = count \;for \;patternset / Total \;number \;of\; R$$

(1)

Then, we apply the MinConf threshold to find the frequent rules that occur among all the patternset in D. The MinConf threshold is computed as follows:

$${\text{MinConf }} = \frac{{{\text{No}}.\,{\rm{ of}}\,{\text{support}}\,{\rm{ count}}\,{\text{ in}}\,{\rm{ both}}\,Q\,{\text{and}}\,p_{n} }}{{{\rm{No}}.\,{\text{of}}\,{\rm{ support }}\,{\text{count}}\,{\rm{in}}\,Q}}$$

(2)

To generate strong associative rules, we first need to find all frequent patternsets. The first process is to join patterns to find $L_{k}$, a set of candidate k-patternsets $C_{k}$, where k is the number of elements in the frequent patternset. Then, a pruning process is performed by using the MinSup threshold to reduce the size of $C_{k}$ in the k-frequent patternsets using the MinSup threshold. The value of MinSup threshold is selected to obtain higher frequent patternsets recursively. Our approach follows algorithm 1.

Given the IP dataset in Table 2 which contains of patient identification (PID) and a list of patterns_IDs. Each patient may have a different list of Pattern_IDs. The pattern_IDs consists of letters in an alphabetical order which correspond to the infection patterns p respectively in the Excel sheet. The alphabetical letters are used to ease the manual computation and readability (Tables 2 and 3).

Table 2 List of pattern IDs for each patient’s record in the dataset

Full size table

Table 3 Scanning the datasest for counting each 2-frequent patternset (C2)

Full size table

First, we scan the whole dataset D in Table 2 for counting the support of each candidate in the Pattern_IDs in the first pass. We call the first candidates $C_{1}$ or 1-freqent patternset as shown in Table 4. After finding the support count for $C_{1}$ candidates, the support count are compared with the MinSup threshold. We set the MinSup threshold to 40% to filter out weak candidates that do not satisfy the MinSup threshold. We call these candidates $L_{1}$ as shown in Table 5. Similarly, the same procedure is followed to compute 2-frequent patternsets ($C_{2}$, $L_{2}$), 3-frequent patternsets ($C_{3}$,$L_{3}$), and 4-frequent patternsets ($C_{4}$,$L_{4}$) as shown in Tables 3, 6, 7, 8, 9, 10, and 11. The ARA algorithm stops to join and generate more candidates in the fifth pass since the 4-frequent patternsets do not satisfy the MinSup threshold.

Table 4 Scanning the datasest for counting each 1-frequent patternset (C1)

Full size table

Table 5 Comparing the support count of 1-frequent patternset (L1) with the MinSup count

Full size table

Table 6 Comparing the support count of 2-frequent patternset (L2) with the MinSup count

Full size table

Table 7 Generating 3-frequent patternset C3 from L2

Full size table

Table 8 Scanning the datasest for counting each 3-frequent patternset C3

Full size table

Table 9 Comparing the support count of 3-frequent patternset (L3) with the MinSup count

Full size table

Table 10 Generating 4-frequent patternset C4 from L3

Full size table

Table 11 Scanning the datasest for counting each 4-frequent patternset C4

Full size table

After pruning the infrequent patternsets, we reach the 3-frequent patternsets $L_{3}$ that satisfy the MinSup threshold. Then, we generate the association rules by generating all non-empty subsets s of frequent patternsets from $L_{3}$ and output the rules $s \longrightarrow (l-s)$. From Table 9, we see that $L_{3}$ consists of $l_{1} = \lbrace I, L, M \rbrace$ and $l_{2} = \lbrace I, M, DA \rbrace$. Then, we generate all possible rules and use Equation 2 to compute the confidence for each rule as shown in Table 12.

Table 12 Generated rules from 3-frequent patternsets $L_{3}$

Full size table

We set the MinConf threshold to 60% to exclude weak generated rules from Table 12. Then, the strong generated rules that satisfy the MinConf threshold are as follows:

{$I\} \longrightarrow \{L, M$}
{$L\} \longrightarrow \{I, M$}
{$M\} \longrightarrow \{I, L$}
{$I, L\} \longrightarrow \{M$}
{$I, M\} \longrightarrow \{L$}
{$L, M\} \longrightarrow \{I$}
{$I, M\} \longrightarrow \{DA$}
{$I, DA\} \longrightarrow \{M$}
{$M, DA\} \longrightarrow \{I$}

We substitute the alphabetical letters with the infection patterns. Then, we obtain the following rules:

{$Male\} \longrightarrow \{(Weight>70), (Height >160)$}
{$(Weight>70)\} \longrightarrow \{Male, (Height >160)$}
{$(Height>160)\} \longrightarrow \{Male, (Weight >70)$}
{$Male, (Weight>70)\} \longrightarrow \{(Height >160)$}
{$Male, (Height>160)\} \longrightarrow \{(Weight >70)$}
{$(Weight>70), (Height >160)\} \longrightarrow \{Male$}
{$Male, (Height >160)\} \longrightarrow \{Fever$}
{$Male, Fever\} \longrightarrow \{(Height >160)$}
{$(Height >160), Fever\} \longrightarrow \{Male$}

As shown from the above generated associative rules, we obtain eight-strong generated rules. We discover four strong COVID-19 infection patterns from COVID-19 recovered patients. The infection patterns are Male, $(Weight > 70)$, $(Height > 160)$, and Fever. These are the most frequent and strongest patterns obtained by the ARA algorithm.

From the above strong generated associative rules, one can notice that the first six rules have the same patterns $Male, (Weight> 70), and (Height > 160)$. The last three rules contain Fever pattern in addition to Male and $(Height > 160)$ patterns.

From the rule {$Male\} \longrightarrow \{(Weight> 70), (Height > 160)$}, if a recovered patient is a male, he tends to have a pattern of a weight more than 70 kilogram and a pattern of height more than 160 centimeters. Similarly, the rules {$(Weight> 70)\} \longrightarrow \{Male, (Height > 160)$}, {$(Height> 160)\} \longrightarrow \{Male, (Weight > 70)$}, {$Male, (Weight> 70)\} \longrightarrow \{(Height > 160)$}, {$Male, (Height> 160)\} \longrightarrow \{(Weight > 70)$}, and {$(Weight> 70), (Height > 160)\} \longrightarrow \{Male$} provide the same strong patterns that are strongly associated to each other.

For the rule {$Male, (Height >160)\} \longrightarrow \{Fever$}, if a recovered patient is a male and his height is more than 160 centimeters, he tends to have a pattern of fever. Similarly, the rules {$Male, Fever\} \longrightarrow \{(Height >160)$} and {$(Height > 160), Fever\} \longrightarrow \{Male$} provide the same strong patterns that are strongly associated to each other.

4 Experiment

The ARA algorithm experiment is conducted using Google Co-laboratory with Apriori Python library on a Windows 10 Pro with 64-bit 8-core and Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz with 32.0 GB of RAM.

The goal of the experiment is to verify the results obtained from our manual computation. The MinSup and the MinConf thresholds are used in the experiment. The same results of the generated strong associative rules are obtained in addition to the non-smoking pattern. The experiment runs in 1.1 seconds.

5 Discussion

The ARA algorithm is robust in finding and extracting patterns from data with the use of minimum support and minimum confidence thresholds. The ARA algorithm has shown strong associative COVID-19 infection patterns from the manual and the experimental computations using the IP dataset. As a result, male, weight above 70 kilograms, height above 160 centimeters, and fever are the strongest COVID-19 recovery patterns.

To support and associate our findings with the medical findings from the literature, some medical research works have associated some patterns for COVID-19 infection. The severity of symptoms in COVID-19 active case and mortality tendency is two-fold and a half higher than females in China [20]. In Italy, 70% of COVID-19 mortality is seen in men while the death rate in Wuhan is 1.1% higher in males than females [21, 22]. Although the ministry of health reported on the 20th of April that 55% of the reported active cases are females, we believe that this is related to public compliance to precautionary measures rather than a significant link to COVID-19. Similar findings is recorded in Oman on May 2020, where 70% of active cases are men as well [23]. Our generated associative rules have detected a pattern linking height and weight to COVID-19 infection. In 2019, the CDC reported that obesity increases the risk of severe illness for COVID-19 and overweight triple the risk of hospitalization [24]. Kompaniyets et al. [25] reported that risks of hospitalization, intensive care unit admission, invasive mechanical ventilation, and death are increased with increasing Body Mass Index (BMI). Our finding in the current study is consistent with the previous studies. We find that overweight individuals have an increased risk of having the COVID-19 infection. Other studies linked BMI above 25 kg/m2 as a strong disposing factor for COVID-19 infection [26].

Our results demonstrate a link between the infection of COVID-19 and non-smokers in our population. Although the same was observed in the UK, they found that the mortality rate is high in the infected smokers than the non-smokers [27]. Furthermore, Smoking is considered a risk factor for the progression of COVID-19 symptoms [28].

In terms of the height pattern, our study shows that tall individuals are more likely to have COVID-19. This finding is in line with other reported studies by Desmet et al. which finds that individuals with height above 182 cm (6ft) tall most probably have doubles chances of being diagnosed with COVID-19 [29].

Knowing the first symptom of COVID-19 infection plays such an important role in the prevention of virus spread and patient’s treatment plan. Several clinical studies have shown that almost all infected patients reported fever as the first symptom of the COVID-19 diseases [30, 31]. For example, in China, 98% of infected individuals have suffered from fever, and in Italy, more than 72% reported the same symptom [32]. Furthermore, the Centre of Disease Control and Prevention in the US lists fever or chills as the first symptoms appear in 2-14 days after infection by the virus (https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html). Also, fever lasts longer in the infected individual who is admitted to intensive care units after the start of the COVID-19 symptoms [33]. Clearly, fever should be considered as the first sign/indicator of the onset of COVID-19 infection.

6 Conclusion

Discovering infection patterns is significant for health providers towards understanding the infection factors. In this research work, we propose to use the ARA algorithm to discover infection patterns from recovered COVID-19 patients’ data. A non-clinical COVID-19 dataset is introduced and analyzed, which consists of COVID-19 Infection patterns. A sample of recovered patients’ data is manually collected in Saudi Arabia. Our manual computation and experimental results show strong associative rules with high confidence scores among males, weight above 70 kilograms, height above 160 centimeters, and fever patterns. These patterns are the strongest infection patterns discovered from COVID-19 recovered patients’ data. In future, we will work on the discovery of COVID-19 recovery and death patterns.

Availability of data and materials

Applicable.

Code availability

Not applicable.

Notes

https://www.kaggle.com/tarikalafif/covid19-infection-patterns-dataset

References

World Health Organization and others: Coronavirus disease 2019 (covid- 19): situation report, 72 (2020)
Phelan AL, Katz R, Gostin LO (2020) The novel coronavirus originating in Wuhan, China: challenges for global health governance. JAMA 323(8):709–710
Article Google Scholar
Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, Zhao X, Huang B, Shi W, Lu R et al (2020) A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med 382:727–733
Article Google Scholar
Ebrahim SH, Memish ZA (2020) Covid-19: preparing for superspreader potential among Umrah pilgrims to Saudi Arabia. Lancet (London, England) 395(10227):48
Article Google Scholar
Hoffman T, Nissen K, Krambrich J, Rönnberg B, Akaberi D, Esmaeilzadeh M, Salaneck E, Lindahl J, Lundkvist Å (2020) Evaluation of a COVID-19 IGM and IGG rapid test; an efficient tool for assessment of past exposure to SARS-COV-2. Infect Ecol Epidemiol 10(1):1754538
Google Scholar
Long Q-X, Tang X-J, Shi Q-L, Li Q, Deng H-J, Yuan J, Hu J-L, Xu W, Zhang Y, Lv F-J et al (2020) Clinical and immunological assessment of asymptomatic SARS-COV-2 infections. Nat Med 26(8):1200–1204
Article Google Scholar
Peiffer-Smadja N, Rawson TM, Ahmad R, Buchard A, Pantelis G, Lescure F-X, Birgand G, Holmes AH (2019) Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clin Microbiol Infect 26:584–595
Article Google Scholar
Agrawal R, Srikant R et al (1994) Fast algorithms for mining association rules. Proc Int Conf Very Large Data Bases 1215:487–499
Google Scholar
Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207–216
Peiffer-Smadja N, Maatoug R, Lescure F-X, D’Ortenzio E, Pineau J, King J-R (2020) Machine learning for COVID-19 needs global collaboration and data-sharing. Nat Mach Intell 2:293–294
Article Google Scholar
Alafif T, Tehame AM, Bajaba S, Barnawi A, Zia S (2021) Machine and deep learning towards COVID-19 diagnosis and treatment: survey, challenges, and future directions. Int J Environ Res Public Health 18(3):1117
Article Google Scholar
Alafif T, Alotaibi R, Albassam A, Almudhayyani A (2021) On the prediction of isolation, release, and decease states for COVID-19 patients: a case study in South Korea. ISA Trans. https://doi.org/10.1016/j.isatra.2020.12.053
Article Google Scholar
Iwendi C, Bashir AK, Peshkar A, Sujatha R, Chatterjee JM, Pasupuleti S, Mishra R, Pillai S, Jo O (2020) COVID-19 patient health prediction using boosted random forest algorithm. Front Public Health 8:357
Article Google Scholar
Sujath R, Chatterjee JM, Hassanien AE (2020) A machine learning forecasting model for COVID-19 pandemic in India. Stoch Environ Res Risk Assess 34:959–972
Article Google Scholar
Kass DA, Duggal P, Cingolani O (2020) Obesity could shift severe COVID-19 disease to younger ages. Lancet (London, England) 395:1544–1545
Article Google Scholar
Zhang J-J, Dong X, Cao Y-Y, Yuan Y-D, Yang Y-B, Yan Y-Q, Akdis CA, Gao Y-D (2020) Clinical characteristics of 140 patients infected with SARS-COV-2 in Wuhan, China. Allergy 75:1730–1741
Article Google Scholar
Latz CA, DeCarlo C, Boitano L, Png CM, Patell R, Conrad MF, Eagleton M, Dua A (2020) Blood type and outcomes in patients with COVID-19. Ann Hematol 99(9):2113–2118
Article Google Scholar
Cai H (2020) Sex difference and smoking predisposition in patients with COVID-19. Lancet Respir Med 8(4):20
Article Google Scholar
Singhal T (2020) A review of coronavirus disease-2019 (COVID-19). Indian J Pediatr 87:281–286
Article Google Scholar
Jin J-M, Bai P, He W, Wu F, Liu X-F, Han D-M, Liu S, Yang J-K (2020) Gender differences in patients with COVID-19: focus on severity and mortality. Front Public Health 8:152
Article Google Scholar
Remuzzi A, Remuzzi G (2020) COVID-19 and Italy: what next? Lancet 395(10231):1225–1228
Article Google Scholar
Groban L, Wang H, Sun X, Ahmad S, Ferrario CM (2020) Is sex a determinant of COVID-19 infection? Truth or myth? Curr Hypertens Rep 22(9):1–12
Article Google Scholar
Khamis F, Al Rashidi B, Al-Zakwani I, Al Wahaibi AH, Al Awaidy ST (2020) Epidemiology of COVID19 infection in Oman: analysis of the first 1304 cases. Oman Med J 35(3):141
Article Google Scholar
Simonnet A, Chetboun M, Poissy J, Raverdy V, Noulette J, Duhamel A, Labreuche J, Mathieu D, Pattou F, Jourdain M et al (2020) High prevalence of obesity in severe acute respiratory syndrome coronavirus-2 (SARS-COV-2) requiring invasive mechanical ventilation. Obesity 28(7):1195–1199
Article Google Scholar
Kompaniyets L, Goodman AB, Belay B, Freedman DS, Sucosky MS, Lange SJ, Gundlapalli AV, Boehmer TK, Blanck HM (2021) Body mass index and risk for COVID-19-related hospitalization, intensive care unit admission, invasive mechanical ventilation, and death-united states, march-december 2020. Morb Mortal Wkly Rep 70(10):355
Article Google Scholar
Ranjan P, Kumar A, Chowdhury S, Pandey S, Choudhary A, Bhattacharya A, Singh A, Pandey RM, Wig N, Vikram NK (2020) Is excess weight a risk factor for the development of COVID 19 infection? A preliminary report from India. Diabetes Metab Syndr 14(6):1805–1807
Article Google Scholar
Prats-Uribe A, Xie J, Prieto-Alhambra D, Petersen I (2021) Smoking and COVID-19 infection and related mortality: a prospective cohort analysis of UK biobank data. Clin Epidemiol 13:357
Article Google Scholar
Patanavanich R, Glantz SA (2020) Smoking is associated with COVID-19 progression: a meta-analysis. Nicotine Tob Res 22(9):1653–1656
Article Google Scholar
Desmet, K., Wacziarg, R.: Understanding spatial variation in covid-19 across the united states. nber working paper 27329 (2020)
Chhikara BS, Rathi B, Singh J, Poonam F (2020) Corona virus SARS-COV-2 disease COVID-19: infection, prevention and clinical advances of the prospective chemical drug therapeutics. Chem Biol Lett 7(1):63–72
Google Scholar
Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X et al (2020) Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395(10223):497–506
Article Google Scholar
Chen J, Lu H, Melino G, Boccia S, Piacentini M, Ricciardi W, Wang Y, Shi Y, Zhu T (2020) COVID-19 infection: the China and Italy perspectives. Cell Death Disease 11(6):1–17
Article Google Scholar
Chen J, Qi T, Liu L, Ling Y, Qian Z, Li T, Li F, Xu Q, Zhang Y, Xu S et al (2020) Clinical progression of patients with COVID-19 in Shanghai, China. J Infect 80(5):1–6
Article Google Scholar

Download references

Funding

This work is funded by Research and Development Grants Program for National Research Institutions and Centers (GRANTS), Target Research Program, Infectious Disease Research Grant Program, King Abdulaziz City for Science and Technology (KACST), Kingdom of Saudi Arabia, grant number (5-20-01-007-0001).

Author information

Tarik Alafif and Alaa Etaiwi contributed equally to this work.

Authors and Affiliations

Computer Science Department, Jamoum University College, Umm Al-Qura University, Jamoum, 25375, Makkah, Saudi Arabia
Tarik Alafif, Ayman Albassam & Hassan Althobaiti
Pathology and laboratory medicine Department, King Faisal Specialist Hospital and Research Center, Jeddah, 21499, Makkah, Saudi Arabia
Alaa Etaiwi
Saudi Human Genome program-Jeddah Satellite Laboratory, Research Center, King Faisal Specialist Hospital and Research Center, Jeddah, 21499, Makkah, Saudi Arabia
Yousef Hawsawi
Biology Department, Umm Al-Qura University, Jamoum, 25375, Makkah, Saudi Arabia
Abdulmajeed Alrefaei

Authors

Tarik Alafif
View author publications
You can also search for this author in PubMed Google Scholar
Alaa Etaiwi
View author publications
You can also search for this author in PubMed Google Scholar
Yousef Hawsawi
View author publications
You can also search for this author in PubMed Google Scholar
Abdulmajeed Alrefaei
View author publications
You can also search for this author in PubMed Google Scholar
Ayman Albassam
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Althobaiti
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, Tarik Alafif; Data curation, Alaa Etaiwi; Formal analysis, Tarik Alafif; Funding acquisition, Tarik Alafif; Investigation, Yousef Hawsawi, Abdulmajeed Alrefaei, and Ayman Albassam; Methodology, Tarik Alafif; Project administration, Tarik Alafif; Resources, Alaa Etaiwi and Yousef Hawsawi; Software, Hassan Althobaiti; Supervision, Tarik Alafif; Validation, Tarik Alafif; Visualization, Alaa Etaiwi; Writing - original draft, Tarik Alafif, Alaa Etaiwi, Abdulmajeed Alrefaei, and Yousef Hawsawi; Writing-review & editing, Ayman Albassam.

Corresponding author

Correspondence to Tarik Alafif.

Ethics declarations

Conflict of interest

Not applicable.

Ethical approval

Applicable.

Consent to participate

Applicable.

Consent for publication

Applicable.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alafif, T., Etaiwi, A., Hawsawi, Y. et al. DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia. Int. j. inf. tecnol. 14, 2825–2838 (2022). https://doi.org/10.1007/s41870-022-00973-2

Download citation

Received: 05 February 2022
Accepted: 25 April 2022
Published: 04 July 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s41870-022-00973-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia

Abstract

Similar content being viewed by others

Intelligent Agent Based Case Base Reasoning Systems Build Knowledge Representation in COVID-19 Analysis of Recovery of Infectious Patients

Prediction of COVID-19 diagnosis based on openEHR artefacts

Data Mining in Healthcare and Prediction Model Using Data Mining Technique on COVID-19

1 Introduction

2 IP dataset

3 Methodology

4 Experiment

5 Discussion

6 Conclusion

Availability of data and materials

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia

Abstract

Similar content being viewed by others

Intelligent Agent Based Case Base Reasoning Systems Build Knowledge Representation in COVID-19 Analysis of Recovery of Infectious Patients

Prediction of COVID-19 diagnosis based on openEHR artefacts

Data Mining in Healthcare and Prediction Model Using Data Mining Technique on COVID-19

1 Introduction

2 IP dataset

3 Methodology

4 Experiment

5 Discussion

6 Conclusion

Availability of data and materials

Code availability

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation