1 Introduction

Coronavirus disease 2019 (COVID-19) is a communicable disease caused by severe acute respiratory syndrome coronavirus 2. As on 15th of June 2021, the disease has infected over 175.84 million people, resulting in a death of 3.80 million people globally [1]. Identified first in December 2019 in Wuhan, this disease mainly spread fomites and by small droplets produced when those infected coughs, sneezes, or talks [2]. The coronavirus travels to the deeper regions of the lungs and causes inflammation, leading to difficulty in breathing, dyspnea, and possibly pneumonia. The World Health Organization declared that the 2019–2020 coronavirus outbreak as a public health emergency of international concern [3] on 30 January 2020 and a pandemic on 11 March 2020 [4]. There have been various disease control protocols adopted by several governments across the world to stop the spread of the disease. But the issue with most of them is that they are prone to breaches since tracing of suspected patients and monitoring of infected patients is challenging.

Since the disease spreads rapidly, breaking the chain of transmission is of prime importance. Few countries have adopted the strategy of “T3” that is “Test”, “Trace”, and “Treat”. In this method, people with symptoms and those suspected of exposure to the virus are tested, and if found positive, their activities are backtracked to trace the individuals with whom they have come in contact. It was initially used for surveillance of diseases like “malaria” and has been extrapolated by various countries for COVID-19 [5]. This enables efficient testing rather than testing people randomly across a large country. Infected patients are then continuously monitored in quarantine wards until they test negative again. This method requires a system to visualize the progress made by the infected patients and the people who have met them. Answering to the technology needed for tracing of individuals, the Government of India came up with a mobile application named “Aarogya Setu” that leverages Wi-Fi, GPS, and Bluetooth to track the movement and interaction of a possible infected person with other people [6].

In this regard, several governments across the globe implemented a complete lockdown in their countries, thereby banning free movement of people in public places except for essential purposes. In the process of lockdown during pandemic, three levels of disease transmission control protocols are followed generally [7].

  • The first level (L1) of which is self-quarantine or self-isolation at home, where the people with symptoms like that of COVID-19 isolate themselves from the public on moral grounds.

  • The second level (L2) is quarantine in special wards such as a mobile quarantine ward, which is located outside the hospital, where the movement of people who were exposed to the disease is strictly restricted. They are periodically visited by front-line healthcare professionals such as physicians and nursing staffs to monitor the health condition of the person under quarantine.

  • The third level (L3) is isolation, where the COVID-19-infected patient is completely isolated and offered treatment in quarantine wards (in hospital) with or without life assistive device’s support.

Mostly, in L2 and L3, the primary person of contact with the patients is the healthcare workers (front-line healthcare professionals such as physicians and nursing staffs and other supporting workers such as sanitation workers) who are easily exposed to the disease [8]. The spread of the disease can occur between an infected person and a healthcare worker, even if the infected takes all the essential precautions [9]. If a front-line healthcare professional gets sick with an infectious disease, the likely spread might occur to other healthcare workers or vulnerable patients within the healthcare facility [10]. This can include patients with a deteriorated immune system and may be at-risk for grave complications. Hence, the healthcare workers often resort to self-isolation and social distancing from their family members and the public, in general, leading to depression over a period [11]. Cases of death of front-line healthcare professionals due to exposure to the disease have also been reported in several countries [12] especially in the USA, over 510,242 healthcare workers have been infected by COVID-19 during this pandemic and that 1653 have died [13]. These numbers keep rising even though governments across the world have provided physicians and other front-line professionals with personal protective equipment and standard operating procedures for patient diagnosis, treatment, and management. These concerns create panic within the healthcare community during this pandemic, which raises demand for safe and reliable system while testing or treating the COVID-19-infected patients.

On the other hand, in order to isolate the infected people from the public, countries like India have come up with mobile quarantine wards, wherein, railway coaches are converted into isolation units along with ventilators and necessary support systems [14]. The healthcare workers visit these mobile quarantine wards periodically to monitor the condition of the infected patients, to get the second opinion in the critical situations, and to update the senior physicians and officials present in the main specialty hospitals. There are several challenges in the practice of mobile quarantine wards faced internally. The major challenge that acts as a hindrance to this concept is the difficulty in proper maintenance and monitoring of the patient’s health records during the quarantine period [15]. Although, the mobile quarantine wards might be helpful to isolate the people from the rest of the public, it still does not eliminate the contact between the patient and the front-line healthcare professionals. Similarly, the data obtained from patients are usually comprehensive and narrative. This makes it difficult to interpret and visualize the various parameters of a patient for comparison and faster treatment.

To address these issues, we propose a novel framework that integrates near-field communication (NFC) chip and a smart cloud-based analytical tool to achieve a contactless or minimum-contact patient monitoring and clinical decision support system. The NFC chip will enable the medical officer to pull out the medical history of the corresponding patient from the cloud database easily. There are several reasons and advantages in usage of NFC chips, which can be a viable alternative for presently used wristbands.

  • Firstly, every patient is given a separate NFC chip that acts as a linking medium to the patient’s database which minimizes the interaction of front-line healthcare professional with the patient [16,17,18].

  • Secondly, since every patient has a unique NFC chip, the chance for misdiagnosis is reduced. Throughout the history of healthcare, the failure to properly identify the patients has led to severe medication errors, transfusion errors, testing errors, and wrong person procedures. Cases of patient misidentification and misdiagnosis have been reported in countries such as the UK and the USA [19, 20].

  • Finally, the usage of NFC chips solves the issue of protecting the physician from unwanted exposure to the infected patient.

The above reasons make NFC tags a viable solution in the previously discussed “Test” and “Treat” methods in “T3”. NFC tags can be extensively used for people in L2 and L3.

The cloud-based smart structuring tool will assist the healthcare professionals to effortlessly organize the data and integrate it with the existing patient database. This tool leverages “Keyword Extraction” techniques and natural language processing (NLP) for structuring of narrative clinical texts in a structured format. It also has an inbuilt optical character recognition (OCR) application that aids in the conversion of text within the images (medical reports) into digital format [21]. Traditional medical records are available in the form of hard copies, thus making the data in them difficult to be integrated with the database. Thus, the proposed system which integrates NFC, NLP, and OCR technologies can assist the physicians in an improved manner: to monitor the progress and at the same time by maintaining minimum interaction with the people quarantined in L2 and the treatment progress of the patients who are in L3 and prescribe a treatment plan appropriately.

2 Methodology

The proposed system, of monitoring isolated and quarantined patients in L2 and L3, respectively, consists of the stepwise processing of the input data which can either be structured or unstructured. These steps can lead to better visualization of data and easy monitoring of the progress of the patient under quarantine. The proposed system mainly focuses on the testing and treating stages of the “T3” strategy. Therefore, people who have tested positive must be monitored continuously for the development of symptoms. On the other hand, the infected patients undergoing treatment must be monitored for signs of recovery. The clinical workflow and detailed process flow of the proposed pandemic tele-health system are depicted in Figs. 1 and 2, respectively.

Fig. 1
figure 1

The clinical workflow of the proposed pandemic tele-health system with comparison of conventional workflow

Fig. 2
figure 2

The process flow of the proposed cloud-based pandemic tele-health system in which the patients are to be monitored at a mobile and a remote quarantine ward

2.1 Architectural flow

2.1.1 Contactless connection of patient record using the NFC chip

The NFC chip or NFC tag used in this system will act as connecting tool to the patient’s cloud database. Every patient in the mobile quarantine ward will be given an NFC chip which will link to that corresponding patient’s database. Contrary to the current scenario, where the patients are given with wristbands for identification, the front-line healthcare professional need not come in contact with the infected patient since the NFC chip can be placed near the bedside of the patient. The proposed passively powered NFC tag (NTAG, NXP, Dutch) works seamlessly with our NFC reader ICs, which are used in more than 90% of all NFC-equipped mobile handset models [22]. NTAG21x NFC solutions can store NFC data exchange format (NDEF) formatted data, making them fully compatible with every NFC-enabled phone for contactless smart cards.

The major reason for using an NFC tag and not any other similar technologies such as long-distance RFID chips is that most of the market available smartphones have inbuilt NFC reading capabilities. The inbuilt NFC reader eliminates the need to purchase a RFID reading module, thereby making the proposed model more feasible and economical. Another reason is that, in the case of long-distance RFID tags, the chances of interference from other nearby RFID tags is very likely, which might result in incorrect patient identification.

The front-line healthcare professional can thus connect to the database of the corresponding patient by bringing their NFC-enabled mobile device closer to the NFC tag. Figure 3 depicts the process of uploading medical records to the cloud-based server through an NFC-enabled mobile device. From now on, the physician can add additional records of data to the system which can be in the form of both structured data and unstructured data such as narrative text, images of scanned report, images, and videos of radiological. So, every time the physician connects with the database through the NFC tag, a new record is created in the cloud along with the corresponding date and time.

Fig. 3
figure 3

The process of uploading of medical records to the database by the healthcare worker at the mobile quarantine ward by reading the patient’s NFC tag

2.1.2 Cloud-based smart structuring tool

The smart cloud-based structuring tool is used to extract text contents present in images of medical reports and structure the general medical data, using NLP to fit them into the database and obtain large volumes of rich medical data.

Conversion of images of medical reports into digital text using OCR

Traditionally, the medical reports (such as laboratory reports, radiological findings, and other reports) are predominantly either printed on papers or stored in the form of digital narrative summary. Often majority of these reports printed on paper needs to be converted as images for storing in digital health records to maintain security, privacy, and confidentiality to avoid the data to be intercepted in its transmission and be immune to hacker attacks. The conversion of the text present inside an image into a text file is important for further structuring and storing in database seamlessly. In the proposed system, we leverage Microsoft’s OCR Engine which is available as an application programming interface (API) under its computer vision suite [23].

Novel structuring of narrative text by extracting keywords using NLP

The unstructured text obtained from the image through OCR and the narrative text given as input by the front-line healthcare professional is processed to extract the required keywords so that it can be integrated with the database. The entire paragraph is broken down into sentences using the natural language toolkit (NLTK) function. After which, the words or phrases in that sentence are marked, based on the part of speech (POS) using rapid automatic keyword extraction (RAKE) algorithm. This is called as POS tagging. Now, the identifiers for every field value are defined, and the tagged words are now structured and sorted into their respective fields. A total of 57 medical parameters that are most noted with patients suffering from COVID-19 are included in this system for automatic detection and extraction.

There are three steps involved in the process of the proposed novel keyword extraction algorithm. It is a combination of POS tagging through RAKE algorithm and extraction of medical parameter values using keyword identification and locating through bag of words (BoW). Also, the reason for selection of this algorithm is explained in the Comparison of Keyword Extraction Algorithm subsection of the “Discussion” section. The first step is the selection of clinically relevant data through scissoring of paragraphs into lines. This ensures that the memory occupancy is low, and the speed of execution is high. In the second step, the use of a rule-based POS tagger (Brill tagger) is applied on clinical data captured and uploaded. In addition, the medical parameters are located using BoW, and their values are searched in the neighboring text elements. The list of medical parameters along with the keywords is given in the Supplemental Table S1. In the third step, the extracted keywords need to be placed in the respective fields. We acknowledge that the identifiers are applied extensively to place the word in the appropriate field by using all the possible combinations of words, which we call keywords. For example, to place the value 120 mmHg in the field of systolic blood pressure, we rather instruct the program to use {Systolic-Blood-Pressure, SBP, BP, etc.} as the identifiers. Similarly, to find the age of the patient, we tend to use various identifiers such as {'yo', 'years', 'year’, ‘old', 'aged', 'age', 'yr','yrs'} and {'male', 'man', 'boy', 'woman’, ‘female', 'girl', ‘transgender’, ‘unclassified’, ‘third-gender’} for gender detection. By doing so, we tend to improve the performance of the system by increasing the efficiency of tagging. The flowchart of the NLP model is shown in Fig. 4. However, the keywords can be created in accordance with the current COVID-19 pandemic situation, and healthcare workers are encouraged to use medical parameters and serological test findings commonly found in COVID-19.

Fig. 4
figure 4

Steps involved in “keywords” extraction from the medical report, which was converted into text files from image format using OCR

Now to locate the field value, we set a search threshold value. This means that the program will search for the value of the identifier within a range of ± n words. For example, if n = 2, to find the value of systolic blood pressure, the program will search for a number within the range of 2 words on either side of the identifier. The search operation stops when a special character such as a conjunction is encountered. This can be better understood as shown in Fig. 5.

Fig. 5
figure 5

Extraction of “parameter values” for each medical data through keyword detection and neighbor search algorithm

The value after being placed in the respective fields is also being placed in the database. Defining the locating parameters of every keyword to be extracted is of vital importance to all concerned. Keeping this in mind, an exhaustive list of identifiers to locate the keyword in deployed, physicians are encouraged to follow the structure of report as depicted in Fig. 6. This is the dataset that has been followed throughout this work and was prepared after consultation with several physicians.

Fig. 6
figure 6

Identification of various keywords from narrative text of the medical reports uploaded using POS tagging algorithm and represent as tabular data for ease of reading and adding them to the EMR database

It is important to note that the elements such as date of birth, sex, vital signs, and medications have a limited number of options that are easy to select from a list or enter quickly in an agreed-upon format. But in reality, none of the unstructured data is easily corralled into a format that can be fed into algorithms and fed back to clinicians in an intuitive manner, yet providers desperately need better ways to understand and leverage these critical sources of information to care for their patients.

2.1.3 Integration

The last part of the proposed step before the deployment of the system would be the integration of the OCR and the NLP into the cloud and a creation of a web application for uploading the records on the mobile quarantine ward side and visualization of the data on the hospital side. As the end users are physicians, the user interface (UI) must be as user friendly as possible. The graphical UI of the proposed system is depicted in Figs. 7 and 8.

Fig. 7
figure 7

User interface design of the proposed application on the quarantine ward end to capture and upload clinical parameters and send to a specialty hospital

Fig. 8
figure 8

User interface design of the proposed application on the specialty hospital end where medical data are structured and displayed in tabular form, whereas the images can be either viewed or downloaded

On the quarantine ward end, the application denotes the date and time at which the previous record was updated, so that the front-line healthcare professional will come to know if it would be the appropriate time to upload the next record and that they can maintain a frequency at which they update records. Since the content they upload can be in the form of images or a saved document, they can effortlessly upload them with a click of the button. If the content to be uploaded is from the mobile device, the user can add more upload field by clicking on the “add an image/document from device” button. The “capture image from device camera” button enables them to add images directly from their mobile camera. It is interesting to note that this feature will come in handy to upload images of the readings from the support system, in the case of patients who are in L3 of isolation, or to upload pictures of the patient conditions.

It is also important to verify if the record is being added to that corresponding patient’s database. This is done by capturing and uploading the facial image of the patient every time a report is taken. The facial picture of the patient that was captured on the day of admission of the patient to the ward is used for comparison with the facial pictures that are captured in the subsequent days of the visit by the healthcare professional. This is done using Microsoft Azure’s Facial Analysis Cognitive Services API [24]. A similarity score is generated between the two images, thereby adding an additional layer of precaution so that there is no wrong patient identification.

At the hospital end, the stack of data is processed, and the previously mentioned OCR and NLP algorithms are applied to get structured data that can be visualized as shown in Fig. 8. A hospital with multiple wards, even in multiple locations, can monitor the progress of the patients in a visually appealing and in a faster manner since data is structured and graphical.

They can select the patient to be evaluated and can pull out patient’s entire medical records from the could-based database which can be clearly seen and understood as shown in Fig. 9. In order to monitor the variation of any medical parameter such as readings from serological tests and blood pressure, the physician at the specialty hospital can make use of the smart decision support function and effortlessly get a comparative graph with basic statistical parameters, such as mean, median, high, low, and standard deviation, that shows the variation of the selected parameter with time. This can be an indication of how the medication has been working.

Fig. 9
figure 9

Smart decision support feature in the application plots key health parameters. The plotting feature enables the doctor to track the variations in a particular medical parameter reading taken at different time intervals

The smart decision support option present within the proposed web-based application enables easy exchange of instructions between the specialty hospital and the mobile quarantine ward. The front-line healthcare worker present in the ward can thus seamlessly communicate with the professional in the hospital who can instruct them with suitable methods and medications. For example, instructions such as a drug or dosage change can be recommended through real-time feedback feature present in the application.

2.2 Implementation and validation approach

2.2.1 Data description

The dataset used in the proposed study consists of 1200 medical case records those were synthetically created by our empanelled clinicians. These medical records were printed in papers and scanned with 300 dots per inch resolution and stored as.bmp images in the 8-bit gray scale format. The dataset contains the aggregated sum of 29,994 medical parameters with an average of 25 parameters per record. The detailed list of parameters considered for the proposed work is depicted in Supplemental Table S1.

2.2.2 Validation

Algorithm validation

The next important and logical step would be to evaluate the performance and reliability of the developed system before actual deployment. Validation of the system is done in two stages, once after the medical report image is converted into text and once after the narrative text is structured. This is done to detect the errors at each step in the cloud-based smart structuring tool and to optimize the same. This is done using the formula as given in Eq. 1 and Eq. 2:

$${\eta }_{\mathrm{OCR}}=\frac{{\mathrm{Data}}_{correct}}{{\mathrm{Data}}_{total}}$$
(1)
$${\eta }_{\mathrm{system}}=\frac{{\mathrm{Data}}_{keyworks\_correct}}{{\mathrm{Data}}_{total}}$$
(2)

where ηOCR and ηsystem are the accuracy of the OCR engine and the developed system, respectively, Datatotal is the total number of input data records, Datacorrect is the number of input records where the data has been recognized correctly by the OCR engine, and Datakeywords_correct is the number of input records whose value for given keywords have been extracted accurately.

The variance of the system is also calculated to measure the reliability, when the system is subjected to repeated measurements. The variance of the system is calculated using Eq. 3:

$$\mathrm{Variance}({\sigma }^{2})=\frac{\sum_{n=1}^{3}{{(m}_{n}-\overline{m })}^{2}}{n-1}$$
(3)

where mn is the number of keywords extracted correctly in each record in a given attempt and n is the number of attempts. Here, the number of attempts is 3.

The other performance measures such as precision, recall, and F1 score are calculated based on Eq. 4 to Eq. 6.

$$\mathrm{Precision}=\frac{\mathrm{True Positive}}{\mathrm{True Positive }+\mathrm{False Positive}}$$
(4)
$$\mathrm{Recall}=\frac{\mathrm{True Positive}}{\mathrm{True Positive }+\mathrm{False Negative}}$$
(5)
$$\mathrm{F}1\mathrm{ Score}=\frac{2*\mathrm{Presicion}*\mathrm{ Recall}}{\mathrm{Presicion}+\mathrm{ Recall}}$$
(6)

Clinical validation

The developed system is validated clinically by requesting physicians who have experience of working in the COVID-19 hospitals and have treated the infected patients. Four such physicians were identified who then agreed to use this system in an imitation setup with a remote quarantine ward and a hospital-like circumstance. This mock evaluation of the system was based on various performance parameters such as system repeatability, user friendliness, accuracy, latency, usefulness, and adaptability and scalability. The physicians rated the system on a scale of 1 to 5 with 1 being the least and 5 being the highest. The repeatability was measured using Eq. 7 to Eq. 9:

$$\mathrm{Mean}(\overline{m })=\frac{\sum_{n=1}^{5}{m}_{n}}{n}$$
(7)
$$\mathrm{Standard Deviation }\left(\sigma \right)=\frac{\sum_{n=1}^{5}{{(m}_{n}-\overline{m })}^{2}}{n-1}$$
(8)
$$\mathrm{Standard Deviation of the Mean }(\mathrm{SDM})=\frac{\sigma }{\sqrt{n}}$$
(9)

where mn is the number of keywords extracted correctly in a given record in a given attempt and n is the number of attempts. Here, the number of attempts is 5. Standard deviation of the mean (SDM) is then calculated and is taken as the measure of repeatability of the system. Ideally, SDM should be zero. It means that the results have zero variance and that the system possesses high repeatability.

2.2.3 Implementation technology

The proposed web-based application was developed in Windows 10, 64-bit operating system with Intel Core i5 processor, 2.40 GHz, 8 GB RAM. Web pages were created using HTML and integrated with back-end Python code (Python V3.7) through middleware programming. Firebase database was used for storing and retrieving the data. The application was deployed onto the cloud-based server for testing. Microsoft Azure OCR API was used for conversion of images into text. As a part of the smart cloud-based structuring tool, NLTK (Version 3) along with our proposed structuring algorithm was used for identification and structuring of records. Graphs and tables were created using CSS and JavaScript.

Separate web pages with login access were designed for front-line healthcare workers to scan and upload patient’s records from the quarantine ward end and specialist physician to retrieve and analyze at the specialty hospital ends. The decision support features with plot function which is previously mentioned will enable the specialist physician to study the variation of a parameter with time. This will enable them to know the response of the patient to a particular drug or medication. Any abnormal values in a medical parameter will alert the physician. This is done using the isolation forest algorithm for anomaly detection.

3 Results

A basic layout along with the required placeholders was created with the help of the framing option as depicted in Fig. 10. The functions and positions of every element are clearly specified for illustrative purpose. Relative positioning is used rather than global positioning so that the website window remains constant for any screen size. Scroll bars are included for better readability in the case of huge text report images.

Fig. 10
figure 10

Visualization of structured data obtained by proposed smart structuring tool on the specialty hospital end

In Fig. 11, we can notice the use of the chat option built into the application. Through this feature, effective and easy communication can be established between the specialty hospital and the quarantine ward. On the hospital end, the chat option has smart reply features integrated with it. All these enable effortless transfer of messages between the two entities.

Fig. 11
figure 11

Messaging applet built within the web application to empower communication between the quarantine ward and the specialty hospital. The applet has an inbuilt smart answer suggestion system enabling faster and easier communication on both the ends

3.1 Algorithm validation

A set of 1200 medical case history reports were created under the guidance of a physician to test the efficiency of the OCR engine. Three trials were conducted, out of which 1195.3 records were correctly recognized on an average as depicted in Table 1. This brings the efficiency of the OCR engine to 99.6%. To test the efficiency of the NLP algorithm; the same 1200 medical case histories were directly fed into the system as text files. The results were then manually compared with the original data. The algorithm was considered to be successful if and only if all the medical data obtained were accurate. Three trials were conducted, and on an average, 1129.3 records were extracted, and keywords were classified correctly. The calculated root means squared coefficient of variation (RMS CV) was 1.3%. It was found that the parameters extracted using the smart structuring algorithm were highly reproducible and have an efficiency of about 94%. The memory size of the entire developed system was found to be 1.8 megabytes. Hence, this acts as a trade-off between accuracy and size. In addition to that, the number of keywords extracted correctly in each record along with its value was evaluated, and the mean and variance of the system was calculated using previously mentioned Eq. 3. The system was found to have zero variance. Hence, the developed system can be deemed reliable.

Table 1 Efficiency of the proposed smart structuring algorithm: comparison of parameter extraction in multiple attempts

3.2 Clinical validation

A mock clinical setup was established, and the system was used by four physicians who then filled out a questionnaire regarding the usage of the proposed system. The physicians uploaded various medical records and tested the system by repeating with the same medical record. On a scale of 1 to 5, with 1 being the lowest and 5 being the highest, they gave their reviews. Their views were collated, and a mean score was calculated (shown in Supplemental Table S2). On an average, the physicians rated the user friendliness of the system at 4.8. The accuracy of the algorithm was rated at 4.6. Although there were not any misidentifications of the keyword’s values, in some cases, one or two keywords were not identified. The physicians found the computation speed of the system to be quite fast without any latency. The inbuilt chat option was found to have a latency of 2.7 s. The processing of the uploaded records (character identification and structuring) took about 4.8 s. All the time metrics were measured with the help of an assistant who used a stopwatch to measure the time intervals. In addition to that, all the 4 physicians felt the need for such a system and marked it as a must-have system during pandemic-like situations for patient monitoring. On an average, the physicians rated 4.2 for system adaptability and scalability. Since the developed system addresses only those medical parameters that are relevant to the novel coronavirus disease, the physicians felt the need to scale the system to other domains of medicine and surgery.

The repeatability of the system was calculated for the records that were uploaded to the system by the physicians. A single report was uploaded 5 times to verify if all the attempts produced the same result. It was calculated in terms of SDM as seen in Eq. 9. The SDM produced a value of zero. This means that the system is repeatable and would produce the same result when tried any number of times. Hence, the developed system was clinically validated by physicians who felt the need to have it in the hospitals treating COVID-19 patients.

4 Discussion

Front-line healthcare professionals and other healthcare workers at the quarantine ward often visit many patients in a short period of time, which may lead to increased levels of human errors. In the current clinical practice, a nursing staff who notes down the medical parameters of the patient as the physician dictates and the values are later fed into a healthcare system/database. By doing so, there is a possibility of human error while noting down the values and while uploading them in the system. The proposed system reduces the proximity of interaction between the healthcare worker and the patients, while increasing the number of patients a single healthcare worker can monitor. Since the developed system increases the speed of communication and in an organized way between these professionals, one front-line worker will be able to monitor several patients. This reduces the number of front-line health providers who will actually work from the mobile quarantine facility while reducing the possibility of error. Perez et al. in their work have developed a similar NFC-enabled system built specifically for patient care in the ICUs in a hospital [25]. It is expected that their system would work well in a smaller set of patients and would be ineffectual in the case of pandemic like situations, since the system requires several front-line professionals (nurse, auxiliary nurse, primary care physician) for data validation. This slows down the speed of monitoring. In the case of our developed solution, since it is coupled to a smart structuring and communication tool, it becomes a reliable application to use during pandemic situations owing to its speed and ease of monitoring.

The NLP along with the novel smart structuring algorithm present as a part of the cloud-based smart structuring tool proved to be useful for extracting much key medical information from the medical texts. In our proposed system, we have taken into consideration about 57 medical parameters that are most noted by patients suffering from COVID-19 including basic information such as name, age, and blood group. Other details that were extracted include diseases, symptoms, and medications. When the physician visits a particular patient the next time, the same set of tests may be done for the patient. These medical parameters are stored every time a test is taken, and a report is drafted. The physician can then know the history of a particular parameter of the patient with just a single click. The physician can understand more about the reaction of the body to the previously prescribed drugs. The graphical representation of key health parameters helps the physician to get deeper insights about the present condition of the patient. Vilic et al. in their work leveraged NLP to create a timeline of vital signs for critically ill patients [26]. Their system works well only in situations where all the data are available readily in the digital format. However, our proposed system takes into pre-existing hard copies of records as well, making it a ready to deploy system in any situations.

The proposed system aims to act an open-ended framework to which several modules can be incorporated in the future. One such module would be the addition of sensors such as vital monitoring sensors which can be added by adjusting and adding the necessary software and hardware architectures. In addition, the proposed system can be extended to several other parameters depending on the need and the field of study and examination. Thus, it makes the proposed system be exhaustive and need-oriented and scalable depending on the application.

4.1 Comparison of keyword extraction algorithms

It is important to compare the various keyword extraction algorithms to justify the selection of the proposed one in this system. For this, parameters such as accuracy, precision, and recall must be studied. Comparing the performance of the proposed novel combination keyword extraction algorithm to other similar approaches is not an easy task, due to the lack of medical patient history and data classification systems that use a general purpose semantic annotator like the one we used. Hence, a comparison with relatively similar systems is made by citing the review works of several researchers. A comparison of graphical methods such as TextRank and statistical methods such as TF*IDF and POS using RAKE algorithm was done by Martin Dostal and Karel Jezek [27]. They have shown that the RAKE algorithm works better than the others when large datasets are involved. Garcia et al. in their work of classifying biomedical literature using bag of words (BoW) have obtained high accuracy, precision, and recall [28]. They have compared it with other similar systems and have proven better performance. A comparison of text classification systems with each of their pros and cons was done by Aliwy and Ameer [29]. These prove that the proposed novel combinatorial RAKE and BoW algorithm has relatively higher performance, both in terms of speed, accuracy, and simplicity, in comparison with other algorithms and approaches. In the future, an extensive analysis of the performance of different algorithms is to be made for patient clinical data.

4.2 Focusing on “T3” Strategy (“Test”, “Trace”, and “Treat”)

The proposed system will aid in proper implementation of the “T3” strategy, specifically in the treating stage. Tracing of infected individuals can be done using mobile applications such as “Aarogya Setu”, developed by the Government of India. The application creates a user database to form a network of information that can alert citizens and officials of potential victims of the COVID-19 pandemic. It leverages Bluetooth range as a proximity sensor under which a person can likely be infected with coronavirus when come in contact with an infected person. Data is shared between devices when they come in contact to each other, thus enabling the government to be notified for further testing.

Other front-line professionals in quarantine wards can now update the condition of an infected patient while maintaining minimum contact using the NFC tags. The records can be sent to the specialist physicians present in specialty hospitals who can then assess the direction in which the treatment must be taken forward. The records can also be appended to the database so that the progress of a patient can be monitored easily and remotely by the specialist physician. Structuring of medical records is swiftly done by the NLP and novel structuring algorithm. Thus, the proposed system provides a viable solution to the present pandemic situation.

4.3 Use case scenarios

The proposed system can be applied to a wide range of situations. In the case of people in L2, where the exposed person is kept in special quarantine wards, the proposed system can be used to monitor the status of the person to check for symptoms of COVID-19. In the current scenario, in states of India such as West Bengal, where monitoring devices such as thermometers are handed over to them for self-monitoring and reporting, the proposed system can be used to visualize and analyze various parameters of the person remotely. Brian and Malcolm in their work reported an NFC integrated system where NFC tags are given to infected people. Their system is particularly useful to monitor if or not the patient has developed symptoms of COVID-19 [30]. Similarly, the proposed system reduces the need for the healthcare workers to visit the houses of the quarantined people, thereby minimizing contact and exposure.

For the patients who are in L3, the proposed system can be used extensively to monitor the progress of the patient and thus the response of the patient to the given medications and diet. It is of utmost importance to adjust the drug dosage and formulation from time to time depending on how the person responds to a given medication. Since the system is scalable, in the future, wearable sensors can be attached to the patients to enable continuous and real-time monitoring of vital signs. Specifically, in the case of patients who are on life support systems, healthcare professionals can be alerted to abnormal variations in patient vital signs and receiving instant updates that enable them to quickly identify and prevent potentially harmful situations. A short review of the various developed remote patient monitoring and clinical decision support systems is summarized in Table 2.

Table 2 Review of various pre-existing patient monitoring and clinical decision support systems

As depicted in Table 2, several systems exist for monitoring patients and their progress. Like the work of Andreea et al. [31], the physicians found the system very useful and easy to use owing to its simple and minimalistic user interface and experience. Unlike the system developed by Lai et al. [32], which focuses only on fatigue and related symptoms, the developed system here focuses on a wide range of symptoms primarily noticed in COVID-19 patients. In addition to the work of Al-Naggar et al. [34], the proposed system has very low latency. Like the work of Novo et al. [36], the system can be scaled to any domain of medicine and surgery, but without any much change to the framework.

Data extraction through NLP is a relatively new technology in the healthcare domain. Anil et al. [37] in their work have leveraged the Kaiser Permanente Southern California (KPSC) clinical information extraction system. This system uses computationally hungry algorithms such as the Word2Vec and similar transformer models. They have demonstrated the efficiency of about 99%, while extracting data from electronic medical records of patients with prostate-related diseases, specifically prostate cancer. Similarly, Thomas et al., [38] in their work with the use of NLP to annotate drug product labeling, have achieved an accuracy ranging from 64 to 77%. Again, the technology used here is memory-wise heavy, leading to lower computational speeds. A detailed comparison of performance metrics of the proposed system against various similar systems is highlighted in Table 3.

Table 3 Comparison of evaluation and performance metrics of similar NLP algorithms that use patient-authored textual data

In future, the proposed system can be integrated with existing electronic health records (EHRs) or vendor neutral archives (VNA) for seamless data storage and retrieval. In the current scenario, the system stands alone and not tested in the actual clinical environment. This was developed solely to address the difficulties in monitoring the patients in the COVID-19 pandemic situation. In addition to this, it should also be Health Insurance Portability and Accountability Act (HIPAA) complaint and ensure privacy, security, and portability of health information. In addition, the system is built keeping in mind the regulatory compliance to avoid any potential breach of sanitary rules in critical care units. Usage of Hadoop and Oracle has ensured the same. Also, the system must be scaled to other domains of medicine and surgery. Currently, the proposed smart structuring algorithm addresses and extracts values only for those medical parameters that are related to the novel coronavirus disease. In future, it can be extrapolated according to the need.

5 Conclusion

In the present scenario, there is a need to create a centralized dashboard that can help various health and government officials to visualize the spread of the virus and progress of the patients for a given treatment. By this, outliers can be easily detected, and the medications can be altered according to the patient. It must also be ensured that the healthcare workers maintain minimum contact and exposure to the virus. The proposed web-based system that deploys NFC tags proves to be a viable solution to this problem. In this way, maximum utilization of the workforce can be enabled with minimum exposure to the virus, thus reducing its spread. Ease of visualization of the medical reports is of utmost importance since it enhances the decision-making capabilities of the physician. This is achieved using the smart cloud-based structuring tool that uses OCR and NLP to convert the input into structured digital data. Thus, several capabilities of the proposed web-based system are found to have an efficiency of over 94%. The system was tested in an authentic mock clinical setup and evaluated by physicians who felt it user friendly and reliable. The system showed a low latency of 2.7 s. In addition to that, since it has an open-ended architecture, the system can be scaled depending on the future needs. Also, as a future development of this system, real-time sensors can be deployed to monitor the patients continuously, thus alerting the physician instantly in case of abnormal and abrupt variations of vital signs in the case of people on life-support systems. Since the range of NFC transfers is less than a few centimeters, data breach in air is not possible, thus making the proposed system more secure.

Since the study focuses more on the activities employed by the various governing bodies in India, at present state, the proposed system is more suitable for the current Indian situation, and in the future, the system can be modified to suit the needs of the other countries depending upon their requirements.