1 Introduction

Pneumonia infection is a serious disease of the lungs with a range of possible causes. Bacteria, viruses or fungi can cause Pneumonia infection Wale and Sonawani (2018), Varadharajan et al. (2018), Muthazhagan et al. (2020). Pneumonia is a serious complication of the new coronavirus (COVID-19). Pneumonia infection is ranked as the eighth leading cause of death in the US at 2017 Jalil and Fraig (2019). Also, it causes death in children younger than five years of age worldwide. There are three different types in pneumonia infection such as bacterial pneumonia, viral pneumonia, and fungal pneumonia. The bacterial pneumonia is the most common Pneumonia infection mainly caused by bacterium Streptococcus pneumoniae Krishna et al. (2017). The Viral pneumonia is the most dangerous Pneumonia infection caused by the respiratory syncytial virus (RSV) and influenza types A and B. Fungal pneumonia is caused by the coccidioides fungus and creates the valley fever Beuy and Viroj (2019).

Chest X-rays are the commonly used method to detect the Pneumonia infection and locate the infected area in the lungs Jaeger et al. (2014). Also, the chest X-ray is the widely used radiological examination technique toward diagnosis of several lung diseases Demner-Fushman et al. (2016). Finding radiological examiners in remote places for analysis for more number of Chest X-rays is an extremely challenging task Dou et al. (2016). In recent times, artificial intelligence approaches are used to solve the challenges in several of medical diagnosis processes (Hwang and Kim 2016; Leaman et al. 2015). Mainly, the deep learning and computer vision techniques are supports to the diagnosis of various cancer and genome diseases Moeskops et al. (2016).

Deep learning is a subset of machine learning in artificial intelligence (AI) that has learnings from large volume of unstructured data Jamaludin et al. (2017). Also, Designing and developing the deep learning model to solve any problem consumes more time and computational resources Roth Holger et al. (2015). Transfer learning techniques are introduced to avoid the deep learning model development challenges Ronneberger et al. (2015). Transfer learning make use of the knowledge gained by the deep learning network while solving one problem and applying this knowledge to solve a similar problem Azimi et al. (2018). The new dataset for the new problem was used to fine-tune the existing deep learning model weights Setio et al. (2016). A Convolutional Neural Network is a multi-layer neural network and recognizes visual patterns directly from pixel images with minimal preprocessing (Roth et al. 2014; Yao et al. 2016). The most common of transfer learning techniques using the convolutional Neural Network for solving images classification problems are the AlexNet, VGGNet, ResNet, and InceptionNet Shin et al. (2016).

VGGNets are classified as a two types based on the number of convolutional layers such as VGG16Net and VGG19Net. In general, the VGG19Net gives a better performance than the VGG16Net. The VGG19Net consists of nineteen convolutional layers and is very appealing because of its very uniform architecture Obinikpo and Kantarci (2017). It is currently the most preferred choice in the community for extracting features from images. The weight configuration of the VGG19Net is publicly available and has been used in many other applications and challenges as a baseline feature extractor. However, the VGG19Net consists of 46 Number of layers and 143 million parameters, which can be a bit challenging to handle Roth.Holger et al. (2015).

In this research, the VGG19Net based Pneumonia infection diagnosis model was proposed using chest X-ray images and trained with the GPU system. The subsequent sections of the article were organized as follows: Sect. 2 provides the literature survey of the Pneumonia disease classification related works. Section 3 describes the materials and methodologies of the VGG19Net for Pneumonia infection diagnosis model. Section 4 contains experimental results and related discussions. Section 5 gives the conclusions and future directions of the research.

2 Related works

Scan and X-ray images of human organs comprises of more valuable medical information that can help diagnose and treat diseases in hospitals. The Manual examination approach was not suitable to handle the large amounts of medical imaging data. The purpose of this review is to show the capability of image processing techniques that can efficiently handle these different but closely related human disease diagnosis. It also conducts a comparative study of various image classification and image retrieval techniques that are applied to the human lung disease identification systems. The authors Ma et al. (2017) proposed a content based medical image technique to retrieve the CT Imaging Signs of Lung Diseases using considering fused, and context-sensitive similarity measures. The fused pairwise similarity was used to minimize the semantic gap for obtaining a more accurate pairwise similarity measure. The authors Ning et al. (2018) proposed the Hausdorff distance combining Tamura texture features and wavelet transform algorithm based brain MRI database and the lung CT image retrieval technique. The performance of the technique is better than the single feature texture technique.

The Three dimensional Convolutional Neural Network framework was proposed by Jamaludin et al. (2017) to produce radiological grading of spinal lumbar MRIs and also localize the predicted pathologies using intervertebral disc volumes. The authors Azimi et al. (2018) proposed an ECG data classifications model using Convolutional Neural Network, deployed the model in Internet of Things devices. This approach uses the existing ECG dataset to classify the patient’s health status. The edge computing devices reduced the cost of implementation and increased the portability of the device. The authors Obinikpo and Kantarci (2017), reviewed applications and challenges of Internet of Things and deep learning techniques in smart connected health systems through wearable or invasive devices. Also, the authors discussed the open challenges in sensed and captured medical data. The authors Kumar and Gandhi (2018) proposed a scalable Internet of Things device toward diagnosis of heart diseases. The logistic regression technique was used to process the sensed data form Internet of Things device. The cloud services were used to store and retrieve the large volume of collected data from patients. Performance of the logistic regression was examined using the Receiver Operating Characteristic (ROC) analysis to predict heart disease.

Likewise, the authors Abdelaziz et al. (2018), proposed a diagnosis for healthcare services for chronic kidney diseases using the linear regression (LR) and neural network (NN) approaches. The model was deployed in the cloud platform to enhance the execution performance. The author compared the performance of various machine learning techniques. The model predicts the chronic kidney diseases with an accuracy of 97.8 in percentage. The authors Wale and Sonawani (2018) reviewed various machine learning and deep learning techniques to diagnose the various human health challenges using imaging and sensing data. The authors discussed the advantages and challenges of numerous machine learning based healthcare applications. The succeeding section presents the fundamentals of the implemented Pneumonia infection diagnosis model and the training and testing dataset.

3 Materials and methods

The complete processes of the proposed model for Pneumonia infection identification are described further in detail. The entire method is separated into a number of stages in the following subsections, starting with collecting the images for the classification process.The proposed Pneumonia detection model consists of five phases. The complete flow of the proposed system is shown in Fig. 1.

Fig. 1
figure 1

Flow of proposed pneumonia detection model

The complete development process of the Pneumonia infection detection system was given on subsequent sub sections. It begins with dataset preparation and the preprocessing phase.

3.1 Dataset preparation and preprocess

The Chest X-ray 8 dataset was downloaded from the open data repository for diagnosing pneumonia from chest images. The Content-based visual information retrieval technique was used to retrieve the input images from the large database with minimum time conception. Data augmentation techniques were then used to enhance the size of the dataset. Also, the content-based image retrieval technique helped avoid image duplication problems. The basic data manipulation and Deep Convolutional Generative Adversarial Network (DCGAN) were used to create the augmented images. The data augmentation techniques increased the number of images in each class to 6000 images. There are 11,900 images which were used to train the proposed neural network. Table 1 illustrates the details of the training data for the Pneumonia infection identification model.

Table 1 Number of images in proposed dataset

The data augmentation process was implemented by using python programming language with tensorflow and augmentor library. The dataset was splitted as a training, validation and testing set. There were 11,900 images used to train and validate the proposed VGG19Net for the detection of Pneumonia infection. There were 300 unseen images which were used to test the performance of the proposed and existing state-of-art models. Figure 2 shows the random samples from the dataset of proposed Pneumonia infection identification model.

Fig. 2
figure 2

Random sample images of Chest X-ray dataset

3.2 VGG19Net design and training

The proposed pneumonia infection identification model uses the standard VGG19Net with a few modifications. The Keras application library was used to import the pre trained VGG19Net. There are sixteen convolutional layers which are used in VGG19Net. The convolutional layers are grouped as a set and after the each set of convolutional layers the max pooling layer was introduced to reduce the dimension of the features. The dimension of the input images are represented using the below Eq. 1,

$$ \dim \left( {image} \right) = \left( {nH,nW,nC} \right) $$
(1)

where nH and nW represents the size of the Height and Width and nC represents the number of Channels.

The each convolutional layer uses unique filters to extract the features from the input data. The dimension of the filter in VGG19Net is shown in the below Eq. 2.

$$ dim\left( {filter} \right) = \left( {f,f,nC} \right) $$
(2)

where f represents the size of the Height and Width and nC represents the number of Channels.

The convolutional layers extracts the feature information from the input data using the filter function and produces the feature data as an output. The convolutional function from the given input and filter is as shown in the below Eq. 3,

$$ conv\left( {I,K} \right)_{xy} = \mathop \sum \limits_{i = 1}^{nH} \mathop \sum \limits_{j = 1}^{nW} \mathop \sum \limits_{k = 1}^{nC} K_{i,j,k} I_{x + i - 1,y + j - 1,k} $$
(3)

where I represents the input image of the convolutional function and K represents the kernel filter. To improve the performance of the first five layers of the standard VGG19Net, it was freezed for the training process. The training time of the model without freezing the first five layers taking very long. But, There is no changes on the training performance of the model including the first five layers. So, the layers are freezed for the training process. Also, two dense layers were added in the network for changing the number of classes from 1000 to 2 classes. The two output classes are for not finding and pneumonia. The softmax activation function was used to classify the model output to class.

Additionally, the dropout layer with dropout of 0.2 was added between the first two dense layers for avoiding the overfitting problem. Different dropout values were compared in this network and varied from 0.2 to 0.8 and the 0.5 gives the better training performance. The mini-batch gradient descent was used in this model with batch size of 16. Also, the various batch sizes were applied and the performance was compared with the batch size 16. Implementation of the proposed VGG19Net was developed and done by using the python programming language and Tensorflow library. Number of layers and input and output dimensions of the each layers are represented in the model summary. The primary designing and customizing processes of the proposed VGG19Net were developed in HP Z800 workstation. After the customization of VGG19Net for Pneumonia infection detection, the model was trained using the NVIDIA DGX1 deep learning server. There are 54,631,490 parameters which are trained in the proposed VGG19Net. The NVIDIA DGX1 deep learning server boosts the training process of the proposed VGG19Net. The summary of the proposed VGG19Net for Pneumonia infection diagnosis is shown in Fig. 3.

Fig. 3
figure 3

Summary of proposed VGG19Net

The Fig. 4 illustrates the training result of the proposed model with 500 epochs and of 16 batch size.

Fig. 4
figure 4

Training result of proposed VGG19Net

Figure 5 represents the validation accuracy and loss of the proposed VGG19Net in proposed Pneumonia Chest X-ray dataset.

Fig. 5
figure 5

Training result of proposed VGG19Net

After completion of the training process the layer architecture and weight values of the proposed VGG19Net were stored as H5 for future purpose. The validation performance of the model using various datasets are shown in the Fig. 6. The datasets are original dataset, Basic manipulation techniques based augmented dataset, DCGAN based augmented dataset and combined dataset using original, basic manipulation and DCGAN techniques.

Fig. 6
figure 6

Dataset performance comparison

The dataset performance comparison results shows the significance of the combined data augmentation techniques in image classification datasets.

4 Results and discussions

The proposed VGG19Net and state-of-the-art networks are tested with 300 testing images for classifying the Pneumonia disease from chest X-ray images. The most common state-of-the-art classification algorithms are AlexNet, VGG16Net and InceptionV3Net Geetharamani and Arun Pandian (2019). Confusion matrix represents the performance of the classification algorithms. The confusion matrix shows the True Positive (TP), True negative (TN), False positive (FP) and False Negative (FN) of the classification algorithm.Fig. 7 illustrates the confusion matrix of the proposed Pneumonia infection detection model.

Fig. 7
figure 7

Confusion matrix of proposed VGG19Net

Label 0 represents “Pneumonia” and label 1 represents “No Finding” of the confusion matrix of the proposed VGG19Net model. The confusion matrix represents the classification performance of the proposed model on testing data.

The Receiver Operating Characteristic (ROC) curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. Figure 8 illustrates the roc curves of the proposed binary classifier.

Fig. 8
figure 8

AuC–RoC curves of proposed VGG19Net

The ROC curves show the classification superiority of the proposed VGG19Net on Pneumonia diagonsis.

Classification accuracy is most important performance metrics of classification algorithm. The following Eq. 4 was used to calculate the classification accuracy from True Positive (TP), True negative (TN), False positive (FP) and the False Negative (FN) of confusion matrix.

$$ {\text{Accuracy}} = { }\frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}} $$
(4)

The classification accuracy of the proposed VGG19Net is much higher than the other approaches. Figure 9 compares the classification accuracy of proposed VGG19Net and state-of-the-art classification techniques.

Fig. 9
figure 9

Comparison of accuracy

Another most important of performance metrics is precision. Precision is the fraction of relevant instances among the retrieved instances of classification techniques. The precision value of the proposed VGG19Net is superior to the existing techniques. The following Eq. 5 represents the calculation process of precision,

$$ {\text{Precision}} = { }\frac{{{\text{TP}}}}{{{\text{TP}} + {\text{FP}}}} $$
(5)

Figure 10 illustrates the precision value of the proposed VGG19Net and other state-of-the-art approaches.

Fig. 10
figure 10

Comparison of precision

Recall is also one of the most common performance metrics to estimate the performance of the classification model. The Recall is the fraction of the total amount of relevant instances that were actually the retrieved instance of the classification algorithm. The following Eq. 6 represents the calculation process of recall,

$$ {\text{Recall}} = { }\frac{{{\text{TP}}}}{{{\text{TP}} + {\text{TN}}}} $$
(6)

Figure 11 shows the recall value of proposed VGG19Net and the existing classification techniques such as AlexNet, VGG16Net and InceptionV3Net.

Fig. 11
figure 11

Comparison of recall

The recall value of the proposed VGG19Net is found better than in the state-of-the-art classification techniques.

In statistical analysis of binary classification, the F1 score is a measure for testing accuracy. It considers both the precision and the recall of the classification algorithm to compute the F1 score. The following Eq. 7 represents the calculation of the F1 score,

$$ {\text{F}}1{\text{ Score}} = { }\frac{{\text{Precision*Recall}}}{{{\text{Precision}} + {\text{Recall}}}} $$
(7)

The F1 score of the proposed VGG19Net is superior to the existing approaches. Figure 12 shows the F1 score of the proposed VGG19Net and in existing AlexNet, VGG16Net and InceptionV3Net.

Fig. 12
figure 12

Comparison of F1 score

The outcomes of the most common performance metrics such as classification accuracy, precision, recall and F1 score, shows that the proposed VGG19Net gives a better performance than the AlexNet, VGG16Net and InceptionV3Net. Also the proposed VGG19Net model was deployed on the embedded device and mobile application for the real-time implementation using the tensorflow lite package.

5 Conclusion and future works

Reliable recognition of infections in the lung is a key step in the diagnosis of Pneumonia disease. X-ray imaging examination of Chest is usually performed by trained human examiners or doctors, making the process time-consuming and hard to standardize. This research proposed and developed a Pneumonia detection model using the Deep Convolutional Neural Network and Pneumonia Chest X-ray dataset. This data was collected from the various patients and clinically examined and categorized by human examiners. The proposed Deep Convolutional Neural Network was trained on by using 1000 training epochs with NVidia tesla v100 GPU and TensorFlow framework. The training process of the model uses 7000 chest X-ray Images and the testing process uses 200 images. The performance of the proposed model used, evaluated thus using different metrics such as Classification accuracy, Sensitivity, Specificity and the F1 score. The Classification accuracy of the proposed model achieved the average accuracy of 99.53 percentage in unseen chest X-ray images. Also, this accuracy was greater than the existing transfer learning approaches such as Alexnet, Resnet, and InceptionNet. The proposed Deep Convolutional Neural Network was found most suitable to detect Pneumonia infection from Chest X-ray images.

In the future, different lung disease classes will include this model to detect various lung diseases using the chest X-ray images. Also, the performance of the proposed Deep CNN model can be improved with more number of layers and parameters. This will allow clinicians to recognize lung diseases from chest X-ray images with lower prevalence at an earlier stage of the disease.