Skip to main content

DiagnoseNET: Automatic Framework to Scale Neural Networks on Heterogeneous Systems Applied to Medical Diagnosis

  • Conference paper
  • First Online:
IT Convergence and Security

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 712))

Abstract

Determining an optimal generalization model with deep neural networks for a medical task is an expensive process that generally requires large amounts of data and computing power. Furthermore, the complexity of the programming expressiveness increases to scale deep learning workflows over new heterogeneous system architectures for training each model and efficiently configure the computing resources. We introduce DiagnoseNET, an automatic framework designed for scaling deep learning models over heterogeneous systems applied to medical diagnosis. DiagnoseNET is designed as a modular framework to enable the deep learning workflow management and allows the expressiveness of neural networks written in TensorFlow, while the DiagnoseNET runtime abstracts the data locality, micro batching and the distributed orchestration to scale the neural network model from a GPU workstation to multi-nodes. The main approach is composed through a set of gradient computation modes to adapt the neural network according to the memory capacity, the workers’ number, the coordination method and the communication protocol (GRPC or MPI) for achieving a balance between accuracy and energy consumption. The experiments carried out allow to evaluate the computational performance in terms of accuracy, convergence time and worker scalability to determine an optimal neural architecture over a mini-cluster of Jetson TX2 nodes. These experiments were performed using two medical cases of study, the former dataset is composed by clinical descriptors collected during the first week of hospitalization of patients in the Provence-Alpes-Côe d’Azur region; the second dataset uses a short ECG records between 30 and 60 s, obtained as part of the PhysioNet 2017 Challenge.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Asch M, Moore T et al (2018) Big data and extreme-scale computing: pathways to convergence-toward a shaping strategy for a future software and data ecosystem for scientific inquiry. Int J High Perform Comput Appl 32:435–479

    Article  Google Scholar 

  2. Avati A, Jung K, Harman S, Downing L, Ng AY, Shah NH (2017) Improving palliative care with deep learning. CoRR. arXiv:1711.06402

  3. Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecný J, Mazzocchi S, McMahan HB, Overveldt TV, Petrou D, Ramage D, Roselander J (2019) Towards federated learning at scale: system design. CoRR. arXiv:1902.01046

  4. Garcia Henao JA, Esteban Hernandez B, Montenegro CE, Navaux PO, Barrios Hernández CJ (2016) enerGyPU and enerGyPhi monitor for power consumption and performance evaluation on Nvidia Tesla GPU and Intel Xeon Ph

    Google Scholar 

  5. Garcia Henao JA, Precioso F, Staccini P, Riveill M (2018) Parallel and distributed processing for unsupervised patient phenotype representation. In: Latin America high performance computing conference. https://hal.archives-ouvertes.fr/hal-01885364, Sept 2018

  6. Jia Z, Zaharia M, Aiken A (2018) Beyond data and model parallelism for deep neural networks. CoRR. arXiv:1807.05358

  7. Jiang J, Yu L, Jiang J, Liu Y, Cui B (2017) Angel: a new large-scale machine learning system. Natl Sci Rev 5(2):216–236. https://doi.org/10.1093/nsr/nwx018

    Article  Google Scholar 

  8. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

  9. Konecný J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: strategies for improving communication efficiency. CoRR. arXiv:1610.05492

  10. Maharlou H, Niakan Kalhori SR, Shahbazi S, Ravangard R (2018) Predicting length of stay in intensive care units after cardiac surgery: comparison of artificial neural networks and adaptive neuro-fuzzy system. Healthc Inform Res 24(2):109–117. https://doi.org/10.4258/hir.2018.24.2.109. http://europepmc.org/articles/PMC5944185

  11. Rajpurkar P, Hannun A, Haghpanahi M, Bourn C, Ng A (2017) Cardiologist-level arrhythmia detection with convolutional neural networks

    Google Scholar 

  12. Strubell E, Ganesh A, McCallum A (2019) Energy and policy considerations for deep learning in NLP. arXiv:1906.02243, Jun 2019

  13. Xing EP, Ho Q, Dai W, Kim JK, Wei J, Lee S, Zheng X, Xie P, Kumar A, Yu Y (2015) Petuum: a new platform for distributed machine learning on big data. IEEE Trans Big Data 1(2):49–67

    Article  Google Scholar 

  14. Ye C, Wang O, Liu M, Zheng L, Xia M, Hao S, Jin B, Jin H, Zhu C, Huang CJ, Gao P, Ellrodt G, Brennan D, Stearns F, Sylvester KG, Widen E, McElhinney DB, Ling X (2019) A real-time early warning system for monitoring inpatient mortality risk: prospective study using electronic medical record data. J Med Internet Res 21(7):e13719–e13719. https://www.ncbi.nlm.nih.gov/pubmed/31278734, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6640073/

Download references

Acknowledgements

We thank DU Ziqing, Mohamed Younes, Arno Gobbin and the IADB team for their help. This work is partly funded by the French government labelled PIA program under its IDEX UCAJEDI project (ANR-15-IDEX-0001). The PhD thesis of John Anderson Garcia Henao is funded by the French government labelled PIA program under its LABEX UCN@Sophia project (ANR-11-LABX-0031-01).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to John Anderson Garcia Henao or Michel Riveill .

Editor information

Editors and Affiliations

Appendices

Appendix 1: DiagnoseNET MPI Synchronous Algorithm

The algorithm 1 describes the MPI synchronous coordination training with parameter server. It uses the nodes ranks to assign them the role of parameter server or worker, defined the rank 0 as parameter server (PS) and the other ranks as workers. When launching the program, the PS does necessary pre-processing tasks, such as loading the dataset and compiling the model. After these tasks, the PS sends the model to the workers, which are ready to receive it. At each training step, the PS sends a different subset of the data to every worker to be used for loss optimization. At the end of an epoch, the PS will gather the new weights from every worker. Workers receive the collection of weights and compute the average weight for the global update. For the other computing parts, it works as the desktop version.

figure f

Appendix 2: DiagnoseNET MPI Asynchronous Algorithm

The algorithm 2 allows training multiple model replicas in parallel on different nodes with different subsets of the data. Each model replica processes a mini-batch to compute gradients and sends them to the parameter server which apply a function (mean, weighted average) between previous and received weights, then updates the global weights accordingly and send them back to the workers. In fact, every worker will compute its gradients individually until its convergence; the convergence occurs when we start having overfitting, which means that the training loss is decreasing while the validation loss increased. The master who is responsible for computing the weighted average of received weights and its own weights, will stop when all workers converge. To check the status of convergence of workers, the master has a queue that stores converged workers and when its length is equal to the number of workers, the master knows that all workers converged and stops training. Since each node computes gradients independently and does not require interaction among each other, they can work at their own pace and have greater robustness to machine failure.

figure g

Appendix 3: Hyperparameter Search to Classify the Medical Task 1

A model space contains (d) hyperparameters and (n) hyperparameters configurations defined in Table 3 and the Table 4 shows the models by number of parameters. We have established some fixed hyperparameters and decided to tune the number of units per layer, the number of layers and batch size, which are the hyperparameters that directly affect the computational cost. Each model was trained using Adam as an optimizer with a maximum of 40 epochs and as a loss function is used the \(Cross\ Entropy\).

Table 3 Search space model descriptors
Table 4 Model dimension space in number of parameters (millions)

According with the model dimension showed in the Table 4, we are found that is possible divided the models by Fine, middle and course grain. In which, the Fig. 4 shows that middle-grain models from 1.99 to 8.29 millions of parameters have a fast convergence in validation loss, and high accuracy levels for the majority of the 14 care purpose labels, in comparison with the other models who present a great variation in accuracy and spent more epochs to convergence.

Fig. 4
figure 4

Experiment results for training a feed-forward neural network, using the hyperparameter model-dimension space

Appendix 4: ECG Neural Architecture to Classify the Medical Task 2

The pure CNN model leads to the problem that the last layer of the model may not exploit the original features or the ones extracted in the first layers. The Fig. 5 shows the ECG neural architecture implemented using DiagnoseNET framework, which key architecture factor are the residual network connections to solve the information loss problem into the deep layers. To implement this, a second information stream is added in the model. In this way, deeper layers have access to the original features, in addition to the information processed by the previous layers. What else, two different types of residual block are included to access the different states of the information. The normal residual block preserves the size of the input while the sub-sampling residual block lowers the size of the input down to a half. By using max pooling, the network extracts only the high values from an input so that the size of its output is halved.

Fig. 5
figure 5

ECG convolutional neural architecture

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Garcia Henao, J.A., Precioso, F., Staccini, P., Riveill, M. (2021). DiagnoseNET: Automatic Framework to Scale Neural Networks on Heterogeneous Systems Applied to Medical Diagnosis. In: Kim, H., Kim, K.J. (eds) IT Convergence and Security. Lecture Notes in Electrical Engineering, vol 712. Springer, Singapore. https://doi.org/10.1007/978-981-15-9354-3_1

Download citation

Publish with us

Policies and ethics