High Efficiency Video Coding (HEVC)–Based Surgical Telementoring System Using Shallow Convolutional Neural Network


Surgical telementoring systems have gained lots of interest, especially in remote locations. However, bandwidth constraint has been the primary bottleneck for efficient telementoring systems. This study aims to establish an efficient surgical telementoring system, where the qualified surgeon (mentor) provides real-time guidance and technical assistance for surgical procedures to the on-spot physician (surgeon). High Efficiency Video Coding (HEVC/H.265)–based video compression has shown promising results for telementoring applications. However, there is a trade-off between the bandwidth resources required for video transmission and quality of video received by the remote surgeon. In order to efficiently compress and transmit real-time surgical videos, a hybrid lossless-lossy approach is proposed where surgical incision region is coded in high quality whereas the background region is coded in low quality based on distance from the surgical incision region. For surgical incision region extraction, state-of-the-art deep learning (DL) architectures for semantic segmentation can be used. However, the computational complexity of these architectures is high resulting in large training and inference times. For telementoring systems, encoding time is crucial; therefore, very deep architectures are not suitable for surgical incision extraction. In this study, we propose a shallow convolutional neural network (S-CNN)–based segmentation approach that consists of encoder network only for surgical region extraction. The segmentation performance of S-CNN is compared with one of the state-of-the-art image segmentation networks (SegNet), and results demonstrate the effectiveness of the proposed network. The proposed telementoring system is efficient and explicitly considers the physiological nature of the human visual system to encode the video by providing good overall visual impact in the location of surgery. The results of the proposed S-CNN-based segmentation demonstrated a pixel accuracy of 97% and a mean intersection over union accuracy of 79%. Similarly, HEVC experimental results showed that the proposed surgical region–based encoding scheme achieved an average bitrate reduction of 88.8% at high-quality settings in comparison with default full-frame HEVC encoding. The average gain in encoding performance (signal-to-noise) of the proposed algorithm is 11.5 dB in the surgical region. The bitrate saving and visual quality of the proposed optimal bit allocation scheme are compared with the mean shift segmentation–based coding scheme for fair comparison. The results show that the proposed scheme maintains high visual quality in surgical incision region along with achieving good bitrate saving. Based on comparison and results, the proposed encoding algorithm can be considered as an efficient and effective solution for surgical telementoring systems for low-bandwidth networks.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12


  1. 1.

    Ereso AQ, Garcia P, Tseng E, Gauger G, Kim H, Dua MM, Guy TS: Live transference of surgical subspecialty skills using telerobotic proctoring to remote general surgeons. J Am Coll Surg 211(3):400–411, 2010

    Article  Google Scholar 

  2. 2.

    Poropatich R, Lappan C, Lam D: Operational Use of US Army Telemedicine Information Systems in Iraq and Afghanistan - Considerations for NATO Operations, Telemedicine for Trauma, Emergencies, and Disaster Management, Artech House, 173–182, 2010

  3. 3.

    Rosser JC, Young SM, Klonsky J: Telementoring: an application whose time has come. Surg Endosc 21(8):1458–1463, 2007

    Article  Google Scholar 

  4. 4.

    Rayman R, Croome K, Galbraith N, McClure R, Morady R, Peterson S, Primak S: Robotic telesurgery: a real-world comparison of ground-and satellite-based internet performance. Int J Med Robot Comput Assisted Surg 3(2):111–116, 2007

    CAS  Article  Google Scholar 

  5. 5.

    Sullivan GJ, Ohm J, Han WJ, Wiegand T: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22(12):1649–1668, 2012

    Article  Google Scholar 

  6. 6.

    Ostermann J, Bormans J, List P, Marpe D, Narroschke M, Pereira F, Wedi T: Video coding with H. 264/AVC: tools, performance, and complexity. IEEE Circuits Syst Mag 4(1):7–28, 2004

    Article  Google Scholar 

  7. 7.

    Nemcic O, Vranjes M, Rimac-Drlje S: Comparison of H. 264/AVC and MPEG-4 Part 2 coded video. IEEE, In ELMAR, 41–44, 2007

  8. 8.

    Marpe D, Wiegand T, Sullivan GJ: The H. 264/MPEG4 advanced video coding standard and its applications. IEEE Commun Mag 44(8):134–143, 2006

    Article  Google Scholar 

  9. 9.

    Kim H, Rhee CE, Lee HJ: A low-power video recording system with multiple operation modes for H. 264 and light-weight compression. IEEE Trans Multimedia 18(4):603–613, 2016

    Article  Google Scholar 

  10. 10.

    Ogawa K, Ohtake G: Watermarking for HEVC/H. 265 stream. In Consumer Electronics (ICCE), IEEE International Conference, 102–103, 2015

  11. 11.

    Antoniou ZC, Panayides AS, Pantziaris M, Constantinides AG, Pattichis CS, Pattichis MS: Real-time adaptation to time-varying constraints for medical video communications. IEEE J Biomed Health Informat:2017

  12. 12.

    Doyle TE, Musson DM, Schwering T: Cognitive priority model for advanced telemedical support in Limited Bandwidth Applications. In Electrical and Computer Engineering (CCECE), IEEE 30th Canadian Conference, 1–6, 2017

  13. 13.

    Panayides AS, Pattichis MS, Constantinides AG, Pattichis CS: M-health medical video communication systems: an overview of design approaches and recent advances. In Engineering in Medicine and Biology Society (EMBC), 35th Annual International Conference of the IEEE, 7253–7256, 2013

  14. 14.

    Sanchez V, Bartrina-Rapesta J: Lossless compression of medical images based on HEVC intra coding. In Acoustics, Speech and Signal Processing (ICASSP), IEEE International Conference, 6622–6626, 2014

  15. 15.

    Rad RM, Saeedi P, Bajic I: Automatic cleavage detection in H. 264 sequence of human embryo development. In Electrical and Computer Engineering (CCECE), IEEE Canadian Conference, 1–4, 2016

  16. 16.

    Neri A, Battisti F, Carli M, Salatino M, Goffredo M, D’Alessio T: Perceptually lossless ultrasound video coding for telemedicine applications. In Proc. Int. workshop video process. Quality Metrics, 2007

  17. 17.

    Tahir M, Ul-Abdin Z, Qadir MA: Enhancing the HEVC video analyzer for medical diagnostic videos. In High-Capacity Optical Networks and Enabling/Emerging Technologies (HONET), 12th International Conference, 1–5, 2015

  18. 18.

    Jangbari P, Patel D: Review on region of interest coding techniques for medical image compression. Int J Comput Appl (0975–8887) 134(10):1–5, 2016

    Google Scholar 

  19. 19.

    Ul-Abdin Z, Shafique M, Qadir MA: Telemedicine Aware Video Coding Under Very-Low Bitrates. International Conference on Health Informatics and Medical Systems, HIMS'16, 130–136, 2016

  20. 20.

    He J, Yang F: Efficient frame-level bit allocation algorithm for H. 265/HEVC. IET Image Process 11(4):245–257, 2017

    Article  Google Scholar 

  21. 21.

    Chien WD, Liao KY, Yang JF: H. 264-based hierarchical two-layer lossless video coding method. IET Signal Process 8(1):21–29, 2014

    Article  Google Scholar 

  22. 22.

    Diaz R, Blinstein S, Qu S: Integrating HEVC Video Compression with a High Dynamic Range Video Pipeline. SMPTE Motion Imaging J 125(1):14–21, 2016

    Article  Google Scholar 

  23. 23.

    Pham DL, Xu C, Prince JL: Current methods in medical image segmentation. Annu Rev Biomed Eng 2(1):315–337, 2000

    CAS  Article  Google Scholar 

  24. 24.

    Dhanachandra N, Chanu YJ: A survey on image segmentation methods using clustering techniques. Eur J Eng Res Sci 2(1):15–20, 2017

    Article  Google Scholar 

  25. 25.

    Comaniciu D, Meer P: Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619, May 2002

    Article  Google Scholar 

  26. 26.

    Ramya R, Jenitta A: Foot injury detection using K-means clustering and mean shift segmentation algorithm. Int J Adv Res Basic Eng Sci Technol (IJARBEST) 3(24):323–329, 2017

    Google Scholar 

  27. 27.

    Wang L, Pedersen PC, Strong DM, Tulu B, Agu E, Ignotz R: Smartphone-based wound assessment system for patients with diabetes. IEEE Trans Biomed Eng 62(2):477–488, 2015

    Article  Google Scholar 

  28. 28.

    Chang MC, Yu T, Luo J, Duan K, Tu P, Zhao Y, Nagraj N, Rajiv V, Priebe M, Wood EA, Stachura M: Multimodal sensor system for pressure ulcer wound assessment and care. IEEE Trans Ind Informat 14(3):1186–1196, 2018

    Article  Google Scholar 

  29. 29.

    Wannous H, Treuillet S, Lucas Y: Robust tissue classification for reproducible wound assessment in telemedicine environments. J Electron Imaging 19(2):023002, 2010

    Article  Google Scholar 

  30. 30.

    Nilakant R, Menon HP, Vikram K: A survey on advanced segmentation techniques for brain MRI image segmentation. Int J Adv Sci Eng Inf Technol 7(4):1448–1456, 2017

    Article  Google Scholar 

  31. 31.

    Krizhevsky A, Sutskever I, Hinton GE: Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, 1097–1105, 2012

  32. 32.

    Long J, Shelhamer E, Darrell T: Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, 3431–3440, 2015

  33. 33.

    Goyal M, Reeves ND, Davison AK, Rajbhandari S, Spragg J, Yap MH: DFUNet: Convolutional Neural Networks for Diabetic Foot Ulcer Classification. arXiv preprint arXiv:1711.10448, 2017

  34. 34.

    Badea MS, Felea II, Florea LM, Vertan C: The use of deep learning in image segmentation, classification and detection. arXiv preprint arXiv:1605.09612, 2016

  35. 35.

    Rajchl M, Lee MC, Oktay O, Kamnitsas K, Passerat-Palmbach J, Bai W, Rueckert D: DeepCut: object segmentation from bounding box annotations using convolutional neural networks. IEEE Trans Med Imaging 36(2):674–683, 2017

    Article  Google Scholar 

  36. 36.

    Chaichulee S, Villarroel M, Jorge J, Arteta C, Green G, McCormick K, Tarassenko L: Multi-task Convolutional Neural Network for Patient Detection and Skin Segmentation in Continuous Non-contact Vital Sign Monitoring. In Automatic Face and Gesture Recognition (FG 2017), 12th IEEE International Conference, 266–272, 2017

  37. 37.

    Simonyan K, Zisserman A: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014

  38. 38.

    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A: Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1–9, 2015

  39. 39.

    He K, Zhang X, Ren S, Sun J: Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778, 2016

  40. 40.

    Ronneberger O, Fischer P, Brox T: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Cham: Springer, 2015, pp. 234–241

    Google Scholar 

  41. 41.

    Badrinarayanan V, Kendall A, Cipolla R: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561, 2015

  42. 42.

    Xu M, Deng X, Li S, Wang Z: Region-of-interest based conversational HEVC coding with hierarchical perception model of face. IEEE J Sel Top Sign Process 8(3):475–489, 2014

    Article  Google Scholar 

  43. 43.

    Chai D, Ngan KN, Bouzerdoum A: Foreground/background bit allocation for region-of-interest coding. In Image Processing, Proceedings. IEEE International Conference on September 2000, 2:923–926, 2000

  44. 44.

    Gokturk SB, Tomasi C, Girod B, Beaulieu C: Medical image compression based on region of interest, with application to colon CT images. In Engineering in Medicine and Biology Society, 2001. Proceedings of the 23rd Annual International Conference of the IEEE, 3:2453–2456, 2001

  45. 45.

    Yu H, Lin Z, Pan F: Applications and improvement of H. 264 in medical video compression. IEEE Trans Circuits Syst I Reg Pap 52(12):2707–2716, 2005

    Article  Google Scholar 

  46. 46.

    Wu Y, Liu P, Gao Y, Jia K: Medical ultrasound video coding with H. 265/HEVC based on ROI extraction. PLoS One 11(11):e0165698, 2016

    Article  Google Scholar 

  47. 47.

    Khire S, Robertson S, Jayant N, Wood EA, Stachura ME, Goksel T: Region-of-interest video coding for enabling surgical telementoring in low-bandwidth scenarios. In Military Communications Conference, 2012-Milcom, 1–6, 2012

  48. 48.

    Grois D, Kaminsky E, Hadar O: ROI adaptive scalable video coding for limited bandwidth wireless networks. In Wireless Days (WD), October 2010 IFIP, 1–5. IEEE, 2010

  49. 49.

    Barsakar T, Mankar V: A novel approach for medical video compression using kernel based meanshift ROI coding techniques. In Advances in Signal Processing (CASP), Conference, IEEE , 212–216, 2016

  50. 50.

    Wei H, Zhou X, Zhou W, Yan C, Duan Z, Shan N: Visual saliency based perceptual video coding in HEVC. In Circuits and Systems (ISCAS), IEEE International Symposium, 2547–2550, 2016

  51. 51.

    Wong ACW, Kwok YK: On a region-of-interest based approach to robust wireless video transmission. In Parallel Architectures, Algorithms and Networks, May 2004. Proceedings. 7th International Symposium, IEEE, 385–390, 2004

  52. 52.

    LeCun Y, Bengio Y, Hinton G: Deep learning. Nature 521(7553):436, 2015

    CAS  Article  Google Scholar 

  53. 53.

    Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Van Der Laak JA, Van Ginneken B, Sánchez CI: A survey on deep learning in medical image analysis. Med Image Anal 42:60–88, 2017

    Article  Google Scholar 

  54. 54.

    Wang C, Yan X, Smith M, Kochhar K, Rubin M, Warren SM, Lee H: A unified framework for automatic wound segmentation and analysis with deep convolutional neural networks. In Engineering in Medicine and Biology Society (EMBC), 37th Annual International Conference of the IEEE, 2415–2418, 2015

  55. 55.

    Shenoy VN, Foster E, Aalami L, Majeed B, Aalami O: Deepwound: Automated Postoperative Wound Assessment and Surgical Site Surveillance through Convolutional Neural Networks. In IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 1017–1021, 2018

  56. 56.

    Zeiler MD, Fergus R: Visualizing and understanding convolutional networks. In: European conference on computer vision. Cham: Springer, 2014, pp. 818–833

    Google Scholar 

  57. 57.

    Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S: Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imaging 35(5):1207–1216, 2016

    Article  Google Scholar 

  58. 58.

    Shin HC, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Summers RM: Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging 35(5):1285–1298, 2016

    Article  Google Scholar 

  59. 59.

    Oliveira GL, Bollen C, Burgard W, Brox T: Efficient and robust deep networks for semantic segmentation. Int J Robot Res 0278364917710542, 2017

  60. 60.

    Nair V, Hinton GE: Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), 807–814, 2010

  61. 61.

    Bottou L: Stochastic gradient descent tricks. In: Neural networks: tricks of the trade. Berlin: Springer, 2012, pp. 421–436

    Google Scholar 

  62. 62.

    Sze V, Budagavi M: High throughput CABAC entropy coding in HEVC. IEEE Trans Circuits Syst Video Technol 22(12):1778–1791, 2012

    Article  Google Scholar 

  63. 63.

    Borgefors G: Distance transformations in digital images. Comput Vis Graph Image Process 34(3):344–371, 1986

    Article  Google Scholar 

  64. 64.

    Zeiler MD: ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701, 2012

  65. 65.

    Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M: Tensorflow: a system for large-scale machine learning. In OSDI, 16:265–283, 2016

  66. 66.

    McClellan T: Live Surgery: Small Finger Extensor Tendon Saw Injury Cut Repair. Youtube, Available from: https://youtu.be/3o7cgZsd3bs. Accessed 30.10.17

  67. 67.

    McClellan T: Live Surgery: Ganglion Cyst Volar Wrist. Youtube, Available from: https://youtu.be/ZgNJ8YDA7dY. Accessed 25.10.17

  68. 68.

    McClellan T: The digital nerve was cut! The wrinkle test worked and helped child. Youtube, Available from: https://youtu.be/CY1HYIBrAwQ. Accessed 05.11.17

  69. 69.

    McClellan T: Live Surgery: Running Subcuticular Suture, What is an Intracuticular or Subcuticular Suture?, Youtube, Available from: https://youtu.be/CiW93U-3XcQ. Accessed 10.11.17

  70. 70.

    McClellan T: Live Surgery: Foreign Body (BB) Removal from Finger. Youtube, Available from: https://youtu.be/DWQ6WX3ImBU. Accessed 25.10.17

  71. 71.

    McClellan T: Live Surgery: Z-Plasty of Scar Contracture (Finger). Youtube, Available from: https://youtu.be/wdseg3UvXrI. Accessed 25.10.17

  72. 72.

    McClellan T: Live Surgery: Ganglion Cyst: Flexor Tendon Sheath (Finger). Youtube, Available from: https://youtu.be/hDZBE8tcctE. Accessed 25.10.17

  73. 73.

    McClellan T: Live Surgery: Flexor Digitorum Profundus (FDP) Finger Tendon Repair). Youtube, Available from: https://youtu.be/boMlEa3P43g. Accessed 25.10.17

  74. 74.

    Vangelisti MD: Hand Surgery Procedure - NuGrip Arthroplasty (Thumb Arthritis Joint Replacement Surgery). Youtube, Available from: https://youtu.be/YZgDQl5kWFs. Accessed 25.10.17

  75. 75.

    Asvadi A: K-means, Mean-shift and Normalized-cut segmentation. MathWorks, Available from: https://ww2.mathworks.cn/matlabcentral/fileexchange/52698-k-means-mean-shift-and-normalized-cut-segmentation. Accessed 05.01.19

  76. 76.

    Carreira-Perpinán MA: A review of mean-shift algorithms for clustering. arXiv preprint arXiv:1503.00687, 2015

  77. 77.

    Bjntegaard G: Calculation of average PSNR differences between RD-curves (VCEG-M33). In VCEG meeting (ITU-T SG16 Q. 6), 2001

  78. 78.

    Hanhart P, Ebrahimi T: Calculation of average coding efficiency based on subjective quality scores. J Vis Commun Image Represent 25(3):555–564, 2014

    Article  Google Scholar 

Download references


The authors acknowledge that the surgical procedure videos of Dr. Thomas McClellan [66,67,68,69,70,71,72,73] and Dr. Vangelisti MD [74] are used for experimentation in this research.

Author information



Corresponding author

Correspondence to Syed Ali Tariq.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hassan, A., Ghafoor, M., Tariq, S.A. et al. High Efficiency Video Coding (HEVC)–Based Surgical Telementoring System Using Shallow Convolutional Neural Network. J Digit Imaging 32, 1027–1043 (2019). https://doi.org/10.1007/s10278-019-00206-2

Download citation


  • Convolutional neural network (CNN)
  • Deep learning (DL)
  • HEVC
  • Medical imaging
  • Segmentation
  • Telementoring