1 Introduction

The traditional way of training the surgeons might take a long time for them to become experts. Simultaneously, such training is expensive as the trainee surgeons need to be present physically for the training session, and it is also resource-oriented. In addition, within the conditions of the COVID-19 pandemic the world is passing nowadays, such medical training has been significantly impacted [6]. Moreover, real-time training can be beneficial learning tools in rural clinics [22]. Therefore, tele-training is an emerging concept in surgical education that could solve such problems [5].

Tele-training in surgical education is the integration of teleconference solution with information technology to perform videoscopic courses. Via this remote technique, the experienced surgeon gives live instructions to the trainee surgeons helping them to be specialists in a specific surgery field [34]. The issue of internet speed in various remote locations is the main challenge during the transmission of high-definition (HD) videos in real-time [2]. As a result, this could affect sending high-quality video frames without any distortion in the real-time network environment. Therefore, there is a stringent need for high transmission rate, reliability and throughput, and distortion minimization while transmitting these video frames [4, 13, 16].

There are good examples of systems designed for surgical tele-training, such as RingCentral [9], Chiron Health [10], TrueCare [3]. However, there should be more effective tele-training systems in transmission quality and time [2]. Although, Priority Aware and Transmission control protocol (TCP) Oriented codiNg (PATON) is one of the states of art solutions that provides a transmission with the priority awareness along with the priority of the video frame and Forward Error Correction (FEC) coding [16]. However, it does not consider video source rate allocation [14] and the reliability of the video frames [18] while minimizing the total distortion of the video frames. This may cause low video quality since the video source rate allocation enhances the receiver side’s reconstructed frames.

Moreover, the bandwidth fluctuations and channel errors frequently occur in unreliable wireless environments to degrade user-perceived video quality led to unreliable frames [16]. The current state of art solution lacks video quality measures to guarantee the reliability of the video frames while transcoding the large video data because of the strict delay requirements of the real-time application [19]. Besides, it poses the limitation by not considering the video source rate allocation to reduce the total video distortion.

This research aims to improve real-time transmitted surgical video quality, goodput and minimize video distortion during the training session to better learning interaction. This study proposes an Enhanced Video Quality, Distortion Minimization, Bandwidth efficient, and Reliability Maximization (EVQDMBRM) algorithm to deduce the video frame’s distortion and get the desired quality of the transmitted video in real-time. The proposed algorithm will reduce the packet loss ratio during the video source rate allocation and transmission loss and maximise the video frames’ transmission reliability.

The work in this article is in parallel with our recently published paper, i.e. [2]. In [2], we proposed a system named EPQVQaLM (Enhanced Path Quality, Video Quality, and Latency Minimization) that focused on improving the quality of surgical video transmission and reducing the end-to-end transmission delay. Our results in [2] show that EPQVQaLM could improve PSNR and provide a mechanism to distribute data chunks intelligently over multiple paths. However, the proposed system missed using the WebRTC-based video conference technology for low latency and did not provide exact processing time [2]. Compared to [2], our work here concentrates on reducing the video frame’s total distortion besides enhancing the quality of the video in a real-time network. In addition, we aim to provide the mechanism to allocate the video rate at the source dynamically. Besides that, our proposed work aims to minimize the packet loss ratio and probing status, which estimates the available bandwidth, as we show in the following sections.

2 Related work

The current solutions that have been reviewed in this study are classified into five categories: energy and bandwidth-efficient, video frame scheduling, data protection, FEC Coding, and Quality of Service (QoS), and Quality-of-Experience (QoE) metrics.

2.1 Energy and bandwidth-efficient metric

Three states of art solution were founded that are related to energy and bandwidth-efficient. In [17], the Energy Quality-aware Bandwidth Aggregation (ELBA) algorithm was proposed to solve the problems related to high energy consumption, low video quality, and quality constraints in the mobile and heterogeneous wireless system to the transmission of real-time video. ELBA schema shows the reduced energy consumption by 35.3%, 52.2%, and 43.3% in both stationary and mobile evaluation scenarios compared to the other available algorithms in terms of PSNR [17].

Wu et al. (2016) had proposed Bandwidth Efficient Multipath Streaming (BEMA) for real-time video transmission in a multi-path heterogeneous to solve the resource-limited and error-prone wireless networks challenge [13]. The proposed BEMA includes: (1) conventional multi-path protocols are throughput-oriented, and video data are scheduled in a content-agnostic fashion; and (2) for a high-quality real-time video that is bandwidth-intensive and delay-sensitive, they used FEC coding, raptor coding, Multipath TCP (MPTCP), and Stream Control Transmission Protocol (SCTP). Also, they improved the path quality for data distribution along with the use of efficient bandwidth. The BEMA improved the PSNR by 18.2%, reduced the end-to-end delay by 22.7%, reduced the percentage of overdue videos to 6.3%, increased goodput value by 7.9%, and decreased the bandwidth consumption to 18.9%, which was not achieved by any other multi-path protocol [13]. Then the same authors in [15] investigated that contradiction between energy consumption and streaming quality cause delivery of high-quality live video between two sites. The challenging issue is the battery constrained mobile devices. Therefore, they proposed an Efficient Probabilistic Public-Key Encryption (EPOC) Schema to transmit real-time video over a heterogeneous wireless network to multi-homed terminals. This solution conserves energy by 32.6%, increases goodput by 19.3%, improves PSNR by 22.5%, and reduces end-to-end delay packets by 31.5% [15].

2.2 Video frame scheduling metric

Regarding video frame scheduling, Wu et al. in 2016 investigated the issues that high-quality video data face while transmitted in limited network resources [15]. They proposed the Content-aware concurrent Multi-path Transfer (CMT-CA) evaluation framework and Stream control transmission protocol (SCTP) for quality video streaming. CMT-CA scheme mitigated total video distortion and improved PSNR by 5.8 [15]. However, it does not consider the amount of energy used by wireless devices, which is very important in multi-path data transfer for real-time video streaming.

Huang et al. (2018) investigated the priority-aware interference avoidance scheduling for multi-user coexisting wireless networks with heterogeneous traffic demands [8]. For priority-aware admission control and throughput maximization, a sequential solution framework was proposed. In addition, for solving a large-scale linear program, the column generation based method was proposed. Moreover, for improving computational efficiency, the greedy initialization method was proposed [8]. With the proposed methods, if all the priority is given to the user’s high-quality videos, then the user’s low priority video’s demand will not be satisfied on time.

Finally, in [19], the PERES (Partial Reliability-based Real-time Streaming) schema is proposed to strike an effective balance between reliability and delay in real-time video transmission. This resulted in improved PSNR, delay, and buffer to achieve the desired high-quality real-time video. In future work, the authors want to include the energy consumption model to improve power efficiency in transmitting real-time videos on mobile devices, which leaves the area for research [19].

2.3 Data protection metric

Concerning data protection, Yan et al., in their research [20], found that the scheduling of coding patterns and their transmission are challenging issues while maximizing mean PSNR. Therefore, they proposed a coding aided collaborative real-time scalable video transmission algorithm to improve video quality by utilizing the limited wireless resources and increasing mean PSNR [20]. However, this algorithm is only limited to device-to-device networks, and huge improvements need to be made to implement this algorithm in other networks. While Wu et al. in 2018 examined that it is critical to improving the quality of the streaming video and performance of the network using effective SCTP [18]. They had developed a framework for raptor coding adaptation, data distribution, and retransmission control by analysing and modelling a throughput mathematical model to formulate the utility maximization problem of multi-path real-time video delivery over parallel wireless networks. Their transmission framework included online packet scheduling, raptor coding adaptation, and retransmission control algorithms [18].

2.4 FEC coding metric

For the FEC Coding, He et al. emphasized huge bandwidth requirements while transmitting Virtual Reality (VR) videos in a real-time network [7]. The authors proposed a video coding method, which is scalable and does the encoding of non-Region-of-Interest (RoI) in low quality and RoI in high quality. This can help avoid the blank screen as the videos are encoded at the Basic Layer (BL) and Enhancement Layer (EL). The experiment’s implementation of the scalable coding method showed a reduction in the bitrates by approximately 87% without a significant decrease in video quality. However, this method may raise troubles with the multi-rate adaptation, which is an essential aspect of VR video transmission.

In comparison, authors in [16] were influenced by the popularity of the growing HD video streaming over TCP Network. They developed FEC Coding, dubbed PATON schema, to maximize the benefits attained while streaming HD Video. The development was performed by selecting a priority-aware frame in a multimedia application and bringing significant changes in the FEC redundancy adaptation, packet interleaving and packet size to optimize the video quality of the real-time network. After the successful signalling procedure, User Datagram Protocol (UDP)/ Internet Protocol (IP) network encapsulation with real-time transport protocol (RTP) encapsulation could be used for faster transmission of real-time video.

2.5 Quality of service (QoS), and quality-of-experience (QoE) metric

Finally, the work on QoS and QoE has been considered as well in our literature. Sedrati et al. in 2017 proposed in a Mobile Ad hoc Network (MANET) to evaluate two routing protocols, the Multipath Desti-nation Sequenced Distance Vector (MDSDV) and the Modified Ad hoc On-Demand Distance Vector (M-AODV), to improve QoS for real-time multimedia applications [11]. The obtained results showed that M-AODV is better at throughput and load network with high mobility evaluation. At the same time, MDSDV is better at network load and reliability for a large-scale network displaying good performance. Moreover, these two protocols provide acceptable and good quality and a small jitter regardless of node number in medium mobility. In addition, authors in [12] investigated a method that guaranteed the QoS for the large video data and proposed a Cloud-Based Online Video Transcoding system (COVT) solution. This solution is economical and QoS aware while transcoding online video in the cloud environment. The proposed solution guaranteed the QoS for Transcoding a small chunk size in resource provisioning rather than the physical resource [12]. However, guarantying QoS for a larger video chunk remains uncertain as the number of Central Process Unit (CPU) cores used by this method is up to 47% less than peak load. This leaves the gap to improve this solution as future work.

Bernardo et al., in their research in 2016, found that various factors like video characteristics, link quality, and device capabilities are affecting the video quality of the devices’ link and abilities [1]. These factors developed an Evaluation Framework to Assess Video Transmission Energy Consumption and Quality (EViTEQ) framework to establish an appropriate relationship between energy consumption and video quality received by the end-users. [1] had explained how the quality of the received videos could affect battery life and perceived quality of experience in literature. However, there was still a need to develop an enhanced energy-saving mechanism to create and improve the simulation models for QoE and Energy consumption.

Moreover, authors in [2] proposed a system named EPQVQaLM (Enhanced Path Quality, Video Quality, and Latency Minimization) that focused on improving the quality of surgical video transmission and reducing the end-to-end transmission delay. Their results show that EPQVQaLM could improve PSNR and provided a mechanism to distribute data chunks intelligently over multiple paths. However, the proposed system did not use the WebRTC-based video conference technology for low latency, and it needs to provide a more detailed processing time [2].

Furthermore, Wu et al. discussed that mobile videos with high quality are challenging to deliver in a limited radio resource because of stringent QoS requirements and time-varying channel status [15]. Therefore, they proposed a video distortion estimation model for online analysis based on input video data, the status of feedback channel and a quality-controlled differentiated protection scheme to adjust the FEC coding redundancy adaptively. This work studied protecting schemes for Intra (I) and Predicted (P) frames but still challenging for real-time implementation with QoE.

Finally, Zheng et al. in [21] argued that the end-users are only concerning about the transmission quality but not about the quality of the individual task’s link. Therefore, the Remaining Time-Based Maximal (RTBM) scheduling policy was pushed forward link scheduling from a throughput optimal and QoS concerning the problem to a QoE -based and application-aware one. An application layer, QoE, is derived from the QoS, which originated from the network layer. However, there is still a gap in the QoE’s existence and scheduling policies.

3 State of art

Wu et al. in [16] presented a Priority-Aware and TCP-Oriented Coding (PATON) scheduling framework to minimize the total distortion. In this framework, HD data for mobile videos are coded and protected against channel losses at the frame level using Reed Solomon Coding [16]. The packets are allocated over multiple real-time wireless access networks (Fig. 1). Table 1 presents the pseudo-code of the state of the art solution. Accordingly, the research of [16] is selected as a baseline for the proposed method in this paper.

Fig. 1
figure 1

Block diagram of the state of the art solution (Wu et al., [16]). The blue borders refer to the good features of this framework, while the red border refers to its limitation

Table 1 Priority-Aware and TCP-Oriented Coding (PATON) Algorithm

PATON guarantees the low-delay HD mobile video delivery by effectively leverage the frame priority and TCP connection state to achieve the desired video quality over a heterogeneous wireless network. This is done by performing a priority-aware frame selection to enhance the video quality, then modelling the end-to-end delay using TCP connection. It then adapts the FEC redundancy level and packet size to minimize the effective data loss rate [16]. Thus, it improves the average video PSNR by 23.4% and reduces end-to-end delay by 35.5%. State of the art solution consists of two components, i.e., sender and receiver [16].

On the one hand, videos are captured using a video application on the sender side then encoded using the H.264 codec standard. The output is video frames scheduled to adopt the video traffic load using the parameter’s frame selection. This is based on the network status information provided by the network status monitor at the receiver and frame priority [16]. These two decision processes are followed by the selected video frames and then converted to the FEC packets and transmitted to the destination using a TCP socket. The value of FEC redundancy and the packet’s size is updated in the FEC coder for each decision epoch [16].

On the other hand, once the packets are received, an estimation algorithm is used by TCP trace to inform the network status monitor about channel loss rate and round-trip time to the network status estimation. Besides, those feedback information is used during frame scheduling at the sender side. The TCP listener sends the packets to the packet filter where deadline packets are dropped, and other packets are decoded using the FEC decoder. Then, the video frames are reassembled, the errors are concealed, and the video decoder displays the video for the receiver at the remote site [16].

Based on our analysis, we found some limitations in state of the art [16]. The TCP socket is used to model end-to-end connection delay with two main components: network-level delay and TCP-level delay. The current solution in [16] improves the video quality in PSNR by minimizing the total distortion of the video frames. It prioritizes the high priority frames and drops the low priority frames. The authors in [16] have considered the available bandwidth, packet loss rate, and transmission loss of those frames. However, they have not considered the frames’ reliability while distributing the packets to the receiver using a TCP socket to reduce the videos’ distortion. The TCP protocol is featured by the high transmission rate resulting in frequent throughput fluctuation and deadline violations due to the connection orientation. Therefore, this limitation is overcome by [19] using the Partial rEliable based Real-Time Streaming (PERES) schema to perform the reliable partial transfer.

In this schema, the UDP socket is employed that is not connection-oriented in which it sends the frames directly without establishing an end-to-end connection. However, it has the feature of TCP as partial reliability, and it measures the reliability of the video frames before transmitting the packets to the receiver [19]. Moreover, instead of prioritizing the video frames, the video distortion can be further reduced if the video frame’s flow rate is allocated dynamically at the video source [15]. Therefore, this limitation is overcome by using delAy Stringent COded Transmission (ASCOT), where the rate allocation reduces the packet loss. If only reliable video frames are chosen for the transmission, the PNSR of the streaming video could be improved hence improving the video quality. The current solution [16] enhanced the video quality by improving average PSNR by 23.6%, reducing end-to-end delay by 35.5%, improving goodput by 13.5%, and achieving the performance gains with the time cost of 87.2 ms.

The PATON framework is implemented to reduce the total distortion of the video frames, as shown in Eq. 1 [16]. However, the video quality can be improved by improving PSNR and goodput and reducing the end-to-end delay, as we will present in our proposed solution (Section 4).

$$ {d}_m=\sum_{m=2}^M{t}_m\left[R(m)\right]+{f}_m\left[R(m)\right] $$
(1)

Where,

d:

total distortion of the video frames.

M:

number of frames that each GoP consist of.

m:

frame index (1 ≤ m ≤ M).

tm:

truncation distortion.

fm:

drifting distortion.

R:

number of parity packets for all the frames.

The number of parity packets for all the frames (R) can be calculated using Eq. 2 [17]:

$$ \mathrm{R}=\left[\frac{\mu .\kern0.5em \left(1-{\pi}_B\right).\kern0.5em \left(M-1\right).\kern0.5em T}{S}\right]-{n}_1-\sum_{m=2}^M\left[\frac{S_m}{S}\right] $$
(2)

Where,

R:

number of parity packets for all the frames.

μ:

available bandwidth.

πB:

packet loss ratio.

M:

number of frames that each GoP consist of.

T:

the delay constraint for each video frames.

n:

no of data packets in an FEC block.

Sm, S:

the size of the m-th frame, FEC packet size.

4 The proposed system

The PSNR, end-to-end delay and processing time were the major performance indicators that need to be considered in our work. End-to-end delay is the time required to complete the processes of encoding video data, the transcoding delay. Parameters such as bandwidth, round trip time, flow rate allocation, transmission time and reliability are calculated during encoding video data. These parameters are calculated and presented in Table 4.

This paper aims to propose a solution to reduce the total distortion of the video. Therefore, the proposed method in [16] is selected among the other methods that have been presented and covered in Section 2. The video frame’s total distortion can be explained as quality degradation. It depends on truncation distortion, which is caused by the transmission loss, packet loss ratio, and the drifting distortion caused by the video frames are reconstructed imperfective. The drifting distortion is the distortion caused by the propagation of missing data in one frame (parent frame) and then propagated by inter-frame prediction in subsequent frames. As in Table 1, where the PATON algorithm is shown, the distortion of each frame is calculated once the transmission losses are experienced.

The proposed solution in this paper considers the single flow of the video transmission using the modified UDP socket from the local site to the remote site. The encoded video is divided into several chunks, and they are dispatched using the flow rate allocation at the source of the video to allocate them dynamically. This is an entirely different feature added to the state of the art solution in [16]. Introducing this parameter can further reduce the total distortion of the video frames. Similarly, the proposed system also considers the video frames’ reliability by dynamically allocating the flow rate at the video source and maximizing the transmission reliability of the video frames to guarantee enhanced video quality. This is also wholly a new feature that was not available in [16].

This proposed system has two major components: the local site, which represents the surgery and expertise surgeon site and the remote site representing the medical student or trainee surgeon site (Fig. 2).

Fig. 2
figure 2

Block diagram of the proposed system to improve video quality using EVQDMBRM Algorithm. The green borders refer to the new parts in our proposed system

Local site (surgery and expertise surgeon site)

Augmented videos are created at the local site. They are encoded before they send to the remote site using H.264 codec standard. The encoded video frames are scheduled to dynamically adapt the video traffic load using the parameter’s frame selection. This is based on the network status information provided by the network status monitor at the remote site (receiver). The selected video frames are then converted to the FEC packets and transmitted to the remote site using a modified UDP socket. At the same time, the value of FEC redundancy and the packet’s size is updated in the FEC coder for each decision epoch. Before the packets are distributed using modified UDP sockets, the proposed Enhanced Video Quality Distortion Minimization Bandwidth efficient and Reliability maximization (EVQDMBRM) algorithm is used to minimize the distortion of the video frames. The EVQDMBRM is also implemented to improve video quality in real-time transmission by reducing the packet loss ratio during the video source rate allocation and transmission loss and maximizing the video frames’ transmission reliability (Fig. 2).

Remote site (medical student or trainee surgeon site)

At the remote site, once the packets are received, an estimation algorithm is used by UDP trace to inform the network status monitor about channel loss rate and round-trip time to the network status estimation. After that, the feedback information is used during frame scheduling at the sender side. Whereas the UPD listener sends the packets to the packet filter where deadline packets are dropped, and other packets are decoded using the FEC decoder, the video frames are then reassembled, and errors are concealed. The video decoder displays the receiver’s video at the remote site [16] (Fig. 2).

4.1 Proposed equations

The state of art solution does not calculate the rate allocation at the video source. However, it is an essential parameter that plays a vital role in minimizing the total distortion of the video frame while streaming the video. This limitation is overcome using ASCOT, where the rate allocation is considered along with the parameter to calculate the packet loss ratio during calculating the truncation distortion [15]. The flow rate allocation in the proposed equation is adapted from [15] using the vectors of rate allocation to reduce the video frame’s distortion at the end-to-end level. Video’s total distortion can be deliberated as in Eq. 3:

$$ {d}_m={\sum}_{m=r+1}^{r-1}\frac{\delta_m+{\pi}_m}{r}+{y}_m $$
(3)

Where,

dm:

drifting distortion of the m frame.

r:

flow rate allocation.

m:

frame index (1 ≤ m ≤ M).

M:

number of frames that each GoP consist.

δm:

additional distortion of the m frame.

πm:

packet loss ratio of the m frame.

ym:

drifting distortion of the m frame.

This paper employed Eqs. 1, 2 and 3 to present Eq. 4 to calculate the flow rate allocation. Here, the Modified Total Distortion (Mdm) is shown as:

$$ {Md}_m=\sum_{m=2}^M{t}_m\left[{\sum}_{m=r+1}^{r-1}\frac{\delta_m+{\pi}_m}{r}\right]+{f}_m\left[R(m)\right] $$
(4)

Where,

M:

number of frames that each GoP consist of.

dm:

drifting distortion of the m frame.

m:

frame index (1 ≤ m ≤ M).

tm:

truncation distortion.

fm:

drifting distortion.

R:

total number of introduced FEC parity packets for (M-1) frames.

R (m):

sum of redundant packets for the mth frame.

r:

flow rate allocation.

πm:

packet loss ratio of the m frame.

δm:

additional distortion of the m frame.

The reliability performance of the video frame can be presented in Eq. 5 [18].

$$ {R}_f^{\prime }(m)=\sum_{f=0}^{n_f}\frac{n_f-\mathbbm{E}\ \left[\ L\left({n}_f+N\right)\right]}{n_f} $$
(5)

Where,

Rf:

Reliability of the f-th frame.

\( \mathbbm{E} \):

the probability, expectation value.

L = (0 ≤ L ≤nf):

number of loss packets.

N = (N ∈ [0,∞]):

number of retransmissions.

nf:

number of video packets for the f-th frame.

We proposed the final enhanced total distortion (Edm), as shown in Eq. 6. The proposed equation is designed to measure and deduce the video frame’s total distortion by minimizing the packet loss ratio, improving flow rate allocation, and increasing the video frames’ reliability. This is to improve the quality of the video frames with fewer distorted video frames at the receiver side. The proposed Eq. 4 reduces the total distortion by reducing the packet loss ratio during the video source rate allocation and packet loss ratio. The additional loss also maximizes the video frames’ transmission reliability, divided with the modified total distortion formula as the total distortion is inversely proportional to the video frame’s reliability.

$$ {Ed}_m=\frac{Md_m}{R{\prime}_f(m)} $$
(6)

Where,

Edm:

Enhanced Total distortion.

Mdm:

Modified Total distortion.

Rf(m):

Reliability of all the f-th frame.

4.2 Area of improvement

The enhanced Eq. (6) is the extension of the video frames’ flow rate allocation and reliability. Firstly, the flow rate allocation was considered dynamically at the video source along with the transmission loss and packet loss ratio before transmitting the video chunks from the local site to the remote side. This is to reduce the distortion of the video, as shown in Eq. 4. Secondly, the video frames’ reliability is improved to minimise video distortion, as shown in Eq. 5. The improvement will help attain the better quality of the video with the better PSNR, processing time, low end-to-end delay, and latency with the deduced unnecessary retransmission of the video packets acquiring the available bandwidth. Along with these two new parameters (Mdm and Rf(m)), the proposed solution also considers round trip time, available bandwidth, channel loss rate, transmission loss rate, overdue loss rate to improve the video’s quality. Therefore, the surgical remote tele-training site receives less distorted and low delay frames.

The proposed system concentrates on minimizing video distortion and improving surgical video transmission quality using our proposed EVQDMBRM algorithm. Meanwhile, it provides the mechanism to dynamically allocate the video rate at the source, minimize the packet loss ratio and probing status, which estimates the available bandwidth. Additionally, we have also considered the video frames’ reliability to guarantee improved video quality, which would ensure a better communication network. It is also an entirely new feature that is not considered by the state of the art [16]. Table 2 presents the pseudo-code of the EVQDMBRM algorithm.

Table 2 Proposed Distortion minimization based on an EVQDMBRM algorithm.

5 Results and discussion

5.1 Simulation environment settings

Network Simulator Version 2 (NS2) is used to implement the proposed algorithm (EVQDMBRM). A cluster of thirteen nodes (n0, n1……, n13) was considered. NS2 was installed in the Linux server of Ubuntu 14.04 as the virtual image using the software VMW with the physical server of 16GB DRAM and six-core Xeon E5–1650 CPU. Table 3 shows the configuration parameters of the wireless networks (both cellular network and WiFi network) we used to test our proposed solution. Since the purpose of our work to enhance the method proposed by the state-of-the-art [16], we used the same configuration parameters for both the cellular and the WiFi networks.

Table 3 Wireless Networks Configuration Parameters (Same as the state-of-art [16])

Furthermore, the same system architecture of the state-of-the-art [16] in wireless networks we implemented for performance evaluation for this work. This includes the emulation topology (it is shown in Fig. 9 in [16] for more information).

The implementation has done using the EVQDMBRM algorithm for the ten different sample surgical training video files having a variable-length ranging from 7 to 18 min. Those videos are available freely on YouTube for educational purposes. The video’s full length was not considered while sending the videos; only its small chunks were sent in the simulation environment. The frame rate assessed by the YouTube video is 25 to 30 frames per second, and some portion of the video frames were dropped whenever the video quality was terrible.

The image frames were extracted using NS2, and it was dependent on the length of the video frames. Then, each video frame’s total distortions for state of the art and proposed system were calculated and compared from two wireless network platforms, namely cellular and Wi-Fi, as shown in Table 4. The proposed solution’s performance was tested based on processing time, PSNR, and end-to-end delay. The results for cellular and Wi-Fi networks are shown in Tables 5 and 6, respectively.

Table 4 Measuring Total Distortion for EVQDMBRM and the State of Art [16] with different network characteristics; twenty different paths were considered from standard wireless communication such as cellular and Wi-Fi for video transmission; lower the value of total distortion gives the enhanced video quality
Table 5 Cellular Communication Network: Video Quality results for state of the art [16] and EVQDMBRM was considered based on the processing time, PSNR, and end-to-end delay; the higher the value of PSNR, the better is the quality of the video at the receiver side of the surgical tele-training using a cellular communication network
Table 6 Wi-Fi Communication Network: Video Quality results for state of the art [16] and EVQDMBRM was considered based on the processing time, PSNR, and end-to-end delay; the higher the value of PSNR, the better is the quality of the video at the receiver side of the surgical tele-training using Wi-Fi Communication Network

5.2 Experimental results

A random sample is selected from Table 4, representing the outcome of running the simulation and compared with the surgical training’s video frame from the encoding stage. This work considers the parameters like bandwidth, round trip time, flow rate allocation, transmission time, and reliability while encoding the video (Table 4). In addition, Table 4 shows the calculation and comparison of the total distortion in cellular and Wi-Fi network, respectively, for both states of the art and EVQDMBRM. The proposed solution (EVQDMBRM) has reduced the total distortion of the video frame by dynamically allocating the flow rate once the videos are encoded and consider the video frames’ reliability to ensure a better communication network, which was not considered by the current solution. In this way, video frames are less distorted, higher the quality of the video. Hence, the obtained PSNR is enhanced by 4db to 5db.

Tables 5 and 6 show the video quality results for state of the art and the EVQDMBRM based on the processing time, PSNR, end-to-end delay, and the total distortion calculation for the current solution [16] and the proposed solution. Twenty different paths were considered for cellular (Table 5) and Wi-Fi network (Table 6) standard for video transmission having the channel loss rate of 0.02 and 0.06, respectively.

The selected sample includes the values of PSNR for both the state of the art [16] and EVQDMBRM for the result analysis in Table 5. PSNR was improved by 5 dB in EVQDMBRM, and the delay of an end-to-end cellular network reduced by15ms. While in the Wi-Fi network standard, PSNR was improved by 5 dB and the end-to-end delay was decreased by 14 ms (Tables 5 and 6, and Fig. 3). The higher the value of PSNR, the better the video’s quality at the receiver side of the surgical tele-training using the cellular and Wi-Fi network, respectively. EVQDMBRM has reduced 58.2 ms against 76.1 ms in a cellular network and 75.9 ms against 88.1 ms in a Wi-Fi network, and end-to-end delay of 114.57 ms against 133.58 in a cellular network and 132.46 ms against 144.07 ms in a Wi-Fi network. The proposed algorithm in this paper has also increased the value of PSNR with 51.13 dB against 47.28 in cellular networks and 43.9 dB against 38.9 dB in Wi-Fi networks.

Fig. 3
figure 3

(a) Average Results of Processing Time, PSNR, and End-to-end delay Cellular Network Standard for state of the art [16] and EVQDMBRM in Wi-Fi network; (b) Average Results of Processing Time, PSNR, and End-to-end delay in Wi-Fi Network Standard state of the art [16] and EVQDMBRM in Wi-Fi network; (c) Result of Processing Time of state of the art [16] and EVQDMBRM in Wi-Fi network in a cellular network; (d) Result of Peak Signal to Noise Ratio (PSNR) of state of the art [16] and EVQDMBRM in Wi-Fi network in Wi-Fi network; and (e) Result of End-to-end Delay of state of the art [16] and EVQDMBRM in Wi-Fi network

Similarly, the end-to-end delay and processing time have been reduced by 14 ms to 15 ms in EVQDMBRM. Therefore, EVQDMBRM outperforms the state of art solution [16] as the higher value of PSNR and lower the value of end-to-end delay and processing time. Thus, it leads to better performance and makes EVQDMBRM more efficient. The PSNR and end-to-end delay were calculated by running the TCL scripts in the NS2 environment.

The major feature of EVQDMBRM is the consideration of the single flow of the video transmission using the UDP socket from the local site to the remote site. In contrast, state of the art [16] used TCP connection, which is connection-oriented. As the reduction in processing time would improve the video quality, the EVQDMBRM algorithm allocates the flow rate dynamically, reducing the processing time and end-to-end delay. The reduction in the end-to-end delay is shown in Fig. 3c. is an additional important feature of the proposed solution. It is the time required by the video to travel from source to destination. Therefore, it is better to have a lower value of end-to-end delay.

We were also able to increase the value of PSNR compared to the state of the art solution in Fig. 3d. Additionally, we have also considered the video frames’ reliability to guarantee the improvement in the video quality, which would ensure a better communication network. It is also an entirely new feature that is not considered by the state of the art.

A wide range of techniques and algorithms have been implemented to improve video quality while streaming a real-time video. However, they are continuously refined so that video quality can be further enhanced. This work has successfully overcome the limitation from the state of art solution with the reduced processing time of 58.2 ms against 76.1 ms in the cellular network, and 75.9 ms against 88.1 ms in Wi-Fi Network Fig. 3c. These outcomes demonstrate that our proposed solution has performed better in PSNR, end-to-end delay, and processing time. This research paper has improved the video quality of the surgical video during the tele-training procedure by increasing the average value of PSNR. Moreover, our proposed algorithm has reduced the average processing time; furthermore, it reduced the average end-to-end delay (Table 7).

Table 7 Comparison Table between Proposed Solution (EVQDMBRM) and State of Art Solution [16]

6 Conclusion and future work

The proposed solution (EVQDMBRM) could minimize the total distortion of video frames by considering the flow rate allocation at the video source and the transmission loss and packet loss ratio. EVQDMBRM occurs before any chunks that may occur in the video transmission from the local site to the remote side. Additionally, it improves the reliability of the video frames to reduce video distortion. The EVQDMBRM algorithm was implemented using the NS2 to obtain the real-life network simulation environment. The single flow through the video transmission was done, and the desired video quality was achieved using the UDP socket. The UDP sockets allocate the flow rate dynamically, resulting in reduced processing time and end-to-end delay.

Furthermore, the video frames’ reliability is also considered to guarantee the improvement in the video quality, which would ensure a better communication network in our future work. We want to implement the proposed algorithm in mobile devices to use energy as energy consumption efficiently. Thus, provide the desired video quality of the tele-training videos for users.

In our future work, we would like to implement this algorithm focusing on efficient energy use. The consumption of energy by mobile devices is vital to provide the desired video quality of the tele-training videos. In addition, our proposed solution (EVQDMBRM) needs to be validated using multiple datasets that we could not offer in this work. The validation will conduct the results we gained in this work and identify our solution as a generic method. Moreover, we will implement EVQDMBRM as a real system such as WebRTC. Furthermore, we need to find the optimal settings for the proposed system that have an impact on the end-to-end latency and the distortion. This covers the adjusting of different parameters to change the video quality versus the bandwidth in H.264. The parameters include video resolution, frame rate, compression ratio, etc.. Besides, different networks topologies need to be tested to show their influence on the proposed system.