1 Introduction

In the past few years, mobile video streaming (e.g., Youtube [1] and Hulu [2]) has become one of the most popular applications, and video traffic headed for handheld devices (e.g., smart cell phones and iPad) has experienced explosive growth. According to the Cisco Visual Index [3] report, video streaming accounts for 57% of mobile data usage in 2012 and will reach 69% by the year 2017. Global mobile data is expected to increase 13-fold between 2012 and 2017. Furthermore, high-definition video has surpassed the standard definition video by the end of 2012 and will comprise 79% of video traffic by 2016.

Although the proliferation of wireless infrastructures has offered the users with many access options (e.g., cellular networks, wireless local area network (WLAN), and Worldwide Interoperability for Microwave Access (WiMAX)), it is still a challenging problem to efficiently provide mobile video streaming services due to performance limitations of single wireless networks. Current WLAN systems cannot provide satisfactory quality of video streaming services due to the small coverage and relatively limited bandwidth as the number of mobile users increases [4, 5]. Even worse, WLAN systems are not robust enough to sustain user mobility [6, 7]. On the other hand, cellular networks, e.g., Universal Mobile Telecommunications System (UMTS) and HSDPA, can provide more robust wireless connections to mobile users. However, their bandwidth is not adequate to support high-quality video streaming with stringent bandwidth requirements [6]. Although 4G LTE and WiMAX can provide a much higher peak data rate and extended coverage, they are not widely deployed yet and the bandwidth limitation will still become a problem as the wireless spectrum is shared by many users [7]. The performance limitations of single wireless networks naturally turn research attentions to aggregate the bandwidth of heterogeneous wireless networks, and it has already attracted considerable research attentions [810]. Conventionally, these bandwidth aggregation algorithms are designed for dynamically allocating video flows with seldom considerations in inherent channel errors and fluctuations, which can significantly impact on the streaming video quality [6, 11].

To address the challenging problems, joint source-channel coding (JSCC) has proven to be an effective solution in designing error-resilient wireless video transmission systems [12, 13]. However, one major problem with the existing JSCC approaches (e.g., [14, 15]) is that they consider the network between the server and the client as a single transport link [16]. The problem becomes more complicated in the context of integrated heterogeneous wireless networks, in which multiple access networks may be simultaneously available. In [17], Jurca et al. studied on the physical path selection and source rate allocation for video streaming over multi-path networks, and experimental results show that video streaming through only certain reliable wireless networks gives better video quality than that through all possible wireless networks. The problem statement is presented in Figure 1, and it can be illustrated that involving an unreliable wireless access network in the transmission during the client mobility will only degrade the user-perceived video quality.

Figure 1
figure 1

Illustration of a mobile video streaming service in a heterogeneous wireless network. In location 1, the user experiences video glitches as the cellular link is unable to support the video streaming well. Then, the user requests to the video server and simultaneously connects to the WLAN access point in location 2. However, the video quality further degrades in the dual mode as the WLAN link is unstable. Then, in location 3, the user switches the WLAN to WiMAX, and the perceived quality is better than that in locations 1 and 2.

Motivated by optimizing the JSCC for mobile video delivery in heterogeneous wireless networks, we propose a flow rate allocation-based JSCC (FRA-JSCC) approach in this work. By the term ‘flow rate allocation’, we mean dynamically picking the appropriate wireless access networks and assigning the transmission rates to each of them. First, the video source rate adaption scheme is designed to satisfy the delay requirements of real-time video applications. Second, forward error correction (FEC) redundancy estimation is performed to meet the tolerable loss rate. Third, a simple but effective search algorithm for flow rate allocation is proposed to minimize end-to-end video distortion. Specifically, the contributions of this paper can be summarized in the following:

  •  An efficient end-to-end video delivery scheme in integrated heterogeneous wireless networks that uses JSCC in conjunction with flow rate allocation in order to improve the perceived video quality.

  •  A mathematical model of JSCC to minimize the end-to-end video distortion over multiple wireless channels. The channel distortion is comprehensively analyzed with both transmission and overdue loss.

  •  Extensive semi-physical emulations in Exata with the real-time H.264 video streaming. Experimental results show that (1) FRA-JSCC improves the average video peak signal-to-noise ratio (PSNR) by up to 3.5, 8.45, and 11 dB compared to the fountain code-based virtual path (FCVP) [6], joint multimedia-FEC rate (JMFR) [16], and dynamic multi-path (DMP) [18]; (2) FRA-JSCC reduces the average end-to-end delay by up to 20.8, 11.5, and 40.3 ms compared to the FCVP, JMFR, and DMP; (3) FRA-JSCC mitigates the effective loss rate by up to 6.05%, 10.5%, and 15.5% compared to the FCVP, JMFR, and DMP.

The remainder of this paper is organized as follows: in Section 2, we briefly discuss the related work. Section 3 presents the system model and problem formulation. In Section 4, we describe the design of the proposed FRA-JSCC in detail. The performance evaluation is provided in Section 5. Conclusion remarks are given out in Section 6. The basic notations used throughout this paper are listed in Table 1.

Table 1 Basic notations used in this paper

2 Related work

The related work to this paper can be generally categorized into two branches: joint source-channel coding and video delivery in heterogeneous wireless networks. We will discuss on each topic respectively in this section.

2.1 Joint source-channel coding

In summary, the JSCC problem includes joint coding and optimal rate calculation for video coding and channel coding, which provides various protection level to the video data according to its level of importance and channel conditions. Most of the related work in video transmission focus on (1) finding an optimal bit rate for video coding and channel coding, e.g., [19, 20]; (2) designing the video coding mechanism to achieve the target source rate under given channel conditions, e.g., [21]; (3) designing the channel coding to achieve the required reliability, e.g., low-density parity check [22], turbo [23], Reed-Solomon (RS) [24], and fountain [25] codes; (4) designing joint optimization framework, including all available error control components together with error concealment and transmission control, to improve global system performance, e.g., [26]. The authors of [14] deal with the optimal allocation of MPEG-2 encoding and media-independent forward error correction rates under the total given bandwidth. They define optimality in terms of minimum perceptual distortion given a set of video and network parameters. They compute the network error parameters after FEC decoding and derive the global set of equations that lead to optimal dynamic rate allocation. In a more recent work [13], Ji et al. studied on the optimization approach of JSCC for layered video broadcasting to heterogeneous devices. The objective is to achieve maximum overall receiving quality of the heterogeneous quality of service (QoS) receivers.

All these works consider the network as a single transport link between the server and the client. They do not address multi-path streaming scenarios, where more than one network path is allocated to the application. Different from previous JSCC approaches, Jurca et al. [16] researched on the optimal FEC scheme and layer selection in multi-path scenario. This approach uses a multi-layer coded video stream, and the base-layer stream is protected by duplicated transmission using multiple physical paths during the handoff. However, the major flaw is that it is generally under the assumption that all the wireless networks are reliable for improving the overall video quality and thus lacks effective network selection algorithm.

2.2 Video delivery in heterogeneous wireless networks

Video delivery in heterogeneous wireless networks has recently attracted much attention, and the general review can be referred to [27, 28]. In the Earliest Delivery Path First [8] algorithm, it takes into account the available bandwidth, propagation delay, and video frame size to estimate the arrival time and aims to find an earliest path to deliver the video packet. The load balancing algorithm (LBA) [10] performs stream adaption in response to varying network status by only transmitting those packets which are estimated to arrive at the client within the decoding deadline and conserves bandwidth by dropping packets that cannot be decoded because they rely on previous packets that have been dropped. A packet prioritization scheme in LBA gives a higher weight to I frames over B and P frames and also to base layer packets over enhancement layer packets. The LBA scheduler sorts packets according to priority weighting and sacrifices lower priority packets to ensure the delivery of those with a higher priority. Song et al. [9] propose a probabilistic multi-path transmission (PMT) scheme, which sends video traffic bursts over multiple available channels based on a probability generation function of packet delay. PMT is not robust to client mobility as it does not dynamically adjust the flow splitting probability according to time-varying channel status. Han et al. [6] proposed an end-to-end virtual path construction system over heterogeneous wireless networks based on fountain code. The goal of this system is to maximize the encoding bit rate on the basis of aggregate bandwidth as well as overcoming the channel loss. However, the big block size of fountain code will lead to a long delay, which is not appropriate for real-time video streaming over the bandwidth-limited and time-varying wireless networks.

Besides, encoded multi-path streaming (EMS) [29] and multi-path loss tolerant (MPLOT) [30] are typical protocols exploiting path diversity in wired/wireless multi-path networks based on erasure code. EMS scheme splits traffic loads over multiple paths according to the path loss rate and dynamically adjusts FEC redundancy. However, EMS was generally under the assumption that all the available paths could be beneficial for the transmission as in [16]. MPLOT is a transport protocol that aims at maximizing the throughput of the upper layer application. However, MPLOT cannot guarantee real-time video delivery as it does not address tight delay constraints.

3 System model and problem formulation

The system model for the proposed FRA-JSCC is depicted in Figure 2. We consider a scenario of a heterogeneous wireless network integrating wireless connections from a single video server to a single destination node. This system involves the models for network path, end-to-end video distortion, and forward error correction. Parts of this section describe each of them.

Figure 2
figure 2

Abstract system model for JSCC in conjunction with flow rate allocation over multiple wireless access networks.

3.1 Network model

The end-to-end connection from the video server to the wireless interface of the mobile client is considered as an independent physical path which includes the wired and wireless domains. It is well known that the wireless access is most likely to be the bottleneck link for the end-to-end transmission due to the limited bandwidth and time-varying channel status. The transmission data packets may encounter loss due to buffer overflow in immediate routers or erasures caused by channel fading in the error-prone wireless channels. In order to simplify the discussion, we generally consider a packet to be lost due to the link fault either in the wired/wireless packet switching networks. Each physical path P r is associated with the following metrics:

  •  Available bandwidth μ r (expressed in Kbps). μ r captures the variation of background traffic and bandwidth fluctuation.

  •  Propagation delay t r which includes the link delays of the wired and wireless networks.

  •  Average loss probability π B r [0,1], assumed to be an i.i.d process and independent of the video streaming rate.

We model the burst loss behavior on each physical path by the continuous-time Gilbert model. It is a two-state stationary continuous time Markov chain. The state X r (t) assumes one of two values: G (good) or B (bad). If a packet is sent at time t with X r (t)=G, then the packet can be successfully delivered. Otherwise, when X r (t)=B, then the packet is lost.

We denote by π G r and π B r the stationary probabilities that P i is good or bad. Let ξ B r and ξ G r represent the transition probability from G to B and B to G, respectively. In this work, we adopt two system-dependent parameters to specify the continuous-time Markov chain packet loss model: (1) the average loss rate π B r and (2) the average loss burst length 1/ ξ B r . Then, we can have

π G r = ξ B r ξ B r + ξ G r and π B r = ξ G r ξ B r + ξ G r .
(1)

The available bandwidth and propagation delay of each wireless network can be estimated by packet probing mechanisms (e.g., the pathChirp [31] algorithm employed in this work) over each interface of the mobile client. The loss parameters π B r and ξ B r can be sensed through control protocols or delay measurements [32].

3.2 Video distortion model

In this subsection, we introduce a generic video distortion model [33]. The end-to-end distortion (Dtotal), perceived by the end user, can generally be computed as the sum of the source distortion (Dsrc) and the channel distortion (Dchl). Overall, the end-to-end distortion can thus be written as

D total = D src + D chl .
(2)

The video quality depends on both the distortion due to a lossy encoding of the media information and the distortion due to losses experienced in the network. Dsrc is mostly determined by the video source rate and the video sequence parameters (e.g., for the same encoding bit rate, the more complex the sequence, the higher the source distortion). The source distortion decays with increasing encoding rate; the decay is quite steep for low bit rate values, but it becomes very slow at high bit rate. The channel distortion is dependent on the effective loss rate π B , which is caused by the transmission loss and expired arrivals of video packets. It is roughly proportional to the number of video frames that cannot be decoded correctly. Hence, we can explicitly formulate Dtotal (in units of mean square error) as

D total = D 0 + α V - V 0 D src + β × π B D chl ,
(3)

in which α, V0, D0, and β are constants for a specific video codec and video sequence. These parameters can be estimated from three or more trial encodings using nonlinear regression techniques. To allow fast adaptation of the flow rate allocation to abrupt changes in the video content, these parameters can be updated for each group of pictures (GOP) in the encoded video sequence [34]. Since this model takes into account the effects of intra-coding and spatial loop filtering, it provides accurate estimates for end-to-end distortion [32].

3.3 Forward error correction

In this work, we use the systematic RS block erasure code for video data protection against channel losses. Generically, a FEC block of n data packets contains k source packets and n-k redundant packets. Usually, the receiver can fully reconstruct the original k data packets if at least k packets of the FEC block are correctly received. In FEC (n,k) code, for every k source packets, (n-k) redundant data packets are introduced to make up a codeword of packets. As long as a client receives at least k out of the n data packets, it can recover all the source packets. If the number of received packet is less than k, the arrival source packets can still be used to contribute to the video decoding process because they have been kept intact by the RS encoding process. In general, for the same code rate k/n, increasing the value of n would enhance the performance of RS code. The FEC code rate n/k needs to be dynamically chosen based on the loss requirement and channel status.

Practically, the frame-level [35], GOP-level [36], or sub-GOP-level [37] FEC coding is often applied for video data protection. In this work, we implement the GoP-level FEC coding (see Figure 3) in order to seamlessly integrate with the source rate adaption mechanism.

Figure 3
figure 3

Illustration of GOP-level FEC coding used in this paper.

3.4 Effective loss rate

The effective loss rate π B represents the combined rates of the lost packets due to channel losses and expired arrivals, i.e.,

π B = π tran + 1 - π tran × π over .
(4)

For real-time video applications, each video frame is associated with a decoding deadline. This deadline sets a maximum delay bound for a frame to be successfully delivered to the client in order to contribute to the decoding process. Next, we will provide a comprehensive analysis for the transmission and overdue loss, respectively.

3.4.1 Transmission loss rate

Let c denote a n-tuple which represents a particular failure configuration. If the i th FEC data packet is lost during the transmission, then c i = B. By taking into account all the possible configurations, we can compute the transmission loss rate as

π tran = 1 k all c (c)×P(c),
(5)

in which 0<(c)<k is the number of lost source packets for a given c. For the systematic FEC(n,k) we can have

(c)= 0 if i = 1 k 1 c i = B n - k , i = 1 k 1 c i = B otherwise .
(6)

As the physical paths to the multi-homed client are independent of each other, we can compute P(c) as follows:

P(c)= r = 1 R ϕ r × P ( c r ) ,
(7)

where P( c r ) is the probability of a failure configuration cron P r and ϕ r is an element of the selection vector (Φ= ϕ 1 ,, ϕ R ) for wireless access networks. ϕ r is defined by

ϕ r = 1 if the r th wireless access network is picked , 0 otherwise .

Let p i , j r (θ) denote the probability of the transition from state i to j on P r in time θ, then we can have

p i , j r (θ)=P[ X r (θ)=j| X r (0)=i].
(8)

For the classic Markov chain analysis, we can have

p G , G r ( θ ) = π G r + π B r × κ , p G , B r ( θ ) = π B r - π B r × κ , p B , G r ( θ ) = π G r - π G r × κ , p B , B r ( θ ) = π B r + π G r × κ ,
(9)

in which κ=exp - ( ξ B r + ξ G r ) × θ . We assume that each element in the vector N= n 1 , n 2 , , n R , r n r =n represents the number of packets dispatched onto each physical path. Now, the value of P( c r ) can be computed as follows

P( c r )= ϕ r × π c 1 r r i = 1 n r - 1 p c i r , c i + 1 r r θ r .
(10)

After a sequence of algebraic computations, we can obtain

π tran = 1 k all c (c) r = 1 R π c 1 r r i = 1 n r - 1 p c i r , c i + 1 r r θ r .
(11)

The above equation allows us to compute the transmission loss rate of a specific scheduling approach. Based on the Equation 11 in [38], we can obtain the expected value of π tran as in Equation 12,

π tran = 1 k j = n - k + 1 n 0 j 1 , . . , j N j j 1 + . . + j N = j r = 1 R ϕ r π G r × P ( [ n r - 1 j r ] | G ) + π B r × P ( [ n r - 1 j r - 1 ] | B ) × r = 1 R ϕ r × i = 0 k r i × π G r × P ( [ k r - 1 i ] | G ) × P ( [ n r - k r j r - i ] | G ) + π B r × P ( [ i - 1 k r - 1 ] | B ) × P ( [ n r - k r j r - i ] | B ) π G r × P ( [ n r - 1 j r ] | G ) + π B r × P ( [ n r - 1 j r - 1 ] | B ) ,
(12)

in which P([ n r - 1 j r - 1 ]|q),q{G,B} denotes that any b out of a consecutive packets are lost given that this block is preceded by a packet which is state q. The detailed computations of P [ n r - 1 j r - 1 ] | q ,q{G,B} can be referred to [14].

3.4.2 Overdue loss rate

The end-to-end packet delay over a single wireless network (d r ) is dominated by the queueing delay at the bottleneck link, and it can be approximated by an exponential distribution [39], i.e.,

P d r > T 1 2 π exp - T d r ,
(13)

in which denotes the maximum delay constraint that prevents the playback buffer starvation. Now, we calculate the value of d r . It can be obtained with the following equation:

d r = t r + μ r ( 1 - π r ) n r × S ,
(14)

where μ r (1 - π r ) represents the ‘loss-free’ bandwidth of P r and S represents the packet payload size. Then, the probability for expired arrival of packets can be obtained with

P{ d r >T} 1 2 π exp T × n r × S t r × n r × S + μ r ( 1 - π r ) .
(15)

The overdue loss rate can be obtained with the equation of

π over = r = 1 R n r × ϕ r × P d r > T n , = 1 2 π × n r = 1 R n r × ϕ r × exp T × n r × S t r × n r × S + μ r ( 1 - π r ) .
(16)

3.5 Problem formulation

We are now ready to formulate the problem of flow rate allocation combining the JSCC for video delivery in heterogeneous wireless networks. Note that it is not practical for the video encoder to trace the frequent variation in source rate. Therefore, we adapt the source rate in units of GOP, based on the channel status, FEC code rate, and delay requirements. To allow fast adaptation of the source rate to abrupt changes in the video content, this parameter is updated for each GOP in the encoded video sequence, typically once every 0.25 s (with J = 8 frames, F = 30 fps). The objective is to minimize the summation of the total distortion Dtotal subject to loss, delay and bandwidth constraints:

For each GOP, determine the value of Φ , Ω , V , n tominimize D total = D 0 + α V - V 0 D src + β × π tran D tran + β × ( 1 - π tran ) × π over D over D chl ,
subject to: V × n / k × ω r i = 1 R ω i < μ r , for 1 r < R , V × n / k r = 1 R μ r , π tran + 1 - π tran × π over < Δ , π tran = Equation 12 , π over = Equation 16 .
(17)

This is a nonlinear optimization problem with linear constraints. With regard to the computational cost and convergence, it is impractical to derive the exact solution for the minimal video distortion. In the next section, we will show how to resolve this optimization problem in the design of the proposed FRA-JSCC.

4 Design of flow rate allocation-based joint source-channel coding

In this section, we describe the overall design of the proposed FRA-JSCC and outline the functionality of its major components. The system design is presented in Figure 4, and it includes components implemented in both the server and client side, respectively. In order to solve the optimization problem (17), the proposed FRA-JSCC performs the following working steps at the server side: (1) FEC redundancy estimation, (2) source rate adaption, and (3) flow rate allocation. Specifically, the value of FEC redundancy ((n - k)/k) and video source rate (V) is based on the rate allocation vector (Φ). The input and feedback information (e.g., the loss, delay constraints, and channel status) is necessary for the computation steps. The loss and delay requirement is imposed by the video application in order to achieve the required QoS. The encoded video streaming is split among multiple available wireless networks at the weighted round robin distributor, and the packet transmitter is responsible for dispatching the FEC data packets onto different channels.

Figure 4
figure 4

Overall design of the proposed FRA-JSCC consisting of working components at the server and client sides.

At the client side, the video frames will be stored in the playback buffer after the FEC decoding process. The inter-frame resequence step aims at reordering the video frames in case they arrive at the client out-of-order. As each video frame is associated with a decoding deadline, the overdue frames will be discarded and concealed by copying from the last received ones. Next, we will describe the key components in the system design and their working steps.

4.1 FEC redundancy estimation

For estimating the FEC redundancy, we model the multiple wireless networks as a single virtual link with effective loss rate π B . Consider the transmission of k FEC packets (each of size S) over the virtual link from the source to the destination. Let (n - k)/k denote the redundancy (i.e., the fraction of redundant FEC packets in the FEC block). There is an inherent tradeoff between FEC redundancy and its error correction power [29]. With more redundant packets, the receiver can recover from more severe losses, at the cost of larger end-to-end delays and higher loads imposed on networks. Therefore, in the design of FRA-JSCC, the goal is to use ‘just enough’ FEC redundancy to meet the video application’s loss requirement (Δ). With this objective, the FEC adaption policy can be derived under fairly general assumptions by simply bounding the loss tail probability.

Therefore, the FEC redundancy estimation problem can be stated as

n=arg min diff Δ - π B ,
(18)

in which

diff(Δ- π B )= Δ - π B if π B < Δ , otherwise ,
(19)

and π B can be estimated using Equation 12. Therefore, the FEC redundancy can be obtained i.f.f Φ is determined.

4.2 Source rate adaption

According to the information theory [40], video source distortion can be minimized by increasing the effective encoding rate. On the other hand, the increasing encoding rate will lead to higher transmission rate which imposes heavier load on channels. If the imposed load exceeds the network capacity, it will in turn result in longer delay and packet loss due to network congestion. There is an inherent conflict between the source and channel distortion. Therefore, the critical point in source rate adaption is to find the upper bound under application and channel constraints. The constraint imposed by video applications is the delay requirements. In real-time video applications, delay plays a vital role in enhancing streaming video quality. If a video frame arrives at a destination past the decoding deadline, it is considered lost. In this paper, we propose a source rate adaption algorithm under delay requirements, taking into account FEC redundancy carried out in the last subsection.

The maximum number of packets that can be transmitted through the r th wireless network in the tolerable maximum delay is calculated by

ω r = μ r × ( 1 - π r ) × ( T - t r ) S ,for1rR.
(20)

where ⌊x⌋ denotes the largest integer less than x. Now, we set the weighting factor for the r th wireless channel of the weighted round robin distributor to ω r × ϕ r for 1rR. The proposed joint source and FEC control scheme calculates the FEC decoding failure rate based on the effective loss rate to determine the code rate. First, we define the maximum number of packets which can be transmitted using the constructed virtual link by

Θ= r = 1 R ϕ r × ω r .
(21)

Then, the duration of a GOP to be displayed at the client side can be obtained with J/F, in which J is the number of frames in a GOP and F is the video frame rate (in terms of frames per second). The number of bytes within a GOP after being encoded within the duration is Θ × S × k/n, where k/n denotes the FEC code rate. Consequently, the resulting maximum video source rate V for a FEC block is determined by the equation

V= Θ × S × k / n J / F .
(22)

4.3 Flow rate allocation

The source packets together with the redundancy packets consist of the ‘flow’ mentioned throughout this paper. The goal of the flow rate allocation is to select appropriate wireless access networks out of all the candidates so as to minimize end-to-end video distortion.

Minimize: D total (Φ)= D 0 + α V ( Φ ) - V 0 +β× π B (Φ).
(23)

Until now, we have obtained the expressions of n and V. According to the Theorem 1 in [17], the optimal flow rate allocation solution takes the form of a consecutive series of 1’s, followed by a consecutive series of 0’s, i.e., Φ = [1,1,…,1,0,0,…,0]. Indeed, the inclusion of a wireless access network with high loss rate, long propagation delay, or low bandwidth can theoretically increase Dtotal because more FEC redundancy may be required to compensate for the increased uncertainty. In order to find the optimal solution, we first rank all the available wireless networks according to their ‘loss-free’ bandwidth (μ r (1 - π r )), which has proven to be a good indicator of the network path quality [30]. Then, the optimal flow rate allocation vector can be obtained with a simple but effective search algorithm, i.e.,

Practically, a mobile device has a small number of network interfaces due to the limited battery life, mobility, cost, etc. Thus, the computational complexity required for the proposed flow rate allocation algorithm is negligible although the full search method is used.

4.4 Channel status monitoring

Estimating channel status information based on end-to-end monitoring has been attracting research attention for years. Over heterogeneous wireless networks, it is very important to identify the physical characteristics of each wireless channel in order to utilize channel resources efficiently. The available bandwidth, propagation delay, and channel loss rate are especially important properties for a high-quality video streaming service. So far, numerous algorithms have been proposed to estimate the available bandwidth over wired/wireless networks in the literature [31, 41, 42]. In this paper, the pathChirp algorithm [31] is employed to estimate the available bandwidth through each wireless network with high accuracy and efficiency. During the transient state, a server sends some probe packets with exponentially distributed intervals through each wireless network interface when a video request arrives from a client. Based on the probe packet arrival intervals, the client estimates the available bandwidth using the pathChirp algorithm (for detailed descriptions, please refer to [31]). In the steady state, video data packets are transmitted at a fixed interval, and the client continuously monitors the packet arrival intervals in a sliding window and estimates the available bandwidth based on these intervals. We can easily calculate the propagation delay by the time stamp in each packet header. Now, we can obtain the following information

μ = { μ 1 , μ 2 , , μ R } , π = { π B 1 , π B 2 , , π B R } , and t = { t 1 , t 2 , , t R } .

A client periodically reports information on each physical path to a parameter control unit of a server through the most reliable uplink channel. This information is used to determine the results of FEC parameter tuning, source rate adaption, and source rate allocation in the system design. The procedures of the proposed FRA-JSCC are presented in Algorithm 1.

Algorithm 1 Flow Rate Allocation based Joint Source Channel Coding.

5 Performance evaluation

In this section, we evaluate the efficacy of the proposed FRA-JSCC by comparing it with the existing schemes for video delivery over heterogeneous wireless networks. We first describe the emulation methodology that includes the emulation setup, reference schemes, performance metrics, and emulation scenario.

5.1 Emulation methodology

5.1.1 Emulation setup

We adopt the Exata and Joint Scalable Video Model (JSVM) as the network emulator and video codec, respectively. The architecture of evaluation system is presented in Figure 5, and the main configurations are set as follows:

Figure 5
figure 5

System architecture for performance evaluation.

  •  Exata 2.1 [43] is used as the network emulator. Exata is an advanced edition of QualNet [44] in which we can perform semi-physical emulations. In order to implement the real video streaming-based emulations, we integrate the source code of JSVMa (as Objective File Library (.LIB)) with Exata and develop an application layer protocol of ‘Video Transmission’. The detailed descriptions of the development steps could be referred to Exata Programmer’s Guide [43]. In the emulation topology, the video server has one wired network interface and the mobile client has three wireless network interfaces, i.e., cellular, WLAN, and WiMAX. We can construct an end-to-end connection to a specific wireless network interface by binding a pair of IP addresses from the server and the client. The configurations of the emulated background traffic in the wired networks are listed in Table 2. The server and client are mapped to real computers, which are connected to the emulation server through the Exata Connection Manager. The IEEE 802.11b is adopted as the WLAN protocol. The configurations of heterogeneous wireless networks are summarized in Table 3[4, 5, 45].

Table 2 Parameters of background traffic
Table 3 Parameter configuration of wireless networks
  •  H.264/SVC reference software JSVM 9.18 [46] is adopted as the video encoder. The generated video streaming is encoded at 30 frames per second and a GOP consists of 8 frames. The test video sequences are Foreman, Mother & Daughter, Hall, and Container in QCIF (quarter common interchange format) with 300 frames. Each of the sequences features a different pattern of temporal motion and spatial characteristics which is reflected in their corresponding video quality versus encoding rate dependencies. We concatenate them 10 times to be 3,000 frames long in order to obtain statistically meaningful results. The loss requirement (Δ) and delay constraint () are set to 1% and 250 ms, respectively.

5.1.2 Reference schemes

We compare the performance of FRA-JSCC with the following schemes for video delivery in heterogeneous wireless networks:

  • FCVP [[6]]. As the system proposed in [6] aims at exploiting the path diversity in heterogeneous wireless networks based on fountain code, we name it fountain code-based virtual path construction system. In the implementation of FCVP, the control parameters were updated for every 0.5 s. The symbol and packet size is set to be 8 and 512 bytes, respectively.

  • JMFR [[16]]. The joint multimedia-FEC rate allocation scheme computes the optimal source and FEC rate for scalable video over multi-path networks based on the utility algorithm. The number of video layers is set to be 1 in all the emulations.

  • DMP [[18]]. The dynamic multi-path streaming utilizes multiple paths by maintaining a transmission control protocol (TCP) connection on each path. The sender puts the data packets in a single sender queue. At any time, only one TCP connection can gain the access to the sender queue. The winning TCP connection will keep sending data until the connection is blocked. Another available TCP connection will then gain the access to the sender queue and continue sending data. In order to fairly compare the performance with other competing models, we dynamically adjust the video encoding rate based on the aggregate bandwidth of the available links.

5.1.3 Performance metrics

We adopt the following performance metrics to evaluate the proposed approach against the above competing approaches:

  • PSNR. Peak signal-to-noise ratio is a standard metric of video quality and is a function of the mean square error between the original and the received video frames. If a video frame is lost or past the deadline, it is considered lost but may be concealed by copying from the last received frame before it.

  • Average end-to-end delay. The end-to-end delay of a video frame consists of delay in the network and the resequencing time at the client. It is counted from the generation time of a video frame to the time when it can be decoded.

  • Effective loss rate. As introduced in Section 3.4, the effective loss rate π B includes the transmission and overdue loss. PSNR measures video quality after error concealment for the lost video frames. We measure the effective loss rate to testify the competing models in mitigating the packet loss.

5.1.4 Emulation scenario

We conduct all the emulations in the mobile scenario with trajectories indexed from 1 to 4 as shown in Figure 5. The four mobile trajectories represent the different access options for the mobile user in the integrated heterogeneous wireless networks, e.g., the user could simultaneously access the UMTS and WiMAX while moving along the first trajectory. The mobile client requests to the server through a wireless interface and constructs the connection whenever it moves in the coverage. The moving speed of the client is set to be 2 m/s in all the emulations. In all the emulations, the components of FRA-JSCC are working at the GOP level, i.e., every 0.25 s. It is necessary to update the JSCC parameters for each GOP due to the time-varying wireless channel status. However, with regard to the coding efficiency, it is impractical to trace the rate variation at the video frame level.

For the confidence results, we repeat each set of emulations with different video sequences more than five times and obtained the average results with a 95% confidence interval. The microscopic and mobility results were presented with the measurements of finer granularity.

5.2 Evaluation results

Before showing the experimental results of the performance metrics in detail, we first present the channel status information, which is the feedback with a 0.25-s period from the client. Figure 6 plots the available bandwidth of different wireless access networks during the client mobility along mobile trajectory 3. It can be observed that the available bandwidth of both WLAN and WiMAX experiences fluctuations due to the injected background traffic and client mobility. The instantaneous loss rates are shown in Figure 7. Due to the lack of space, we do not present all the channel status information during the experimentations in this section.

Figure 6
figure 6

Available bandwidth of different wireless networks while moving along mobile trajectory 3. (a) WLAN and (b) WiMAX.

Figure 7
figure 7

Instantaneous channel loss rate of different wireless networks while moving along mobile trajectory 3. (a) WLAN and (b) WiMAX.

5.2.1 PSNR

As shown in Figure 8, FRA-JSCC achieves higher PSNR values and lower variations than the other competing models. The average video PSNR in trajectory 2 is lower than that in trajectory 1 as the WLAN is less stable than the WiMAX. The results verify the instance in Figure 1 and the conclusions in related work [6, 7]. Besides, the superiority of FRA-JSCC and FCVP over the other two schemes is larger in trajectories 3 and 4 as more wireless access networks are available. The substantial improvements in video quality confirm the importance of JSCC in conjunction with flow rate allocation in heterogeneous wireless networks. FRA-JSCC outperforms the FCVP as the Reed-Solomon code is more appropriate than the fountain code for the real-time video and thus reduce the erasure-coding-induced delays. In order to have a microscopic view of the results, we also depict the mean values and standard deviations (Stddev) of mobile trajectory 4 in Table 4. The per frame video PSNR during the interval of [ 0, 20] s is presented in Figure 9. It can be observed that FRA-JSCC maintains the PSNR values at a relatively higher range. In the mobile trajectory 1, the superiority of FRA-JSCC over the JMFR becomes more obvious and is due to the increase number of access options.

Figure 8
figure 8

Average PSNR values and variances under different evaluation scenarios.

Figure 9
figure 9

PSNR values of the received 600 video frames in the Foreman sequence. (a) FRA-JSCC, (b) FCVP, (c) JMFR, and (d) DMP.

Table 4 Average PSNR values for different compared models

5.2.2 Average end-to-end delay

Figure 10 plots the average end-to-end delays as well as the confidence intervals. FRA-JSCC achieves the lowest delay of all the competing models. The delay performance of FCVP is inferior to that of FRA-JSCC and JMFR due to the large block size of fountain and the coding inefficiency. The results indicate the Reed-Solomon code is more suitable for real-time video applications than the fountain code. Figure 11a depicts the cumulative distribution function of the end-to-end video frame delay from a single experiment. We can see that the per-frame delay is significantly lower here than that of the other three reference schemes. Although the FEC encoding is not employed in the DMP, the lost video frames need to be retransmitted, and thus, the end-to-end delay will be increased. As each video frame is associated with a decoding deadline in real-time applications, we plot the ratio of video frames past the decoding deadline of 200 ms in Figure 11b.

Figure 10
figure 10

The average end-to-end delays of all the compared schemes.

Figure 11
figure 11

Delay performance of the competing models. (a) Cumulative distribution function and (b) ratio of video frames past the decoding deadline of 200 ms.

5.2.3 Effective loss rate

Figure 12 depicts the effective loss rates of all the competing schemes under different mobile trajectories. The pattern is very similar to the results presented in Figure 8 as the PSNR is generally proportional to the ratio of lost video frames. FRA-JSCC significantly outperforms the reference schemes as it takes into account both the loss and delay requirements. However, different from the results in end-to-end delay, FCVP outperforms JMFR and DMP in minimizing the effective loss rate as it includes a physical path selection algorithm in the system design. Thus, the transmission loss is substantially decreased.

Figure 12
figure 12

The effective loss rates of all the competing models.

6 Conclusions

In this paper, we have presented a flow rate allocation-based JSCC approach for mobile video delivery in heterogeneous wireless networks. Through modeling and analysis, we have developed solutions for FEC redundancy adaption, video source rate adaption, and flow rate allocation. Experimental results show that the proposed FRA-JSCC is able to dynamically select the appropriate wireless access networks out of all candidates and significantly improve the video PSNR. As future work, we will consider (1) designing a seamless vertical handoff algorithm for optimal-quality video in the integrated WLAN, WiMAX, and cellular networks. The work in [5] formulates the heterogeneous wireless networks as restless bandit systems. However, it does not provide in-depth analysis on the physical characteristics (e.g., the coverage and received signal strength) of each wireless network. We would also consider (2) including an optimal path interleaving mechanism with the FRA-JSCC to overcome the burst loss.

Endnotes

a We choose the JSVM in convenience for the source code integration as both Exata and JSVM are developed using the C++ code, while the H.264/AVC JM (http://iphome.hhi.de/suehring/tml/) software is developed using C language.