A social-aware video sharing solution using demand prediction of epidemic-based propagation in wireless networks

The video services that account for the majority of global network traffic consume significant amounts of electricity and network resources to meet the large-scale demand of users. Variations in user interest and social influence lead to high maintenance costs for achieving a dynamic balance between supply and demand, which negatively impacts the sustainable development of video services. In this paper, we propose a social-aware video-sharing solution using demand prediction of epidemic-based propagation in wireless networks (SDPEP). SDPEP constructs a video propagation model based on user “pull” and “push” sharing behaviors and designs an estimation method for calculating the probability of video fetching by investigating user interests and social relationships. SDPEP uses the probability of video fetching to calculate the basic reproduction number during epidemic-based video propagation, predicting user demand during the propagation process. To ensure efficient caching with low-cost adjustments to video distribution, SDPEP employs a caching-based adjustment strategy for distributing videos while maintaining dynamic balance between supply and demand. Extensive testing shows that SDPEP outperforms other state-of-the-art solutions.


Introduction
The rapidly evolving wireless communication technologies, such as 5 G, not only enable ubiquitous user access but also enhance network bandwidth to deliver content-rich and high-definition videos [1][2][3][4][5].Internet video services rely on captivating content and convenient accessibility to attract a substantial user base.Additionally, they leverage social networks to propagate videos through interpersonal connections, thereby further amplifying the scale and velocity of video dissemination [6][7][8][9].For example, some popular videos on certain video applications (such as Tik Tok) can receive over 30 million views within 24 h.To meet the high demand for user videos, video systems must provide sufficient bandwidth resources through caching-based distribution of video copies in advance.However, dynamic changes in user demand due to variations in interest and social influence can result in a shortage or redundancy of bandwidth supply.Insufficient bandwidth supply causes long startup delays and low quality of experience for users when popular videos attract a large number of viewers during a short time period.Video systems need to quickly regulate the distribution of resources to bridge the gap between supply and demand, which consumes significant amounts of energy.Conversely, insufficient demand results in supply redundancy that wastes electric energy and network resources used for pre-adjustment of video distribution.The scalability promotion of video systems with low operating costs and high quality services is crucial for sustainable development [10][11][12][13].High-efficiency video-sharing based on accurate demand prediction can dynamically balance the supply and demand with real-time slight regulation during propagation to support energy-efficient expansion while ensuring high-quality user experiences [14][15][16].However, integrating video services with social networks introduces uncertainty and complexity that challenge the accuracy of predicting user demands.
Numerous researchers have focused on resource allocation and supply based on demand prediction [17][18][19].For example, Liu et al. proposed a reactive content caching method that predicts content popularity to make timely caching decisions for newly requested content, thereby improving cache hit rates and reducing average download times [20].Panayiotou et al. developed a combinational fair allocation method of network resources based on traffic demand predictions to ensure quality-of-service guarantees for requesters in under-congested networks [21].Zhang et al.'s hierarchical proactive caching approach considers both user demands and mobility in the future, effectively addressing traditional reactive caching problems to reduce network load and improve user QoE [22].However, these methods do not comprehensively consider user interest or social influence based on sharing behaviors during video propagation processes, which makes it difficult to ensure accurate demand prediction or support dynamic regulation according to variations in user demand levels.Therefore, an efficient social-based video sharing method is needed that can accurately predict user demand by modeling the process of video propagation while synthetically investigating both user interest and social influence factors so as to support economic regulation of video resources while maintaining high-quality QoE.
In this paper, we propose a social-aware video sharing solution using demand prediction of epidemic-based propagation in wireless networks (SDPEP).SDPEP constructs a video-sharing model based on the "pull" and "push" modes and evaluates users' probability of fetching videos according to their interest preferences and social influence.SDPEP utilizes the basic reproduction number to predict video demand and designs a caching-based video-sharing strategy to achieve a balance between user supply and demand.Some notable contributions of SDPEP are as follows: 1. SDPEP investigates user behaviors in both the "pull" and "push" modes of social-based video sharing during the process of video propagation and evaluates the probability of fetching videos for each mode based on measurements of user interest and social influence respectively.SDPEP further estimates the probability of video fetching by integration of the probability of "pull" and "push".2. SDPEP constructs a SIR-based model of video propagation and formulates the calculation method of the basic regeneration number based on the probability of fetching videos, which accurately estimates overall user demand at a macro level during the process of video propagation.SDPEP further designs a caching-based adjustment strategy for distributing videos according to predicted demands, which effectively implements high-efficiency low-cost video caching and ensures user QoE.
The rest of the paper is organized as follows.Section 2 describes the related works of social-based video propagation methods; Sect. 3 presents detailed design aspects of SDPEP including estimation methods for fetching probabilities in social networks, prediction techniques for video demand based on the basic reproduction number, and strategies for adjusting cached videos; Sect. 4 evaluates the performance of SDPEP with the comparison of other solutions through a comparative experiment and simulation results show how SDPEP achieves much better performance results in comparison with other state-of-the-art solutions.

Related work
Some researchers also focus on video prefetching methods based on predicting user video demand.Zhang et al. propose a hierarchical proactive caching method that considers users' demands and mobility by investigating the historical popularity of videos to estimate their preferences and predict future video demands [22].The current location and planned route of automatic vehicular users are used to predict their future locations, allowing base stations or roadside units in the predicted driving path to proactively cache videos that may be requested according to predicted demand and locations.Xu et al. propose an optimal content caching method in 5 G ICN by building a fluid-based model that combines Information-Centric Networking (ICN) and 5 G D2D communications [23], formulating an optimal content replication problem that caches overhead and load, requiring nodes to exchange state lists for predicting system-wide demand variation, making caching decisions require all nodes implement video caching at the same time via message broadcasting.[24].The above methods investigate the historical popularity information of videos to predict user demand and regulate video distribution-based video caching.However, the important factors related to user sociality are neglected, and the prediction accuracy of user demand difficulty is ensured, which results in the frequent jitter of video caching and replacement and increases the energy and resource consumption of video caching.Some researchers also focus on video caching methods based on the prediction of video popularity.Tan et al. propose an estimation method for predicting video popularity by analyzing the average percentage of videos watched [25].By investigating the relationship between the average watched percentage and future views of videos using largescale data from online video services, they construct an agesensitive model to predict video popularity.This prediction is designed within a finite field of average watched percentage to ensure accuracy.Fan et al. introduce a content caching policy based on evolving learning in edge networks [26].They adaptively learn time-varying content popularity and train content features using a multi-layer recurrent neural network, reducing computational complexity and improving prediction accuracy.The local buffer's cached content can be dynamically removed based on predictions of content popularity, effectively reducing jitter levels during replacement.Zhao et al. propose a version-aware caching scheme at edge servers for multi-version video-on-demand systems [27].Video cache placement is formulated as a knapsack problem considering cache storage constraints and transcoding computation at edge servers to enhance cache hit ratio.A version-aware caching profit is evaluated based on transcoding relations among versions, enabling appropriate decisions regarding cached and replaced videos through algorithms utilizing Lagrangian relaxation.Han et al. propose proactive edge caching policies for the cached fraction and encoding bit rate of videos to maximize the weighted average QoE of VoD service [28].The helper nodes cache videos and are deployed in the coverage area of base stations.The weighted popularity-to-duration ratio of videos determines the performance of caching policy according to the formulated caching optimization problems with convexity properties.A low-complexity video caching algorithm is designed to improve the caching performance of VoD service.The above methods make use of learning the historical video popularity to estimate the replaceability of cached videos.The scale and duration of video popularity rely on the channels such as social networks of video propagation and user interest levels.However, these methods neglect the social channel and user preference, which difficult ensure high prediction accuracy to reduce the jitter of video replacement.Some researchers are focusing on social-based video caching and traffic scheduling.Wang et al. proposed a mobile social video prefetching method, DPDL-SVP, based on distributed online learning with differential privacy (DPDL-SVP) [29].DPDL-SVP identifies the main influencing factors for user video requests, including video preference, content popularity, and social interactions through analysis of historical data from popular online social network sites such as WeiBo.cn.By formulating and dividing the problem of online convex optimization for video prefetching, DPDL-SVP designs a distributed algorithm that ensures differential privacy and proves the performance bound of the algorithms.Accurate video prefetching based on user demand prediction accuracy can effectively optimize video distribution and reduce delivery delays.Roy et al. designed a transfer learning framework by analyzing social streams to estimate sudden popularity bursts in online content [30].They developed a transfer learning algorithm that models propagation levels of videos in social networks to enhance prediction accuracy of video popularity by capturing topics discussed in social streams.The sudden rise in video popularity is mainly influenced by the social prominence of its context, which is determined using large datasets comprising tweets and videos.Jia et al. propose an interest-aware edge caching strategy for 5 G ultra-dense networks [31].They utilize Spectral Clustering to generate initial clusters for videos and refine them using Fuzzy C-Means clustering technique to construct users' interest domains based on similarity levels within these domains.Users with common or similar interests are then grouped together within these interest domains.The designed performance-aware video caching strategy enables intelligent caching and removal of local video resources according to users' interests.Xie et al. propose a prediction method of micro-video popularity based on the hierarchical multimodal variational encoder-decoder using the user information and the micro-video content [32].The popularity of micro-videos is decoded as a lower dimensional stochastic embedding according to the multimodal variational encoder-decoder.A user encoder-decoder is designed and is used to learn the Gaussian embedding of micro-video and user information with social influence, which generates the coarse-grained video popularity.The refined posterior distribution of the micro-video embedding by learning the prior distribution is encoded from the video content features to support the decoding for the fine-grained popularity of the micro-video.The popularity trend of micro-video can be expressed by the learned latent embeddings of micro-videos based on the multimodal extension of variational information bottleneck theory.The above methods consider user interest and social relationships, but the social influence and user-sharing behaviors in the process of video propagation are neglected.These bring a severely negative influence on the prediction accuracy of video demand, which does not support highefficiency video sharing with the low caching cost.The users have an interesting range according to the videos they have watched.Therefore, let IR i = (c 1 , c 2 , … , c n ) denote interest range of u i .Any item c i in IR i is a video cluster and is defined as c i = (v a , v b , … , v k ) .Moreover, each video cluster c i has a cluster head item v k which depends on the shortest distance with other items in video cluster c i to represent c i .When a video v j is propagated and is stored and watched by a user u i , v j belongs to the interest range of u i and joins a video cluster whose head video has the highest similarity values with v j .If v j is a propagated video, the interest level of u i for v j can be defined as: where s hj is the content-based similarity value between v h and v j ( v h is the head node of an item c h in IR i of u i ) where the content similarity of two videos can be defined as the cosine of the angle between the two video vectors.

SDPEP detailed design
(1) MAX[s aj , s bj , … , s nj ] is the maximum value among similari- ties of all cluster head nodes and v j .The result s h of MAX[s aj , s bj , … , s nj ] denotes that v j belongs to the video clus- ter c h corresponding to the cluster head node s h .N m is the number of items in c h and N m is the total number of videos in all clusters of IR i ; N m N t also can be considered as the probability that u i obtains a video which belongs to cluster c h .s hj can be considered as a weight value for or a membership between v j and c h .

Ways of social-based video propagation
The dissemination of videos in social networks relies on the layer-by-layer generation of video copies along the social links between users, which can be represented by a graph G = (V, E) , where V denotes the set of vertexes representing users and E represents the set of edges representing social links.If two vertexes are connected by an edge in G, they are considered as social neighbors and share videos with each other via this connection; if they do not have an edge, direct sharing of videos between them is not possible.In this process, seed users play a crucial role as they possess initial videos and act as driving forces for video propagation.They provide initial video data to request nodes and push videos to their social neighbors who then use both "pull" and "push" modes to propagate these videos.The "pull" mode occurs when a user u i expresses interest in a video v j by sending a request message; whereas the "push" mode happens when u i has close social ties with another user u j who pushes data of a video v j to u i , leading to acceptance from u i .Users construct two indexes that record informa- tion about their social network and the stored videos shared among their connections.Periodic exchange of local video resources with these connections is performed to maintain accuracy in both indexes.These indexes serve as repositories for object and content information related to video sharing, allowing easy labeling of interested or uninterested videos while providing objects for efficient searching/pushing purposes.Interest preference and social influence are critical determinants affecting both pull/push modes.
In the "pull" mode, if a user u i is interested in a video v j and finds that u i 's social neighbors have stored v j , u i decides to request v j in terms of u i 's preference with high probability.If u i 's social neighbors are uninteresting in v j , u i also may lose the interest for v j and does not actively request v j to u i 's social neighbors.In the "push" mode, u i has high-interest levels and receives the push of v j from u i 's social neighbors with a close social relationship, u i accepts the pushed v j with high probability.If u i has a high interest in v j and has a close social relationship with u i 's social neighbors, u i accepts v j with high probability; Otherwise, if u i has a low interest in v j and has the weak social relationships with u i 's social neighbors, u i rejects v j .If u i has a high interest in v j and has a weak social relation with u i 's social neighbors, u i may accept v j due to the high interest; If u i has a low interest in v j and has the close social relationships with u i 's social neighbors, u i also may accept v j due to the strong social influence from u i 's social neighbors.Table 1 shows the multiple conditions that u i fetches v j .
u i pushes v j to u i 's social neighbors or responds to the video requests of other users after u i obtains v j .In other words, u i can be considered as an infected user and depends on the social link to propagate v j 's copies via the "pull" and "push" modes.Common interests and strong social influence are the basis of video propagation in social networks.The users have three states in the process of video propagation based on the epidemic model: "susceptible", "infected" and "immune" [33]: (1) "susceptible": If the users have a certain interest in v j (e.g. they have watched the videos similar with v j ), they can be considered as the susceptible users; (2) "infected": If the users have obtained v j via the "pull" or "push" modes, they can be considered as the infected users; (3) "immune": If the users have watched v j , they become the immune users.The probability that u i fetches v j based on the ways of "pull" and "push" can be defined as: where P r ij is the probability that u i fetches v j via the pull mode; 1 − P r ij is the probability that u i does not request v j .PF j i is the probability that u i fetches v j via the push mode; 1 − PF j i is the probability that u i does not obtain v j by push of all social neighbors of u i ; ) is the probability that u i does not obtain v j using the "pull" and "push" modes.

Probability of pull-based video fetching
Interest preference and social influence are the crucial determinants in the infection stage of the "pull" pattern.v j is a propagated video in social networks.The interest level I ij of a user u i for v j can be calculated according to the Eq. ( 1).The social influence of u i comes from u i 's social neighbors. (2) If most of u i 's social neighbors have watched v j , u i receives the strong social influence from the u i 's social neighbors; If most of u i 's social neighbors do not have watched v j , u i may not be influenced by the watched behaviors of u i 's social neighbors.The social influence of u i 's social neighbors for v j can be defined as: where SN j i is the set of u i 's social neighbors which have watched v j and |SN . is a weight parameter and denotes important levels and interaction relation of EI ij and RI ij .The probability that u i requests v j can be defined as: where is a weight parameter and denotes importance levels and interaction relation of SI ij and I ij .SI ij ∈ [0, 1] and . P r ij is a probability of pull-based video fetching based on the joint optimization of social influence SI ij and content interest I ij . (3)

Probability of push-based video fetching
After a user u k obtains a popular v j , u k can push v j to u k 's social neighbors.Video push between users relies on the precondition that the two users must have the edges in G.The motivation for the video push includes social relationship levels and content interest levels.If there is high-frequency sharing of videos between u k and u k 's social neighbors, the probability that u k pushes videos to u k 's social neighbors keeps high levels.In other words, when u k considers that the received videos may be accepted by u k 's social neigh- bors in terms of consistency levels between pushed content and preference, u k pushes videos to u i 's social neighbors.For instance, when u i and u k have close social relationships, u i receives v j pushed by u k .Even if the content of v j is not compatible with the preference of u i , u k still may push v j to u i regardless of the results of acceptation of u k .The social rela- tionship between users plays a leading role at the moment.
If there is a strong common interest between u k and u i , u k still pushes v j to u i regardless of levels of the social relation- ship between u i and u k .The coincidence levels between u k and u i for the interest of video content play a leading role in the above push event at the moment.If u i and u k have close social relationships and have strong common interests, u k pushes v j to u i with high probability.The probability that u i can obtain the push of v j from u i 's social neighbor u k can be defined as: where is a weight parameter and is used to regulate the proportion of social strength and preference in P j ki ; CI ki denotes the common interest level between u i and u k and is defined as: where VS i and VS k are the sets of videos watched by u i and u k , respectively; |VS i ∩ VS k | returns the number of intersec- tion set of VS k and VS k ; |VS i ∪ VS k | returns the number of union set of VS k and VS k .SD ki is the strength levels of social relationships in the perspective of pushing videos from u k to u i and can be defined as: where F ki is the number that u k pushes videos to u i ; F k is the total number of u k pushing video for u k 's social neighbors.
denotes the video-sharing levels of u i relative to u k 's social neighbors.
is the total video-sharing levels between (7) u i and u k 's social neighbors.SD ki is the projection of The event that u i receives the push of v j from u k has a precondition that u k has obtained v j .If u k does not obtain v j , u k does not push v j and P j ki = 0 .If u k have obtained v j , the probability P j ki that u i receives the push of v j from u k is P j ki = × SD ki + (1 − ) × CI ki accord- ing to the Eq.(7).SD ki ∈ [0, 1] and CI ki ∈ [0, 1] , so . If u k is the seed user of v j and we assume that the video seed users must push the propagated videos to their social neighbor, u i inevitably receives the push of v j from u k and P j ki = 1 .Therefore, P j ki can be redefined as: Further, when u i receives the push of v j from u i 's social neighbors, u i can decide whether to accept the push of v j or not.The social relationship between u i and the pusher and the matching level between the interest of u i and content of v j are the important factors for acceptance of pushed videos.The probability that u i accepts v j pushed from u i 's social neighbor u k can be defined as: where is a weight parameter and is used to regulate the proportion of social strength and interest in PA j ki .I ij is the interest level of u i for v j according to the Eq. ( 1).SA ki is the acceptance rate of pushed videos from u k to u i and can be defined as: where F ki is the number that u k pushes videos to u i ; AF ki is the number that u i accepts the videos pushed by u k .Because The probability that u i receives and accepts the push of v j from u i 's social neighbor u k can be defined as: u i has many social neighbors, so u i may receive and accept the push of v j from one of u i 's social neighbors.The prob- ability that u i fetches v j via the push mode can be defined as: (10) where 1 − PF j ci denotes the probability that u i does not receive and accept the push of v j from u i 's social neighbor u c ; ci is the probability that u i does not receive and accept the push of v j from all u i 's social neighbors.

Model of video propagation based on SIR model
As previously mentioned, the video propagation process can be described by the SIR epidemic model [33].For instance, the users who do not obtain videos have the susceptible state; The users who have obtained videos and stored videos in the local buffer have the infected state; The users who have watched video content and have removed videos in the local buffer have the immune state according to the assumption that the users which have watched video content do not re-watch the same video content.The users who have removed videos in the local buffer also do not continue to disseminate the removed videos.The interaction between users (e.g."pull" and "push") can be considered as the route of infection via edges in G.The SIR model for video propagation can be defined as: where is the birth and death rate; is the infection rate from susceptible to infection; is the recovery rate; N is the total number of all users and N = N(S) + N(I) + N(R) ; N(S) is the number of the susceptible users; N(I) is the number of the infected users; N(R) is the number of the immune users.S(t) = N(S)∕N is the ratio between the susceptible users and all users; I(t) = N(I)∕N is the ratio between the infectors and all users; R(t) = N(R)∕N is the ratio between the immune users and all users; S(t) + I(t) + R(t) = 1 .The basic regeneration number R 0 of the above SIR model can be defined as [34]: When any one user u i obtains a video v j , u i is an infector and the infected periodic time of u i is L. L is the period from infection to recovery and also is the length of v j .In other words, when u i has obtained v j and does not remove v j in the local buffer, u i can continue to push v j to u i 's social neighbors or provide v j for other request users of v j during the period L. The number of new infectors generated by "pull" and (15) "push" of infectors during the period L can be defined as R 0 .R 0 ≥ 1 denotes that the scale of video propagation keeps the growing tendency; R 0 < 1 denotes that the scale of video propagation keeps the shrink tendency.

Prediction of basic regeneration number
The value of R 0 relies on the three parameters , and .The values of , , and can be calculated in the process of propagating v j .For instance, is the recovery rate from infector to immune.The value of can be calculated by the quotient of the number of users from the infected state to the immune state and period.is the transformation rate of users from the susceptible state to the infected state.The value of can be calculated by the quotient of the number of users from the susceptible state to the infected state and period.denotes the variation rate of users who newly join the social networks and users who quit the social networks.The value of can be calculated by the quotient of the difference between the number of users who newly join the social networks and the number of users who quit the social networks and the period.If the propagation period time of v j is T j s and is uniformly divided into multiple time plots T j s = t e − t 0 = (tp a , tp b , … , tp m ) , the number of users related to the calculation of , , and during each time plot can be collected.The basic regeneration number of R 0 (tp i ) during any time plot tp i can be calculated.However, the prediction of the basic regeneration number can effectively support the balance between video supply and demand by adjusting of video copy scale in advance.Therefore, we calculate the predicted values of , , and according to the historical statistics.
Let SS j be the set of users that have the susceptible state in the process of propagating v j ; Let IS j be the set of users which have the infected state in the process of propagating v j ; Let MS j be the set of users which have watched v j and have removed v j in a local buffer.After all members in IS j have watched v j and have removed v j in the local buffer, they leave the set IS j and join into MS j .The time length that the infected users stay in IS j is in the range (0, L j ] where L j is the length of v j .The recovery rate of any user u k is the ratio between l k and L j where l k is the watching time of u k for v j .In the process of propagating v j , the time that the members in IS j play v j can be collected by the exchange of state mes- sages among users and the number N j IM of users from IS j to MS j also can be collected.The real recovery rate during a time plot tp h can be defined as r = N j IM tp h .However, the predicted value of relies on the historical statistics related to v j .The predicted value of during the next time plot tp v can be defined as: where IMS j (tp v ) ⊆ IS j (tp v−1 ) is a set of users which may change state from infectors to immunes in the next time plot tp v and IS j (tp v−1 ) is the set of users which have obtained v j and store v j in local buffer during the current time plot tp v−1 .If PT j h is the time that a user u h in IS j (tp v ) has watched v j , L j − PT j h is the theoretical remaining time of u h .VS q h is the set of videos which have been watched by u h and VS q h ⊆ c x and v j ∈ c x , which means that VS q h and v j belong to the same video cluster c x .l e h is the length of watching time of u h for v e .L e is the length of a video v e in VS q c .
is the remaining time ratio of u h for v e .The average remaining time ratio of u h for all videos in VS q h can be defined as: ) where rtp v−1 is the remaining time of the current time plot tp v−1 , u h may stay in IS j (tp v ) a n d c o n t i n u e s t o p r o p a g a t e v j ; I f ) , u h may leave in IS j (tp v ) and becomes a member in IMS j (tp v ) .When all users in IS j (tp v−1 ) are evaluated according to the above process, the values of |IMS j (tp v )| and j (tp v ) can be calculated.
is the infected rate of users and is the number of users who change state from "susceptible" to "infected" during a time plot.In the process of propagating v j , the state of users in SS j can be collected by the exchange of messages between users.The number of users in SS j which change their state to the infected state during a time plot tp v−1 also is counted.However, The predicted value p (tp v ) of during the next time plot tp v relies on the probability that users fetch v j according to the equation (2).SS j (tp v−1 ) is the set of users which may want to fetch v j during a time plot tp v−1 .A user u k is a member in SS j (tp v−1 ) and SN k is the set of u k 's social neighbors.The change of u k 's is determined by the value of P j k .We calculate the predicted value of P j k during the next time plot tp v to estimate the expected state of u k .
The value of P j k relies on P r kj and PF j k .For instance, SI is a dynamic parameter, so the values of SI and SD change with the state variation of u k 's social neighbors.The interest-related parameters such as CI and I and the historical interaction such as SD and SA are relatively static and have slight variation levels caused by the state variation of u k 's social neighbors.Because SI is an average value and is adjusted by , the influence caused by the variation of SI's value for P j k can be reduced to a negligible level.On the other hand, the increase in the number of members in SN k effectively promotes the value of ( 17) and reduces the volatility level.The prediction of P j k relies on the state of u k 's social neighbors.Initially, if the members in SN k do not have the seed users, the probability PF j k that the members in SN k push v j to u k is 0. The value of P r kj determines P j k .If the members in SN k have one seed user or multiple seed users, the probability PF j k that the members in SN k push v j to u k is not 0. With the increase in the number of the members in SN k which obtain v j , the probability PF j k can be calculated according to the equation (14).Similarly, if the members in SN k do not have the seed users at the initial propagation pro- cess of v j , SI kj is 0 and P r kj = (1 − ) × I kj .With the increase in the number of the members in SN k which obtain v j , the prob- ability P r kj can be calculated according to the equation ( 6).The value of P j k can be calculated according to P r kj and PF j k .u k needs to predict the state of u k 's social neighbors which does not obtain v j during the next time plot tp v to predict value of P j k (tp v ) .Let SSN j k (tp v−1 ) ⊆ SN k is the set of users which does not obtain v j during the time plot tp v−1 and ISN j k (tp v ) ⊆ SN k is the set of users in SSN j k (tp v−1 ) which may obtain v j during the time plot tp v .Because the predicted value of P j k (tp v ) relies on the predicted results of the state of mem- bers in SN k during the next time plot tp v , we use the estimated probability and historical information of video fetching to predict the state of members in SSN j k (tp v−1 ) .P j x (tp v ) is the esti- mated value of fetching v j of a user u z in SSN j k (tp v−1 ) during the time plot tp v .Let T p be the threshold of probability that the users fetch videos.The value of T p can be defined as the aver- age popularity of all videos in the cluster.For instance, T x p is the average value of popularity of all videos in c x , v j ∈ c x .If . VS q z is the set of videos which is similar to v j and have been watched by u z and VS q z ⊆ c x and v j ∈ c x .VS z is the set of all videos which have been watched by u z .
denotes the interest level of u z for the vid- eos which is similar to v j .If according to the above method, u k can obtain a predicted user set ISN j k (tp v ) during the next time plot tp v .The value of P j k (tp v ) can be calculated according to ISN j k (tp v ) .The state of u k can be predicted according to the comparison results between their P j k (tp v ) and T x p during the next time plot tp v : if P j k (tp v ) ≥ T x p , u k may change the state from the susceptible state to the infected state, is added into the set PIS j (tp v ) ; If P j k (tp v ) < T x p , u k may not change the state from the susceptible state to the infected state.Similarly, the predicted probability of fetching v j of the members in SS j (tp v−1 ) also are calculated and the transition of their states also are evaluated in terms of the above method.
where PIS j (tp v ) 's members are the predicted infectors dur- ing the next time plot tp v and |PIS j (tp v )| returns the number of members in PIS j (tp v ) . is related to the number of users who join and leave social networks during a time plot.The rate of users who join and leave social networks is calculated according to the historical statistics in the propagation process of videos in c x , v j ∈ c x .The value of can be defined as: where N J c and N L c are the number of joining and leaving social networks in the propagation process of a video v c in c x , v c ∈ c x , respectively.TS c is the propagation time of v c .
denotes the average variation levels of user number of social network in the process of propagating v c .The pre- dicted value of is close to the true value according to the similar content between v j and other items in c x , so the pre- dicted value of based on the historical statistics only brings slight influence for the predicted results of basic regeneration number.

Caching-based adjustment of propagating videos based on predicted basic regeneration number
In terms of the above method of calculating the values of , , and during propagation rounds, we can obtain the predicted values of basic regeneration number during the propagation time plots.The predicted value of basic regeneration number R 0 (tp v ) at the time plot tp v can be calculated: The predicted values of the basic regeneration number denote the number of susceptible which are infected by one infector during the time plots and also are considered as the specific values between the number of future infectors and the number of current infectors.Let N s be the average number that one infector supplies video data for the susceptible that request videos.The following process shows the caching-based adjustment of video distribution based on the predicted basic regeneration number.
1.If R 0 (tp v ) < 1 , the video systems do not need to have additional users to store copies of v j .2. If R 0 (tp v ) ≥ 1 and N s ≥ R 0 (tp v ) , the current video suppli- ers in social networks can provide enough upload for the video requesters; At the moment, if N s < 2R 0 (tp v ) where 2R 0 (tp v ) is defined as the upper bound of video request- ers at tp v , the number of nodes which have stored v j do not need to be decreased; if N 2R 0 (tp v ) , the number ( 20) of nodes which can remove v j from the local buffer is defined as , the video systems need to have the number (R 0 (tp v ) − N s )∕N s of additional users which store copies of v j to meet the supply demand of upload bandwidth.The increase in the number of video copies in networks is an important way of supply promotion.For instance, let t j (tp v ) and B j (tp v ) be average time and bandwidth of caching v j for a node.If × t j (tp v ) > T r where N j (tp v ) is the number of nodes which have stored v j and T r is the time length of tp v , the video systems need to have the number (R 0 (tp v ) − N s )∕N s of additional users to store copies of v j at the time plot tp v−1 , which effectively balances sup- ply and demand of bandwidth in the next time plot tp v .

Testing topology and scenarios
We compare the performance of the proposed solution SDPEP with that of the two state-of-the-art solutions OCP [23] and SECS [31] which are deployed in a mobile network environment by making use of the Network Simulator 3 (NS-3) [35].The simulation time is set to 500 s.500 mobile nodes are uniformly deployed in a square scenario with 2000 × 2000 area and keep the random movement behaviors during the whole simulation time.Initially, every mobile node has the position coordinates of beginning and ending and is randomly set at a constant speed to move along the path consisting of beginning and ending positions.When the mobile nodes arrive at the ending position, they have 0 s stay time and are randomly reassigned to a new destination position and a movement speed.The velocity of mobile nodes is in the ranges [1,30] m/s.There are 10 video clusters and each video cluster has uniformly 200 videos.To simulate the video preference of users, every single node has a primarily interested video cluster (every 50 nodes is corresponding to one video cluster).They watched video record of every single node including 60 videos where the number of videos in the primarily interested video cluster is 30 and the others are randomly assigned in the record.The popularity of all videos follows the Zipf distribution [36].The number of propagating videos is 40 and every 4 propagated videos are corresponding to a video cluster.The length and size of every propagated video are 100 s and 25 MB, respectively.The playback bitrate of all videos is 2000 kbps.The number of seed nodes that store initial video data for every propagated video is set to 10.The number of videos cached by every mobile node is set to 10 or 20.T r in SDPEP is set to 20 s; tp v also is set to 20 s.All nodes have the played logs which includes 5 videos from the different video clusters according to their interest.The playback time of nodes is randomly allocated.The nodes request videos according to the played logs.The number of nodes which play videos during the time period tp v can be collected, so of videos which are played by the nodes can be calculated.Similarly, the number of videos nodes which have watched videos during the time period tp v can be collected, so of videos which are played by the nodes can be calculated based on the setting that the nodes do not request and accept the videos which have been watched.If the videos in the played logs have been watched by the nodes, the nodes neglect the current videos and continue to request the next videos.The nodes also have the pushed video lists which record the videos pushed by other nodes and the nodes according to the probability to push videos for other nodes.When the nodes finish viewing of current video according to the allocated playback time, the nodes first check the pushed video lists.If the pushed video lists are the empty set or the push acceptance probabilities of nodes for the pushed videos are less than T p , they continue to request a new video in the played logs.If the pushed video lists are not the empty set or the push acceptance probabilities of nodes for the pushed videos are larger than T p , they accept the pushed videos and the playback time of nodes for the pushed videos is randomly allocated.If the nodes finish the playback of all videos in the played logs and do not receive the pushed videos during T r , they do not continue to take part in the video sharing (they quit the video system).
The number of nodes who quit the video system during tp v can be collected, the death rate during tp v can be calculated.The number of nodes which join into the video system is 0, so the birth rate is 0. Therefore, the value of is defined as the ratio between the number of nodes who the video system and tp v (the death rate).25 base stations are uniformly distributed in the simulation scenarios and are used as access points (APs) to transmit and forward video data.The physical and MAC layer and modulation schemes of network units are reset according to the 5 G industrial standardization.The 802.11p is used as the MAC protocol and the upper bound of the data rate is set to 20 Mbps.The maximum communication range is 250 m and the MAC channel delay is 250 ms.The propagation loss model employs the Friis Propagation Loss Model (FPLM) in NS3 [35], which is designed for an unstructured clear path between receivers and transmitters to eliminate the performance degraded by random shadowing effects.The FPLM effectively erases the random effects caused by shadowing for the simulation results.The D2D settings of the 5 G network follow the settings in the popular studies [37].

Performance evaluation
We compare the performance of SDPEP with OCP and SECS in terms of the caching hit ratio, response delay, caching cost, and control overhead, respectively.
Caching hit ratio (CHR): If one node n i stores a video v j and receives a request message of v j from another node n k , the event that n i deals with the request message and transmits v j to n k can be a cache hit.The ratio between the number of successful cache hits and the total number of all video requests is defined as the caching hit ratio.To clearly show the CHR simulation results, the average values of CHR during a period of every 5 s are shown in Figs. 1 and 2.
As Fig. 1 shows, the three curves corresponding to SDPEP, OCP, and SECS have a fall trend after the slow rise during the whole simulation time.The blue curve of SDPEP experiences a fast rise from t = 0 to t = 155 s , fall from t = 160 to t = 245 s, short rise from t = 250 to t = 270 s and slow fall from t = 275 to t = 500 s .The orange curve of OCP has an evident increase from t = 0 to t = 160 s , keeps a violent jitter from t = 165 to t = 345 s , and experience a slow fall from t = 350 to t = 500 s .The red curve of SECS also has a fast rise from t = 0 to t = 150 s , keeps a slow rise from t = 160 to t = 325 s after a shortfall from t = 150 to t = 160 s , and has a slow fall from t = 330 to t = 500 s .Although the CHR values of SDPEP are less than those of OCP and SECS during the initial simulation time, the blue curve of SDPEP has higher CHR levels than those of curves of OCP and SECS during the most of simulation time.As Fig. 2 shows, the blue curve of SDPEP keeps a fast rise from t = 0 to t = 135 s , keeps a violent jitter from t = 140 to t = 340 s , and has a slight fall from t = 345 to t = 500 s.The orange curve of OCP has an fast rise from t = 0 to t = 170 s , and experiences a slight fall from t = 175 to t = 500 s .The red curve of SECS also has an increased process from t = 0 to t = 185 s , keeps a fluctuant trend from t = 190 to t = 260 s and experiences a slow fall from t = 265 to t = 500 s .Although the blue curve of SDPEP has lower CHR levels than those of curves of OCP and SECS during the initial simulation time, the CHR values of SDPEP are larger than those of OCP and SECS during the most of simulation time.
By investigating the influence levels of user interests and social influence in the process of social-based video propagation, SDPEP accurately estimates probabilities of users fetching videos in both "pull" and "push" modes.By modeling the social-based video propagation process using the SIR model, SDPEP calculates the basic regeneration number based on users' fetching probabilities to predict video demand during propagation.It dynamically adjusts video caching according to predicted demand, ensuring a matching scale with user demand.Sufficient supply of video upload bandwidth supports fast response for "push" requests, reducing unsuccessful responses caused by request handling delays.Additionally, supply nodes with cached videos accurately push videos based on calculated pushing probabilities, further reducing unsuccessful responses and increasing successful responses regardless of response timeout.Therefore, SDPEP achieves higher CHR values compared to OCP and SECS.SECS constructs interest domains using Fuzzy C-Means (FCM) algorithm and groups users with common interests together to define the range of shared videos between them.SECS requires intragroup users to cache videos based on their sharing capacities and dynamically replace locally cached videos according to variations in video demand within user groups.However, due to limited range of video sharing resulting from preferential sharing strategy within groups, SECS increases the risk of response failure.The encounter-based video push between intragroup users also lead to the hysteretic video sharing, so that the user demand is not effectively released via the push mode, which further increases the risk of a cache miss.Moreover, the performance-aware caching strategy means that the video delivery quality is the precondition of caching and delivery.The blowout of video requests leads to the delayed response of video suppliers due to the limited handling capacities, which increases the number of caching misses.Therefore, the CHR values of SECS are lower than those of SDPEP.OCP builds a fluid-based model to combine Information-Centric Networking (ICN) and 5 G D2D communications.OCP formulates an optimal content replication problem that caches overhead and load.OCP requires the nodes to exchange state lists to predict the demand variation of the whole system; OCP makes the caching decision to require all nodes to implement video caching in terms of the same caching time via message broadcasting.However, the same caching time restricts the response timeliness for the variation of video demand and the single sharing way of request-response mode also increases the risk of a bank run of video upload bandwidth, which promotes the risk of a cache miss.Moreover, OCP overlooks the influence levels of sociability between users for the intention levels of requesting videos of users, which also results in the low prediction accuracy of video demand.Therefore, the CHR values of OCP are lower than those of SDPEP and SECS.
Response delay (RD): The difference value between the time that a requesting node n i receives the first video data sent by a video supplier n j and the time that n i sends the video request message to n j is considered as the response delay.Except for the retransmission delay caused by packet loss in network congestion, the wait delay caused by the lack As Fig. 3 shows, the blue curve of SDPEP has a fast rise from t = 0 to t = 320 s and keeps a slow fall from t = 325 to t = 500 s .The orange curve of OCP has an fast increase from t = 0 to t = 255 s and keeps the jitter from t = 260 to t = 500 s .The red curve of SECS also has an increased pro- cess from t = 0 to t = 305 s and keeps a stable fluctuation from t = 310 to t = 500 s .The blue curve of SDPEP has higher levels than those of curves of OCP and SECS during the most of simulation time.
As Fig. 4 shows, the blue curve of SDPEP keeps a fast rise from t = 0 to t = 300 s and has a stable fluctuation from t = 305 to t = 500 s .The orange curve of OCP has a fast rise from t = 0 to t = 330 s , and experiences a slow fall from t = 335 to t = 500 s .The red curve of SECS also has a fast increase from t = 0 to t = 305 s and keeps a slow fall from t = 310 to t = 500 s .The blue curve of SDPEP has lower RD levels than those of curves of OCP and SECS.
The video fetching process involves the discovery of supply nodes, request handling, and data delivery.Unsuccessful supplier lookup, crowded request messages, and scarce upload bandwidth are the main reasons for RD's rise.SDPEP accurately estimates probabilities of users fetching videos in "pull" and "push" modes.On one hand, SDPEP constructs a social-based video propagation SIR model to predict demand for videos accurately and provide enough upload bandwidth to meet video requests.On the other hand, high probability video push avoids delay in supplier lookup, request handling, and bandwidth allocation wait time effectively reducing RD values compared to OCP and SECS.SECS groups the users with common and similar interests into the same groups, which provides the success probability of supplier lookup.SECS also requires the nodes to use encounter-based video push to propagate videos, which effectively reduces the delay in message sending, request handling, and data delivery.However, the performance-aware caching strategy overlooks the demand scale of users, so that the blowout of video requests leads to the long wait delay of request handling and bandwidth allocation of video suppliers in the "pull" mode.Moreover, the encounter-based video push between intragroup users also lead to the hysteretic video sharing, so that a large amount of user demand relies on the "pull" mode, which further increases video supplier load and results in the rise of RD.Therefore, the RD values of SECS are higher than those of SDPEP.OCP requires node state exchange to predict network-wide demand variation but overlooks social influence between users making it difficult to ensure high prediction accuracy of video demand.If the different levels between the predicted demand and the real demand are high, the unbalance between supply and demand leads to a lack of redundancy of upload bandwidth.The lack of upload bandwidth results in a long delay in request handling and bandwidth allocation.Moreover, OCP formulates a uniform caching time according to the caching decision, so that the caching regulation does not keep pace with the variation of video demand in real-time.When the video demand fast increases and the caching decision lags, the lack of upload bandwidth also results in the fast increase of RD.Therefore, the RD values of OCP are higher than those of SDPEP and SECS.
Caching Cost (CC): The abundant video copies can effectively provide enough upload bandwidth to support a large number of video requests.An increase in the number of video copies results in fast consumption of resources of bandwidth, storage, and computation in networks.The more the number of nodes that take part in storing video copies, the higher the cost of video dissemination is.Therefore, the ratio between the number of nodes that take part in storing video copies and the total number of nodes is defined as the caching cost.To clearly show the CC simulation results, the values of CC during a period of every 10 s are shown in Figs. 5 and 6.As Fig. 5 shows, the blue curve of SDPEP keeps a fast rise from t = 0 to t = 200 s and has a stable jitter from t = 210 to t = 500 s .The orange curve of OCP has a fast rise from t = 0 to t = 250 s and experiences a fluctuation from t = 260 to t = 500 s .The red curve of SECS also has a fast rise from t = 0 to t = 190 s and experiences a stable fluc- tuation from t = 200 to t = 500 s .The blue curve of SDPEP has lower CC levels than those of curves of OCP and SECS.
As Fig. 6 shows, the blue curve of SDPEP has a fast rise from t = 0 to t = 210 s and keeps a slight jitter from t = 220 to t = 500 s .The orange curve of OCP has an fast increase from t = 0 to t = 300 s and keeps a slight fall from t = 310 to t = 500 s .The red curve of SECS also has an increased process from t = 0 to t = 240 s and keeps a slight fall from t = 250 to t = 500 s .The blue curve of SDPEP has lower levels than those of curves of OCP and SECS during the whole simulation time.
The large number of nodes storing video copies can provide enough upload bandwidth, which effectively reduces the delay of request handling and bandwidth allocation.Video caching consumes the resources of bandwidth and storage of nodes.The dynamic balance between supply and demand reduces unnecessary redundancy and saves bandwidth and storage of nodes, which can efficiently adapt to the variation of demand.The prediction accuracy of video demand and the real-time performance of caching videos are key factors for the dynamic balance between supply and demand.By the estimation of probabilities of users fetching videos in "pull" and "push," modes and construction of the social-based video propagation SIR model, SDPEP makes use of the basic regeneration number to accurately predict video demand with the time variation.The nodes actively implement video caching to meet the video demand in terms of the predicted video demand.Moreover, the successful video push also increases the number of video copies in networks, which early releases the potential video demand and reduces a load of video requests via "pull" mode.Therefore, the number of nodes that participate in video copying in SDPEP is slightly less than those of OCP and SECS, so the CC values of SDPEP are higher than those of OCP and SECS.In SECS, the performance-aware caching strategy is unrelated to the video demand of users.In "pull" mode, the nodes which store video data passively respond and handle video requests of users.Because the request nodes preferentially search intragroup video supply nodes, the high-efficiency sharing way increases the scale of video copies.When the rate of demand increase is larger than that of supply increase, the long RD results in the decrease of CHR, and the increase rate of video copy is restrained.Therefore, the CC values of SECS keep a relatively stable trend in the late simulation time.On the other hand, because SECS employs encounter-based video push, the number of successful video push is determined by probabilities of the encounter between nodes and video acceptance probabilities of nodes.The high success rate of video push is difficulty ensured, so SECS has more nodes that take part in storing video copies than that SDPEP.By collection and analysis of node state information, OCP employs the periodical cache policy based on the uniform caching time to meet the predicted video demand in the whole network.However, OCP does not investigate the influence of the social relationship between users for the intention of videorequesting of users, which may result in the low prediction accuracy of video demand.If the predicted demand is greater than the real demand, there is a large amount of video caching redundancy; Otherwise, if the predicted demand is less than the real demand, the long wait delay of request handling and bandwidth allocation results in the long RD.Moreover, OCP does not employ the video push to early release the potential video demand, so the video demand is met by the "pull" mode, which also brings more participant nodes.Therefore, the CC values of OCP are higher than those of SDPEP and SECS.
Control Overhead (CO): The allocation and regulation of video cache copies need to collect and share a large amount of information ( e.g.available bandwidth and storage space of mobile nodes).The message interaction also needs to consume the available bandwidth of mobile nodes.Therefore, the averaged bandwidth which is used to message interaction allocation and regulation of video cache copies is defined as the control overhead.
As Fig. 7 shows, the three curves corresponding to SDPEP, OCP, and SECS have severe jitter processes with increasing simulation time.The blue curve of SDPEP keeps a fast rise from t = 0 to t = 270 s , has a fast fall from t = 280 to t = 340 s, and experiences a violent jitter from t = 350 to t = 500 s .The orange curve of OCP has a fast fall from t = 120 to t = 210 s after a fast rise from t = 0 to t = 110 s, also experiences a fall after a rise from t = 220 to t = 370 s , and keep a rise from t = 380 to t = 500 s .The red curve of SECS also has a slight fall after a rise from t = 0 to t = 180 s , experiences a large fall after a rise from t = 190 to t = 360 s , and keeps a fast rise from t = 370 to t = 500 s .The CO val- ues of SDPEP are larger than those of OCP and SECS.
As Fig. 8 shows, the blue curve of SDPEP also has severe jitter processes with increasing simulation time.The blue curve of SDPEP has a fall after a rise from t = 0 to t = 170 s , experiences a stable process from t = 180 to t = 240 s , also has a fall after a rise from t = 250 to t = 410 s , and experi- ences a fast rise from t = 420 to t = 500 s .The orange curve of OCP has a fall after a rise from t = 0 to t = 240 s, also experiences a fall after a rise from t = 250 to t = 420 s , and keep a rise from t = 430 to t = 500 s.The red curve of SECS also has a severe jitter from t = 0 to t = 310 s , experiencing a slight fall from t = 320 to t = 500 s .The CO values of SDPEP are larger than those of OCP and SECS.
SDPEP requires the collection and analysis of user interests and social influence to estimate the probabilities of users fetching videos in both "pull" and "push" modes.Additionally, SDPEP collects video propagation information, including video requests and push-by-information interaction between nodes, to calculate the basic regeneration number and predict demand for videos during the propagation process.To ensure accurate real-time prediction of video demand, SDPEP consumes a significant amount of node bandwidth for message exchange to maintain dynamic balance with low caching redundancy between supply and demand regarding variations in video demand.Consequently, CO values for SDPEP are higher than those for OCP and SECS.SECS groups users with common or similar interests while requiring preferential search supply nodes within intragroup nodes by exchanging state information to estimate probabilities of push acceptance and movement encounters among nodes.Moreover, SECS utilizes interaction from state information to maintain node group structure due to logout or accession events among nodes.Although SECS also requires a large amount of node bandwidth for message interaction between nodes, its relatively low frequency results in lower control overhead compared with OCP or SDPEP; therefore, CO values for SECS are slightly lower than those for OCP or SDPEP.In OCP, periodic exchange of state lists is necessary among nodes to predict variation in demand while sending caching instruction messages after making caching decisions as well as replacement instruction messages when replacing cached data.Since caching/replacement follows the same time frame period as message interactions that determine control overheads in OCP; thus it becomes a key factor affecting control overheads.Therefore, the CO values of OCP are slightly lower than those of SDPEP and are slightly higher than those of SECS.

Conclusions
In this paper, we propose a social-aware video sharing solution using demand of epidemic-based propagation in wireless networks (SDPEP).SDPEP constructs a video propagation model based on user behaviors of "pull" and "push," estimating the probability of video fetching in social networks by calculating probabilities according to interest preference and social influence.By utilizing the probability of video fetching, SDPEP calculates the basic regeneration number in epidemic-based propagation and predicts the demand scale of videos.Furthermore, SDPEP designs a caching-based adjustment strategy for video copies to balance the demand and supply with low redundancy.Simulation results demonstrate that compared to SECS and OCP, SDPEP achieves higher caching hit ratio, lower response delay, lower caching cost, and higher control overhead.

ji|
returns the number of items in SN j i .SN i is the set of u i 's social neighbors and |SN i | returns the num- ber of items in SN i .|SN j i | |SN i | is the infected ratio of u i 's social neighbors.EI ij can be defined as: where SN j iu ∈ SN j i is the set of social neighbors in SN j i which do not have watched v j ; EI ij denotes the average expected interest of social neighbors in SN j iu .The larger the value of EI ij is, the higher the expected interest levels of social neigh- bors in SNj iu .Similarly, RI ij can be defined as: where SN j iw ∈ SN j i is the set of social neighbors in SN j i which have watched v j .l c is the playback time of u c for v j ; L j is the time length of v j ; l c L j is the playback ratio of u c and reflects the real interest level of u c for v j .RI ij denotes the average real interest of social neighbors in SN j iu .The larger the value of RI ij is, the higher the real interest levels of social neighbors in

Fig. 1 Fig. 2
Fig. 1 Caching hit ratio against simulation time when number of cached videos is 10

Fig. 3
Fig.3 Response delay against simulation time when the number of cached videos is 10

Fig. 4 Fig. 5
Fig.4 Response delay against simulation time when the number of cached videos is 20

Fig. 6
Fig.6 Caching cost against simulation time when the number of cached videos is 20

Fig. 7 Fig. 8
Fig. 7 Control overhead against simulation time when the number of cached videos is 10

Table 1
Conditions of u i fetching v j