1 Introduction

At present, there are a lot of cloud rendering systems in the field of graphics, virtual reality, and computer vision, but these systems lack a unified system performance analysis method [1,2,3,4,5]. 3D rendering has demanding requirements on the hardware configuration and command response. Cloud rendering system faces a large number of rendering requests from users. There are a large number of simultaneous rendering requests, which will increase great pressure to the backend server system. The user sends commands by terminal systems. The servers receive instructions and complete the rendering task immediately, and the rendered image is transmitted to the user terminal at the same time. If the average response time is too slow, it will greatly reduce the user experience.

The performance measure of cloud rendering system needs to examine the following main points: (1) the number of concurrent users, the design target of cloud rendering system is to withstand a certain scale of the user’s concurrent access, so the number of concurrent users is an important index. (2) the average response delay, cloud rendering system requires certain time to respond to any operation issued by the user. 3D rendering in general more than 25 FPS the user will feel a smooth screen, so the average response delay is also an important index.

This paper first analyzes the time consumption of task scheduling and rendering distribution. The efficiency and effect of quantization performance were converted into mathematical symbols. Then, the paper analyses the 3D rendering process, especially analysis multi-task rendering process and rendering performance by a mathematical formula. Through rigorous mathematical analysis of the key parameters of the system, the paper calculates the number of concurrent users and the average response delay. In the paper, we will show a detailed study of these key parameters which is how to influence the performance of the system. The paper puts forward a performance parameter adjustment strategy to enhance system performance.

2 Related work

The technology has developed quite mature and was successfully applied to the movie industry, like CG rendering, and 3Ds max scene rendering [6,7,8]. This type of rendering does not require real-time generally, that is, as long as the rendering begins, and the user only needs to wait for the results to be returned [9,10,11,12]. However, it needs to take into account the efficiency of the interaction process for some real-time demands. At present, it lacks a unified evaluation standard and efficiency analysis method of cloud rendering both at home and abroad [13,14,15,16]. Although there are few cloud rendering technology in the mobile terminal application, the characteristics of cloud rendering technology have provided a great convenience in mobile migration terminal. Cloud rendering mode is similar to the cloud computing model [17,18,19,20], and its main idea is to transfer the user’s local 3D rendering work completely to a cloud rendering server which has powerful rendering processing capabilities. The client sends commands to cloud rendering servers [21,22,23,24,25,26]. The server renderings tasks according to the instructions of users, and the results will be sent back to the user to display [27,28,29,30,31]. The benefits of cloud rendering are that users do not need to worry about the hardware configuration and software compatibility of local equipment [32,33,34,35]. All rendering tasks are completed on a cloud rendering server. Although data and development of cloud rendering [36,37,38,39] help the user to solve a lot of personal problems, there is a lack of a unified evaluation criteria for its performance and efficiency. As for the parallel task layer of the IoT cloud rendering computing system, each computing node device has an independent parallel task scheduling module. Relying on this module, the node device can no longer focus on the communication details of the rendering application server [40,41,42,43,44]. All scheduling management and operations are encapsulated in a parallel task scheduling module. The purpose of this is to introduce middleware to reduce the coupling between the node device and the rendering system in the server system. The node device can shunt the user’s instruction request, improve the operation efficiency of the IoT cloud rendering computing system, and enhance the rendering by interacting with the instruction portability of system and parallel task scheduling modules [45,46,47].

3 Methods

The definition of time cost (we consider t0 and t1 as constants in this section) are as follows: (1) t0 means a time required that users start an operation and balance loading to resource management server. (2) t1 means t0 plus query, get pictures, and return results time. (3) tdispatch means send instructions time. (4) trender means execute the render instruction time at the render machine. (5) tupload means the upload render results to the file server and the database server time.

3.1 Instructions distribution time consumption

The time of the cloud rendering system to distribute commands can be expressed as tdispatch = tdis _ wait + task + tsend. The tdis _ wait means the commands waiting time in distribution queue, which can be ignored when the queue is empty. The task means time required of scheduling system to query all rendering machine performance status. The tsend means the time required of the scheduling system send out commands.

3.2 Rendering command processing time consumption

Rendering instruction processing time can be expressed as trender = trend _ wait + tscene _ create + ttake _ photo. The trend _ wait expresses waiting time in render queue, which can be ignored when there is no queuer in the render queue. tscene _ create means the time required to execute instructions when a scene is created. ttake _ photo means the time required to shoot all the pictures.

4 The theory analysis of cloud rendering system business process

In this section, we express the performance of cloud rendering system with corresponding mathematical expressions and find the main contribution of the cloud rendering system. Cloud rendering system business processes mainly include (1) single task non-rendering process. (2) Single task rendering process. (3) Multi-task rendering process.

4.1 Single task rendering process

The single task rendering process is one of the typical processes in the business process field, and the time consumption of each part is as follows:

(1) In tdispatch part, tdis _ wait is omitted if there is no wait in queue; task is parallel TCP request time. Each part is a long connection, so \( {t}_{\mathrm{ask}}=2\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \); it needs two times to determine whether the instructions communications is a success or not, so \( {t}_{\mathrm{dispatch}}=4\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \).

(2) The trender: tdis _ wait is omitted if there is not wait in queue; scene creation process needs consume time tscene _ create = Pp; scene requires time tscene _ create = Pp; finally, trender = Pp + Cc.

(3) tupload means all pictures uploaded in time. Among them, the TCP long connection time requirement \( {t}_{\mathrm{up}\_\mathrm{pic}}=\mathrm{C}\left(\frac{\mathrm{D}}{W}+{\Delta }_w\right)+{\Delta }_{tcp} \) and the results upload database time requirement \( {t}_{\mathrm{up}\_\mathrm{database}}=\frac{\Delta _d}{W}+{\Delta }_w+{\Delta }_{\mathrm{tcp}} \).

So, single task rendering process final time requirement can be presented as \( {\mathrm{T}}_{\mathrm{SR}}\approx \mathrm{P}{\Delta }_p+\mathrm{C}\left({\Delta }_c+\frac{D}{W}\right)+{\mathrm{Q}}_1 \). From this formula, we can see that the effect of TSR are mainly in scene creation time p, picture shoot time, and transmission time \( \mathrm{C}\left({\Delta }_c+\frac{D}{W}\right) \).

4.2 Multi-task rendering process

The multi-task rendering process is more complex than single task rendering process. It is mostly consumed time to wait in queue. This paper assumes that there are S tasks to be executed simultaneously, and the total process time TMR can be expressed as

$$ {\mathrm{T}}_{\mathrm{MR}}={\mathrm{t}}_0+{t}_{\mathrm{dispatch}}+{t}_{\mathrm{render}}+{t}_{\mathrm{upload}}+{\mathrm{t}}_1 $$
(1)

The time required for all task scheduling, \( {t}_{\mathrm{dis}\mathrm{patch}}=\varDelta {\mathrm{dis}}_{\mathrm{wait}\frac{\mathrm{S}}{N}}+{t}_{\mathrm{ask}}+{t}_{\mathrm{send}}=\left[\frac{\mathrm{S}}{N}\right]\left({t}_{\mathrm{ask}}+{t}_{\mathrm{send}}\right)=4\left[\frac{\mathrm{S}}{N}\right]\left(\frac{\varDelta_d}{W}+{\varDelta}_w\right). \)

For trender part, because there are multiple render machines parallel rendering, tasks will be evenly distributed to each machine according to task scheduling strategy, and rendering machine mostly receives \( \left\lceil \frac{\mathrm{S}}{M}\right\rceil \) tasks. Because there are M renderer, each renderer task distribution cycle \( \overline{t_d} \) is as follows, \( \overline{t_d}=\frac{M}{N}\left({t}_{\mathrm{ask}}+{t}_{\mathrm{send}}\right)=\frac{4M}{N}\left(\frac{\Delta _d}{W}+{\Delta }_w\right) \). The definition of an average time of task occupied space \( \overline{t_o} \) is as follows: \( \overline{t_o}=P{\Delta }_{\mathrm{p}}+\mathrm{C}{\Delta }_{\mathrm{c}} \). In the premise of the above definition, the paper gives an important conclusion. For each renderer, if \( K\overline{t_d}\ge \overline{t_o} \) or S ≤ M · K, the rendering engine will never have tasks waiting in render queue. Therefore, as long as the condition \( K\overline{t_d}\ge \overline{t_o} \) or S ≤ M · K is established, any tasks that will arrive will be assigned immediately to site for rendering without congestion. On the contrary, if these two conditions are not established, namely, the condition of \( K\overline{t_d}<\overline{t_o} \) and S > M ∙ K is established, rendering system congestion will occur.

4.2.1 Multi-task rendering process in blocking status

In the blocking status, the \( K\overline{t_d}<\overline{t_o} \) and S > M · K conditions are both established. In the status, the render time tr2 is

$$ {t}_{r2}\left(S,M,K\right)=\left\{\begin{array}{c}\left(\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K-1\right)\overline{t_d}+\left(\left\lceil \frac{S}{M}\right\rceil \operatorname{div}\ K+1\right)\overline{t_o},\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K\ne 0\\ {}\left(K-1\right)\overline{t_d}+\left(\left\lceil \frac{S}{M}\right\rceil \operatorname{div}\ K\right)\overline{t_o},\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K=0\end{array}\right. $$
(2)

In this paper, ts(i) represents required time function before each rendering site achieves full load working status. The parameter i indicates the number of rendering site, then ts(i) is expressed as the following formula:

$$ {t}_s(i)=\left(i-1\right)\overline{t_d}\kern0.5em \left(K\ge i\ge 1\right) $$
(3)

Due to the time difference of receiving task in different sites, the time of each site achieve full load working status will be decided by task distribution cycle \( \overline{t_d} \). The ts(i) of each rendering site is not equal and increases with the growth of i, so the number of each site assigned can be defined as disc(i) function. It is expressed as following formula:

$$ {\mathrm{dis}}_c(i)=\left\lceil \frac{S}{M}\right\rceil \operatorname{div}\ K+\varepsilon \left(\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K-i\right) $$
(4)

Among them, ε(x) is unit step function, which is defined as \( \varepsilon \left(\mathrm{x}\right)=\left\{\begin{array}{c}0,x<0\\ {}1,x\ge 0.\end{array}\right. \)

Therefore, the time consumption of each site to complete corresponding rendering tasks, which can be defined as the following:

$$ {t}_e(i)={t}_s(i)+\overline{t_o}{\mathrm{dis}}_c(i)=\left(i-1\right)\overline{t_d}+\left(\left\lceil \frac{S}{M}\right\rceil \operatorname{div}\ K+\varepsilon \left(\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K-i\right)\right)\overline{t_o} $$
(5)

After all, the sites complete corresponding rendering tasks, the rendering process is over, so the time consumption of completing rendering tasks is shown in the following formula:

$$ {t}_{r2}=\max \left({\cup}_{i=1}^K{t}_e(i)\right) $$
(6)

In particular, when \( \left\lceil \frac{S}{M}\right\rceil \) is times of K, this will lead to \( \left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K=0 \), because 1 ≤ i ≤ K and i ∈ N+ in the paper. It will become an increasing function at this time

$$ {t}_{r2}=\max \left({\cup}_{i=1}^K{t}_e(i)\right)={t}_e\left({i}_3\right)=\left(K-1\right)\overline{t_d}+\left(\frac{S}{M}\operatorname{div}\ K\right)\cdot \overline{t_o} $$
(7)

4.2.2 Multi-task rendering results upload time consumption

For tupload part, since all upload tasks are carried out by the parallel way, this section is consistent with the single task system as show in the following formula,

$$ {t}_{\mathrm{upload}}=\mathrm{C}\left(\frac{\mathrm{D}}{W}+{\Delta }_w\right)+\frac{\Delta _d}{W}+{\Delta }_w+2{\Delta }_{tcp} $$
(8)

In this paper, TMR(A) is defined as the time consumption of non-blocking status, TMR(B) is defined as the time consumption of the blocking status. Non-blocking task scheduling is a single master processor and there are worker/client processors. Each task has all the data it needs to compute, but gets the index to work on from the master. After the computation, the worker returns some data to the master. The bottom line is if a task takes too long to compute then it becomes the limiting factor and the master cannot move on to assign an index to the next worker even if non-blocking techniques are used. Is it possible to skip assigning to a worker and move on to next. TMR(A) can be expressed as

$$ {\mathrm{T}}_{\mathrm{MR}(A)}=4\left\lceil \frac{\mathrm{S}}{N}\right\rceil \left(\frac{\Delta _d}{W}+{\Delta }_w\right)+\mathrm{P}{\Delta }_p+\mathrm{C}{\Delta }_c+\mathrm{C}\left(\frac{\mathrm{D}}{W}+{\Delta }_w\right)+{\Delta }_w+{\mathrm{Q}}_1 $$
(9)

When the network is in good condition, w ≈ 0, so the formula can be further simplified as \( {\mathrm{T}}_{MR(A)}=4\left\lceil \frac{\mathrm{S}}{N}\right\rceil \frac{\Delta _d}{W}+\left(\mathrm{P}{\Delta }_p+\mathrm{C}{\Delta }_c\right)+\frac{\mathrm{CD}}{W}+{\mathrm{Q}}_1.{\mathrm{T}}_{\mathrm{MR}(B)} \) can be further simplified as TMR(B) = t0 + tr2 + tupload + t1. It can achieve good communication when w ≈ 0, so we can obtain after the expansion and simplification,

$$ {\mathrm{T}}_{\mathrm{MR}(B)}\approx \left\{\begin{array}{c}\frac{4M\left(\mathrm{K}-2\right)}{N}\bullet \frac{\Delta _d}{W}+\left(\frac{S}{M\bullet K}+1\right)\left(P{\Delta }_p+C{\Delta }_c\right)+\frac{\mathrm{CD}}{W}+{\mathrm{Q}}_1,\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K\ne 0\\ {}\frac{4M\left(\mathrm{K}-1\right)}{N}\bullet \frac{\Delta _d}{W}+\frac{S}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right)+\frac{\mathrm{CD}}{W}+{\mathrm{Q}}_1,\left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K=0\end{array}\right. $$
(10)

In conclusion, the multi-task rendering process time consumption TMR can be summarized as

$$ {\mathrm{T}}_{\mathrm{MR}}\left(S,N,M,K\right)=\left\{\begin{array}{c}{\mathrm{T}}_{\mathrm{MR}(A)},K\overline{t_d}\ge \overline{t_o}\cup \mathrm{S}\le \mathrm{M}\bullet \mathrm{K}\\ {}{\mathrm{T}}_{\mathrm{MR}(B)},K\overline{t_d}<\overline{t_o}\cap \mathrm{S}>\mathrm{M}\bullet \mathrm{K}\end{array}\right. $$
(11)

Among them,

$$ \overline{t_d}=\frac{4\mathrm{M}}{N}\left(\frac{\varDelta_d}{W}+{\varDelta}_w\right) $$
(12)
$$ \overline{t_o}=P{\varDelta}_{\mathrm{p}}+\mathrm{C}{\varDelta}_{\mathrm{c}} $$
(13)

This paper draws the following conclusions by analysis of the main factors which affect the number of concurrent users and the average response delay : (1) the average response delay is affected by many factors: (1) the relationship of concurrent tasks number S and the multi-task rendering time TMR is linear. (2) The relationship of web server number N and multi-task rendering time TMR is inversely proportional. (3) The rendering time P∆p + C∆c is the coefficient of parameter S in the blocking status. (4) The number of render machine M and the number of sites K will have a direct contribution to the TMR in the blocking status. (2) The number of concurrent users is mainly restricted by the average response delay, because the increase of the number of concurrent users will directly lead to the increase of average response delay.

5 System performance optimization results and discussion

The performance pressure of the system mainly focuses on the multi-task rendering process. This paper will mainly analyze the optimal selection scheme of K, M, and N in the status of given S and T. Analysis process is divided into the following: (1) we analyze the performance degradation trends of different status under this process. (2) In the face of the specified S level, we can not only observe the trends of N-T, M-T, and K-T, but also can observe the change trend of N-S/T, M-S/T, and K-S/T, and ultimately choose a better N, M, K ratio by the directional derivative. (3) The derivation of discussion also contains scene rendering optimization, scheduling algorithm optimization, and expansion support.

5.1 The first test experiments

We conduct experiments on our own systems and models. On the implementation of the test process, four sets of test input parameters were used to test the performance of the six cloud rendering systems on the two machines. Above figure shows the running situation of some rendering programs on the rendering machine A. And each rendering program has a corresponding command window to display the current program running log information.

5.2 The second test experiments

In order to test the efficiency of the algorithm, we tested it on a publicly available large-scale simulation scenario (http://pointclouds.org/ [23]) provided by Middlebury, Canada. The data source is shown in Fig. 1,

Fig. 1
figure 1

The 3D scene test data source of performance bottleneck analysis and resource optimized distribution method. (1) Virtual park scene of the data source. (2) Power plant scene of the data source. (3) Virtual city scene of the data source. This group figure describe the test data source of performance bottleneck analysis and resource optimized distribution method for IoT cloud rendering computing system in cyber-enabled applications. These data source provided by Middlebury, Canada (http://pointclouds.org/ [23]). We select the test models with the largest number of patches, vertexes, and the largest amount of data as experimental test model data. These representative models are selected according to the standard of include most IoT cloud rendering computing system scenarios and most amount of data. Moreover, these models have typical loading times and operational characteristics. If our algorithm works well on these characteristics models, it will work well on other scene models as well. In this group of models and scenes, we selected some 3D model test data set from Middlebury, Canada. The first sub-figure is a virtual park scene and related models, the second sub-figure is a power plant scene and related models, and the third sub-figure is a larger virtual city scene and related models. The following are advantages of scene and models:(1) under different influence of network congestion, these scenes and models show different rendering effects. These models are greatly influenced by different network speed, network congestion, algorithm, and parameters. So this kind of model can distinguish different network congestion situations and different algorithms efficiency. It could truly distinguish the advantages and efficiency of different method. (2) These scenes and models contain more vertexes, edges, and triangular patches, which can distinguish complex method effect easily in graphics. (3) These scene and models include the brightness, hue, and saturation of scene color. These scenes and models will show different rendering effects under different rendering method and procedures. These scenes and models have high discrimination and expressiveness for different algorithms

These data source properties are shown in the following Table 1,

Table 1 Experimental data model attributes

5.3 Status transition function

Assume function G(N, M, K) expresses the relationship of \( K\overline{t_d} \) and \( K\overline{t_d} \). They are defined as \( G\left(N,M,K\right)=K\overline{t_d}-\overline{t_o}=\frac{\mathrm{M}\cdot \mathrm{K}}{N}\cdot \frac{4{\varDelta}_d}{W}-\left(P{\varDelta}_p+C{\varDelta}_c\right). \) It is not difficult to find that while G(N, M, K) ≥ 0, the system will be in non-blocking status, while G(N, M, K) < 0 the system is in a blocked status.

The performance bottleneck analysis is shown in the following figure. It can be seen from the diagram, in the system, that the main time-consuming part is in two-dimensional image rendering, file upload, and file transfer.

5.4 Average response time analysis

The average response time is an important index in the system performance analysis, which directly determines the user experience. In the multi-task rendering process, the more common situation is the \( \left\lceil \frac{S}{M}\right\rceil \operatorname {mod}\ K\ne 0 \); this paper choose \( {T}_{\mathrm{MR}(B)}=\frac{4M\left(\mathrm{K}-2\right)}{N}\bullet \frac{\Delta _d}{W}+\left(\frac{S}{M\bullet K}+1\right)\left(P{\Delta }_p+C{\Delta }_c\right)+\frac{\mathrm{CD}}{W}+{\mathrm{Q}}_1 \). TMR(B) substract TMR(A) can obtain the time difference \( {T}_{\Delta }={T}_{MR(B)}-{T}_{MR(A)}=\frac{4{\Delta }_d\left(M\left(\mathrm{K}-2\right)-\mathrm{S}\right)}{N\bullet W}+\frac{S}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right) \). The average response time difference can be obtained by T divided by S, namely, \( \overline{T_{\Delta }}=\frac{T_{\Delta }}{S}=\frac{4M{\Delta }_d\left(\mathrm{K}-2\right)}{N\bullet W\bullet S}+\frac{1}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right)-\frac{4{\Delta }_d}{N\bullet W} \). According to the status, transfer function shows that when G(N, M, K) = 0, the system at the critical points. So the status \( G\left(N,M,K\right)=0\to \frac{\mathrm{M}\bullet \mathrm{K}}{N}\bullet \frac{4{\Delta }_d}{W}-\left(P{\Delta }_p+C{\Delta }_c\right)=0\to \frac{1}{M\bullet K}\left(P{\Delta }_p+C{\Delta }_c\right)-\frac{4{\Delta }_d}{N\bullet W}=0 \). Because of K ≥ 2, so \( \frac{4M{\Delta }_d\left(K-2\right)}{N\bullet W\bullet S}\ge 0 \) was established. So we can get the conclusion that when the system is in the critical points of the blocking and non-blocking status, the average response time TMR(A) is better than TMR(B).

5.5 System performance decline rate analysis

TMR(S, N, M, K) can be used to represent the time consumed by the multi-task rendering process. S can be seen as the main variable function. Because the value of the variable with the number of users will change at any time. While the remaining variables can be regarded as the secondary variables, these variables will set the default values in the cloud rendering system TMR calculate partial derivation for S. In order to facilitate the discussion, the following variables are further defined:

$$ {T}_{S1}^{\hbox{'}}=\frac{4{\Delta}_d}{N\cdot {W}^{\hbox{'}}}\;{T}_{S2}^{\hbox{'}}=\frac{P{\varDelta}_p+C{\varDelta}_C}{M\cdot K} $$
(14)

\( {T}_{s1}^{\prime } \) indicate that there is a performance bottleneck in task scheduling part of the system. \( {T}_{s2}^{\prime } \) indicates that there is a performance bottleneck in the rendering part of the system. The TMR in (M ∙ K, +∞) interval growth are discussed, while the conditions w = 0, the G(N, M, K) < 0 are satisfied, and the \( {T}_{s2}^{\prime } \),

$$ {T}_{s2}^{\prime }=\frac{P{\Delta }_p+C{\Delta }_c}{M\bullet K}=\frac{\overline{t_o}}{M\bullet K}>\frac{\overline{t_d}}{M}=\frac{4{\Delta }_d}{N\bullet W}={T}_{s1}^{\prime } $$
(15)

The results show how to adjust other parameters regardless. TMR in (M ∙ K, +∞) status interval change will always tend to growth faster aspect. However, after determining the growth rate of TMR, it still can adjust the parameters to improve the performance of the system. In order to test the algorithm of this paper, we choose the current popular algorithm [21,22,23] to test system pressure. After blocking, the average response time of the algorithm is shown in the following Table 2.

Table 2 The average response time of each test algorithm after blocking

5.6 System performance tuning strategy

When G(N, M, K) ≥ 0, the growth rate of TMR is \( {T}_{s1}^{\prime } \), and there is an upper limit of N,

$$ \max (N)=\left\lfloor \frac{4 MK{\Delta }_d}{W\left(P{\Delta }_p+C{\Delta }_c\right)}\right\rfloor $$
(16)

We can make the system degrade the growth rate of \( {T}_{s1}^{\prime } \) to a minimum to maintain the current status of the system by adjusting the parameter N to achieve upper limit value. When G(N, M, K) < 0, the growth rate TMR is \( {T}_{s2}^{\prime } \), we can reduce the growth rate by the following ways: increase the product size of the parameters M ∙ K and reduce the size P∆p + C∆c. So we can choose to upgrade M ∙ K to the upper limit

$$ \max \left(M\bullet K\right)=\left\lfloor \frac{N\bullet W}{4{\Delta }_d}\left(P{\Delta }_p+C{\Delta }_c\right)\right\rfloor $$
(17)

The purpose is to make the G(N, M, K) → 0. When G(N, M, K) → 0, task scheduling part and rendering part have the same performance. When N →  + ∞, \( \underset{N\to +\infty }{\lim }G\left(N,M,K\right)\to -\left(P{\varDelta}_p+C{\varDelta}_c\right) \), it means that the performance of rendering part is serious behind task scheduling parts, and the system operation is too slow so that becomes a performance bottleneck in the rendering part. When M ∙ K →  + ∞, \( \underset{N\to +\infty }{\lim }G\left(N,M,K\right)\to +\infty, \) it means that the performance of task scheduling part is serious behind the rendering part so that the system operation is too slow to become a performance bottleneck in task scheduling part. The average response time of each algorithm after system optimization is shown in the following Table 3,

Table 3 Average response time of each algorithm after system optimization

6 Conclusion

At present, the cloud computing system and cloud rendering industry are substantially rising. Aiming at the current cloud rendering system performance, we propose a set of diagnostic performance bottlenecks and resources to optimize allocation methods and to analyze the core performance of cloud rendering system by this theory, especially to analyze the multi-task rendering process. We can obtain the following conclusions: (1) the average response delay is influenced by many factors. (2) The increase of concurrent users’ number will directly lead to the increase of average response delay. (3) The parameters parts of concurrent users’ number will affect the concurrent user number. We propose a unique parameter adjustment strategy to improve system performance by rigorous mathematical proof, namely, under G(N, M, K) ≥ 0 circumstances, we optimize the system by maximizing the parameter N to the upper limit, or in G(N, M, K) < 0 circumstances, we optimize the system by increasing the parameter M ∙ K product to limit the decrease rate of TMR. This is a new performance optimization scheme for cloud rendering system.