Association Visualization Analysis for the Application Service Layer and Network Control Layer

. Most researches about complex networks are single-layer networks-based representation. However, in most cases, systems in the real world are not isolated but connective. In this paper, different from the traditional Open System Interconnection (OSI) model, our research pays attention to application service layer and network control layer for the view of application. Two layers connect with each other by using IP mapping relationship. Firstly, to avoid unnecessary loss of computational ef ﬁ ciency, we modify Louvain algorithm to divide the nodes in network control layer into several parts. Secondly, we add additional community attractive force and introduce Barnes-Hut force-calculation model to Fruchterman-Reingold algorithm in order to make nodes in network control layer aligned more structured and well-distributed ef ﬁ ciently. Finally, we merge the application service layer and the network control layer into a two-layer visualization model. Based on our two-layer model, the whole network trend, topology and incidence relation can be conveniently grasped.


Introduction
Visual analytics is based on the combination of automatic analysis technology and interactive visualization technology for efficacious exploring and decision making for users [1]. Converting complex datasets into graphics or images makes it easier for users to understand large-scale data and discover hidden information.
In many researches, network was just considered as a complex graph or a kind of data structure with nodes and edges. These studies are generally divided into two categories. One is visualization research for a single layer. The other is visualization of multi layers for complex network.
There are some works on displaying network status in single layer. For network control layer, VizFlowConnect [2] and PortVis [3] use Parallel-Coordinate and Scatter-Plot to visualize Netflow data. IPMatrix [4] and Netvis [5] use Node-Link and

Preliminary Knowledge
In network control layer, we can abstract the Internet topology at the inter-domain level into an Autonomous System (AS) connection graph. Through this graph, network is just considered as a complex graph with AS nodes and BGP routing sessions' links. Similar to social networks, there are communities in the real network. The AS nodes in the community interact closely and the relationships between communities are relatively sparse. In order to present a good community structure layout, this section introduces Louvain algorithm for community detection and Fruchterman-Reingold algorithm for nodes layout covered in this paper.

Louvain Algorithm
The existing community detection algorithms for complex network are mainly divided into two categories. One is based on the graph theory, such as k-clique algorithm [17], Label propagation algorithm (LPA) [18] and so on. Another is hierarchy-clustering algorithm, such as Fast Newman algorithm (FN) [19], Louvain algorithm [20] and so on. Currently, researchers consider the Louvain algorithm the best non-overlapping community detection algorithm. Girvan Newman [21] first proposed the concept of modularity Q in 2002. Then the modularity Q in formula (1) is commonly used to measure the strength of the network community structure. As the value of the modularity Q increases, the community structure is more robust and compact. The value of modularity Q is up to one.
In which, P in is the sum of the weights of all the edges in community c. P tot is the sum of the weights of the edges connected to all nodes in community c.
Louvain algorithm is based on modularity Q optimization. The change of the modularity increment ΔQ can be derived by formula (1) Where, k i;in is the sum of weight of edges which is connected to node n i in community c. The use of greedy algorithm for large complex networks greatly improves computational efficiency.
Louvain algorithm's process is mainly divided into two steps: The first step is to regard all nodes in the network as an independent community. Then we try to assign each node to another community which its neighbor node belongs to and calculate the change in modularity increment ΔQ. If ΔQ > 0, we choose the community which makes ΔQ the largest and then put this node into this community.
In the second step, we regard the nodes of the same community as a new node. The weights of the edges between communities are converted into the weight of edges between new nodes. Repeat step 1 until the modularity Q no longer changes.

Fruchterman-Reingold Algorithms
The most common graph drawing methods always rely in physical simulations, such as force-directed algorithm (FDA) [22], Kamada-Kawai algorithm (KK) [23], Fruchterman-Reingold algorithm (FR) [24] and so on. In this system, the edges between the nodes are equivalent to the spring or other physical connections, and the nodes are balanced by the interaction of the elastic force. Our research is based on FR algorithm. This method treats nodes as atoms in physical system and there is attractive force and repulsions force between each node. By calculating the total energy of the system, it can produce a beautiful and balance layout with a simple cooling table.
In order to make the nodes in graph well-distributed, FR algorithm thinks that nodes with edges connected should be as close as possible and nodes with no edges connected should be as far as possible. This method defines the concepts of attractive force (f a ) and repulsions force (f r ). There are attractive force among all nodes with edges connected and repulsions force among all nodes. K is used to control side length. The force can be calculated as following: In which, k ¼ C ffiffi ffi S N q is the balance coefficient. C is a constant. S is the layout area and N is the number of all nodes. Using FR algorithm, nodes are evenly distributed.

The Method of Two-Layer Network Topology Visualization
In our research, we just abstract the network and application as network control layer and application service layer. We focus on the topological structure of the network control layer as well as the relationships between each layer, so we use the 2.5D visualization method to design our two-layer model. This section introduces our work on network control layer topological visualization and the display algorithm of twolayer network model.

Visualization of Network Control Layer
Similar to social networks, there are communities in the real network. The AS nodes in the community interact closely and the relationships between communities are less interact. Traditional Louvain algorithm and FR algorithm have some defects in display and efficiency. In order to make the view structured and well-distributed, we modify Louvain algorithm via pruning the leaf nodes and add additional community attractive forces to our modified FR algorithm.

The Modified Louvain Algorithm
Since there are a large number of leaf nodes which are only connected to one node in real network, it would cause unnecessary loss of computational efficiency if we calculate modularity increment ΔQ for every node. As shown in Fig. 1, the black nodes have five leaf nodes. Because community detection will avoid individual nodes belonging to a community, the leaf nodes and the black node must be in the same community. To avoid unnecessary loss of computational efficiency, in the first step of the Louvain algorithm, we can assign the leaf nodes to its adjacent non-leaf nodes directly. With the increase of the ratio of leaf nodes in the network, the algorithm efficiency is obviously enhanced.
Our research obtains the routing data of the rrc00.ripe.net probe at 16:00 on May 7, 2018 from the RIPE Routing Information Service (RIS). We get the relationships between global AS nodes by Python processing. Since there are edges with higher repetition rate and edges with fewer paths in our routing data, we introduce the number of AS path as edge weights into the Louvain algorithm. Compared to the unweighted graph in real network, the nodes with high weight links are more likely in the same community.
We classify the nodes and edges by countries and select 338 nodes and 544 edges in China collection for experiment. We perform Louvain algorithm and modified Louvain one on this dataset. Compared with the Fast Newman algorithm (FN) and Label propagation algorithm (LPA), Table 1 shows the results of the experiment.  As shown in Table 1, the Fast Newman algorithm performs worst. The Modularity Q is not bad but it takes too much time in iteration. The LPA algorithm is the fastest algorithm in those algorithms, but the Louvain algorithm performs much better in the modularity Q. Compared with the original Louvain algorithm, the modified one is better with 4.12% less time-consuming. At the same time, as the number of communities drops from 15 to 12, the modularity Q has also increased from 0.6137 to 0.6231. Therefore, this experiment proves the modified Louvain algorithm is the most suitable community detection for visualization of network control layer.

The Modified FR Algorithm
There are some issues on using FR algorithm to visualize large-scale data layout: (1) It is difficult for us to observe the community structure and the connection between communities when there are too many communities. (2) The time complexity of FR is Oð E j j þ jVj 2 Þ. When using too many nodes, it will take a lot of time to calculate. So, in our research, refer to FR algorithm, we redefine attractive forces (f a ) and repulsions forces (f r ), and add community force (f com ) to make the nodes in the same community more closely to each other while ensuring an evenly layout. In order to avoid the local optimum, we introduce the energy function into our method and use simulated annealing algorithm to approximate optimal solution. At the same time, Barnes-Hut force-calculation model [25] is introduced to reduce time complexity so that the modified algorithm can be applied to large-scale network layout.
Often, the nodes with high weight links should stay closer than other nodes, so we introduce the weight of each edges into the calculation of attractive forces (f a ). For another, the high-degree nodes usually belongs to different communities, so we hope these nodes father away from each other in order to display a better visualization of communities. Therefore, we also introduce the degrees of two nodes when we calculate the repulsive forces (f r ) between all nodes. The forces of every node in network are calculated as following: For the nodes in the same community, we hope that edges with high weight connect nodes with each other tightly. Refer to the attractive force formula (3) defined in FR, we introduce the sum of edges weight in community (w com ) and define the formula for community forces as following: In which d n 1 ; n 2 ð Þ is the distance between node 1 and node 2, w n 1 ; n 2 ð Þ indicates the weight of edge n 1 n 2 ð Þ. deg n 1 ð Þ is the degree of node n 1 . w com is the sum of edge weight in a community. w all is the sum of weight of all edges.
At the same time, in order to reduce the time complexity of the repulsion forces calculation, we introduce Barnes-Hut force-calculation model in the calculation of repulsive forces. In our modified FR algorithm, the time complexity of the repulsion calculation is reduced from OðjVj 2 Þ to OðjVj log jVjÞ.
We use this modified FR algorithm and another classic layout algorithms for experiment in the public Dolphins data set (62 nodes, 159 sides) and Football Club data set (115 nodes, 613 sides). The results are as shown in Figs. 2 and 3.  From the visualization of the two datasets in Figs. 2 and 3, it is easy to see that the modified FR algorithm shows the community relationships in network more clearly and structured than Yifan Hu and ForceAltas2 algorithm. We can easily distinguish community connections with greater connectivity in the current network state.

Establishment of Two-Layer Network Model
In our research, the real network is considered as a two-layer model of the application service layer and the network control layer. We find that the locations of the application service layer's nodes are related to the locations of the network control layer nodes closely. In order to reduce the visual confusion caused by the inter-layer crossing, we need to calculate and adjust the location of the application service layer's nodes. Generally, an application service corresponds to multiple nodes in the network control layer. So in our 2.5D model, it is assumed that the node coordinates of the network control layer are x i , y i , z i ¼ 0. Then the corresponding node coordinates in application service layer are as follows: In which, x i , y i , z i represent node coordinates in the network control layer and x i , y i , z i represent node coordinates in the application service layer. n is the number of nodes in network control layer corresponding to an application service.
The layout algorithm established by two-layer layout model can be described as following: Step1 Merging all leaf nodes with their adjacent non-leaf nodes in network G Step2 Covert the network G into an undirected weight graph and execute the modified Louvain algorithm. The result is community partition C.
Step3 Randomly input nodes in network G and initialize all node coordinates 1 2 3 P ,P ,P , ...
Step4 Initialize the temperature coefficient and build Barnes-Hut hierarchical space model according to the position of Step3 Step5 Calculate the attractive force between adjacent nodes and the repulsive force between all nodes according to formula (4) and alculate another attractive force between adjacent nodes in the same community according to formula (5), then update the node position.
Step6 When the temperature coefficient is gradually reduced to zero and the node position tends to be stable, output the node coordinates Step7 According to the positions of nodes in network G, calculate node coordinates in application service layer by formula (6). Take the z value of network control layer as 0 and the application service layer as 1, output the application server node position We select 338 nodes and 544 edges in China collection in network control layer and 13 website nodes in application service layer for experiment. Using IP mapping to connect two layers, the result is shown in Fig. 4. The black nodes represent 13 websites and the remaining colorful nodes are the AS nodes in network control layer. Different colors represent different communities.
As shown in Fig. 4, compared the two-layer network model built by FR algorithm with two-layer network model built by modified algorithm, we can find that the community structure in our modified algorithm is more obvious and structured.
If you are interested in a website, you can select a node in the application service layer. Then, all business relationship related to it in network control layer are highlighted and we can get the AS node information from node labels. The result is shown in Fig. 5.
As shown in Fig. 5, we take the Baidu Fanyi node as an example, the AS nodes connect with it are AS4808, AS4847 AS9808, AS23724 and AS55967. Therefore, when these nodesare safe, we can ensure the proper operation of Baidu Fanyi.  Figure 6 shows the result when we select an AS node (take AS23724 as an example) in network control layer. We can easily distinguish all nodes connected to AS23724 in network control layer and application service layer. If there is a problem with the AS23724 node, the nodes connected to it in network control layer and the website relied on it in application service layer may be also in trouble.

Conclusion
Currently, the visualization of real network mainly focused on the single-layer networks-based representation. Our research considered the application service layer and the network control layer with the IP mapping relationship between them. In network control layer, to avoid unnecessary loss of computational efficiency, we modified Louvain algorithm via pruning the leaf nodes to community detection. In order to make the view structured and well-distributed, we added additional community  attractive forces to FR algorithm to make nodes in network control layer structured and well-distributed. Finally, we merged application service layer and the network control layer into a 2.5D visual model to facilitate the user's further analysis on observing the network trend, topology and incidence relation.
The future task of our multi-layer networks is to introduce a geographic location layer to show the IP location of each node in application service layer. From the future multi-layer visual model, the operating status of the network and topology can be observed in multiple dimensions.