Our data includes the same set of graphs that was used by the authors of PRISM to compare it with other algorithms [6]. The set is available in the Graphviz open source packageFootnote 3. We also used a small collection of random graphs and a collection of about 10,000 filesFootnote 4. For the experiments we use a modified version of Dot, where we can invoke either GTree or Prism for the overlap removal step, and we also used MSAGL, where we implemented PRISM and GTree. MSAGL was used only to obtain the quality measures. We ran the experiments on a PC with Linux, 64bit and an Intel Core i7-2600K CPU@3.40 GHz with 16 GB RAM.
Some of resulting layouts can be seen in Figs. 3, 5, 6.
One can try to resolve overlap by scaling the node centers of the original layout. If there are no two coincident node centers this will work, but the resulting layout may require a huge area if some centers are close to each other. We consider the area of the final layout as one of the quality measures, and usually PRISM produces a smaller area than GTree, see Table 1.
Table 1. Similarity to the initial layout (left) and number of iterations for different graph sizes and different initialization methods (right). PR stands for PRISM
Table 2. k closest neighbors error, the Multi Dimensional Scaling algorithm of MSAGL was used for the initial layout. PR stands for PRISM.
In addition to comparing the areas, we compare some other layout properties. Following Gansner and Hu [6], we look at edge length dissimilarity, denoted as \(\sigma _{\text {edge}}\). This measure reflects the relative change of the edge lengths of a Delaunay Triangulation on the node centers of the original layout.
The other measure, which is denoted by \(\sigma _{disp}\), is the Procrustean similarity [1]. It shows how close the transformation of the original graph is to a combination of a scale, a rotation, and a shift transformation. PRISM and GTree performs similar in the last two measures as Table 1 shows.
To distinguish the methods further, we measure the change in the set of k closest neighbors of the nodes. Namely, let \(p_1,\dots ,p_n\) be the positions of the node centers, and let k be an integer such that \(0 < k \le n\). Let \(I = \{1,\dots ,n\}\) be the set of node indices. For each \(i \in I \) we define \(N_k(i) \subset I\setminus \{i\}\), such that \(|N_k(p, i)|=k\), and for every \(j \in I \setminus N_k(p, i)\) and for every \(j' \in N_k(p, i)\) holds \(\Vert p_j-p_i\Vert \ge \Vert p_{j'}-p_i\Vert \). In other words, \(N_k(p, i)\) represents a set of k closest neighbors of i, excluding i. Let \(p'_1,\dots ,p'_n\) be transformed node centers. To see how much the layout is distorted nearby node i, we intersect \(N_k(p,i)\) and \(N_k(p',i)\). We measure the distortion as \((k-m)^2\), where m is the number of elements in the intersection. One can see that if the node preserves its k closest neighbors then the distortion is zero.
Our experiments for k from 8 to 12 show that under this measure GTree produced a smaller error, showing less distortion, on 8 graphs from 14, and on the rest PRISM produced a better result, see Table 2. GTree produced a smaller error on all small random graphs from other collectionsFootnote 5.
Table 3. Statistics on collection A. Here k-cn stands for k-closest neighbors, and “iters” stands for the number of iterations. Each cell contains the number of graphs for the measure on which the method performed better. We can see that PRISM produced a layout of smaller area than the one of GTree on 8498 graph, against 1579 graphs where GTree required less area. From the other side, GTree gives better results on all other measures. The columns of k-cn and “iters” do not sum to 10077, the number of graphs in A, because some of the results were equal for PRISM and GTree.
We ran tests on the graphs from a subdirectory of the same site called “dot_files”, let us call this set of graphs collection A. Each graph from A represents the control flow of a method from a version of the .NET framework. A contains 10077 graphs. The graph sizes do not exceed several thousands. We used the Multi Dimensional Scaling algorithms of MSAGL for the initial layout in this test. The results of the run are summarized in Table 3.
Runtime Comparison. Both methods remove the overlap iteratively using the proximity graph. However, while PRISM needs \(\mathcal {O}(|V|\cdot \sqrt{|V|})\) time to solve the stress model, GTree needs only \(\mathcal {O}(|V|)\) time per iteration with the growing tree procedure. Therefore, GTree is asymptotically faster in a single iteration. In addition, as Table 1 (right) shows, GTree usually needs fewer iterations than PRISM, especially on larger graphs. The overall runtime can be seen in Fig. 4. It shows that GTree outperforms PRISM on larger graphs.
In Fig. 5 we experiment with the way we expand the edges. Instead of the formula \(p'_j = p'_i + t_{ij}(p_j-p_i)\), which resolves the overlap between the nodes i and j immediately, we use the update \(p'_j = p'_i + \min (t_{ij},1.5)(p_j-p_i)\). As a result, the algorithm runs a little bit slower but produces layouts with smaller area.