Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

We consider the problem of finding a large planar subgraph in a given non-planar graph \(G=(V,E)\); \(n:=|V|\), \(m:=|E|\). We distinguish between algorithms that find a large, maximal, or maximum such graph: while the latter (MPS) is one with largest edge cardinality and NP-hard to find [18], a subgraph is inclusionwise maximal if it cannot be enlarged by adding any further edge of G. Sometimes, the inverse question—the skewness of G—is asked: find the smallest number \( skew (G)\) of edges to remove, such that the remaining graph is planar.

The problem is a natural non-trivial graph property, and the probably best known non-planarity measure next to crossing number. This already may be reason enough to investigate its computation. Moreover, MPS/skewness arises at the core of several other applications: E.g., the practically strongest heuristic to draw G with few crossings—the planarization method [2, 7]Footnote 1—starts with a large planar subgraph, and then adds the other edges into it.

Recognizing graphs of small skewness also plays a crucial role in parameterized complexity: Many problems become easier when considering a planar graph; e.g., maximum flow can be computed in \(\mathcal {O}(n\log n)\) time, the Steiner tree problem allows a PTAS, the maximum cut can be found in polynomial time, etc. It hence can be a good idea to (in a nutshell) remove a couple of edges to obtain a planar graph, solve the problem on this subgraph, and then consider suitable modifications to the solution to accommodate for the previously ignored edges. E.g., we can compute a maximum flow in time \(\mathcal {O}( skew (G)^3 \cdot n\log n)\) [13].

While solving MPS is NP-hard, there are diverse polynomial-time approaches to compute a large or maximal planar subgraph, ranging from very trivial to sophisticated. By Euler’s formula we know that already a spanning tree gives a 1 / 3-approximation for MPS. Hence all reasonable algorithms achieve this ratio. Only the cactus algorithms (see below) are known to exhibit better ratios. We will also consider an exact MPS algorithm based on integer linear programs (ILPs).

All algorithms considered in this paper are known (for quite some time, in fact), and are theory-wise well understood both in terms of worst case solution quality and running time. To our knowledge, however, they have never been practically compared. In this paper we are in particular interested in the following quality measures, and their interplay:

  • What is the practical difference in terms of running time?

  • What is the practical difference in solution quality (i.e., subgraph density)?

  • What is the implementation effort of the various approaches?

Overall, understanding these different quality measures as a multi-criteria setting, we can argue for each of the considered algorithms that it is pareto-optimal. We are in particular interested in studying a somewhat “blurred” notion of pareto-optimality: We want to investigate, e.g., in which situations the additional sophistication of algorithms gives “significant enough” improvements.Footnote 2

Also the measure of “implementation complexity” is surprisingly hard to concisely define, and even in the field of software-engineers there is no prevailing notion; “lines of code” are, for example, very unrelated to the intricacies of algorithm implementation. We will hence only argue in terms of direct comparisons between pairs of algorithms, based on our experience when implementing them.Footnote 3

As we will see in the next section, there are certain building blocks all algorithms require, e.g., a graph data structure and (except for C, see below) an efficient planarity test. When discussing implementation complexity, it seems safe to assume that a programmer will already start off with some kind of graph library for her basic datastructure needs.Footnote 4 In the context of the ILP-based approach, we assume that the programmer uses one of the various freely available (or commercial) frameworks. Writing a competitive branch-and-cut framework from ground up would require a staggering amount of knowledge, experience, time, and finesse. The ILP method is simply not an option if the programmer may not use a preexisting framework.

In the following section, we discuss our considered algorithms and their implementation complexity. In Sect. 3, we present our experimental study. We first consider the pure problem of obtaining a planar subgraph. Thereafter, we investigate the algorithm choices when solving MPS as a subproblem in a typical graph drawing setting—the planarization heuristic.

2 Algorithms

Naïve Approach . The algorithmically simplest way to find a maximal planar subgraph is to start with the empty graph and to insert each edge (in random order) unless the planarity test fails. Given an \(\mathcal {O}(n)\) time planarity test (we use the algorithm by Boyer and Myrvold [3], which is also supposed to be among the practically fastest), this approach requires \(\mathcal {O}(nm)\) overall time.Footnote 5

In our study, we consider a trivial multi-start variant that picks the best solution after several runs of the algorithm, each with a different randomized order. The obvious benefit of this approach is the fact that it is trivial to understand and implement—once one has any planarity test as a black box.

Augmented Planarity Test . Planarity tests can be modified to allow the construction of large planar subgraphs. We will briefly sketch these modifications in the context of the above mentioned \(\mathcal {O}(n)\) planarity test by Boyer and Myrvold [3]: In the original test, we start with a DFS tree and build the original graph bottom-up; we incrementally add a vertex together with its DFS edges to its children and the backedges from its decendents. The test fails if at least one backedge cannot be embedded.

We can obtain a large (though in general not maximal) planar subgraph by ignoring whether some backedges have not been embedded, and continuing with the algorithm (BM). If we require maximality, we can use as a prostprocessing to grow the obtained graph further (BM+). While this voids the linear runtime, it will be faster than the pure naïve approach. Given an implementation of the planarity testing algorithm, the required modifications are relatively simple per se—however, they are potentially hard to get right as the implementor needs to understand side effects within the complex planarity testing implementation.

Alternatively, Hsu [14] showed how to overcome the lack of maximality directly within the planarity testing algorithm [19] (which is essentially equivalent to [3]), retaining linear runtime. While this approach is the most promising in terms of running time, it would require the most demanding implementation of all approaches discussed in this paper (including the next subsection)—it means to implement a full planarity testing algorithm plus intricate additional procedures. We know of no implementation of this algorithm.

Cactus Algorithm . The only non-trivial approximation ratios are achieved by two cactus-based algorithms [4]. Thereby, one starts with the disconnected vertices of G. To obtain a ratio of 7 / 18 (C), we iteratively add triangles connecting formerly disconnected components. This process leaves a forest F of tree-like structures made out of triangles—cactusses. Finally, we make F connected by adding arbitrary edges of E between disconnected components. Since this subgraph will not be maximal in general, we can use to grow it further (C+).

From the implementation point of view, this algorithm is very trivial and—unless one requires maximality—does not even involve any planarity test. While a bit more complex than the naïve approach, it does not require modifications to complex and potentially hard-to-understand planarity testing code like BM.

For the best approximation ratio of 4 / 9 one seeks not a maximal but a maximum cactus forest. However, this approach is of mostly theoretical interest as it requires non-trivial polynomial time matroid algorithms.

ILP Approach ( ILP ). Finally, we use an integer linear program (ILP) to solve MPS exactly in reasonable (but formally non-polynomial) time, see [15]. With binary variables for each edge, specifying whether it is in the solution, we have

$$\max \Big \{ \sum \nolimits _{e\in E} x_e \ \Big |\ \sum \nolimits _{e\in K} x_e \le |K|-1 \text { for all Kuratowski subdivisions } K\subseteq G\Big \}.$$

Kuratowski’s theorem [17] states that a graph is planar if and only if it does not contain a \(K_5\) or a \(K_{3,3}\) as a subdivision—Kuratowski subdivisions. Hence we guarantee a planar solution by requiring to remove at least one edge in each such subgraph K. While the set of these constraints is exponential in size, we can separate them heuristically within a branch-and-cut framework, see [15]: after each LP relaxation, we round the fractional solution and try to identify a Kuratowski subdivision that leads to a violated constraint.

This separation in fact constitutes the central implementation effort. Typical planarity testing algorithms initially only answer yes or no. In the latter case, however, all known linear-time algorithms can be extended to extract a witness of non-planarity in the form of a Kuratowski subdivision in \(\mathcal {O}(n)\) time. If the available implementation does not support this additional query, it can be simulated using \(\mathcal {O}(n)\) calls to the planarity testing algorithm, by incrementally removing edges whenever the graph stays non-planar after the removal. Both methods result in a straight-forward implementation (assuming some familiarity with ILP frameworks), but an additional tuning step to decide, e.g., on rounding thresholds, is necessary. The overall complexity is probably somewhere in-between C and BM. In our study, we decided to use the effective extraction scheme described in [10] which gives several Kuratowski subdivions via a single call. We propose, however, to use this feature only if it is already available in the library: its implementation effort would otherwise be comparable to a full planarity test, and in particular for harder instances its benefit is not significant.

3 Experiments

For an exploratory study we conducted experiments on several benchmark sets. We summarize the results as follows—observe the inversion between F1 and F2.

  1. F1.

    C+ yields the best solutions. Choosing a “well-growable” initial subgraph—in our case a good cactus—is practically important. The better solution of BM is a weak starting point for BM+; even gives clearly better solutions.

  2. F2.

    BM gives better solutions than C; both are the by far fastest approaches. Thus, if runtime is more crucial than maximality, we suggest to use BM.

  3. F3.

    ILP only works for small graphs. Expander graphs (they are sparse but well-connected) seem to be among the hardest instances for the approach.

  4. F4.

    Larger planar subgraphs lead to fewer crossings for the planarization method. However, this is much less pronounced with modern insertion methods.

Setup and Instances. All considered algorithms are implemented in C++ (g++ 5.3.1, 64bit, -O3) as part of OGDF [5], the ILP is solved via CPLEX 12.6. We use an Intel Xeon E5-2430 v2, 2.50 GHz running Debian 8; algorithms use singles cores out of twelve, with a memory limit of 4 GB per process.

We use the non-planar graphs of the established benchmark sets North [12] (423 instances), Rome [11] (8249), and SteinLib [16] (586), all of which include real-world instances. In our plots, we group instances according to |V| rounded to the nearest multiple of 10; for Rome we only consider graph with \({\ge }\,25\) vertices.

Fig. 1.
figure 1

We may omit algorithms whose values are unsuitable for a plot; instead we give their average[min, max] in the caption.

Additionally, we consider two artificial sets: BaAl [1] are scale-free graphs, and Regular [20] (implemented as part of the OGDF) are random regular graphs; they are expander graphs w.h.p. [folklore]. Both sets contain 20 instances for each combination of \(|V|\in \{10^2,10^3,10^4\}\) and \(|E|/|V|\in \{2,3,5,10,20\}\).

Evaluation. Our results confirm the need for heuristic approaches, as ILP solves less than 25% of the larger graphs of the (comparably simple) Rome set within 10 min. Even deploying strong preprocessing [6] (+PP) and doubling the computation time does not help significantly, cf. Fig. 1(d). Already 30-vertex graphs with density 3, generated like Regular, cannot be solved within 48 hours (\(\rightarrow \) F3).

We measure solution quality by the density (edges per vertices) of the computed planar subgraph. Independently of the benchmark set, C+ always achieves the best solutions, cf. Fig. 1(a), (b) (table) (\(\rightarrow \) F1). We know instances where is only a approximation whereas the worst ratio known for BM+ is  [8]. Surprisingly, yields distinctly better solutions than BM+ in practice (\(\rightarrow \) F1).

On SteinLib, BaAl, and Regular, both \(\texttt {C} \) and \(\texttt {BM} \) behave similar w.r.t. solution quality. For Rome and North, however, BM yields solutions that are 20–30% better, respectively (\(\rightarrow \) F2). This discrepancy seems to be due the fact that the found subgraphs are generally very sparse for both algorithms on BaAl and Regular (average density of 1.1 and 1.2, respectively, for the largest graphs).

Both \(\texttt {C} \) and \(\texttt {BM} \) are extremly (and similarly) fast; Fig. 1(c) (table) (\(\rightarrow \) F2). For BM+ and C+, the -postprocessing dominates the running time: is worst, followed by \(\texttt {C+} \) and \(\texttt {BM+} \)—a larger initial solution leads to fewer trys for growing. Nonetheless, we observe that the (weaker) solution of C allows for significantly more successful growing steps that BM (\(\rightarrow \) F1).

Finally, we investigate the importance of the subgraph selection for the planarization method, cf. Fig. 1(e), (f). For the simplest insertion algorithms (iterative edge insertions, fixed embedding, no postprocessing, [2]), a strong subgraph method (C+) is important; C leads to very bad solutions. For state-of-the-art insertion routines (simultaneous edge insertions, variable embedding, strong postprocessing, [7, 9]) the subgraph selection is less important; even C is feasible.