SUSPECT: MINLP special structure detector for Pyomo

We present SUSPECT, an open source toolkit that symbolically analyzes mixed-integer nonlinear optimization problems formulated using the Python algebraic modeling library Pyomo. We present the data structures and algorithms used to implement SUSPECT. SUSPECT works on a directed acyclic graph representation of the optimization problem to perform: bounds tightening, bound propagation, monotonicity detection, and convexity detection. We show how the tree-walking rules in SUSPECT balance the need for lightweight computation with effective special structure detection. SUSPECT can be used as a standalone tool or as a Python library to be integrated in other tools or solvers. We highlight the easy extensibility of SUSPECT with several recent convexity detection tricks from the literature. We also report experimental results on the MINLPLib 2 dataset.

Contributions This paper presents data structures and algorithms used to implement SUSPECT. The primary contribution of SUSPECT is its easy extensibility to detect the convexity of complex expressions. Custom MINLP solvers can leverage SUSPECT during preprocessing to make informed decisions about how to best customize the solution process. For example, a solver could use SUSPECT to identify which problem variables appear in nonconvex expressions, thereby reducing the space over which it must apply spatial branch and bound. SUSPECT's extensibility allows domain experts to add new convexity detectors, thereby expanding the range of expressions that SUSPECT can correctly classify. Finally, SUSPECT may also be used to provide special structure detection and bounds tightening when developing new global optimization tools. This paper is published alongside the source code [10], licensed under the Apache License, version 2. All global solvers have some convexity detection facilities, but SUSPECT is the first open source code to explicitly detect convexity as a stand-alone utility.
The rest of the paper is organized as follows. Section 2 introduces convexity detection, the Python library Pyomo and data structures to represent nonlinear optimization problems. Bound tightening and propagation algorithms are described in Sect. 3, while Sect. 4 gives an overview of the monotonicity and convexity propagation rules used in SUSPECT. In Sect. 5 we describe the data structures and algorithms implemented in SUSPECT and in Sect. 6 we introduce the plugin-based architecture used to extend SUSPECT. The test suite and results are presented in Sect. 7. We conclude in Sect. 8.

Background
Different methods exist for proving or disproving convexity of an expression (or an optimization problem). Tree-walking approaches [17,18] are suited for preprocessing given their low computational complexity. More computationally expensive methods (i) use automatic differentiation techniques to compute the interval Hessian [42,44], (ii) reduce convexity questions to real quantifier elimination problems [45], or (iii) sample to verify convexity [12]. Some solvers, e.g. CVX [22,23], CVXPY [14] and Convex.jl [51], impose a set of conventions when constructing problems and thereby obtain a convex optimization problem [24].
The bound tightening techniques described in this paper are similar to those used in interval arithmetic and constraint programming [37,47]. Other bound tightening techniques use pair of inequalities [3], tightening with a linear optimization problem (LP) [4], or solving two LPs for each variable [20].

Pyomo expression trees
SUSPECT works with optimization models formulated using Pyomo [27,28], a Python-based Algebraic Modeling Language (AML). Pyomo supports most features common to AMLs, e.g. (i) separating the model definition from the instance data and solution method, (ii) supporting linear and nonlinear expressions, and (iii) structuring modeling using indexing sets and multi-dimensional variables, parameters, and constraints. Pyomo focuses on an open, extensible object model. As with most AMLs, Fig. 1 shows that Pyomo represents objectives and constraints using expression trees, 1 with internal nodes representing operators and leaf nodes representing operands. Operands may be variable objects, literal constants (integer or floating point numbers, or strings), or placeholders for literal constants ("mutable parameters"). Internal nodes come in three fundamental categories: Unary operators (negation, absolute value, and trigonometric functions), binary operators (subtraction, multiplication, division, exponentiation, and comparison), and n-ary operators (addition, linear expressions).

Directed acyclic graphs
SUSPECT represents problems using a Directed Acyclic Graph (DAG) to group common subexpressions in a single node [47,55].
The SUSPECT instance graph G is similar to combining all Pyomo expression trees for that instance except that Pyomo explicitly avoids shared subexpressions and restricts each operator vertex to have at most one parent. Therefore, any two Pyomo expressions share no common internal nodes. In contrast, SUSPECT actively identifies and combines common vertices across expressions [54]. Figure 2 represents Optimization Problem (1) and shows the DAG vertices grouped by depth d (v).

Bound tightening and propagation
Feasibility-based bounds tightening propagates bounds from sources to sinks and from sinks to sources. To propagate bounds from sources to sinks, SUSPECT computes the new bounds B v of each node v with existing bound B v and bounds on its children nodes B u as: For each operator ⊗ v , SUSPECT computes the bound using the interval arithmetic rules in Appendix A [32,43].
dom g for some DAG vertex, SUSPECT throws an exception. In normal operation, SUSPECT has X ⊆ dom g, but this feature helps researchers extend SUSPECT.

Example 1 Consider the function
The reverse step of feasibility-based bound tightening propagates bounds from sinks to sources. A DAG vertex v with unary function ⊗ v and bound B v implies a child u with bound B u that is equal to or tighter than The absolute value and square functions are the only exceptions in the current implementation: they are not invertible, but we update the bound of B u using (2) gives the bound tightening step SUSPECT performs for each vertex v that represents an unary function: Example 2 Consider the function k( . SUSPECT does not support disjoint intervals because their support is uncommon in most solvers, so the bound is not Appendix B shows that SUSPECT also tightens the bounds of the children of linear expressions, but not of other non-unary functions, e.g. product and division.

Special structure detection
SUSPECT detects monotonicity and convexity recursively. The rules assume an expression has only unary or binary vertices, but SUSPECT deduces properties of n-ary vertices k = g 0 • g 1 • · · · • g l by applying the rules as (( This section introduces rules computing the monotonicity and convexity properties of the composition Applying the chain rule and computing the gradient of k, the monotonicity of k depends on the signs of ∇g(h(x)) and ∇h(x): If any of g or h are not differentiable, then the monotonicity and convexity of k are considered unknown. SUSPECT has special routines for the absolute value function: if the domain of |x| specifies x ≥ 0 (x ≤ 0) then we compute first and second derivatives as if |x| = x (|x| = −x). We assume nothing if x = 0 is in the interior of the domain. For a more detailed list of monotonicity and convexity rules refer to Appendix C of the supplementary material.

Implementation details
SUSPECT is implemented in Python and is compatible with Python version 3.5 and later. The source code is freely available on GitHub. 2 SUSPECT can be used as a standalone command line application or included in other applications as a library.

Data structures
SUSPECT implements a problem DAG by storing all vertices in a list, sorted by ascending depth. This sorting ensures that if SUSPECT traverses the list forward (backward), it visits a vertex after having visited all its children (parents). Each vertex stores a reference to its children and parents. Other approaches to representing the DAG are possible, e.g. using adjacency lists [13]. The SUSPECT class hierarchy for vertices, which is diagrammed in Fig. 3, is similar to Pyomo, but SUSPECT has a class for each available unary function because we often need to dispatch based on a vertex type. Having a type for each unary function enables leveraging the same functions used by the other vertex types. Figure 4 represents Optimization Problem (1): the DAG also contains dictionaries to map variables, constraints, and objectives names to their respective vertex.   Fig. 1, each unary function has its own class, and constraints and objectives are a special type of expression "x0" "x1" "x2" objectives "f0" constraints "f1" "f2"

Converting Pyomo models to SUSPECT
SUSPECT represents an optimization problem as a DAG, but Pyomo represents problems using separate expression trees for each constraint and objective. SUSPECT uses a hashmap-like data structure to correctly map a Pyomo expression to an existing SUSPECT expression if present, or otherwise create a new expression. The heuristic approach of hashing Pyomo expressions may not converge to the DAG with the fewest vertices, but it should obtain an acceptable DAG. The DAG structure depends on the model building style since no simplifications are performed. The hash function needs: (i) to compute the same value for equivalent expressions but different values for different expressions, and (ii) to limit collisions with a uniform distribution. Computing the hash is recursive: SUSPECT starts from the expression tree leaves and moves to the root. The hash of a variable is its ID. SUSPECT implements two possible strategies for the hash of a constant coefficient:

-Each unique number gets an increasing ID [SUSPECT default] This is best
when there are few distinct floating point numbers, e.g. many coefficients are 1. SUSPECT stores floating point numbers in a binary tree with their associated ID. -Rounded number is the hash value This is best when there are many unique floating point numbers.
Hashes of non-leaf nodes combine the hashes of their children expressions, together with the node. Equation (3) shows how the hash function of a vertex v is computed [30], where h is the hash function and c 0 , . . . , c n are the children of the vertex v.
LinearExpression( · ) Fig. 5 ExpressionDict implementation. Floating point coefficients make building a dictionary of expressions and values (in this case Bound objects) challenging. SUSPECT uses a hash function that ignores coefficients to insert similar expressions in the same bucket, then compares each expression for equality Figure 5 represents the expression hash map. SUSPECT inserts an expression by hashing the expression into a bucket index and then iterating through the bucket to check for the expression, otherwise SUSPECT appends it to the list. The Python built-in hash map manages the buckets. For lookups, SUSPECT hashes the expression for the bucket and then searches linearly through the bucket. This method matches some equivalent expressions, e.g. reordering a summation x 0 +x 1 to x 1 +x 0 . Since SUSPECT represents linear expressions as a summation of variables with the associated coefficients, the expressions x 0 + x 1 and x 0 + x 2 + x 3 would be represented by different linear expression vertices. SUSPECT could be improved by matching polynomials, e.g.
SUSPECT builds the DAG by: (i) associating Pyomo expressions to SUSPECT expressions, (ii) adding all problem variables, and (iii) visiting all Pyomo constraint and objective trees. For each Pyomo tree, SUSPECT walks from the leaves (variables and constants) to the root (functions) and modifies the DAG when it finds new subexpressions. This process terminates with a DAG equivalent to the Pyomo model.

Tree-walking rules: forward and backward visitors
The rules detecting monotonicity and convexity are naturally expressed recursively. The SUSPECT implementation avoids recursion because it can lead to stack overflows. To overcome this challenge, we devise a strategy to compute recursive rules without recursion. A node's vertex depth is greater than the depth of its children, so visiting nodes in ascending order: (i) visits a node after visiting its children and (ii) visits a node at most once. The problem DAG methods that visit vertices take two parameters: a visitor object (a subclass of ForwardVisitor or BackwardVisitor) and a context. The visitor is called on a node while the context shares information between visitors. SUSPECT uses this context to store information, e.g. bounds, polynomiality, monotonicity, and convexity.
The visitor classes implement the register_handlers and handle_result methods. The first method returns a dictionary associating expression types to a callback. SUSPECT calls the second after each vertex visit to update special structure information. The callbacks registered in the visitor are called with two parameters: the current expression and the context object. Callbacks can use the context to lookup information about other vertices, e.g. children in forward mode and parents in backward mode. To reduce computation time in forward (backward) mode, SUSPECT only visits nodes if any of its children (parents) changed in the previous iteration. To use this feature, implementations return a Boolean from the handle_result method to indicate if the vertex value was updated or not. If the vertex value changes, then SUSPECT adds the node parents (children) to the list of nodes to visit.

Bound tightening and propagation
DAGs are commonly used for bound propagation and tightening [47,55]. SUSPECT implements the bound propagation rules defined in Sect. 3 using the Sect. 5.3 treewalking rules. Since convergence time of the bound tightening step can be infinite [31], SUSPECT incorporates a maximum iteration parameter (default 10).
Floating point operations are subject to rounding errors [21] that are propagated in bounds tightening. To mitigate errors, SUSPECT algorithms work with the generic Bound interface. SUSPECT provides an implementation of this interface using arbitrary precision arithmetic [36], but users can supply their own implementation, e.g. using interval arithmetic [7]. Appendix D discusses the Bound interface.

Extending SUSPECT
SUSPECT may be extended to include other convexity structures detectors. SUSPECT already incorporates convexity detection for four important convex expressions: and concave if Q is negative semidefinite.
SUSPECT adopts a plugin-based architecture to enable third parties to extend its convexity detection capabilities without having to change SUSPECT itself. SUSPECT is distributed with the aforementioned convexity extensions. SUSPECT uses the numpy library, which provides Python wrappers around LAPACK dsyevd routines, to compute the Q matrix eigenvalues. Since dsyevd uses double precision floating point arithmetic, it may obtain incorrect results because of numerical issues. An alternative approach to determine whether Q is positive (negative) semidefinite is to perform two Cholesky decompositions with pivoting, this could also result in a more stable algorithm. As numpy does not currently implement Choelesky decomposition with pivoting, this method has not yet been implemented in SUSPECT.
Writing a new convexity detector requires the user to implement the detector as a class extending ConvexityDetector, and register it with SUSPECT. A convexity detector visitor is similar to the Sect. 5.3 visitors, but each callback returns a Convexity object. When SUSPECT visits a vertex and the built-in convexity detector is inconclusive, it will iterate over the registered detectors that can handle the expression and assign the convexity as the first conclusive result encountered.
SUSPECT uses Python's setuptools entry points to handle extensions registration. Entry points are a mechanism for the SUSPECT package to advertise extension points in its source. SUSPECT looks for extensions registered under the suspect.convexity_detection and suspect.monotonicity _detection groups for convexity and monotonicity detectors. Users extending SUSPECT need to package their extensions using setuptools and provide a setup.py file.

Example 3
Optimizing heat exchanger network design uses the reciprocal of log mean temperature difference (RecLMTD β ) with the limits defined: with variables x, y ∈ R + , and constant parameter β ≥ −1. RecLMTD β is concave if β = −1, strictly concave if −1 < β < 0, linear if β = 0, and convex if β > 0 [41]. Listings 1-3 in Appendix E contain the detector source code. The register_handlers method returns a callback for the expression with PowerExpression as root (case x = y) and one for the expression with DivisionExpression as root (case x = y). Both callbacks check whether the expression they are given matches the subgraph representing the RecLMTD β expression.
Listing 4 contains the setup.py source code necessary to register the new convexity detector with SUSPECT.

Numerical results
We consider the MINLPLib 2 dataset [53] which, when we accessed it on 13th December 2017, contained 1527 problems. We conducted the tests as a batch job on a cluster of Amazon AWS C5 instances. Each problem was assigned a single instance to run and each instance is equivalent to 2 cores of a 3.0 GHz Intel Xeon processor with 4 GB memory. To increase reproducibility, the SUSPECT command line tool was packaged as a Docker image running Python 3.5. The results shown in this manuscript were generated using the version of SUSPECT with Git tag v7, dated 18th of May 2018. Together with SUSPECT, we used Pyomo version 5.2. The problems are expected to be in the OSiL [16] format and we use the binary tree strategy for hashing floating point constants. Figure 6 shows the time (in CPU seconds) needed to run the special feature detection algorithms. The horizontal line marks the percentage of instances SUSPECT can read and start processing. Recall that SUSPECT can manage the Fig. 4 functions. SUSPECT fails at reading 135 (9%) of the 1527 instances because of unsupported nonlinear functions, e.g. the error function, or because the problem size is too big and SUSPECT time outs after more than 5 min converting from Pyomo to SUSPECT. SUSPECT processes 1003 (72%) of the remaining 1392 instances instances in under 2 s. This low computational overhead is due to the strength of tree-walking algorithms. Using the MINLPLib structure information as an oracle and considering the 1392 instances SUSPECT processes, our results matched the expected results in 1275 cases (91.6%). In 35 cases (2%), SUSPECT times out or encounters an exception while performing special structure detection. Table 15 of Appendix F lists the instances where SUSPECT did not detect the convexity properties.
SUSPECT does not have false positives on the test set, this gives confidence in the implementation and makes it possible to use it for solver selection. The johnall problem is labeled as having nonlinear objective function by MINLPLib but SUSPECT detects it as having polynomial objective type. The lip objective function is detected as concave because it is a convex maximization problem. Of the remaining 82 problems not recognized as convex (or concave), 53 fail because of numerical errors when computing the eigenvalues of the quadratic expressions. These problems are marked with a dagger † in Table 15. The trim loss minimization problems (tls{2, 4, 5, 6, 7, 12}) are convex but SUSPECT fails at detecting it because they contain the square root of the product of two nonnegative variables. The problem st_e17 contains the constraint x 1 − (0.2458x 2 1 )/x 2 ≥ 6 that SUSPECT fails to recognize as convex.

Conclusions and future work
This manuscript presents SUSPECT, a tool detecting and reporting special structure in optimization problems formulated using the Pyomo algebraic modeling library. The project can be developed further in different directions. On the convexity detection side, we can add an additional step of convexity disproving if the convexity detection is inconclusive. We can add the ability to detect nonconvex objective functions, e.g. the difference of two convex expressions [2,50] to take advantage of new algorithms [52] specialized in solving this class of problems or to recognize pooling problem structure [11]. We can improve the bound tightening step using Optimization Based Bound Tightening, for example using one of the recent implementations [20] that started to emerge. It would be easy to extend SUSPECT to work with other algebraic modeling languages since the special structure detection algorithms work on SUSPECT's own DAG or to extend the family of expression forms known [39]. SUSPECT could be integrated with existing nonlinear solvers to detect the problem type and choose the best optimization algorithm for it.