Human-centred feasibility restoration in practice

Decision systems for solving real-world combinatorial problems must be able to report infeasibility in such a way that users can understand the reasons behind it, and determine how to modify the problem to restore feasibility. Current methods mainly focus on reporting one or more subsets of the problem constraints that cause infeasibility. Methods that also show users how to restore feasibility tend to be less flexible and/or problem-dependent. We describe a problem-independent approach to feasibility restoration that combines existing techniques from the literature in novel ways to yield meaningful, useful, practical, and flexible user support. We evaluated the resulting framework on three real-world applications and conducted a qualitative expert user study with participants from different application domains.


Introduction
Finding (high quality) solutions to combinatorial problems is important for our society. This has fuelled research into technologies to model and solve these problems, many of which are now used in decision systems deployed by businesses such as Amazon, Google, and HP.
An important but less researched aspect of these systems is their interaction with human users, particularly when reporting infeasibility created by errors or "what-if" scenarios. While users do need information to restore feasibility, it is not obvious what information is best. Current research has mainly focused on finding subsets of the problem constraints responsible for the infeasibility. This has yielded interesting subsets, such as Minimal Unsatisfiable Sets (MUS) and Minimal Correction Subsets (MCS) [1], and enumeration methods to compute them efficiently (e.g., [2][3][4][5][6]). See [7] for many applications of these subsets ranging from signal processing and machine learning to improving Benders decomposition of mixedinteger linear programs, using exact/heuristic and autonomous/human-aided computations.
B Ilankaikone Senthooran ilankaikone.senthooran@monash.edu Extended author information available on the last page of the article While enumeration methods are a great starting point for explaining infeasibility to users, a straightforward use of these methods is not suitable for real-world systems [8]. We have experienced this repeatedly, most recently in a system that finds high quality 3D layouts for an industrial plant, where better quality solutions can save millions of dollars. We soon realised it is easy for users to create infeasible plants due to incorrect data (e.g., making the plant footprint too small for its equipment) and/or inconsistent constraints (e.g., setting object A to be on the ground and also on top of another object B). Since plants contain hundreds of pieces of equipment, multiple inconsistencies are easily introduced. Our attempts to use some of the constraint-independent enumeration methods available [4,5] to find and resolve these inconsistencies resulted in impractical waiting times and hundreds of MUSes, which overwhelmed the users both due to their number and the fact many of the constraints had been automatically introduced by the system and thus were unfamiliar to users. Further, these methods did not show how to restore feasibility. Our attempts to use methods that sacrificed generality for speed, and which reported the minimum changes needed to restore feasibility [7,9], resulted in users being given a very restricted set of choices.
The above experiences illustrate the need for user support that is meaningful, useful, practical, and flexible. We say support is meaningful if it expresses the selected constraint subsets in a way that is understandable to users. It is useful if it helps users determine not only what prevents the system from finding a solution, but also how to actually modify the data or constraints to eliminate the inconsistency. It is practical if it is fast enough for users, and it is flexible if it gives them choices regarding how to find and resolve infeasibility.
The few methods that address some of these four needs [10][11][12] tend to focus only on one of them and/or are problem-dependent. Two of our main contributions in this paper are (1) a problem-independent approach to feasibility restoration that combines existing qualitative and quantitative techniques in novel ways, yielding meaningful, useful, practical, and flexible user support, and (2) an evaluation of the trade-offs between practicality and flexibility for the conflict resolution alternatives offered by our approach, as well as their meaningfulness and usefulness, in the context of three real-world applications. In doing this, we also contribute (3) a method to quantify the violation of logical combinations of constraints, and (4) three problem-independent ways to visualise MUSes and interactively explore them, each providing a different context and a different focus for communicating conflicts to users. Importantly, we also (5) compare the meaningfulness and usefulness of these three visualisations by means of a qualitative expert user study with 10 participants from different application domains.
The current journal paper includes significant parts of our previously published conference paper [13] and extends it by adding Hydrogen Production to the other two real-world applications in contribution (2) above, and by adding contribution (5) which is based on a user study performed after the conference paper. Further, we now provide supplementary material detailing how to annotate a problem model and use our approach. Finally, as the three real-world applications used in this paper cannot be made publicly available, we have now annotated models of the well-known Sudoku and Knapsack problems, which are available at https://ialab.it.monash.edu/Human-Centred-Feasibility-Restoration-Examples/ for readers to explore our approach.
The rest of the paper is organised as follows. Section 2 provides the necessary background, including the concept of constraint problems and methods to find and enumerate interesting constraint subsets. Section 3 provides a motivating example which is used in Section 4 to explain the decisions we took when designing our approach. Section 5 provides an overview of the entire approach, Section 6 discusses its Soft Generator component, Section 7 its MUS Enumerator component, and Section 8 its Visualiser. Section 9 experimentally evaluates the trade-offs between practicality and flexibility of the conflict resolution alternatives offered by our approach in the context of three real-world applications. Section 10 describes and evaluates the results of our qualitative expert user study. Section 11 discusses related work. Lastly, Section 12 concludes the paper.

Background
Constraint basics Let vars(O) denote the set of variables of object O. A constraint satisfaction problem P = (X , D, C) consists of a set of variables X , a domain function D mapping each x ∈ X to the (possibly infinite) set of values D x that forms its domain, and a set of constraints such that vars(C) ⊆ X . For optimisation problems, we add the special variable o to X , called the objective to be minimised (without loss of generality). Consider the constraint problem P = (X , D, C). An assignment of P over V ⊆ X is a set of equations of the form x = d, where x ∈ X and d ∈ D x , with exactly one literal per variable in V . A constraint c ∈ C is a set of assignments over vars(c) ⊆ X , usually denoted by a formula. Assignment A of P over V ⊆ X satisfies c if and only if vars(c) ⊆ V and the projection of A over vars(c) (defined as {(x = d) | (x = d) ∈ A ∧ x ∈ vars(c)}) is a member of c. Constraint c ∈ C of P is satisfiable if and only if there is at least one assignment A of P over X that satisfies c, and is unsatisfiable otherwise. A set M ⊆ C of P is satisfiable if and only if there is at least one assignment A of P over X that satisfies every constraint c ∈ M, and is unsatisfiable otherwise. A solution of P is any assignment of P over X that makes C satisfiable. If P is an optimisation problem, an optimal solution of P is one that minimises special variable o. Where the identity of P is clear, we will omit the "of P" part.
Explaining infeasibility Consider a constraint problem P = (X , D, C) whose set of constraints C is unsatisfiable. Subset M ⊆ C is a Minimal Unsatisfiable Set (MUS) of C iff M is unsatisfiable and removing any constraint from M makes it satisfiable; and is a Minimal Correction Subset (MCS) of C iff C \ M is satisfiable and adding any c ∈ M to C \ M makes it unsatisfiable. Every MCS is a hitting set of all MUSes (i.e., has a non-empty intersection with each MUS) and vice-versa. While removing a MUS from C might not make it satisfiable (C may have disjoint MUSes), removing one MCS from C does. Applications often have a subset B of C, called the set of background constraints, that should not appear in the computed subsets. Its complement C \ B is called the set of foreground constraints. MUSes and MCSes are redefined using B as follows. A minimal conflict of C given B is a subset M of the foreground constraints such that M ∪ B is unsatisfiable, and for any M ⊂ M, M ∪ B is satisfiable. A minimal relaxation of C given B is a subset M of the foreground constraints such that (C \ M) ∪ B is satisfiable and for any subset M ⊂ M, (C \ M ) ∪ B is unsatisfiable. Herein we treat MUS and MCS as synonyms of minimal conflicts and minimal relaxations, respectively.

Connecting models to instances
We distinguish between a problem model, where the input data is described in terms of parameters, and a particular model instance, where the values of all parameters are added to the model. In terms of our notation, a model instance corresponds to a constraint problem P = (X , D, C), while the model does not include the data provided by D. Models are usually defined in a high-level language, such as JuMP [14] or MiniZinc [15], and their instances are compiled into a flat format, where loops are unrolled and constraints are transformed into formats suitable for the selected solver. Both the compiler and the userselected solver may introduce new variables and constraints during the flattening/solving process. This makes it difficult to report any constraint subset (be it MUS or MCS) to the users in a meaningful way, since the flattened instance P = (X , D , C ) will have additional variables, associated values and constraints. Our approach uses the MiniZinc toolchain, which assigns a unique identifier [5] to each variable x ∈ X and constraint c ∈ C in the flattened instance that links them to the part of the model's source code that generated them, that is, to either an x ∈ X , c ∈ C or to the part of the code that generated the new variable x or new constraint c .
Finding one or all MUSes Early techniques to find a MUS are based on linear deletion methods [2], where each constraint in unsatisfiable set C is tentatively removed from it, and is added back to C if its removal yields satisfiability. Once all constraints in the initial C are tested, those in the final C form a MUS. One of the most popular techniques is QuickXplain [3], which reduces the number of satisfiability checks needed by recursively splitting and reducing C to a MUS. These approaches use constraint solvers as satisfiability checkers, without taking into account any properties of C. Other approaches sacrifice such generality for speed by focusing on particular kinds of constraints, like those in linear programs [16,17], numerical constraint satisfaction problems [18], and Mixed Integer Programs (MIP) [2], where a MUS is called an Irreducible Inconsistent Subsystem (IIS). A detailed survey of MUS and MCS enumeration techniques can be found in [19,20]. Most MUS enumeration techniques gain speed by keeping track of the explored subsets to prune superseded ones. One of the most popular is MARCO [4], which avoids redundant checks of supersets/subsets of known unsatisfiable/satisfiable sets. When enumerating MUSes, our approach uses Find-MUS [5], a MUS enumeration tool available for MiniZinc that extends MARCO by taking advantage of the hierarchical constraint structure present in MiniZinc models to speed up computation. When computing a single MUS, our approach makes use of the IIS computed by the state-of-the-art MIP solver Gurobi [21], and also of the ability of FindMUS to find a single MUS.
Finding one or all MCSes The aim when enumerating MUSes is often to compute one or more MCSes. Efficient techniques exist to compute MCSes directly (e.g., [1,6,22]), rather than as hitting sets of MUSes. Interestingly, MCSes discovered during MUS enumeration can also be used to speed up the enumeration process [23]. Our approach uses a method based on [9] to quickly compute a single MCS. This is because all the problems we modelled in our real-world applications were, at their core, MIP models, and thus easy to adapt for [9] with the approach described in Section 6. To increase flexibility, we also allow users to select their preferred MCS based on the MUSes enumerated.

Motivating example: plant layout
The equipment allocation phase of the Plant Layout system of [24] finds the 3D position coordinates and orientations of the specified equipment within a given container space, that (1) satisfy distance, maintenance, and alignment constraints, and (2) minimise the costs of the plant's footprint, of the supporting equipment, and of a Manhattan approximation of the connecting pipes. Figure 1 shows the main phases supported by our approach: once the creation of an initial Plant Layout instance ( Fig. 1(a)) is completed in the Input phase, users can either view a solution (rendered in 3D in the Output phase of Fig. 1(d)) if the instance is satisfiable, or can otherwise restore the instance feasibility by following the Review and Repair phases.
Data: During the Input phase, the user interface (UI, Fig. 1(a)) provides a predefined palette of equipment templates, where each template belongs to a class (e.g., heat exchangers or pumps) and each class has a set of associated constraints. Users can drag and drop equipment from the palette onto the canvas to describe the plant. They can modify the equipment dimensions, alter the positions of the nozzles where the pipes attach, and connect equipment via pipes. All input data constraints are added to the set of foreground constraints, except for the equipment dimensions and nozzles which are assumed to be correct and thus added to the set of background constraints.
Underlying constraints: Every piece of equipment is automatically constrained to be within the container space; not overlap with any other equipment; be positioned in one of four possible orientations; satisfy the min/max distances associated to its class; and satisfy any maintenance access constraints associated to its class (e.g., needs truck access or cannot have equipment below). In addition, some combinations of equipment have extra constraints (e.g., heat exchangers must be symmetrically positioned w.r.t. their connecting vessel). Finally, the model has redundant constraints to speed up solving.
User constraints: The UI also allows users to add, remove and modify some of the underlying constraints. In particular, users can modify the min/max distances between equipment classes; add constraints on the relative position/distance of any two objects; add/delete/modify maintenance access for any equipment; ensure objects are positioned within/at a given area/point/boundary; provide upper bounds to the container space dimensions; and add group size constraints forcing selected objects to be within a given sized box.
Internal representation: The optimisation model internally treats equipment as boxes, thus ignoring their exact form. It also models maintenance access zones as boxes attached to equipment in rigid/rotatable form, thus providing access to one or any side of the equipment (e.g., the blue, green, and yellow boxes in Fig. 1(d) are all maintenance access zones). The position of each box is modelled primarily by its front-left-bottom corner coordinates and its orientation. However, many other auxiliary decision variables are used to make it easier to express certain constraints. Restoring feasibility: It is easy for user constraints to cause infeasibility. Thus, all user constraints are added to the set of foreground constraints. For example, the min/max distances or the absolute/relative positioning of equipment can conflict with the dimensions of the container space or make it impossible for equipment to be symmetrically positioned. Figure 1(b) shows the Review phase with the three (problem-independent) ways our system helps users understand and explore the infeasible instance. For example, Fig. 1(b.2) shows the format used by our system to display the constraints in all MUSes and allow users to select an MCS (see Section 8 for details). This MCS is then used by the system to find restoration values for its constraints. Users can then modify these values in the UI during the Repair phase ( Fig. 1(c)) and restart the process to get a solution in the Output phase ( Fig. 1(d)).

Design decisions
As mentioned before, our aim is to support users not only in understanding the reasons for infeasibility, but also in recovering from it, and do so in a meaningful, useful, practical, and flexible way. As with most multi-objective optimisation problems, these objectives often conflict. For example, in order to be flexible we would like to find all possible MCS (or minimal relaxation subsets). However, as well as being potentially impractical, this may also overwhelm users, thus affecting meaningfulness. The following is a brief summary of our key design decisions.
To be meaningful we use MiniZinc's capability to (a) annotate the model constraints with meaningful names, and (b) connect the variables and constraints seen by the underlying solver (i.e., those in the flattened instance) to those appearing in the problem model. This allows us to present information to the users at the same level of abstraction used in the UI for constraint specification, independently of the underlying technology (see, for example, the constraint names "group size" and "maximum distance" in the infeasibility set reported by all three images in Fig. 1(b)).
To be useful we combine the detection of infeasible sets with a conflict resolution technique where slack variables are added to user constraints, transforming them into soft constraints [9]. This allows us to tell users not only which constraint subsets are infeasible, but also how the value of their variables can be modified to make them feasible.
To be practical we support both time-intensive methods that aim to enumerate all minimal/maximal infeasible/feasible sets, and fast methods that aim to report a single set, possibly not minimal/maximal. Further, by iteratively combining these methods we are able to provide feedback about infeasibility quickly, while also generating sets that are useful to users.
To be flexible we show users the different ways in which the methods supported by our approach can be executed, and allow them to decide what they want reported and how long they can wait for it. Figure 2 shows our approach as a sequence diagram, where colours correspond to the different restoration methods users can select and the solid/dash outline of the arrows indicates the right/left direction of the sequence. To help put this diagram into context, the four main phases introduced in Fig. 1 are shown in the bottom right. The first 4 steps (in green) are always followed: the user describes the problem during the Input phase via a UI (step 1), which sends this information to the instance generator (step 2) to generate a MiniZinc instance (step 3) that is then flattened by the MiniZinc compiler and sent to the selected solver (step 4). If a (possibly optimal) solution is found, it is sent back to the UI and presented to the user in the Output phase (steps 5 and 6). Otherwise, two paths (blue/red) can be selected to enter the Review phase.

Approach overview
In the blue path the solver engages our soft generator (step 7). This generates a soft instance where the user constraints are relaxed by means of slack variables aimed at quantifying infeasibility (step 9). This soft instance is then flattened and sent to the solver (step 10) for a soft solution. Users who want a fast, though potentially incomplete explanation, can request this soft solution (steps 11 and 12), use the values of the slack variables to modify the problem, and then restart the process at step 1. In the red path the solver engages the MUS generator either directly after detecting inconsistency (step 8), or after users request for the soft solution obtained in the red path to be sent to the MUS enumerator to speed up its enumeration (step 13). In both cases, the enumeration often requires several calls to a solver (step 14). The enumerated MUSes are then sent to the UI (step 15) and presented to the users for their review (step 16). User can then select their preferred MCS (step 17) and trigger a Repair phase (step 18), where slack variables are only added to the selected MCS constraints (step 19). The resulting soft MCS instance is then flattened (step 20) and solved (step 21). The MCS solution is presented to the users (step 22), who can go back to Input phase (step 1) to make any required changes to the original problem and restart the process.

Soft generator
Our soft generator has two main goals. The first one is to quickly identify either one MCS or the constraints in the MUSes of one MCS. The former greatly reduces the number of constraints users need to modify, but locks them into the constraints of that MCS. The latter allows users to select their preferred MCS from the constraints in the MUSes, but may overwhelm users due to large numbers of constraints. Thus, the former is more meaningful, while the latter is more flexible. The second goal is to quantify the minimum changes required to restore feasibility [25].
To achieve these two goals, we modify the infeasible instance based on [9] in two ways. First, all user constraints are relaxed by introducing slack variables, whose values provide the required quantification for our two goals (and must be ≥ 0). Second, the original objective function is replaced by one that always minimises the total value of the slack variables (second goal) but can either minimise the number of slack variables with a positive value (yielding one MCS), or not (yielding constraints in the MUSes of at least one MCS). For our motivating example (Plant Layout), the violation measured by the slack variables corresponds to length units, and the new objective function uses them without any scaling (i.e., the unit of violation is constant, e.g., 1mm) and without any weights. Note that weights can be useful for users to express a priori preferences between types of constraints by quantifying the relative importance of their slack variables. However, none of the users in our three realworld applications have provided such a priori preferences. Note also that, as discussed later in Section 8, users can always control (though not quantify) preferences a posteriori (step 17) by switching constraint types on and off.

Relaxing constraints via slack variables
Let a, x ∈ R n be vectors of coefficients and variables, respectively, and d ∈ R a constant. Linear inequality a x ≤ d is relaxed in [9] as a x ≤ d + s ω , where ω is the index of the constraint and, thus, of the slack variable s ω . Since all slack variables must be ≥ 0, linear equality a x = d is relaxed in [9] using two inequalities: a x ≤ d + s + ω and a x ≥ d − s − ω , which increases the number of constraints w.r.t. the original model.
Combinatorial constraints are not tackled in [9]. To help fill this gap, we define here a general method by modifying MiniZinc's linearisation mechanism for MIP solvers [26]. Let us show how we do this for logic constraints to illustrate the general method. Consider a logic constraint, such as a disjunction of inequalities. MiniZinc linearises logic constraints in three steps [26]. First, each linear component (say a x ≤ d) is turned into an indicator constraint, b = 1 → a x ≤ d, where b is an auxiliary 0/1 variable and → logical implication. Second, each indicator constraint is turned into the big-M constraint, a x ≤ d + M (1 − b), where M is an upper bound for a x − d. Finally, the original logic constraint is translated into Boolean arithmetic. Our method modifies step two to relax the indicator constraints by introducing a slack variable into the right-hand side: b = 1 → a x ≤ d + s ω . This quantifies the infeasibility.
Similar techniques can be applied to other constraints in a MiniZinc model. However, automatically relaxing arbitrary constraints is complex, particularly when the aim of the relaxation is to provide users with useful insight into how to modify the input data to restore feasibility. For example, one could transform ("soften") a given global constraint to measure the number of its variables that have to take different values to satisfy the constraint, or the number of violated constraints in its "flattening" [27,28]. While such transformations are useful for driving the search and for fast solver propagation, they might not be the most useful for users trying to recover from infeasibility. Thus, the choice of transformation depends not only on the constraint itself, but also on what the transformation is aiming to achieve. It may even depend on the particular constraint problem being modelled. Exploring the automatic relaxation of arbitrary constraints is one of our future goals, possibly by extending the Mini-Brass [29] system, which can be used with MiniZinc and provides soft implementations for some global constraints.

Replacing the objective function
Quick MCS: In order to identify one MCS quickly, we generate the following lexicographic two-objective function, which first minimises the total number of slack variables with positive value (yielding a minimum-cardinality MCS), and then the total magnitude of slack violation [7] (ensuring the changes needed to restore feasibility via that MCS are minimal): where s ω is the slack variable introduced for constraint ω, and each b ω is a new 0/1 variable defined by the indicator constraint b ω = 0 → s ω ≤ 0 (recall s ω ≥ 0). Thanks to f 1 , the solution returned minimises the number of constraints that must be violated to get a soft solution to the original problem (those for which the slack variables are positive). This identifies one MCS (in fact a minimum-cardinality MCS), which was our first goal. Thanks to f 2 , the solution also quantifies the minimum total change required for the values of the slack variables in the violated constraints, which was our second goal. The image on the left in Fig. 3 provides a visual representation of this approach, where each of the pale columns represents a relaxed constraint, and a dark coloured column within a pale one indicates the positive value of the associated slack variable (the shorter the colour the lower the value). Thus, the image on the left represents a soft solution where f 1 = 3 and f 2 is the sum of the values represented by the 3 darker columns. Note that modifying the objective function (1) to minimise only f 1 is not ideal, as this can yield too large slack values. Similarly, minimising f 2 alone is not ideal, since any reduction in the value for f 2 will likely be achieved by "spreading" the load across more slack variables. Thus, this may not only fail to yield an MCS, but also generate so many positive slack variables as to overwhelm users. We could alternatively modify the objective function to first minimise f 2 and then minimise f 1 . While this might reduce the number of slack variables, it is still unlikely to yield an MCS. This was indeed the case in our (limited) experiments, where the swapped objective function significantly increased the number of positive slacks.
Quick MUS constraints: In order to identify the constraints in the MUSes of at least one MCS quickly, we generate the following alternative lexicographic two-objective function [9]: where ω, s ω , and f 2 are as above, and f 1 represents the maximum value of any slack variable. Due to the lexicographic order, which minimises f 1 first, all the values summed up by f 2 will be equal to or smaller than that of f 1 . The image on the right in Fig. 3 provides a visual representation of this approach, showing a soft solution where f 1 = 8, f 1 is the value of the 3 leftmost dark columns (two blue, one orange) and f 2 is the sum of the values represented by all 8 darker columns. Note that this method does not group the constraints into MUSes nor identifies any MCS, thus speeding up the computation. Both MUSes and MCS(es) are implicit and, if grouping and/or identification is required, post-processing needs to be performed either by the user (step 12) or by the system (step 13).
Again, neither f 1 nor f 2 alone perform well: f 1 may leave all slack variables positive (including those of non MUS-members), while f 2 can lead to arbitrary MUS-member slack variables set to zero. Even joining both objectives by a linear combination can fail to produce a single MUS, thus reducing flexibility by reducing the amount of choice. Consider for example, the relaxed constraints while a MUS must have one of the last two constraints (which means s 3 and s 4 cannot both be 0).
Note that even the full objective (2) may miss some constraints of a MUS if they overlap with those of another MUS. This is illustrated in a different example shown in Fig. 4, where the lines denote five linear constraints (1)(2)(3)(4)(5) and the arrows denote their feasible directions, yielding three MUSes: {1, 2, 3}, {1, 4, 5}, and {1, 2, 5}, with constraint 1 appearing in all three. When minimising objective (2), the optimal soft solution is in the centre of triangle {1, 2, 5}. There, the slack variables of constraints 3 and 4 become zero and are thus disregarded, obtaining only positive slack variables for constraints 1, 2 and 5 (those in the last MUS).
Note also that when using any of the two objectives above, the original objective (i.e., the one in the original problem model) is not used. This allows us to generate a satisfaction version of the model that can sometimes be much simpler, since some variables and constraints in the original model are defined only for their role in the objective (see, for example, Section 3.1 of [30]). This is also the case when enumerating MUSes, as detailed in the next section.

MUS enumerator
MUS enumeration often aims at computing one MCS. Automatically computing it can however prevent users from selecting the best MCS for their problem. To increase flexibility, our approach graphically displays the MUSes enumerated (see next section) in such a way that it is easy for users to identify different MCSes for those MUSes and select one. As mentioned before, our approach uses FindMUS [5], a MUS enumeration tool available for MiniZinc, that extends MARCO to take advantage of the hierarchical structure present in MiniZinc models. In addition, FindMUS keeps track of annotations provided for the variables, expressions, and constraints in the instance, and can add them to its JSON output. This makes it easier to integrate FindMUS with other tools such as UIs: if users can provide meaningful constraint and variable names in the application's UI, and these names can be added to the Fig. 4 An example of 5 constraints yielding 3 overlapping MUSes generated instance, then they can be used by FindMUS in its JSON output. This allows UIs to display the constraints in the MUSes with their assigned names, making the output more meaningful for users.
The downside of such flexibility is practicality: MUS enumeration tools are often impractical for real-world systems due to the number of possible foreground constraint combinations. In practice, the number of MUSes can be reduced by removing any redundant constraints added by the system to speed up solving, and by moving to the set of background constraints (a) data constraints assumed to be correct, such as the dimensions of the equipment in Plant Layout, and (b) underlying constraints that cannot be modified by the user, such as the nonoverlap constraints in the same problem. This is easily achieved in our approach by using MiniZinc annotations as follows. First, all constraints that should be in the set of foreground constraints are appropriately annotated in the MiniZinc model. Note that, thanks to the Mini-Zinc compiler, all annotations are carried onto the associated "flattened" constraints in the instance. Then, all constraints in the instance that were annotated as needing to be in the set of foreground constraints are automatically placed there by FindMUS, and all remaining constraints are placed in the set of background constraints. This also increases meaningfulness as it focuses users on the constraints they can change (e.g., the size of a group), and reduces the number of MUSes pointing to the same problem (e.g., multiple individual objects in the group not fitting into the allocated space). See the supplementary material for a basic example of how to add such annotations to a MiniZinc model, how these annotations link flattened constraints back to user-specified constraints and the problem entities to which they apply (e.g., equipment in Plant Layout), and how to run FindMUS with this model.
To further increase the practicality of MUS enumeration tools we have explored three alternative avenues. The first avenue uses the solution to the soft instance generated by the soft generator to try to speed up the search for MUSes (step 13). To achieve this, the MUS enumerator collects in set V ars all non-slack variables that occur in at least one constraint with a positive slack value. It then partitions the original set of constraints into those constraints with at least one variable in V ars (which become the set of foreground constraints) and the remaining constraints (which become the set of background constraints). Intuitively, this focuses the MUSes on the variables that are directly involved in the infeasibility. For Plant Layout, V ars corresponds to the objects in the plant and our system uses the annotations in the MiniZinc model to speed up the detection of constraints that involve at least one object in V ars.
The second avenue explores the use of the Irreducible Inconsistent Subsystem (IIS) efficiently computed by some MIP solvers-we use Gurobi [21]. To achieve this, the MUS enumerator is modified to be able to start each search for a MUS by asking the solver for an IIS (can be seen as part of step 14) with the aim of improving performance. As Gurobi can report whether the IIS is minimal or not, we have modified FindMUS to further shrink nonminimal subsets to a MUS before reporting it. The third avenue combines the two previous ones by allowing the MUS enumerator to take advantage of both soft generation and IIS.

MUSes and MCS visualiser
As shown in Fig. 2, step 16 enables users to review the MUSes found by the MUS enumerator, and select a correction set (minimal or not, step 17) before executing steps 18-22 to obtain a solution that includes the slack variables in the constraints of the MCS. To help users during this selection, we have developed three problem-independent ways to visualise MUSes and interactively explore them, each providing a different context and a different focus for communicating conflicts to users. These visualisations can be seen in Fig. 1(b) applied to our motivating Plant Layout problem. They can also be seen in Fig. 5 applied to our Water Management problem (see Section 9.2 for details), which finds a monthly plan for moving water between reservoirs and releasing it to rivers that satisfies the capacity, flow and demand constraints of the water network. Finally, they can be seen in Fig. 6 applied to our Hydrogen Production problem (see Section 9.3 for details), which finds the locations and composition of a network of hydrogen production facilities that satisfy constraints such as demand and storage, while minimising total production cost.
The three visualisations make use of the JSON output of FindMUS which, as mentioned before, is based on the user-defined annotations for constraints defined in the MiniZinc model. In general, annotations are just strings. To retrieve useful information for the visualisation, we created a basic semicolon-separated string format that is problem-dependent and is assumed to contain all the required information for that constraint in a specific order. The string below shows a (somewhat) generalised example to communicate conflicts to the visualisation component.
As shown above, each string contains the type of the constraint, a comma-separated list of IDs for the entities in the model that are relevant to the constraint, the input value (if any), and a textual description of the constraint. The constraint type corresponds to one of the predefined constraint types agreed upon with the users. For example, Plant Layout has many constraint types such as "group size" or "maximum distance". The entity IDs are unique entity identifiers assigned by the system, which are mapped to user-defined labels (if given) to improve readability. In Plant Layout the entities correspond to equipment, groups, and pipes, with IDs mapped to labels such as "1V-1103" and "1C-1101". The input value (if any) is the value that has been set during data entry for that constraint, such as the value of the maximum distance between two entities. The information given by each string is then used to create the following three visualisations.
The MUS list: The simplest interactive visualisation provides a list of all MUSes found by FindMUS (2 in Fig. 1(b), 5 in Fig. 5(a), and 8 in Fig. 6(a)) and is a beautified rendering of its direct output. Users can select a MUS in the list to see details of its conflicting constraints including a description, constraint type, IDs (or labels if provided) of its entities, and their input value. The user can select one (or more) constraints from the MUSes in the list to form a (possibly not minimal) correction set. The selected constraints can then be used by step 17 in the sequence diagram of Fig. 2.
The MUS graph: The MUS list visualisation is only meaningful when the number of MUSes and/or constraints in each MUS is small. To increase meaningfulness for large lists, we created a MUS graph representation ( Fig. 1(b.2) for Plant Layout, Fig. 5(b) for Water Management and Fig. 6(b) for Hydrogen Production) that connects the constraints (coloured dots on the left-hand side) with their MUSes (green dots in the middle) and their entity IDs or labels (green dots on the right-hand side). The colour of each constraint dot identifies its type, as described by the legend appearing in the top-right of the image. Constraints that occur in the same MUSes are grouped in a rectangle which is then linked to those MUSes via a green arc, significantly reducing the number of connections and visual clutter.
Users can interact with the MUS graph in two ways. First, if they select a constraint in the graph, the system highlights with a red frame the constraint and all MUSes linked to it (e.g., MUS 2 in Fig. 1(b) and MUSes 3 and 5 in Fig. 5(b)), indicating that all those MUSes will be resolved if the highlighted constraint, or any constraint in that rectangle, is relaxed. The system also highlights with a green frame any rectangle linked to the highlighted MUSes, indicating that selecting constraints in those rectangles is no longer required to resolve the MUSes. This helps users find a suitable MCS (achieved when all MUSes are highlighted), which can then be used to relax its constraints and find a solution (steps 18-22 in Fig. 2). The second way of interacting with the MUS graph is by deselecting one or more types of constraints in the legend, indicating the user would prefer not to modify them (e.g., the "minimum distance" constraint in Fig. 1(b) and "volume min" constraint in Fig. 5(b) are greyed out). This causes the system to remove those constraints from the left-hand side. Note that if the user deselects too many constraint types, the constraints of the remaining types might not be enough to resolve all existing MUSes. If so, those MUSes will have no connection to any constraint on the left, leaving them dangling. Such MUSes and their entities are coloured in light green to alert the user.
The conflict network: For a large range of constraint problems, the entities in the user constraints are part of some form of network that is meaningful to the user. For example, for Plant Layout this network is the connection graph between equipment (the nodes) via pipes (the edges) ( Fig. 1(b.3)). The conflict network displays conflicts as visual annotations on this meaningful representation. This visualisation is two-layered. The first (or base) layer focuses on the topology of the network, that is, on the position of the nodes and edges of the network. If no topology is known for the problem the visualiser simply applies a force-directed layout. Otherwise, the system assigns fixed positions to the nodes and can even render the nodes with a fixed embedding (position on maps) if they are part of a geospatial problem (Fig. 5(c)).
The second layer renders each of the constraints that are part of conflicts as edges between its entities according to the type of their constraints and according to the description of these constraints in the MiniZinc annotation. Note that when we say conflict in the context of this network, we mean a MUS. We use the word conflict to emphasise the user (rather than the mathematical) point of view, since the connection with the MUSes is designed to be much less apparent in the network than in the MUS graph. For conflicts that reference only one entity, we draw a frame around the node in the same colour used for their constraint type(s) in the graph (see for example the volume max in the Tarago Reservoir of Fig. 5(c); and the maximum allowed processing units, and off-peak period duration in Fig. 6(c)). If the conflict is between two entities, we use a colour-coded edge (see the minimum/maximum distance conflict between two equipment units in Fig. 1(b.3); the minimum transfer between Tarago Reservoir and main Demand/Supply collection in Fig. 5(c); and the disabled transportation link between supply and demand location in Fig. 6(c)). Some conflicts involve multiple constraints between two entities, e.g., a conflict between minimum and maximum allowed distance between two pieces of equipment. For these we draw a constraint edge for each conflicting constraint between the two nodes, as can be seen in Fig. 1(b.3) by the blue and green edges between two nodes.
The MUS graph and the conflict network are visually linked by common colour-coding for constraint types. Further, the conflict network also helps users understand what conflicts are solved, and when a (possible minimal) MCS has been selected in the MUS graph. This is achieved by ensuring that each MUS selected in the MUS graph is removed from the conflict network (recall that conflict and MUS are synonyms in this context), indicating to users that this conflict is "gone". If no more conflicting constraints are shown on the conflict network, then all conflicts have been selected for correction.

Experimental evaluation using real-world examples
This section evaluates the trade-offs between practicality and flexibility for the conflict resolution alternatives offered by our approach in the context of three different real-world applications.

Plant Layout
As introduced in Section 3, the Plant Layout problem is about finding the optimal spatial placement of equipment (boxes) and the best routes for connecting pipes. For this evaluation we will focus on the equipment placement part of the problem.
Benchmarks: We first used the Plant Layout UI to create six small infeasible plants designed to confirm our approach can resolve the infeasibility created by two or more con-straints of the following six types: minimum distances (MnD), maximum distances (MxD), the symmetric placement of equipment and connecting pipes (Sym), the relative attachment position of a box (AttPos), insufficient group size for contained boxes (GrSz), and insufficient size for the container space (CtSp). These six small infeasible plants became our first six benchmarks (denoted S1 to S6) and have 2 to 5 boxes and no pipes (except for S2, which needs them for Sym). Once the effectiveness of our approach was successfully confirmed for these small benchmarks, we used the UI to create three new feasible plants: a medium plant (denoted M) with 19 boxes and 20 pipes; a large plant (L) with 78 boxes and 66 pipes; and an extra-large (XL) plant with 217 boxes and 207 pipes. We then created 8 variations of each of these three feasible plants by adding constraints of the above six types that made them infeasible. This resulted in the 30 infeasible benchmarks shown in Table 1 where, for example, M1 is a medium size benchmark built by making the medium plant infeasible due to a conflict between a minimum and a maximum distance constraint; and L5 is a large size benchmark built by making the large plant infeasible due to (multiple) conflicts that include symmetry, minimum and maximum distance constraints.
The first two columns in Table 1 show the benchmark size and name. The next four columns show the total number of constraints in the instances generated in step 4 when the benchmark is initially solved (denoted Initial); in the instances generated in step 10 when it is relaxed by the soft generator searching for the set of constraints either in an MCS (using equation (1)-denoted SGC) or in its MUSes (using equation (2)-denoted SGU), respectively; and in the instances generated in step 8 when calling FindMUS directly (denoted FM). Note that column Initial has symbol "-" if the MiniZinc compiler detects infeasibility and aborts without generating anything. The last four columns show the number of distinct conflicts we added to the benchmark, that is, the number of MUSes we expected to create when we modified the problem taking the role of users; the number of objects (boxes or pipes) involved in these conflicts; and the type of constraints in the conflicts. Note that the last variation in each of the medium, large, and extra-large plants (M8, L8 and XL8) contains all the conflicts from the first 4 variations in that plant.
Setup: All benchmarks were run in 8 different configurations: the three already introduced (SGC, SGU and FM) and the five combinations FM+IIS, SGC+FM, SGC+FM+IIS, SGU+FM and SGU+FM+IIS (where IIS denotes Gurobi's IIS). In addition, configurations involving FM were run in two modes: finding one or all MUSes. This is because FindMUS supports a fast single MUS extraction mode that might also be useful for users. All runs were performed on an Intel Core i7-8700K (3.70 GHz, 12 cores, 12MB cache) with 32GB memory using Mini-Zinc 2.5.5, FindMUS, and Gurobi 9.0.1. Each instance was run on 2 cores with 2 instances being run in parallel at a time. Timeouts for the soft generator (step 13) and for FindMUS (step 14) were both set at 1/2 hour for small sizes, and 2 hours and 4 hours, respectively, for the other sizes. If these timeouts are reached, SGC and SGU might return unnecessary constraints, while FM might return non-minimal or not all unsatisfiable subsets.
Results: Tables 2 and 3 show the results for each of the 8 configurations, with underlined values indicating a timeout was reached. In particular, Table 2 shows the number (#sl) and total value (slack) of positive slack variables returned by SGC and SGU, as well as the number of MUSes (#mus) returned by the other configurations (FM, FM+IIS, SGC+FM, SGC+FM+IIS, SGU+FM, and SGU+FM+IIS), with FM enumerating all MUSes. Table 3 shows the run times (minutes:seconds) taken by the 8 configurations with FM computing only one MUS (greyed) or all.
Discussion: There is no clear winner between SGC and SGU in terms of speed. Both time out only for the XL benchmarks that have symmetry conflicts, which is due to its implementation having an absolute value constraint which slows down solving. When there     is no timeout, SGC always returns a minimum MCS, while SGU returns constraints in the associated MUSes, though without partitioning them into MUSes. If SGU returns many constraints (e.g., XL8), they can become overwhelming and thus not meaningful. If it does not, they can also be more meaningful (as they may give more context to the failure) and flexible (as they give more choice). However, if the MCS returned by SGC is meaningful enough, SGC is more useful, as it always provides the minimum number of constraints that must be changed. This already shows the value of having different approaches available.
As expected, using FM to enumerate all MUSes is significantly slower than using SGU or SGC. However, if FM does not time out, it is more flexible than SGC (which gives a single MCS) and SGU (which might miss constraints), and more meaningful, as it provides the full context while partitioning constraints into MUSes (as opposed to SGU). Also, using FM to find a single MUS is usually fast. Users could repeatedly do this to restore instances with several MUSes.
The combination of FM with SGU is not as promising as expected: the possible reduction in flexibility due to only looking at the variables in the constraints returned by SGU only pays off with speed-ups for M3, M8 and XL1. The rest are slower. The combination with SGC is better in terms of speed-ups (M2, M3, M8, XL1 to XL4), but might lose MUSes (M2 and M8). However, SGU and SGC improve FM's usefulness by providing users with slack values.
These results, and our own experience as users, suggest the following strategies for Plant Layout. Less experienced users, i.e., those modifying an unfamiliar model, and those with enough time would benefit from using FM on its own and in combination with SGC to get results that are useful (via SGC's slacks), flexible (FM's MUSes) and meaningful (FM+SGC's MUS reduction). Experienced users are likely to find SGU's results useful, fast and (together with their knowledge and experience) meaningful enough to restore feasibility, and prefer that to getting more flexibility at the cost of longer run-times and possible timeouts for very large instances.

Water Management
Managing a city's water supply requires a complex set of decisions regarding the city's storage/service reservoirs, tunnels, and water-transfer pipelines. The work of [31] describes an optimisation system designed to do this by creating a plan that outlines the anticipated operations in the water supply system for years ahead, and identifies for each month the expected water to be sourced, stored, moved from one reservoir to another, or released to the rivers. The resulting operating plan is built to satisfy the water demand, environmental and network capacity constraints, while minimising the risk of uncontrolled releases from the water harvesting sites, and the cost of transferring water between reservoirs.

Data:
The UI provides a predefined water supply network and historical stream-flows, where users can set (a) the reservoir capacities, (b) the planning horizon length and, for each month, (c) the water demands (d) min/max water levels per reservoir, and (e) min/max water flow per pipeline. In addition, for the selected planning horizon, users must specify the water levels of each reservoir at the beginning of the period and the anticipated stream-flow derived from past data. This allows users to generate operating plans for different rain inflow scenarios, storage distributions (both spatial and seasonal) and planning horizons.
Underlying constraints: Each reservoir is constrained to maintain its water level within the specified range for each month, and to release water to waterways to meet any specified environmental requirements. Each pipeline (i.e., the water transfer link between reservoirs, from a reservoir to the city, or from a reservoir to the river) cannot transfer more water than its maximum capacity. No redundant constraints were added to the model. User constraints: When generating an operating plan, users can modify the network capacity constraints (a, d and e mentioned under "Data" above) to consider events such as the closure of a pipeline or a reduction in the water level at a reservoir during a given period due to maintenance.
Restoring feasibility: The number of network capacity-related parameters exposed to users is high and increases with the planning horizon length. Thus, users can easily cause infeasibility by setting conflicting values. For example, setting a low maximum water level at a reservoir can result in excess water that needs to be transferred out, which then conflicts with the limit set on the relevant pipelines.

Experimental evaluation
Benchmarks: We built two classes of benchmarks: with short-term operating plans (12month) and with long-term ones (60-month). All are built on the same distribution network, which has 11 reservoirs, 9 transfer nodes, and 46 connections. We then created seven benchmarks for each class by modifying the operational parameters that made them infeasible. To do this we selected conflicts that were caused by the following constraint types: minimum reservoir volume (MnV), maximum reservoir volume (MxV), minimum transfer volume (MnT), maximum transfer volume (MxT), minimum release volume (MnR), and maximum release volume (MxR). Table 4 shows the characteristics of the resulting 14 infeasible benchmarks. Columns 1-7 follow those of Table 1, while the last column shows the type of constraints in the conflicts.  Tables 5  and 6 follow the same structure as those of Tables 2 and 3.
Discussion: There is no timeout in any of the instances and configurations tried. Interestingly, short-term instances have more slacks/MUSes than long-term ones. This is because the chosen conflicts have more impact in the short-term benchmarks and can be resolved in more ways than in long-term ones.
SGC is slightly faster than SGU (although this is only noticeable in the long-term benchmarks) and always returns a minimum-cardinality MCS. While SGU does not return a minimum-cardinality MCS in short-term instances, it does so in long-term ones, but only because the total amount of violations required by an MCS happens to be the minimum. Note that the number of constraints in the minimum-cardinality MCSes coincides with the number of conflicts we introduced in the benchmarks, confirming we introduced distinct (as opposed to redundant) conflicts. Both FM and FM+IIS, enumerate all MUSes, but FM+IIS is faster. Out of the four combined approaches (SGU+IIS, SGU+FM, SGU+FM+IIS, SGC+FM+IIS), SGC+FM+IIS is always fastest, and is even faster than FM+IIS for many short-term benchmarks. All four combinations produce the same number of MUSes for the same benchmark. While they often take longer to enumerate all MUSes than FM and FM+IIS, they produce fewer MUSes in some benchmarks (S2, S4, S6, S7).
Similarly to Plant Layout, using FM/FM+IIS to enumerate all MUSes is slower than using SGU and SGC. However, it is more flexible and provides the full context; especially with the visualiser, where users can easily see the connections between the conflicting constraints.
Based on the above results and our experience as users, we recommend using FM+IIS or SGC+FM+IIS for this problem, which we have already used to find conflicts for our industry partner very quickly. This might be surprising, as SGC is faster and always points to the correct reservoirs and/or connections. However, all configurations are fast and there are often multiple ways to resolve conflicts in this problem, one of which could be the desired way to fix them, and this could only be found by enumerating all the MUSes.

Hydrogen Production
The Hydrogen Production problem finds the number, location, and size of a set of hydrogen production facilities that (1) satisfy the demand, flow, storage, and turn-down constraints during a specified sequence of production periods, and (2) minimise their total production cost. A production period represents a given number of years where the demand is constant. The run-time of a production facility is the total number of years in the periods since the facility started its production.
A (simplified) hydrogen production facility is built from four types of components required to produce liquefied hydrogen: electrolysers, gas compressors, liquefaction units, and storage units. Each of these components has an associated number of configurations, which determine their specific flow per day or storage capacity. The size of the facilities is represented in a solution by the number of units of each configuration and component type.
Many factors need to be considered for selecting the locations and sizes of the production facilities, such as the distance to the customers (which influences transport costs), the available transport modes (pipeline, truck, train), the power grid availability and distance, the electricity costs, the capital expenditure (CAPEX) cost of purchasing the equipment, the operational expenditure (OPEX) cost of running the equipment, and so on. The model used in our experiments uses the following simplifications to strike a balance between complexity and speed: there is only one production facility per fixed region (state or territory), a production  facility can supply many demand locations but a demand location can only be supplied by a single production facility, only a single transport mode with predetermined transport costs between locations is considered, and each supply location has a single electricity provider with associated cost per production period, which is the price during the off-peak time of the day. Underlying constraints: Each production facility must produce enough hydrogen to meet the total demand of the locations it supplies. The amount of hydrogen produced is only limited by the maximum allowed number of installed components (due to the flow/storage constraints of its configurations), and by the number of off-peak hours. To obtain the smallest configuration for each supply location a constraint enforces the turn-down capacity of each component to be smaller or equal to the required total production for all chosen demand locations. This ensures that the choice of the sizes of the available components is optimal, that is, not bigger than necessary. Note that the objective function includes the sum of transport and electricity costs, which ensure the best number of facilities and their locations with respect to distance and production price; and the sum of CAPEX and OPEX costs, which enforce the minimal possible number of supply locations for this given problem. As a result, the objective function can decide it is cost-effective to build a bigger configuration sooner than is strictly needed, due to better CAPEX and OPEX costs for one larger component than for several smaller ones.

Explaining the data and model
User constraints: Users can change all aforementioned data values, including CAPEX and OPEX costs, transport costs, electricity costs, duration of the off-peak period, and turndown capacities. Users can also disable specific supply-demand transportation links, thus stopping any production facility in a supply location from delivering to a specific demand location. Disabling all supply-demand transportation links for a given supply location is equivalent to forbidding the construction of production facilities in that location. This can be used to explore alternative solutions by, for example, forcing the system not to use supply locations that appeared in an already known optimal solution.
Restoring feasibility: Users can create infeasible instances via three different types of data/constraints that can make it impossible for the system to meet the total demand: data stating the maximum number of allowed components per production facility, data stating the number of off-peak hours, and constraints disabling supply-demand transportation links. Figure 6 shows the conflicts of an infeasible instance as a MUS list, as a MUS graph, and as a geospatial representation of the conflict network using maps and location markers for the baselevel, overlaid with the conflict layer which shows location dependent conflicts (maximum number of allowed processing units and maximum duration of the off-peak period) as conflict nodes, and transportation link conflicts as edges between supply and demand location (i.e., a state or territory and a capital city).

Experimental evaluation
Benchmarks: We created 4 feasible hydrogen production problems of increasing sizes: small (S) with 10 supply locations and 8 demand locations; medium (M) with 20 supply locations and 10 demand locations; large (L) with 30 supply locations and 14 demand locations; and extra-large (XL) with 60 supply locations and 30 demand locations. All problems have 2 periods of 10 years each, with the second period having increased demand.
We then created four variations of these four feasible hydrogen production problems that make them infeasible as follows. As mentioned above for each demand location there is only a single supply location allowed at any time. We thus used this limitation to set required daily demand to a higher value than can be produced by any supply location (recall that supply is limited by the maximum allowed units and the off-peak hours). In particular, we created two variations of each size containing such conflicting constraints, with the only difference being that the first variation (used in S1-XL1) has all possible supply-demand transportation links allowed, while the second (used in S2-XL2) only has the minimum necessary transport links active, thus creating a one-to-one relationship. The third variation (used in S3-XL3) disables enough supply-demand transportation links to ensure not every demand location can receive its supply. The fourth variation (used in S4-XL4) combines the last two kinds of demand and transport link conflicting constraints, by ensuring one demand location requires more than can be supplied and another demand location cannot receive any supply. Table 7 shows the characteristics of the resulting 16 infeasible benchmarks. Columns 1-7 follow those of Table 1, while the last column shows the type of constraints in conflict.
Setup and Results: We used the same setup as for Plant Layout. The results, shown in Tables 8 and 9, follow the same structure as that of Tables 2 and 3 (where 0:01 represents a value in [0:01..0:015)).
Discussion: There is no timeout in any of the small, medium, and large benchmarks evaluated. However, XL4 leads to timeout when enumerating all the MUSes, because the introduced conflict and instance size allow resolving the underlying problem in many different (though similar) ways.
SGC is faster than SGU, but this is only noticeable in the extra-large benchmarks. SGC and SGU return the same number of constraints on all occasions since the binary variables in the conflicting constraints have the same effect for minimising the number of conflicts and the quantity of the violation. Enumerating all MUSes using FM is faster than FM+IIS. SGC+FM is the fastest compared with other combined approaches; however, slower than FM. The performance variation is only noticeable in the extra-large benchmark.
There are a few anomalous results in Table 9, where finding a single MUS is shown to take longer than enumerating multiple MUSes. As mentioned above, when instructed to find a single MUS, FindMUS tries to improve performance by using a slightly modified algorithm that focuses the search on reducing the first unsatisfiable subset discovered to a MUS. While this approach typically results in faster extraction, it can lock the algorithm into exploring subsets that are expensive to test. As a result, there can be cases where the first confirmed MUS is discovered at different times by the two approaches.
For the Hydrogen Production problem, results show that the performances of all approaches are quite similar up to large instances. However, FM seems to dominate over IIS when the problem size increases further.  We conducted a qualitative expert user study aimed at gathering feedback from optimisation and/or domain experts with which to evaluate the meaningfulness of the three conflict visualisations described above and their usefulness in selecting an MCS.
Participants: We invited 10 participants with diverse backgrounds: 5 from academia and 5 from industry. Of the 10 participants, there were 3 with optimisation expertise (developing optimisation methods and optimisation models), 2 with a background in Plant Layout, 2 with a background in Water Management, and 3 with a background in Hydrogen Production. All participants were experienced in creating optimisation models (optimisation experts) and/or populating models with data to create instances (domain experts).
Data: We created 18 infeasible instances across the three real-world domains-5 for Plant Layout, 5 for Water Management, and 8 for Hydrogen Production-based on the problem instances introduced in Sections 3, 9.2, and 9.3, respectively. The instance sizes are comparable to large for Plant Layout, short for Water Management, and small for Hydrogen Production. Note that the 18 instances do not represent toy examples but rather small realworld examples, e.g., we used part of a real-world plant for Plant Layout, and a real-world set of reservoirs for the Water Management problem (albeit with a shorter than usual planning horizon). The conflicts in these infeasible instances were created using the same constraint types used for the benchmarks in Section 9 except for pipe symmetry (Sym) and container size (CtSp) for the Plant Layout. Sym was eliminated because it is an unusual constraint for Plant Layout (it has only been applied for one particular real-world plant) and was likely to create confusion for real-world process engineers. CtSp was eliminated because it is conceptually similar to the group size constraint.
Once the infeasible instances were created, we ran FindMUS beforehand to generate all MUSes for each instance (i.e., we ran FM in mode "all" for each instance). This was done to avoid idle time and keep user study sessions reasonably short (less than 1h). Since the aim of our study is to compare the three problem-independent ways of interactively visualising MUSes, obtaining the MUSes beforehand does not affect the study results. FindMUS obtained between 1 and 10 MUSes per instance (1-4 MUSes for Plant Layout, 1-10 MUSes for Water Management, 3-8 MUSes for Hydrogen Production), with varying numbers of constraints per MUS (8-16 for Plant Layout, 3-10 for Water Management, and 12-32 for Hydrogen Production). Note that the number of constraints per MUS is a result of the underlying model and the number of interacting constraints. The conflict results were then split into two groups per problem domain: small and large, depending on the number of MUSes found for that problem. Instances in the two groups were used at different points of the study, as detailed below.
Procedure: We conducted the user study sessions with individual participants online using a video conferencing platform due to the ongoing COVID-19 pandemic and restrictions with face-to-face user studies. The real-world problem each participant worked with was the same for the entire session and was chosen as follows: domain experts were assigned the problem that matched their expertise, while optimisation experts were randomly assigned a problem from the subset of problems that were new to them. Each study session lasted between 45 and 55 minutes and was split into three parts: 1. Introduction: conflict detection/visualisation via slides (15min) 2. Training: conflict visualisation and MCS selection via the UI (10 to 15min) 3. Data collection: conflict exploration and MCS selection by participants (20 to 25min).
The aim of the introduction was to expose the participant to the necessary basic concepts such as model, instance, conflict, MUS, MCS, visualisation, and resolution. This was achieved by the facilitator sharing their screen with the participant and going over a set of prepared slides that used the selected problem as an example. In particular, the slides started by showing the problem from the user's point of view (data, constraints and objective), from the modeller's point of view (model, data, instance, solver, solution) and highlighting the difficulties when infeasibility occurs. The slides then summarised the basics of our conflict resolution system, focusing on MUS enumeration and visualisation to build an MCS (steps 8, 14-17 in Section 5 and Fig. 2), and provided the basics of the different conflict visualisations. Finally, the details of the selected problem were presented to the participants (which were subsequently used for the training and data collection).
The aim of the training part of a session was to show participants how to use the conflict visualisations both to understand the actual conflict and to select an MCS. To achieve this, the facilitator again shared their screen with the participant to perform walkthroughs [32] across the different conflict visualisations and to explain how to interact with them. The length of this part varied depending on questions from participants and emerging discussions.
In the final (data collection) part of a session participants were tasked to explore the conflict visualisations on their own using two infeasible instances they had not yet seen and to select an MCS. They were presented with a conflict visualisation with a smaller number of MUSes first, followed by one with a larger number of MUSes. Some participants were also provided with a third instance if they had specific questions about the feasibility recovery process or a discussion evolved. Participants were asked to explore the conflict visualisations in a browser on their local machine while sharing their screen with the study facilitator. They were also asked to follow a Think Aloud protocol [33] during the exploration of the visualisations to allow the facilitator to follow their thought process and to understand how they interacted with the conflict visualisations. Both the screen sharing and the audio were recorded during this part, with participants being asked to turn off their camera. All participants consented to this.
After each study session participants were asked to complete a questionnaire by rating the following statements: S1: I understood the tasks set out for me. S2: I had trouble completing the tasks assigned to me. S3: I could understand how to use the application 1 S4: It was easy to learn how to use the application. S5: I could use the application well without making many mistakes. S6: The way of interacting with the application is intuitive. S7: This application is useful in finding the cause of infeasibility. S8: This application is easy to use. S9: I was satisfied in using the application. on a five-point Likert scale [34]. Many of the statements were added to determine whether the user's performance was impaired by our explanation of the task (S1), its difficulty (S2), or by the UI itself (S4, S5, S9). In addition, S3, S6 and S8 were added to measure the ease of use and intuitiveness of the UI in more detail. While S7 is the only question focused on usefulness, this topic was also frequently mentioned during Think Aloud and in the open-ended questions (see below).
Participants were also asked to indicate which of the three visualisations they would prefer and which they would not prefer if they could choose what was available to them. Note that our aim here was to establish preference, rather than the ranking established by questions such as best / worst. This is because while there is always a last place in a ranking, it is possible for users to like all visualisations and thus answer "None" to "not prefer" (as was the case, see below). Results: During the sessions all participants were able to understand the concepts and successfully completed their tasks (exploring the MUSes and selecting an MCS). Figure 7 provides the results of the questionnaire. As shown by Fig. 7(a), six statements got only positive ratings (i.e., "Strongly Agree" and "Agree", except for S2 where positive is "Strongly Disagree" and "Disagree"); another two (S5 and S8) included a "Neutral" rating; and only one (S3) included a "Disagree". Clearly, user performance was not impaired by the task or the UI, and most users found the UI intuitive and easy to use. All users agree it was useful, with 7 out of 10 indicating strong agreement.
As shown by Fig. 7(b), two participants prefer none of the conflict visualisations, one participant prefers the MUSes only as a list, and most participants prefer a graphical representation of the MUSes either as the combination of graph and network (5 participants) or as a graph only (2 participants). Using the MUSes only as network is never the preferred visualisation and is the "not prefer" choice of two participants. One participant does not prefer the MUSes only as a graph and three participants do not prefer the MUSes only as a list. Four participants responded with "None" to the question which conflict visualisation they would not prefer, and no participant responded with "MUSes as graph and network".
Considering the preferences w.r.t. participants' expertise level provides additional interesting insights. Most of the domain experts prefer MUSes as graph and network (5 out of 7 domain experts) and answered "None" when asked which visualisation they would not prefer (4 out of the 7  Regarding the open-ended questions, most participants did not have any issues with the application while using it (P4, P5, P6, P7, P9: "It worked very well!", P10), P1 and P2 suggested improvements regarding user interaction and user feedback ("I want to see instantaneous feedback …"), and P3 and P8 experienced small issues with the user interaction ("I found some bugs …", "Small, bugs with toggling on and off the graph nodes").
Almost every participant liked/enjoyed some feature while using the application with P2 even comparing it to a game (P1: "I like the graphic user interface, interactivity and ease of operation", P2: "It's like solving a puzzle", P3: "It was easy to use, and very intuitive", P4: "The colours for different constraint categories were good. I liked the different visualisations for the MUSes, [having] multiple options was good to reframe the feasibility restoration", P6: "Yes, this kind of debugging the error is interesting and useful for modellers", P7: "The graphical representation of the feasibility issues was very useful", P8: "Application gave an intuitive methodology to understand how the MUS works", P9: "The MUS graph seems very useful. In particular the grouping by constraints that appear in several MUSs-I don't think that's been done before …", P10: "The schematic depiction was very user friendly and made the program very intuitive").
Several participants also made suggestions for improvements such as changing the order of the conflict visualisations (P8: "I think presenting the graphical components (graph and network map) before the list would better assist the user's ability to first visualise constraints before going into further detail (list view)"); reviewing the overall colour scheme (P1: "Try to use 'red' as a warning, 'green' as problem solved", P4: "More a random thought, however I once had a colleague that was colour blind-a user such as this with colour-blindness may not benefit much from the MUSes as graph and network"); and extending the user interaction (P3: "… missing features (removing edges when a constraint type is deselected [in the network])", P9: "I would have liked to be able to go from the graph view to a particular constraint in the list view with a click, or the other way around").
Two participants provided additional comments. P3 suggested better user guidance for the MUS graph: "I think it should be made clear what the grouping of constraints meant in the graph view, I eventually worked it out in the session, all constraints which act the same with respect to the MUSes, but this should be made clearer …". P8, explaining their motivation for their answer to the question 'Which of the three approaches for feasibility restoration would you prefer?', said "I answered this 'None of the above' as I think each provides a different view crucial [to] understanding the entire solution. Without all three I wouldn't be able to truly comprehend the solution").
Discussion: In general, participants were satisfied with the UI and the conflict visualisations, while providing good suggestions for improvements. The only negative rating ("Disagree") was given for S3, which also received three neutral ratings. This is not unexpected since the feasibility restoration with its different visualisations is an expert system requiring introductory training to be used by non-experts and, thus, it is difficult to use at first glance. This difficulty is also reflected by the single neutral rating for S5 (use without many mistakes) and for S8 (easy to use). However, all participants found the UI and conflict visualisations useful and intuitive.
The results also show a strong preference for a graphical representation of the conflicts either as the MUS graph for optimisation experts or as the combination of MUS graph and network for domain ones. This difference might be due to the network visualisation being able to add geographical/space related information that is more meaningful for domain experts (who have such information deeply ingrained in their problem thinking) than for optimisation experts. Also interesting is the fact most domain experts said there is no visualisation they would prefer not to have, while all optimisation experts selected the list as the visualisation they did "not prefer". This might be due to the detailed description and values provided by the list visualisation being more meaningful to domain experts than to optimisation ones, as the former are accustomed to think in the concrete terms of the problem such as objects and values, while the latter are more accustomed to think in terms of problem abstractions such as variables and constraints. While two participants indicated they have no preference for a visualisation, one of those two said this is because they consider all three different conflict visualisations crucial for understanding. In summary, the participants could clearly see the value of the MUS graph either by itself or in combination with the network, and participants other than optimisation experts felt that the list view might also have merit, depending on the experience of the user and their background in the particular application domain.
This summary is also supported by the verbal comments participants made during the sessions regarding the meaningfulness of the visualisations, i.e., that they present information in terms of problem concepts (P1: "I really like this concept actually, especially for someone who is not familiar with optimisation, for the user to understand", P7: [asked why they didn't use the network] "I know what's in the real system, so I was just using the graph … This one [the network], yeah, it wasn't really essential, this may be useful for someone who is less familiar with the system, someone new", P8: "I do like this detailed nature [of the list] because I have the context of the problem, if I didn't have the context of the problem I think this could be quite intimidating", P10: "For me to be honest it's easier to look at the graph, as a first time user, it gives me a better representation of the relationship between the equipment and what the issue may be", "So, I do think this [the list] is useful but maybe when I'm more familiar with the equipment and more familiar with what it's telling me").
Further comments made by participants support the usefulness of the visualisations in helping them understand the conflicts and select an MCS (P2: "Graph is definitely better … I just try to minimise the number of constraints I need to select … the list doesn't give me an overview", P3: "I guess as an expert on the problem set I think obviously I would want to be in this view [the network] … just looking at this would be enough [graph and network]", "And I'm not sure, this [the list] takes me far away from understanding of the problem", P4: "I guess, in my mind the nice thing about looking at it in this particular view [the list] is you can kind of see the descriptions all sitting there together …", P5: "I like the fact that you can jump between a table format and a graph format, it's quite useful", P6: "I think that [the overall concept] is very, very interesting", P8: "I really like that interactivity, I can see what is happening to this map", P9: "Basically, I find the visualisation very appealing." Then referring to the value of the visualisation showing constraints in a box covering all MUSes, "So intuitively, I think, I would say that there are a lot of constraints in it and it covers all of them [the MUSes] … And no matter, if I fix one of them, I've fixed the entire problem, so that seems like a good starting point to me" and "This whole structure is just completely lost when you look at the list").
A few participants also suggested improvements for the user interaction and the overall colour scheme when responding to Q1-Q4. Some participants also made similar comments during individual study sessions, in particular the colour scheme received some critical comments (P1: "I think only the colour coding thing that is my only concern", P8: "I think there is just a bit of colour scheme work to do between the links [edges in the MUS graph] and the dots [nodes in the MUS graph] just to make sure that they don't collide"). This is very valuable feedback that will help improve the conflict visualisations.

MCS Computation
As mentioned in Section 2 Background, while there are many efficient approaches for enumerating MCSes in Boolean formulas (e.g., [1,6]), we decided to use the approach described in Section 6 to automatically extract one MCS when time is critical, and let users select their own MCS from the enumerated MUSes, when more time is available. A complementary approach, which we would like to explore in the future, is to present users with the MCSes directly computed from MCS enumeration methods, as it is highly likely different problems benefit more from different approaches. Further, as discussed in Section 6.1, automatically relaxing arbitrary constraints using slack variables is non-trivial for complex (e.g., global) constraints. Thus, it might be necessary to use different methods depending on whether the problem is modelled using mainly Boolean or linear constraints.

Preferred subsets
The exponential number of MUSes and MCSes, together with the large number of constraints that appear in real-world constraint problems, have yielded methods that allow users to express their preferences [3,35,36]. These methods obtain MUSes and MCSes built from lower priority constraints, which can be violated to satisfy the more important ones. As mentioned in Section 6, a priori preferences can be represented as weights on the cost of different violations in the model. In addition, accounting for preferences can help speed up computation, as shown by MiniBrass [29], which generalises several constraint preference schemes by using partially ordered valuation structures, and implements them as a soft constraint modelling language. None of the users of our three applications have yet expressed any interest in determining hierarchies of preferences. However, we suspect this is because they have only needed to resolve conflicts that were caused by mistakes in the input data, rather than by the exploration of "what-if" scenarios that push the limits of particular scenarios. As we expect this will soon occur, we would like to assess the pros/cons of these methods in real-world problems and integrate the most suitable into our approach.

Human-understandable subsets
The automatic generation of human understandable explanations of minimal unsatisfiable subsets in constraint programming models was first explored in [37]. The approach allowed users to define nested groupings of constraints, and attach userfriendly descriptions to these groups. The descriptions could then be used in the explanations produced by a MUS enumeration system. The system would use the hierarchy of groups to simplify explanations (i.e., when all of the children of a group are included in a MUS, the explanation would include just the description of the group itself instead of listing those of all the children). As discussed in Sections 7 and 8, we use the similar mechanism of MiniZinc annotations to attach information to constraints in a model that can then be used to construct meaningful feedback for the users of a system.
A more recent approach that is also closer to ours is that of [10], which uses a version of MARCO to iteratively construct minimal conflicts responsible for causing infeasibility, coupled with minimal relaxations which can be used to restore feasibility. Their minimal sets can be expressed in human understandable language and at different levels of abstraction by using a powerset lattice of Boolean variables to represent each constraint in the foreground constraints, which only contains constraints that can be altered by parameters controlled by the users. It is also the only work we know of that tries to eliminate redundancies from the subsets to produce more compact descriptions. In contrast, our approach is problemindependent and its novel combination of enumeration and IIS based methods makes it more practical and flexible.

Finding many MUSes
As mentioned in Section 1, methods exist for finding the minimum set of changes needed to restore feasibility [7,9]. These methods produce MCSes or a small set of MUSes. A similar approach that attempts to present a meaningful subset of MUSes to a user is presented in [38]. Instead of enumerating all MUSes the approach constructs (a) a subset of the MUSes of a problem that cover the constraints of an MCS -referred to as a representative explanation; and (b) a subset of the MUSes that ceases to be representative with the removal of any one MUS -referred to as a minimal representative explanation. The subset computed in (b) is thus a minimal set of MUSes that must be corrected to make the problem satisfiable. In our framework we provide several methods to compute MUS subsets, ranging from our soft-generator methods (SGU and SGC), to focused MUS enumeration based on these softened constraints, to full MUS enumeration which is the most time intensive but presents the user with more flexibility to choose which constraints they feel are most important. We have not implemented a method to compute the MUS subsets that form a minimal representative explanation in the current framework, but this could easily be done. The resulting subsets would complement well those obtained by the focused MUS enumeration and the full enumeration approaches.

Conclusion
This paper addresses the need for decision systems to integrate meaningful, useful, practical, and flexible conflict resolution approaches to its target users. We propose a problem-independent conflict resolution approach that is meaningful, that is, when detecting infeasibility is able to show which user constraints conflict with each other, describes the conflicts using problem concepts (Figs. 1, 5, and 6); and avoids overwhelming users by supporting diagnosis at different levels of detail, and by letting users determine the amount of information shown (MCS/MUSes).
To be useful conflict resolution approaches must also show users how to resolve conflicts. We propose a combination of conflict resolution methods that not only identifies the key conflicting decisions, but also reveals the amount of change needed to resolve them. We do this by both reducing the set of infeasible constraints to a minimal set, and finding the smallest total relaxation of the constraints that can restore feasibility.
Theoretical approaches to detecting infeasibility do not scale to real-world applications, as finding all minimal unsatisfiable sets (MUSes) is often computationally impractical for them. A key contribution of this paper to practicality and flexibility is the ability for users to control infeasibility detection, as shown in Fig. 2, so that useful explanations are returned in the right amount of time (from a few seconds to overnight, depending on the need).
Usability is a slippery and often underestimated concept. The novel features we propose result from a collaborative process with industry users, who were initially sceptical and/or confused about the software, and now want to see the solutions it produces, understand the best trade-offs between the different objectives and learn why some seemingly obvious decisions turn out to be far from optimal. The three presented conflict visualisations integrated in our conflict resolution approach have been evaluated in a qualitative expert user study. The results have shown that the visualisations are both useful and meaningful, even if they also revealed some weaknesses, such as the colour scheme.
Our work identifies at least three interesting future research directions, and gives some initial indication of what can be achieved. The first one is the automatic relaxation of complex constraints with slack variables. While this paper provides a method to relax basic logical constraints, automatically relaxing arbitrary constraints is non-trivial for complex (e.g., global) constraints. The second research direction is the exploration of "a priori" user preferences in practical applications in the context of hierarchical preferences expressed via constraint hierarchies, and also in the context of relative preferences expressed via weights in our transformed objective function (see Section 6). Finally, we are also very keen to explore the identification of MUS/MCS patterns, as the pattern represented by different constraint subsets (e.g., transport link between supply location X and any demand location) will often be more meaningful to users than the list of all constraint sets (e.g., one per demand location). Further, detection of such patterns during MUS/MCS enumeration can significantly speed up computation.