Applying matrix factorization to consistency-based direct diagnosis

Polat Erdeniz, Seda; Felfernig, Alexander; Atas, Muesluem

doi:10.1007/s10489-020-02183-4

Applying matrix factorization to consistency-based direct diagnosis

Open access
Published: 14 May 2021

Volume 52, pages 7024–7036, (2022)
Cite this article

Download PDF

You have full access to this open access article

Applied Intelligence Aims and scope Submit manuscript

Applying matrix factorization to consistency-based direct diagnosis

Download PDF

Seda Polat Erdeniz¹,
Alexander Felfernig¹ &
Muesluem Atas¹

1030 Accesses
1 Citation
Explore all metrics

Abstract

Configuration systems must be able to deal with inconsistencies which can occur in different contexts. Especially in interactive settings, where users specify requirements and a constraint solver has to identify solutions, inconsistencies may more often arise. In inconsistency situations, there is a need of diagnosis methods that support the identification of minimal sets of constraints that have to be adapted or deleted in order to restore consistency. A diagnosis algorithm’s performance can be evaluated in terms of time to find a diagnosis (runtime) and diagnosis quality. Runtime efficiency of diagnosis is especially crucial in real-time scenarios such as production scheduling, robot control, and communication networks. However, there is a trade off between diagnosis quality and the runtime efficiency of diagnostic reasoning. In this article, we deal with solving the quality-runtime performance trade off problem of direct diagnosis. In this context, we propose a novel learning approach based on matrix factorization for constraint ordering. We show that our approach improves runtime performance and diagnosis quality at the same time.

Learned Constraint Ordering for Consistency Based Direct Diagnosis

LearnDiag: A Direct Diagnosis Algorithm Based On Learned Heuristics

Anytime diagnosis for reconfiguration

Article Open access 08 January 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Configuration systems [15] are used to find solutions for problems which have many variables and constraints. A configuration problem can be defined as a constraint satisfaction problem (CSP) [21]. If constraints of a CSP are inconsistent, no solution can be found. In this context, diagnosis is required to find at least one solution for an inconsistent CSP [1].

There are several diagnostic approaches [14]. One approach is direct diagnosis which employs queries to check the consistency of the constraint set without the need to identify the corresponding conflict sets [5]. When diagnoses have to be provided in real-time, response times should be less than a few seconds [2]. For example, in communication networks, efficient diagnosis is crucial to retain the quality of service. However, in direct diagnosis approaches, there is a clear trade-off between runtime performance of diagnosis calculation and diagnosis quality [6].

To address this challenge, we propose Learned Constraint Ordering (LCO) for direct diagnosis. Our approach learns constraint ordering heuristics from inconsistent historical transactions which include inconsistent user requirements. For example, our method learns that “display quality” is a more important criterion than “price” according to the past purchases and user requirements regarding a set of digital cameras. For this reason, “display quality” is regarded as more important and “price” would be a candidate for change proposal. For example, a diagnosis would recommend to increase the price in order to be able to find an item that satisfies all requirements. Using historically inconsistent transactions, we build a sparse matrix and then employ matrix factorization techniques to estimate future diagnoses. After an offline learning phase, the most similar transaction to the new inconsistent requirement set is found and the corresponding constraint ordering heuristic (which is calculated in the offline phase) is applied to reorder the inconsistent constraints before direct diagnosis is achieved. Thanks to the learned ordering, direct diagnosis algorithms can solve the diagnosis task with a high quality diagnosis result in a shorter runtime compared to direct diagnosis without constraint ordering. We provide a working example to demonstrate the effects of our approach. Finally, based on experimental evaluations, we show that our constraint ordering approach with direct diagnosis is superior to the baseline (direct diagnosis algorithms without constraint ordering) on popular benchmark constraint satisfaction problems.

2 Preliminaries

In this section, we give an overview of the basic definitions in consistency-based configuration and diagnosis, and introduce a running example. Finally, we explain our evaluation criteria for the determined diagnoses.

2.1 Configuration task

The following (simplified) assortment of digital cameras (see Table 1) and a set of inconsistent user requirements (see Table 2) for selecting a digital camera from the camera product table will serve as a working example to demonstrate how our approach works.

Table 1 Digital camera product table

Full size table

Table 2 An example of an inconsistent configuration task: CSP_Lisa

Full size table

Our working example is represented as a configuration task in Table 2 (on the basis of Definition 1). As shown in Table 2, our example configuration consists of a variable set (V ) with 10 variables (which are also listed in the first column of Table 1) and only one knowledge base constraint (c₁) which represents the set of available cameras shown in Table 1). User related preferences are defined in the set of user requirements (REQ_Lisa).

Definition 1 (Configuration task)

A configuration task can be defined as a CSP(V,D,C). V = {v₁,v₂, ... , v_n} represents a set of finite domain variables. D = {dom(v₁),dom(v₂),...,dom(v_n)} represents a set of variable domains where dom(v_n) represents the domain of variable v_n. C = C_KB ∪ REQ where C_KB = {c₁,c₂,..,c_q} is a set of domain specific constraints (the configuration knowledge base) that restricts the possible combinations of variable values. REQ = {c_{q + 1},c_{q + 2},..,c_t} is a set of user requirements, which is also represented as constraints. A configuration (S) for a configuration task is a set of assignments S = {v₁ = a₁,v₂ = a₂,...,v_n = a_n} where a_i ∈ dom(v_i) which is consistent with the constraints in C.

2.2 Diagnosis task

In an interactive configuration scenario (see Table 1), a configuration may not find a solution due to the fact that some constraints are inconsistent (see Table 2). Such a “no solution could be found” dilemma is caused by at least one conflict between constraints in the knowledge base C_KB and the user requirements REQ assuming the knowledge base C_KB to be consistent. Note that it could also be the case that C_KB itself is inconsistent. Definition 2 introduces a formal representation of a conflict.

Definition 2 (Conflict)

A conflict is a set of constraints $CS \subseteq {C_{\text {KB}} \cup REQ}$ which is inconsistent.

If we have an inconsistency in our knowledge base, we can say that C_KB ∪ REQ is a conflict set. Definition 3 introduces the concept of minimal conflicts.

Definition 3 (Minimal conflict)

A minimal conflict CS is a conflict (see Definition 2) where CS only contains constraints which are responsible for the conflict, s.t. $\nexists _{c_i \in CS} CS - \{c_{i}\}$ is inconsistent.

Our example contains three minimal conflict sets. CS₁ = {c₂,c₃} because there is no product with c₂: effectiveResolution = 20.9 Megapixel and c₃: display= 2.5inches in the product table C_KB, CS₂ = {c₂,c₉} since there is no product with c₂: effectiveResolution= 20.9Megapixel and c₉: zoom= 5.8x, and CS₃ = {c₃,c₉} there is no product with c₃: display = 2.5inches and c₉: zoom= 5.8x.

In such cases, we can help users to resolve the conflicts with a diagnosis Δ (a set of constraints). Assuming that C_KB is consistent, we can say that the knowledge base always will be consistent if we remove REQ. Removing Δ from REQ leads to a consistent knowledge base (see Definition 4).

Definition 4 (REQ diagnosis task)

A user requirements diagnosis task is defined as a tuple (C_KB, REQ) where REQ is a set of given user requirements and C_KB represents the constraints part of the configuration knowledge base. A diagnosis for a REQ diagnosis task (C_KB, REQ) is a set ${\varDelta } \subseteq REQ$, s.t. C_KB ∪ (REQ - Δ) is consistent (which means that there is at least one solution). Δ is minimal if there does not exist a diagnosis ${\varDelta }^{\prime } \subset {\varDelta }$, s.t. C_KB ∪ (REQ- ${\varDelta }^{\prime }$) is consistent.

The feasibility (consistency) of the user requirements can be checked using a CSP solver [4, 9]. If a CSP is consistent, there exists a solution i.e., $solve(CSP) \neq \varnothing $ where solve(CSP : {V,D,C}) includes a set of assignments S (a configuration/solution).

In Definition 5, we introduce the term minimal diagnosis which helps to reduce the number of constraints within a diagnosis.

Definition 5 (Minimal diagnosis)

A minimal diagnosis Δ is a diagnosis (see Definition 4) where there doesn’t exist a subset ${\varDelta }^{\prime } \subset {\varDelta }$ which has the diagnosis property.

The REQ diagnosis task triggered by the requirements of the user Lisa (REQ_Lisa) has two corresponding minimal diagnoses. The removal of the set ${\varDelta }_{\text {Lisa}_{1}}= \{c_{2}, c_{9}\}$ or ${\varDelta }_{\text {Lisa}_{2}}= \{c_{3}, c_{9}\}$ leads to a consistent configuration knowledge base, i.e., $REQ_{\text {Lisa}} - {\varDelta }_{\text {Lisa}_{1}} \cup C_{\text {KB}}$ is consistent and $REQ_{\text {Lisa}} - {\varDelta }_{\text {Lisa}_{2}} \cup C_{\text {KB}}$ is consistent.

2.3 Direct diagnosis

Algorithmic approaches to provide efficient solutions for diagnosis problems are many-fold. Basically, there are two types of approaches which are conflict-directed diagnosis [19] and direct diagnosis [7].

Conflict-directed diagnosis algorithms calculate conflicts which are then used to find diagnoses. Their runtime performance is often not sufficient especially in real-time scenarios. Direct diagnosis algorithms determine diagnoses by executing a series of queries. These queries check the consistency of the constraint set without the need of pre-calculated conflict sets.

FlexDiag [7] is a direct diagnosis algorithm which determines one diagnosis at a time which indicates variable assignments of the original configuration that have to be changed such that a reconfiguration conform to the new requirements is possible. FlexDiag is constraint-ordering sensitive.

2.4 Constraint ordering

The quality of diagnoses and runtime performance of direct diagnosis algorithms are based on the ordering of the constraints in the set of user requirements: the lower the importance of a constraint the lower the index of the constraint. The lower the ordering of a conflicting constraint, the higher the probability that this constraint will be part of the diagnosis [5].

Users typically prefer to keep the important requirements and to change or delete (if needed) the less important ones [8]. The major goal of (model-based) diagnosis tasks is to identify the preferred (leading) diagnoses [3]. For the characterization of a preferred diagnosis we will rely on a total ordering of the given set of constraints in REQ. Such a total ordering can be achieved, for example, by directly asking the customer regarding the preferences, by applying multi attribute utility theory where the determined interest dimensions correspond with the attributes of REQ or by applying the orderings determined by conjoint analysis [16].

2.5 Evaluation criteria

We can evaluate the performance of a direct diagnosis algorithm based on runtime performance, diagnosis quality (in terms of minimality), and combined performance (runtime and minimality).

Runtime

runtime(Δ) represents the time needed by the diagnostic search to find Δ. This spent time can be measured in milliseconds or in terms of number of consistency checks (#CC) applied till a diagnosis is found. For a more accurate runtime measurement (excluding the operating system’s effects on runtime, etc.), the number of consistency checks can be used.

Minimality

Diagnosis quality can be measured in terms of the degree of minimality of the constraints in a diagnosis, the cardinality of Δ compared to the cardinality of ${\varDelta }_{\min \limits }$. $|{\varDelta }_{\min \limits }|$ represents the cardinality of a minimal diagnosis. The highest (best) minimality can be 1 according to Formula (1).

$$ \small minimality({\varDelta}) = \frac{|{\varDelta}_{\min}|}{|{\varDelta}|} $$

(1)

Combined

Since it is important to satisfy both evaluation criteria runtime performance and minimality at the same time, we also evaluate combined performance based on the Formula (2). combined(Δ) increases when minimality(Δ) increases and/or runtime(Δ) decreases. This means, the direct diagnosis algorithm provides a diagnosis with a high minimality and low runtime while the combined performance is high.

$$ \small combined({\varDelta}) = \frac{minimality({\varDelta})}{runtime({\varDelta})} $$

(2)

In order to improve the runtime performance of diagnostic search, FlexDiag [6] uses a parameter (m) that helps to systematically reduce the number of consistency checks but at the same time the minimality of diagnoses is deteriorated. In FlexDiag, the parameter m is used to control diagnosis quality in terms of minimality, accuracy, and the performance of diagnostic search. The higher the value of m the higher the performance of FlexDiag and the lower the degree of diagnosis quality.

3 Related work

The most widely known algorithm for the identification of minimal diagnoses is hitting set directed acyclic graph (HSDAG) [14]. HSDAG is based on conflict-directed hitting set determination and determines diagnoses based on breadth-first search. It computes minimal diagnoses (also minimal cardinality diagnoses) using minimal conflict sets which can be calculated by QuickXplain [8]. The major disadvantage of applying this approach is the need of predetermining minimal conflicts which can deteriorate diagnostic search performance [5]. Many different approaches to provide efficient solutions for diagnosis problems are proposed. One approach [24] focuses on improvements of HSDAG. Another approach [23] uses pre-determined sets of conflicts based on binary decision diagrams. In diagnosis scenarios instances where the number of minimal diagnoses and their cardinality is high, the determination of diagnoses with standard conflict-based approaches becomes infeasible.

The direct diagnosis algorithm FlexDiag [7] utilizes an inverse version of QuickXplain [8] which finds directly a diagnosis from an inconsistent constraint set. FlexDiag is an extension of FastDiag which assures diagnosis determination within certain time limits by systematically reducing the number of solver calls needed. The authors claim that this specific interpretation of anytime diagnosis leads to a trade-off between diagnosis quality (evaluated, e.g., in terms of minimality) and the time needed for diagnosis determination. Our proposed constraint ordering approach LCO improves their direct diagnosis approach in terms of diagnosis quality and at the same time in terms of runtime performance.

The approach of [18] determines diagnoses directly. Authors reduce the number of consistency checks by avoiding the computation of minimized conflict sets. Their approach is similar to [5], but introduces a new pruning rule. They compared their approach only with the standard technique (based on QuickXplain [8] and HSDAG [14]). In their experiments, they show that their direct diagnosis approach outperforms the standard diagnosis approach in terms of runtime. In their approach, authors collect the constraint ordering directly from users (interactively). Compared to the work [18], our approach learns the constraint ordering from historical inconsistent user requirements and their preferred diagnoses.

The importance of constraint ordering in direct diagnosis scenarios has already been mentioned in related work [7, 18]. In our approach, we predict the most important constraints for the users based on an important collaborative filtering approach matrix factorization [10] and employing historical transactions. We learn a constraint ordering where the predicted most important constraints have the highest ordering. This is due to the fact that, the implemented direct diagnosis algorithm FlexDiag first starts searching a diagnosis among the constraints ranked lowest. Therefore, the highest ordering constraints have low probability to be in the diagnosis set.

4 Learned Constraint Ordering (LCO) for direct diagnosis

In this paper, our motivation is to solve the quality-runtime performance trade off problem of direct diagnosis. For this purpose, our proposed method learns constraint ordering heuristics based on historical transactions in an offline phase and then in an online phase it employs a direct diagnosis algorithm on the re-ordered constraints of diagnosis tasks (active transactions). In this paper, we demonstrate and evaluate our approach based on the direct diagnosis algorithm FlexDiag [7].

Our contributions in this context are the following. We utilize the constraint-ordering sensitivity of direct diagnosis and propose a novel learning approach for constraint ordering.

In order to increase the diagnosis quality and the runtime performance of FlexDiag at the same time, we propose matrix factorization based constraint ordering (LCO) for diagnosis of inconsistent constraints (see Algorithm 1). In our approach, we build a sparse matrix (R) by exploiting historical purchase transactions with inconsistent user requirements and preferred minimal diagnoses and a new inconsistent set of user requirements (constraints). The estimated dense matrix (as a result of matrix factorization of the sparse matrix) provides the input in terms of importance estimates for determining a new diagnosis. According to the importance estimates determined by matrix factorization, constraints are reordered in before employing direct diagnosis.

In the following, we explain our approach by demonstrating how it works on the example diagnosis task (REQ_Lisa ∪ C_KB). In this context, we show the experimental evaluations of our approach on the basis of real-world configuration knowledge bases. In our camera configuration example, Lisa provides her requirements which are inconsistent with the camera product table (no solution found). Therefore, her requirements need to be diagnosed. Using our approach, we first calculate a constraint ordering (an LCO) and then solve the diagnosis task (REQ_Lisa ∪ C_KB) using the defined constraint ordering.

4.1 Offline phase: learning from historical transactions

Our proposed method needs an offline phase in which various constraint ordering heuristics are learned based on historical transactions. For the offline learning, matrix factorization techniques and historical transactions with inconsistent user requirements are employed (see Table 3). An example of using a learned heuristics is a solution search starting with the variable “resolution” and the corresponding value “20.9” and continuing with the other values of “resolution” in the learned order. After the variable “resolution” is satisfied, then continue with the next variable with its ordered values until all the variables-values are consistent with the given set of constraints.

Table 3 Historical transactions with inconsistent user requirements

Full size table

In Table 3, for each user, we have an inconsistent set of user requirements (which leads to “no solution”). After a no-solution situation, some of the users (in our case, Alice, Tom, and Joe in Table 3) decided to buy a product (Purchase) which does not completely satisfy their requirements. Consequently, they had to change their initial requirements which is presented as ${\varDelta }_{\min \limits }$. These historical transactions which are completed with a purchase are complete historical transactions. The rest of historical transactions, in which users did not complete their transactions with a purchase (e.g. Bob and Ray in Table 3), is called incomplete historical transactions. In this context, we estimate diagnoses of incomplete historical transactions using matrix factorization and we take into account only historical transactions with minimal diagnoses.

Based on the historical transactions in Table 3, we know that, the product Camera₁ is purchased by Alice. This means, Alice has renounced her last two requirements c₁₀: weight= 560g and c₁₁: price= 469 and purchased the product Camera₁ which has weight = 475g and price = 659. Therefore, Δ_Alice = {c₁₀,c₁₁} is a diagnosis for the requirements diagnosis task (REQ_Alice ∪ C_KB). When we eliminate the diagnosis constraints from the inconsistent requirement set, the diagnosed requirement set becomes $REQ^{\prime }_{\text {Alice}} =$ {effective Resolution = 20.9 Megapixel, display = 3.5 inches, touch = yes, wifi = yes, nfc= no, gps= yes, videoResolution= UHD − U4K/3840x2160, zoom= 3.0x}. Based on $REQ^{\prime }_{\text {Alice}}$, the found solution set is {Camera₁} which includes the products purchased by Alice (Camera₁).

4.1.1 The sparse matrix

Matrix factorization based collaborative filtering algorithms [11] introduce a rating matrix R (a.k.a., user-item matrix) which describes preferences of users for the individual items the users have rated. Thereby, R represents an m × n matrix, where m denotes the number of users and n the number of items. The respective element r_u,i of the matrix R describes the rating of the item i made by user u. Given the complete set of user ratings, the recommendation task is to predict how the users would rate the items which they have not yet been rated by these users.

In our approach, we build a sparse matrix R (user-constraint matrix) using inconsistent historical transactions as shown in Table 4a where columns represent constraints. Therefore, each row of the sparse matrix R represents a set of user requirements (the left half) and their corresponding diagnoses (the right half) if available. User requirements are presented in their normalized values in the range of 0–1, and diagnoses are presented with the presence (1)/non-presence (0) of user requirements.

Table 4 Matrix factorization estimates a dense matrix PQ^T (b) which closely approximates the sparse matrix R (a)

Full size table

If there are non-numeric domains in the problem, they are enumerated. For example, the domain v₇: videoResolution: {No,UHD,4K} is enumerated as v₇: {0,1,2}. Besides, domain ranges of all constraints in REQ are mapped to [0..1] since matrix factorization needs to use the same range for all values in the matrix. For this purpose, we have employed Min-Max Normalization [22] according to Formula (3).

$$ v_{\text{i\_norm}} = \frac{v_{\mathrm{i}}-dom(v_{\mathrm{i}})_{\min}} {dom(v_{\mathrm{i}})_{\max} - dom(v_{\mathrm{i}})_{\min}} $$

(3)

In our camera configuration example, in the left half of the first row of R, we set the user requirements of Alice. As shown in Table 3, she prefers 3.5 inches displays, i.e. c₃: display = 3.5 inches. In Table 4, the assigned value of display is normalized using Formula (3) and represented as 1. In the right half of the first row of R, we set the diagnoses probabilities. If a constraint c_i exists in Δ_user, its corresponding diagnosis probability δ_i is 1, otherwise to 0. We know that Δ_Alice = {c₁₀,c₁₁} (see Table 3), consequently we set δ₁₀ and δ₁₁ to 1 and the rest to 0.

4.1.2 Matrix factorization

In terms of matrix factorization [12, 13], the sparse matrix R is decomposed into an m × k user-feature matrix P and a k × n constraint-feature matrixQ^T which both are used to find the estimated dense matrix PQ^T. In this context, k is a variable parameter which needs to be adapted to optimize the prediction quality for the test data.

In our example, we apply matrix factorization to the sparse matrix in Table 4a. Then, the estimated matrix is obtained as shown in Table 4b which includes the estimated diagnoses for Bob and Ray.

4.2 Online phase: diagnosing active transactions

After calculating the estimated matrix in the offline phase, in the online phase we diagnose active transactions which includes inconsistencies as in our working example. In active transactions, users still did not leave the configuration session and need real-time help to remove inconsistencies in their requirements to decide on a product to purchase. Therefore, the configuration system should provide a relevant diagnosis in a reasonable time (before users leave the system without a purchase).

4.2.1 The most similar historical transaction

We find the most similar historical transaction to the new set of inconsistent requirements using Formula (4) where HT represents a historical transaction, AT represents the active transaction, HT.c_i represents the value of each requirement in the estimated dense matrix PQ^T, and AT.c_i represents the value of each requirement in the active transaction. i represent a constraint index value in the REQ of AT.

$$ min \left( \sqrt{\underset{i \in AT.REQ}{\sum}\|HT.c_{\mathrm{i}}-AT.c_{\mathrm{i}}\|^{2}}\right) $$

(4)

In our camera configuration example, the most similar historical transaction to the active transaction of Lisa is the transaction of Ray. Therefore, to diagnose REQ_Lisa, we use LCO_Ray : {c₂,c₉,c₆,c₇,c₁₁,c₁₀,c₃,c₄,c₅,c₈}. When we only consider user requirements of Lisa: {c₂,c₃,c₅,c₇,c₉}, we obtain the constraint ordering for Lisa LCO_Lisa : {c₂,c₉,c₇,c₃,c₅}.

4.2.2 Direct diagnosis with LCO

After the most similar historical transaction has been found and its constraint ordering has been applied to the active transaction’s user requirements, the direct diagnosis algorithm is employed. We employ FlexDiag as the direct diagnosis algorithm with its parameter m which is used to control diagnosis quality.

In our working example, user constraints of the diagnosis task are reordered using LCO_Lisa as {c₂,c₉,c₇,c₃,c₅} and a minimal diagnosis Δ = {c₂,c₉} is found by FlexDiag(m = 1) with performance results (on the basis of evaluation criteria given in Section 2.5): #CC = 4, minimality = 1, and combined = 0.250. However, when we employ the diagnosis algorithm FlexDiag(m = 1) on the default order of user constraints {c₂,c₃,c₅,c₇,c₉}, the same diagnosis Δ = {c₂,c₉} is found with performance results: #CC = 8, minimality = 1, and combined = 0.125. Therefore, using LCO with FlexDiag(m = 1), we improve the performance of diagnosis when diagnosing the working example.

In the following, we discuss effects of our approach on a direct diagnosis algorithm FlexDiag, based on the evaluation criteria: runtime (based on the number of consistency checks), minimality, and combined. As shown in the search trees of FlexDiag with LCO (Tree-2 and Tree-4) have better combined performance compared to the search trees of FlexDiag without LCO (Tree-1 and Tree-3). When m = 1, LCO improved the combined performance (see Formula 2) with the ratio 32% (0.166 instead of 0.125). If m = 2, LCO improves the combined performance with the ratio 100% (0.250 instead of 0.125) and the minimality with the ratio 50% whereas the runtime is not improved. Consequently, as also observed in the example search trees, LCO improves runtime performance and minimality of diagnosing.

5 Experimental evaluation

5.1 Settings

We have developed our approach in Java and tested on a computer with an Intel Core i5-5200U, 2.20 GHz processor, 8 GB RAM and 64 bit Windows 7 Operating System and Java Run-time Environment 1.8.0. Constraint satisfaction problems have been solved by Choco3.^{Footnote 1} which is a java library for constraint satisfaction problems with a FlatZinc (the target language of MiniZinc) parser. For matrix factorization, we have used the SVDRecommender of Apache Mahout [17].^{Footnote 2}

5.2 Datasets

We have used Minizinc-2016 benchmark problems [20] where each problem includes five data files with file extension “.dzn”.^{Footnote 3}

In order to obtain historical and active transactions based on these benchmark problems, we randomly generated 5000 sets of inconsistent user requirements (each with N constraints) based on integer variables. We have employed the java-code snippet below which inserts 10 additional constraints to each benchmark problem and always leads an inconsistency:

5.3 Comparative methods

We compare our approach directly with FlexDiag [7]. In this article, we do not compare LCO with more traditional diagnosis approaches—for related evaluations we refer the reader to [7] where detailed analyses of FlexDiag can be found. We used FlexDiag to show the minimality improvements of using LCO when m ≥ 2. As baseline, we evaluated FlexDiag-without constraint ordering. #REQ represents the number of constraints in a set of inconsistent requirements.

To evaluate LCO, we analyzed the three aspects (1) runtime performance, (2) diagnosis quality (in terms of minimality, see Formula (1)), and (3) combined performance (see Formula (2)). We observed that our approach LCO outperforms the baseline versions (see Table 5).

Table 5 Experimental results based on Minizinc-2016 Benchmark problems

Full size table

5.4 Results

As discussed throughout this paper, our main objective is to improve the combined performance (runtime performance and diagnosis quality at the same time). We have compared our approach LCO with the baseline no constraint ordering. In both cases, for diagnostic search, FlexDiag is used with three different m values 1, 2, and 4.

As shown in Table 5, based on the average value (in the last row), LCO outperforms the baseline in terms of runtime and minimality since with each m value (1, 2, and 4), LCO has lower runtime than the baseline whereas its minimality is higher (or equal) compared to the baseline with each m value (1, 2, and 4).

Based on the results in Table 5, we present two comparisons in Fig. 1 where relations between performance indicators and the number of constraints in the set of user requirements are shown.

In Fig. 1a, we observe that minimality improves (or the parameter m decreases) when runtime increases. As observed, the number of consistency checks (#CC) are at each m (m = 1, 2, and 4) lower when LCO is used. Moreover, at each m (m = 1, 2, and 4), LCO also provides better of equal minimality results.

The combined performance results are shown in Fig. 1b and c. Deviations in the results of LCO are more visible than the baseline, because LCO has greater performance values. When we zoom into the results of the baseline, we also observe similar deviations due to the variations in the problems. As observed, our approach improves the combined performance significantly.

6 Conclusions

In this paper, we proposed a novel learning approach for constraint ordering heuristics to solve the quality-runtime performance trade off problem of direct diagnosis. We employed matrix factorization for learning based on historical transactions. Taking the advantage of learning from historical transactions, we calculated possible constraint ordering heuristics in an offline phase for solving new diagnosis tasks in online phase.

In particular, we applied our constraint ordering heuristics to reorder user constraints of diagnosis tasks and employed a direct diagnosis algorithm FlexDiag [7] to diagnose the reordered constraints of the diagnosis tasks. The reason to choose this diagnosis algorithm is that the quality-runtime performance trade off problem is much more obvious in FlexDiag when its m parameter is increased. However, our approach can be also applicable to other direct diagnosis approaches (e.g. [18]). We compared our approach with a baseline: FlexDiag without heuristics. According to our experimental results, our approach (LCO) solves the the quality-runtime performance trade off problem by improving the diagnosis quality (in terms of minimality) and the runtime performance at the same time.

Notes

http://www.choco-solver.org/
Using the latent factor k = 100 and the number of iterations = 1000
http://www.minizinc.org/challenge2016/results2016.html

References

Bakker RR, Dikker F, Tempelman F, Wognum PM (1993) Diagnosing and solving over-determined constraint satisfaction problems. In: IJCAI, vol 93, pp 276–281
Card SK, Robertson GG, Mackinlay JD (1991) The information visualizer, an information workspace. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 181–186
de Kleer J (1990) Using crude probability estimates to guide diagnosis. Artif Intell 45(3):381–391
Article Google Scholar
Ezzahir R, Bessiere C, Belaissaoui M, Bouyakhf EH (2007) Dischoco: a platform for distributed constraint programming. In: IJCAI, vol 7, pp 16–21
Felfernig A, Schubert M, Zehentner C (2012) An efficient diagnosis algorithm for inconsistent constraint sets. Artif Intell Eng Des Anal Manuf (AIEDAM) 26(1):53–62
Article Google Scholar
Felfernig A, Walter R, Galindo JA, Benavides D, Erdeniz SP, Atas M, Reiterer S (2018) Anytime diagnosis for reconfiguration. J Intell Inf Syst 1–22
Felfernig A, Walter R, Galindo JA, Benavides D, Erdeniz SP, Atas M, Reiterer S (2018) Anytime diagnosis for reconfiguration. J Intell Inf Syst 1–22
Junker U (2004) Quickxplain: preferred explanations and relaxations for over-constrained problems. In: AAAI-2004, pp 167–172
Jussien N, Rochart G, Lorca X (2008) Choco: an open source java constraint programming library. In: CPAIOR’08 workshop on open-source software for integer and contraint programming (OSSICP’08), pp 1–10
Koren Y (2010) Collaborative filtering with temporal dynamics. Commun ACM 53(4):89–97
Article Google Scholar
Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30–37
Article Google Scholar
Mnih A, Salakhutdinov RR (2008) Probabilistic matrix factorization. In: Advances in neural information processing systems, pp 1257–1264
Pauca VP, Piper J, Plemmons RJ (2006) Nonnegative matrix factorization for spectral data analysis. Linear Algebra Appl 416(1):29–47
Article MathSciNet Google Scholar
Reiter R (1987) A theory of diagnosis from first principles. Artif Intell 32(1):57–95
Article MathSciNet Google Scholar
Sabin D, Weigel R (1998) Product configuration frameworks-a survey. IEEE Intell Syst Appl 13(4):42–49
Article Google Scholar
Schaupp LC, Bélanger F (2005) A conjoint analysis of online consumer satisfaction1. J Electron Commerce Res 6(2):95–112
Google Scholar
Schelter S, Owen S (2012) Collaborative filtering with apache mahout. In: Proceedings of ACM RecSys challenge
Shchekotykhin KM, Friedrich G, Rodler P, Fleiss P (2014) Sequential diagnosis of high cardinality faults in knowledge-bases by direct diagnosis generation. In: ECAI, vol 14, pp 813–818
Stern RT, Kalech M, Feldman A, Provan GM (2012) Exploring the duality in conflict-directed model-based diagnosis. In: AAAI, vol 12, pp 828–834
Stuckey PJ, Feydy T, Schutt A, Tack G, Fischer J (2014) The minizinc challenge 2008–2013. AI Mag 35(2):55–60
Google Scholar
Tsang E (1993) Foundations of constraint satisfaction. Academic Press
Visalakshi NK, Thangavel K (2009) Impact of normalization in distributed k-means clustering. Int J Soft Comput 4(4):168–172
Google Scholar
Wang K, Li ZA, Ai Y, Zhang YG (2009) Computing minimal diagnosis with binary decision diagrams algorithm. In: Sixth international conference on fuzzy systems and knowledge discovery, 2009. FSKD’09, vol 1. IEEE, pp 145–149
Wotawa F (2001) A variant of reiter’s hitting-set algorithm. Inf Process Lett 79(1):45–51
Article MathSciNet Google Scholar

Download references

Acknowledgements

The work presented in this paper has been conducted within the scope of the Horizon 2020 projects OpenReq (Grant Nr. 732463) and AGILE (Grant Nr. 688088).

Funding

Open access funding provided by Graz University of Technology.

Author information

Authors and Affiliations

Graz University of Technology, Inffeldgasse 16B/2, Graz, 8010, Austria
Seda Polat Erdeniz, Alexander Felfernig & Muesluem Atas

Authors

Seda Polat Erdeniz
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Felfernig
View author publications
You can also search for this author in PubMed Google Scholar
Muesluem Atas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seda Polat Erdeniz.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special issue on Artificial intelligence in practice - from theory to application

Guest Editors: Franz Wotawa, Gerhard Friedrich and Ingo Pill

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Polat Erdeniz, S., Felfernig, A. & Atas, M. Applying matrix factorization to consistency-based direct diagnosis. Appl Intell 52, 7024–7036 (2022). https://doi.org/10.1007/s10489-020-02183-4

Download citation

Accepted: 29 December 2020
Published: 14 May 2021
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10489-020-02183-4

Applying matrix factorization to consistency-based direct diagnosis

Abstract

Similar content being viewed by others

Learned Constraint Ordering for Consistency Based Direct Diagnosis

LearnDiag: A Direct Diagnosis Algorithm Based On Learned Heuristics

Anytime diagnosis for reconfiguration

1 Introduction

2 Preliminaries

2.1 Configuration task

Definition 1 (Configuration task)

2.2 Diagnosis task

Definition 2 (Conflict)

Definition 3 (Minimal conflict)

Definition 4 (REQ diagnosis task)

Definition 5 (Minimal diagnosis)

2.3 Direct diagnosis

2.4 Constraint ordering

2.5 Evaluation criteria

Runtime

Minimality

Combined

3 Related work

4 Learned Constraint Ordering (LCO) for direct diagnosis

4.1 Offline phase: learning from historical transactions

4.1.1 The sparse matrix

4.1.2 Matrix factorization

4.2 Online phase: diagnosing active transactions

4.2.1 The most similar historical transaction

4.2.2 Direct diagnosis with LCO

5 Experimental evaluation

5.1 Settings

5.2 Datasets

5.3 Comparative methods

5.4 Results

6 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation