Skip to main content

MaxSAT-based temporal logic inference from noisy data


We address the problem of inferring descriptions of system behavior using temporal logic from a finite set of positive and negative examples. In this paper, we consider two formalisms of temporal logic that describe linear time properties: Linear Temporal Logic over finite horizon (LTL\(_{\mathrm{f}}\)) and Signal Temporal Logic (STL). For inferring formulas in either of the formalism, most of the existing approaches rely on predefined templates that guide the structure of the inferred formula. On the other hand, the approaches that can infer arbitrary formulas are not robust to noise in the data. To alleviate such limitations, we devise two algorithms for inferring concise formulas even in the presence of noise. Our first approach to infer minimal formulas involves reducing the inference problem to a problem in maximum satisfiability and then using off-the-shelf solvers to find a solution. To the best of our knowledge, we are the first to incorporate the usage of MaxSAT/MaxSMT solvers for inferring formulas in LTL\(_{\mathrm{f}}\) and STL. Our second approach relies on the first approach to derive a decision tree over temporal formulas exploiting standard decision tree learning algorithm. We have implemented our approaches and verified their efficacy in learning concise descriptions in the presence of noise.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. Based on the conference version of this paper [18].

  2. LTL, when interpreted over finite traces, is sometimes referred to as LTL\(_{\mathrm{f}}\).

  3. We adapted SAT-DT to learn decision trees with a similar stopping criteria as ours.



  1. Aréchiga N (2019) Specifying safety of autonomous vehicles in signal temporal logic. In: IV, pp 58–63. IEEE

  2. Arif MF, Larraz D, Echeverria M, Reynolds A, Chowdhury O, Tinelli C (2020) SYSLITE: syntax-guided synthesis of PLTL formulas from finite traces. In: FMCAD, IEEE, pp 93–103

  3. Asarin E, Donzé A, Maler O, Nickovic D (2012) Parametric identification of temporal properties. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 7186 LNCS(September), 147–160.

  4. Bacchus F, Kabanza F (2000) Using temporal logics to express search control knowledge for planning. Artif Intell 116(1–2):123–191

    Article  MathSciNet  Google Scholar 

  5. Bombara G, Vasile CI, Penedo F, Yasuoka H, Belta C (2016) A decision tree approach to data classification using signal temporal logic. In: Proceedings of the 19th international conference on hybrid systems: computation and control, ACM, pp 1–10

  6. Brunello A, Sciavicco G, Stan IE (2019) Interval temporal logic decision tree learning. In: JELIA, Lecture notes in computer science, vol. 11468, Springer, pp 778–793

  7. Budde CE, Argenio PRD, Sedwards S (2018) Qualitative and quantitative trace analysis with extended signal temporal logic. Int J Softw Tools Technol Transf 1:340–358.

    Article  Google Scholar 

  8. Camacho A, Baier JA, Muise CJ, McIlraith SA (2018) Finite LTL synthesis as planning. In: ICAPS, AAAI Press, pp 29–38

  9. Camacho A, Icarte RT, Klassen TQ, Valenzano RA, McIlraith SA (2019) LTL and beyond: formal languages for reward function specification in reinforcement learning. In: IJCAI, pp 6065–6073.

  10. Camacho A, McIlraith SA (2019) Learning interpretable models expressed in linear temporal logic. In: ICAPS, AAAI Press, pp 621–630

  11. Dwyer MB, Avrunin GS, Corbett JC (1998) Property specification patterns for finite-state verification. In: Proceedings of the second workshop on formal methods in software practice, FMSP, Association for Computing Machinery, p 7–15

  12. Fainekos GE, Kress-Gazit H, Pappas GJ (2005) Temporal logic motion planning for mobile robots. In: ICRA, IEEE, pp 2020–2025

  13. Gabel M, Su Z (2010) Online inference and enforcement of temporal properties. In: ICSE (1), ACM, pp 15–24

  14. Giacomo GD, Vardi MY (2013) Linear temporal logic and linear dynamic logic on finite traces. In: IJCAI, IJCAI/AAAI, pp 854–860

  15. Halaby ME (2016) On the computational complexity of maxsat. Electron Colloq Comput Complex 23:34

    Google Scholar 

  16. Hoxha B, Dokhanchi A, Fainekos G (2018) Mining parametric temporal logic properties in model-based design for cyber-physical systems. Int J Softw Tools Technol Transf 20(1):79–93

    Article  Google Scholar 

  17. Jin X, Donzé A, Deshmukh JV, Seshia SA (2013) Mining requirements from closed-loop control models. In: HSCC, ACM, pp 43–52

  18. Gaglione JR, Neider D, Roy R, Topcu U, Xu Z (2021) Learning linear temporal properties from noisy data: a MaxSAT-Based approach. In: Automated technology for verification and analysis, Springer International Publishing, pp 74–90.

  19. Kim J, Muise C, Shah A, Agarwal S, Shah J (2019) Bayesian inference of linear temporal logic specifications for contrastive explanations. In: IJCAI, pp 5591–5598.

  20. Kong Z, Jones A, Belta C (2017) Temporal logics for learning and detection of anomalous behavior. IEEE Trans Autom Control 62(3):1210–1222

    Article  MathSciNet  Google Scholar 

  21. Lemieux C, Park D, Beschastnikh I (2015) General LTL specification mining (T). In: ASE, IEEE Computer Society. pp 81–92

  22. Maler O, Nickovic D (2004) Monitoring temporal properties of continuous signals. In: Proceedings of FORMATS-FTRTFT. Vol. 3253 of LNCS, Springer, pp 152–166

  23. Maler O, Nickovic D (2004) Monitoring temporal properties of continuous signals. Lect Notes Comput Sci (Incl Subser Lect Notes Artif Intell Lect Notes Bioinf) 3253:152–166.

    Article  MATH  Google Scholar 

  24. Mohammadinejad S, Deshmukh JV, Puranic AG, Vazquez-Chanlatte M, Donzé A (2020) Interpretable classification of time-series data using efficient enumerative techniques. In: HSCC, ACM, pp 9:1–9:10

  25. de Moura LM, Bjørner N (2008) Z3: an efficient SMT solver. In: TACAS, Lecture notes in computer science, vol. 4963, Springer, pp 337–340

  26. Nagabandi A, Konoglie K, Levine S, Kumar V (2019) Deep dynamics models for learning dexterous manipulation, pp 1–12

  27. Neider D, Gavran I (2018) Learning linear temporal properties. In: Bjørner N, Gurfinkel A (eds) 2018 Formal methods in computer aided design, FMCAD 2018, IEEE, pp 1–10

  28. Pnueli A (1977) The temporal logic of programs. In: Proceedings of 18th annual symposium on foundations of computer science, pp 46–57

  29. Pradel M, Gross TR (2012) Leveraging test generation and specification mining for automated bug detection without false positives. In: ICSE, IEEE Computer Society, pp 288–298

  30. Pradel M, Jaspan C, Aldrich J, Gross TR (2012) Statically checking API protocol conformance with mined multi-object specifications. In: ICSE, IEEE Computer Society, pp 925–935

  31. Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

    Google Scholar 

  32. Raman V, Donzé A, Sadigh D, Murray RM, Seshia SA (2015) Reactive synthesis from signal temporal logic specifications. In: HSCC, ACM, pp 239–248

  33. Roy R, Fisman D, Neider D (2020) Learning interpretable models in the property specification language. In: IJCAI, pp 2213–2219.

  34. Sebastiani R, Trentin P (2017) On optimization modulo theories, MaxSMT and sorting networks. CoRR arxiv:1702.02385

  35. Shah A, Kamath P, Shah JA, Li S (2018) Bayesian inference of temporal task specifications from demonstrations. In: NeurIPS, pp 3808–3817

  36. Tseitin GS (1983) On the Complexity of Derivation in Propositional Calculus, Springer, Berlin Heidelberg, pp 466–483

  37. Walkinshaw N, Derrick J, Guo Q (2009) Iterative refinement of reverse-engineered models by model-based testing. In: FM, Lecture notes in computer science, vol. 5850, Springer, pp 305–320

  38. Weimer W, Necula GC (2005) Mining temporal specifications for error detection. In: TACAS, Lecture notes in computer science, vol. 3440, Springer, pp 461–476

  39. Xu Z, Belta C, Julius A (2015) Temporal logic inference with prior information: An application to robot arm movements. In: IFAC conference on analysis and design of hybrid systems (ADHS), pp 141 – 146

  40. Xu Z, Birtwistle M, Belta C, Julius A (2016) A temporal logic inference approach for model discrimination. IEEE Life Sci. Lett. 2(3):19–22

    Article  Google Scholar 

  41. Xu Z, Julius AA (2019) Robust temporal logic inference for provably correct fault detection and privacy preservation of switched systems. IEEE Syst. J. 13(3):3010–3021

    Article  Google Scholar 

  42. Xu Z, Nettekoven AJ, Agung Julius A, Topcu U (2019) Graph temporal logic inference for classification and identification. In: 2019 IEEE 58th conference on decision and control (CDC), pp 4761–4768

  43. Xu Z, Ornik M, Julius AA, Topcu U (2019) Information-guided temporal logic inference with prior knowledge. In: 2019 American control conference (ACC), pp 1891–1897

  44. Yang J, Evans D, Bhardwaj D, Bhat T, Das M (2006) Perracotta: mining temporal API rules from imperfect traces. In: ICSE, ACM, pp 282–291

Download references


This work has been supported by the Defense Advanced Research Projects Agency (DARPA) (Contract number HR001120C0032), Army Research Laboratory (ARL) (Contract number W911NF2020132 and ACC-APG-RTP W911NF), National Science Foundation (NSF) (Contract Number 1646522), and Deutsche Forschungsgemeinschaft (DFG) (Grant Number 434592664).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jean-Raphaël Gaglione.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Appendix 1 Construction of temporal formulas described in Remarks 1 and 2

LTL\(_{\mathrm{f}}\) formula from Remark 1 To construct a trivial LTL\(_{\mathrm{f}}\) formula \(\varphi \) with \({\textit{l}}({\textit{S}}, \varphi )=0\), one needs to perform the following steps: construct formulas \({\varphi }_{u, v}\) for all \((u,1)\in {\textit{S}}\) and \((v,0)\in {\textit{S}}\), such that \({V({{\varphi }_{u, v}},{u})}=1\) and \({V({{\varphi }_{u, v}},{u})}=0\), using a sequence of \({{\,\mathrm{\mathbf {X}}\,}}\)-operators and an appropriate propositional formula to describe the first symbol where u and v differ; now \(\varphi =\bigvee _{(u, 1) \in {\textit{S}}} \bigwedge _{(v, 0) \in {\textit{S}}} \varphi _{u, v}\) is the desired formula.

STL Formula from Remark 2 With the predicates \({\varPi }= \{ { {\textit{u} } }_{j}\ge {\theta _{}} | 1 \le {j}\le {m}\}\), we construct \({\varphi }_{{ {\textit{u} } }^{+}, { {\textit{u} } }^{-}} {:}{=}{{\,\mathrm{\mathbf {F}}\,}}_{[i,i+1)} { {\textit{u} } }_j \ge \frac{{ {\textit{u} } }^{+}_j[i] + { {\textit{u} } }^{-}_j[i]}{2}\) for all \(({ {\textit{u} } }^{+},1)\in {\textit{S}}\) and \(({ {\textit{u} } }^{-},0)\in {\textit{S}}\), assuming \({ {\textit{u} } }^{+}\) and \({ {\textit{u} } }^{-}\) differs at time i and coordinate j, ensuring that \(V({ {\textit{u} } }^{+}, {\varphi }_{{ {\textit{u} } }^{+}, { {\textit{u} } }^{-}}) \ne V({ {\textit{u} } }^{}, {\varphi }_{{ {\textit{u} } }^{+}, { {\textit{u} } }^{-}})\). Without loss of generality, we can ensure that \(V({ {\textit{u} } }^{+}, {\varphi }_{{ {\textit{u} } }^{+}, { {\textit{u} } }^{-}})=1\) by negating the preceding formula when necessary. Now \({\varphi }{:}{=}{{\,\mathrm{\bigvee }\,}}_{({ {\textit{u} } }^{+}, 1) \in {\textit{S}}} {{\,\mathrm{\bigwedge }\,}}_{({ {\textit{u} } }^{-}, 0) \in {\textit{S}}} {\varphi }_{{ {\textit{u} } }^{+}, { {\textit{u} } }^{-}}\) is the desired formula.

Appendix 2 List of all the constraints used

1.1 Appendix 2.1 Constraints for learning minimal LTL formula

Structural constraints

$$\begin{aligned}&\Big [ \bigwedge _{1\le i \le n} \bigvee _{\lambda \in \varLambda } {x_{i,\lambda }} \Big ] \wedge \Big [ \bigwedge _{1 \le i \le n} \bigwedge _{\lambda \ne \lambda ' \in \varLambda } \lnot {x_{i,\lambda }} \vee \lnot {x_{i,\lambda '}} \Big ] \end{aligned}$$
$$\begin{aligned}&[\bigwedge \limits _{2\le i\le n} \bigvee \limits _{1\le j\le i}{l_{i,j}}]\wedge [\bigwedge \limits _{2\le i\le n}\bigwedge \limits _{1\le j\le j'\le n}\lnot {l_{i,j}}\vee \lnot {l_{i,j'}}] \end{aligned}$$
$$\begin{aligned}&[\bigwedge \limits _{2\le i\le n} \bigvee \limits _{1\le j\le i}{r_{i,j}}]\wedge [\bigwedge \limits _{2\le i\le n}\bigwedge \limits _{1\le j\le j'\le n}\lnot {r_{i,j}}\vee \lnot {r_{i,j'}}] \end{aligned}$$
$$\begin{aligned}&\bigwedge _{\begin{array}{c} 2 \le i \le n, 1 \le j, j'< i\\ \lambda \in \{{{\,\mathrm{\mathbf {X}}\,}}, {{\,\mathrm{\mathbf {U}}\,}}, \lnot , \vee \} \end{array}} [ {x_{i,\lambda }} \wedge {l_{i,j}} \wedge {r_{i,j'}} ] \rightarrow \Big [ \bigvee _{\lambda ' \in \varLambda } {x_{j,\lambda '}} \wedge \bigvee _{\lambda '\in \varLambda } {x_{j',\lambda '}} \Big ] \end{aligned}$$
$$\begin{aligned}&\bigvee \limits _{p\in {\mathcal {P}}} {x_{1,p}} \end{aligned}$$

Formula 9 ensures that each node of the syntax DAG has a unique label. Similarly, Formulas 10 and 11 ensure that each node of a syntax DAG has a unique left and right child, respectively. Finally, Formula 13 ensures that Node 1 is labeled by a propositional variable.

Constraints for semantics

$$\begin{aligned} \bigwedge \limits _{1\le i \le n}\bigwedge \limits _{p\in {\mathcal {P}}}{x_{i,p}}\rightarrow \Big [\bigwedge \limits _{0\le \tau < {|u|}} {\left\{ \begin{array}{ll} {y_{i,\tau }^{{ {\textit{u} } }}}\text { if }p\in u[\tau ] \\ \lnot {y_{i,\tau }^{{ {\textit{u} } }}}\text { if }p\not \in u[\tau ] \end{array}\right. }\Big ] \end{aligned}$$
$$\begin{aligned} \bigwedge \limits _{\begin{array}{c} 1\le i \le n \\ 1\le j< i \end{array}}{x_{i,\lnot }}\wedge {l_{i,j}} \rightarrow \Big [\bigwedge \limits _{\begin{array}{c} 0\le \tau < {|u|} \end{array}}\Big [{y_{i,\tau }^{{ {\textit{u} } }}}\leftrightarrow \lnot {y_{j,\tau }^{{ {\textit{u} } }}}\Big ]\Big ] \end{aligned}$$
$$\begin{aligned} \bigwedge \limits _{\begin{array}{c} 1\le i \le n \\ 1\le j,j'< i \end{array}} {x_{i,\vee }}\wedge {l_{i,j}}\wedge {r_{i,j'}}\rightarrow \Big [\bigwedge \limits _{\begin{array}{c} 0\le \tau < {|u|} \end{array}}\Big [{y_{i,\tau }^{{ {\textit{u} } }}}\leftrightarrow {y_{j,\tau }^{{ {\textit{u} } }}}\vee {y_{j', \tau }^{{ {\textit{u} } }}}\Big ]\Big ] \end{aligned}$$
$$\begin{aligned} \bigwedge \limits _{\begin{array}{c} 1\le i \le n \\ 1\le j< i \end{array}}{x_{i,{{\,\mathrm{\mathbf {X}}\,}}}}\wedge {l_{i,j}} \rightarrow \Big [\bigwedge \limits _{\begin{array}{c} 0\le \tau < {|u|}-1 \end{array}}\Big [{y_{i,\tau }^{{ {\textit{u} } }}}\leftrightarrow {y_{j,\tau +1}^{{ {\textit{u} } }}}\Big ]\Big ] \end{aligned}$$
$$\begin{aligned} \bigwedge \limits _{\begin{array}{c} 1\le i \le n \\ 1\le j, j'< i \end{array}}{x_{i,{{\,\mathrm{\mathbf {U}}\,}}}}\wedge {l_{i,j}}\wedge {r_{i,j'}}\rightarrow \nonumber \\ \Big [\bigwedge \limits _{\begin{array}{c} 0\le \tau< {|u|} \end{array}}\Big [{y_{i,\tau }^{{ {\textit{u} } }}}\leftrightarrow \bigvee \limits _{\tau \le \tau '< {|u|}}\Big [{y_{j',\tau '}^{{ {\textit{u} } }}}\wedge \bigwedge \limits _{\tau \le t< \tau '} {y_{j,t}^{{ {\textit{u} } }}}\Big ]\Big ] \end{aligned}$$

The constraints are similar to the ones proposed by Neider and Gavran, except that they have been adapted to comply with the semantics of LTL\(_{\mathrm{f}}\). Formula 14 implements the semantics of propositions and states that if Node i is labeled with \(p\in {\mathcal {P}}\), then \({y_{i,\tau }^{{ {\textit{u} } }}}\) is set to 1 if and only if \(p\in { {\textit{u} } }[i]\). Formulas 15 and 16 implement the semantics of negation and disjunction, respectively: if Node i is labeled with \(\lnot \) and Node j is its left child, then \({y_{i,\tau }^{{ {\textit{u} } }}}\) equals the negation of \({y_{j,\tau }^{{ {\textit{u} } }}}\); on the other hand, if Node i is labeled with \(\vee \), Node j is its left child, and Node \(j'\) is its right child, then \({y_{i,\tau }^{{ {\textit{u} } }}}\) equals the disjunction of \({y_{j,\tau }^{{ {\textit{u} } }}}\) and \({y_{j',\tau }^{{ {\textit{u} } }}}\). Formula 17 implements the semantics of the \({{\,\mathrm{\mathbf {X}}\,}}\)-operator and states that if Node i is labeled with \({{\,\mathrm{\mathbf {X}}\,}}\) and its left child is Node j, then \({y_{i,\tau }^{{ {\textit{u} } }}}\) equals \({y_{j,\tau +1}^{{ {\textit{u} } }}}\). Finally, Formula 18 implements the semantics of the \({{\,\mathrm{\mathbf {U}}\,}}\)-operator; it states that if Node i is labeled with \({{\,\mathrm{\mathbf {U}}\,}}\), its left child is Node j, and its right child is Node \(j'\), then \({y_{i,\tau }^{{ {\textit{u} } }}}\) is set to 1 if and only if there exists a position \(\tau '\) for which \({y_{j',\tau '}^{{ {\textit{u} } }}}\) is set to 1 and for all positions t lying between \(\tau \) and \(\tau '\), \({y_{j,t}^{{ {\textit{u} } }}}\) is set to 1.

Appendix 3 Proofs of the theoretical results

1.1 Appendix 3.1 Proofs from section 3

Proof of Lemma 1

The hard constraints of \(\varPhi ^{{\textit{S}}}_{{n}}\) are \({\varPhi _{{n}}^{\textit{str}}}\) and \({\varPhi ^{{n}}_{{ {\textit{u} } }}}\). Now, \({\varPhi _{{n}}^{\textit{str}}}\) is satisfiable since there always exists a valid LTL\(_{\mathrm{f}}\) formula of size \({n}\). As a result, using the syntax DAG of a LTL\(_{\mathrm{f}}\) formula of size \({n}\), we can find an assignment to the variables of \({\varPhi _{{n}}^{\textit{str}}}\) that makes it satisfiable. The constraint \({\varPhi ^{{n}}_{{ {\textit{u} } }}}\), on the other hand, simply tracks the valuation of the prospective formula on traces \({ {\textit{u} } }\). One can easily find an assignment of the variables of \({\varPhi ^{{n}}_{{ {\textit{u} } }}}\) using the semantics of LTL\(_{\mathrm{f}}\).

For proving the second part, let us assume that v is an assignment that satisfies the hard constraints. We now claim that the sum of the weights of the satisfied soft constraints is equal to \(1-{\textit{wl}}({\textit{S}}, {\varphi }_v, {\varOmega })\). If we can prove this, then if v is an assignment that maximizes the weight of the satisfied soft constraints directly implies that \({\varphi }_v\) minimizes the \({\textit{wl}}\) function. Now toward proving the claim, we have the following:

$$\begin{aligned} {\textit{wl}}({{\textit{S}}},{{\varphi }_v}, {\varOmega })&= \sum \limits _{{V({{\varphi }_v},{{ {\textit{u} } }})}\ne {b}} {\varOmega }({ {\textit{u} } })\\&= \sum {\varOmega }({ {\textit{u} } }) - \sum \limits _{{V({{\varphi }_v},{{ {\textit{u} } }})}= {b}} {\varOmega }({ {\textit{u} } })\\&= 1 - \sum \limits _{{V({{\varphi }_v},{{ {\textit{u} } }})}={b}} {\varOmega }({ {\textit{u} } })\\&= 1 - \sum \limits _{v({y_{{n},0}^{{ {\textit{u} } }}})={b}} {\varOmega }({ {\textit{u} } }) \end{aligned}$$

All the summations appearing in the above equation are over \(({ {\textit{u} } }, {b})\in {\textit{S}}\). Moreover, the quantity \(\sum _{v({y_{{n},0}^{{ {\textit{u} } }}})={b}} {\varOmega }({ {\textit{u} } })\), appearing in the final line, refers to sum of the weights of the satisfied soft constraints, since the constraints in which \(v({y_{{n},0}^{{ {\textit{u} } }}})={b}\) are the ones that are satisfied. \(\square \)

Proof of Theorem 1

The termination of Algorithm 1 is guaranteed by the fact that there always exists an LTL\(_{\mathrm{f}}\) formula \(\varphi \) for which \({\textit{wl}}({\varphi },{\textit{S}}, {\varOmega })=0\) as indicated by Remark 1. Second, the fact that \({\varphi }\) has \({\textit{wl}}({\varphi },{\textit{S}}, {\varOmega })\le \kappa \) is a consequence of Lemma 1. Finally, the minimality of the formula is a consequence of the fact that Algorithm 1 searches for an LTL\(_{\mathrm{f}}\) formula in increasing order of size. \(\square \)

1.2 Appendix 3.2 Proofs from Sect. 4

Proof of Lemma 1 in the case of STL

The only new constraint in \({\varPhi _{{n}}^{\textit{str}}}\) compared to STL case is \(0 \le {a_{i}} < {b_{i}}\) for \(i\in \{1,\ldots ,{n}\}\). The constraint \({\varPhi ^{{n}}_{{ {\textit{u} } }}}\) still simply tracks the valuation of the prospective formula on traces \({ {\textit{u} } }\). Thus, all these hard constraints are satisfiables, as explained in Proof 1.

For proving the second part, the weights of the satisfied soft constraints are still equal to \(1-{\textit{wl}}({\textit{S}}, {\varphi }_v, {\varOmega })\). Once again, the proof is similar to Proof 1. \(\square \)

Proof of Theorem 2

The termination of the MaxSMT-based STL learning algorithm under the conditions of Remark 2 is guaranteed by the fact that there always exists an STL formula \({\varphi }\) for which \({\textit{wl}}({\varphi },{\textit{S}}, {\varOmega })=0\), as discussed in the beginning of Sect. 4. Second, the fact that \({\varphi }\) has \({\textit{wl}}({\varphi },{\textit{S}}, {\varOmega })\le \kappa \) is a consequence of Proof 2. Finally, the minimality of the formula is guaranteed as explained in proof 1. \(\square \)

1.3 Appendix 3.3 Proofs from Sect. 5

Proof of Lemma 2

Toward contradiction, without loss of generality, let us assume that for all \({ {\textit{u} } }\) in \({\textit{S}}\) and formula \({\varphi }\) with \({\textit{s}_{\textit{r}}}({\textit{S}}, {\varphi })>0.5\), we have \({V({{ {\textit{u} } }},{{\varphi }})}=1\). In such a case, \({|{V({{ {\textit{u} } }},{\varphi })}-{b}|}=0\) for \(({ {\textit{u} } }, 1)\in {\textit{S}}\) and \({|{V({{ {\textit{u} } }},{\varphi })}-{b}|}=1\) for \(({ {\textit{u} } }, 0)\in {\textit{S}}\). We can, thus, calculate that \(\sum _{(u,1)\in {\textit{S}}}{|{V({{ {\textit{u} } }},{\varphi })}-{b}|}=0\), \(\sum _{(u,0)\in {\textit{S}}}{|{V({{ {\textit{u} } }},{\varphi })}-{b}|}={|\{({ {\textit{u} } }, 0)\in {\textit{S}}|{b}=0\}|}\), and consequently \({\textit{s}_{\textit{r}}}({\textit{S}}, {\varphi })=0.5\), violating our assumption. \(\square \)

Proof of Theorem 3

First, observe that at each decision node, we can always infer an LTL\(_{\mathrm{f}}\) formula \({\varphi }\) for which \({\textit{s}_{\textit{r}}}({\textit{S}}, {\varphi })\ge \mu \), for any value of \(\mu \). This is because there always exists an LTL\(_{\mathrm{f}}\) formula \({\varphi }\) that produces perfect classification, and for this, \({\textit{s}_{\textit{r}}}({\textit{S}},{\varphi })=1\). Second, observe that whenever a split is made during the learning algorithm, sub-samples \({\textit{S}}_1\) and \({\textit{S}}_2\) are both non-empty due to Lemma 2. This implies that the algorithm terminates since a sample can be only split finitely many times. Now, for ensuring the decision tree \({t}\) achieves a \({\textit{l}}({\textit{S}}, {t})\le \kappa \), we use induction over the structure of the decision tree. If \({t}\) is leaf node \(\textit{true}\) or \({{\,\mathrm{\textit{false}}\,}}\), then \({\textit{l}}({\textit{S}}, {t})\le \kappa \) using the stopping criteria. Now, say that \({t}\) is a decision tree with root \({\varphi }\) and subtrees \({t}_1\) and \({t}_2\), meaning \({\varphi }_{t}= ({\varphi }\wedge {\varphi }_{{t}_1})\vee (\lnot {\varphi }\wedge {\varphi }_{{t}_2})\). Also, say that the sub-samples produced by \({\varphi }\) are \({\textit{S}}_1\) and \({\textit{S}}_2\). By induction hypothesis, we can say that \({\textit{l}}({\textit{S}}_1, {t}_1)\le \kappa \) and \({\textit{l}}({\textit{S}}_2, {t}_2)\le \kappa \). Now, it is easy to observe that \({\textit{l}}({\textit{S}}_1, ({\varphi }\wedge {\varphi }_{{t}_1}))\le \kappa \) and \({\textit{l}}({\textit{S}}_2, (\lnot {\varphi }\wedge {\varphi }_{{t}_2}))\le \kappa \), since \({\varphi }\) satisfies all traces in \({\textit{S}}_1\) and \(\lnot {\varphi }\) does not satisfy any trace in \({\textit{S}}_2\). We, thus, have \({\textit{l}}({\textit{S}},{t})={\textit{l}}({\textit{S}}_1\uplus {\textit{S}}_2,({\varphi }\wedge {\varphi }_{{t}_1})\vee (\lnot {\varphi }\wedge {\varphi }_{{t}_2}))\le \kappa \)\(\square \)

Fig. 8
figure 8

Running time comparison of SAT-flie and MaxSAT-flie

Appendix 4 Additional theoretical observations

We explain here why Problems 1 and 2 are adapted to Temporal Logic inference from noisy data. Note that when a sample \({\textit{S}}\) is constructed from a LTL\(_{\mathrm{f}}\) (or equivalently, STL) formula \(\psi \) of reference (small in size, in principle), i.e., such that \({\textit{l}}({\textit{S}},\psi ) = 0\), and given a minimal LTL\(_{\mathrm{f}}\) formula \({\varphi }\) such that \({\textit{l}}({\textit{S}},{\varphi }) = 0\), we always have \({|{\varphi }|} \le {|\psi |}\). However, after introducing noise in the sample such that \({\textit{l}}({\textit{S}}',\psi ) \lessapprox 0\), and given a minimal formula \({\varphi }'\) such that \({\textit{l}}({\textit{S}}',{\varphi }') = 0\), we have no such guarantee on the size of \({\varphi }'\). Intuitively, the size of \({\varphi }'\) is growing the more random the classification labels of \({\textit{S}}'\) are. However, if we have a bound on the noise, i.e., if we have \({\textit{l}}({\textit{S}}',\psi ) \le \kappa \), given a minimal LTL\(_{\mathrm{f}}\) formula \({\varphi }'_\kappa \) such that \({\textit{l}}({\textit{S}}',{\varphi }'_\kappa ) \le \kappa \), we can now ensure that \({|{\varphi }'_\kappa |} \le {|\psi |}\). Hence, Problem 1 is adapted in the context of LTL\(_{\mathrm{f}}\) inference from noisy data.

Appendix 5 Experimental results

Figure 8 presents a comparison of the running time of MaxSAT-flie (proposed in this paper) and SAT-flie (proposed in [27]), on each sample of the LTL\(_{\mathrm{f}}\) sample sets.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gaglione, JR., Neider, D., Roy, R. et al. MaxSAT-based temporal logic inference from noisy data. Innovations Syst Softw Eng 18, 427–442 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Linear Temporal Logic
  • Signal Temporal Logic
  • Decision tree
  • Specification mining
  • Explainable AI