Machine Learning

, Volume 86, Issue 3, pp 295–333

Efficiently identifying deterministic real-time automata from labeled data

Open Access
Article

Abstract

We develop a novel learning algorithm RTI for identifying a deterministic real-time automaton (DRTA) from labeled time-stamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finite-state automaton (DFA) identification, called evidence-driven state-merging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NP-complete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from time-stamped event sequences.

A straightforward alternative to identifying DRTAs is to identify a DFA that models time implicitly, i.e., a DFA that uses different states for different points in time. Such a DFA can be identified by first sampling the timed sequences using a fixed frequency, and subsequently applying EDSM to the resulting non-timed event sequences. We evaluate the performance of both RTI and this sampling approach experimentally on artificially generated data. In these experiments RTI outperforms the sampling approach significantly. Thus, we show that if we obtain data from a real-time system, it is easier to identify a DRTA from this data than to identify an equivalent DFA.

Keywords

Timed automata Real-time automata Identification in the limit Supervised learning 

References

  1. Alur, R., & Dill, D. L. (1994). A theory of timed automata. Theoretical Computer Science, 126, 183–235. MathSciNetMATHCrossRefGoogle Scholar
  2. Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer. MATHGoogle Scholar
  3. Bugalho, M., & Oliveira, A. L. (2005). Inference of regular languages using state merging algorithms with search. Pattern Recognition, 38, 1457–1467. MATHCrossRefGoogle Scholar
  4. Carrasco, R., & Oncina, J. (1994). Learning stochastic regular grammars by means of a state merging method. In LNCS: Vol. 862. Proceedings of the 2nd international colloqium on grammatical inference (pp. 139–150). Berlin: Springer. Google Scholar
  5. Clark, A., & Thollard, F. (2004) PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 473–497. Google Scholar
  6. Dima, C. (2001). Real-time automata. Journal of Automata, Languages and Combinatorics, 6(1), 2–23. MathSciNetGoogle Scholar
  7. Dupont, P., Denis, F., & Esposito, Y. (2005). Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognition, 38, 1349–1371. MATHCrossRefGoogle Scholar
  8. Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37(3), 302–320. MathSciNetMATHCrossRefGoogle Scholar
  9. Goldman, S. A., & Mathias, H. D. (1996). Teaching a smarter learner. Journal of Computer and System Sciences, 52(2), 255–267. MathSciNetCrossRefGoogle Scholar
  10. Grinchtein, O., Jonsson, B., & Petterson, P. (2006). Inference of event-recording automata using timed decision trees. In LNCS: Vol. 4137. CONCUR (pp. 435–449). Berlin: Springer. Google Scholar
  11. Guédon, Y. (2003). Estimating hidden semi-Markov chains from discrete sequences. Journal of Computational and Graphical Statistics, 12(3), 604–639. MathSciNetCrossRefGoogle Scholar
  12. de la Higuera, C. (1997). Characteristic sets for polynomial grammatical inference. Machine Learning, 27(2), 125–138. MATHCrossRefGoogle Scholar
  13. de la Higuera, C. (2005). A bibliographical study of grammatical inference. Pattern Recognition, 38(9), 1332–1348. CrossRefGoogle Scholar
  14. Kermorvant, C., & Dupont, P. (2002). Stochastic grammatical inference with multinomial tests. In LNAI: Vol. 2484. Proceedings of the 6th international colloquium on grammatical inference (pp. 149–160). Berlin: Springer. Google Scholar
  15. Lang, K. J., Pearlmutter, B. A., & Price, R. A. (1998). Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In LNCS: Vol. 1433. Grammatical inference. Berlin: Springer. CrossRefGoogle Scholar
  16. Larsen, K. G., Petterson, P., & Yi, W. (1997). Uppaal in a nutshell. International Journal on Software Tools for Technology Transfer, 1(1–2), 134–152. MATHGoogle Scholar
  17. Mitchell, T. (1997). Machine learning. New York: McGraw-Hill. MATHGoogle Scholar
  18. Mörchen, F., & Ultsch, A. (2004). Mining temporal patterns in multivariate time series. In LNCS: Vol. 3238. Advances in artificial intelligence (pp. 127–140). Berlin: Springer. Google Scholar
  19. Oncina, J., & Garcia, P. (1992). Inferring regular languages in polynomial update time. In Series in machine perception and artificial intelligence: Vol. 1. Pattern recognition and image analysis (pp. 49–61). Singapore: World Scientific. CrossRefGoogle Scholar
  20. Pitt, L., & Warmuth, M. (1989). The minimum consistent DFA problem cannot be approximated within and polynomial. In Annual ACM symposium on theory of computing (pp. 421–432). New York: ACM. Google Scholar
  21. Pnueli, A., Asarin, E., Maler, O., & Sifakis, J. (1998). Controller synthesis for timed automata. In IFAC symposium on system structure and control (pp. 469–474). Amsterdam: Elsevier. Google Scholar
  22. Roddick, J. F., & Spiliopoulou, M. (2002). A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering, 14(4), 750–767. CrossRefGoogle Scholar
  23. Sen, K., Viswanathan, M., & Agha, G. (2004). Learning continuous time Markov chains from sample executions. In Proceedings of the quantitative evaluation of systems (pp. 146–155). Google Scholar
  24. Sipser, M. (1997). Introduction to the theory of computation. Boston: PWS Publishing. MATHGoogle Scholar
  25. Springintveld, J., Vaandrager, F. W., & D’Argenio, P. R. (2001). Testing timed automata. Theoretical Computer Science, 254(1–2), 225–257. MathSciNetMATHCrossRefGoogle Scholar
  26. Sudkamp, T. A. (2006). Languages and machines: an introduction to the theory of computer science (3rd ed.). Reading: Addison-Wesley. Google Scholar
  27. Verwer, S., de Weerdt, M., & Witteveen, C. (2008). Polynomial distinguishability of timed automata. In LNCS: Vol. 5278. Grammatical inference: theory and applications (pp. 238–251). Berlin: Springer. CrossRefGoogle Scholar
  28. Verwer, S., de Weerdt, M., & Witteveen, C. (2009). One-clock deterministic timed automata are efficiently identifiable in the limit. In LNCS: Vol. 5457. Language and automata theory and applications (pp. 740–751). Berlin: Springer. CrossRefGoogle Scholar
  29. Verwer, S., de Weerdt, M., & Witteveen, C. (2011). The efficiency of identifying timed automata and the power of clocks. Information and Computation, 209(3), 606–625. MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© The Author(s) 2011

Authors and Affiliations

  • Sicco Verwer
    • 1
  • Mathijs de Weerdt
    • 2
  • Cees Witteveen
    • 2
  1. 1.Katholieke Universiteit LeuvenHeverleeBelgium
  2. 2.Delft University of TechnologyDelftThe Netherlands

Personalised recommendations