Efficiently identifying deterministic realtime automata from labeled data
 Sicco Verwer,
 Mathijs de Weerdt,
 Cees Witteveen
 … show all 3 hide
Abstract
We develop a novel learning algorithm RTI for identifying a deterministic realtime automaton (DRTA) from labeled timestamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finitestate automaton (DFA) identification, called evidencedriven statemerging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NPcomplete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from timestamped event sequences.
A straightforward alternative to identifying DRTAs is to identify a DFA that models time implicitly, i.e., a DFA that uses different states for different points in time. Such a DFA can be identified by first sampling the timed sequences using a fixed frequency, and subsequently applying EDSM to the resulting nontimed event sequences. We evaluate the performance of both RTI and this sampling approach experimentally on artificially generated data. In these experiments RTI outperforms the sampling approach significantly. Thus, we show that if we obtain data from a realtime system, it is easier to identify a DRTA from this data than to identify an equivalent DFA.
 Alur, R., Dill, D. L. (1994) A theory of timed automata. Theoretical Computer Science 126: pp. 183235 CrossRef
 Bishop, C. M. (2006) Pattern recognition and machine learning. Springer, Berlin
 Bugalho, M., Oliveira, A. L. (2005) Inference of regular languages using state merging algorithms with search. Pattern Recognition 38: pp. 14571467 CrossRef
 Carrasco, R., Oncina, J. (1994) Learning stochastic regular grammars by means of a state merging method. Proceedings of the 2nd international colloqium on grammatical inference. Springer, Berlin, pp. 139150
 Clark, A., & Thollard, F. (2004) PAClearnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 473–497.
 Dima, C. (2001) Realtime automata. Journal of Automata, Languages and Combinatorics 6: pp. 223
 Dupont, P., Denis, F., Esposito, Y. (2005) Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognition 38: pp. 13491371 CrossRef
 Gold, E. M. (1978) Complexity of automaton identification from given data. Information and Control 37: pp. 302320 CrossRef
 Goldman, S. A., Mathias, H. D. (1996) Teaching a smarter learner. Journal of Computer and System Sciences 52: pp. 255267 CrossRef
 Grinchtein, O., Jonsson, B., Petterson, P. (2006) Inference of eventrecording automata using timed decision trees. CONCUR. Springer, Berlin, pp. 435449
 Guédon, Y. (2003) Estimating hidden semiMarkov chains from discrete sequences. Journal of Computational and Graphical Statistics 12: pp. 604639 CrossRef
 Higuera, C. (1997) Characteristic sets for polynomial grammatical inference. Machine Learning 27: pp. 125138 CrossRef
 Higuera, C. (2005) A bibliographical study of grammatical inference. Pattern Recognition 38: pp. 13321348 CrossRef
 Kermorvant, C., Dupont, P. (2002) Stochastic grammatical inference with multinomial tests. Proceedings of the 6th international colloquium on grammatical inference. Springer, Berlin, pp. 149160
 Lang, K. J., Pearlmutter, B. A., Price, R. A. (1998) Results of the Abbadingo one DFA learning competition and a new evidencedriven state merging algorithm. Grammatical inference. Springer, Berlin CrossRef
 Larsen, K. G., Petterson, P., Yi, W. (1997) Uppaal in a nutshell. International Journal on Software Tools for Technology Transfer 1: pp. 134152
 Mitchell, T. (1997) Machine learning. McGrawHill, New York
 Mörchen, F., Ultsch, A. (2004) Mining temporal patterns in multivariate time series. Advances in artificial intelligence. Springer, Berlin, pp. 127140
 Oncina, J., Garcia, P. (1992) Inferring regular languages in polynomial update time. Pattern recognition and image analysis. World Scientific, Singapore, pp. 4961 CrossRef
 Pitt, L., Warmuth, M. (1989) The minimum consistent DFA problem cannot be approximated within and polynomial. Annual ACM symposium on theory of computing. ACM, New York, pp. 421432
 Pnueli, A., Asarin, E., Maler, O., Sifakis, J. (1998) Controller synthesis for timed automata. IFAC symposium on system structure and control. Elsevier, Amsterdam, pp. 469474
 Roddick, J. F., Spiliopoulou, M. (2002) A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering 14: pp. 750767 CrossRef
 Sen, K., Viswanathan, M., Agha, G. (2004) Learning continuous time Markov chains from sample executions. Proceedings of the quantitative evaluation of systems. pp. 146155
 Sipser, M. (1997) Introduction to the theory of computation. PWS Publishing, Boston
 Springintveld, J., Vaandrager, F. W., D’Argenio, P. R. (2001) Testing timed automata. Theoretical Computer Science 254: pp. 225257 CrossRef
 Sudkamp, T. A. (2006) Languages and machines: an introduction to the theory of computer science. AddisonWesley, Reading
 Verwer, S., Weerdt, M., Witteveen, C. (2008) Polynomial distinguishability of timed automata. Grammatical inference: theory and applications. Springer, Berlin, pp. 238251 CrossRef
 Verwer, S., Weerdt, M., Witteveen, C. (2009) Oneclock deterministic timed automata are efficiently identifiable in the limit. Language and automata theory and applications. Springer, Berlin, pp. 740751 CrossRef
 Verwer, S., Weerdt, M., Witteveen, C. (2011) The efficiency of identifying timed automata and the power of clocks. Information and Computation 209: pp. 606625 CrossRef
 Title
 Efficiently identifying deterministic realtime automata from labeled data
 Journal

Machine Learning
Volume 86, Issue 3 , pp 295333
 Cover Date
 20120301
 DOI
 10.1007/s1099401152654
 Print ISSN
 08856125
 Online ISSN
 15730565
 Publisher
 Springer US
 Additional Links
 Topics
 Keywords

 Timed automata
 Realtime automata
 Identification in the limit
 Supervised learning
 Industry Sectors
 Authors

 Sicco Verwer ^{(1)}
 Mathijs de Weerdt ^{(2)}
 Cees Witteveen ^{(2)}
 Author Affiliations

 1. Katholieke Universiteit Leuven, Celestijnenlaan 200a, Box 2402, 3001, Heverlee, Belgium
 2. Delft University of Technology, Mekelweg 4, 2826 CD, Delft, The Netherlands