Abstract
We develop a novel learning algorithm RTI for identifying a deterministic real-time automaton (DRTA) from labeled time-stamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finite-state automaton (DFA) identification, called evidence-driven state-merging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NP-complete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from time-stamped event sequences.
A straightforward alternative to identifying DRTAs is to identify a DFA that models time implicitly, i.e., a DFA that uses different states for different points in time. Such a DFA can be identified by first sampling the timed sequences using a fixed frequency, and subsequently applying EDSM to the resulting non-timed event sequences. We evaluate the performance of both RTI and this sampling approach experimentally on artificially generated data. In these experiments RTI outperforms the sampling approach significantly. Thus, we show that if we obtain data from a real-time system, it is easier to identify a DRTA from this data than to identify an equivalent DFA.
Article PDF
Similar content being viewed by others
References
Alur, R., & Dill, D. L. (1994). A theory of timed automata. Theoretical Computer Science, 126, 183–235.
Bishop, C. M. (2006). Pattern recognition and machine learning. Berlin: Springer.
Bugalho, M., & Oliveira, A. L. (2005). Inference of regular languages using state merging algorithms with search. Pattern Recognition, 38, 1457–1467.
Carrasco, R., & Oncina, J. (1994). Learning stochastic regular grammars by means of a state merging method. In LNCS: Vol. 862. Proceedings of the 2nd international colloqium on grammatical inference (pp. 139–150). Berlin: Springer.
Clark, A., & Thollard, F. (2004) PAC-learnability of probabilistic deterministic finite state automata. Journal of Machine Learning Research, 473–497.
Dima, C. (2001). Real-time automata. Journal of Automata, Languages and Combinatorics, 6(1), 2–23.
Dupont, P., Denis, F., & Esposito, Y. (2005). Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms. Pattern Recognition, 38, 1349–1371.
Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37(3), 302–320.
Goldman, S. A., & Mathias, H. D. (1996). Teaching a smarter learner. Journal of Computer and System Sciences, 52(2), 255–267.
Grinchtein, O., Jonsson, B., & Petterson, P. (2006). Inference of event-recording automata using timed decision trees. In LNCS: Vol. 4137. CONCUR (pp. 435–449). Berlin: Springer.
Guédon, Y. (2003). Estimating hidden semi-Markov chains from discrete sequences. Journal of Computational and Graphical Statistics, 12(3), 604–639.
de la Higuera, C. (1997). Characteristic sets for polynomial grammatical inference. Machine Learning, 27(2), 125–138.
de la Higuera, C. (2005). A bibliographical study of grammatical inference. Pattern Recognition, 38(9), 1332–1348.
Kermorvant, C., & Dupont, P. (2002). Stochastic grammatical inference with multinomial tests. In LNAI: Vol. 2484. Proceedings of the 6th international colloquium on grammatical inference (pp. 149–160). Berlin: Springer.
Lang, K. J., Pearlmutter, B. A., & Price, R. A. (1998). Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm. In LNCS: Vol. 1433. Grammatical inference. Berlin: Springer.
Larsen, K. G., Petterson, P., & Yi, W. (1997). Uppaal in a nutshell. International Journal on Software Tools for Technology Transfer, 1(1–2), 134–152.
Mitchell, T. (1997). Machine learning. New York: McGraw-Hill.
Mörchen, F., & Ultsch, A. (2004). Mining temporal patterns in multivariate time series. In LNCS: Vol. 3238. Advances in artificial intelligence (pp. 127–140). Berlin: Springer.
Oncina, J., & Garcia, P. (1992). Inferring regular languages in polynomial update time. In Series in machine perception and artificial intelligence: Vol. 1. Pattern recognition and image analysis (pp. 49–61). Singapore: World Scientific.
Pitt, L., & Warmuth, M. (1989). The minimum consistent DFA problem cannot be approximated within and polynomial. In Annual ACM symposium on theory of computing (pp. 421–432). New York: ACM.
Pnueli, A., Asarin, E., Maler, O., & Sifakis, J. (1998). Controller synthesis for timed automata. In IFAC symposium on system structure and control (pp. 469–474). Amsterdam: Elsevier.
Roddick, J. F., & Spiliopoulou, M. (2002). A survey of temporal knowledge discovery paradigms and methods. IEEE Transactions on Knowledge and Data Engineering, 14(4), 750–767.
Sen, K., Viswanathan, M., & Agha, G. (2004). Learning continuous time Markov chains from sample executions. In Proceedings of the quantitative evaluation of systems (pp. 146–155).
Sipser, M. (1997). Introduction to the theory of computation. Boston: PWS Publishing.
Springintveld, J., Vaandrager, F. W., & D’Argenio, P. R. (2001). Testing timed automata. Theoretical Computer Science, 254(1–2), 225–257.
Sudkamp, T. A. (2006). Languages and machines: an introduction to the theory of computer science (3rd ed.). Reading: Addison-Wesley.
Verwer, S., de Weerdt, M., & Witteveen, C. (2008). Polynomial distinguishability of timed automata. In LNCS: Vol. 5278. Grammatical inference: theory and applications (pp. 238–251). Berlin: Springer.
Verwer, S., de Weerdt, M., & Witteveen, C. (2009). One-clock deterministic timed automata are efficiently identifiable in the limit. In LNCS: Vol. 5457. Language and automata theory and applications (pp. 740–751). Berlin: Springer.
Verwer, S., de Weerdt, M., & Witteveen, C. (2011). The efficiency of identifying timed automata and the power of clocks. Information and Computation, 209(3), 606–625.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Nicolo Cesa-Bianchi.
The main part of this research was performed when the first author was a PhD student at Delft University of Technology. It has been supported and funded by the Dutch Ministry of Economical Affairs under the SENTER program.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Verwer, S., de Weerdt, M. & Witteveen, C. Efficiently identifying deterministic real-time automata from labeled data. Mach Learn 86, 295–333 (2012). https://doi.org/10.1007/s10994-011-5265-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-011-5265-4