Simplifying Regular Expressions
We consider the efficient simplification of regular expressions and suggest a quantitative comparison of heuristics for simplifying regular expressions. To this end, we propose a new normal form for regular expressions, which outperforms previous heuristics while still being computable in linear time. This allows us to determine an exact bound for the relation between the two prevalent measures for regular expression - size: alphabetic width and reverse polish notation length. In addition, we show that every regular expression of alphabetic width n can be converted into a nondeterministic finite automaton with ε-transitions of size at most \(4\frac25n+1\), and prove this bound to be optimal. This answers a question posed by Ilie and Yu, who had obtained lower and upper bounds of 4n − 1 and \(9n-\frac12\), respectively . For reverse polish notation length as input size measure, an optimal bound was recently determined by Gulan and Fernau . We prove that, under mild restrictions, their construction is also optimal when taking alphabetic width as input size measure.
Unable to display preview. Download preview PDF.
- 2.Baader, F., Nipkow, T.: Term Rewriting and All That. Cambridge University Press, Cambridge (1998)Google Scholar
- 3.Bille, P., Thorup, M.: Faster regular expression matching. In: ICALP 2009. LNCS, vol. 5555, pp. 171–182. Springer, Heidelberg (2009)Google Scholar
- 11.Gelade, W., Neven, F.: Succinctness of the complement and intersection of regular expressions. In: Symposium on Theoretical Aspects of Computer Science. Number 08001 in Dagstuhl Seminar Proceedings, pp. 325–336 (2008)Google Scholar
- 14.Gulan, S., Fernau, H.: An optimal construction of finite automata from regular expressions. In: FSTTCS 2008. Number 08004 in Dagstuhl Seminar Proceedings, pp. 211–222 (2008)Google Scholar
- 17.Meyer, A.R., Stockmeyer, L.J.: The equivalence problem for regular expressions with squaring requires exponential space. In: FOCS 1972, pp. 125–129. IEEE Computer Society, Los Alamitos (1972)Google Scholar