Advertisement

La Reconnaissance Des Facteurs D'un Langage Fini Dans Un Texte En Temps Lineaire - Resume -

  • Jean-Claude Spehner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 317)

Abstract

First we here give a on-line construction of a transductor F(L) which recognizes all the factors of a finite language L and positions each factor as a factor of a word from L. F(L) can be twice smaller than the A. Blumer, J. Blumer, D. Haussler, R. McConnell and A. Ehrenfeucht's partial automata which recognizes the same words. Though, the complexity of the construction of F(L) is in O(∥L∥.(|A| + min (|L|, lgmax))) where |L| and |A| are respectively the cardinality of L and of its alphabet A, ∥L∥ is the sum of the lengths of the words from L and l gmax is the maximal length of these words and not in O (∥L∥).

Then we build a second transductor F'(L) which has the same states as F(L) and which finds, for each factor u of L and each letter a of A such that ua is not a factor of L, the largest right factor of ua which is a factor of L. F'(L) generalizes the transductor we have introduced in [Spe86] for a unique word. The determination of F'(L) is in O(∥L∥. |A|).

By using the transductors F(L) and F'(L), we obtain an algorithm which finds all the occurrences of the factors of L in a text in time linear in the length of the text and independantly of the cardinality of the alphabet of this text.

This algorithm can be used in computing to find and modify a family of identifiers in a program. Linguists can also determine all the words of a same family or related to a same concept — paronym words may be eliminated.

Keywords

Nous Avons NATO Advance Research Workshop Linear Automaton Automaton Transformation Efficient String Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Références

  1. [AC75]
    A.V. Aho and M.J. Corasick, Efficient string matching; an aid to bibliographic research, Comm. ACM 18(6) (1975) 333–340.Google Scholar
  2. [Aho80]
    A.V. Aho, Pattern matching in strings, in: R.V. Book, ed., Formal Language Theory (Academic Press, New York, 1980) 325–347.Google Scholar
  3. [Ber 79]
    J. Berstel, Transductions and Context-free Languages (Teubner, Stuttgart, 1979).Google Scholar
  4. [BBEHC83]
    A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler and R. McConnell, Linear size finite automata for the set of all subwords of a word; an outline of results, Bull. EATCS 21 (1983) 12–20.Google Scholar
  5. [BBEHC84a]
    A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler and R. McConnell, Building the minimal DFA for the set of all subwords of a word on-line in linear time, Proc. ICALP 1984, Lectures Notes in Computer Sciences 172 (Springer, Berlin, 1984) 109–118.Google Scholar
  6. [BBEHC84b]
    A. Blumer, J. Blumer, A. Ehrenfeucht, D. Haussler and R. McConnell, Building a complete inverted file for a set of text files in linear time, Proc of 16th ACM Symposium on the Theory of Computing, ACM, New-York, 1984, 349–358.Google Scholar
  7. [BBHECS85]
    A. Blumer, J. Blumer, D. Haussler, A. Ehrenfeucht, M.T. Chen and J. Seiferas, The smallest automaton recognizing the subwords of a text, Theoret. Comput. Sci. 40 (1985) 31–55.Google Scholar
  8. [BBHCE87]
    A. Blumer, J. Blumer, D. Haussler, R. McConnell and A. Ehrenfeucht, Complete inverted files for efficient text retrivial and analysis, J. ACM 34(3) (1987) 578–595.Google Scholar
  9. [BM77]
    R.S. Boyer and J.S. Moore, A fast string searching algorithm, Comm. ACM 20(10) (1977) 762–772.Google Scholar
  10. [Cre76]
    E.M. McCreight, A space-economical suffix-tree construction algorithm, J. ACM 23(2) (1976) 262–272.Google Scholar
  11. [CS74]
    M.T. Chen and J. Seiferas, Efficient and elegant subword-tree construction, Proc. NATO advanced Research Workshop on Combinatorial Algorithms on words, Maratea, Italy (1984), 97–107.Google Scholar
  12. [Cro84]
    M. Crochemore, Optimal factor transducers, Proc. NATO Advanced Research Worship on Combinatorial Algorithms on Words, Maratea, Italy (1984) 31–43.Google Scholar
  13. [Cro86a]
    M. Crochemore, Transducers and repetitions, Theoret. Comput. Sci. 45(1) (1986) 63–86.Google Scholar
  14. [Cro86b]
    M. Crochemore, Computing LCF in linear time, Bull. EATCS 30 (1986) 57–61.Google Scholar
  15. [Eil74]
    S. Eilenberg, Automata, Languages and Machines (Academic Press, New-York 1974).Google Scholar
  16. [KMP77]
    D.E. Knuth, J.H. Morris and V.R. Pratt, Fast pattern-matching in strings, SIAM J. Comput. 6(2) (1977) 323–350.Google Scholar
  17. [Lot83]
    M. Lothaire, Combinatorics on Words (Addison-Wesley, Reading, MA, 1983).Google Scholar
  18. [MP70]
    J.H. Morris and V.R. Pratt, A linear pattern-matching algorithm, Tech. Rept. 40, Computing Center, University of California, Berkeley, CA (1970).Google Scholar
  19. [Ner58]
    A. Nerode, Linear automaton transformations, Proc. Amer. Math. Soc. 9 (1958) 541–544.Google Scholar
  20. [Spe86]
    J.C. Spehner, La reconnaissance des facteurs d'un mot dans un texte, TCS 48 (1986) 35–52.Google Scholar
  21. [Spe87]
    J.C. Spehner, Sur les automates qui'reconnaissent une famille de languages, Publication du Labo. Math-Info no 45, Université de Haute Alsace (Mulhouse).Google Scholar
  22. [Wei83]
    P. Weiner, Linear pattern-matching algorithms, Proc. 14th IEEE Ann. Symp. on Switching and Automata Theory, Iowa, U.S.A. (1983) 1–11.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1988

Authors and Affiliations

  • Jean-Claude Spehner
    • 1
  1. 1.F.S.T., Université de Haute AlsaceMulhouse CédexFrance

Personalised recommendations