Generating Non-plagiaristic Markov Sequences with Max Order Sampling

Part of the Lecture Notes in Morphogenesis book series (LECTMORPH)


Plagiarism is usually studied from an analysis viewpoint: how to detect that a text contains copies of another one. In this chapter we study plagiarism from the generation viewpoint: how to generate a text with a guarantee of non-plagiarism. More precisely, we address the problem of Markov sequence generation with forbidden k-gram constraints. This problem is addressed in two steps. In the first step, we show that, given a Markov transition matrix and a set of k-grams, we can build efficiently an automaton that represents exactly the language of all sequences that can be generated from a Markov model, and that also do not contain any of the k-grams. The size of the automaton is bounded by the size of the forbidden k-grams, and so is the time for building it. This automaton can be used to solve the algebraic problem (i.e. considering non-zero probabilities are uniform), by a simple walk. In the second step, we show that the automaton can be extended so as to be exploited by a belief propagation scheme, in order to produce perfect sampling of all the solutions.


Markov Chain Belief Propagation Maximum Order Factor Graph Markov State 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)CrossRefGoogle Scholar
  2. 2.
    Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Intell. Res. (JAIR) 22, 385–421 (2004)Google Scholar
  3. 3.
    Beldiceanu, N., Carlsson, M., Petit, T.: Deriving filtering algorithms from constraint checkers. In: [15], pp. 107–122Google Scholar
  4. 4.
    Brooks, F.P., Hopkins, A., Neumann, P.G., Wright, W.: An experiment in musical composition. IRE Trans. Electron. Comput. 6(3), 175–182 (1957)CrossRefGoogle Scholar
  5. 5.
    Conklin, D., Weisser, S.: Antipattern discovery in ethiopian bagana songs. In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) Discovery Science - 17th International Conference, DS 2014, Bled, Slovenia, 8–10 October 2014. Proceedings. Lecture Notes in Computer Science, pp. 62–72. Springer (2014)Google Scholar
  6. 6.
    Guibas, L.J., Odlyzko, A.M.: String overlaps, pattern matching, and nontransitive games. J. Comb. Theory, Ser. A 30(2), 183–208 (1981). doi: 10.1016/0097-3165(81)90005-4 CrossRefGoogle Scholar
  7. 7.
    Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata. IEEE Conference on Computational Complexity, pp. 229–234. IEEE Computer Society (2000)Google Scholar
  8. 8.
    Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata and NL versus NP. Theor. Comput. Sci. 302(1–3), 257–274 (2003)CrossRefGoogle Scholar
  9. 9.
    Papadopoulos, A., Roy, P., Pachet, F.: Avoiding Plagiarism in Markov Sequence Generation. In: Brodley, C.E., Stone, P. (eds.) AAAI. AAAI Press, Menlo Park (2014)Google Scholar
  10. 10.
    Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Waltz, D.L. (ed.) Proceedings of the National Conference on Artificial Intelligence. Pittsburgh, PA, 18–20 August 1982, pp. 133–136. AAAI Press (1982)Google Scholar
  11. 11.
    Pearl, J.: Probabilistic reasoning in intelligent systems - networks of plausible inference. Morgan Kaufmann series in representation and reasoning. Morgan Kaufmann (1989)Google Scholar
  12. 12.
    Pesant, G.: A regular language membership constraint for finite sequences of variables. In: [15], pp. 482–495Google Scholar
  13. 13.
    Pinkerton, R.C.: Information theory and melody. Scientific American (1956)Google Scholar
  14. 14.
    Villeneuve, D., Desaulniers, G.: The shortest path problem with forbidden paths. Eur. J. Oper. Res. 165(1), 97–107 (2005)CrossRefGoogle Scholar
  15. 15.
    Wallace, M. (ed.): Principles and Practice of Constraint Programming - CP 2004. 10th International Conference, CP 2004, Toronto, Canada, 27 September–1 October 2004, Proceedings. Lecture Notes in Computer Science. Springer (2004)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.UPMC Paris 6, UMR 7606, LIP6ParisFrance
  2. 2.Sony CSL, 6 rue AmyotParisFrance

Personalised recommendations