Skip to main content

Generating Non-plagiaristic Markov Sequences with Max Order Sampling

  • Chapter
  • First Online:
Creativity and Universality in Language

Part of the book series: Lecture Notes in Morphogenesis ((LECTMORPH))

Abstract

Plagiarism is usually studied from an analysis viewpoint: how to detect that a text contains copies of another one. In this chapter we study plagiarism from the generation viewpoint: how to generate a text with a guarantee of non-plagiarism. More precisely, we address the problem of Markov sequence generation with forbidden k-gram constraints. This problem is addressed in two steps. In the first step, we show that, given a Markov transition matrix and a set of k-grams, we can build efficiently an automaton that represents exactly the language of all sequences that can be generated from a Markov model, and that also do not contain any of the k-grams. The size of the automaton is bounded by the size of the forbidden k-grams, and so is the time for building it. This automaton can be used to solve the algebraic problem (i.e. considering non-zero probabilities are uniform), by a simple walk. In the second step, we show that the automaton can be extended so as to be exploited by a belief propagation scheme, in order to produce perfect sampling of all the solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)

    Article  Google Scholar 

  2. Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Intell. Res. (JAIR) 22, 385–421 (2004)

    Google Scholar 

  3. Beldiceanu, N., Carlsson, M., Petit, T.: Deriving filtering algorithms from constraint checkers. In: [15], pp. 107–122

    Google Scholar 

  4. Brooks, F.P., Hopkins, A., Neumann, P.G., Wright, W.: An experiment in musical composition. IRE Trans. Electron. Comput. 6(3), 175–182 (1957)

    Article  Google Scholar 

  5. Conklin, D., Weisser, S.: Antipattern discovery in ethiopian bagana songs. In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) Discovery Science - 17th International Conference, DS 2014, Bled, Slovenia, 8–10 October 2014. Proceedings. Lecture Notes in Computer Science, pp. 62–72. Springer (2014)

    Google Scholar 

  6. Guibas, L.J., Odlyzko, A.M.: String overlaps, pattern matching, and nontransitive games. J. Comb. Theory, Ser. A 30(2), 183–208 (1981). doi:10.1016/0097-3165(81)90005-4

    Article  Google Scholar 

  7. Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata. IEEE Conference on Computational Complexity, pp. 229–234. IEEE Computer Society (2000)

    Google Scholar 

  8. Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata and NL versus NP. Theor. Comput. Sci. 302(1–3), 257–274 (2003)

    Article  Google Scholar 

  9. Papadopoulos, A., Roy, P., Pachet, F.: Avoiding Plagiarism in Markov Sequence Generation. In: Brodley, C.E., Stone, P. (eds.) AAAI. AAAI Press, Menlo Park (2014)

    Google Scholar 

  10. Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Waltz, D.L. (ed.) Proceedings of the National Conference on Artificial Intelligence. Pittsburgh, PA, 18–20 August 1982, pp. 133–136. AAAI Press (1982)

    Google Scholar 

  11. Pearl, J.: Probabilistic reasoning in intelligent systems - networks of plausible inference. Morgan Kaufmann series in representation and reasoning. Morgan Kaufmann (1989)

    Google Scholar 

  12. Pesant, G.: A regular language membership constraint for finite sequences of variables. In: [15], pp. 482–495

    Google Scholar 

  13. Pinkerton, R.C.: Information theory and melody. Scientific American (1956)

    Google Scholar 

  14. Villeneuve, D., Desaulniers, G.: The shortest path problem with forbidden paths. Eur. J. Oper. Res. 165(1), 97–107 (2005)

    Article  Google Scholar 

  15. Wallace, M. (ed.): Principles and Practice of Constraint Programming - CP 2004. 10th International Conference, CP 2004, Toronto, Canada, 27 September–1 October 2004, Proceedings. Lecture Notes in Computer Science. Springer (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandre Papadopoulos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Papadopoulos, A., Pachet, F., Roy, P. (2016). Generating Non-plagiaristic Markov Sequences with Max Order Sampling. In: Degli Esposti, M., Altmann, E., Pachet, F. (eds) Creativity and Universality in Language. Lecture Notes in Morphogenesis. Springer, Cham. https://doi.org/10.1007/978-3-319-24403-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24403-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24401-3

  • Online ISBN: 978-3-319-24403-7

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics