Abstract
Plagiarism is usually studied from an analysis viewpoint: how to detect that a text contains copies of another one. In this chapter we study plagiarism from the generation viewpoint: how to generate a text with a guarantee of non-plagiarism. More precisely, we address the problem of Markov sequence generation with forbidden k-gram constraints. This problem is addressed in two steps. In the first step, we show that, given a Markov transition matrix and a set of k-grams, we can build efficiently an automaton that represents exactly the language of all sequences that can be generated from a Markov model, and that also do not contain any of the k-grams. The size of the automaton is bounded by the size of the forbidden k-grams, and so is the time for building it. This automaton can be used to solve the algebraic problem (i.e. considering non-zero probabilities are uniform), by a simple walk. In the second step, we show that the automaton can be extended so as to be exploited by a belief propagation scheme, in order to produce perfect sampling of all the solutions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aho, A.V., Corasick, M.J.: Efficient string matching: an aid to bibliographic search. Commun. ACM 18(6), 333–340 (1975)
Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order markov models. J. Artif. Intell. Res. (JAIR) 22, 385–421 (2004)
Beldiceanu, N., Carlsson, M., Petit, T.: Deriving filtering algorithms from constraint checkers. In: [15], pp. 107–122
Brooks, F.P., Hopkins, A., Neumann, P.G., Wright, W.: An experiment in musical composition. IRE Trans. Electron. Comput. 6(3), 175–182 (1957)
Conklin, D., Weisser, S.: Antipattern discovery in ethiopian bagana songs. In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) Discovery Science - 17th International Conference, DS 2014, Bled, Slovenia, 8–10 October 2014. Proceedings. Lecture Notes in Computer Science, pp. 62–72. Springer (2014)
Guibas, L.J., Odlyzko, A.M.: String overlaps, pattern matching, and nontransitive games. J. Comb. Theory, Ser. A 30(2), 183–208 (1981). doi:10.1016/0097-3165(81)90005-4
Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata. IEEE Conference on Computational Complexity, pp. 229–234. IEEE Computer Society (2000)
Karakostas, G., Lipton, R.J., Viglas, A.: On the complexity of intersecting finite state automata and NL versus NP. Theor. Comput. Sci. 302(1–3), 257–274 (2003)
Papadopoulos, A., Roy, P., Pachet, F.: Avoiding Plagiarism in Markov Sequence Generation. In: Brodley, C.E., Stone, P. (eds.) AAAI. AAAI Press, Menlo Park (2014)
Pearl, J.: Reverend bayes on inference engines: a distributed hierarchical approach. In: Waltz, D.L. (ed.) Proceedings of the National Conference on Artificial Intelligence. Pittsburgh, PA, 18–20 August 1982, pp. 133–136. AAAI Press (1982)
Pearl, J.: Probabilistic reasoning in intelligent systems - networks of plausible inference. Morgan Kaufmann series in representation and reasoning. Morgan Kaufmann (1989)
Pesant, G.: A regular language membership constraint for finite sequences of variables. In: [15], pp. 482–495
Pinkerton, R.C.: Information theory and melody. Scientific American (1956)
Villeneuve, D., Desaulniers, G.: The shortest path problem with forbidden paths. Eur. J. Oper. Res. 165(1), 97–107 (2005)
Wallace, M. (ed.): Principles and Practice of Constraint Programming - CP 2004. 10th International Conference, CP 2004, Toronto, Canada, 27 September–1 October 2004, Proceedings. Lecture Notes in Computer Science. Springer (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Papadopoulos, A., Pachet, F., Roy, P. (2016). Generating Non-plagiaristic Markov Sequences with Max Order Sampling. In: Degli Esposti, M., Altmann, E., Pachet, F. (eds) Creativity and Universality in Language. Lecture Notes in Morphogenesis. Springer, Cham. https://doi.org/10.1007/978-3-319-24403-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-24403-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24401-3
Online ISBN: 978-3-319-24403-7
eBook Packages: Social SciencesSocial Sciences (R0)