Music Corpus Analysis Using Unwords

  • Darrell ConklinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11502)


Discovering patterns reoccurring within a collection of pieces is a fundamental type of music corpus analysis. The inverted task is to discover patterns that are surprisingly infrequent in a corpus, including completely absent patterns or unwords. The key issue in mining unwords is evaluating whether a specific pattern is merely statistically absent from the corpus, or is prohibited in the style exemplified by the corpus. This paper describes a statistical method for evaluating unwords and applies it to reveal interesting unwords for counterpoint and chord sequences.


Unwords Music analysis Statistics Pattern discovery 



Special thanks to Kerstin Neubarth for valuable comments on the manuscript.


  1. 1.
    CCARH: Digital encoding of 4-part chorales by J.S. Bach (2019). Accessed 15 Jan 2019
  2. 2.
    Chairungsee, S., Crochemore, M.: Using minimal absent words to build phylogeny. Theor. Comput. Sci. 450, 109–116 (2012)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Conklin, D.: Antipattern discovery in folk tunes. J. New Music Res. 42(2), 161–169 (2013)CrossRefGoogle Scholar
  4. 4.
    Conklin, D.: Chord sequence generation with semiotic patterns. J. Math. Music 10(2), 92–106 (2016)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Conklin, D., Bergeron, M.: Discovery of contrapuntal patterns. In: 11th International Society for Music Information Retrieval Conference, Utrecht, The Netherlands, pp. 201–206 (2010)Google Scholar
  6. 6.
    Conklin, D., Weisser, S.: Pattern and antipattern discovery in Ethiopian Bagana songs. In: Meredith, D. (ed.) Computational Music Analysis, pp. 425–443. Springer, Cham (2016). Scholar
  7. 7.
    Crawford, T., Badkobeh, G., Lewis, D.: Searching page-images of early music scanned with OMR: a scalable solution using minimal absent words. In: 19th International Society for Music Information Retrieval Conference, Paris, France, pp. 233–239 (2018)Google Scholar
  8. 8.
    Crochemore, M., Mignosi, F., Restivo, A.: Automata and forbidden words. Inf. Process. Lett. 67(3), 111–117 (1998)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Fitsioris, G., Conklin, D.: Parallel successions of perfect fifths in the Bach chorales. In: 4th Conference on Interdisciplinary Musicology, Thessaloniki, Greece, pp. 1–10 (2008)Google Scholar
  10. 10.
    Hampikian, G., Andersen, T.: Absent sequences: nullomers and primes. In: Pacific Symposium on Biocomputing, Hawaii, USA, pp. 355–366 (2007)Google Scholar
  11. 11.
    Herold, J., Kurtz, S., Giegerich, R.: Efficient computation of absent words in genomic sequences. BMC Bioinform. 9(1), 167 (2008)CrossRefGoogle Scholar
  12. 12.
    Herremans, D., Weisser, S., Sörensen, K., Conklin, D.: Generating structured music for Bagana using quality metrics based on Markov models. Expert Syst. Appl. 42, 7424–7435 (2015)CrossRefGoogle Scholar
  13. 13.
    Pinho, A., Ferreira, P., Garcia, S., Rodrigues, J.: On finding minimal absent words. BMC Bioinform. 10(1), 137 (2009)CrossRefGoogle Scholar
  14. 14.
    Stefanowitsch, A.: Negative evidence and the raw frequency fallacy. Corpus Linguist. Linguist. Theory 2(1), 61–77 (2006)Google Scholar
  15. 15.
    UTALC: Uplifting Trance Anthem Loop Corpus (2018). Accessed 15 Jan 2019

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer Science and Artificial IntelligenceUniversity of the Basque Country UPV/EHUSan SebastianSpain
  2. 2.IKERBASQUE, Basque Foundation for ScienceBilbaoSpain

Personalised recommendations