Abstract
We analyse the statistical properties a database of musical notes for the purpose of designing an information retrieval system as part of the Musifind project. In order to reduce the amount of musical information we convert the database to the intervals between notes, which will make the database easier to search. We also investigate a further simplification by creating equivalence classes of musical intervals which also increases the resilience of searches to errors in the query. The Zipf, Zipf-Mandelbrot, Generalized Waring (GW) and Generalized Inverse Gaussian-Poisson (GIGP) distributions are tested against these various representations with the GIGP distribution providing the best overall fit for the data. There are many similarities with text databases, especially those with short bibliographic records. There are also some differences, particularly in the highest frequency intervals which occur with a much lower frequency than the highest frequency “stopwords” in a text database. This provides evidence to support the hypothesis that traditional text retrieval methods will work for a music database.
Similar content being viewed by others
References
H. B arlow, S. Morgenstern, A Dictionary of Musical Themes. London: Ernest Benn, 1949.
B. S. Brook, Thematic catalogue. In: The new Grove dictionary of music and musicians, Ed. Stanley Sadie. London: Macmillan Publishers, 1980.
Q. L. Burrell, M. R. Fenton, Yes, the GIGP really does work–and is workable! Journal of the American Society for Information Science, 44 (1993) 61–69.
J. S. Downie, The MusiFind music information retrieval project, Phase III: Evaluation of indexing options. In: Connectedness: Information, Systems, People, Organizations: Proceedings of the 23rd Annual Conference of the Canadian Association for Information Science, 7–10 June 1995, Edmonton, Alberta. Toronto: Canadian Association for Information Science, 1995, pp. 135–146.
J. S. Downie, Representing melodies as collections of “musical words”: It works! Poster presented at ALISE '99, 26–29 January 1999, Philadelphia, PA., 1999.
J. S. Downie, Music retrieval as text retrieval: Simple yet effective. In: Proceedings of the Association for Computing Machinery, SIGIR '99 conference, University of California at Berkeley, 15–19 August 1999, Berkeley, California. New York: Association for Computing Machinery, 1999, pp. 297–298.
J. S. Downie, Evaluating a simple approach to music information retrieval: Conceiving melodic n-grams as text. London, Ont.: Faculty of Graduate Studies, University of Western Ontario, 1999. [dissertation].
J. S. Downie, M. J. Nelson, Evaluation of a simple and effective music IR system. In: Proceedings of the Twenty-third Annual International ACM SIGIR Conference on Research and Development Information Retrieval, July 24–28, 2000. Athens, Greece. New York: Association for Computing Machinery 2000, pp. 73–80.
L. Egghe, On the law of Zipf-Mandelbrot for multi-word phrases. Journal of the American Society for Information Science, 50 (1999) 233–241.
L. Egghe, The distribution of N-grams. Scientometrics, 47 (2000) 237–252.
L. Egghe, R. Rousseau, Introduction to Informetrics: Quantitative Methods in Library and Information Science. Amsterdam: Elsevier Science Publishers, 1990.
W. B. Hewlett, E. Selfridge-field (Eds), Computing in Musicology. Vol. 11, Melodic Similarity: Concepts, Procedures, and Applications. Menlo Park: Center for Computer Assisted Research in the Humanities, 1998.
K. Van Winkle Keller, C. Rabson, National Tune Index, 18th Century Secular Music. New York: University Music Edition, 1980.
A. McLane, Music as information. Annual Review of Information Science and Technology, 31 (1996) 225–262.
R. J. McNab, L. A. Smith, I. H. Witten, C. Henderson, S. J. Cunningham, Towards the digital music library: Tune retrieval from acoustic input. In: Digital Libraries '96, Proceedings of the ACM Digital Libraries Conference, Bethesda, Maryland. New York: Association for Computing Machinery, 1996, pp. 11–18.
R. J. McNab, L. A. Smith, D. Bainbridge, I. H. Witten, The New Zealand Digital Library MELody inDEX. D-Lib Magazine (May), 1997. Available at: http://www.dlib.org/dlib/may97/meldex/05witten.html
M. Nelson, Stochastic models for the distribution of index terms. Journal of Documentation, 45 (1989) 227–237.
D. Parsons, The Directory of Tunes and Musical Themes. New York: Spencer Brown, 1975.
L. Prechelt, R. Typke, An interface for melody input. ACM Transactions on Computer-Human Interaction, 8 (2), (2001) 133–149.
W. H. Press, Numerical Recipes: The Art of Scientific Computing. Cambridge, UK: Cambridge University Press, 1986.
H. S. Sichel, A bibliometric distribution which really works. Journal of the American Society for Information Science, 36 (1985) 314–321.
J. Tague, Ranks and sizes: Some complementarities and contrasts. Journal of Documentation, 16 (1990) 29–36.
A. L. Uitdenbogerd, J. Zobel, Matching techniques for large music databases. In D. Bulterman, K. Jeffay, H. J. Zhang (Eds), Proceedings of the 7th ACM International Multimedia Conference, November 1999, Orlando, Florida. New York: ACM Press, 1999, pp. 57–66.
D. Wolfram, Applying informetric characteristics of databases to IR system file design, Part I. Information Processing and Management, 28 (1992) 121–133.
D. Wolfram, Applying informetric characteristics of databases to IR system file design, Part II. Information Processing and Management, 28 (1992) 135–151.
D. Greenhaus, About the Digital Tradition, The Modcat Café (Spring), 1999 Available: http://www.mudcat.org/DigiTrad-blurb.cfm
H. Schaffrath, The ESAC databases and MAPPET software, Computing in Musicology, 8 (1992) 66.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Nelson, M., Downie, J.S. Informetric analysis of a music database. Scientometrics 54, 243–255 (2002). https://doi.org/10.1023/A:1016013912188
Issue Date:
DOI: https://doi.org/10.1023/A:1016013912188