Heuristic and supervised approaches to handwritten annotation extraction for musical score images
- 24 Downloads
Performers’ copies of musical scores are typically rich in handwritten annotations, which capture historical and institutional performance practices. The development of interactive interfaces to explore digital archives of these scores and the systematic investigation of their meaning and function will be facilitated by the automatic extraction of handwritten score annotations. We present several approaches to the extraction of handwritten annotations of arbitrary content from digitized images of musical scores. First, we show promising results in certain contexts when using simple unsupervised clustering techniques to identify handwritten annotations in conductors’ scores. Next, we compare annotated scores to unannotated copies and use a printed sheet music comparison tool, Aruspix, to recover handwritten annotations as additions to the clean copy. Using both of these techniques in a combined annotation pipeline qualitatively improves the recovery of handwritten annotations. Recent work has shown the effectiveness of reframing classical optical musical recognition tasks as supervised machine learning classification tasks. In the same spirit, we pose the problem of handwritten annotation extraction as a supervised pixel classification task, where the feature space for the learning task is derived from the intensities of neighboring pixels. After an initial investment of time required to develop dependable training data, this approach can reliably extract annotations for entire volumes of score images without further supervision. These techniques are demonstrated using a sample of orchestral scores annotated by professional conductors of the New York Philharmonic Orchestra. Handwritten annotation extraction in musical scores has applications to the systematic investigation of score annotation practices by performers, annotator attribution, and to the interactive presentation of annotated scores, which we briefly discuss.
KeywordsAnnotation extraction Image processing Color clustering Supervised pixel classification Orchestral scores Conducting Image superimposition
The authors wish to thank Barbara Haws at the New York Philharmonic Archives and Mitchell Brodsky for their technical support and encouragement and the anonymous reviewers for their feedback and suggestions. Leon Levy Digital Archive, New York Philharmonic, contributed to original score image courtesy.
- 5.Farooq, F., Sridharan, K., Govindaraju, V.: Identifying handwritten text in mixed documents. In: 18th International Conference on Pattern Recognition (ICPR’06), vol. 2, pp. 1142–1145 (2006). https://doi.org/10.1109/ICPR.2006.676
- 6.Guo, J.K., Ma, M.Y.: Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 439–443 (2001). https://doi.org/10.1109/ICDAR.2001.953828
- 7.Hankinson, A., Burgoyne, J.A., Vigliensoni, G., Porter, A., Thompson, J., Liu, W., Chiu, R., Fujinaga, I.: Digital document image retrieval using optical music recognition. In: Proceedings of the 13th ISMIR Conference, Porto, Portugal, pp. 577–582, 8–12 Oct 2012Google Scholar
- 8.IIIF Consortium (2017) IIIF Presentation API v. 2.1.1. Online. http://iiif.io/api/presentation/2.1/. Accessed 25 June 2018
- 10.McLaren, K.: The development of the cie 1976 (l * a * b *) uniform colour space and colour-difference formula. J. Soc. Dye. Colour. 92(9), 338–341 (1976). https://doi.org/10.1111/j.1478-4408.1976.tb03301.x CrossRefGoogle Scholar
- 11.Nakai, T., Kise, K., Iwamura, M.: A method of annotation extraction from paper documents using alignment based on local arrangements of feature points. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition, vol. 1, pp. 23–27. IEEE (2007)Google Scholar
- 13.Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.L.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation (2015). arXiv preprint arXiv:1502.02734
- 14.Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
- 16.Pugin, L.: Aruspix: An automatic source-comparison system. In: Hewlett, W.B., Selfridge-Field, E. (eds.) Music Analysis East and West, Computing in Musicology, vol. 14, pp. 49–60. MIT Press, Cambridge (2006)Google Scholar
- 17.Roland, P., Kepper, J.: Music encoding initiative guidelines (v. 3.0.0) (2016). http://www.music-encoding.org/docs/MEI_Guidelines_v3.0.0.pdf. Accessed 25 June 2018.
- 18.Violante, S., Smith, R., Reiss, M.: A computationally efficient technique for discriminating between hand-written and printed text. In: IEEE Colloquium on Document Image Processing and Multimedia Environments, pp. 17–1. IET (1995)Google Scholar
- 19.Weigl, D.M., Page, K.R.: A framework for distributed semantic annotation of musical score: take it to the bridge!. In: Proceedings of the 18th ISMIR Conference, Suzhou, China, pp. 221–228. The International Society of Music Information Retrieval (ISMIR), 23–27 Oct 2017Google Scholar
- 21.Zagoris, K., Pratikakis, I., Gatos, B.: Segmentation-based historical handwritten word spotting using document-specific local features. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 9–14. IEEE (2014)Google Scholar