Abstract
Performers’ copies of musical scores are typically rich in handwritten annotations, which capture historical and institutional performance practices. The development of interactive interfaces to explore digital archives of these scores and the systematic investigation of their meaning and function will be facilitated by the automatic extraction of handwritten score annotations. We present several approaches to the extraction of handwritten annotations of arbitrary content from digitized images of musical scores. First, we show promising results in certain contexts when using simple unsupervised clustering techniques to identify handwritten annotations in conductors’ scores. Next, we compare annotated scores to unannotated copies and use a printed sheet music comparison tool, Aruspix, to recover handwritten annotations as additions to the clean copy. Using both of these techniques in a combined annotation pipeline qualitatively improves the recovery of handwritten annotations. Recent work has shown the effectiveness of reframing classical optical musical recognition tasks as supervised machine learning classification tasks. In the same spirit, we pose the problem of handwritten annotation extraction as a supervised pixel classification task, where the feature space for the learning task is derived from the intensities of neighboring pixels. After an initial investment of time required to develop dependable training data, this approach can reliably extract annotations for entire volumes of score images without further supervision. These techniques are demonstrated using a sample of orchestral scores annotated by professional conductors of the New York Philharmonic Orchestra. Handwritten annotation extraction in musical scores has applications to the systematic investigation of score annotation practices by performers, annotator attribution, and to the interactive presentation of annotated scores, which we briefly discuss.
Similar content being viewed by others
Notes
The Python code used to implement each of these pipelines is available from the corresponding author (Bell), on request.
References
Calvo-Zaragoza, J., Mic, L., Oncina, J.: Music staff removal with supervised pixel classification. Int. J. Doc. Anal. Recognit. (IJDAR) 19(3), 211–219 (2016). https://doi.org/10.1007/s10032-016-0266-2
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Dalitz, C., Droettboom, M., Pranzas, B., Fujinaga, I.: A comparative study of staff removal algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 735–766 (2008)
Fan, K.C., Wang, L.S., Tu, Y.T.: Classification of machine-printed and handwritten texts using character block layout variance. Pattern Recognit. 31(9), 1275–1284 (1998)
Farooq, F., Sridharan, K., Govindaraju, V.: Identifying handwritten text in mixed documents. In: 18th International Conference on Pattern Recognition (ICPR’06), vol. 2, pp. 1142–1145 (2006). https://doi.org/10.1109/ICPR.2006.676
Guo, J.K., Ma, M.Y.: Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 439–443 (2001). https://doi.org/10.1109/ICDAR.2001.953828
Hankinson, A., Burgoyne, J.A., Vigliensoni, G., Porter, A., Thompson, J., Liu, W., Chiu, R., Fujinaga, I.: Digital document image retrieval using optical music recognition. In: Proceedings of the 13th ISMIR Conference, Porto, Portugal, pp. 577–582, 8–12 Oct 2012
IIIF Consortium (2017) IIIF Presentation API v. 2.1.1. Online. http://iiif.io/api/presentation/2.1/. Accessed 25 June 2018
Limpaecher, A., Feltman, N., Treuille, A., Cohen, M.: Real-time drawing assistance through crowdsourcing. ACM Trans. Graphics 32(4), 1 (2013). https://doi.org/10.1145/2461912.2462016
McLaren, K.: The development of the cie 1976 (l * a * b *) uniform colour space and colour-difference formula. J. Soc. Dye. Colour. 92(9), 338–341 (1976). https://doi.org/10.1111/j.1478-4408.1976.tb03301.x
Nakai, T., Kise, K., Iwamura, M.: A method of annotation extraction from paper documents using alignment based on local arrangements of feature points. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition, vol. 1, pp. 23–27. IEEE (2007)
Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)
Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.L.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation (2015). arXiv preprint arXiv:1502.02734
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Handwritten text separation from annotated machine printed documents using Markov random fields. Int. J. Doc. Anal. Recognit. (IJDAR) 16(1), 1–16 (2011). https://doi.org/10.1007/s10032-011-0179-z
Pugin, L.: Aruspix: An automatic source-comparison system. In: Hewlett, W.B., Selfridge-Field, E. (eds.) Music Analysis East and West, Computing in Musicology, vol. 14, pp. 49–60. MIT Press, Cambridge (2006)
Roland, P., Kepper, J.: Music encoding initiative guidelines (v. 3.0.0) (2016). http://www.music-encoding.org/docs/MEI_Guidelines_v3.0.0.pdf. Accessed 25 June 2018.
Violante, S., Smith, R., Reiss, M.: A computationally efficient technique for discriminating between hand-written and printed text. In: IEEE Colloquium on Document Image Processing and Multimedia Environments, pp. 17–1. IET (1995)
Weigl, D.M., Page, K.R.: A framework for distributed semantic annotation of musical score: take it to the bridge!. In: Proceedings of the 18th ISMIR Conference, Suzhou, China, pp. 221–228. The International Society of Music Information Retrieval (ISMIR), 23–27 Oct 2017
Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., Papamarkos, N.: Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognit. 47(3), 1051–1062 (2014). https://doi.org/10.1016/j.patcog.2013.09.005
Zagoris, K., Pratikakis, I., Gatos, B.: Segmentation-based historical handwritten word spotting using document-specific local features. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 9–14. IEEE (2014)
Acknowledgements
The authors wish to thank Barbara Haws at the New York Philharmonic Archives and Mitchell Brodsky for their technical support and encouragement and the anonymous reviewers for their feedback and suggestions. Leon Levy Digital Archive, New York Philharmonic, contributed to original score image courtesy.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bell, E., Pugin, L. Heuristic and supervised approaches to handwritten annotation extraction for musical score images. Int J Digit Libr 20, 49–59 (2019). https://doi.org/10.1007/s00799-018-0249-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-018-0249-7