Skip to main content
Log in

Heuristic and supervised approaches to handwritten annotation extraction for musical score images

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Performers’ copies of musical scores are typically rich in handwritten annotations, which capture historical and institutional performance practices. The development of interactive interfaces to explore digital archives of these scores and the systematic investigation of their meaning and function will be facilitated by the automatic extraction of handwritten score annotations. We present several approaches to the extraction of handwritten annotations of arbitrary content from digitized images of musical scores. First, we show promising results in certain contexts when using simple unsupervised clustering techniques to identify handwritten annotations in conductors’ scores. Next, we compare annotated scores to unannotated copies and use a printed sheet music comparison tool, Aruspix, to recover handwritten annotations as additions to the clean copy. Using both of these techniques in a combined annotation pipeline qualitatively improves the recovery of handwritten annotations. Recent work has shown the effectiveness of reframing classical optical musical recognition tasks as supervised machine learning classification tasks. In the same spirit, we pose the problem of handwritten annotation extraction as a supervised pixel classification task, where the feature space for the learning task is derived from the intensities of neighboring pixels. After an initial investment of time required to develop dependable training data, this approach can reliably extract annotations for entire volumes of score images without further supervision. These techniques are demonstrated using a sample of orchestral scores annotated by professional conductors of the New York Philharmonic Orchestra. Handwritten annotation extraction in musical scores has applications to the systematic investigation of score annotation practices by performers, annotator attribution, and to the interactive presentation of annotated scores, which we briefly discuss.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. The Python code used to implement each of these pipelines is available from the corresponding author (Bell), on request.

References

  1. Calvo-Zaragoza, J., Mic, L., Oncina, J.: Music staff removal with supervised pixel classification. Int. J. Doc. Anal. Recognit. (IJDAR) 19(3), 211–219 (2016). https://doi.org/10.1007/s10032-016-0266-2

    Article  Google Scholar 

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  MATH  Google Scholar 

  3. Dalitz, C., Droettboom, M., Pranzas, B., Fujinaga, I.: A comparative study of staff removal algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 735–766 (2008)

    Article  Google Scholar 

  4. Fan, K.C., Wang, L.S., Tu, Y.T.: Classification of machine-printed and handwritten texts using character block layout variance. Pattern Recognit. 31(9), 1275–1284 (1998)

    Article  Google Scholar 

  5. Farooq, F., Sridharan, K., Govindaraju, V.: Identifying handwritten text in mixed documents. In: 18th International Conference on Pattern Recognition (ICPR’06), vol. 2, pp. 1142–1145 (2006). https://doi.org/10.1109/ICPR.2006.676

  6. Guo, J.K., Ma, M.Y.: Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, pp. 439–443 (2001). https://doi.org/10.1109/ICDAR.2001.953828

  7. Hankinson, A., Burgoyne, J.A., Vigliensoni, G., Porter, A., Thompson, J., Liu, W., Chiu, R., Fujinaga, I.: Digital document image retrieval using optical music recognition. In: Proceedings of the 13th ISMIR Conference, Porto, Portugal, pp. 577–582, 8–12 Oct 2012

  8. IIIF Consortium (2017) IIIF Presentation API v. 2.1.1. Online. http://iiif.io/api/presentation/2.1/. Accessed 25 June 2018

  9. Limpaecher, A., Feltman, N., Treuille, A., Cohen, M.: Real-time drawing assistance through crowdsourcing. ACM Trans. Graphics 32(4), 1 (2013). https://doi.org/10.1145/2461912.2462016

    Article  Google Scholar 

  10. McLaren, K.: The development of the cie 1976 (l * a * b *) uniform colour space and colour-difference formula. J. Soc. Dye. Colour. 92(9), 338–341 (1976). https://doi.org/10.1111/j.1478-4408.1976.tb03301.x

    Article  Google Scholar 

  11. Nakai, T., Kise, K., Iwamura, M.: A method of annotation extraction from paper documents using alignment based on local arrangements of feature points. In: Proceedings of the Ninth International Conference on Document Analysis and Recognition, vol. 1, pp. 23–27. IEEE (2007)

  12. Nogueira, F., Aridas, C.K.: Imbalanced-learn: a python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18(17), 1–5 (2017)

    MATH  Google Scholar 

  13. Papandreou, G., Chen, L.C., Murphy, K., Yuille, A.L.: Weakly-and semi-supervised learning of a DCNN for semantic image segmentation (2015). arXiv preprint arXiv:1502.02734

  14. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  15. Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Handwritten text separation from annotated machine printed documents using Markov random fields. Int. J. Doc. Anal. Recognit. (IJDAR) 16(1), 1–16 (2011). https://doi.org/10.1007/s10032-011-0179-z

    Google Scholar 

  16. Pugin, L.: Aruspix: An automatic source-comparison system. In: Hewlett, W.B., Selfridge-Field, E. (eds.) Music Analysis East and West, Computing in Musicology, vol. 14, pp. 49–60. MIT Press, Cambridge (2006)

    Google Scholar 

  17. Roland, P., Kepper, J.: Music encoding initiative guidelines (v. 3.0.0) (2016). http://www.music-encoding.org/docs/MEI_Guidelines_v3.0.0.pdf. Accessed 25 June 2018.

  18. Violante, S., Smith, R., Reiss, M.: A computationally efficient technique for discriminating between hand-written and printed text. In: IEEE Colloquium on Document Image Processing and Multimedia Environments, pp. 17–1. IET (1995)

  19. Weigl, D.M., Page, K.R.: A framework for distributed semantic annotation of musical score: take it to the bridge!. In: Proceedings of the 18th ISMIR Conference, Suzhou, China, pp. 221–228. The International Society of Music Information Retrieval (ISMIR), 23–27 Oct 2017

  20. Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., Papamarkos, N.: Distinction between handwritten and machine-printed text based on the bag of visual words model. Pattern Recognit. 47(3), 1051–1062 (2014). https://doi.org/10.1016/j.patcog.2013.09.005

    Article  Google Scholar 

  21. Zagoris, K., Pratikakis, I., Gatos, B.: Segmentation-based historical handwritten word spotting using document-specific local features. In: 2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR), pp 9–14. IEEE (2014)

Download references

Acknowledgements

The authors wish to thank Barbara Haws at the New York Philharmonic Archives and Mitchell Brodsky for their technical support and encouragement and the anonymous reviewers for their feedback and suggestions. Leon Levy Digital Archive, New York Philharmonic, contributed to original score image courtesy.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Eamonn Bell.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bell, E., Pugin, L. Heuristic and supervised approaches to handwritten annotation extraction for musical score images. Int J Digit Libr 20, 49–59 (2019). https://doi.org/10.1007/s00799-018-0249-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-018-0249-7

Keywords

Navigation