Abstract
Staff-line removal is an important preprocessing stage for most optical music recognition systems. Common procedures to solve this task involve image processing techniques. In contrast to these traditional methods based on hand-engineered transformations, the problem can also be approached as a classification task in which each pixel is labeled as either staff or symbol, so that only those that belong to symbols are kept in the image. In order to perform this classification, we propose the use of convolutional neural networks, which have demonstrated an outstanding performance in image retrieval tasks. The initial features of each pixel consist of a square patch from the input image centered at that pixel. The proposed network is trained by using a dataset which contains pairs of scores with and without the staff lines. Our results in both binary and grayscale images show that the proposed technique is very accurate, outperforming both other classifiers and the state-of-the-art strategies considered. In addition, several advantages of the presented methodology with respect to traditional procedures proposed so far are discussed.
Similar content being viewed by others
References
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Proceedings of COMPSTAT’2010, pp. 177–186. Springer, Berlin (2010)
Byrd, D., Simonsen, J.G.: Towards a standard testbed for optical music recognition: definitions, metrics, and page images. J. New Music Res. 44(3), 169–195 (2015)
Calvo-Zaragoza, J., Barbancho, I., Tardón, L.J., Barbancho, A.M.: Avoiding staff removal stage in optical music recognition: application to scores written in white mensural notation. Pattern Anal. Appl. 18(4), 933–943 (2015)
Calvo-Zaragoza, J., Micó, L., Oncina, J.: Music staff removal with supervised pixel classification. Int. J. Doc. Anal. Recognit. 19(3), 211–219 (2016)
Ciresan, D., Meier, U., Schmidhuber, J.: Multi-column deep neural networks for image classification. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649. IEEE (2012)
Dalitz, C., Droettboom, M., Pranzas, B., Fujinaga, I.: A comparative study of staff removal algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 30(5), 753–766 (2008)
Dos Santos Cardoso, J., Capela, A., Rebelo, A., Guedes, C., Pinto da Costa, J.: Staff detection with stable paths. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 1134–1139 (2009)
Dutta, A., Pal, U., Fornes, A., Llados, J.: An efficient staff removal approach from printed musical documents. In: 2010 20th International Conference on Pattern Recognition (ICPR), pp. 1965–1968 (2010)
Fornés, A., Dutta, A., Gordo, A., Lladós, J.: CVC-MUSCIMA: a ground truth of handwritten music score images for writer identification and staff removal. Int. J. Doc. Anal. Recognit. 15(3), 243–251 (2012)
Fornés, A., Kieu, V.C., Visani, M., Journet, N., Dutta, A.: The ICDAR/GREC 2013 music scores competition: staff removal. In: 10th International Workshop on Graphics Recognition, Current Trends and Challenges GREC 2013, Bethlehem, PA, USA, August 20–21, 2013, Revised Selected Papers, pp. 207–220 (2013)
Géraud, T.: A morphological method for music score staff removal. In: Proceedings of the 21st International Conference on Image Processing (ICIP), pp. 2599–2603, Paris, France (2014)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Hirata, N.S.T.: Multilevel training of binary morphological operators. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 707–720 (2009)
Kanungo, T., Haralick, R.M., Phillips, I.: Global and local document degradation models. In: Document Analysis and Recognition, 1993, Proceedings of the Second International Conference on, pp. 730–734. IEEE (1993)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
dos Santos Montagner, I., Hirata, R., Hirata, N.S.T.: A machine learning based method for staff removal. In: Pattern Recognition (ICPR), 2014 22nd International Conference on, pp. 3162–3167 (2014)
Piatkowska, W., Nowak, L., Pawlowski, M., Ogorzalek, M.: Stafflines pattern detection using the swarm intelligence algorithm. In: Bolc, L., Tadeusiewicz, R., Chmielewski, L.J., Wojciechowski, Konrad (eds.) Computer Vision and Graphics. Lecture Notes in Computer Science, vol. 7594, pp. 557–564. Springer, Berlin (2012)
Ramirez, C., Ohya, J.: Automatic recognition of square notation symbols in western plainchant manuscripts. J. New Music Res. 43(4), 390–399 (2014)
Rebelo, A., Cardoso, J.S.: Staff line detection and removal in the grayscale domain. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 57–61 (2013)
Rebelo, A., Capela, G., Cardoso, J.S.: Optical recognition of music symbols. Int. J. Doc. Anal. Recognit. 13(1), 19–31 (2010)
Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marçal, A.R.S., Guedes, C., Cardoso, J.S.: Optical music recognition: state-of-the-art and open issues. IJMIR 1(3), 173–190 (2012)
Rossant, F., Bloch, I.: Robust and adaptive OMR system including fuzzy modeling, fusion of musical rules, and possible error detection. EURASIP J. Appl. Sig. Process. 2007(1), 160–160 (2007)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR, arXiv:1409.1556 (2014)
Su, B., Lu, S., Pal, U., Tan, C.L.: An effective staff detection and removal technique for musical documents. In: 2012 10th IAPR International Workshop on Document Analysis Systems (DAS), pp. 160–164 (2012)
Tardón, L.J., Sammartino, S., Barbancho, I., Gómez, V., Oliver, A.: Optical music recognition for scores written in white mensural notation. EURASIP. J. Image. Video. Process. 2009, 843401 (2009). doi:10.1155/2009/843401
Typke, R., Wiering, F., Veltkamp, R.C.: A survey of music information retrieval systems. In: ISMIR 2005, 6th International Conference on Music Information Retrieval, London, UK, 11–15 Sept 2005, Proceedings, pp. 153–160 (2005)
Visaniy, M., Kieu, V.C., Fornes, A., Journet, N.: ICDAR 2013 music scores competition: staff removal. In: 2013 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 1407–1411 (2013)
Zeiler, M.D. : ADADELTA: an adaptive learning rate method. CoRR arXiv:1212.5701 (2012)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Computer Vision—ECCV 2014, pp. 818–833. Springer, Berlin (2014)
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU Fellowship (Ref. AP2012–0939), the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds) and the Instituto Universitario de Investigación Informática (IUII) from the University of Alicante. Authors would like to thank the anonymous reviewers for their constructive comments to improve the paper quality.
Rights and permissions
About this article
Cite this article
Calvo-Zaragoza, J., Pertusa, A. & Oncina, J. Staff-line detection and removal using a convolutional neural network. Machine Vision and Applications 28, 665–674 (2017). https://doi.org/10.1007/s00138-017-0844-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00138-017-0844-4