GarmNet: Improving Global with Local Perception for Robotic Laundry Folding

  • Daniel Fernandes GomesEmail author
  • Shan Luo
  • Luis F. Teixeira
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11650)


Developing autonomous assistants to help with domestic tasks is a vital topic in robotics research. Among these tasks, garment folding is one of them that is still far from being achieved mainly due to the large number of possible configurations that a crumpled piece of clothing may exhibit. Research has been done on either estimating the pose of the garment as a whole or detecting the landmarks for grasping separately. However, such works constrain the capability of the robots to perceive the states of the garment by limiting the representations for one single task. In this paper, we propose a novel end-to-end deep learning model named GarmNet that is able to simultaneously localize the garment and detect landmarks for grasping. The localization of the garment represents the global information for recognising the category of the garment, whereas the detection of landmarks can facilitate subsequent grasping actions. We train and evaluate our proposed GarmNet model using the CloPeMa Garment dataset that contains 3,330 images of different garment types in different poses. The experiments show that the inclusion of landmark detection (GarmNet-B) can largely improve the garment localization, with an error rate of 24.7% lower. Solutions as ours are important for robotics applications, as these offer scalable to many classes, memory and processing efficient solutions.


Garment localization Landmark detection Robot laundry folding 



This work was supported by the EPSRC project “Robotics and Artificial Intelligence for Nuclear (RAIN)” (EP/R026084/1).


  1. 1.
    Corona, E., Alenyà, G., Gabas, A., Torras, C.: Active garment recognition and target grasping point detection using deep learning. Pattern Recogn. 74, 629–641 (2018). Scholar
  2. 2.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR 2009 (2009)Google Scholar
  3. 3.
    Engels, G., Heckel, R., Sauer, S.: UML—a universal modeling language? In: Nielsen, M., Simpson, D. (eds.) ICATPN 2000. LNCS, vol. 1825, pp. 24–38. Springer, Heidelberg (2000). Scholar
  4. 4.
    Everingham, M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010). Scholar
  5. 5.
    Girshick, R.B.: Fast R-CNN. CoRR abs/1504.08083 (2015).
  6. 6.
    Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR abs/1311.2524 (2013).
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).
  8. 8.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (2012)Google Scholar
  9. 9.
    Lecun, Y., Bengio, Y.: Convolutional networks for images, speech, and time-series. In: The Handbook of Brain Theory and Neural Networks, January 1995Google Scholar
  10. 10.
    Li, Y., Chen, C.F., Allen, P.K.: Recognition of deformable object category and pose. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) (2014)Google Scholar
  11. 11.
    Liu, Z., Luo, P., Qiu, S., Wang, X., Tang, X.: Deepfashion: powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  12. 12.
    Maitin-Shepard, J., Cusumano-Towner, M., Lei, J., Abbeel, P.: Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding. In: 2010 IEEE International Conference on Robotics and Automation, pp. 2308–2315, May 2010.
  13. 13.
    Mariolis, I., Peleka, G., Kargakos, A., Malassiotis, S.: Pose and category recognition of highly deformable objects using deep learning. In: 2015 International Conference on Advanced Robotics (ICAR), pp. 655–662. IEEE, July 2015.
  14. 14.
    Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. CoRR abs/1506.02640 (2015).
  15. 15.
    Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR abs/1612.08242 (2016).
  16. 16.
    Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR abs/1506.01497 (2015).
  17. 17.
    Seo, Y., Shik Shin, K.: Hierarchical convolutional neural networks for fashion image classification. Expert Syst. Appl. 116, 328–339 (2019). Scholar
  18. 18.
    Wagner, L., K.D., Smutný, V.: CTU color and depth image dataset of spread garments. Technical Report CTU-CMP-2013-25, Center for Machine Perception, K13133 FEE Czech Technical University, Prague, Czech Republic, September 2013Google Scholar
  19. 19.
    Yamazaki, K.: Instance recognition of clumped clothing using image features focusing on clothing fabrics and wrinkles. In: 2015 IEEE International Conference on Robotics and Biomimetics, IEEE-ROBIO 2015, pp. 1102–1108 (2016).,
  20. 20.
    Yang, M., Yu, K.: Real-time clothing recognition in surveillance videos. In: Macq, B., Schelkens, P. (eds.) ICIP, pp. 2937–2940. IEEE (2011).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of LiverpoolLiverpoolUK
  2. 2.Faculdade de EngenhariaUniversidade do PortoPortoPortugal
  3. 3.INESC TECPortoPortugal

Personalised recommendations