Machine Learning

, Volume 94, Issue 1, pp 3–23 | Cite as

Plane-based object categorisation using relational learning



We use Inductive Logic Programming (ILP) to learn classifiers for generic object recognition from point clouds, as generated by 3D cameras, such as the Kinect. Each point cloud is segmented into planar surfaces. Each subset of planes that represents an object is labelled and predicates describing those planes and their relationships are used for learning. Our claim is that a relational description for classes of 3D objects can be built for robust object categorisation in real robotic application. To test the hypothesis, labelled sets of planes from 3D point clouds gathered during the RoboCup Rescue Robot competition are used as positive and negative examples for an ILP system. The robustness of the results is evaluated by 10-fold cross validation. In addition, common household objects that have curved surfaces are used for evaluation and comparison against a well-known non-relational classifier. The results show that ILP can be successfully applied to recognise objects encountered by a robot especially in an urban search and rescue environment.


Object classification Inductive logic programming Machine learning Urban search and rescue 3D point cloud 



The authors thank Stephen Muggleton and Dianhuan Lin for their assistance in running Metagol on the staircase data.


  1. Abudawood, T., & Flach, P. (2011). Learning multi-class theories in ILP. In P. Frasconi & F. Lisi (Eds.), Lecture notes in computer science: Vol. 6489. 20th international conference on ILP (pp. 6–13). Berlin: Springer. Google Scholar
  2. Antanas, L., van Otterlo, M., Mogrovejo, O., Antonio, J., Tuytelaars, T., & De Raedt, L. (2012). A relational distance-based framework for hierarchical image understanding. In ICPRAM 2012—proceedings of the 1st international conference on pattern recognition applications and methods, Algarve, Portugal, 6–8 Feb 2012 (Vol. 2, pp. 206–218). Google Scholar
  3. Bo, L., Ren, X., & Fox, D. (2011). Depth kernel descriptors for object recognition. In Proc. of IEEE/RSJ international conference on intelligent robots and systems (IROS), 2011. Google Scholar
  4. Böhm, J., & Brenner, C. (2000). Curvature based range image classification for object recognition. In Proceedings of intelligent robots and computer vision XIX: algorithms, techniques, and active vision, 2000 (Vol. 4197, pp. 211–220). CrossRefGoogle Scholar
  5. Cang, Y., & Hegde, G. P. M. (2009). Robust edge extraction for SwissRanger SR-3000 range images. In IEEE international conference on robotics and automation (ICRA), 12–17 May 2009 (pp. 2437–2442). Google Scholar
  6. Dönderler, M., Ulusoy, Ö., & Güdükbay, U. (2000). A rule-based approach to represent spatio-temporal relations in video data. In T. Yakhno (Ed.), Lecture notes in computer science: Vol. 1909. Advances in information systems (pp. 409–418). Berlin: Springer. CrossRefGoogle Scholar
  7. Endres, F. L. (2009). Scene analysis from range data. Master, Albert-Ludwigs-University Freiburg, Faculty of Applied Sciences. Google Scholar
  8. Esposito, F., & Malerba, D. (2001). Machine learning in computer vision. Applied Artificial Intelligence, 15(8), 693–705. doi: 10.1080/088395101317018546. CrossRefGoogle Scholar
  9. Farid, R., & Sammut, C. (2012a). A relational approach to plane-based object categorisation. Paper presented at the RSS 2012 workshop on RGB-D cameras [Online], University of Sydney, July 2012. Google Scholar
  10. Farid, R., & Sammut, C. (2012b). Plane-based object categorisation using relational learning. In Proceedings of the 22nd international conference on inductive logic programming (ILP2012), Dubrovnik, Croatia, 17–19 September 2012 Google Scholar
  11. Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In Proceedings IEEE computer society conference on computer vision and pattern recognition, 2003 (Vol. 2, pp. 264–271). Google Scholar
  12. Froimovich, G., Rivlin, E., & Shimshoni, I. (2002). Object classification by functional parts. In Proceedings first international symposium on 3D data processing visualization and transmission, 2002 (pp. 648–655). Google Scholar
  13. Froimovich, G., Rivlin, E., Shimshoni, I., & Soldea, O. (2007). Efficient search and verification for function based classification from real range images. Computer Vision and Image Understanding, 105(3), 200–217. CrossRefGoogle Scholar
  14. Gachter, S. (2005). Results on range image segmentation for service robots (Technical Report). Google Scholar
  15. Gachter, S., Nguyen, V., & Siegwart, R. (2006). Results on range image segmentation for service robots. In IEEE international conference on computer vision systems, 2006 (p. 53). Google Scholar
  16. Hegazy, D., & Denzler, J. (2009). Generic 3D object recognition from time-of-flight images using boosted combined shape features. In Proceedings of international conference on computer vision, theory and applications (VISAPP), 2009. Google Scholar
  17. Holz, D., Schnabel, R., Droeschel, D., Stückler, J., & Behnke, S. (2011). Towards semantic scene analysis with time-of-flight cameras. In J. Ruiz-del-Solar, E. Chown, & P. Plöger (Eds.), Lecture notes in computer science: Vol. 6556. RoboCup 2010: robot soccer world cup XIV (pp. 121–132). Berlin: Springer. CrossRefGoogle Scholar
  18. Kadous, M. W., Sammut, C., & Sheh, R. K. M. (2005). Behavioural cloning for robots in unstructured environments. In Workshop on machine learning based ground robotics, neural information processing systems, 2005. Google Scholar
  19. Lai, K., Bo, L., Ren, X., & Fox, D. (2011). A large-scale hierarchical multi-view RGB-D object dataset. In Proc. of international conference on robotics and automation (ICRA), 2011. Google Scholar
  20. Lowe, D. G. (1999). Object recognition from local scale-invariant features. In The proceedings of the seventh IEEE international conference on computer vision (Vol. 2, pp. 1150–1157). CrossRefGoogle Scholar
  21. Muggleton, S. (1991). Inductive logic programming. New Generation Computing, 8(4), 295–318. CrossRefMATHGoogle Scholar
  22. Muggleton, S., Lin, D., Pahlavi, N., & Tamaddoni-Nezhad, A. (2013). Meta-interpretive learning: application to grammatical inference. Machine Learning. Special issue on Inductive Logic Programming. Google Scholar
  23. NIST (2010). The National Institute of Standards and Technology; Test Methods.
  24. Opelt, A. (2006). Generic object recognition. Graz: Graz University of Technology. Google Scholar
  25. Pechuk, M., Soldea, O., & Rivlin, E. (2008). Learning function-based object classification from 3D imagery. Computer Vision and Image Understanding, 110(2), 173–191. CrossRefGoogle Scholar
  26. Peuquet, D. J., & Ci-Xiang, Z. (1987). An algorithm to determine the directional relationship between arbitrarily-shaped polygons in the plane. Pattern Recognition, 20(1), 65–74. doi: 10.1016/0031-3203(87)90018-5. CrossRefGoogle Scholar
  27. Posner, I., Schroeter, D., & Newman, P. (2007). Describing composite urban workspaces. In IEEE international conference on robotics and automation, 10–14 April 2007 (pp. 4962–4968). doi: 10.1109/robot.2007.364244. Google Scholar
  28. Posner, I., Schroeter, D., & Newman, P. (2008). Online generation of scene descriptions in urban environments. Robotics and Autonomous Systems, 56(11), 901–914. doi: 10.1016/j.robot.2008.08.009. CrossRefGoogle Scholar
  29. Rifkin, R., & Klautau, A. (2004). In defense of one-vs-all classification. Journal of Machine Learning Research, 5, 101–141. MathSciNetMATHGoogle Scholar
  30. Shanahan, M. (2002). A logical account of perception incorporating feedback and expectation. In Proceedings KR, 2002 (pp. 3–13). Google Scholar
  31. Shanahan, M., & Randell, D. (2004). A logic-based formulation of active visual perception. In Proceedings KR, 2004 (Vol. 4, pp. 64–72). Google Scholar
  32. Shin, J. (2008). Parts-based object classification for range images. Zurich: Swiss Federal Institute of Technology. Google Scholar
  33. Sotoodeh, S. (2006). Outlier detection in laser scanner point clouds. In International archives of photogrammetry, remote sensing and spatial information sciences XXXVI-5 (pp. 297–302). Google Scholar
  34. Srinivasan, A. (2001). The Aleph manual (Technical report). University of Oxford. Google Scholar
  35. Vasudevan, S., Gächter, S., Nguyen, V., & Siegwart, R. (2007). Cognitive maps for mobile robots—an object based approach. Robotics and Autonomous Systems, 55(5), 359–371. CrossRefGoogle Scholar
  36. Vince, J. A. (2005). Geometry for computer graphics: formulae, examples and proofs. Berlin: Springer. MATHGoogle Scholar
  37. Weber, C., Hahmann, S., & Hagen, H. (2011). Methods for feature detection in point clouds. In Visualization of large and unstructured data sets—applications in geospatial planning, modeling and engineering (IRTG 1131 workshop), Dagstuhl, Germany, 2011 (Vol. 19, pp. 90–99). Wadern: Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. doi: 10.4230/OASIcs.VLUDS.2010.90. Google Scholar
  38. Xu, M., & Petrou, M. (2010). Learning logic rules for scene interpretation based on Markov logic networks. In H. Zha, R.-i. Taniguchi, & S. Maybank (Eds.), Lecture notes in computer science: Vol. 5996. Computer vision—ACCV 2009 (pp. 341–350). Berlin: Springer. CrossRefGoogle Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringThe University of New South WalesSydneyAustralia

Personalised recommendations