Advertisement

A Large-Scale Annotated Mechanical Components Benchmark for Classification and Retrieval Tasks with Deep Neural Networks

Conference paper
  • 517 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12363)

Abstract

We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven feature learning for mechanical components. Exploring the shape descriptor for mechanical components is essential to computer vision and manufacturing applications. However, not much attention has been given on creating annotated mechanical components datasets on a large scale. This is because acquiring 3D models is challenging and annotating mechanical components requires engineering knowledge. Our main contributions are the creation of a large-scale annotated mechanical component benchmark, defining hierarchy taxonomy of mechanical components, and benchmarking the effectiveness of deep learning shape classifiers on the mechanical components. We created an annotated dataset and benchmarked seven state-of-the-art deep learning classification methods in three categories, namely: (1) point clouds, (2) volumetric representation in voxel grids, and (3) view-based representation.

Keywords

Deep learning Mechanical components Benchmark 3D objects Classification Retrieval 

Notes

Acknowledgment

We wish to give a special thanks to the reviewers for their invaluable feedback. Additionally, we thank TraceParts for providing CAD models of mechanical components. This work is partially supported by NSF under the grants FW-HTF 1839971, OIA 1937036, and CRI 1729486. We also acknowledge the Feddersen Chair Funds. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the funding agency.

Supplementary material

504473_1_En_11_MOESM1_ESM.pdf (2.4 mb)
Supplementary material 1 (pdf 2492 KB)

References

  1. 1.
    Open asset import library. http://www.assimp.org/. Accessed 10 Nov 2019
  2. 2.
    Bespalov, D., Ip, C.Y., Regli, W.C., Shaffer, J.: Benchmarking CAD search techniques. In: Proceedings of the 2005 ACM Symposium on Solid and Physical Modeling, pp. 275–286. ACM (2005)Google Scholar
  3. 3.
    Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Generative and discriminative voxel modeling with convolutional neural networks. arXiv preprint arXiv:1608.04236 (2016)
  4. 4.
    Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. Technical report arXiv:1512.03012 [cs.GR], Stanford University – Princeton University – Toyota Technological Institute at Chicago (2015)
  5. 5.
    Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01234-2_49CrossRefGoogle Scholar
  6. 6.
    Choi, C., Kim, S., Ramani, K.: Learning hand articulations by hallucinating heat distribution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3104–3113 (2017)Google Scholar
  7. 7.
    Choy, C.B., Xu, D., Gwak, J.Y., Chen, K., Savarese, S.: 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 628–644. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_38CrossRefGoogle Scholar
  8. 8.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)Google Scholar
  9. 9.
    Ellis, K., Ritchie, D., Solar-Lezama, A., Tenenbaum, J.: Learning to infer graphics programs from hand-drawn images. In: Advances in Neural Information Processing Systems, pp. 6059–6068 (2018)Google Scholar
  10. 10.
    Farag, A., Ali, A., Graham, J., Farag, A., Elshazly, S., Falk, R.: Evaluation of geometric feature descriptors for detection and classification of lung nodules in low dose CT scans of the chest. In: 2011 IEEE International Symposium on Biomedical Imaging: from Nano to Macro, pp. 169–172. IEEE (2011)Google Scholar
  11. 11.
    Furuya, T., Ohbuchi, R.: Diffusion-on-manifold aggregation of local features for shape-based 3D model retrieval. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 171–178. ACM (2015)Google Scholar
  12. 12.
    Furuya, T., Ohbuchi, R.: Deep aggregation of local 3D geometric features for 3D model retrieval. In: BMVC, pp. 121–1 (2016)Google Scholar
  13. 13.
    Geuzaine, C., Remacle, J.F.: Gmsh: a 3-D finite element mesh generator with built-in pre-and post-processing facilities. Int. J. Numer. Methods Eng. 79(11), 1309–1331 (2009)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  15. 15.
    He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)Google Scholar
  16. 16.
    Huang, J., Kwok, T.H., Zhou, C.: Parametric design for human body modeling by wireframe-assisted deep learning. Comput.-Aided Des. 108, 19–29 (2019)CrossRefGoogle Scholar
  17. 17.
    Iyer, N., Jayanti, S., Lou, K., Kalyanaraman, Y., Ramani, K.: Three-dimensional shape searching: state-of-the-art review and future trends. Comput.-Aided Des. 37(5), 509–530 (2005)CrossRefGoogle Scholar
  18. 18.
    Jayanti, S., Kalyanaraman, Y., Iyer, N., Ramani, K.: Developing an engineering shape benchmark for CAD models. Comput.-Aided Des. 38(9), 939–953 (2006)CrossRefGoogle Scholar
  19. 19.
    Jia, X., Gavves, E., Fernando, B., Tuytelaars, T.: Guiding the long-short term memory model for image caption generation. In: The IEEE International Conference on Computer Vision (ICCV), December 2015Google Scholar
  20. 20.
    Kanezaki, A., Matsushita, Y., Nishida, Y.: Rotationnet: joint object categorization and pose estimation using multiviews from unsupervised viewpoints. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5010–5019 (2018)Google Scholar
  21. 21.
    Kim, S., Winovich, N., Chi, H.G., Lin, G., Ramani, K.: Latent transformations neural network for object view synthesis. Vis. Comput. 1–15 (2019)Google Scholar
  22. 22.
    Koch, S., et al.: ABC: a big CAD model dataset for geometric deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9601–9611 (2019)Google Scholar
  23. 23.
    Kulkarni, T.D., Whitney, W.F., Kohli, P., Tenenbaum, J.: Deep convolutional inverse graphics network. In: Advances in Neural Information Processing Systems, pp. 2539–2547 (2015)Google Scholar
  24. 24.
    Li, Y., Bu, R., Sun, M., Wu, W., Di, X., Chen, B.: PointCNN: Convolution on X-transformed points. In: Advances in Neural Information Processing Systems, pp. 820–830 (2018)Google Scholar
  25. 25.
    Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)Google Scholar
  26. 26.
    Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  27. 27.
    Liu, S., Giles, L., Ororbia, A.: Learning a hierarchical latent-variable model of 3D shapes. In: 2018 International Conference on 3D Vision (3DV), pp. 542–551. IEEE (2018)Google Scholar
  28. 28.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  29. 29.
    van den Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(Nov), 2579–2605 (2008)Google Scholar
  30. 30.
    Maturana, D., Scherer, S.: Voxnet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)Google Scholar
  31. 31.
    Miller, G.A.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)Google Scholar
  32. 32.
    Mo, K., et al.: Partnet: a large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 909–918 (2019)Google Scholar
  33. 33.
    Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  34. 34.
    Paszke, A., Chaurasia, A., Kim, S., Culurciello, E.: ENET: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147 (2016)
  35. 35.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)Google Scholar
  36. 36.
    Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, pp. 5099–5108 (2017)Google Scholar
  37. 37.
    Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)Google Scholar
  38. 38.
    Regli, W.C., et al.: National design repository project: a status report. In: International Joint Conferences on Artificial Intelligence (IJCAI), Seattle, WA, pp. 4–10 (2001)Google Scholar
  39. 39.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
  40. 40.
    Savva, M., et al.: Shrec16 track: largescale 3D shape retrieval from shapenet core55. In: Proceedings of the Eurographics Workshop on 3D Object Retrieval, pp. 89–98 (2016)Google Scholar
  41. 41.
    Savva, M., et al.: Large-scale 3D shape retrieval from shapenet core55: Shrec’17 track. In: Proceedings of the Workshop on 3D Object Retrieval, pp. 39–50. Eurographics Association (2017)Google Scholar
  42. 42.
    Shilane, P., Min, P., Kazhdan, M., Funkhouser, T.: The Princeton shape benchmark. In: Proceedings Shape Modeling Applications, pp. 167–178. IEEE (2004)Google Scholar
  43. 43.
    Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 945–953 (2015)Google Scholar
  44. 44.
    Su, J.-C., Gadelha, M., Wang, R., Maji, S.: A deeper look at 3D shape classifiers. In: Leal-Taixé, L., Roth, S. (eds.) ECCV 2018. LNCS, vol. 11131, pp. 645–661. Springer, Cham (2019).  https://doi.org/10.1007/978-3-030-11015-4_49CrossRefGoogle Scholar
  45. 45.
    Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)Google Scholar
  46. 46.
    Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2015)Google Scholar
  47. 47.
    Wu, Z., et al.: 3D shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)Google Scholar
  48. 48.
    Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y.: SpiderCNN: deep learning on point sets with parameterized convolutional filters. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 90–105. Springer, Cham (2018).  https://doi.org/10.1007/978-3-030-01237-3_6CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Purdue UniversityWest LafayetteUSA
  2. 2.The University of Texas at AustinAustinUSA

Personalised recommendations