Abstract
We address the problem of automatically detecting a sparse set of 3D mesh vertices, likely to be good candidates for determining correspondences, even on soft organic objects. We focus on 3D face scans, on which single local shape descriptor responses are known to be weak, sparse or noisy. Our machine-learning approach consists of computing feature vectors containing \(D\) different local surface descriptors. These vectors are normalized with respect to the learned distribution of those descriptors for some given target shape (landmark) of interest. Then, an optimal function of this vector is extracted that best separates this particular target shape from its surrounding region within the set of training data. We investigate two alternatives for this optimal function: a linear method, namely Linear Discriminant Analysis, and a non-linear method, namely AdaBoost. We evaluate our approach by landmarking 3D face scans in the FRGC v2 and Bosphorus 3D face datasets. Our system achieves state-of-the-art performance while being highly generic.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Also called interest points or feature detections in other literature.
It is essential that the reader distinguishes carefully between unlabeled keypoints and labeled landmarks throughout this paper.
To be more precise, a query scan point close to an extracted keypoint is sometimes designated to be the landmark, in order to minimize the least-squares error when fitting model \(\mathcal{L }\). This is discussed in Sect. 6.
We define a scalar local shape descriptor as a real number that describes the shape of the local neighborhood surrounding some mesh vertex. In some literature, this is termed a feature or feature descriptor. Here, the local neighborhood is Euclidean and enclosed by a sphere of predefined radius.
Subsets \(R\_90\), \(L\_90\) and \(IGN\) of the Bosphorus dataset are not used in this paper.
Note, however, that our framework allows us to ‘plug in’ and use any classifier as an L-score generator. Switching to another technique is straightforward, if there is some advantage to this, in terms of the class of meshes that are being processed.
References
Alyuz, N., Gokberk, B., & Akarun, L. (2010). Regional registration for expression resistant 3-d face recognition. IEEE Transactions on Information Forensics and Security, 5(3), 425–440. doi:10.1109/TIFS.2010.2054081.
Ben Azouz, Z., Shu, C., & Mantel, A. (2006). Automatic locating of anthropometric landmarks on 3d human models. In Third international symposium on 3D data processing, visualization, and transmission (pp. 750–757, 14–16). doi:10.1109/3DPVT.2006.34.
Berretti, S., Bimbo, A. D., & Pala, P. (2010). Recognition of 3d faces with missing parts based on profile networks. 1st ACM workshop on 3D object. Retrieval (ACM 3DOR’10) (pp. 81–86). doi:10.1145/1877808.1877825.
Besl, P., & McKay, N. (1992). A method for registration of 3d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256. doi:10.1109/34.121791.
Boyer, E., Bronstein, A. M., Bronstein, M. M., Bustos4, B., Darom, T., Horaud, R., et al.. (2011). Shrec 2011: robust feature detection and description benchmark. Eurographics workshop on 3D object. Retrieved. doi:10.2312/3DOR/3DOR11/071-078.
Castellani, U., Cristani, M., Fantoni, S., & Murino, V. (2008). Sparse points matching by combining 3d mesh saliency with statistical descriptors. Computer Graphics Forum, 27(2), 643–652. doi:10.1111/j.1467-8659.2008.01162.x.
Chang, K. I., Bowyer, K., & Flynn, P. (2006). Multiple nose region matching for 3d face recognition under varying facial expression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1695–1700. doi:10.1109/TPAMI.2006.210.
Choi, E., & Lee, C. (2003). Feature extraction based on the bhattacharyya distance. Pattern Recognition, 36(8), 1703–1709. doi:10.1016/S0031-3203(03)00035-9.
Colbry, D., Stockman, G., & Jain, A. (2005). Detection of anchor points for 3d face verification. In IEEE computer society conference on computer vision and pattern recognition—workshops, 2005. CVPR workshops (pp. 118–118, 25–25). doi:10.1109/CVPR.2005.441.
Creusot, C. (2011). Automatic landmarking for non-cooperative 3D face recognition. Ph.D. thesis, University of York. http://etheses.whiterose.ac.uk/2274/.
Creusot, C., Pears, N., & Austin, J. (2010). 3D face landmark labelling. In Proceedings of the ACM workshop on 3D object retrieval. ACM, 3DOR ’10 (pp. 27–32). doi:10.1145/1877808.1877815.
Creusot, C., Pears, N., & Austin, J. (2011). Automatic keypoint detection on 3d faces using a dictionary of local shapes. In 2011 International conference on 3D imaging, modeling, processing, visualization and transmission (3DIMPVT) (pp. 204–211). doi:10.1109/3DIMPVT.2011.33.
D’Hose, J., Colineau, J., Bichon, C., & Dorizzi, B. (2007). Precise localization of landmarks on 3d faces using gabor wavelets. In First IEEE international conference on biometrics: theory, applications, and systems, 2007. BTAS 2007 (pp. 1–6). doi:10.1109/BTAS.2007.4401927.
Dibeklioglu, H., Salah, A., & Akarun, L. (2008). 3d facial landmarking under expression, pose, and occlusion variations. In BTAS08 (pp. 1–6). doi:10.1109/BTAS.2008.4699324.
Faltemier, T., Bowyer, K., & Flynn, P. (2008). Rotated profile signatures for robust 3d feature detection. In 8th IEEE international conference on automatic face gesture Recognition, 2008. FG ’08 (pp. 1–7, 17–19). doi:10.1109/AFGR.2008.4813413.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395. doi:10.1145/358669.358692.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119–139. doi:10.1006/jcss.1997.1504.
Goldfeather, J., & Interrante, V. (2004) A novel cubic-order algorithm for approximating principal direction vectors. ACM Transactions on Graphics, 23(1), 45–63. doi:10.1145/966131.966134.
Itskovich, A., & Tal, A. (2011). Surface partial matching and application to archaeology. Computers & Graphics, 35(2), 334–341. doi:10.1016/j.cag.2010.11.010.
Johnson, A., & Hebert, M. (1999). Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(1), 433–449.
Kim, J.-S., & Choi, S.-M. (2009). Symmetric deformation of 3d face scans using facial features and curvatures. Computer Animation and Virtual Worlds, 20, 289–300. doi:10.1002/cav.v20:2/3.
Lowe, D. G., (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110. doi:10.1023/B:VISI.0000029664.99615.94.
Max, N. (1999). Weights for computing vertex normals from facet normals. Journal of Graphics Tools, 4, 1–6. http://portal.acm.org/citation.cfm?id=334709.334710.
Mayo, M., & Zhang, E. (2009). 3D face recognition using multiview keypoint matching. Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance, 2009. AVSS ’09 (pp. 290–295). doi:10.1109/AVSS.2009.11.
Mian, A., Bennamoun, M., & Owens, R. (2010). On the repeatability and quality of keypoints for local feature-based 3d object retrieval from cluttered scenes. International Journal of Computer Vision, 89(2), 348–361. doi:10.1007/s11263-009-0296-z.
Mian, A. S., Bennamoun, M., & Owens, R. (2008). Keypoint detection and local feature matching for textured 3d face recognition. International Journal of Computer Vision, 79(1), 1–12. doi:10.1007/s11263-007-0085-5.
Mian, A. S., Bennamoun, M., & Owens, R. A. (2006). Automatic 3d face detection, normalization and recognition. In 3DPVT (pp. 735–742). doi:10.1109/3DPVT.2006.32.
Pears, N., Heseltine, T., & Romero, M., (2010). From 3D point clouds to pose-normalised depth maps. International Journal of Computer Vision, 89(2), 152–176. doi:10.1007/s11263-009-0297-y.
Phillips, P., Flynn, P., Scruggs, T., Bowyer, K., Chang, J., Hoffman, K., et al. (2005). Overview of the face recognition grand challenge. In IEEE computer society conference on computer vision and pattern recognition, 2005. CVPR 2005 (Vol. 1, pp. 947–954). doi:10.1109/CVPR.2005.268.
Romero, M., & Pears, N., (2009). Landmark localisation in 3d face data. In Sixth IEEE international conference on advanced video and signal based surveillance, 2009. AVSS ’09 (pp. 73–78). doi:10.1109/AVSS.2009.90.
Romero-Huertas, M., & Pears, N. (2008). 3D facial landmark localisation by matching simple descriptors. In 2nd IEEE international conference on biometrics: theory, applications and systems, BTAS 2008 (pp. 1–6). doi:10.1109/BTAS.2008.4699390.
Rubner, Y., Tomasi, C., & Guibas, L. J. (2000). The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision, 40, 99–121. doi:10.1023/A:1026543900054.
Savran, A., Alyüz, N., Dibeklioğlu, H., Çeliktutan, O., Gökberk, B., Sankur, B., et al. (2008). Bosphorus database for 3d face analysis. In Biometrics and identity management: first European workshop, BIOID 2008 (pp. 47–56). Springer: Roskilde, Denmark. doi:10.1007/978-3-540-89991-4_6.
Segundo, M., Queirolo, C., Bellon, O., & Silva, L., (2007). Automatic 3D facial segmentation and landmark detection. In 14th International conference on image analysis and processing, 2007 (ICIAP 2007) (pp. 431–436). doi:10.1109/ICIAP.2007.4362816.
Segundo M., Silva L., Bellon O. R. P., & Queirolo C. C., (2010). Automatic face segmentation and facial landmark detection in range images. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40(5), 1319–1330. doi:10.1109/TSMCB.2009.2038233.
Shotton, J., Johnson, M., & Cipolla, R., (2008). Semantic texton forests for image categorization and segmentation. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 1–8). doi:10.1109/CVPR.2008.4587503.
Szeptycki, P., Ardabilian, M., & Chen, L., (2009). A coarse-to-fine curvature analysis-based rotation invariant 3D face landmarking. International conference on biometrics: theory, applications and systems (pp. 32–37). doi:10.1109/BTAS.2009.5339052.
Viola, P., & Jones, M. J. (2004). Robust real-time face detection. International Journal of Computer Vision, 57, 137–154. doi:10.1023/B:VISI.0000013087.49260.fb.
Zaharescu, A., Boyer, E., Varanasi, K., & Horaud, R., (2009). Surface feature detection and description with applications to mesh matching. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco (pp. 373–380). doi:10.1109/CVPR.2009.5206748.
Zhao X., Dellandr anda E., Chen L., & Kakadiaris I. A. (2011). Accurate landmarking of three-dimensional facial data in the presence of facial expressions and occlusions using a three-dimensional statistical facial feature model. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 41(5), 1417–1428. doi:10.1109/TSMCB.2011.2148711.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Creusot, C., Pears, N. & Austin, J. A Machine-Learning Approach to Keypoint Detection and Landmarking on 3D Meshes. Int J Comput Vis 102, 146–179 (2013). https://doi.org/10.1007/s11263-012-0605-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-012-0605-9