Skip to main content

Can We Teach Functions to an Artificial Intelligence by Just Showing It Enough “Ground Truth”?

  • Chapter
  • First Online:
Mathematics Going Forward

Part of the book series: Lecture Notes in Mathematics ((LNM,volume 2313))

  • 1354 Accesses

Abstract

The term “artificial intelligence”, which in the past received very different interpretations, is nowadays being identified with deep learning. Deep neural networks bring the promise that, instead of hand-crafting data processing algorithms by mathematical reasoning based on formalized principles, we can simply feed enough data to a neural network, which will learn the right operator from it. In supervised learning, the data are either annotated by humans or obtained from a large set of observed pairs (xn, f(xn)). It is this association of an output f(xn) to an input xn in a learning dataset which is called a “ground truth”. The use of a “ground truth” annotated by humans raises a serious methodological problem, as humans are fallible. Worse even, the performance of these methods is evaluated and compared on subsets of the same annotations. Objective natural ground truths raise similar issues: raw data can be ambiguous or contradictory. In this paper, we shall examine two examples where machine learning methods were used to replicate aspects of human perception and logic: depth perception and the detection of straight lines or segments. We show that a strict control of the geometry in the learning data set, or a rigorous mathematical definition of the geometric task, lead to results widely different from those learned blindly from annotated datasets or from ground truths acquired in the wild. We conclude that a mathematical and principled analysis of learning datasets should precede their use.

En gratitude à Catriona Byrne, mémorable, unique, et irremplaçable chef d’orchestre de l’édition mathématique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. M. Abdar, F. Pourpanah, S. Hussain, D. Rezazadegan, L. Liu, M. Ghavamzadeh, P. Fieguth, X. Cao, A. Khosravi,U. Rajendra Acharya, et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion76, 243–297 (2021).

    Article  Google Scholar 

  2. C. Akinlar and C. Topal. Edlines: Real-time line segment detection by edge drawing (ed). In: 2011 18th IEEE International Conference on Image Processing, pp. 2837–2840, IEEE (2011).

    Google Scholar 

  3. C. Akinlar and C. Topal. EDLines: A real-time line segment detector with a false detection control. Pattern Recognition Letters32(13), 1633–1642 (2011).

    Article  Google Scholar 

  4. I. Alhashim and P. Wonka. High quality monocular depth estimation via transfer learning. arXiv preprint arXiv:1812.11941 (2018).

    Google Scholar 

  5. M. Caron, P. Bojanowski, A. Joulin and M. Douze. Deep clustering for unsupervised learning of visual features. In: Proceedings of the European conference on computer vision (ECCV), pp. 132–149, Springer (2018).

    Google Scholar 

  6. M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski and A. Joulin. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems33, 9912–9924 (2020).

    Google Scholar 

  7. W. Chen, Z. Fu, D. Yang and J. Deng. Single-image depth perception in the wild. Advances in neural information processing systems29, 730–738 (2016).

    Google Scholar 

  8. W. Chen, S. Qian, D. Fan, N. Kojima, M. Hamilton and J. Deng. Oasis: A large-scale dataset for single image 3d in the wild, In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 679–688, Springer (2020).

    Google Scholar 

  9. A. Courtois, J.-M. Morel and P. Arias. Investigating Neural Architectures by Synthetic Dataset Design. arXiv preprint arXiv:2204.11045 (2022).

    Google Scholar 

  10. P. Denis, J.H. Elder and F.J. Estrada. Efficient edge-based methods for estimating Manhattan frames in urban imagery. In: Proceedings of the European conference on computer vision (ECCV), pp. 197–210, Springer (2008).

    Google Scholar 

  11. A. Desolneux, L. Moisan and J.-M. Morel. From Gestalt Theory to Image Analysis: A Probabilistic Approach. Interdisciplinary Applied Mathematics 34, Springer Science & Business Media (2007).

    Google Scholar 

  12. T. DeVries and G.W. Taylor. Learning confidence for out-of-distribution detection in neural networks. arXiv preprint arXiv:1802.04865 (208).

    Google Scholar 

  13. H. Feng, M. Chen, J. Hu, D. Shen, H. Liu and D. Cai. Complementary pseudo labels for unsupervised domain adaptation on person re-identification. IEEE Transactions on Image Processing30, 2898–2907 (2021).

    Article  Google Scholar 

  14. Y. Gousseau and F. Roueff. The dead leaves model: general results and limits at small scales. arXiv preprint arXiv:math/0312035 (2003).

    Google Scholar 

  15. J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar et al. Bootstrap your own latent: A new approach to self-supervised learning. Advances in Neural Information Processing Systems33, 21271–21284 (2020).

    Google Scholar 

  16. R. Grompone von Gioi, J. Jakubowicz, J.-M. Morel and G. Randall. LSD: A fast line segment detector with a false detection control. IEEE transactions on pattern analysis and machine intelligence32(4), 722–732 (2008).

    Article  Google Scholar 

  17. G. Gu, B. Ko, SH. Go, S.-H. Lee, J. Lee, M. Shin. Towards light-weight and real-time line segment detection. arXiv preprint arXiv:2106.00186 (2022).

    Google Scholar 

  18. C. Guo, G. Pleiss, Y. Sun and K.Q. Weinberger. On calibration of modern neural networks. In: International Conference on Machine Learning, pp. 1321–1330, PRML (2017).

    Google Scholar 

  19. M. Hein, M. Andriushchenko and J. Bitterwolf. Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 41–50, IEEE (2019).

    Google Scholar 

  20. K. Huang, Y, Wang, Z. Zhou, T. Ding, S. Gao and Y. Ma. Learning to parse wireframes in images of man-made environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 626–635, IEEE (2018).

    Google Scholar 

  21. S. Huang, F. Qin, P. Xiong, N. Ding, Y. He and X. Liu. Tp-lsd: Tri-points based line segment detector. In: European Conference on Computer Vision, pp. 770–785, Springer (2020).

    Google Scholar 

  22. P. Isola, J.-Y. Zhu, T. Zhou and A.A. Efros. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134, IEEE (2017).

    Google Scholar 

  23. A. Jaiswal, A.R. Babu, M.Z. Zadeh, D. Banerjee and F. Makedon. A survey on contrastive self-supervised learning. Technologies9(1), 2 (2020).

    Google Scholar 

  24. G. Kanizsa. Organization in vision: Essays on Gestalt perception. Praeger Publishers (1979).

    Google Scholar 

  25. P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu and D. Krishnan. Supervised contrastive learning. Advances in Neural Information Processing Systems33, 18661–18673 (2020).

    Google Scholar 

  26. Y. Kim and C. Kim. Semi-Supervised Domain Adaptation via Selective Pseudo Labeling and Progressive Self-Training. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 1059–1066, IEEE (2021).

    Google Scholar 

  27. D.P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

    Google Scholar 

  28. Z. Kou, Z. Shi and L. Liu. Airport detection based on line segment detector. In: 2012 International Conference on Computer Vision in Remote Sensing, pp. 72–77, IEEE (2012).

    Google Scholar 

  29. D. Kundu, L.K. Choi, A.C. Bovik and B.L. Evans. Perceptual quality evaluation of synthetic pictures distorted by compression and transmission. Signal Processing: Image Communication61, 54–72 (2018).

    Google Scholar 

  30. J.H. Lee, M.-K. Han, D.W. Ko and I.H. Suh. From big to small: Multi-scale local planar guidance for monocular depth estimation. arXiv preprint arXiv:1907.10326 (2019).

    Google Scholar 

  31. J.-H. Lee and C.-S. Kim. Multi-loss rebalancing algorithm for monocular depth estimation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16, pp. 785–801, Springer (2020).

    Google Scholar 

  32. H. Li, H. Yu, J. Wang, W. Yang, L. Yu and S. Scherer. ULSD: Unified line segment detection across pinhole, fisheye, and spherical cameras. ISPRS Journal of Photogrammetry and Remote Sensing178, 187–202 (2021).

    Article  Google Scholar 

  33. Z. Li and N. Snavely. Megadepth: Learning single-view depth prediction from internet photos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041–2050, IEEE (2018).

    Google Scholar 

  34. M.-Y. Liu and O. Tuzel. Coupled generative adversarial networks. In: Advances in neural information processing systems29, Curran Associates, Inc. (2016).

    Google Scholar 

  35. X. Lu, J. Yao, K. Li and L. Li. Cannylines: A parameter-free line segment detector. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 507–511, IEEE (2015).

    Google Scholar 

  36. W. Metzger. Gesetze des Sehens [1975]. Verlag Waldemar Kramer (2008).

    Google Scholar 

  37. W. Metzger. Laws of Seeing. English translation of “Gesetze des Sehens” (1936, first German edition), by Lothar Spillmann, Steven Lehar, Mimsey Stromeyer and Michael Wertheimer. Cambridge: MIT Press (2006).

    Google Scholar 

  38. S.M. Miangoleh, S. Dille, L. Mai, S. Paris and Y. Aksoy. Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9685–9694, IEEE (2021).

    Google Scholar 

  39. M. Minderer, J. Djolonga, R. Romijnders, F. Hubis, X. Zhai, N. Houlsby, D. Tran and M. Lucic. Revisiting the calibration of modern neural networks. Advances in Neural Information Processing Systems34, pp. 15682–15694, Curran Associates, Inc. (2021).

    Google Scholar 

  40. J. Mukhoti, J. van Amersfoort, P.H.S. Torr and Y. Gal. Deep Deterministic Uncertainty for Semantic Segmentation. arXiv preprint arXiv:2111.00079 (2021).

    Google Scholar 

  41. Y.E. Nesterov. A method for solving the convex programming problem with convergence rate O(1∕k2). Dokl. akad. nauk SSSR269, 543–547 (1983).

    MathSciNet  Google Scholar 

  42. R. Pautrat, J.-T. Lin, V. Larsson, M.R. Oswald and M. Pollefeys. SOLD2: Self-supervised occlusion-aware line description and detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11368–11378, IEEE (2021).

    Google Scholar 

  43. A. Power, Y. Burda, H. Edwards, I. Babuschkin and V. Misra. Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. ICLR MATH-AI Workshop (2021).

    Google Scholar 

  44. M. Ramamonjisoa, Y. Du and V. Lepetit. Predicting sharp and accurate occlusion boundaries in monocular depth estimation using displacement fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14648–14657, IEEE (2020).

    Google Scholar 

  45. R. Ranftl, K. Lasinger, D. Hafner, K. Schindler and V. Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence44(3), 1623–1637 (2020).

    Article  Google Scholar 

  46. A. Rozantsev, M. Salzmann and P. Fua. Beyond sharing weights for deep domain adaptation. IEEE transactions on pattern analysis and machine intelligence41(4), 801–814 (2018).

    Article  Google Scholar 

  47. L. Schmarje, M. Santarossa, S.-M. Schröder and R. Koch. A survey on semi-, self- and unsupervised learning for image classification. IEEE Access9, 82146–82168 (2021).

    Article  Google Scholar 

  48. B. Sun, J. Feng and K. Saenko. Return of frustratingly easy domain adaptation. In: Proceedings of the AAAI Conference on Artificial Intelligence30(1) (2016).

    Google Scholar 

  49. B. Sun and K. Saenko. Deep coral: Correlation alignment for deep domain adaptation. In: European conference on computer vision, pp. 443–450, Springer (2016).

    Google Scholar 

  50. R. Grompone von Gioi, J. Jakubowicz, J.-M. Morel and G. Randall. On straight line segment detection. Journal of Mathematical Imaging and Vision32(3), 313–347 (2008).

    Article  MathSciNet  Google Scholar 

  51. Y. Tay, M. Dehghani, S. Abnar, Y. Shen, D. Bahri, P. Pham, J. Rao, L. Yang, S. Ruder and D. Metzler. Long Range Arena: A Benchmark for Efficient Transformers. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=qVyeW-grC2k

  52. T. Tommasi, N. Patricia, B. Caputo and T. Tuytelaars. A deeper look at dataset bias. In: Domain adaptation in computer vision applications, pp. 37–55, Springer (2017).

    Google Scholar 

  53. A. Torralba and A.A. Efros. Unbiased look at dataset bias. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2011, pp. 1521–1528, IEEE (2011).

    Google Scholar 

  54. D.-B. Wang, L. Feng and M.-L. Zhang. Rethinking Calibration of Deep Neural Networks: Do Not Be Afraid of Overconfidence. In: Advances in Neural Information Processing Systems34, pp. 11809–11820, Curran Associates, Inc. (2011).

    Google Scholar 

  55. L. Wang, J. Zhang, Y. Wang, H. Lu and X. Ruan. Cliffnet for monocular depth estimation with hierarchical embedding loss. In: European Conference on Computer Vision, pp. 316–331, Springer (2020).

    Google Scholar 

  56. M. Wang and W. Deng. Deep visual domain adaptation: A survey. Neurocomputing312, 135–153 (2018).

    Article  Google Scholar 

  57. K. Xian, C. Shen, Z. Cao, H. Lu, Y. Xiao, R. Li and Z. Luo. Monocular relative depth perception with web stereo data supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 311–320, IEEE (2018).

    Google Scholar 

  58. K. Xian, J. Zhang, O. Wang, L. Mai, Z. Lin and Z. Cao. Structure-guided ranking loss for single image depth prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 611–620, IEEE (2020).

    Google Scholar 

  59. Z. Xie, Y. Lin, Z. Zhang, Y. Cao, S. Lin and H. Hu. Propagate yourself: Exploring pixel-level consistency for unsupervised visual representation learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16684–16693, IEEE (2021).

    Google Scholar 

  60. Y. Xu, W. Xu, D. Cheung and Z. Tu. Line segment detection using transformers without edges. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4257–4266, IEEE (2021).

    Google Scholar 

  61. H. Yu, N. Xu, Z. Huang, Y. Zhou and H. Shi. High-Resolution Deep Image Matting. Proceedings of the AAAI Conference on Artificial Intelligence35(4), 3217–3224 (2021).

    Article  Google Scholar 

  62. F. Zhang, P. Torr, R. Ranftl and S. Richter. Looking Beyond Single Images for Contrastive Semantic Segmentation Learning. In: Advances in Neural Information Processing Systems34, pp. 3285–3297, Curran Associates, Inc. (2021).

    Google Scholar 

  63. L. Zhang and X. Gao. Transfer adaptation learning: A decade survey. arXiv preprint arXiv:1903.04687 (2019).

    Google Scholar 

  64. J.-Y. Zhu, T. Park, P. Isola and A.A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp. 2223–2232, IEEE (2017).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Courtois, A., Ehret, T., Arias, P., Morel, JM. (2023). Can We Teach Functions to an Artificial Intelligence by Just Showing It Enough “Ground Truth”?. In: Morel, JM., Teissier, B. (eds) Mathematics Going Forward . Lecture Notes in Mathematics, vol 2313. Springer, Cham. https://doi.org/10.1007/978-3-031-12244-6_31

Download citation

Publish with us

Policies and ethics