Skip to main content

Approximate Computing for Scientific Applications

  • Chapter
  • First Online:
Approximate Computing Techniques

Abstract

This chapter reviews the performance benefits that result from applying (software) approximate computing to scientific applications. For this purpose, we target two particular areas, linear algebra and deep learning, with the first one selected for being ubiquitous in scientific problems and the second one for its considerable and growing number of important applications both in industry and science.

The review of linear algebra in scientific computing is focused on the iterative solution of sparse linear systems, exposing the prevalent costs of memory accesses in these methods, and demonstrating how approximate computing can help to reduce these overheads, for example, in the case of stationary solvers themselves or the application of preconditioners for the solution of sparse linear systems via Krylov subspace methods.

The discussion of deep learning is focused on the use of approximate data transfer for cutting costs of host-to-device operations, as well as the use of adaptive precision for accelerating training of classical CNN architectures. Additionally we discuss model optimization and architecture search in presence of constraints for edge devices applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Available at http://www.mpfr.org/ (version 4.0.1, February 2018).

References

  1. Golub, G., & Loan, C. V. (1996). Matrix computations, 3rd edn. Baltimore: The Johns Hopkins University Press.

    MATH  Google Scholar 

  2. Demmel, J. W. (1997). Applied numerical linear algebra. Philadelphia: SIAM.

    Book  MATH  Google Scholar 

  3. Dongarra, J. J., Duff, I. S., Sorensen, D. C., & van der Vorst, H. A. (1998). Numerical linear algebra for high-performance computers. Philadelphia, PA: Society for Industrial and Applied Mathematics.

    Book  MATH  Google Scholar 

  4. Anderson, E., Bai, Z., Dongarra, J., Greenbaum, A., McKenney, A., Du Croz, J., Hammarling, S., Demmel, J., Bischof, C., & Sorensen, D. (1990). Lapack: a portable linear algebra library for high-performance computers. In Proceedings of the 1990 ACM/IEEE Conference on Supercomputing, Supercomputing’90, (Los Alamitos, CA, USA) (pp. 2–11). Piscataway: IEEE Computer Society Press.

    Google Scholar 

  5. Blackford, L. S., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M., Kaufman, L., Lumsdaine, A., Petitet, A., Pozo, R., Remington, K., & Whaley, R. C. (2002). An updated set of basic linear algebra subprograms (BLAS). ACM Transactions on Mathematical Software, 28, 135–151 (2002)

    Article  MathSciNet  Google Scholar 

  6. Horowitz, M. (2014). Computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) (pp. 10–14).

    Google Scholar 

  7. Ginkgo. (2019). https://ginkgo-project.github.io

  8. Buluç, A., Williams, S., Oliker, L., & Demmel, J. (2011). Reduced-bandwidth multithreaded algorithms for sparse matrix-vector multiplication. In 36th IEEE International Parallel & Distributed Processing Symposium IPDPS (pp. 721–733).

    Google Scholar 

  9. Bell, N., & Garland, M. (2008). Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004.

    Google Scholar 

  10. I. S. Commitee. (2000). IEEE standard for modeling and simulation (m&s) high level architecture (HLA) - framework and rules. IEEE Std. 1516–2000 (pp. i–22).

    Google Scholar 

  11. Saad, Y. (2003). Iterative methods for sparse linear systems, 2nd edn. Philadelphia: SIAM.

    Book  MATH  Google Scholar 

  12. Wulf, W. A., & McKee, S. A. (1995). Hitting the memory wall: Implications of the obvious. SIGARCH Computer Architecture News, 23, 20–24.

    Article  Google Scholar 

  13. Molka, D., Hackenberg, D., Schöne, R., & Müller, M. S. (2010). Characterizing the energy consumption of data transfers and arithmetic operations on x86–64 processors. In International Green Computing Conference 2010, Chicago, IL, USA, 15–18 August 2010 (pp. 123–133).

    Google Scholar 

  14. Higham, N. J. (2002). Accuracy and stability of numerical algorithms, 2nd edn. Philadelphia: SIAM.

    Book  MATH  Google Scholar 

  15. Buttari, A., Dongarra, J. J., Langou, J., Langou, J., Luszczek, P., & Kurzak, J. (2007). Mixed precision iterative refinement techniques for the solution of dense linear systems. International Journal of High Performance Computing Applications, 21(4), 457–486.

    Article  MATH  Google Scholar 

  16. Baboulin, M., Buttari, A., Dongarra, J. J., Langou, J., Langou, J., Luszczek, P., Kurzak, J., & Tomov, S. (2009). Accelerating scientific computations with mixed precision algorithms. Computer Physics Communications, 180(12), 2526–2533.

    Article  MATH  Google Scholar 

  17. Barrachina, S., Castillo, M., Igual, F. D., Mayo, R., & Quintana-Ortí, E. S. (2008). Solving dense linear systems on graphics processors. In E. Luque, T. Margalef, & D. Benítez (Eds.), Euro-Par 2008 – Parallel Processing (pp. 739–748). Berlin: Springer.

    Chapter  Google Scholar 

  18. Strzodka, R., & Göddeke, D. (2006). Pipelined mixed precision algorithms on FPGAs for fast and accurate PDE solvers from low precision components. In IEEE Proceedings on Field–Programmable Custom Computing Machines (FCCM 2006). Piscataway: IEEE Computer Society Press.

    Google Scholar 

  19. Anzt, H., Heuveline, V., & Rocker, B. (2010). Mixed precision error correction methods for linear systems Convergence analysis based on Krylov subspace methods. In K. Jonasson (Ed.) PARA 2010, Part II, LNCS 7134 (pp. 237–248). Heidelberg: Springer.

    Google Scholar 

  20. Haidar, A., Tomov, S., Dongarra, J., & Higham, N. J. (2018). Harnessing GPU tensor cores for fast FP16 arithmetic to speed up mixed-precision iterative refinement solvers. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC’18, (Piscataway, NJ, USA) (pp. 47:1–47:11). Piscataway: IEEE Press.

    Google Scholar 

  21. Anzt, H., Dongarra, J., & Quintana-Ortí, E. S. (2015). Adaptive precision solvers for sparse linear systems. In Proceedings of the 3rd International Workshop on Energy Efficient Supercomputing, E2SC’15, (New York, NY, USA) (pp. 2:1–2:10). New York: ACM.

    Google Scholar 

  22. Grützmacher, T., & Anzt, H. (2019). A modular precision format for decoupling arithmetic format and storage format. In G. Mencagli, D. B. Heras, V. Cardellini, E. Casalicchio, E. Jeannot, F. Wolf, A. Salis, C. Schifanella, R. R. Manumachu, L. Ricci, M. Beccuti, L. Antonelli, J. D. Garcia Sanchez, & S. L. Scott (Eds.), Euro-Par 2018: Parallel Processing Workshops (pp. 434–443). Cham: Springer.

    Google Scholar 

  23. Grützmacher, T., Cojean, T., Flegar, G., Göbel, F., & Anzt, H. (2019). A customized precision format based on mantissa segmentation for accelerating sparse linear algebra. Concurrency and Computation: Practice and Experience, 32(2), e5418. e5418 cpe.5418.

    Google Scholar 

  24. Anzt, H., Flegar, G., Grützmacher, T., & Quintana-Ortí, E. S. (2019). Toward a modular precision ecosystem for high-performance computing. The International Journal of High Performance Computing Applications, 33(6), 1069–1078.

    Article  Google Scholar 

  25. Grützmacher, T., Anzt, H., Scheidegger, F., & Quintana-Ortí, E. S. (2018). High-performance GPU implementation of PageRank with reduced precision based on mantissa segmentation. In 2018 IEEE/ACM 8th Workshop on Irregular Applications: Architectures and Algorithms (IA3) (pp. 61–68).

    Google Scholar 

  26. Anzt, H., Dongarra, J., Flegar, G., Higham, N. J., & Quintana-Ortí, E. S. (2019). Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers. Concurrency and Computation: Practice and Experience, 31(6), 1–12.

    Article  Google Scholar 

  27. Tadano, H., & Sakurai, T. (2008). On single precision preconditioners for krylov subspace iterative methods. In I. Lirkov, S. Margenov, & J. Waśniewski (Eds.), Large-Scale Scientific Computing (pp. 721–728). Berlin: Springer.

    Chapter  Google Scholar 

  28. Gropp, W. D., Kaushik, D. K., Keyes, D. E., & Smith, B. F. (2000). Latency, bandwidth, and concurrent issue limitations in high-performance CFD. In Proceedings of the First MIT Conference on Computational Fluid and Solid Mechanics.

    Google Scholar 

  29. Carson, E., & Higham, N. J. (2017). A new analysis of iterative refinement and its application to accurate solution of ill-conditioned sparse linear systems. SIAM Journal on Scientific Computing, 39(6), A2834–A2856.

    Article  MathSciNet  MATH  Google Scholar 

  30. Carson, E., & Higham, N. J. (2018). Accelerating the solution of linear systems by iterative refinement in three precisions. SIAM Journal on Scientific Computing, 40(2), A817–A847.

    Article  MathSciNet  MATH  Google Scholar 

  31. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks, In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems (vol. 25, pp. 1097–1105). New York: Curran Associates.

    Google Scholar 

  32. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9).

    Google Scholar 

  33. Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., & Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.

    Article  Google Scholar 

  34. Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., et al. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144.

    Google Scholar 

  35. Ciregan, D., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (pp. 3642–3649).

    Google Scholar 

  36. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems, NIPS’12 (pp. 1097–1105). New York: Curran Associates.

    Google Scholar 

  37. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., et al. (2016). Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, OSDI’16 (pp. 265–283). Berkeley: USENIX Association.

    Google Scholar 

  38. Merolla, P. A., Arthur, J. V., Alvarez-Icaza, R., Cassidy, A. S., Sawada, J., Akopyan, F., Jackson, B. L., Imam, N., Guo, C., Nakamura, Y., Brezzo, B., Vo, I., Esser, S. K., Appuswamy, R., Taba, B., Amir, A., Flickner, M., Risk, W., Manohar, R., et al. (2014). A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, 345(6197), 668–673.

    Google Scholar 

  39. Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., Bates, S., Bhatia, S., Boden, N., Borchers, A., Boyle, R., Cantin, P.-L., Chao, C., Clark, C., Coriell, J., Daley, M., Dau, M., Dean, J., Gelb, B., et al. (2017). In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture, ISCA’17 (pp. 1–12). New York: ACM.

    Google Scholar 

  40. Kurth, T., Zhang, J., Satish, N., Mitliagkas, I., Racah, E., Patwary, M. A., Malas, T., Sundaram, N., Bhimji, W., Smorkalov, M., Deslippe, J., Shiryaev, M., Sridharan, S., Prabhat, P. D. (2017). Deep learning at 15pf: Supervised and semi-supervised classification for scientific data. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC’17 (pp. 7:1–7:11). New York: ACM.

    Google Scholar 

  41. Werbos, P. J. (1974). Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University.

    Google Scholar 

  42. Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. (1988). Numerical recipes in C: The art of scientific computing. New York: Cambridge University Press.

    MATH  Google Scholar 

  43. Kiefer, J., & Wolfowitz, J. (1952). Stochastic estimation of the maximum of a regression function. The Annals of Mathematical Statistics, 23, 462–466.

    Article  MathSciNet  MATH  Google Scholar 

  44. You, Y., Buluc, A., & Demmel, J. (2017). Scaling deep learning on GPU and knights landing clusters. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC’17, pp. 9:1–9:12. New York: ACM.

    Google Scholar 

  45. Gupta, S., Agrawal, A., Gopalakrishnan, K., & Narayanan, P. (2015). Deep learning with limited numerical precision. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015 (pp. 1737–1746).

    Google Scholar 

  46. Köster, U., Webb, T., Wang, X., Nassar, M., Bansal, A. K., Constable, W., Elibol, O., Gray, S., Hall, S., Hornof, L., Khosrowshahi, A., Kloss, C., Pai, R. J., & Rao, N. (2017). Flexpoint: An adaptive numerical format for efficient training of deep neural networks. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems(vol. 30, pp. 1742–1752). New York: Curran Associates.

    Google Scholar 

  47. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. CoRR, vol. abs/1409.1556.

    Google Scholar 

  48. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770–778).

    Google Scholar 

  49. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR09).

    Google Scholar 

  50. Bottou, L., & Bousquet, O. (2008). The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems (pp. 161–168).

    Google Scholar 

  51. Micikevicius, P., Narang, S., Alben, J., Diamos, G. F., Elsen, E., García, D., Ginsburg, B., Houston, M., Kuchaiev, O., Venkatesh, G., & Wu, H. (2018). Mixed precision training. In Seventh International Conference on Learning Representations (ICLR).

    Google Scholar 

  52. Dean, J., Corrado, G. S., Monga, R., Chen, K., Devin, M., Le, Q. V., Mao, M. Z., Ranzato, M., Senior, A., Tucker, P., Yang, K., & Ng, A. Y. (2012). Large scale distributed deep networks. In NIPS’12: Proceedings of the 25th International Conference on Neural Information Processing Systems.

    Google Scholar 

  53. Holi, J. L., & Hwang, J. N. (1993). Finite precision error analysis of neural network hardware implementations. IEEE Transactions on Computers, 42, 281–290.

    Article  Google Scholar 

  54. Courbariaux, M., Bengio, Y. & David, J. (2014). Low precision arithmetic for deep learning. CoRR, vol. abs/1412.7024.

    Google Scholar 

  55. Rŋos, J. O., Armejach, A., Khattak, G., Petit, E., Vallecorsa, S., & Casas, M. (2020). Evaluating mixed-precision arithmetic for 3d generative adversarial networks to simulate high energy physics detectors. In 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 49–56).

    Google Scholar 

  56. RÃŋos, J. O., Armejach, A., Petit, E., Henry, G., & Casas, M. (2021). Dynamically adapting floating-point precision to accelerate deep neural network training. In 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA).

    Google Scholar 

  57. Niu, F., Recht, B., Re, C., & Wright, S. J. (2011). Hogwild! A lock-free approach to parallelizing stochastic gradient descent. In Proceedings of the 24th International Conference on Neural Information Processing Systems, NIPS’11 (pp. 693–701). New York: Curran Associates.

    Google Scholar 

  58. Zhang, S., Choromanska, A., & LeCun, Y. (2014). Deep learning with elastic averaging SGD. CoRR, vol. abs/1412.6651.

    Google Scholar 

  59. Coates, A., Huval, B., Wang, T., Wu, D. J., Ng, A. Y., & Catanzaro, B. (2013). Deep learning with cots HPC systems. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICM’13 (pp. III–1337–III–1345), JMLR.org.

    Google Scholar 

  60. Le, Q. V., Monga, R., Devin, M., Corrado, G., Chen, K., Ranzato, M., Dean, J., & Ng, A. Y. (2011). Building high-level features using large scale unsupervised learning. CoRR, vol. abs/1112.6209.

    Google Scholar 

  61. Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., & Dally, W. J. (2016). EIE: Efficient inference engine on compressed deep neural network. CoRR, vol. abs/1602.01528.

    Google Scholar 

  62. Lin, Y., Han, S., Mao, H., Wang, Y., & Dally, W. J. (2017). Deep gradient compression: Reducing the communication bandwidth for distributed training. CoRR, vol. abs/1712.01887.

    Google Scholar 

  63. Wen, W., Xu, C., Yan, F., Wu, C., Wang, Y., Chen, Y., & Li, H. (2017). Terngrad: Ternary gradients to reduce communication in distributed deep learning. CoRR, vol. abs/1705.07878.

    Google Scholar 

  64. Alistarh, D., Li, J., Tomioka, R., & Vojnovic, M. (2016). QSGD: Randomized quantization for communication-optimal stochastic gradient descent. CoRR, vol. abs/1610.02132.

    Google Scholar 

  65. Aji, A. F., & Heafield, K. (2017). Sparse communication for distributed gradient descent. CoRR, vol. abs/1704.05021.

    Google Scholar 

  66. Murray, A. F., & Edwards, P. J. (1994). Enhanced MLP performance and fault tolerance resulting from synaptic weight noise during training. IEEE Transactions on Neural Networks, 5, 792–802.

    Article  Google Scholar 

  67. Bishop, C. M. (1995). Training with noise is equivalent to Tikhonov regularization. Neural Computation, 7, 108–116.

    Article  Google Scholar 

  68. Audhkhasi, K., Osoba, O., & Kosko, B. (2013). Noise benefits in backpropagation and deep bidirectional pre-training. In The 2013 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8).

    Google Scholar 

  69. Dagum, L., & Menon, R. (1998). Openmp: An industry-standard API for shared-memory programming. IEEE Computing in Science & Engineering, 5, 46–55.

    Article  Google Scholar 

  70. Lomont, C. (2011). Introduction to intel advanced vector extensions. Intel white paper.

    Google Scholar 

  71. Gwennap, L. (1998). AltiVec vectorizes PowerPC. Microprocessors Report (vol. 12, pp. 1–5).

    Google Scholar 

  72. IEEE standard for floating point arithmetic (2008). IEEE Std 754–2008 (pp. 1–70).

    Google Scholar 

  73. Krizhevsky, A. (2014). One weird trick for parallelizing convolutional neural networks. CoRR, vol. abs/1404.5997.

    Google Scholar 

  74. Seo, H., Liu, Z., Großschädl, J., & Kim, H. (2015). Efficient arithmetic on arm-neon and its application for high-speed RSA implementation. IACR Cryptology ePrint Archive, 2015, 465.

    Google Scholar 

  75. NVIDIA Corporation. (2018). CUDA toolkit documentation. v10.0.130 ed.

    Google Scholar 

  76. Qian, N. (1999). On the momentum term in gradient descent learning algorithms. Neural Networks, 12(1), 145–151.

    Article  Google Scholar 

  77. NVIDIA Corporation. (2016). Nvlink fabric

    Google Scholar 

  78. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). ImageNet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 248–255).

    Google Scholar 

  79. Stallkamp, J., Schlipsing, M., Salmen, J., & Igel, C. (2011). The German traffic sign recognition benchmark: A multi-class classification competition. In The 2011 International Joint Conference on Neural Networks (pp. 1453–1460).

    Google Scholar 

  80. Deng, L. (2012). The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, 29(6), 141–142.

    Article  Google Scholar 

  81. Xiao, H., Rasul, K., & Vollgraf, R. (2017). Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. https://www.bibsonomy.org/bibtex/2de51af2f6c7d8b0f4cd84a428bb17967/andolab and https://arxiv.org/abs/1708.07747

  82. Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.222.9220

    Google Scholar 

  83. Coates, A., Ng, A., & Lee, H. (2011). An analysis of single-layer networks in unsupervised feature learning. In G. Gordon, D. Dunson, & M. DudÃŋk (Eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research (vol. 15, pp. 215–223), Fort Lauderdale: PMLR.

    Google Scholar 

  84. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning (vol. 2011, p. 5).

    Google Scholar 

  85. Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., & Torralba, A. (2018). Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 1452–1464.

    Article  Google Scholar 

  86. Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., & Vedaldi, A. (2014). Describing textures in the wild. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR’14 (pp. 3606–3613). Washington: IEEE Computer Society.

    Chapter  Google Scholar 

  87. Bossard, L., Guillaumin, M., & Van Gool, L. (2014). Food-101 – mining discriminative components with random forests. In D. Fleet, T. Pajdla, B. Schiele, & T. Tuytelaars (Eds.) Computer Vision – ECCV 2014 (pp. 446–461). Cham: Springer.

    Chapter  Google Scholar 

  88. Nilsback, M. E., & Zisserman, A. (2008). Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing (pp. 722–729).

    Google Scholar 

  89. Quattoni, A., & Torralba, A. (2009). Recognizing indoor scenes. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 413–420).

    Google Scholar 

  90. Li, W., Logenthiran, T., Phan, V.-T., & Woo, W. L. (2019). A novel smart energy theft system (sets) for IoT based smart home. IEEE Internet of Things Journal, 6, 5531–5539.

    Article  Google Scholar 

  91. Fenza, G., Gallo, M., & Loia, V. (2019). Drift-aware methodology for anomaly detection in smart grid. IEEE Access, 7, 9645–9657.

    Article  Google Scholar 

  92. Gaber, M. M., Aneiba, A., Basurra, S., Batty, O., Elmisery, A. M., Kovalchuk, Y., & Rehman, M. H. U. (2019). Internet of things and data mining: From applications to techniques and systems. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3), e1292.

    Google Scholar 

  93. Nordrum, A. (2016). The internet of fewer things [news]. IEEE Spectrum, 53, 12–13.

    Article  Google Scholar 

  94. Pytorch. Retrieved 22 May, 2019, from https://pytorch.org/

  95. Flegar, G., Scheidegger, F., Novakovic, V., Mariani, G., Tomas, A., Malossi, C., & Quintana-Ortí, E. (2019). Float x: A c++library for customized floating-point arithmetic. ACM Trans. Math. Softw. (to appear)

    Google Scholar 

  96. Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., & Zimmermann, P. (2007). MPFR: A multiple-precision binary floating-point library with correct rounding. ACM Transactions on Mathematical Software (TOMS), 33(2), 13.

    Article  MathSciNet  MATH  Google Scholar 

  97. Zuras, D., Cowlishaw, M., Aiken, A., Applegate, M., Bailey, D., Bass, S., Bhandarkar, D., Bhat, M., Bindel, D., Boldo, S., et al. (2008). IEEE standard for floating-point arithmetic. IEEE Std 754-2008, pp. 1–70. http://www.dsc.ufcg.edu.br/cnum/modulos/Modulo2/IEEE754_2008.pdf

    Google Scholar 

  98. Loroch, D. M., Pfreundt, F.-J., Wehn, N., & Keuper, J. (2017). Tensorquant: A simulation toolbox for deep neural network quantization. In Proceedings of the Machine Learning on HPC Environments,MLHPC’17 (pp. 1:1–1:8). New York: ACM.

    Google Scholar 

  99. Rybalkin, V., Wehn, N., Yousefi, M. R., & Stricker, D. (2017). Hardware architecture of bidirectional long short-term memory neural network for optical character recognition. In Proceedings of the Conference on Design, Automation & Test in Europe (pp. 1394–1399). European Design and Automation Association.

    Google Scholar 

  100. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, vol. abs/1704.04861.

    Google Scholar 

  101. Jaderberg, M., Vedaldi, A., & Zisserman, A. (2014). Speeding up convolutional neural networks with low rank expansions. CoRR, vol. abs/1405.3866.

    Google Scholar 

  102. Hill, P., Zamirai, B., Lu, S., Chao, Y., Laurenzano, M., Samadi, M., Papaefthymiou, M. C., Mahlke, S. A., Wenisch, T. F., Deng, J., Tang, L., & Mars, J. (2018). Rethinking numerical representations for deep neural networks. CoRR, vol. abs/1808.02513.

    Google Scholar 

  103. Cavigelli, L., & Benini, L. (2018). Extended bit-plane compression for convolutional neural network accelerators. CoRR, vol. abs/1810.03979.

    Google Scholar 

  104. Ashiquzzaman, A., Ma, L. V., Kim, S., Lee, D., Um, T., & Kim, J. (2019). Compacting deep neural networks for light weight IoT SCADA based applications with node pruning. In 2019 International Conference on Artificial Intelligence in Information and Communication (ICAIIC) (pp. 082–085).

    Google Scholar 

  105. Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A., Duffy, N., & Hodjat, B. (2019). Chapter 15 - evolving deep neural networks. In R. Kozma, C. Alippi, Y. Choe, & F. C. Morabito (Eds.), Artificial Intelligence in the Age of Neural Networks and Brain Computing (pp. 293–312). Cambridge: Academic Press.

    Chapter  Google Scholar 

  106. Xie, L., & Yuille, A. (2017). Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1379–1388).

    Google Scholar 

  107. Zhong, Z., Yan, J., & Liu, C. (207). Practical network blocks design with q-learning. CoRR, vol. abs/1708.05552.

    Google Scholar 

  108. Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. CoRR, vol. abs/1611.01578.

    Google Scholar 

  109. Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

  110. Cai, H., Chen, T., Zhang, W., Yu, Y., & Wang, J. (2018). Efficient architecture search by network transformation. In Thirty-Second AAAI Conference on Artificial Intelligence.

    Google Scholar 

  111. Baker, B., Gupta, O., Naik, N., & Raskar, R. (2016). Designing neural network architectures using reinforcement learning. CoRR, vol. abs/1611.02167.

    Google Scholar 

  112. Wistuba, M., Rawat, A., & Pedapati, T. (2019). A survey on neural architecture search. arXiv:1905.01392.

    Google Scholar 

  113. Scheidegger, F., Benini, L., Bekas, C., & Malossi, C. (2019). Constrained deep neural network architecture search for IoT devices accounting for hardware calibration. In Advances in Neural Information Processing Systems.

    Google Scholar 

  114. Goldberg, D. E., & Deb, K. (1991). A comparative analysis of selection schemes used in genetic algorithms. In Foundations of genetic algorithms (vol. 1, pp. 69–93). Amsterdam: Elsevier.

    Google Scholar 

  115. Scheidegger, F., Istrate, R., Mariani, G., Benini, L., Bekas, C., & Malossi, C. (2021). Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy. The Visual Computer volume 37, 1593–1610. https://link.springer.com/article/10.1007/s00371-020-01922-5

    Article  Google Scholar 

  116. Conti, F., Rossi, D., Pullini, A., Loi, I., & Benini, L. (2016). Pulp: A ultra-low power parallel accelerator for energy-efficient and flexible embedded vision. Journal of Signal Processing Systems, 84, 339–354.

    Article  Google Scholar 

  117. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.

    Google Scholar 

  118. He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).

    Google Scholar 

  119. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

  120. Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR, vol. abs/1704.04861.

    Google Scholar 

  121. Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., & Feng, J. (2017). Dual path networks. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in Neural Information Processing Systems 30 (pp. 4467–4475). New York: Curran Associates.

    Google Scholar 

  122. Huang, G., Liu, Z., van der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hartwig Anzt .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Anzt, H., Casas, M., Malossi, A. .I., Quintana-Ortí, E.S., Scheidegger, F., Zhuang, S. (2022). Approximate Computing for Scientific Applications. In: Bosio, A., Ménard, D., Sentieys, O. (eds) Approximate Computing Techniques. Springer, Cham. https://doi.org/10.1007/978-3-030-94705-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-94705-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-94704-0

  • Online ISBN: 978-3-030-94705-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics