Skip to main content

An Apache Giraph Implementation of Distributed ADMM for Solving LASSO Problems

  • Conference paper
  • First Online:
Soft Computing for Problem Solving

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1393))

Abstract

Convex formulation of optimization problems is gaining importance in many engineering problems such as signal/image processing, machine learning, the theory of structured sparsity, rank minimization, etc. Alternating Direction Method of Multipliers (ADMM) is commonly used to solve convex optimization problems. Since the volume of data is increasing day by day, developing distributed and high-performance algorithms to solve such problems is a need of today’s world. Currently in literature, distributed ADMM is implemented using Message Passing Interface (MPI), which does not scale well with the increase in the size of the data. Our main goal here is to propose an Apache Giraph-based implementation (on Hadoop) of distributed ADMM to solve the LASSO (Least Absolute Shrinkage and Selection Operator) formulation of convex optimization problems. Our most novel contribution is in exploiting the distributed nature of our algorithm to obtain the inverse of a matrix cheaply. The experimental results on randomly generated datasets show that our implementation converges in three iterations and about 30 s for a problem of size \(1.2 \times 10^9\). This is much more efficient than an MPI-based implementation that takes four times the iterations and ten times the time as compared to our algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Map-Reduce takes tens of iterations, and the time runs into minutes.

  2. 2.

    Please refer [1, 11] for step-by-step evaluation of r and s.

  3. 3.

    The dataset could not be exactly matched because of different formats.

References

  1. Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends® Mach Learn 3(1):1–122

    Google Scholar 

  2. Wei E, Ozdaglar A (2012) Distributed alternating direction method of multipliers. Proceedings 51st IEEE conference on decision and control, IEEE, pp 1–6

    Google Scholar 

  3. Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113

    Article  Google Scholar 

  4. Nemade V, Shastri A, Ahuja K, Tiwari A (2018) Scaled and projected spectral clustering with vector quantization for handling big data. Proceedings of the 9th IEEE symposium series on computational intelligence (SSCI), IEEE, pp 2174–2179

    Google Scholar 

  5. Message Passing Interface Forum: MPI: A message-passing interface standard version 3.0. Chapter author for collective communication, process topologies, and one sided communications (2012)

    Google Scholar 

  6. Afonso M, Bioucas-Dias J, Figueiredo M (2010) Fast image recovery using variable splitting and constrained optimization. IEEE Trans Image Process 19(9):2345–2356

    Article  MathSciNet  Google Scholar 

  7. Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical LASSO. Biostatistics 9(3):432–441

    Article  Google Scholar 

  8. Apache Giraph: http://giraph.apache.org/. Accessed August 2020

  9. Agrawal R, Ahuja K, Hoo CH, Nguyen TDA, Kumar A (2019) ParaLarPD: parallel FPGA router using primal-dual sub-gradient method. Electronics 8(12):1439–1454

    Article  Google Scholar 

  10. Agrawal R, Ahuja K, Maheshwari D, Kumar A (2020) ParaLarH: parallel FPGA router based upon Lagrange heuristics. arXiv:2010.11893

  11. Wohlberg B. (2017) ADMM penalty parameter selection by residual balancing. arXiv:1704.06209

  12. Tibshirani R (1996) Regression shrinkage and selection via the LASSO. J Royal Stat Soc Seri B (Methodological) 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  13. Mateos G, Bazerque J, Giannakis G (2010) Distributed sparse linear regression. IEEE Trans Signal process 58(10):5262–5276

    Article  MathSciNet  Google Scholar 

  14. He B, Yang H, Wang S (2000) Alternating direction method with self-adaptive penalty parameters for monotone variational inequalities. J Optim Theory Appl 106(2):337–356

    Article  MathSciNet  Google Scholar 

  15. Microsoft Azure: Install Giraph on HDInsight hadoop clusters, and use Giraph to process large-scale graphs. https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-giraph-install-linux. Accessed August 2018

  16. Microsoft Azure: Node configuration. https://docs.microsoft.com/en-us/rest/api/automation/dscnodeconfiguration/createorupdate. Accessed August 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kapil Ahuja .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Agrawal, R., Shastri, A.A., Ahuja, K., Perreard, A., Gujral, J. (2021). An Apache Giraph Implementation of Distributed ADMM for Solving LASSO Problems. In: Tiwari, A., Ahuja, K., Yadav, A., Bansal, J.C., Deep, K., Nagar, A.K. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 1393. Springer, Singapore. https://doi.org/10.1007/978-981-16-2712-5_44

Download citation

Publish with us

Policies and ethics