Skip to main content

Large-Scale and Big Optimization Based on Hadoop

  • Chapter
  • First Online:
Book cover Big Data Optimization: Recent Developments and Challenges

Part of the book series: Studies in Big Data ((SBD,volume 18))

  • 3237 Accesses

Abstract

Integer Linear Programming (ILP) is among the most popular optimization techniques found in practical applications, however, it often faces computational issues in modeling real-world problems. Computation can easily outgrow the computing power of standalone computers as the size of problem increases. The modern distributed computing releases the computing power constraints by providing scalable computing resources to match application needs, which boosts large-scale optimization. This chapter presents a paradigm that leverages Hadoop, an open-source distributed computing framework, to solve a large-scale ILP problem that is abstracted from real-world air traffic flow management. The ILP involves millions of decision variables, which is intractable even with existing state-of-the-art optimization software package. Dual decomposition method is used to separate variables into a set of dual subproblems that are smaller ILPs with lower dimensions, the computation complexity is downsized. As a result, the subproblems are solvable with optimization tools. It is shown that the iterative update on Lagrangian multipliers in dual decomposition method can fit into the Hadoop’s MapReduce programming model, which is designed to allocate computations to cluster for parallel processing and collect results from each node to report aggregate results. Thanks to the scalability of the distributed computing, parallelism can be improved by assigning more working nodes to the Hadoop cluster. As a result, the computational efficiency for solving the whole ILP problem is not impacted by the input size.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://hadoop.apache.org.

  2. 2.

    http://www.coin-or.org.

  3. 3.

    http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/.

References

  1. Yan, Y., Huang, L.: Large-scale image processing research cloud. In: Cloud Computing, pp. 88–93 (2014)

    Google Scholar 

  2. Kang, Y., Park, Y.B.: The performance evaluation of k-means by two MapReduce frameworks, Hadoop vs. Twister. In: International Conference on Information Networking (ICOIN), pp. 405–406 (2015)

    Google Scholar 

  3. Chu, C., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. Adv. Neural Inf. Process. Syst. 19, 281–288 (2007)

    Google Scholar 

  4. Hall, K.B., Gilpin, S., Mann, G.: MapReduce/Bigtable for distributed optimization. In: NIPS LCCC Workshop (2010)

    Google Scholar 

  5. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. In: Foundations and Trends® in Machine Learning, vol. 3, no. 1, pp. 1–122 (2011)

    Google Scholar 

  6. Palomar, D.P., Chiang, M.: A tutorial on decomposition methods for network utility maximization. IEEE J. Sel. Areas Commun. 24(8), 1439–1451 (2006)

    Article  Google Scholar 

  7. Cao, Y., Sun, D., Zhang, L.: Air traffic prediction based on the kernel density estimation. In: American Control Conference, Washington D.C., 17–19 June 2013

    Google Scholar 

  8. Bosson, C.S., Sun, D.: An aggregate air traffic forecasting model subject to stochastic inputs. In: AIAA Guidance, Navigation, and Control and Co-located Conferences, Boston, MA, 19–22 Aug 2013

    Google Scholar 

  9. U.S. Department of Transportation Federal Aviation Administration: Facility Operation and Administration, Washington, DC, Order JO 7210.3W, Feb. 2010

    Google Scholar 

  10. Wei, P., Cao, Y., Sun, D.: Total unimodularity and decomposition method for large-scale air traffic cell transmission model. Transp. Res. Part B 53, 1–16 (2013)

    Article  Google Scholar 

  11. Cao, Y., Sun, D.: A parallel computing framework for large-scale traffic flow optimization. IEEE Trans. Intell. Transp. Syst. 13(14), 1855–1864 (2012)

    Article  Google Scholar 

  12. Ye, Y.: An O(n3L) potential reduction algorithm for linear programming. Math. Program. 50(2), 239–258 (1991)

    Article  MATH  Google Scholar 

  13. Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Athena Scientific (1997)

    Google Scholar 

  14. White, T.: Hadoop: The Definitive Guide. Yahoo! Press, Sebastapool, CA (2009)

    Google Scholar 

  15. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large cluster. Commun. ACM 51(1) (2008)

    Google Scholar 

  16. National Transportation Center Volpe: Enhanced Traffic Management System (ETMS). Number Technical Report VNTSC-DTS56-TMS-002, United States Department of Transportation, Cambridge, MA (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Cao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Cao, Y., Sun, D. (2016). Large-Scale and Big Optimization Based on Hadoop. In: Emrouznejad, A. (eds) Big Data Optimization: Recent Developments and Challenges. Studies in Big Data, vol 18. Springer, Cham. https://doi.org/10.1007/978-3-319-30265-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30265-2_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30263-8

  • Online ISBN: 978-3-319-30265-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics