Skip to main content

Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAs

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2022)

Abstract

We investigate novel SoC-FPGA solutions for fast and energy-efficient ranking based on machine-learned ensembles of decision trees. Since the memory footprint of ranking ensembles limits the effective exploitation of programmable logic for large-scale inference tasks, we investigate binning and quantization techniques to reduce the memory occupation of the learned model and we optimize the state-of-the-art ensemble-traversal algorithm for deployment on low-cost, energy-efficient FPGA devices. The results of the experiments conducted using publicly available Learning-to-Rank datasets, show that our model compression techniques do not impact significantly the accuracy. Moreover, the reduced space requirements allow the models and the logic to be replicated on the FPGA device in order to execute several inference tasks in parallel. We discuss in details the experimental settings and the feasibility of the deployment of the proposed solution in a real setting. The results of the experiments conducted show that our FPGA solution achieves performances at the state of the art and consumes from 9\(\times \) up to 19.8\(\times \) less energy than an equivalent multi-threaded CPU implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://dataaspirant.com/xgboost-algorithm/.

  2. 2.

    http://research.microsoft.com/en-us/projects/mslr/.

  3. 3.

    http://quickrank.isti.cnr.it/istella-dataset/.

  4. 4.

    Code available at https://github.com/hpclab/model_compression_for_ranking_on_fpga.

  5. 5.

    https://github.com/DanieleDeSensi/mammut.

  6. 6.

    https://www.maximintegrated.com/en/products/power/switching-regulators/maxpowertool002.html.

References

  1. Asadi, N., Lin, J.: Training efficient tree-based models for document ranking. In: Serdyukov, P., et al. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 146–157. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36973-5_13

    Chapter  Google Scholar 

  2. Bergstra, J., Yamins, D., Cox, D.: Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of ICML, pp. 115–123. PMLR (2013)

    Google Scholar 

  3. Burges, C.J.: From ranknet to lambdarank to lambdamart: an overview. Learning 11(23–581), 81 (2010)

    Google Scholar 

  4. Busolin, F., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Trani, S.: Learning early exit strategies for additive ranking ensembles. In: Proceedings of SIGIR, pp. 2217–2221. ACM (2021)

    Google Scholar 

  5. Cambazoglu, B.B., et al.: Early exit optimizations for additive machine learned ranking systems. In: Proceedings of WSDM, pp. 411–420. ACM (2010)

    Google Scholar 

  6. Capannini, G., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N.: Quality versus efficiency in document scoring with learning-to-rank models. Inf. Process. Manag. 52(6), 1161–1177 (2016)

    Article  Google Scholar 

  7. Chapelle, O., Chang, Y.: Yahoo! Learning to rank challenge overview. J. Mach. Learn. Res. 14, 1–24 (2011). Proceedings Track

    Google Scholar 

  8. Chen, R.C., Gallagher, L., Blanco, R., Culpepper, J.S.: Efficient cost-aware cascade ranking in multi-stage retrieval. In: Proceedings of SIGIR, pp. 445–454. ACM (2017)

    Google Scholar 

  9. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of ACM SIGKDD, pp. 785–794. ACM (2016)

    Google Scholar 

  10. Dato, D., et al.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM Trans. Inf. Syst. 35(2), 15:1–15:31 (2016)

    Google Scholar 

  11. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2000)

    MathSciNet  MATH  Google Scholar 

  12. Gallagher, L., Chen, R.C., Blanco, R., Culpepper, J.S.: Joint optimization of cascade ranking models. In: Proceedings of WSDM, pp. 15–23. ACM (2019)

    Google Scholar 

  13. Gao, R., Hsu, F.H.: An FPGA-based accelerator for LambdaRank in web search engines. ACM TRETS 4, 1–19 (2011)

    Google Scholar 

  14. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. In: Proceedings of ICLR (2016)

    Google Scholar 

  15. Järvelin, K., Kekäläinen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  16. Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of NIPS, pp. 3149–3157 (2017)

    Google Scholar 

  17. Lettich, F., et al.: Parallel traversal of large ensembles of decision trees. IEEE TPDS 30(9), 2075–2089 (2019)

    Google Scholar 

  18. Li, Q., Wang, E., Fleming, S.T., Thomas, D., Cheung, P.: Accelerating position-aware top-k ListNet for ranking under custom precision regimes. In: Proceedings of FPL, pp. 81–87 (2019)

    Google Scholar 

  19. Liu, T.Y.: Learning to rank for information retrieval. Found. Trends Inf. Retr. 3(3), 225–331 (2009)

    Article  Google Scholar 

  20. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees. In: Proceedings of SIGIR, pp. 73–82. ACM (2015)

    Google Scholar 

  21. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Silvestri, F., Salvatore, T.: X-CLEaVER: learning ranking ensembles by growing and pruning trees. ACM TIST 9, 1–26 (2018)

    Article  Google Scholar 

  22. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Silvestri, F., Trani, S.: Post-learning optimization of tree ensembles for efficient ranking. In: Proceedings of SIGIR, pp. 949–952 (2016)

    Google Scholar 

  23. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Exploiting CPU SIMD extensions to speed-up document scoring with tree ensembles. In: Proceedings of ACM SIGIR, pp. 833–836 (2016)

    Google Scholar 

  24. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Trani, S.: X-DART: blending dropout and pruning for efficient learning to rank. In: Proceedings of SIGIR, pp. 1077–1080. ACM (2017)

    Google Scholar 

  25. Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Trani, S.: Query-level early exit for additive learning-to-rank ensembles. In: Proceedings of SIGIR, pp. 2033–2036. ACM (2020)

    Google Scholar 

  26. Molina, R., Loor, F., Gil-Costa, V., Nardini, F.M., Perego, R., Trani, S.: Efficient traversal of decision tree ensembles with FPGAs. J. Parallel Distrib. Comput. 155, 38–49 (2021)

    Article  Google Scholar 

  27. Qasaimeh, M., Denolf, K., Lo, J., Vissers, K., Zambreno, J., Jones, P.H.: Comparing energy efficiency of CPU, GPU and FPGA implementations for vision kernels. In: Proceedings of IEEE ICESS, pp. 1–8 (2019)

    Google Scholar 

  28. Sensi, D.D., Torquati, M., Danelutto, M.: Mammut: high-level management of system knobs and sensors. SoftwareX 6, 150–154 (2017)

    Article  Google Scholar 

  29. Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: Proceedings of CIKM. ACM (2007)

    Google Scholar 

  30. Wang, L., Lin, J., Metzler, D.: Learning to efficiently rank. In: Proceedings of SIGIR, pp. 138–145. ACM, New York (2010)

    Google Scholar 

  31. Wu, Q., Burges, C., Svore, K., Gao, J.: Adapting boosting for information retrieval measures. Inf. Retrieval 13, 254–270 (2010). https://doi.org/10.1007/s10791-009-9112-1

  32. Xin, J., Tang, R., Yu, Y., Lin, J.: BERxiT: early exiting for BERT with better fine-tuning and extension to regression. In: Proceedings of ACL, pp. 91–104. ACL, April 2021

    Google Scholar 

  33. Xu, N.Y., Cai, X.F., Gao, R., Zhang, L., Hsu, F.H.: FPGA acceleration of RankBoost in web search engines. ACM TRETS 1(4), 1–19 (2009)

    Article  Google Scholar 

  34. Ye, T., Zhou, H., Zou, W.Y., Gao, B., Zhang, R.: RapidScorer: fast tree ensemble evaluation by maximizing compactness in data level parallelization. In: Proceedings of SIGKDD, pp. 941–950. ACM (2018)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the project HAMLET: Hardware Acceleration of Machine LEarning Tasks, funded by CONICET (Argentina) and CNR (Italy) 2017-2018 collaboration program, by the TEACHING project, funded by the EU Horizon 2020 Research and Innovation program (Grant agreement ID: 871385), and by the OK-INSAID project, funded by the Italian Ministry of Education and Research (GA no. ARS01_00917).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salvatore Trani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gil-Costa, V., Loor, F., Molina, R., Nardini, F., Perego, R., Trani, S. (2022). Ensemble Model Compression for Fast and Energy-Efficient Ranking on FPGAs. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13185. Springer, Cham. https://doi.org/10.1007/978-3-030-99736-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99736-6_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99735-9

  • Online ISBN: 978-3-030-99736-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics