Skip to main content
Log in

A real-time recommendation engine using lambda architecture

  • Original Article
  • Published:
Artificial Life and Robotics Aims and scope Submit manuscript

Abstract

In a data science theory, the recommended methodology is one of the most popular theories and has been deployed in many real industries. However, one of the most challenging problems these days is how to recommend items with massively streaming data. Therefore, this paper aims to do a real-time recommendation engine using the Lambda architecture. The Apache Hadoop and Apache Spark frameworks were used in this research to process the MovieLens dataset comprised 100 K and 20 M ratings from the GroupLens research. Using alternating least squares (ALS) and k-means algorithms, the top K recommendation movies and the top K trending movies for each user were shown as results. Additionally, the mean squared error (MSE) and within cluster sum of squared error (WCSS) had been computed to evaluate the performance of the ALS and k-means algorithms, sequentially. The results showed that they are acceptable since the MSE and WCSS values are low when comparing to the size of data. However, they can still be improved by tuning some parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Kantor PB, Rokach L, Ricci F, Shapira B (2011) Recommender systems handbook. Springer, Berlin

    MATH  Google Scholar 

  2. Linden G, Smith B, York J (2003) Amazon.com recommendations: item-to-item collaborative filtering. IEEE Internet Comput 7:76–80

    Article  Google Scholar 

  3. Aggarwal CC (2016) Recommender systems. Springer, Switzerland

    Book  Google Scholar 

  4. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. J Comput 42(8):30–37

    Google Scholar 

  5. Pentreath N (2015) Machine learning with spark. Packt Publishing, Birmingham

    Google Scholar 

  6. Panigrahia S, Lenkaa RK, Stitipragyana A (2016) A hybrid distributed collaborative filtering recommender engine using Apache Spark. International workshop on big data and data mining challenges on IoT and pervasive systems (BigD2M 2016), pp 1000–1006

  7. Karanth S (2014) Mastering Hadoop. Packt Publishing, Birmingham

    Google Scholar 

  8. Shvachko K (2010) The Hadoop distributed file system. In: Proceeding of 2010 IEEE 26th symposium, mass storage system and technology, (MSST’10), pp 1–10

  9. Deanand J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. OSDI

  10. Zaharia M, Chowdhury M, Das T, Dave A, Ma J, McCauley M, Franklin MJ, Shenker S, Stoica I (2012) Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association

  11. Marz N, Warren J (2013) Big Data: principles and best practices of scalable real-time data systems. O’Reilly Media, Newton

    Google Scholar 

  12. Gong S (2010) A collaborative filtering recommendation algorithm based on user clustering and item clustering. JSW 5(7):745–752

    Article  Google Scholar 

  13. Phorasim P, Yu L (2016) Movies recommendation system using collaborative filtering and k-means. Int J Adv Comput Res 7(29):52

    Article  Google Scholar 

  14. Zhou Y, Wilkinson D, Schreiber R, Pan R (2008) Large-scale parallel collaborative filtering for the Netflix prize. Algorithmic aspects in information and management. Springer, Berlin, pp 337–348

    Book  Google Scholar 

  15. Phulari SV, Shah PP, Kalpande AD, Pawar VA (2016) Clustering and filtering approach for searching Big Data application query. Int J Eng Sci Innov Technol 5(1):197–204

    Google Scholar 

  16. Liu Q, Xiaobing L (2015) A new parallel item-based collaborative filtering algorithm based on Hadoop. JSW 10(4):416–426

    Article  Google Scholar 

  17. Dutta K, Jayapal M (2015) Big Data analytics in real time systems. In: Big Data analytics seminar, pp 1–13

  18. Huang Y, Cui B, Zhang W, Jiang J, Xu Y (2015) TencentRec—real-time stream recommendation in practice, SIGMOD’15

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanisa Numnonda.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Numnonda, T. A real-time recommendation engine using lambda architecture. Artif Life Robotics 23, 249–254 (2018). https://doi.org/10.1007/s10015-017-0424-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10015-017-0424-8

Keywords

Navigation