Dataset Morphing to Analyze the Performance of Collaborative Filtering
Machine Learning algorithms are often too complex to be studied from a purely analytical point of view. Alternatively, with a reasonably large number of datasets one can empirically observe the behavior of a given algorithm in different conditions and hypothesize some general characteristics. This knowledge about algorithms can be used to choose the most appropriate one given a new dataset. This very hard problem can be approached using metalearning. Unfortunately, the number of datasets available may not be sufficient to obtain reliable meta-knowledge. Additionally, datasets may change with time, by growing, shrinking and editing, due to natural actions like people buying in a e-commerce site. In this paper we propose dataset morphing as the basis of a novel methodology that can help overcome these drawbacks and can be used to better understand ML algorithms. It consists of manipulating real datasets through the iterative application of gradual transformations (morphing) and by observing the changes in the behavior of learning algorithms while relating these changes with changes in the meta features of the morphed datasets. Although dataset morphing can be envisaged in a much wider framework, we focus on one very specific instance: the study of collaborative filtering algorithms on binary data. Results show that the proposed approach is feasible and that it can be used to identify useful metafeatures to predict the best collaborative filtering algorithm for a given dataset.
KeywordsRecommender Systems Metalearning
This work is funded by ERDF through the Operational Programme of Competitiveness and Internationalization—COMPETE 2020—of Portugal 2020 within project PushNews—POCI-01-0247-FEDER-0024257.
- 2.Cunha, T., Soares, C., de Carvalho, A.C.P.L.F.: Selecting collaborative filtering algorithms using metalearning. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9852, pp. 393–409. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46227-1_25CrossRefGoogle Scholar
- 5.Maxwell, F., Konstan, J.A.: The MovieLens datasets: history and context. ACM Trans. Intell. Syst. Technol. (TIST) 5, 1–19 (2015)Google Scholar
- 6.García-Saiz, D., Zorrilla, M.: A meta-learning based framework for building algorithm recommenders: an application for educational arena. J. Intell. Fuzzy Syst. 32, 1–11 (2017)Google Scholar
- 7.Hahsler, M.: recommenderlab: a framework for developing and testing recommendation algorithms, November 2011Google Scholar
- 8.Massa, P., Avesani, P.: In: Proceedings of the 2007 ACM conference on Recommender systems - RecSys 2007 (2007)Google Scholar
- 9.Matuszyk, P., Vinagre, J., Spiliopoulou, M., Jorge, A.M., Gama, J.: Forgetting methods for incremental matrix factorization in recommender systems. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing - SAC ’15 (2015)Google Scholar
- 10.McAuley, J., Leskovec, J.: Hidden factors and hidden topics: understanding rating dimensions with review text. Proceedings of the 7th ACM conference on Recommender systems - RecSys 2013 (2015)Google Scholar
- 11.Prudêncio, R.B.C., Soares, C., Ludermir, T.B.: Combining meta-learning and active selection of datasetoids for algorithm selection. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011. LNCS (LNAI), vol. 6678, pp. 164–171. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21219-2_22CrossRefGoogle Scholar
- 13.Rivolli, A., Garcia, L.P.F., Soares, C., Vanschoren, J., de Leon Ferreira de Carvalho, A.C.P.: Towards reproducible empirical research in meta-learning. CoRR abs/1808.1 (2018)Google Scholar