Skip to main content

Research Data Reusability with Content-Based Recommender System

  • Conference paper
  • First Online:
Deep Learning Theory and Applications (DeLTA 2023)

Abstract

The use of content-based recommender systems to enable the reusability of research data artifacts has gained significant attention in recent years. This study aims to evaluate the effectiveness of such systems in improving the accessibility and reusability of research data artifacts. The study employs an empirical study to identify content-based recommender systems’ strengths and limitations for recommending research data-collections (repositories). The empirical study involves developing and evaluating a prototype content-based recommender system for research data artifacts. The literature review findings reveal that content-based recommender systems have several strengths, including providing personalized recommendations, reducing information overload, and enhancing retrieved artifacts’ quality, especially when dealing with cold start problems. The results of the empirical study indicate that the developed prototype content-based recommender system effectively provides relevant recommendations for research data repositories. The evaluation of the system using standard evaluation metrics shows that the system achieves an accuracy of 79% in recommending relevant items. Additionally, the user evaluation of the system confirms the relevancy of recommendations and enhances the accessibility and reusability of research data artifacts. In conclusion, the study provides evidence that content-based recommender systems can effectively enable the reusability of research data artifacts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.coscine.de.

  2. 2.

    https://www.w3.org/TR/shacl/.

  3. 3.

    https://www.w3.org/TR/rdf11-concepts/.

  4. 4.

    www.scikit-learn.org.

  5. 5.

    https://pypi.org/project/DA4RDM-RecSys-ContentBased/.

  6. 6.

    https://nlpaug.readthedocs.io.

References

  1. Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_27

    Chapter  Google Scholar 

  2. Aggarwal, C.C., Yu, P.S.: Outlier detection for high dimensional data. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 37–46 (2001)

    Google Scholar 

  3. Bennett, D.A.: How can i deal with missing data in my study? Aust. NZ. J. Public Health 25(5), 464–469 (2001)

    Article  Google Scholar 

  4. Boukerche, A., Zheng, L., Alfandi, O.: Outlier detection: methods, models, and classification. ACM Comput. Surv. (CSUR) 53(3), 1–37 (2020)

    Article  Google Scholar 

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Färber, M., Leisinger, A.K.: Recommending datasets for scientific problem descriptions. In: CIKM, pp. 3014–3018 (2021)

    Google Scholar 

  7. Heinrichs, B.P.A., Politze, M., Yazdi, M.A.: Evaluation of architectures for FAIR data management in a research data management use case. In: Proceedings of the 11th International Conference on Data Science, Technology and Applications (DATA 2022), SCITEPRESS - Science and Technology Publications, Setúbal (2022). https://doi.org/10.5220/0011302700003269

  8. Ho-Dac, N.N., Carson, S.J., Moore, W.L.: The effects of positive and negative online customer reviews: do brand strength and category maturity matter? J. Market. 77(6), 37–53 (2013)

    Article  Google Scholar 

  9. Jones, A.M., Arya, A., Agarwal, P., Gaurav, P., Arya, T.: An ontological sub-matrix factorization based approach for cold-start issue in recommender systems. In: 2017 International Conference on Current Trends in Computer, Electrical, Electronics and Communication (CTCEEC), pp. 161–166. IEEE (2017)

    Google Scholar 

  10. Kenton, J.D.M.W.C., Toutanova, L.K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of naacL-HLT, pp. 4171–4186 (2019)

    Google Scholar 

  11. Nair, A.M., Benny, O., George, J.: Content based scientific article recommendation system using deep learning technique. In: Suma, V., Chen, J.I.-Z., Baig, Z., Wang, H. (eds.) Inventive Systems and Control. LNNS, vol. 204, pp. 965–977. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-1395-1_70

    Chapter  Google Scholar 

  12. Phung, S., Kumar, A., Kim, J.: A deep learning technique for imputing missing healthcare data. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6513–6516. IEEE (2019)

    Google Scholar 

  13. Politze, M., Bensberg, S., Müller, M.S.: Managing discipline-specific metadata within an integrated research data management system. In: ICEIS (2), pp. 253–260 (2019)

    Google Scholar 

  14. Revathy, V.R., Anitha, S.P.: Cold start problem in social recommender systems: state-of-the-art review. In: Bhatia, S.K., Tiwari, S., Mishra, K.K., Trivedi, M.C. (eds.) Advances in Computer Communication and Computational Sciences. AISC, vol. 759, pp. 105–115. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-0341-8_10

    Chapter  Google Scholar 

  15. Rogers, A., Kovaleva, O., Rumshisky, A.: A primer in BERTology: What we know about how BERT works. Trans. Assoc. Comput. Linguist. 8, 842–866 (2021)

    Article  Google Scholar 

  16. Tenopir, C., et al.: Academic librarians and research data services: attitudes and practices. IT Lib: Inf. Technol. Libr. J. Issue 1 (2019)

    Google Scholar 

  17. Tenopir, C., et al.: Data sharing, management, use, and reuse: practices and perceptions of scientists worldwide. PLoS ONE 15(3), e0229003 (2020)

    Article  Google Scholar 

  18. Ünal, Y., Chowdhury, G., Kurbanoğlu, S., Boustany, J., Walton, G.: Research data management and data sharing behaviour of university researchers. In: Proceedings of ISIC: The Information Behaviour Conference, vol. 3, p. 15 (2019)

    Google Scholar 

  19. Vardigan, M., Donakowski, D., Heus, P., Ionescu, S., Rotondo, J.: Creating rich, structured metadata: lessons learned in the metadata portal project. IASSIST Q. 38(3), 15–15 (2015)

    Article  Google Scholar 

  20. Yazdi, M.A.: Enabling operational support in the research data life cycle. In: Proceedings of the First International Conference on Process Mining (ICPM), Doctoral Consortium, pp. 1–10 (2019)

    Google Scholar 

  21. Yazdi, M.A., Politze, M.: Reverse engineering: the university distributed services. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) FTC 2020. AISC, vol. 1289, pp. 223–238. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-63089-8_14

    Chapter  Google Scholar 

  22. Yazdi, M.A., Schimmel, D., Nellesen, M., Politze, M., Müller, M.S.: Da4rdm: data analysis for research data management systems. In: 13th International Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, (KMIS), pp. 177–183 (2021). https://doi.org/10.5220/0010678700003064

  23. Yazdi, M.A., Ghalati, P.F., Heinrichs, B.: Event log abstraction in client-server applications. In: 13th International Conference on Knowledge Discovery and Information Retrieval (KDIR), pp. 27–36 (2021). https://doi.org/10.5220/0010652000003064

  24. Zhang, S., Yao, L., Sun, A., Tay, Y.: Deep learning based recommender system: a survey and new perspectives. ACM Comput. Surv. (CSUR) 52(1), 1–38 (2019)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Amin Yazdi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yazdi, M.A., Politze, M., Heinrichs, B. (2023). Research Data Reusability with Content-Based Recommender System. In: Conte, D., Fred, A., Gusikhin, O., Sansone, C. (eds) Deep Learning Theory and Applications. DeLTA 2023. Communications in Computer and Information Science, vol 1875. Springer, Cham. https://doi.org/10.1007/978-3-031-39059-3_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-39059-3_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-39058-6

  • Online ISBN: 978-3-031-39059-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics