Skip to main content

Advertisement

Log in

DRAL: a tool for discovering relevant e-activities for learners

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Web-based educational systems routinely collect vast quantities of data on students’ e-activity generating log files that offer researchers unique opportunities to apply data mining techniques and discover interesting information to improve the learning process. This paper proposes a friendly and intuitive tool called DRAL to detect the most relevant e-activities that a student needs to pass a course based on features extracted from logged data in an education web-based system. The method uses a more flexible representation of the available information based on multiple instance learning to prevent the appearance of a great number of missing values and is based on a multi-objective grammar guided genetic programming algorithm which obtains simple and clear classification rules which are markedly useful to identify the number, type and time of e-activities more relevant so that a student has a high probability to pass a course. To validate this approach, our proposal is compared with the most traditional proposals in multiple instance learning over the years. Experimental results demonstrate that the approach proposed successfully improves the accuracy of previous models by finding a balance between specificity and sensitivity values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Andrews S, Tsochantaridis I, Hofmann T (2002) Support vector machines for multiple-instance learning. In: NIPS’02: proceedings of neural information processing system. Vancouver, Canada, pp 561–568

  2. Ardila A (2001) Predictors of university academic performance in Colombia. Int J Educ Res 35:411–417

    Article  Google Scholar 

  3. Auer P, Ortner R (2004) A boosting approach to multiple instance learning. In: ECML’04: Proceedings of the 5th European Conference on Machine Learning. Lecture Notes in Computer Science, vol 3201, Pisa, Italy, pp 63–74

  4. Bekele R, Menzel W (2005) A bayesian approach to predict performance of a student (bapps): a case with ethiopian students. Artif Intell Appl 22:189–194

    Google Scholar 

  5. Belanger F, Jordan DH (2000) Evaluation and implementation of distance learning: technologies, tools and techniques. Idea Group, Hershey

    Google Scholar 

  6. Busato V, Prins F, Elshout J, Hamaker C (2000) Intellectual ability, learning style, personality, achievement motivation and academic success of psychology students in higher education. Pers Individ Differ 29:1057–1068

    Article  Google Scholar 

  7. Cen H, Koedinger KR, Junker B (2006) Learning factors analysis a general method for cognitive model evaluation and improvement, vol 4053. Springer, Berlin

    Google Scholar 

  8. Chadwick SA (1999) Teaching virtually via the web: comparing student performance and attitudes about communication in lecture, virtual web-based, and web-supplemented courses. Electron J Commun 9:1–13

    Google Scholar 

  9. Chai YM, Yang ZW (2007) A multi-instance learning algorithm based on normalized radial basis function network. In: ISSN’07: proceedings of the 4th international symposium on neural networks. Lecture Notes in Computer Science, vol 4491, Nanjing, China, pp 1162–1172

  10. Chen X, Zhang C, Chen S, Rubin S (2009) A human-centered multiple instance learning framework for semantic video retrieval. IEEE Trans Syst Man Cybern Part C Appl Rev 39(2):228–233

    Article  Google Scholar 

  11. Chevaleyre Y, Bredeche N, Zucker J (2002) Learning rules from multiple instance data: Issues and algorithms. In: IPMU’02: proceedings of 9th information processing and management of uncertainty in knowledge-based systems, Annecy, France, pp 455–459

  12. Chevaleyre YZ, Zucker JD (2001) Solving multiple-instance and multiple-part learning problems with decision trees and decision rules. Application to the mutagenesis problem. In: AI’01: proceedings of the 14th of the Canadian society for computational studies of intelligence, Lecture Note in Computer Science, vol 2056, Ottawa, Canada, pp 204–214

  13. Chidolue M (2001) The relationship between teacher characteristics, learning environment and student achievement and attitude. Stud Educ Eval 22(3):263–274

    Article  Google Scholar 

  14. Coello CA, Lamont GB, Veldhuizen DAV (2007) Evolutionary algorithms for solving multi-objective problems. Genetic and evolutionary computation, 2nd edn. Springer, Berlin

    Google Scholar 

  15. Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimisation: NSGA-II. In: PPSN VI: proceedings of the 6th international conference on parallel problem solving from nature. Springer, London, UK, pp 849–858

  16. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 17:1–30

    MathSciNet  Google Scholar 

  17. Dietterich TG, Lathrop RH, Lozano-Perez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1-2):31–71

    Article  MATH  Google Scholar 

  18. Fausett L, Elwasif W (1994) Predicting performance from test scores using backpropagation and counterpropagation. In: WCCI’94: IEEE world congress on computational intelligence, Washington, USA, pp 3398–3402

  19. Gao S, Suna Q (2008) Exploiting generalized discriminative multiple instance learning for multimedia semantic concept detection. Pattern Recogn 41(10):3214–3223

    Article  MATH  Google Scholar 

  20. Garcia-Piquer A, Fornells A, Orriols-Puig A, Corral G, Golobardes E (2011) Data classification through an evolutionary approach based on multiple criteria. Knowl Inf Syst (in press). doi:10.1007/s10115-011-0462-9

  21. Gartner T, Flach PA, Kowalczyk, A., Smola AJ (2002) Multi-instance kernels. In: ICML’02: proceedings of the 19th international conference on machine learning. Morgan Kaufmann, Sydney, Australia, pp 179–186

  22. Gu Z, Mei T, Tang J, Wu X, Hua X (2008) Milc2: A multi-layer multi-instance learning approach to video concept detection. In: MMM’08: proceedings of the 14th international conference of multimedia modeling, Kyoto, Japan, pp 24–34

  23. Herman G, Ye G, Xu J, Zhang B (2008) Region-based image categorization with reduced feature set. In: Proceedings of the 10th IEEE workshop on multimedia signal processing, Cairns, QLD, pp 586–591

  24. Hong Y, Kwong S (2009) Learning assignment order of instances for the constrained k-means clustering algorithm. IEEE Trans Syst Man Cybern Part B Cybern 39(2):568–574

    Article  Google Scholar 

  25. Huang H, Hsu C (2002) Bayesian classification for data from the same unknown class. IEEE Trans Syst Man Cybern Part B Cybern 32(2):137–145

    Article  Google Scholar 

  26. Jantan H, Hamdan AR, Othman ZA (2010) Classification and prediction of academic talent using data mining techniques. In: KES’10: proceedings of the 14th international conference on knowledge-based and intelligent information and engineering systems: part I. Springer, Berlin, pp 491–500

  27. Keerthi S, Shevade S, Bhattacharyya C, Murthy K (2001) Improvements to platt’s SMO algorithm for svm classifier design. Neural Comput 13(3):637–649

    Article  MATH  Google Scholar 

  28. Kotsiantis S, Pintelas P (2005) Predicting students marks in hellenic open university. In: ICALT’05: the 5th international conference on advanced learning technologies, Kaohsiung, Taiwan, pp 664–668

  29. Kouchakpour P, Zaknich A, Brunl T (2009) A survey and taxonomy of performance improvement of canonical genetic programming. Knowl Inf Syst 21:1–39. doi:10.1007/s10115-008-0184-9

    Article  Google Scholar 

  30. Luengo J, Garca S, Herrera F (2011) On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl Inf Syst (in press). doi:10.1007/s10115-011-0424-2

  31. Luna J, Romero J, Ventura S (2011) Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules. Knowl Inf Syst (in press). doi:10.1007/s10115-011-0419-z

  32. Majid A, Lee CH, Mahmood M, Choi TS (2011) Impulse noise filtering based on noise-free pixels using genetic programming. Knowl Inf Syst (in press). doi:10.1007/s10115-011-0456-7

  33. Marcano-Cedeo A, Quintanilla-Domnguez J, Andina D (2011) Breast cancer classification applying artificial metaplasticity algorithm. Neurocomputing 74(8):1243–1250

    Article  Google Scholar 

  34. Maron O, Lozano-Pérez T (1997) A framework for multiple-instance learning. In: NIPS’97: proceedings of neural information processing system 10, Denver, Colorado, USA, pp 570–576

  35. Martnez D (2001) Predicting student outcomes using discriminant function analysis. In: Annual meeting of the research and planning group, California, USA, pp 163–173

  36. Minaei-Bidgoli B, Punch W (2003) Using genetic algorithms for data mining optimization in an educational web-based system. Genet Evol Comput 2:2252–2263

    Google Scholar 

  37. Moallem M (2001) Applying constructivist and objectivist learning theories in the design of a web-based course: implications for practice. Educ Technol Soc 4:113–125

    Google Scholar 

  38. Nguyen TN, Paul J, Peter H (2007) A comparative analysis of techniques for predicting academic performance. IEEE Xplore, pp 7–12

  39. Oommen BJ, Hashem MK (2009) Modeling a student’s behavior in a tutorial-like system using learning automata. IEEE Trans Syst Man Cybern Part B Cybern (in press)

  40. Pang J, Huang Q, Jiang S (2008) Multiple instance boost using graph embedding based decision stump for pedestrian detection. In: ECCV’08: proceedings of the 10th European conference on computer vision. Lecture Note in Computer Science, vol 5305. Springer, Berlin, pp 541–552

  41. Pao HT, Chuang SC, Xu YY, Fu H (2008) An EM based multiple instance learning method for image classification. Expert Syst Appl 35(3):1468–1472

    Article  Google Scholar 

  42. Pappa G, Freitas A (2009) Evolving rule induction algorithms with multi-objective grammar-based genetic programming. Knowl Inf Syst 19:283–309. doi:10.1007/s10115-008-0171-1

    Article  Google Scholar 

  43. Qi X, Han Y (2007) Incorporating multiple svms for automatic image annotation. Pattern Recogn 40(2):728–741

    Article  MATH  Google Scholar 

  44. Rice WH (2006) Moodle e-learning course development. Pack Publishing, Birmingham

    Google Scholar 

  45. Romero C, Espejo P, Zafra A, Romero J, Ventura S (2011) Web usage mining for predicting final marks of students that use moodle courses. Comput Appl Eng Educ J (accepted)

  46. Romero C, Gonzalez P, Ventura S, del Jesus M, Herrera F (2009) Evolutionary algorithms for subgroup discovery in e-learning: a practical application using moodle data. Expert Syst Appl 36(2):1632–1644

    Article  Google Scholar 

  47. Romero C, Ventura S (2010) Educational data mining: a review of the state-of-the-art. IEEE Trans Syst Man Cybern Part C Appl Rev 40(6):610–618

    Article  Google Scholar 

  48. Shi Y (2010) Multiple criteria optimization-based data mining methods and applications: a systematic survey. Knowl Inf Syst 24:369–391. doi:10.1007/s10115-009-0268-1

    Article  Google Scholar 

  49. Sikora M (2011) Induction and pruning of classification rules for prediction of microseismic hazards in coal mines. Expert Syst Appl 38(6):6748–6758

    Article  MathSciNet  Google Scholar 

  50. Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manag 45(4):427–437

    Article  Google Scholar 

  51. Superby J, Vandamme J, Meskens N (2006) Determination of factors influencing the achievement of the first-year university students using data mining methods. In: EDM’06: workshop on educational data mining, Hong Kong, China, pp 37–44

  52. Ventura S, Romero C, Zafra A, Delgado JA, Hervás C (2007) JCLEC: a java framework for evolutionary computation. Soft Comput 12(4):381–392

    Article  Google Scholar 

  53. Wang H, Wang S (2010) Mining incomplete survey data through classification. Knowl Inf Syst 24:221–233. doi:10.1007/s10115-009-0245-8

    Article  Google Scholar 

  54. Wang J, Zucker JD (2000) Solving the multiple-instance problem: a lazy learning approach. In: ICML’00: proceedings of the 17th international conference on machine learning, Standord, CA, USA, pp 1119–1126

  55. Whigham PA (1995) Grammatically-based genetic programming. In: Proceedings of the workshop on genetic programming: from theory to real-world applications, Tahoe City, California, USA, pp 33–41

  56. Witten I, Frank E (2005) Data Mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    MATH  Google Scholar 

  57. Xu X (2003) Statistical learning in multiple instance problems. Ph.D. thesis, Department of Computer Science. University of Waikato, Hamilton, New Zealand

  58. Xu X, Frank E (2004) Logistic regression and boosting for labeled bags of instances. In: PAKDD’04: proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining, Lecture Notes in Computer Science, vol 3056, Sydney, Australia, pp 272–281

  59. Zafra A, Gibaja E, Ventura S (2011) Multi-instance learning with multi-objective genetic programming for web mining. Appl Soft Comput 11(1):93–102

    Article  Google Scholar 

  60. Zafra A, Romero C, Ventura S (2011) Multiple instance learning for classifying students in learning management systems. Expert Syst Appl 38(12):15020–15031

    Article  Google Scholar 

  61. Zafra A, Ventura S (2010) G3P-MI: a genetic programming algorithm for multiple instance learning. Inf Sci 180(23):4496–4513

    Article  Google Scholar 

  62. Zafra A, Ventura S, Romero C, Herrera-Viedma E (2009) Multi-instance genetic programming for web index recommendation. Expert Syst Appl 36:11470–11479

    Article  Google Scholar 

  63. Zhang ML, Zhou ZH (2009) Multi-instance clustering with applications to multi-instance prediction. Appl Intell 31:47–68

    Article  Google Scholar 

  64. Zhang Q, Goldman S (2001) EM-DD: an improved multiple-instance learning technique. In: NIPS’01: proceedings of neural information processing system 14, Vancouver, Canada, pp 1073–1080

  65. Zhou ZH, Jiang K, Li M (2005) Multi-instance learning based web mining. Appl Intell 22(2):135–147

    Article  Google Scholar 

  66. Zhou ZH, Zhang ML (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11(2):155–170

    Article  Google Scholar 

Download references

Acknowledgments

The authors gratefully acknowledge the financial subsidy provided by the Spanish Department of Research under TIN2008-06681-C06-03 and P08-TIC-3720 Projects and FEDER fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amelia Zafra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zafra, A., Romero, C. & Ventura, S. DRAL: a tool for discovering relevant e-activities for learners. Knowl Inf Syst 36, 211–250 (2013). https://doi.org/10.1007/s10115-012-0531-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0531-8

Keywords

Navigation