A computational model for subjective evaluation of novelty in descriptive aptitude

Abstract

Evaluating novelty in design education is subjective and generally depends on expert’s referential metrics. Presently, practitioners in this field perform subjective evaluation of answers of prospective students, but many a time, humans are prone to errors when associated with repetitive tasks on large-scale. Therefore, this paper attempts to automate the process of evaluating novelty by a proposed computational model. The present study explores design aptitude to evaluate novelty in solutions provided by students in an examination. Mixed-methods research is conducted based on structured questionnaire and analysis to investigate features of subjective evaluation of novelty practiced for evaluation in design education. The survey resulted in features that closely resemble human evaluation strategies for evaluating novelty from descriptive solutions. Further, a computational model is proposed, designed, and implemented that evaluates novelty. Scores are generated for each feature by unsupervised learning techniques, eventually calculating novelty score by a scoring function. This model suggests unambiguous scores to solutions, which might help in a consistent selection of students aspiring admission to design schools. This study attempts to reduce pain points of educational practitioners by offering a voluntary automated technique for subjective evaluation and optimize trustworthiness of students in examination process. In future, this model can be extended for evaluating any other domain of interest.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

References

  1. Aksoy, C., Can, F., & Kocberber, S. (2012). Novelty detection for topic tracking. Journal of the American Society for Information Science and Technology, 63(4), 777–795.

    Article  Google Scholar 

  2. Albitar, S., Fournier, S., & Espinasse, B. (2014, October). An effective TF/IDF-based text-to-text semantic similarity measure for text classification. In International conference on Web Information Systems Engineering (pp. 105–114). Cham: Springer.

  3. Albluwi, I. (2018). A closer look at the differences between graders in introductory computer science exams. IEEE Transactions on Education, 61(3), 253–260.

    Article  Google Scholar 

  4. Basu, S., Mooney, R. J., Pasupuleti, K. V., & Ghosh, J. (2001, August). Evaluating the novelty of text-mined rules using lexical knowledge. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 233–238). ACM.

  5. Botchkarev, A. (2019). A new typology design of performance metrics to measure errors in machine learning regression algorithms. Interdisciplinary Journal of Information, Knowledge & Management, 14.

  6. Cutumisu, M., & Guo, Q. (2019). Using topic modeling to extract pre-service teachers’ understandings of computational thinking from their coding reflections. IEEE Transactions on Education, 99, 1–8.

    Google Scholar 

  7. Demirkan, H., & Afacan, Y. (2012). Assessing creativity in design education: Analysis of creativity factors in the first-year design studio. Design Studies, 33(3), 262–278.

    Article  Google Scholar 

  8. Devi, M. S., & Mittal, H. (2016). Machine learning techniques with ontology for subjective answer evaluation. Retrieved from https://arxiv.org/abs/1605.02442.

  9. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. Retrieved from https://arxiv.org/abs/1810.04805.

  10. Dynich, A., & Wang, Y. (2017). Analysis of novelty of a scientific text as a basis for assessment of efficiency of scientific activities. Journal of Organizational Change Management, 30(5), 668–682.

    Article  Google Scholar 

  11. Farthing, D. W., Jones, D. M., & McPhee, D. (1998, August). Permutational multiple-choice questions: An objective and efficient alternative to essay-type examination questions. In ACM SIGCSE Bulletin (Vol. 30, No. 3, pp. 81–85). ACM.

  12. Fiorineschi, L., Frillici, F. S., & Rotini, F. (2020). Subjectivity of novelty metrics based on idea decomposition. International Journal of Design Creativity and Innovation, 29, 1–17.

    Google Scholar 

  13. Gamon, M. (2006, June). Graph-based text representation for novelty detection. In Proceedings of the first workshop on Graph Based Methods for Natural Language Processing (pp. 17–24). Association for Computational Linguistics.

  14. Gehrmann, S., Deng, Y., & Rush, A. M. (2018). Bottom-up abstractive summarization. Retrieved from https://arxiv.org/abs/1808.10792.

  15. Ghosal, T., Edithal, V., Ekbal, A., Bhattacharyya, P., Tsatsaronis, G., & Chivukula, S. S. S. K. (2018, August). Novelty goes deep. A deep neural solution to document level novelty detection. In Proceedings of the 27th international conference on Computational Linguistics (pp. 2802–2813).

  16. Hayashi, T., & Ohsawa, Y. (2014, December). Estimation of novelty assessment of strategic scenarios using relativeness. In 2014 IEEE international conference on Data Mining Workshop (pp. 441–446). IEEE.

  17. Hoang, A., Bosselut, A., Celikyilmaz, A., & Choi, Y. (2019). Efficient adaptation of pretrained transformers for abstractive summarization. Retrieved from https://arxiv.org/abs/1906.00138.

  18. Hoffmann, H. (2007). Kernel PCA for novelty detection. Pattern Recognition, 40(3), 863–874.

    Article  Google Scholar 

  19. https://languagetool.org/. Retrieved on 12 July 2019.

  20. Huddleston, E. M. (1954). Measurement of writing ability at the college-entrance level: Objective vs. subjective testing techniques. The Journal of Experimental Education, 22(3), 165–213.

    Article  Google Scholar 

  21. Jordanous, A. (2012). A standardised procedure for evaluating creative systems: Computational creativity evaluation based on what it is to be creative. Cognitive Computation, 4(3), 246–279.

    Article  Google Scholar 

  22. Kasravi, K., & Risov, M. (2009, January). Multivariate patent similarity detection. In 2009 42nd Hawaii international conference on System Sciences (pp. 1–8). IEEE.

  23. Kaufman, J. C. (2016). Creativity 101. Springer Publishing Company.

  24. Kim, E., & Horii, H. (2015). A study on an assessment framework for the novelty of ideas generated by analogical thinking. Procedia-Social and Behavioral Sciences, 195, 1396–1406.

    Article  Google Scholar 

  25. Krejcie, R. V., & Morgan, D. W. (1970). Determining sample size for research activities. Educational and Psychological Measurement, 30(3), 607–610.

    Article  Google Scholar 

  26. Kubat, M. (2017). An introduction to machine learning (Vol. 2). Cham, Switzerland: Springer International Publishing.

    Google Scholar 

  27. Le, H. T., & Le, T. M. (2013, December). An approach to abstractive text summarization. In 2013 international conference on Soft Computing and Pattern Recognition (SoCPaR) (pp. 371–376). IEEE.

  28. Le, Q., & Mikolov, T. (2014, January). Distributed representations of sentences and documents. In International conference on Machine Learning (pp. 1188–1196).

  29. Liu, Y. T. (2000). Creativity or novelty? Cognitive-computational versus social-cultural. Design Studies, 21(3), 261–276.

    Article  Google Scholar 

  30. Lorbeer, B., Kosareva, A., Deva, B., Softić, D., Ruppel, P., & Küpper, A. (2016). A-BIRCH: Automatic threshold estimation for the BIRCH clustering algorithm. In INNS conference on Big Data (pp. 169–178). Cham: Springer.

  31. Lorbeer, B., Kosareva, A., Deva, B., Softić, D., Ruppel, P., & Kupper, A. (2018). Variations on the clustering algorithm BIRCH. Big Data Research, 11, 44–53.

    Article  Google Scholar 

  32. Maher, M. L. (2010, August). Evaluating creativity in humans, computers, and collectively intelligent systems. In Proceedings of the 1st DESIRE network conference on Creativity and Innovation in Design (pp. 22–28). Desire Network.

  33. Mehta, R., & Dahl, D. W. (2019). Creativity: Past, present, and future. Consumer Psychology Review, 2(1), 30–49.

    Article  Google Scholar 

  34. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111–3119). NIPS.

  35. Mittal, H., & Devi, M. S. (2016). Computerized evaluation of subjective answers using hybrid technique. In Innovations in computer science and engineering (pp. 295–303). Singapore: Springer.

    Google Scholar 

  36. Moreau, E., Yvon, F., & Cappé, O. (2008, August). Robust similarity measures for named entities matching. In COLING 2008 (pp. 593–600). ACL.

  37. Narayan, S., Cohen, S. B., & Lapata, M. (2018). Ranking sentences for extractive summarization with reinforcement learning. Retrieved from https://arxiv.org/abs/1802.08636.

  38. Neto, E. D. A. L., de Carvalho, F. A., & Tenorio, C. P. (2004, December). Univariate and multivariate linear regression methods to predict interval-valued features. In Australasian joint conference on Artificial Intelligence (pp. 526–537). Berlin, Heidelberg: Springer.

  39. Olarotimi, B. A. (2018). Divergence and relevance in advertising creativity: Theory testing in the Nigerian context. Journal of Marketing and Consumer Research, 43, 29–38.

    Google Scholar 

  40. Park, D., Nam, J., & Park, J. (2020). Novelty and influence of creative works, and quantifying patterns of advances based on probabilistic references networks. EPJ Data Science, 9(1), 2.

    Article  Google Scholar 

  41. Patil, P., Patil, S., Miniyar, V., & Bandal, A. (2018). Subjective answer evaluation using machine learning. International Journal of Pure and Applied Mathematics, 118(24), 1–13.

    Google Scholar 

  42. Penumatsa, P., Ventura, M., Graesser, A. C., Louwerse, M., Hu, X., Cai, Z., & Franceschetti, D. R. (2006). The right threshold value: What is the right threshold of cosine measure when using latent semantic analysis for evaluating student answers? International Journal on Artificial Intelligence Tools, 15(05), 767–777.

    Article  Google Scholar 

  43. Pimentel, M. A., Clifton, D. A., Clifton, L., & Tarassenko, L. (2014). A review of novelty detection. Signal Processing, 99, 215–249.

    Article  Google Scholar 

  44. Plucker, J. A. (2001). Introduction to the special issue: Commemorating Guilford’s 1950 presidential address. Creativity Research Journal, 13(3–4), 247–247.

    Article  Google Scholar 

  45. Ramos, J. (2003, December). Using tf-idf to determine word relevance in document queries. In Proceedings of the first instructional conference on Machine Learning (Vol. 242, pp. 133–142).

  46. Ranjan, B. S. C., Siddharth, L., & Chakrabarti, A. (2018). A systematic approach to assessing novelty, requirement satisfaction, and creativity. AI EDAM, 32(4), 390–414.

    Google Scholar 

  47. Rao, A. S., Avadhani, P. S., & Chaudhuri, N. B. (2017). Detecting targeted malicious e-mail using linear regression algorithm with data mining techniques. In Computational intelligence in data mining (pp. 23–35). Singapore: Springer.

    Google Scholar 

  48. Riedl, M. O. (2016). Computational narrative intelligence: A human-centered goal for artificial intelligence. Retrieved from https://arxiv.org/abs/1602.06484.

  49. Sarkar, P., & Chakrabarti, A. (2011). Assessing design creativity. Design Studies, 32(4), 348–383.

    Article  Google Scholar 

  50. Schölkopf, B., Williamson, R. C., Smola, A. J., Shawe-Taylor, J., & Platt, J. C. (2000). Support vector method for novelty detection. In Advances in neural information processing systems (pp. 582–588). NIPS.

  51. Sharma, B. (2016). A focus on reliability in developmental research through Cronbach’s Alpha among medical, dental and paramedical professionals. Asian Pacific Journal of Health Sciences, 3(4), 271–278.

    Article  Google Scholar 

  52. Shi, T., Keneshloo, Y., Ramakrishnan, N., & Reddy, C. K. (2018). Neural abstractive text summarization with sequence-to-sequence models. Retrieved from https://arxiv.org/abs/1812.02303.

  53. Singhal, S., & Bhattacharya, A. (2015). Abstractive text summarization, Department of Computer Science IIT Kanpur, 1-11.

  54. Skansi, S. (2018). Introduction to deep learning: From logical calculus to artificial intelligence. Springer.

  55. Soboroff, I., & Harman, D. (2005, October). Novelty detection: The TREC experience. In Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing (pp. 105–112). Association for Computational Linguistics.

  56. Sprugnoli, R., & Tonelli, S. (2019). Novel event detection and classification for historical texts. Computational Linguistics, 45(2), 229–265.

    Article  Google Scholar 

  57. Still, A., & d’Inverno, M. (2016). A history of creativity for future AI research. In Proceedings of the 7th computational creativity conference (ICCC 2016). Universite Pierre et Marie Curie.

  58. Thomas, A., Kowar, M. K., & Sharma, S. (2008, July). Intelligent Fuzzy decision making for subjective answer evaluation using utility functions. In 2008 first international conference on Emerging Trends in Engineering and Technology (pp. 587–591). IEEE.

  59. Tu, Y. N., & Seng, J. L. (2012). Indices of novelty for emerging topic detection. Information Processing & Management, 48(2), 303–325.

    Article  Google Scholar 

  60. Vij, S., Tayal, D., & Jain, A. (2019). A machine learning approach for automated evaluation of short answers using text similarity based on WordNet graphs. Wireless Personal Communications, pp. 1–12.

  61. Vodolazova, T., Lloret, E., Muñoz, R., & Palomar, M. (2013, June). Extractive text summarization: can we use the same techniques for any text? In International conference on Application of Natural Language to Information Systems (pp. 164–175). Berlin, Heidelberg: Springer.

  62. Wang, W., & Lu, Y. (2018, March). Analysis of the mean absolute error (MAE) and the root mean square error (RMSE) in assessing rounding model. In IOP conference series: Materials science and engineering (Vol. 324, No. 1, p. 012049). IOP Publishing.

  63. Wang, Z., & Bovik, A. C. (2009). Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Processing Magazine, 26(1), 98–117.

    Article  Google Scholar 

  64. Watts, L. S., & Blessinger, P. (Eds.). (2016). Creative learning in higher education: International perspectives and approaches. Routledge.

  65. Zambetta, F., Raffe, W., Tamassia, M., Mueller, F. F., Li, X., Quinten, N., et al. (2020). Reducing perceived waiting time in theme park queues via an augmented reality game. ACM Transactions on Computer-Human Interaction (TOCHI), 27(1), 1–30.

    Article  Google Scholar 

  66. Zedelius, C. M., Mills, C., & Schooler, J. W. (2019). Beyond subjective judgments: Predicting evaluations of creative writing from computational linguistic features. Behavior Research Methods, 51(2), 879–894.

    Article  Google Scholar 

  67. Zhang, T., Ramakrishnan, R., & Livny, M. (1997). BIRCH: A new data clustering algorithm and its applications. Data Mining and Knowledge Discovery, 1(2), 141–182.

    Article  Google Scholar 

Download references

Acknowledgements

We would also like to express our gratitude to all experts participating in this study. We also acknowledge Ankit for helping in some phases of software implementation.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Debayan Dhar.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Concepts that will help to fill up the form are as follows:

Novelty

Dictionary meaning: The quality of being new, original, or unusual.

Definition: Novelty encompasses both new and original. It does not resemble something formerly known. It may also be defined with reference to the previous ideas of the individual concerned.

Grammatical mistake

Meaning: Mistakes associated with sentence construction.

Misspellings

Meaning: Mistakes associated with spelling of a word.

Relevance between question and solution

Meaning: The answer must fit within the task constraints.

Narration link between sentences

Meaning: Able to convey a narration without invoking a break or diverging the concept.

Unique concept of solution

Meaning: New and original concept, possessing least similarity among other solutions.

Objective questions

Meaning: Objective answers are restricted to factual knowledge.

Eg. Multiple choice questions, fill in the blanks, match the following, etc.

Subjective questions

Meaning: Subjective answers reflect personal opinions, advices, preferences, and experiences.

Eg. Descriptive short/long questions, essays, etc.

Descriptive answers

Meaning: This type of answers comprises of sentences and paragraphs.

Eg. Camel is a useful domestic animal that is tamed by human beings for thousands of years. It is popularly known as the “Ship of the Desert”. It is mainly found in the desert areas of Africa, Middle East Asia, etc. In India, it is found in the states of Rajasthan and Gujarat. It is able to walk for long distances in desert areas where neither water nor vegetation exists. Camels are mostly used by nomadic people of Africa and Asia; the camel has a long neck and legs. People use to sit on the back of a camel for riding in desert areas.

  1. 1.

    Which types of answers are more significant for evaluating creativity in students? (subjective/objective/both)

  2. 2.

    Which of the following factors/stimuli is important for you during the evaluation of novelty in descriptive answers? Please evaluate factors from ‘Very important’ to ‘Not at all important’. (grammatical mistake, misspellings, relevancy between question and solution, narration link between sentences, unique concept of solution, any additional factor(s))

  3. 3.

    Rank the following factors in terms of your prioritization to evaluate novelty in descriptive answers. The ranks can range from 1 to 5 and two or more factors can get the same rank depending on your choice. Rank 1 is the most prioritized factor whereas rank 5 is the least prioritized factor. (grammatical mistake, misspellings, relevancy between question and solution, narration link between sentences, unique concept of solution, any additional factor(s))

  4. 4.

    Personal details were collected like current profession, specialization, highest qualification, years of association with design field, experience in selecting candidates, parameters effecting the evaluation, experience in selecting candidates based on creativity.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chaudhuri, N.B., Dhar, D. & Yammiyavar, P.G. A computational model for subjective evaluation of novelty in descriptive aptitude. Int J Technol Des Educ (2020). https://doi.org/10.1007/s10798-020-09638-2

Download citation

Keywords

  • Design education
  • Examination
  • Assessment
  • Unsupervised learning
  • Machine learning techniques