What is behind a summary-evaluation decision?

Zipitria, Iraide; Larrañaga, Pedro; Armañanzas, Ruben; Arruarte, Ana; Elorriaga, Jon A.

doi:10.3758/BRM.40.2.597

What is behind a summary-evaluation decision?

Published: May 2008

Volume 40, pages 597–612, (2008)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

What is behind a summary-evaluation decision?

Download PDF

Iraide Zipitria¹,
Pedro Larrañaga¹^nAff2,
Ruben Armañanzas¹,
Ana Arruarte¹ &
…
Jon A. Elorriaga¹

547 Accesses
3 Citations
Explore all metrics

Abstract

Research in psychology has reported that, among the variety of possibilities for assessment methodologies, summary evaluation offers a particularly adequate context for inferring text comprehension and topic understanding. However, grades obtained in this methodology are hard to quantify objectively. Therefore, we carried out an empirical study to analyze the decisions underlying human summary-grading behavior. The task consisted of expert evaluation of summaries produced in critically relevant contexts of summarization development, and the resulting data were modeled by means of Bayesian networks using an application called Elvira, which allows for graphically observing the predictive power (if any) of the resultant variables. Thus, in this article, we analyzed summary-evaluation decision making in a computational framework.

Article PDF

Summary Evaluation: Together We Stand NPowER-ed

Automated Summarization Evaluation (ASE) Using Natural Language Processing Tools

The challenging task of summary evaluation: an overview

Article 02 September 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice Hall.
Google Scholar
Bartlett, F. C. (1932). Remembering: A study in experimental and social psychology. Cambridge: Cambridge University Press.
Google Scholar
Bayes, T. (1764). An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53, 370–418.
Article Google Scholar
Blanco, R., Inza, I., Merino, M., Quiroga, J., & Larrañaga, P. (2005). Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS. Journal of Biomedical Informatics, 38, 376–388.
Article PubMed Google Scholar
Bower, G. H., & Hilgard, E. R. (1981). Theories of learning (5th ed.). Englewood Cliffs, NJ: Prentice Hall.
Google Scholar
Bransford, J. D., Vye, N., Kinzer, C. K., & Risko, V. (1990). Teaching thinking and content knowledge: Toward an integrated approach. In B. F. Jones & L. Idol (Eds.), Dimensions of thinking and cognitive instruction (pp. 381–413). Hillsdale, NJ: Erlbaum.
Google Scholar
Breiman, L., Friedman, J., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth.
Google Scholar
Brown, A. L., & Day, J. D. (1983). Macrorules for summarizing texts: The development of expertise. Journal of Verbal Learning & Verbal Behavior, 22, 1–14.
Article Google Scholar
Bull, S., & Pain, H. (1995, August). Did I say what I think I said, and do you agree with me? Inspecting and questioning the student model. Paper presented at the Seventh World Conference on Artificial Intelligence in Education (AACE ’ 95), Washington, DC.
Burstein, J., & Marcu, D. (2003). Automated evaluation of discourse structure in student essays. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 209–229). Mahwah, NJ: Erlbaum.
Google Scholar
Cassany, D. (1993). Reparar la escritura: Didáctica de la corrección de lo escrito. Barcelona: Editorial Graó.
Google Scholar
Catlett, J. (1991). On changing continuous attributes into ordered discrete attributes. In Y. Kodratoff (Ed.), Machine learning— EWSL-91: Proceedings of the European Working Session on Learning (pp. 164–178). Berlin: Springer.
Chapter Google Scholar
Chung, G. K. W. K., & Baker, E. L. (2003). Issues in the reliability and validity of automated scoring of constructed responses. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A crossdisciplinary perspective (pp. 23–40). Mahwah, NJ: Erlbaum.
Google Scholar
Cizek, G. J., & Page, B. A. (2003). The concept of reliability in the context of automated essay scoring. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 125–145). Mahwah, NJ: Erlbaum.
Google Scholar
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Google Scholar
Cook, R., & Kay, J. (1994). The justified user model: A viewable, explained user model. In Fourth International Conference on User Modeling (pp. 145–150). Hyannis, MA: Mitre Corp.
Google Scholar
Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27.
Article Google Scholar
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines and other kernel-based learning methods. Cambridge: Cambridge University Press.
Google Scholar
Dimitrova, V. (2003). STyLE-OLM: Interactive open learner modelling. International Journal of Artificial Intelligence in Education, 13, 35–78.
Google Scholar
Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In Proceedings of the Twelfth International Conference on Machine Learning (pp. 194–202). Tahoe City, CA: Morgan Kaufmann.
Google Scholar
Elosúa, M. R., García-Madruga, J. A., Gutiérrez, F., Luque, J. L., & Gárate, M. (2002). Effects of an intervention in active strategies for text comprehension and recall. Spanish Journal of Psychology, 5, 90–101.
PubMed Google Scholar
Elvira Consortium (2002). Elvira: An environment for creating and using probabilistic graphical models. In J. A. Gámez & A. Salmerón (Eds.), Proceedings of the First European Workshop on Probabilistic Graphical Models (pp. 222–230), Cuenca, Spain.
Fayyad, U. M., & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (pp. 1022–1027). Tahoe City, CA: Morgan Kaufmann.
Google Scholar
Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7, 179–188.
Google Scholar
Fitzgerald, J. (1987). Research on revision in writing. Review of Educational Research, 57, 481–506.
Google Scholar
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29, 131–163.
Article Google Scholar
Garner, R. (1982). Efficient text summarization: Costs and benefits. Journal of Educational Research, 75, 275–279.
Google Scholar
Garner, R. (1987). Strategies for reading and studying expository text. Educational Psychologist, 22, 299–312.
Article Google Scholar
Genesee, F., & Upshur, J. A. (1996). Classroom-based evaluation in second language education. Cambridge: Cambridge University Press.
Google Scholar
Glazer, E. M., & Hannafin, M. J. (2006). The collaborative apprenticeship model: Situated professional development within school settings. Teaching & Teacher Education, 22, 179–193.
Article Google Scholar
Glymour, C. (2001). The mind’s arrows: Bayes nets and graphical causal models in psychology. Cambridge, MA: MIT Press.
Google Scholar
Goldberg, G. L., & Roswell, B. S. (1999). From perception to practice: The impact of teachers’ scoring experience on performancebased instruction and classroom assessment. Educational Assessment, 6, 257–290.
Article Google Scholar
Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20, 197–243.
Google Scholar
Holland, J. H. (1975). Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence. Ann Arbor: University of Michigan Press.
Google Scholar
Hosmer, D. W., Jr., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.
Google Scholar
Inoue, A. B. (2005). Community-based assessment pedagogy. Assessing Writing, 9, 208–238.
Article Google Scholar
Jensen, F. V. (2001). Bayesian networks and decision graphs. New York: Springer.
Google Scholar
Kerber, R. (1992). ChiMerge: Discretization for numeric attributes. In P. Rosenbloom & P. Szolovits (Eds.), Proceedings of the Tenth National Conference on Artificial Intelligence (pp. 123–128). Menlo Park, CA: AAAI Press.
Google Scholar
Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363–394.
Article Google Scholar
Kirby, J. R., & Pedwell, D. (1991). Students’ approaches to summarisation. Educational Psychology, 11, 297–307.
Article Google Scholar
Kozminsky, E., & Graetz, N. (1986). First vs. second language comprehension: Some evidence from text summarizing. Journal of Research in Reading, 9, 3–21.
Article Google Scholar
Kruskal, W. H., & Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. Journal of the American Statistical Association, 47, 583–621.
Article Google Scholar
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.
Article Google Scholar
Langley, P., & Sage, S. (1994). Induction of selective Bayesian classifiers. In Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence (pp. 399–406). San Francisco: Morgan Kaufmann.
Google Scholar
Lauritzen, S. L., & Spiegelhalter, D. J. (1988). Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society: Series B, 50, 157–224.
Google Scholar
Lehnert, W. G. (1981). Plots units and narrative summarization. Cognitive Science, 5, 293–331.
Article Google Scholar
Long, J., & Harding-Esch, E. (1978). Summary and recall of text in first and second languages: Some factors contributing to performance differences. In D. Gerver & H. W. Sinaiko (Eds.), Language interpretation and communication (pp. 273–288). New York: Plenum.
Google Scholar
Magnani, L. (2001). Abduction, reason, and science: Processes of discovery and explanation. New York: Kluwer/Plenum.
Book Google Scholar
Magnani, L. (2004). Model-based and manipulative abduction in science. Foundations of Science, 9, 219–247.
Article Google Scholar
Manelis, L., & Yekovich, F. R. (1984). Analysis of expository prose and its relation to learning. Journal of Structural Learning, 8, 29–44.
Google Scholar
Mani, I., & Maybury, M. T. (1999). Advances in automatic text summarization. Cambridge, MA: MIT Press.
Google Scholar
McCulloch, W. S., & Pitts, W. H. (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5, 115–133.
Article Google Scholar
Minsky, M. (1961). Steps toward artificial intelligence. Proceedings of the Institute of Radio Engineers, 49, 8–30.
Google Scholar
Neapolitan, R. E. (2003). Learning Bayesian networks. Harlow, U.K.: Prentice Hall.
Google Scholar
Page, E. B. (2003). Project essay grade: PEG. In M. D. Shermis & J. Burstein (Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 43–54). Mahwah, NJ: Erlbaum.
Google Scholar
Pearl, J. (1987). Distributed revision of composite beliefs. Artificial Intelligence, 33, 173–215.
Article Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Peirce, C. S. (1955). Abduction and induction. In J. Buchler (Ed.), Philosophical writings of Peirce (pp. 150–156). New York: Dover.
Google Scholar
Robinson, B., & Schaible, R. M. (1995). Collaborative teaching: Reaping the benefits. College Teaching, 43, 57–59.
Article Google Scholar
Rumelhart, D. E. (1975). Notes on a schema for stories. In D. G. Bobrow & A. Collins (Eds.), Representation and understanding: Studies in cognitive science (pp. 185–210). New York: Academic Press.
Google Scholar
Schank, R. C., Lebowitz, M., & Birnbaum, L. (1980). An integrated understander. American Journal of Computational Linguistics, 6, 13–30.
Google Scholar
Sherrard, C. (1989). Teaching students to summarize: Applying textlinguistics. System, 17, 1–11.
Article Google Scholar
Shimony, S. E., & Charniak, E. (1990). A new algorithm for finding MAP assignments to belief networks. In P. P. Bonissone, M. Henrion, L. N. Kanal, & J. F. Lemmer (Eds.), Proceedings of the Sixth Annual Conference on Uncertainty in Artificial Intelligence (pp. 185–196). New York: Elsevier.
Google Scholar
Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. New York: Springer.
Google Scholar
Stone, M. (1974). Cross-validatory choice and assessment of statistical predictions. Journal of the Royal Statistical Society: Series B, 36, 111–147.
Google Scholar
Symons, S., & Pressley, M. (1993). Prior knowledge affects text search success and extraction of information. Reading Research Quarterly, 28, 250–261.
Article Google Scholar
Taylor, B. M. (1982). Text structure and children’s comprehension and memory for expository material. Journal of Educational Psychology, 74, 323–340.
Article Google Scholar
Thorndyke, P. W. (1977). Cognitive structures in comprehension and memory of narrative discourse. Cognitive Psychology, 9, 77–110.
Article Google Scholar
Virvou, M., & Moundridou, M. (2001). Adding an instructor modelling component to the architecture of ITS authoring tools. International Journal of Artificial Intelligence in Education, 12, 185–211.
Google Scholar
Whittaker, J. (1990). Graphical models in applied multivariate statistics. Chichester, U.K.: Wiley.
Google Scholar
Winograd, P. N. (1984). Strategic difficulties in summarizing texts. Reading Research Quarterly, 19, 404–425.
Article Google Scholar
Zipitria, I., Arruarte, A., & Elorriaga, J. A. (2006). Observing lemmatization effect in LSA coherence and comprehension grading of learner summaries. In M. Ikeda, K. D. Ashley, & T. W. Chan (Eds.), Proceedings of the 8th International Conference on Intelligent Tutoring Systems (ITS 2006) (pp. 595–603). Berlin: Springer.
Google Scholar

Download references

Author information

Pedro Larrañaga
Present address: Technical University of Madrid, Spain

Authors and Affiliations

Department of Social Psychology and Behavioral Science Methodology, University of the Basque Country, Tolosa etorbidea, 70, 20018, Donostia, Basque Country, Spain
Iraide Zipitria, Pedro Larrañaga, Ruben Armañanzas, Ana Arruarte & Jon A. Elorriaga

Authors

Iraide Zipitria
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Larrañaga
View author publications
You can also search for this author in PubMed Google Scholar
Ruben Armañanzas
View author publications
You can also search for this author in PubMed Google Scholar
Ana Arruarte
View author publications
You can also search for this author in PubMed Google Scholar
Jon A. Elorriaga
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Iraide Zipitria.

Additional information

Thiswork was partially supported by the University of the Basque Country (Grant UE06/19) and the Spanish Ministry of Education and Science (Grant TIN2006-14968-C02-01), as well as by the Gipuzkoa Council in collaboration with the European Union and by the Etortek, Saiotek, and Research Groups 2007-2012 (IT-242-07) programs (Basque Government), TIN2005-03824 and Consolider Ingenio 2010-CSD2007-00018 projects (Spanish Ministry of Education and Science), and COMBIOMED network in computational biomedicine (Carlos III Health Institute). R.A. is supported by Basque Government Grant AE-BFI-05/430.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zipitria, I., Larrañaga, P., Armañanzas, R. et al. What is behind a summary-evaluation decision?. Behavior Research Methods 40, 597–612 (2008). https://doi.org/10.3758/BRM.40.2.597

Download citation

Received: 14 July 2007
Accepted: 24 November 2007
Issue Date: May 2008
DOI: https://doi.org/10.3758/BRM.40.2.597

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

What is behind a summary-evaluation decision?

Abstract

Article PDF

Similar content being viewed by others

Summary Evaluation: Together We Stand NPowER-ed

Automated Summarization Evaluation (ASE) Using Natural Language Processing Tools

The challenging task of summary evaluation: an overview

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

What is behind a summary-evaluation decision?

Abstract

Article PDF

Similar content being viewed by others

Summary Evaluation: Together We Stand NPowER-ed

Automated Summarization Evaluation (ASE) Using Natural Language Processing Tools

The challenging task of summary evaluation: an overview

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation