A Semi-Automated Approach to Building Text Summarisation Classifiers

Garcia-Constantino, Matias; Coenen, Frans; Noble, P. -J.; Radford, Alan; Setzkorn, Christian

doi:10.1007/978-3-642-31537-4_39

Matias Garcia-Constantino²⁰,
Frans Coenen²⁰,
P. -J. Noble²¹,
Alan Radford²¹ &
…
Christian Setzkorn²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7376))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

5866 Accesses
1 Citations

Abstract

An investigation into the extraction of useful information from the free text element of questionnaires, using a semi-automated summarisation extraction technique to generate text summarisation classifiers, is described. A realisation of the proposed technique, SARSET (Semi-Automated Rule Summarisation Extraction Tool), is presented and evaluated using real questionnaire data. The results of this approach are compared against the results obtained using two alternative techniques to build text summarisation classifiers. The first of these uses standard rule-based classifier generators, and the second is founded on the concept of building classifiers using secondary data. The results demonstrate that the proposed semi-automated approach outperforms the other two approaches considered.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Abd-Elrahman, A., Andreu, M., Abbott, T.: Using text data mining techniques for understanding free-style question answers in course evaluation forms. Research in Higher Education Journal 9, 11–21 (2010)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499 (1994)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval, vol. 463. ACM press, New York (1999)
Google Scholar
Chen, Y.L., Weng, C.H.: Mining fuzzy association rules from questionnaire data. Knowledge-Based Systems 22, 46–56 (2009)
Article Google Scholar
Coenen, F.: The LUCS-KDD TFP Association Rule Mining Algorithm. Department of Computer Science, The University of Liverpool, UK (2004), http://www.csc.liv.ac.uk/~frans/KDD/Software/Apriori_TFP/aprioriTFP.html
Coenen, F.: The LUCS-KDD TFPC Classification Association Rule Mining Algorithm. Department of Computer Science, The University of Liverpool, UK (2004), http://www.csc.liv.ac.uk/~frans/KDD/Software/Apriori_TFPC/aprioriTFPC.html
Garcia-Constantino, M., Coenen, F., Noble, P.-J., Radford, A., Setzkorn, C., Tierney, A.: An investigation concerning the generation of text summarisation classifiers using secondary data. In: Perner, P. (ed.) MLDM 2011. LNCS, vol. 6871, pp. 387–398. Springer, Heidelberg (2011)
Google Scholar
Hand, D.J., Till, R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45, 171–186 (2001)
Article MATH Google Scholar
Hiramatsu, A., Oiso, H., Tamura, S., Komoda, N.: Support system for analyzing open-ended questionnaires data by culling typical opinions. In: 2004 IEEE International Conference on Systems, Man and Cybernetics, vol. 2, pp. 1377–1382 (2004)
Google Scholar
Hirasawa, S.: Analyses of Student Questionnaires for Faculty Developments. A Short Course at Tamkang University Taipei, Taiwan, R.O.C., March 7-9 (2006)
Google Scholar
Hirasawa, S., Chu, W.W.: Knowledge acquisition from documents with both fixed and free formats. In: 2003 IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 4694–4699 (2003)
Google Scholar
Hiroko, I., Masao, U., Hitoshi, I.: Criterion for judging request intention in response texts of open-ended questionnaires. In: Proceedings of the Second International Workshop on Paraphrasing, pp. 49–56. Association for Computational Linguistics (2003)
Google Scholar
Jing, L.P., Huang, H.K., Shi, H.B.: Improved feature selection approach TFIDF in text mining. In: Proceedings of the First International Conference on Machine Learning and Cybernetics, pp. 944–946 (2002)
Google Scholar
Joshi, A.K.: Natural language processing. Science 253, 1242 (1991)
Article Google Scholar
McCallum, A.: Information extraction: Distilling structured data from unstructured text. ACM Queue 3, 48–57 (2005)
Article Google Scholar
Morinaga, S., Yamanishi, K., Tateishi, K., Fukushima, T.: Mining product reputations on the web. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 341–349 (2002)
Google Scholar
Nagamachi, M.: Kansei engineering: a new ergonomic consumer-oriented technology for product development. International Journal of Industrial Ergonomics 15, 3–11 (1995)
Article Google Scholar
Radford, A., Noble, P.J., Coyne, K.P., Gaskell, R.M., Jones, P.H., Bryan, J.G.E., Setzkorn, C., Tierney, Á., Dawson, S.: Antibacterial prescribing patterns in small animal veterinary practice identified via SAVSNET: the small animal veterinary surveillance network. Veterinary Record 169, 310–318 (2011)
Article Google Scholar
Rosell, M., Velupillai, S.: Revealing relations between open and closed answers in questionnaires through text clustering evaluation. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), pp. 1716–1722 (2008)
Google Scholar
Svátek, V.: Ontologies, Questionnaires and (Mining) Tabular Data. In: the 3rd European Semantic Web Conference (ESWC 2006) (2006)
Google Scholar
Uchida, Y., Yoshikawa, T., Furuhashi, T., Hirao, E., Iguchi, H.: Extraction of important keywords in free text of questionnaire data and visualization of relationship among sentences. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2009), pp. 1604–1608 (2009)
Google Scholar
Willett, P.: The Porter stemming algorithm: then and now. Program: Electronic Library and Information Systems 40, 219–223 (2006)
Article Google Scholar
Yamanishi, K., Li, H.: Mining open answers in questionnaire data. IEEE Intelligent Systems, pp. 58–63 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, The University of Liverpool, Liverpool, L69 3BX, UK
Matias Garcia-Constantino & Frans Coenen
School of Veterinary Science, University of Liverpool, Leahurst, Neston, CH64 7TE, UK
P. -J. Noble, Alan Radford & Christian Setzkorn

Authors

Matias Garcia-Constantino
View author publications
You can also search for this author in PubMed Google Scholar
Frans Coenen
View author publications
You can also search for this author in PubMed Google Scholar
P. -J. Noble
View author publications
You can also search for this author in PubMed Google Scholar
Alan Radford
View author publications
You can also search for this author in PubMed Google Scholar
Christian Setzkorn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI, Kohlenstraße 2, 04107, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garcia-Constantino, M., Coenen, F., Noble, P.J., Radford, A., Setzkorn, C. (2012). A Semi-Automated Approach to Building Text Summarisation Classifiers. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2012. Lecture Notes in Computer Science(), vol 7376. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31537-4_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-31537-4_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31536-7
Online ISBN: 978-3-642-31537-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics