Abstract
The processing performance of vector-based document clustering methods is improved if automatic summarisation is used in addition to established forms of text pre-processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Salton, G.: The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs (1971)
Chang, H.-C., Hsu, C.-C.: Using topic keyword clusters for automatic document clustering. In: 3rd International Conference on Information Technology and Applications, pp. 419–424. IEEE Computer Society, Los Alamitos (2005)
Sargeant, J., Wood, M.M., Anderson, S.: A human-computer collaborative approach to the marking of free text answers. In: 8th International Conference on Computer Aided Assessment, Loughborough, UK, pp. 361–370 (2004)
Wood, M.M., Jones, C., Sargeant, J., Reed, P.: Light-weight clustering techniques for short text answers in HCC CAA. In: 10th International Conference on Computer Aided Assessment, Loughborough, UK, pp. 291–305 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Latif, S., Wood, M.M. (2008). Text Pre-processing for Document Clustering. In: Kapetanios, E., Sugumaran, V., Spiliopoulou, M. (eds) Natural Language and Information Systems. NLDB 2008. Lecture Notes in Computer Science, vol 5039. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69858-6_44
Download citation
DOI: https://doi.org/10.1007/978-3-540-69858-6_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69857-9
Online ISBN: 978-3-540-69858-6
eBook Packages: Computer ScienceComputer Science (R0)