User Modelling for Interactive User-Adaptive Collection Structuring
Automatic structuring is one means to ease access to document collections, be it for organization or for exploration. Of even greater help would be a presentation that adapts to the user’s way of structuring and thus is intuitively understandable. We extend an existing user-adaptive prototype system that is based on a growing self-organizing map and that learns a feature weighting scheme from a user’s interaction with the system resulting in a personalized similarity measure. The proposed approach for adapting the feature weights targets certain problems of previously used heuristics. The revised adaptation method is based on quadratic optimization and thus we are able to pose certain contraints on the derived weighting scheme. Moreover, thus it is guaranteed that an optimal weighting scheme is found if one exists. The proposed approach is evaluated by simulating user interaction with the system on two text datasets: one artificial data set that is used to analyze the performance for different user types and a real world data set – a subset of the banksearch dataset – containing additional class information.
KeywordsFeature Weight Vector Space Model Adaptation Algorithm Quadratic Optimization Greedy Heuristic
Unable to display preview. Download preview PDF.
- 1.Greiff, W.R.: A theory of term weighting based on exploratory data analysis. In: 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM Press, New York, NY (1998)Google Scholar
- 2.Hotho, A., Nürnberger, A., Paaß, G.: A brief survey of text mining. GLDV-Journal for Computational Linguistics and Language Technology 20(1), 19–62 (2005)Google Scholar
- 5.Nürnberger, A., Detyniecki, M.: Weighted self-organizing maps - incorporating user feedback. In: Artificial Neural Networks and Neural Information Processing - ICANN/ICONIP 2003, Proc. of the joined 13th Int. Conf. (2003)Google Scholar
- 7.Nürnberger, A., Klose, A.: Improving clustering and visualization of multimedia data using interactive user feedback. In: Proc. of the 9th Int. Conf. on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002) (2002)Google Scholar
- 8.Porter, M.: An algorithm for suffix stripping. Program, 130–137 (1980)Google Scholar
- 11.Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975) (see also TR74-218, Cornell University, NY, USA) Google Scholar
- 12.Sinka, M., Corne, D.: A large benchmark dataset for web document clustering. In: Soft Computing Systems: Design, Management and Applications. Frontiers in Artificial Intelligence and Applications, vol. 87, pp. 881–890 (2002)Google Scholar