Skip to main content
Log in

Multipass Algorithms for Mining Association Rules in Text Databases

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract.

In this paper, we propose two new algorithms for mining association rules between words in text databases. The characteristics of text databases are quite different from those of retail transaction databases, and existing mining algorithms cannot handle text databases efficiently because of the large number of itemsets (i.e., words) that need to be counted. Two well-known mining algorithms, Apriori algorithm and Direct Hashing and Pruning (DHP) algorithm, are evaluated in the context of mining text databases, and are compared with the new proposed algorithms named Multipass-Apriori (M-Apriori) and Multipass-DHP (M-DHP). It has been shown that the proposed algorithms have better performance for large text databases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received 12 November 1999 / Revised 27 September 2000 / Accepted in revised form 25 October 2000

Rights and permissions

Reprints and permissions

About this article

Cite this article

Holt, J., Chung, S. Multipass Algorithms for Mining Association Rules in Text Databases. Knowledge and Information Systems 3, 168–183 (2001). https://doi.org/10.1007/PL00011664

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/PL00011664

Navigation