Message Retrieval and Classification from Chat Room Servers Using Bayesian Networks

  • Debbie Zhang
  • Simeon Simoff
  • John Debenham
Part of the IFIP International Federation for Information Processing book series (IFIPAICT, volume 228)


Chat rooms and newsgroup on the internet is a valuable, and often free of charge, source of information. In this paper, a design of smart chat room bots that automatically retrieve and filter on line messages is proposed. The design is based on internet technology and Bayesian Networks. Technical details of connecting to and retrieving data from web based chat room servers are presented. A Naive Bayesian network classifier is implemented using frequency of the keywords that mostly appear in the selecting messages as input features. A prototype of such a message classification system has been implemented. It has been trialed on detecting investment related messages from four Australian chat room sites.

Key words

Information retrieval Bayesian network web mining 


  1. 1.
    www.botspot.comGoogle Scholar
  2. 2.
    Hughes, M., Sho?ner, M., Hamner, D.: Java network programming: a complete guide to networking, streams, and distributed computing, Greenwich (1997).Google Scholar
  3. 3.
    http://htmlparser,sourceforge.netGoogle Scholar
  4. 4.
    Sahami, M., Dumais, S., Heckerman, D., and Horvitz, E.: A Bayesian Approach to Filtering Junk E-mail. Proc. of the AAAI’98 Workshop on Learning for Text Categorization, Madison, Wisconsin (1998).Google Scholar
  5. 5.
    Androutsopoulos, I., Paliouras, G., Karkaletsis, V., Sakkis, G., Spyropoulos, C. and Stamatopoulos, P.: Learning to filter spam e-mail: A comparison of a naive Bayesian and a memory-based approach. Workshop on Machine Learning and Textual Information Access, 4th European Conference on Principles and Practice of Knowledge Discovery in Databases (2000) pp. 1–13.Google Scholar
  6. 6.
    Cowell, R., Dawid, A., Lauritzen, S., and Spiegelhalter, D.: Probabilistic Networks and Expert Systems, Springer (1999).Google Scholar
  7. 7.
    Pearl J.: Probabilistic Reasoning in Intelligent System, Morgen Kaufmann (1988).Google Scholar
  8. 8.
    Jordan, M.: Learning in Graphical Models, MIT (1999).Google Scholar
  9. 9.
    Langley, P., Iba, W., and Thompson, K.: An Analysis of Bayesian Classifiers. Proceedings of AAAI-92 (1992) pp. 223–228.Google Scholar

Copyright information

© International Federation for Information Processing 2006

Authors and Affiliations

  • Debbie Zhang
    • 1
  • Simeon Simoff
    • 1
  • John Debenham
    • 1
  1. 1.Faculty of Information TechnologyUniversity of TechnologySydney

Personalised recommendations