Interactive Refinement of Filtering Queries on Streaming Intelligence Data

  • Yiming Ma
  • Dawit Yimam Seid
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3975)


Intelligence analysis involves routinely monitoring and correlating large amount of data streaming from multiple sources. In order to detect important patterns, the analyst normally needs to look at data gathered over a certain time window. Given the size of data and rate at which it arrives, it is usually impossible to manually process every record or case. Instead, automated filtering (classification) mechanisms are employed to identify information relevant to the analyst’s task. In this paper, we present a novel system framework called FREESIA (Filter REfinement Engine for Streaming InformAtion) to effectively generate, utilize and update filtering queries on streaming data.


Relevance Feedback Streaming Data Initial Query Similarity Query Relevant Record 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chaudhuri, S., Gravano, L.: Evaluating top-k selection queries. In: Proc. of the Twenty-fifth International Conference on Very Large Databases, VLDB 1999 (1999)Google Scholar
  2. 2.
    Day, W., Edelsbrunner, H.: Efficient algorithms for agglomerative hierarchical clustering methods, vol. 1(1), pp. 7–24 (1984)Google Scholar
  3. 3.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  4. 4.
    Fagin, R.: Combining Fuzzy Information from Multiple Systems. In: Proc. of the 15th ACM Symp. on PODS (1996)Google Scholar
  5. 5.
    Fagin, R., Lotem, A., Naor, M.: Optimal aggregation algorithms for middleware. In: PODS 2001, Santa Barnara, California, May 2001, pp. 83–99 (2001)Google Scholar
  6. 6.
    Lambert, D., Pinheiro, J.C.: Mining a stream of transactions for customer patterns. In: Knowledge Discovery and Data Mining, pp. 305–310 (2001)Google Scholar
  7. 7.
    Ling, C., Li, C.: Data mining for direct marketing: problems and solutions. In: Proceedings of ACM SIGKDD (KDD 1998), pp. 73–79 (1998)Google Scholar
  8. 8.
    Merz, C.J., Murphy, P.: UCI Repository of Machine Learning Databases (1996),
  9. 9.
    Piatetsky-Shapiro, G., Masand, B.: Estimating campaign benefits and modeling lift. In: Proceedings of ACM SIGKDD (KDD 1999), pp. 185–193 (1999)Google Scholar
  10. 10.
    Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System: Experiments in Automatic Document Processing, pp. 313–323. Prentice Hall, Englewood Cliffs (1971)Google Scholar
  11. 11.
    Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of ICML 2001, pp. 441–448 (2001)Google Scholar
  12. 12.
    Yates, R.B., Neto, R.: Modern information retrieval. ACM Press Series Addison Wesley, New York (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yiming Ma
    • 1
  • Dawit Yimam Seid
    • 1
  1. 1.School of Information and Computer ScienceUniversity of CaliforniaIrvineUSA

Personalised recommendations