Detecting Complex Dependencies in Categorical Data
Locating and evaluating relationships among values in multiple streams of data is a difficult and important task. Consider the data flowing from monitors in an intensive care unit. Readings from various subsets of the monitors are indicative and predictive of certain aspects of the patient’s state. We present an algorithm that facilitates discovery and assessment of the strength of such predictive relationships called Multi-stream Dependency Detection (MSDD). We use heuristic search to guide our exploration of the space of potentially interesting dependencies to uncover those that are significant. We begin by reviewing the dependency detection technique described in , and extend it to the multiple stream case, describing in detail our heuristic search over the space of possible dependencies. Quantitative evidence for the utility of our approach is provided through a series of experiments with artificially-generated data. In addition, we present results from the application of our algorithm to two real problem domains: feature-based classification and prediction of pathologies in a simulated shipping network.
KeywordsStream Length Structure Rule Multiple Stream Predictive Rule Dependency Rule
Unable to display preview. Download preview PDF.
- Howe, Adele E. and Cohen, Paul R. Understanding Planner Behavior. To appear in AI Journal, Winter 1995.Google Scholar
- Murphy, P. M., and Aha, D. W. UCI Repository of machine learning databases [Machine-readable data repository]. Irvine, CA: University of California, Department of Information and Computer Science, 1994.Google Scholar
- Oates, Tim. MSDD as a Tool for Classification. Memo 94–29, Experimental Knowledge Systems Laboratory, Department of Computer Science, University of Massachusetts, Amherst, 1994.Google Scholar
- Oates, Tim and Cohen, Paul R. Toward a plan steering agent: Experiments with schedule maintenance. In Proceedings of the Second International Conference on Artificial Intelligence Planning Systems, pp. 134–139, 1994.Google Scholar
- Thrun, S.B. The MONK’s problems: A performance comparison of different learning algorithms. Carnegie Mellon University, CMU-CS-91–197, 1991.Google Scholar
- Wirth, J. and Catlett, J. Experiments on the costs and benefits of windowing in ID3. In Proceedings of the Fifth International Conference on Machine Learning, pp. 87–99, 1988.Google Scholar
- Zheng, Zijian. A benchmark for classifier learning. Basser Department of Computer Science, University of Sydney, NSW.Google Scholar