Taxonomy-Driven Lumping for Sequence Mining
In many application domains, events are naturally organized in a hierarchy. Whether events describe human activities, system failures, coordinates in a trajectory, or biomedical phenomena, there is often a taxonomy that should be taken into consideration. A taxonomy allow us to represent the information at a more general description level, if we choose carefully the most suitable level of granularity.
Given a taxonomy of events and a dataset of sequences of these events, we study the problem of finding efficient and effective ways to produce a compact representation of the sequences. This can be valuable by itself, or can be used to help solving other problems, such as clustering.