Generalisation Operators for Lists Embedded in a Metric Space
In some application areas, similarities and distances are used to calculate how similar two objects are in order to use these measurements to find related objects, to cluster a set of objects, to make classifications or to perform an approximate search guided by the distance. In many other application areas, we require patterns to describe similarities in the data. These patterns are usually constructed through generalisation (or specialisation) operators. For every data structure, we can define distances. In fact, we may find different distances for sets, lists, atoms, numbers, ontologies, web pages, etc. We can also define pattern languages and use generalisation operators over them. However, for many data structures, distances and generalisation operators are not consistent. For instance, for lists (or sequences), edit distances are not consistent with regular languages, since, for a regular pattern such as *a, the covered set of lists might be far away in terms of the edit distance (e.g. bbbbbba and aa). In this paper we investigate the way in which, given a pattern language, we can define a pair of generalisation operator and distance which are consistent. We define the notion of (minimal) distance-based generalisation operators for lists. We illustrate positive results with two different pattern languages.
KeywordsDistance-based methods inductive operators induction with distances list-based representations
Unable to display preview. Download preview PDF.
- 1.Bowers, A.F., Giraud-Carrier, C.G., Lloyd, J.W.: Classification of individuals with complex structure. In: Proc. of the 17th International Conference on Machine Learning (ICML 2000), pp. 81–88. Morgan Kaufmann, San Francisco (2000)Google Scholar
- 3.Estruch, V.: Bridging the gap between distance and generalisation: Symbolic learning in metric spaces. PhD Thesis, DSIC-UPV (2008), http://www.dsic.upv.es/~vestruch/thesis.pdf
- 4.Estruch, V., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Distance based generalisation. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS (LNAI), vol. 3625, pp. 87–102. Springer, Heidelberg (2005)Google Scholar
- 5.Estruch, V., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Distance based generalisation for graphs. In: Proc. Work. of Machine and Learning with Graphs, pp. 133–140 (2006)Google Scholar
- 9.Hernández-Orallo, J., Ramírez-Quintana, M.J.: Inverse narrowing for the induction of functional logic programs. In: 1998 Joint Conference on Declarative Programming, APPIA-GULP-PRODE 1998, A Coruña, Spain, July 20-23, pp. 379–392 (1998)Google Scholar
- 18.Rivest, R., Cormen, T.H., Leiserson, C., Stein, C. (eds.): Introduction to Algorithms. MIT Press, Cambridge (2000)Google Scholar