Abstract
Motivated by various binary classification problems in structured data (e.g., graphs or other relational and algebraic structures), we investigate some algorithmic properties of closed set and half-space separation in abstract closure systems. Assuming that the underlying closure system is finite and given by the corresponding closure operator, we formulate some negative and positive complexity results for these two separation problems. In particular, we prove that deciding half-space separability in abstract closure systems is NP-complete in general. On the other hand, for the relaxed problem of maximal closed set separation we propose a simple greedy algorithm and show that it is efficient and has the best possible lower bound on the number of closure operator calls. As a second direction to overcome the negative result above, we consider Kakutani closure systems and show first that our greedy algorithm provides an algorithmic characterization of this kind of set systems. As one of the major potential application fields, we then focus on Kakutani closure systems over graphs and generalize a fundamental characterization result based on the Pasch axiom to graph structure partitioning of finite sets. Though the primary focus of this work is on the generality of the results obtained, we experimentally demonstrate the practical usefulness of our approach on vertex classification in different graph datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Throughout this work we consistently use the nomenclature “closed sets” by noting that “convex” and “closed” are synonyms by the standard terminology of this field.
- 2.
An entirely different application to binary classification in distributive lattices with applications to inductive logic programming and formal concept analysis is discussed in the long version of this paper.
- 3.
Notice that the function mapping any subset of \(\mathbb {R}^d\) to its convex hull is a closure operator.
- 4.
A similar property was considered by the Japanese mathematician Shizou Kakutani for Euclidean spaces (cf. [9]).
- 5.
For a good reference on convexity structures satisfying the \(S_4\) separation property, the reader is referred e.g. to [4].
- 6.
The claim holds for outerplanar graphs as well. For the sake of simplicity we formulate it in this short version for trees only, as it suffices for our purpose.
- 7.
While in ordinary graph mining the pattern matching is typically defined by subgraph isomorphism, it is the graph homomorphism in logic based graph mining, as subsumption between first-order clauses reduces to homomorphism between graphs (see [8] for a discussion).
- 8.
In case of trees, such a non-redundant set could be obtained by considering only leaves as training examples.
- 9.
We formulate this heuristic for trees for simplicity. In the long version we show that this idea can be generalized to any graph satisfying the Pasch axiom.
References
Artigas, D., Dantas, S., Dourado, M., Szwarcfiter, J.: Partitioning a graph into convex sets. Discret. Math. 311(17), 1968–1977 (2011)
Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152. ACM Press (1992)
Buluç, A., Meyerhenke, H., Safro, I., Sanders, P., Schulz, C.: Recent advances in graph partitioning. In: Kliemann, L., Sanders, P. (eds.) Algorithm Engineering. LNCS, vol. 9220, pp. 117–158. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49487-6_4
Chepoi, V.: Separation of two convex sets in convexity structures. J. Geom. 50(1), 30–51 (1994)
Ellis, J.W.: A general set-separation theorem. Duke Math. J. 19(3), 417–421 (1952)
Farber, M., Jamison, R.: Convexity in graphs and hypergraphs. SIAM J. Algebraic Discret. Methods 7(3), 433–444 (1986)
Gottlob, G.: Subsumption and implication. Inform. Process. Lett. 24(2), 109–111 (1987)
Horváth, T., Turán, G.: Learning logic programs with structured background knowledge. Artif. Intell. 128(1–2), 31–97 (2001)
Kakutani, S.: Ein Beweis des Satzes von Edelheit über konvexe Mengen. Proc. Imp. Acad. Tokyo 13, 93–94 (1937)
Kubiś, W.: Separation properties of convexity spaces. J. Geom. 74(1), 110–119 (2002)
Plotkin, G.: A note on inductive generalization. Mach. Intell. 5, 153–163 (1970)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386–408 (1958)
Schaeffer, S.E.: Survey: graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)
Schulz, T.H., Horváth, T., Welke, P., Wrobel, S.: Mining tree patterns with partially injective homomorphisms. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018, Part II. LNCS (LNAI), vol. 11052, pp. 585–601. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10928-8_35
van de Vel, M.: Binary convexities and distributive lattices. Proc. London Math. Soc. 48(1), 1–33 (1984)
van de Vel, M.: Theory of Convex Structures. North-Holland Mathematical Library, vol. 50. North Holland, Amsterdam (1993)
Acknowledgements
Part of this work has been funded by the Ministry of Education and Research of Germany (BMBF) under project ML2R (grant number 01/S18038C) and by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2070 – 390732324.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Seiffarth, F., Horváth, T., Wrobel, S. (2020). Maximal Closed Set and Half-Space Separations in Finite Closure Systems. In: Brefeld, U., Fromont, E., Hotho, A., Knobbe, A., Maathuis, M., Robardet, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Lecture Notes in Computer Science(), vol 11906. Springer, Cham. https://doi.org/10.1007/978-3-030-46150-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-46150-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46149-2
Online ISBN: 978-3-030-46150-8
eBook Packages: Computer ScienceComputer Science (R0)