Advertisement

Machine Learning

, Volume 108, Issue 7, pp 1085–1110 | Cite as

Semi-supervised online structure learning for composite event recognition

  • Evangelos MichelioudakisEmail author
  • Alexander Artikis
  • Georgios Paliouras
Article
Part of the following topical collections:
  1. Special Issue of the Inductive Logic Programming (ILP) 2017-2018

Abstract

Online structure learning approaches, such as those stemming from statistical relational learning, enable the discovery of complex relations in noisy data streams. However, these methods assume the existence of fully-labelled training data, which is unrealistic for most real-world applications. We present a novel approach for completing the supervision of a semi-supervised structure learning task. We incorporate graph-cut minimisation, a technique that derives labels for unlabelled data, based on their distance to their labelled counterparts. In order to adapt graph-cut minimisation to first order logic, we employ a suitable structural distance for measuring the distance between sets of logical atoms. The labelling process is achieved online (single-pass) by means of a caching mechanism and the Hoeffding bound, a statistical tool to approximate globally-optimal decisions from locally-optimal ones. We evaluate our approach on the task of composite event recognition by using a benchmark dataset for human activity recognition, as well as a real dataset for maritime monitoring. The evaluation suggests that our approach can effectively complete the missing labels and eventually, improve the accuracy of the underlying structure learning system.

Keywords

Semi-supervised learning Online structure learning Graph-cut minimisation First-order logic distance Event Calculus Event recognition 

Notes

Acknowledgements

The work has been funded by the EU H2020 project datAcron (687591). We would also like to thank Nikos Katzouris for providing assistance on the distance functions for first-order logic and helping us running OLED.

References

  1. Abdulsalam, H., Skillicorn, D. B., & Martin, P. (2011). Classification using streaming random forests. IEEE Transactions on Knowledge and Data Engineering, 23(1), 22–36.CrossRefGoogle Scholar
  2. Aha, D. W., Kibler, D. F., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.  https://doi.org/10.1023/A:1022689900470.Google Scholar
  3. Albinati, J., Oliveira, S. E. L., Otero, F. E. B., & Pappa, G. L. (2015). An ant colony-based semi-supervised approach for learning classification rules. Swarm Intelligence, 9(4), 315–341.CrossRefGoogle Scholar
  4. Alevizos, E., Skarlatidis, A., Artikis, A., & Paliouras, G. (2017). Probabilistic complex event recognition: A survey. ACM Computing Surveys, 50(5), 71:1–71:31.CrossRefGoogle Scholar
  5. Artikis, A., Katzouris, N., Correia, I., Baber, C., Morar, N., Skarbovsky, I., Fournier, F., & Paliouras, G. (2017). A prototype for credit card fraud management: Industry paper. In Proceedings of the 11th ACM international conference on distributed and event-based systems (pp. 249–260). ACM.Google Scholar
  6. Artikis, A., Sergot, M. J., & Paliouras, G. (2015). An Event Calculus for event recognition. IEEE Transactions on Knowledge and Data Engineering, 27(4), 895–908.CrossRefGoogle Scholar
  7. Artikis, A., Skarlatidis, A., Portet, F., & Paliouras, G. (2012). Logic-based event recognition. Knowledge Engineering Review, 27(4), 469–506.CrossRefGoogle Scholar
  8. Bisson, G. (1992a). Conceptual clustering in a first order logic representation. In Proceedings of the 10th European conference on artificial intelligence (pp. 458–462). New York: Wiley.Google Scholar
  9. Bisson, G. (1992b). Learning in FOL with a similarity measure. In Proceedings of the 10th National conference on artificial intelligence (pp. 82–87). Cambridge: AAAI Press/The MIT Press.Google Scholar
  10. Blockeel, H., & De Raedt, L. (1998). Top-down induction of first-order logical decision trees. Artificial Intelligence, 101(1–2), 285–297.MathSciNetCrossRefzbMATHGoogle Scholar
  11. Blum, A., & Chawla, S. (2001). Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the eighteenth international conference on machine learning (pp. 19–26). Los Altos: Morgan Kaufmann.Google Scholar
  12. Blum, A., Lafferty, J. D., Rwebangira, M. R., & Reddy, R. (2004). Semi-supervised learning using randomized mincuts. In Proceedings of the 21st international conference on machine learning. New York: ACM.Google Scholar
  13. Blum, A., & Mitchell, T. M. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the 11th annual conference on computational learning theory (pp. 92–100). New York: ACM.Google Scholar
  14. Bohnebeck, U., Horváth, T., & Wrobel, S. (1998). Term comparisons in first-order similarity measures. In Proceedings of the 8th International workshop on inductive logic programming (pp. 65–79). Berlin: Springer.Google Scholar
  15. Chawla, N. V., & Karakoulas, G. (2005). Learning from labeled and unlabeled data: An empirical study across techniques and domains. Journal of Artificial Intelligence Research, 23(1), 331–366.CrossRefzbMATHGoogle Scholar
  16. Cugola, G., & Margara, A. (2012). Processing flows of information: From data stream to complex event processing. ACM Computing Survey, 44(3), 15:1–15:62.CrossRefGoogle Scholar
  17. Culp, M., & Michailidis, G. (2008). An iterative algorithm for extending learners to a semi-supervised setting. Journal of Computational and Graphical Statistics, 17(3), 545–571.MathSciNetCrossRefGoogle Scholar
  18. De Raedt, L., & Dehaspe, L. (1997). Clausal discovery. Machine Learning, 26(2–3), 99–146.CrossRefzbMATHGoogle Scholar
  19. De Raedt, L. (2008). Logical and relational learning: From ILP to MRDM (cognitive technologies). Secaucus, NJ: Springer-Verlag, New York Inc.CrossRefGoogle Scholar
  20. Dhurandhar, A., & Dobra, A. (2012). Distribution-free bounds for relational classification. Knowledge and Information Systems, 31(1), 55–78.CrossRefGoogle Scholar
  21. Domingos, P. M., & Hulten, G. (2000). Mining high-speed data streams. In Proceedings of the 6th international conference on knowledge discovery and data mining (pp. 71–80).Google Scholar
  22. Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.MathSciNetzbMATHGoogle Scholar
  23. Emde, W., & Wettschereck, D. (1996). Relational instance-based learning. In Proceedings of the 13th international conference on machine Learning (pp. 122–130). Los Altos: Morgan Kaufmann.Google Scholar
  24. Ghahramani, Z., & Jordan, M. I. (1993). Supervised learning from incomplete data via an EM approach. In Proceedings of the 7th conference on advances in neural information processing systems (Vol. 6, pp. 120–127). Los Altos: Morgan Kaufmann.Google Scholar
  25. Goldman, S. A., & Zhou, Y. (2000). Enhancing supervised learning with unlabeled data. In Proceedings of the seventeenth international conference on machine learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29–July 2, 2000 (pp. 327–334). Los Altos: Morgan Kaufmann.Google Scholar
  26. Hausdorff, F. (1962). Set theory. AMS Chelsea Publishing Series White River Junction: Chelsea Publishing Company.Google Scholar
  27. Heckerman, D. (1999). chap A tutorial on learning with Bayesian networks learning. In Graphical models (pp. 301–354). Cambridge: MIT Press.Google Scholar
  28. Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301), 13–30.MathSciNetCrossRefzbMATHGoogle Scholar
  29. Huynh, T. N., & Mooney, R. J. (2011). Online structure learning for Markov logic networks. In Proceedings of ECML PKDD (Vol. 2, pp. 81–96).Google Scholar
  30. Katzouris, N., Artikis, A., & Paliouras, G. (2016). Online learning of event definitions. Theory and Practice of Logic Programming, 16(5–6), 817–833.MathSciNetCrossRefzbMATHGoogle Scholar
  31. Katzouris, N., Michelioudakis, E., Artikis, A., & Paliouras, G. (2018). Online learning of weighted relational rules for complex event recognition. In Proceedings of ECML-PKDD.Google Scholar
  32. Kirsten, M., & Wrobel, S. (1998). Relational distance-based clustering. In Proceedings of the 8th international workshop on inductive logic programming (pp. 261–270). Berlin: Springer.Google Scholar
  33. Kirsten, M., & Wrobel, S. (2000). Extending k-means clustering to first-order representations. In Proceedings of the 10th international conference on inductive logic programming (pp. 112–129). Berlin: Springer.Google Scholar
  34. Kowalski, R. A., & Sergot, M. J. (1986). A logic-based calculus of events. New Generation Computing, 4(1), 67–95.CrossRefzbMATHGoogle Scholar
  35. Kuhn, H. W. (1955). The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 2, 83–97.MathSciNetCrossRefzbMATHGoogle Scholar
  36. Landwehr, N., Kersting, K., & De Raedt, L. (2007). Integrating naïve bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.zbMATHGoogle Scholar
  37. Landwehr, N., Passerini, A., De Raedt, L., & Frasconi, P. (2006). kFOIL: Learning simple relational kernels. In Proceedings of the 21st National conference on artificial intelligence (pp. 389–394). Cambridge: AAAI Press.Google Scholar
  38. Li, Y., & Guo, M. (2011). Web page classification using relational learning algorithm and unlabeled data. Journal of Computers, 6(3), 474–479.Google Scholar
  39. Li, Y., & Guo, M. (2012). A new relational tri-training system with adaptive data editing for inductive logic programming. Knowledge-Based Systems, 35, 173–185.CrossRefGoogle Scholar
  40. McCallum, A. (2003). Efficiently inducing features of conditional random fields. In Proceedings of the 19th conference on uncertainty in artificial Intelligence (pp. 403–410).Google Scholar
  41. Michelioudakis, E., Artikis, A., & Paliouras, G. (2016a). Online structure learning for traffic management. In Proceedings of the 26th international conference on inductive logic programming (pp. 27–39).Google Scholar
  42. Michelioudakis, E., Skarlatidis, A., Paliouras, G., & Artikis, A. (2016b). Online structure learning using background knowledge axiomatization. In Proceedings of ECML-PKDD (Vol. 1, pp. 242–237).Google Scholar
  43. Mueller, E. T. (2008). Event Calculus. In Handbook of knowledge representation, foundations of artificial intelligence (Vol. 3, pp. 671–708). Amsterdam: Elsevier.Google Scholar
  44. Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, 13, 245–286.CrossRefGoogle Scholar
  45. Nienhuys-Cheng, S. H. (1997). Distance between Herbrand interpretations: A measure for approximations to a target concept. In Proceedings of the 7th international workshop on inductive logic programming (pp. 213–226). Berlin: Springer.Google Scholar
  46. Patroumpas, K., Alevizos, E., Artikis, A., Vodas, M., Pelekis, N., & Theodoridis, Y. (2017). Online event recognition from moving vessel trajectories. GeoInformatica, 21(2), 389–427.CrossRefGoogle Scholar
  47. Pietra, S. D., Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393.CrossRefGoogle Scholar
  48. Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.Google Scholar
  49. Ramon, J., & Bruynooghe, M. (1998). A framework for defining distances between first-order logic objects. In Proceedings of the 8th international workshop on inductive logic programming (pp. 271–280). Berlin: Springer.Google Scholar
  50. Richards, B. L., & Mooney, R. J. (1992). Learning relations by pathfinding. In Proceedings of AAAI (pp. 50–55). Cambridge: AAAI Press.Google Scholar
  51. Richardson, M., & Domingos, P. M. (2006). Markov logic networks. Machine Learning, 62(1–2), 107–136.CrossRefGoogle Scholar
  52. Skarlatidis, A., Paliouras, G., Artikis, A., & Vouros, G. A. (2015). Probabilistic Event Calculus for event recognition. ACM Transactions on Computational Logic, 16(2), 11:1–11:37.MathSciNetCrossRefzbMATHGoogle Scholar
  53. Soonthornphisaj, N., & Kijsirikul, B. (2003). Iterative cross-training: An algorithm for web page categorization. Intelligent Data Analysis, 7(3), 233–253.CrossRefzbMATHGoogle Scholar
  54. Soonthornphisaj, N., & Kijsirikul, B. (2004). Combining ILP with semi-supervised learning for web page categorization. In Proceedings of the international conference on computational intelligence (pp. 322–325).Google Scholar
  55. Srinivasan, A. (2003). The aleph manual. Technical Report 4, Computing Laboratory, Oxford University. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/aleph. Accessed 20 April 2018.
  56. Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. In Proceedings of the 33rd annual meeting of the association for computational linguistics (pp. 189–196).Google Scholar
  57. Zhou, Z., & Li, M. (2005). Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 17(11), 1529–1541.CrossRefGoogle Scholar
  58. Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the 20th international conference on machine learning (pp. 912–919). AAAI Press.Google Scholar
  59. Zhu, X., Goldberg, A. B., Brachman, R., & Dietterich, T. (2009). Introduction to semi-supervised learning. San Rafael: Morgan and Claypool Publishers.CrossRefGoogle Scholar

Copyright information

© The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Informatics and TelecommunicationsNational and Kapodistrian University of AthensAthensGreece
  2. 2.Department of Maritime StudiesUniversity of PiraeusPiraeusGreece
  3. 3.Institute of Informatics and TelecommunicationsNational Center for Scientific Research “Demokritos”AthensGreece

Personalised recommendations