Advertisement

Communication-Free Widened Learning of Bayesian Network Classifiers Using Hashed Fiedler Vectors

  • Oliver R. Sampson
  • Christian Borgelt
  • Michael R. Berthold
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11191)

Abstract

Widening is a method where parallel resources are used to find better solutions from greedy algorithms instead of merely trying to find the same solutions more quickly. To date, every example of Widening has used some form of communication between the parallel workers to maintain their distances from one another in the model space. For the first time, we present a communication-free, widened extension to a standard machine learning algorithm. By using Locality Sensitive Hashing on the Bayesian networks’ Fiedler vectors, we demonstrate the ability to learn classifiers superior to those of standard implementations and to those generated with a greedy heuristic alone.

References

  1. 1.
    Akbar, Zaenal, Ivanova, Violeta N., Berthold, Michael R.: Parallel data mining revisited. Better, not faster. In: Hollmén, Jaakko, Klawonn, Frank, Tucker, Allan (eds.) IDA 2012. LNCS, vol. 7619, pp. 23–34. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-34156-4_4CrossRefGoogle Scholar
  2. 2.
    Akl, S.G.: Parallel real-time computation: sometimes quantity means quality. In: Proceedings of International Symposium on Parallel Architectures, Algorithms and Networks, 2000. I-SPAN 2000, pp. 2–11. IEEE (2000)Google Scholar
  3. 3.
    Bielza, Concha, Larrañaga, Pedro: Discrete Bayesian network classifiers: a survey. ACM Comput. Surv. (CSUR) 47(1), 5 (2014)CrossRefGoogle Scholar
  4. 4.
    Andrei Z. Broder. On the resemblance and containment of documents. In: Proceedings of Compression and Complexity of Sequences 1997, pp. 21–29. IEEE (1997)Google Scholar
  5. 5.
    Buhler, Jeremy: Efficient large-scale sequence comparison by locality-sensitive hashing. Bioinformatics 17(5), 419–428 (2001)CrossRefGoogle Scholar
  6. 6.
    Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the Thiry-Fourth Annual ACM Symposium on Theory of Computing, pp. 380–388. ACM (2002)Google Scholar
  7. 7.
    Fan-Roon Kim Chung. Spectral Graph Theory. Number 92 in Regional Conference Series in Mathematics. American Mathematical Society, 1997Google Scholar
  8. 8.
    Coenen, F.: LUCS-KDD DN software (2003)Google Scholar
  9. 9.
    Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262. ACM (2004)Google Scholar
  10. 10.
    Doyle, P.G., Laurie Snell, J.: Random Walks and Electric Networks. Mathematical Association of America (1984)Google Scholar
  11. 11.
    Fillbrunn, Alexander, Berthold, Michael R.: Diversity-driven widening of hierarchical agglomerative clustering. In: Fromont, Elisa, De Bie, Tijl, van Leeuwen, Matthijs (eds.) IDA 2015. LNCS, vol. 9385, pp. 84–94. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24465-5_8CrossRefGoogle Scholar
  12. 12.
    Fillbrunn, Alexander, Wörteler, Leonard, Grossniklaus, Michael, Berthold, Michael R.: Bucket selection: a model-independent diverse selection strategy for widening. In: Adams, Niall, Tucker, Allan, Weston, David (eds.) IDA 2017. LNCS, vol. 10584, pp. 87–98. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68765-0_8CrossRefGoogle Scholar
  13. 13.
    Gionis, Aristides, Indyk, Piotr, Motwani, Rajeev: Similarity search in high dimensions via hashing. VLDB 99, 518–529 (1999)Google Scholar
  14. 14.
    Guo, Krystal, Mohar, Bojan: Hermitian adjacency matrix of digraphs and mixed graphs. J. Graph Theory 85(1), 217–248 (2017)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Koski, T.J.T., Noble, J.M.: A review of Bayesian networks and structure learning. Mathematica Applicanda 40(1), 53–103 (2012)MathSciNetzbMATHGoogle Scholar
  16. 16.
    Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: 12th International Conference on Computer Vision, pp. 2130–7. IEEE (2009)Google Scholar
  17. 17.
    Larrañaga, Pedro, Karshenas, Hossein, Bielza, Concha, Santana, Roberto: A review on evolutionary algorithms in Bayesian network learning and inference tasks. Inf. Sci. 233, 109–125 (2013)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Lichman, M.: UCI Machine Learning Repository (2013)Google Scholar
  19. 19.
    Luo, Bin, Wilson, Richard C., Hancock, Edwin R.: Spectral feature vectors for graph clustering. In: Caelli, Terry, Amin, Adnan, Duin, Robert P.W., de Ridder, Dick, Kamel, Mohamed (eds.) SSPR /SPR 2002. LNCS, vol. 2396, pp. 83–93. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-70659-3_8CrossRefGoogle Scholar
  20. 20.
    Marolt, Matija: A mid-level representation for melody-based retrieval in audio collections. IEEE Trans. Multimed. 10(8), 1617–1625 (2008)CrossRefGoogle Scholar
  21. 21.
    Meinl, T.: Maximum-Score Diversity Selection. Ph.D. thesis, University of Konstanz, Konstanz, Germany (2010)Google Scholar
  22. 22.
    Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., Burlington (1988)zbMATHGoogle Scholar
  23. 23.
    Qiu, Huaijun, Hancock, Edwin R.: Graph matching and clustering using spectral partitions. Pattern Recognit. 39(1), 22–34 (2006)CrossRefGoogle Scholar
  24. 24.
    Sampson, Oliver, Berthold, Michael R.: Widened KRIMP: better performance through diverse parallelism. In: Blockeel, Hendrik, van Leeuwen, Matthijs, Vinciotti, Veronica (eds.) IDA 2014. LNCS, vol. 8819, pp. 276–285. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-12571-8_24CrossRefGoogle Scholar
  25. 25.
    Sampson, Oliver R., Berthold, Michael R.: Widened learning of Bayesian network classifiers. In: Boström, Henrik, Knobbe, Arno, Soares, Carlos, Papapetrou, Panagiotis (eds.) IDA 2016. LNCS, vol. 9897, pp. 215–225. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46349-0_19CrossRefGoogle Scholar
  26. 26.
    Satu Elisa Schaeffer: Graph clustering. Comput. Sci. Rev. 1(1), 27–64 (2007)CrossRefGoogle Scholar
  27. 27.
    Scutari, Marco: Learning Bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Terasawa, Kengo, Tanaka, Yuzuru: Spherical LSH for approximate nearest neighbor search on unit hypersphere. In: Dehne, Frank, Sack, Jörg-Rüdiger, Zeh, Norbert (eds.) WADS 2007. LNCS, vol. 4619, pp. 27–38. Springer, Heidelberg (2007).  https://doi.org/10.1007/978-3-540-73951-7_4CrossRefGoogle Scholar
  29. 29.
    Van Dam, E.R., Haemers, W.H.: Which graphs are determined by their spectrum? Linear Algebra Appl. 373, 241–272 (2003)Google Scholar
  30. 30.
    Vishveshwara, S., Brinda, K.V., Kannan, N.: Protein structure: insights from graph theory. J. Theor. Comput. Chem. 1(01), 187–211 (2002)CrossRefGoogle Scholar
  31. 31.
    Zhang, Boyu, Liu, Xianglong, Lang, Bo: Fast graph similarity search via locality sensitive hashing. In: Ho, Yo-Sung, Sang, Jitao, Ro, Yong Man, Kim, Junmo, Wu, Fei (eds.) PCM 2015. LNCS, vol. 9314, pp. 623–633. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-24075-6_60CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Oliver R. Sampson
    • 1
  • Christian Borgelt
    • 1
  • Michael R. Berthold
    • 1
  1. 1.Chair for Bioinformatics and Information Mining, Department of Computer and Information ScienceUniversity of KonstanzKonstanzGermany

Personalised recommendations