Advertisement

Cybernetics and Systems Analysis

, Volume 51, Issue 2, pp 313–323 | Cite as

Formation of Similarity-Reflecting Binary Vectors with Random Binary Projections

  • D. A. Rachkovskij
NEW MEANS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS

Abstract

We propose a transformation of real input vectors to output binary vectors by projection using a binary random matrix with elements {0,1} and thresholding. We investigate the rate of convergence of the distribution of vector components before binarization to the Gaussian distribution as well as its relationship to the estimation error of the angle between the input vectors by the binarized output vectors. It is shown that for the choice of projection parameters that provide nearly-Gaussian distribution, the experimental and analytical errors are close.

Keywords

binary random projections convergence to the Gaussian distribution estimate of the similarity of vectors 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    D. A. Rachkovskij and S. V. Slipchenko, “Similarity-based retrieval with structure-sensitive sparse binary distributed representations,” Computational Intelligence, 28, No. 1, 106–129 (2012).CrossRefMathSciNetGoogle Scholar
  2. 2.
    A. M. Reznik, A. A. Galinskaya, O. K. Dekhtyarenko, and D. W. Nowicki, “Preprocessing of matrix QCM sensors data for the classification by means of neural network,” Sensors and Actuators, B, 106, 158–163 (2005).CrossRefGoogle Scholar
  3. 3.
    A. A. Frolov, D. Husek, and P. Yu. Polyakov, “Recurrent-neural-network-based Boolean factor analysis and its application to word clustering,” IEEE Trans. on Neural Networks, 20, No. 7, 1073–1086 (2009).CrossRefGoogle Scholar
  4. 4.
    V. I. Gritsenko, D. A. Rachkovskij, A. D. Goltsev, V. V. Lukovych, I. S. Misuno, E. G. Revunova, S. V. Slipchenko, A. M. Sokolov, and S. A. Talayev, “Neural distributed representation for intelligent information technologies and modeling of thinking,” Cybernetics and Computer Engineering., 173, 7–24 (2013).Google Scholar
  5. 5.
    S. V. Slipchenko and D. A. Rachkovskij, “Analogical mapping using similarity of binary distributed representations,” Intern. J. Inform. Theories and Appl., 16, No. 3, 269–290 (2009).Google Scholar
  6. 6.
    I. S. Misuno, D. A. Rachkovskij, and S. V. Slipchenko, “Vector and distributed representations reflecting semantic relatedness of words,” Mathematical Machines and Systems, No. 3, 50–67 (2005).Google Scholar
  7. 7.
    A. Sokolov, “LIMSI: learning semantic similarity by selecting random word subsets,” in: Proc. of 6th Intern. Workshop on Semantic Evaluation (SEMEVAL’12), Montreal (Canada), Association for Computational Linguistics (2012), pp. 543–546.Google Scholar
  8. 8.
    A. Sokolov and S. Riezler, “Task-driven greedy learning of feature hashing functions,” in: Proc. NIPS’13 Workshop “Big Learning: Advances in Algorithms and Data Management”, Lake Tahoe (USA) (2013), pp. 1–5.Google Scholar
  9. 9.
    E. M. Kussul and D. A. Rachkovskij, “Multilevel assembly neural architecture and processing of sequences,” in: A. V. Holden and V. I. Kryukov (eds.), Neurocomputers and Attention: Vol. II. Connectionism and Neurocomputers, Manchester Univ. Press, Manchester–New York (1991), pp. 577–590.Google Scholar
  10. 10.
    P. Kanerva, G. Sjodin, J. Kristoferson, R. Karlsson, B. Levin, A. Holst, J. Karlgren, and M. Sahlgren, “Computing with large random patterns,” in: Foundations of Real-World Intelligence, CSLI Publ., Stanford (Calif.) (2001), pp. 251–311.Google Scholar
  11. 11.
    D. A. Rachkovskij, “Representation and processing of structures with binary sparse distributed codes,” IEEE Trans. on Knowledge and Data Engineering, 13, No. 2, 261–276 (2001).CrossRefGoogle Scholar
  12. 12.
    D. A. Rachkovskij, S. V. Slipchenko, E. M. Kussul, and T. N. Baidyk, “Binding procedure for distributed binary data representations,” Cybern. Syst. Analysis, 41, No. 3, 319–331 (2005).CrossRefGoogle Scholar
  13. 13.
    A. Letichevsky, A. Letychevsky Jr., and V. Peschanenko, “Insertion modeling system,” Lecture Notes in Computer Science, 7162, 262–274 (2011).CrossRefGoogle Scholar
  14. 14.
    A. Letichevsky, A. Godlevsky, A. Letichevsky Jr., S. Potienko, and V. Peschanenko, “The properties of predicate transformer in VRS system,” Cybern. Syst. Analysis, 46, No. 4, 521–532 (2010).CrossRefzbMATHGoogle Scholar
  15. 15.
    S. I. Gallant and T. W. Okaywe, “Representing objects, relations, and sequences,” Neural Computation, 25, No. 8, 2038–2078 (2013).CrossRefMathSciNetGoogle Scholar
  16. 16.
    D. A. Rachkovskij, “Some approaches to analogical mapping with structure sensitive distributed representations,” J. of Experimental and Theoretical Artificial Intelligence, 16, No. 3, 125–145 (2004).CrossRefzbMATHGoogle Scholar
  17. 17.
    B. Emruli, R. W. Gayler, and F. Sandin, “Analogical mapping and inference with binary spatter codes and sparse distributed memory,” Intern. Joint Conf. on Neural Networks (IJCNN), 4–9 Aug 2013, Dallas, TX, IEEE (2013), pp. 1–8.Google Scholar
  18. 18.
    N. Kussul, A. Shelestov, S. Skakun, O. Kravchenko, Y. Gripich, L. Hluchy, P. Kopp, and E. Lupian, “The data fusion Grid infrastructure: Project objectives and achievements,” Computing and Informatics, B29, No. 2, 319–334 (2012).Google Scholar
  19. 19.
    N. N. Kussul, A. Y. Shelestov, S. V. Skakun, Guoqing Li, and O. M. Kussul, “The wide area grid testbed for flood monitoring using earth observation data,” IEEE J. of Selected Topics in Applied Earth Observations and Remote Sensing, 5, No. 6, 1746–1751 (2012).Google Scholar
  20. 20.
    D. Achlioptas, “Database-friendly random projections: Johnson–Lindenstrauss with binary coins,” J. Comp. and System Sci., 66, No. 4, 671–687 (2003).CrossRefzbMATHMathSciNetGoogle Scholar
  21. 21.
    M. Charikar, “Similarity estimation techniques from rounding algorithms,” ACM Symposium on Theory of Computing, Vol. 1, ACM, Montreal (Canada) (2002), pp. 380–388.Google Scholar
  22. 22.
    P. Li, T. J. Hastie, and K. W. Church, “Very sparse random projections,” in: Proc. 12th ACM SIGKDD Intern. Conf. on Knowledge Discovery and Data Mining, ACM Press, Philadelphia (USA) (2006), pp. 287–296.Google Scholar
  23. 23.
    D. A. Rachkovskij, I. S. Misuno, and S. V. Slipchenko, “Randomized projective methods for the construction of binary sparse vector representations,” Cybern. Syst. Analysis, 48, No. 1, 146–156 (2012).CrossRefzbMATHGoogle Scholar
  24. 24.
    D. A. Rachkovskij, “Vector data transformation using random binary matrices,” Cybern. Syst. Analysis, 50, No. 6, 960–968 (2014).CrossRefGoogle Scholar
  25. 25.
    E. G. Revunova and D. A. Rachkovskij, “Using randomized algorithms for solving discrete ill-posed problems,” Information Theories and Applications, 16, No. 2, 176–192 (2009).Google Scholar
  26. 26.
    D. A. Rachkovskij and E. G. Revunova, “Randomized method for solving discrete ill-posed problems,” Cybern. Syst. Analysis, 48, No. 4, 621–635 (2012).CrossRefzbMATHMathSciNetGoogle Scholar
  27. 27.
    N. M. Amosov, T. N. Baidyk, A. D. Goltsev, A. M. Kasatkin, L. M. Kasatkina, E. M. Kussul, and D. A. Rachkovskij, Neurocomputers and Intelligent Robots [in Russian], Naukova Dumka, Kyiv (1991).Google Scholar
  28. 28.
    D. A. Rachkovskij, E. M. Kussul, and T. N. Baidyk, “Building a world model with structure-sensitive sparse binary distributed representations,” Biologically Inspired Cognitive Architectures, 3, 64–86 (2013).CrossRefGoogle Scholar
  29. 29.
    R. S. Omelchenko, “Spellchecker based on distributed representation,” Problems in Programming, No. 4, 35–42 (2013).Google Scholar
  30. 30.
    A. Frolov, A. Kartashov, A. Goltsev, and R. Folk, “Quality and efficiency of retrieval for Willshaw-like autoassociative networks. I. Correction,” Network: Computation in Neural Systems, 6, No. 4, 513–534 (1995).CrossRefzbMATHGoogle Scholar
  31. 31.
    A. Frolov, A. Kartashov, A. Goltsev, and R. Folk, “Quality and efficiency of retrieval for Willshaw-like autoassociative networks. II. Recognition,” Network: Computation in Neural Systems, 6, No. 4, 535–549 (1995).CrossRefzbMATHGoogle Scholar
  32. 32.
    A. A. Frolov, D. Husek, and I. P. Muraviev, “Informational capacity and recall quality in sparsely encoded Hopfield-like neural network: Analytical approaches and computer simulation,” Neural Networks, 10, No. 5, 845–855 (1997).CrossRefGoogle Scholar
  33. 33.
    A. A. Frolov, D. A. Rachkovskij, and D. Husek, “On information characteristics of Willshaw-like auto-associative memory,” Neural Network World, 12, No. 2, 141–158 (2002).Google Scholar
  34. 34.
    A. A. Frolov, D. Husek, and D. A. Rachkovskij, “Time of searching for similar binary vectors in associative memory,” Cybern. Syst. Analysis, 42, No. 5, 615–623 (2006).CrossRefzbMATHGoogle Scholar
  35. 35.
    D. W. Nowicki and O. K. Dekhtyarenko, “Averaging on Riemannian manifolds and unsupervised learning using neural associative memory,” in: Proc. 13th European Symp. on Artificial Neural Networks (ESANN 2005) (April 27–29), Bruges, Belgium (2005), pp. 181–189.Google Scholar
  36. 36.
    D. Nowicki, P. Verga, and H. Siegelmann, “Modeling reconsolidation in kernel associative memory,” PloS one, 8, No. 8 (2013), e68189.doi:10.1371/journal.pone.0068189.CrossRefGoogle Scholar
  37. 37.
    O. M. Riznyk and D. O. Dzyuba, “Dynamic associative memory based on open source recurrent neural network,” Matem. Mash. Syst., No. 2, 50–60 (2010).Google Scholar
  38. 38.
  39. 39.
    A. C. Berry, “The accuracy of the Gaussian approximation to the sum of independent variates,” Trans. American Math. Society, 49, 122–136 (1941).CrossRefGoogle Scholar
  40. 40.
    C. G. Esseen, “On the Liapunov limit of error in the theory of probability,” Arkiv fur Matematik, Astronomi och Fysik, 28A, No. 9, 1–19 (1942).MathSciNetGoogle Scholar
  41. 41.
    C. G. Esseen, “A moment inequality with an application to the central limit theorem,” Skandinavisk Aktuarietidskrift, 39, 160–170 (1956).MathSciNetGoogle Scholar
  42. 42.
    V. Korolev and I. Shevtsova, “An improvement of the Berry–Esseen inequality with applications to Poisson and mixed Poisson random sums,” Scandinavian Actuarial J., No. 2, 81–105 (2012).Google Scholar
  43. 43.
    I. G. Shevtsova, “On the absolute constants in the Berry–Esseen-type inequalities,” Doklady Mathematics, 89, No. 3, 378–381 (2014).CrossRefzbMATHMathSciNetGoogle Scholar
  44. 44.
    I. S. Tyurin, “ Refinement of the remainder term in Lyapunov’s theorem,” Theory of Probab. and its Application, 56, No. 4, 808–811 (2011).MathSciNetGoogle Scholar
  45. 45.
    S. V. Nagaev and V. I. Chebotarev, “On the bound of proximity of the binomial distribution to the normal one,” Doklady Mathematics, 83, No. 1, 19–21 (2011).CrossRefzbMATHMathSciNetGoogle Scholar
  46. 46.
    C. Walck, “Hand-book on statistical distributions for experimentalists,” Internal Report SUF-PFY/96-01 (last modification 10 Sept. 2007), Fysikum, University of Stockholm, Particle Physics Group (2007).Google Scholar
  47. 47.
    V. A. Bentkus, “Lyapunov type bound in Rd,” Theory of Probab. and its Applications, 49, 311–323 (2005).CrossRefMathSciNetGoogle Scholar
  48. 48.
    R. N. Bhattacharya and S. Holmes, “An exposition of Gotze’s estimation of the rate of convergence in the multivariate central limit theorem,” Eprint arXiv:1003.4254 (2010).Google Scholar
  49. 49.
    L. H. Y. Chen and X. Fang, “Multivariate normal approximation by Stein’s method: The concentration inequality approach,” Eprint arXiv:1111.4073 (2011).Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.International Scientific and Training Center of Information Technologies and SystemsNational Academy of Sciences of Ukraine and Ministry of Education and Science of UkraineKyivUkraine

Personalised recommendations