The Transition from A Priori to A Posteriori Information: Bayesian Procedures in Distributed Large-Scale Data Processing Systems

  • P. V. GolubtsovEmail author
Information Systems


The procedure of transition from a priori to a posteriori information for a linear experiment in the context of Big Data systems is considered. At first glance, this process is fundamentally sequential, namely: as a result of observation, a priori information is transformed into a posteriori information, which is later interpreted as a priori for the next observation, etc. It is shown that such a procedure can be parallelized and unified due to the transformation of both the measurement results and the original a priori information into some special type. The properties of various forms of information representation are studied and compared. This approach makes it possible to effectively scale the Bayesian estimation procedure and, thus, adapt it to the problems of processing large amounts of distributed data.


Big Data a priori and a posteriori information linear estimation canonical information distributed data collection and processing systems information algebra information space 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Golubtsov, P.V., The concept of information in Big Data processing, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 1, pp. 38–43.CrossRefGoogle Scholar
  2. 2.
    Golubtsov, P.V., The linear estimation problem and information in Big-Data systems, Autom. Doc. Math. Linguist., 2018, vol. 52, no. 2, pp. 73–79.CrossRefGoogle Scholar
  3. 3.
    Lindley, D., Bayesian Statistics: A Review, Philadelphia, PA: SIAM, 1972.CrossRefzbMATHGoogle Scholar
  4. 4.
    Barra, J.-R., Mathematical Basis of Statistics: Probability and Mathematical Statistics, Academic Press, 1981.zbMATHGoogle Scholar
  5. 5.
    Borovkov, A.A., Matematicheskaya statistika (Mathematical Statistics), Novosibirsk: Nauka, 1997.zbMATHGoogle Scholar
  6. 6.
    Efron, B., Bayes’ theorem in the 21st century, Science, 2013, vol. 340, no. 6137, pp. 1177–1178.MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Spiegelhalter, D.J., Dawid, A.P., Lauritzen, S.L., and Cowell, R.G., Bayesian analysis in expert systems, Stat. Sci., 1993, vol. 8, pp. 219–247.MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Spiegelhalter, D.J. and Lauritzen, S.L., Sequential updating of conditional probabilities on directed graphical structures, Networks, 1990, vol. 20, no. 5, pp. 579–605.MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Zhu, J., Chen, J., and Hu, W., Big learning with Bayesian methods, Natl. Sci. Rev., 2017, vol. 4, no. 4, pp. 627–651.CrossRefGoogle Scholar
  10. 10.
    Oravecz, Z., Huentelman, M., and Vandekerckhove, J., Sequential Bayesian updating for Big Data, in Big Data in Cognitive Science, Jones, M.N., Ed., New York: Taylor & Francis, 2016, ch. 2, pp. 13–33.Google Scholar
  11. 11.
    Bekkerman, R., Bilenko, M., and Langford, J., Scaling up Machine Learning: Parallel and Distributed Approaches, New York: Cambridge University Press, 2012.Google Scholar
  12. 12.
    Fan, J., Han, F., and Liu, H., Challenges of big data analysis, Natl. Sci. Rev., 2013, vol. 1, no. 2, pp. 293–314.CrossRefGoogle Scholar
  13. 13.
    Pyt’ev, Yu.P., Pseudoinverse operator. Properties and applications, Math. USSR Sb., 1983, vol. 46, no. 1, pp. 17–50.CrossRefzbMATHGoogle Scholar
  14. 14.
    Pyt’ev, Yu.P., Matematicheskie metody interpretatsii eksperimenta (Mathematical Methods in Experiments Interpretation), Moscow: Vyssh. Shk., 1989.zbMATHGoogle Scholar
  15. 15.
    Pyt’ev, Yu.P., Metody matematicheskogo modelirovaniya izmeritel’no-vychislitel’nykh sistem (Methods of Mathematical Modeling of Measuring and Computing Systems), Moscow: Fizmatlit, 2012.zbMATHGoogle Scholar
  16. 16.
    Lindley, D.V. and Smith, A.F.M., Bayes estimates for the linear model, J. R. Stat. Soc., Ser. B, 1972, vol. 34, no. 1, pp. 1–41.MathSciNetzbMATHGoogle Scholar
  17. 17.
    Robert, C.P., On the relevance of the Bayesian approach to statistics, Rev. Econ. Anal., 2010, vol. 2, no. 2, pp. 139–152.Google Scholar
  18. 18.
    Vasudevan, A., On the a priori and a posteriori assessment of probabilities, J. Appl. Logic, 2013, vol. 11, no. 4, pp. 440–451.MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Albert, A., Regression and the Moore–Penrose Pseudoinverse, New York: Academic Press, 1972.zbMATHGoogle Scholar
  20. 20.
    Little, R.J., Calibrated Bayes: A Bayes/frequentist roadmap, Am. Stat., 2006, vol. 60, no. 3, pp. 213–223.MathSciNetCrossRefGoogle Scholar
  21. 21.
    White, T., Hadoop: The Definitive Guide, Sebastopol, CA: O’Reilly, 2015.Google Scholar
  22. 22.
    Dean, J. and Ghemawat, S., Mapreduce: Simplified data processing on large clusters, Commun. ACM, 2008, vol. 51, no. 1, pp. 107–113.CrossRefGoogle Scholar
  23. 23.
    Palit, I. and Reddy, C.K., Scalable and parallel boosting with MapReduce, IEEE Trans. Knowl. Data Eng., 2012, vol. 24, no. 10, pp. 1904–1916.CrossRefGoogle Scholar
  24. 24.
    Ekanayake, J., Pallickara, S., and Fox, G., MapReduce for data intensive scientific analyses, Fourth IEEE International Conference on eScience, Indianapolis: IN, 2008, pp. 277–284.Google Scholar
  25. 25.
    Ryza, S., Laserson, U., Owen, S., and Wills, J., Advanced Analytics with Spark: Patterns for Learning from Data at Scale, Sebastopol, CA: O’Reilly, 2015.Google Scholar

Copyright information

© Allerton Press, Inc. 2018

Authors and Affiliations

  1. 1.Moscow State UniversityMoscowRussia
  2. 2.All-Russian Institute for Scientific and Technical InformationRussian Academy of SciencesMoscowRussia
  3. 3.National Research University Higher School of EconomicsMoscowRussia

Personalised recommendations