Bayesian Survey Analysis: Introduction

Paczkowski, Walter R.

doi:10.1007/978-3-030-76267-4_8

Walter R. Paczkowski²

790 Accesses

Abstract

I previously discussed and illustrated deep analysis methods for survey data when the target variable of a Core Question is measured on a continuous or discrete scale. A prominent method is OLS regression for a continuous target. The target is the dependent or left-hand-side variable, and the independent variables, or features (perhaps from Surround Questions such as demographics), are the right-hand-side variables in a linear model. A logit model is used rather than an OLS model for a discrete target because of statistical issues, the most important being that OLS can predict outside the range of the target. For example, if the target is customer satisfaction measured on a 5-point Likert scale, but the five points are encoded as 0 and 1 (i.e., B3B and T2B, respectively), then OLS could predict a value of −2 for the binary target. What is −2? A logit model is used to avoid this nonsensical result. I illustrated how this is handled in Chap. 5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Normality does not have to be assumed. This is just convenient for this example.
2.
This is based on Haigh (2012, p. 3), although he doesn’t explain how he got his numbers.
3.
This is actually an estimate of the probability.
4.
I’m assuming, of course, that the deck is thoroughly reshuffled.
5.
There is a distinction between data and information. See Paczkowski (2022) for a discussion.
6.
See the Wikipedia article on Thomas Bayes at https://en.wikipedia.org/wiki/Thomas_Bayes, last accessed December 27, 2021.
7.
Cited by Hajek (2019).
8.
If σ is the standard deviation, then the precision is τ = 1∕σ ².
9.
See https://stats.stackexchange.com/questions/20520/what-is-an-uninformative-prior-can-we-ever-have-one-with-truly-no-information?noredirect=1&lq=1. Last accessed January 4, 2022.
10.
See https://en.wikipedia.org/wiki/Markov_chain. Last accessed January 3, 2022.
11.
See https://en.wikipedia.org/wiki/Monte_Carlo_method#History. Last accessed January 4, 2022.
12.
See https://en.wikipedia.org/wiki/Random_walk for a good discussion of random walks as a Markov Chain. Also see https://en.wikipedia.org/wiki/Markov_chain. Both articles last accessed January 3, 2022.
13.
As of January 17, 2022.
14.
See, for example, the description of SmartRevenue, Inc. at [https://www.linkedin.com/company/smartrevenue/about/](https://www.linkedin.com/company/smartrevenue/about/), last accessed December 7, 2021. SmartRevenue is now defunct.
15.
See https://global.nielsen.com/global/en/. Last accessed December 7, 2021.
16.
See https://www.sisinternational.com/ as an example market research company using this method. Last accessed December 7, 2021.
17.
See the Wikipedia article “Half-normal distribution” at https://en.wikipedia.org/wiki/Half-normal_distribution. Last accessed January 9, 2022.
18.
See Rob Hicks’ course notes, which are the basis for this discussion, at https://rlhick.people.wm.edu/stories/bayesian_7.html, last accessed January 16, 2022.
19.
Note: the “draw” keyword is not required because it is the first argument to the function.
20.
See the pyMC3 and ArviZ documentation.
21.
See Hogg and Craig (1970, Chapter 6).
22.
For a good explanation of the HDI, see https://stats.stackexchange.com/questions/148439/what-is-a-highest-density-region-hdr. Last accessed January 7, 2022. Also see Hyndman (1996).
23.
Be careful how you average. I exponentiated the estimate first for each value in the chains and then averaged these values. You could average the unexponentiated estimates and then exponentiated the average. The latter will produce a smaller value. You need to exponentiate first and then average because each exponentiation is for a separate model.
24.
See https://en.wikipedia.org/wiki/Beta_distribution. Last accessed January 22, 2022.

References

Andel, J. 2001. Mathematics of Chance. In Wiley Series in Probabilities and Statistics. New York: Wiley.
Google Scholar
Christensen, R., W. Johnson, A. Branscum, and T.E. Hanson. 2011. Bayesian Ideas and Data Analysis: An Introduction for Scientists and Statisticians. New York: CRC Press.
MATH Google Scholar
DeVany, A. 1976. Uncertainty, waiting time, and capacity utilization: A stochastic theory of product quality. Journal of Political Economy 84 (3): 523–542.
Article Google Scholar
Eagle, A. 2021. Chance versus randomness. In Stanford Encyclopedia of Philosophy.
Google Scholar
Feller, W. 1950. An Introduction to Probability Theory and Its Applications. Vol. I. New York: Wiley.
MATH Google Scholar
Feller, W. 1971. An Introduction to Probability Theory and Its Applications. Vol. II. New York: Wiley.
MATH Google Scholar
Gelman, A. 2002. Prior distribution. In Encyclopedia of Environmetrics, ed. Abdel H. El-Shaarawi, and Walter W. Piegorsch. Vol. 3, 1634–1637. New York: Wiley.
Google Scholar
Gelman, A. 2006. Prior distributions for variance parameters inhierarchical models. Bayesian Analysis 1 (3): 515–533.
Article MathSciNet Google Scholar
Gelman, A. and J. Hill. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press.
Google Scholar
Gelman, A., J. Hill, and A. Vehtari. 2021. Regression and Other Stories. Cambridge: Cambridge University Press.
MATH Google Scholar
Gill, J. 2008. Bayesian Methods: A Social and Behavioral Sciences Approach, 2nd ed. Statistics in the Social and Behaviorial Sciences. New York: Chapman & Hall/CRC.
Google Scholar
Haans, H. and E. Gijsbrechts. 2011. “one-deal-fits-all?” on category sales promotion effectiveness in smaller versus larger supermarkets. Journal of Retailing 87 (4): 427–443.
Article Google Scholar
Haigh, J. 2012. Probability: A Very Short Introduction. Oxford: Oxford University Press.
Book Google Scholar
Hajek, A. 2019. Interpretations of probability. In Stanford Encyclopedia of Philosophy.
Google Scholar
Hogg, R.V. and A.T. Craig. 1970. Introduction to Mathematical Statistics, 3rd ed. New York: Macmillan Publishing Co., Inc.
MATH Google Scholar
Hyndman, R.J. 1996. Computing and graphing highest density regions. The American Statistician 50 (2): 120–126.
Google Scholar
Kastellec, J.P., J.R. Lax, and J. Phillips. 2019. Estimating state public opinion with multi-levelregression and poststratification using r. https://scholar.princeton.edu/sites/default/files/jkastellec/files/mrp_primer.pdf.
Google Scholar
Malkiel, B.G. 1999. A Random Walk Down Wall Street, Revised ed. New York: W.W. Norton & Company.
Google Scholar
Martin, N., B. Depaire, and A. Caris. 2018. A synthesized method for conducting a business process simulation study. In 2018 Winter Simulation Conference (WSC), 276–290.
Google Scholar
Martin, O.A., R. Kumar, and J. Lao. 2022. Bayesian Modeling and Computation in Python. In Textx in Statistical Science. New York: CRC Press.
Google Scholar
Mlodinow, L. 2008. The Drunkard’s Walk: How Randomness Rules Our Lives. Vintage Books.
MATH Google Scholar
Paczkowski, W.R. 2018. Pricing Analytics: Models and Advanced Quantitative Techniques for Product Pricing. Milton Park: Routledge.
Book Google Scholar
Paczkowski, W.R. 2022. Business Analytics: Data Science for Business Problems. Berlin: Springer.
Google Scholar
Pinker, S. 2021. Rationality: What it is, Why it Seems Scarce, Why it Matters. New York: Viking Press.
Google Scholar
Santos-d’Amorim, K., and M. Miranda. 2021. Misinformation, disinformation, and malinformation: clarifying the definitions and examples in disinfodemic times. Encontros Bibli Revista Eletronica de Biblioteconomia e Ciencia da Informacao 26: 1–23.
Google Scholar
SAS. 2018. SAS/STAT 15.1 User’s Guide. In Chapter 7: Introduction to Bayesian Analysis Procedures, 129–166. North Carolina: SAS Institute Inc.
Google Scholar
Scheaffer, R.L. 1990. Introduction to Probability and Its Applications. In Advanced Series in Statistics and Decision Sciences. California: Duxbury Press.
Google Scholar
Shonkwiler, R.W. and F. Mendivil. 2009. Explorations in Monte Carlo Methods. In Undergraduate Texts in Mathematics. Berlin: Springer.
Google Scholar
Todd, M.J., K.M. Kelley, and H. Hopfer. 2021. Usa mid-atlantic consumer preferences for front labelattributes for local wine. Beverages 7 (22): 1–16.
Google Scholar
Weiss, N.A. 2005. Introductory Statistics, 7th ed. Boston: Pearson Education, Inc.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Data Analytics Corp., Plainsboro, NJ, USA
Walter R. Paczkowski

Authors

Walter R. Paczkowski
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Paczkowski, W.R. (2022). Bayesian Survey Analysis: Introduction. In: Modern Survey Analysis. Springer, Cham. https://doi.org/10.1007/978-3-030-76267-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-76267-4_8
Published: 12 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76266-7
Online ISBN: 978-3-030-76267-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics