Abstract
We propose a two-way Bayesian vector spatial procedure incorporating dimension reparameterization with a variable selection option to determine the dimensionality and simultaneously identify the significant covariates that help interpret the derived dimensions in the joint space map. We discuss how we solve identifiability problems in a Bayesian context that are associated with the two-way vector spatial model, and demonstrate through a simulation study how our proposed model outperforms a popular benchmark model. In addition, an empirical application dealing with consumers’ ratings of large sport utility vehicles is presented to illustrate the proposed methodology. We are able to obtain interpretable and managerially insightful results from our proposed model with variable selection in comparison with the benchmark model.
Similar content being viewed by others
References
Addelman, S. (1961). Irregular fractions of the \(2^{{\rm n}}\) factorial experiments. Technometrics, 3, 479–496.
Barbieri, M. M., & Berger, J. (2004). Optimal predictive model selection. Annals of Statistics, 32, 870–897.
Benzecri, J. P. (1992). Correspondence analysis handbook. New York: Marcel Dekker.
Bolton, G. E., Fong, D. K. H., & Mosquin, P. (2003). Bayes factors with an application to experimental economics. Experimental Economics, 6, 311–325.
Borg, I., & Groenen, P. J. F. (2005). Modern multidimensional scaling: Theory and applications (2nd ed.). New York: Springer.
Brown, P. J., Vannucci, M., & Fearn, T. (1998). Multivariate Bayesian variable selection and prediction. Journal of the Royal Statistical Society Series B, 60, 627–641.
Buja, N., & Eyuboglu, N. (1992). Remarks on parallel analysis. Multivariate Behavioral Research, 27(4), 509–540.
Carroll, J. D. (1980). Models and methods for multidimensional analysis of preferential choice (or other dominance) data. In E. D. Lantermann & H. Feger (Eds.), Similarity and choice (pp. 234–289). Vienna: Hans Huber Publishers.
Carroll, J. D., & Arabie, P. (1980). Multidimensional scaling. Annual Review of Psychology, 31, 607–649.
Carroll, J. D., Pruzanksy, S., & Kruskal, J. B. (1980). CANDELINC: A general approach to multidimensional analysis of many-way arrays with linear constraints on parameters. Psychometrika, 45(1), 3–24.
Cox, T. F., & Cox, M. A. A. (2001). Multidimensional scaling (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC.
Dawid, A. (1981). Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika, 68(1), 265–274.
DeSarbo, W. S. (1982). GENNCLUS: New models for general nonhierarchical clustering analysis. Psychometrika, 47(4), 449–475.
DeSarbo, W. S., & Carroll, J. D. (1985). Three-way metric unfolding via alternating weighted least squares. Psychometrika, 50(3), 275–300.
DeSarbo, W. S., & Cho, J. (1989). A stochastic multidimensional scaling vector threshold model for the spatial representation of pick any/N data. Psychometrika, 54(1), 105–121.
DeSarbo, W. S., Fong, D. K. H., Liechty, J., & Saxton, K. (2004). A hierarchical Bayesian procedure for two-mode cluster analysis. Psychometrika, 69, 547–572.
DeSarbo, W. S., Grewal, R., & Scott, C. J. (2008). A clusterwise bilinear multidimensional scaling methodology for simultaneous segmentation and positioning analysis. Journal of Marketing Research, 45(2), 280–292.
DeSarbo, W. S., Howard, D. J., & Jedidi, K. (1991). MULTICLUS: A new method for simultaneously performing multidimensional scaling and cluster analysis. Psychometrika, 56(1), 121–136.
DeSarbo, W. S., & Jedidi, K. (1995). The spatial representation of heterogeneous consideration sets. Marketing Science, 14(3), 326–342.
DeSarbo, W. S., & Kim, S. (2013). A review of the major multidimensional scaling models for the analysis of preference/dominance data in marketing. In L. Moutinho & K.-H. Huarng (Eds.), Quantitative Modeling in Marketing and Management (pp. 3–27). London: World Scientific Press.
DeSarbo, W. S., Kim, Y., & Fong, D. K. H. (1999). A Bayesian multidimensional scaling procedure for the spatial analysis of revealed choice data. Journal of Econometrics, 89, 79–108.
DeSarbo, W. S., Oliver, R. L., & DeSoete, G. (1986). A probabilistic multidimensional scaling vector model. Applied Psychological Measurement, 10(1), 79–98.
DeSarbo, W. S., Park, J., & Rao, V. (2011). Deriving joint space positioning maps from consumer preference ratings. Marketing Letters, 22(1), 1–14.
DeSarbo, W. S., & Rao, V. R. (1986). A constrained unfolding methodology for product positioning. Marketing Science, 5, 1–19.
Fong, D. K. H. (2010). Bayesian multidimensional scaling and its applications in marketing research. In Ming-Hui Chen, Dipak K. Dey, Peter Mueller, Dongchu Sun, & Keying Ye (Eds.), Frontier of Statistical Decision Making and Bayesian Analysis (pp. 410–417). Berlin: Springer.
Fong, D. K. H., DeSarbo, W. S., Park, J., & Scott, C. J. (2010). A Bayesian vector multidimensional scaling procedure for the analysis of ordered preference data. Journal of the American Statistical Association, 105(490), 482–492.
Fong, D. K. H., Ebbes, P., & DeSarbo, W. S. (2012). A heterogeneous Bayesian regression model for cross sectional data involving a single observation per response unit. Psychometrika, 77(2), 293–314.
George, E. I., & McCulloch, R. E. (1993). Variable selection via gibbs sampling. Journal of American Statistical Association, 88, 881–889.
George, E. I., & McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statistica Sinica, 7, 339–373.
Gifi, A. (1990). Nonlinear multivariate analysis. Chichester, England: Wiley.
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations (3rd ed.). Baltimore, MD: Johns Hopkins University Press.
Gormley, I. C., & Murphy, T. B. (2006). A latent space model for rank data. In Statistical network analysis: Models, issues and new directions. Lecture notes in computer science. New York: Springer. Available as technical report at http://www.tcd.ie/Statistics/postgraduate/0602.pdf.
Gupta, A. K., & Nagar, D. K. (2000). Matrix variate distributions. Monographs and surveys in pure and applied mathematics (Vol. 104). London: Chapman & Hall/CRC.
Gustafson, P. (2005). On model expansion, model contraction, identifiability and prior information: Two illustrative scenarios involving mismeasured variables. Statistical Science, 20, 111–140.
Harshman, R. A., & Lundy, M. E. (1984). Data preprocessing and the extended PARAFAC model. In H. G. Law, C. W. Snyder Jr, J. Hattie, & R. P. McDonald (Eds.), Research methods for multimode data analysis (pp. 216–284). New York: Praeger.
Jedidi, K., & DeSarbo, W. S. (1991). A stochastic multidimensional scaling procedure for the spatial representation of three-mode, three-way pick any/J data. Psychometrika, 56(3), 471–494.
Jolliffe, I. T., Trendafilov, N. T., & Uddin, M. (2003). A modified principal component technique based on the LASSO. Journal of Computational and Graphical Statistics, 12(3), 531–547.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27.
Lee, M. D. (2008). Three case studies in the Bayesian analysis of cognitive models. Psychonomic Bulletin & Review, 15, 1–15.
Oh, M.-S., & Raftery, A. E. (2001). Bayesian multidimensional scaling and choice of dimension. Journal of the American Statistical Association, 96, 1031–1044.
O’Hara, R. B. O., & Sillanpaa, M. J. (2009). A review of Bayesian variable selection methods: What, how and which. Bayesian Analysis, 4(1), 85–118.
Park, J., DeSarbo, W. S., & Liechty, J. (2008). A hierarchical Bayesian multidimensional scaling methodology for accommodating both structural and preference heterogeneity. Psychometrika, 73(3), 451–472.
Raftery, A. E., Newton, M. A., Satagopan, J. M., & Krivitsky, P. N. (2007). Estimating the integrated likelihood via posterior simulation using the harmonic mean identity. In J. M. Bernardo, M. J. Bayarri, J. O. Berger, A. P. Dawid, D. Heckerman, A. F. M. Smith, & M. West (Eds.), Bayesian statistics 8 (pp. 1–45). Oxford: Oxford University Press.
Rossi, P. E., McCulloch, R. E., & Allenby, G. M. (1996). The value of purchase history data in target marketing. Marketing Science, 15(4), 321–340.
Schonemann, P. H. (1970). On metric multidimensional unfolding. Psychometrika, 35(3), 349–366.
Scott, C. J., & DeSarbo, W. S. (2011). A new constrained stochastic multidimensional scaling vector model: An application to the perceived importance of leadership attributes. Journal of Modeling in Management, 6(1), 7–32.
Shepard, R. N. (1962). The analysis of proximities: Multidimensional scaling with an unknown distance function. Psychometrika, 27(125–140), 219–246.
Shepard, R. N. (1980). Multidimensional scaling, tree-fitting, and clustering. Science, 210, 390–398.
Shin, J. S., Fong, D. K. H., & Kim, K. J. (1998). Complexity reduction of a house of quality chart using correspondence analysis. Quality Management Journal, 5, 46–58.
Slater, P. (1960). The analysis of personal preferences. The British Journal of Statistical Psychology, 13, 119–135.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, 64, 583–639.
Takane, Y. (2013). Constrained principal component analysis. New York, NY: Chapman & Hall Inc.
Takane, Y., Young, F., & Leeuw, J. (1977). Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika, 42(1), 7–67.
Ter Braak, C. J. F. (1986). Canonical correspondence analysis: A new eigenvector technique for multivariate direct gradient analysis. Ecology, 67(5), 1167–1179.
Tucker, L. R. (1960). Intra-individual and inter-individual multidimensionality. In H. Gullikson & S. Messick (Eds.), Psychological Scaling: Theory and Applications. New York, NY: Holt, Rinehart, & Winston.
Acknowledgments
The authors wish to thank the editor, associate editor, and three anonymous referees for their constructive comments. This research was funded in part by the Smeal College of Business.
Author information
Authors and Affiliations
Corresponding author
Additional information
Zhe Chen is currently working at Google Inc.
Appendices
Appendix 1: Full Conditional Distributions
(i) Let \({\varvec{Z}}\) be the data matrix. Since
where etr refers to an exponential function of the trace of (a matrix), the full conditional distribution of \(\sigma ^{-2}\) is \(\hbox {Ga}\left( {\hbox {m}_1^*,m_2^*} \right) \), where \(\hbox {m}_1^*=\left( {\frac{NJ}{2}+m_1 } \right) \hbox { and} \quad m_2^*=\left( {m_2 +\frac{1}{2}\hbox {tr}[({\varvec{Z}}-{\varvec{A}}^{{\prime }}{\varvec{B}}) ({\varvec{Z}}-{\varvec{A}}^{{\prime }}{\varvec{B}})^{{\prime }}]} \right) \).
(ii) Let \({\varvec{A}}_0 =\mathbf{1}^{{{\prime }}}{\otimes } {\varvec{a}}_0 \). Since
the full conditional distribution of \({\varvec{A}}\) is \(\hbox {MN}\left( {\bar{{\varvec{A}}} ,{\varvec{I}}_N ,{\varvec{A}}_l } \right) \), where \({\varvec{A}}_l =(\sigma ^{-2}\mathbf{BB}^{{\prime }}+{\varvec{I}}_\mathrm{T} /c)^{-1}\hbox { and } \bar{{\varvec{A}}} ={\varvec{A}}_l (\sigma ^{-2}\mathbf{B}{\varvec{Z}}^{{\prime }}+{\varvec{A}}_0 /c)\).
(iii) Since
the full conditional distribution of \({\varvec{a}}_0\) is \(N\left( {\bar{{\varvec{a}}} ,{\varvec{G}}_{an} } \right) \), where \({\varvec{G}}_{an} =\left( {\frac{N}{c}\mathbf{I}_\mathrm{T} +{\varvec{G}}_a^{-1} } \right) ^{-1}\hbox { and } \bar{{\varvec{a}}} ={\varvec{G}}_{an} \left( {\sum _{i=1}^N \,{\varvec{a}}_i } \right) /c\).
(iv) Let \({\varvec{B}}_0 =\mathbf{1}^{{{\prime }}}{\otimes } {\varvec{b}}_0 \). Since
the full conditional distribution of \(\mathbf{B}\) is \(\hbox {MN}\left( {\bar{{\varvec{B}}} ,{\varvec{I}}_J ,{\varvec{B}}_l } \right) \), where \({\varvec{B}}_l =(\sigma ^{-2}{{\varvec{A}A}}^{{\prime }}+{\varvec{\Sigma }}^{-1})^{-1}\hbox { and } \bar{{\varvec{B}}} ={\varvec{B}}_l (\sigma ^{-2}{{\varvec{A}Z}}+{\varvec{\Sigma }}^{-1}({\varvec{B}}_0 +{\varvec{\Theta }}{\varvec{X}}))\).
(v) Since
the full conditional distribution of \({\varvec{b}}_0\) is \(N\left( {\bar{{\varvec{b}}} ,{\varvec{G}}_{bn} } \right) \), where \({\varvec{G}}_{bn} =\left( {J{\varvec{\Sigma }}^{-1}+{\varvec{G}}_b^{-1} } \right) ^{-1}\hbox { and } \bar{{\varvec{b}}} ={\varvec{G}}_{bn} {\varvec{\Sigma }}^{-1} \sum _{j=1}^J \,\left( { {\varvec{b}}_j -{\varvec{\Theta }}{\varvec{X}}_j } \right) \).
(vi) Since
the full conditional distribution of \({\varvec{\Sigma }}^{-1}\) is \(W(J+\sum _k \,\gamma _k +\nu ,{\varvec{V}}_n )\), where \({\varvec{V}}_n =\left[ \left( \mathbf{B}-\mathbf{B}_0 -{\varvec{\Theta }}_{\left( \gamma \right) }\right. \right. \) \(\left. \left. {\varvec{X}}_{\left( \gamma \right) } \right) \left( {\mathbf{B}-\mathbf{B}_0 -{\varvec{\Theta }}_{\left( \gamma \right) } {\varvec{X}}_{\left( \gamma \right) } } \right) ^{{{\prime }}}+V^{-1}{\varvec{I}}_\mathrm{T} +{\varvec{\Theta }}_{\left( \gamma \right) } {\varvec{H}}_{\left( \gamma \right) }^{-1} {\varvec{\Theta }}_{\left( \gamma \right) }^{\prime } \right] ^{-1}\).
(vii) Since
the full conditional distribution of \(w\) is \(\hbox {Beta}(p+\sum _{k=1}^K \,\gamma _k ,q+K-\sum _{k=1}^K \,\gamma _k )\).
(viii) Since
the full conditional distribution of \({\varvec{\Theta }}_{\left( \gamma \right) } \) is \(\hbox {MN}(\tilde{\varvec{\Theta }} _{\left( \gamma \right) } ,\mathbf{K}_{\left( \gamma \right) }^{-1} ,{\varvec{\Sigma }})\), where \({\varvec{K}}_{(\gamma )} ={\varvec{X}}_{(\gamma )} {\varvec{X}}_{(\gamma )}^{{\prime }} +{\varvec{H}}_{(\gamma )}^{-1} \) and \(\tilde{\varvec{\Theta }} _{(\gamma )} =({\varvec{B}}-\mathbf{B}_0 ){\varvec{X}}_{(\gamma )}^{{\prime }} {\varvec{K}}_{(\gamma )}^{-1} \).
(ix) Since
we first integrate out \({\varvec{\Theta }}_{\left( \gamma \right) } \) to get the distribution \(\pi \left( {{\varvec{\Sigma }}^{-1},\varvec{\gamma } \hbox {|all others except }{\varvec{\Theta }}_{\left( \gamma \right) } } \right) \) which is proportional to
Then, the required result in (23) is obtained by integrating out \({\varvec{\Sigma }}^{-1}\) from this last expression.
Appendix 2: Proof of Theorem 1
For any generated \({\varvec{A}}\), by applying the QR decomposition method, one can obtain a unique orthogonal matrix \({\varvec{\Gamma }}\) such that \({\varvec{\Gamma }}{\varvec{A}}\) satisfies the identification constraint given in (9). Other identified parameters are then obtained by multiplying \({\varvec{\Gamma }}\) to those parameters (e.g., \({\varvec{\Gamma }}{\varvec{B}})\). Now \(\mathbf{V}_{(\gamma )}\) in (23) can be re-written as
so
Since \(|\mathbf{V}_{\left( \gamma \right) } |\) as well as the remaining terms in (23) are unchanged when the unidentified parameters (e.g., \({\varvec{B}}\)) are replaced by the identified parameters (e.g., \({\varvec{\Gamma }}{\varvec{B}})\), the posterior distribution of \(\varvec{\gamma }\) is unchanged when the substitution is made. Thus, the variable selection results are not affected by the proposed post-processing procedure.
Rights and permissions
About this article
Cite this article
Fong, D.K.H., DeSarbo, W.S., Chen, Z. et al. A Bayesian Vector Multidimensional Scaling Procedure Incorporating Dimension Reparameterization with Variable Selection. Psychometrika 80, 1043–1065 (2015). https://doi.org/10.1007/s11336-015-9449-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11336-015-9449-x