, Volume 77, Issue 4, pp 741–762 | Cite as

The Heterogeneous P-Median Problem for Categorization Based Clustering

  • Simon J. BlanchardEmail author
  • Daniel Aloise
  • Wayne S. DeSarbo


The p-median offers an alternative to centroid-based clustering algorithms for identifying unobserved categories. However, existing p-median formulations typically require data aggregation into a single proximity matrix, resulting in masked respondent heterogeneity. A proposed three-way formulation of the p-median problem explicitly considers heterogeneity by identifying groups of individual respondents that perceive similar category structures. Three proposed heuristics for the heterogeneous p-median (HPM) are developed and then illustrated in a consumer psychology context using a sample of undergraduate students who performed a sorting task of major U.S. retailers, as well as a through Monte Carlo analysis.

Key words

p-median heterogeneity sorting task categorization clustering consumer psychology 



We wish to thank the Editor, the Associate Editor, and three anonymous reviewers for their constructive comments which have helped improve the contribution and quality of this manuscript.


  1. Addelman, S. (1962). Orthogonal main-effect plans for asymmetrical factorial experiments. Technometrics, 4(1), 21–46. CrossRefGoogle Scholar
  2. Ashby, F.G., Maddox, T.W., & Lee, W.W. (1994). On the dangers of average across subjects when using multidimensional scaling or the similarity-choice model. Psychological Science, 5(3), 144–151. CrossRefGoogle Scholar
  3. Baum, E.B. (1986). Toward practical ‘neural’ computation for combinatorial optimization problems. In J. Denker (Ed.), Neural networks for computing (pp. 53–64). New York: American Institute of Physics. Google Scholar
  4. Berman, O., & Drezner, Z. (2008). The p-median problem under uncertainty. European Journal of Operations Research, 189(1), 19–30. CrossRefGoogle Scholar
  5. Bettman, J.R., Luce, M.F., & Payne, J.W. (1998). Constructive consumer choice processes. Journal of Consumer Research, 25(3), 187–217. CrossRefGoogle Scholar
  6. Bijmolt, T.H.A., & Wedel, M. (1995). The effects of alternative methods of collecting similarity data for multidimensional scaling. International Journal of Research in Marketing, 12(4), 363–371. CrossRefGoogle Scholar
  7. Blanchard, S.J., DeSarbo, W.S., Atalay, A.S., & Harmancioglu, N. (2012). Identifying consumer heterogeneity in unobserved categories. Marketing Letters, 23(1), 177–194. CrossRefGoogle Scholar
  8. Boone, L.E., & Kurtz, D.L. (2009). Contemporary marketing. Mason: South-Western Educational Publishing. Google Scholar
  9. Brusco, M.J., Cradit, J.D., & Tashchian, A. (2003). Multicriterion clusterwise regression for joint segmentation settings: an application to customer value. Journal of Marketing Research, 40(2), 225–234. CrossRefGoogle Scholar
  10. Brusco, M.J., & Cradit, J.D. (2005). ConPar: a method for identifying groups of concordant subject proximity matrices for subsequent multidimensional scaling analyses. Journal of Mathematical Psychology, 49(2), 142–154. CrossRefGoogle Scholar
  11. Brusco, M.J., & Köhn, H.-F. (2008a). Comment on ‘Clustering by passing messages between data points’. Science, 319(5864), 726. PubMedCrossRefGoogle Scholar
  12. Brusco, M.J., & Köhn, H.-F. (2008b). Optimal partitioning of a data set based on the p-median problem. Psychometrika, 73(1), 89–105. CrossRefGoogle Scholar
  13. Brusco, M.J., & Köhn, H.-F. (2009). Exemplar-based clustering via simulated annealing. Psychometrika, 74(3), 457–475. CrossRefGoogle Scholar
  14. Conn, A.R., Scheinberg, K., & Vincente, L.N. (2009). Introduction to derivative-free optimization. Philadelphia: SIAM. CrossRefGoogle Scholar
  15. Coxon, A.P.M. (1999). Sorting data: collection and analysis. Thousand Oaks: Sage. Google Scholar
  16. Crainic, T.G., Gendreau, M., Hansen, P., & Mladenović, N. (2007). Cooperative parallel variable neighborhood search for the p-median. Journal of Heuristics, 10(3), 293–314. CrossRefGoogle Scholar
  17. Daws, J.T. (1996). The analysis of free-sorting data: beyond pairwise co-occurrence. Journal of Classification, 13(1), 57–80. CrossRefGoogle Scholar
  18. DeSarbo, W.S. (1982). GENNCLUS: new models for general nonhierarchical clustering analysis. Psychometrika, 47(4), 436–449. Google Scholar
  19. DeSarbo, W.S., & Carroll, J.D. (1985). Three-way metric unfolding via alternating weighted least squares. Psychometrika, 50(3), 275–300. CrossRefGoogle Scholar
  20. DeSarbo, W.S., & Cron, W.L. (1988). A maximum likelihood methodology clusterwise linear regression. Journal of Classification, 5(2), 249–289. CrossRefGoogle Scholar
  21. Farquhar, P.H., Han, J.Y., Herr, P.M., & Ijiri, Y. (1992). Strategies for leveraging master-brands. Marketing Research, 4(3), 32–43. Google Scholar
  22. Fazio, R.H., & Dunton, B.C. (1997). Categorization by race: the impact of automatic and controlled components of racial prejudice. Journal of Experimental Social Psychology, 33(5), 451–470. CrossRefGoogle Scholar
  23. Forgy, E.W. (1965). Cluster analysis of multivariate data: efficiency vs. interpretability of classifications. Biometrics, 21(3), 768–769. Google Scholar
  24. Floudas, C.A. (1995). Non-linear and mixed-integer optimisation. New York: Oxford University Press. Google Scholar
  25. Furnas, G.W. (1989). Metric family portraits. Journal of Classification, 6(1), 7–52. CrossRefGoogle Scholar
  26. Gigerenzer, G., & Todd, P.M. (1999). Simple heuristics that make us smart. New York: Oxford University Press. Google Scholar
  27. Griffin, A., & Hauser, J.R. (1993). The voice of the customer. Marketing Science, 12(1), 1–27. CrossRefGoogle Scholar
  28. Hansen, P., Brimberg, J., Urosevic, D., & Mladenović, N. (2009). Solving large p-median clustering problems by primal-dual variable neighborhood search. Data Mining and Knowledge Discovery, 19(3), 351–375. CrossRefGoogle Scholar
  29. Hansen, P., & Mladenović, N. (2001). Variable neighborhood search: principles and applications. European Journal of Operational Research, 130, 449–467. CrossRefGoogle Scholar
  30. Hauser, J.R., Toubia, O., Evgeniou, T., Befurt, R., & Dzyabura, D. (2010). Disjunctions of conjunctions, cognitive simplicity, and consideration sets. Journal of Marketing Research, 47(3), 485–496. CrossRefGoogle Scholar
  31. Helsen, K., & Green, P. (1991). A computational study of replicated clustering with an application to market segmentation. Decision Sciences, 22(5), 1124–1141. CrossRefGoogle Scholar
  32. Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218. CrossRefGoogle Scholar
  33. Isen, A.M. (1984). Toward understanding the role of affect in cognition. In R.S. Wyer Jr. & T.K. Srull (Eds.), Handbook of social cognition (pp. 179–236). Hillsdale: Lawrence Erlbaum. Google Scholar
  34. Ilog (2006). ILOG CPLEX 10.0 user’s manual. Google Scholar
  35. Jedidi, K., & DeSarbo, W.S. (1991). A stochastics multidimensional scaling methodology for the spatial representation of three-mode, three-way pick any/J data. Psychometrika, 56(3), 471–494. CrossRefGoogle Scholar
  36. John, D.R., & Sujan, M. (1990). Age differences in product categorization. Journal of Consumer Research, 16(March), 452–460. CrossRefGoogle Scholar
  37. Johnson, S.C. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241–254. PubMedCrossRefGoogle Scholar
  38. Kalamas, M., Cleveland, M., Laroche, M., & Laufer, R. (2006). The critical role of congruency in prototypical brand extensions. Journal of Strategic Marketing, 14(3), 193–210. CrossRefGoogle Scholar
  39. Kariv, O., & Hakimi, S.L. (1979). An algorithmic approach to network location problems. II: the p-medians. SIAM Journal on Applied Mathematics, 37(3), 539–560. CrossRefGoogle Scholar
  40. Kaufman, L., & Rousseeuw, P.J. (2005). Finding groups in data: an introduction to cluster analysis. New York: Wiley. Google Scholar
  41. Kelter, S., Cohen, R., Engel, D., List, G., & Stronher, H. (1977). The conceptual structure of aphasic and schizophrenic patients in a nonverbal sorting task. Journal of Psycholinguistic Research, 6(4), 279–303. PubMedCrossRefGoogle Scholar
  42. Klastorin, T. (1985). The p-median problem for cluster analysis: a comparative test using the mixture model approach. Management Science, 31(1), 84–95. CrossRefGoogle Scholar
  43. Köhn, H.-F., Steinley, D., & Brusco, M.J. (2010). The p-median as a tool for clustering psychological data. Psychological Methods, 15(1), 87–95. PubMedCrossRefGoogle Scholar
  44. Lakey, B., & Cassady, P.B. (1990). Cognitive processes in perceived social support. Personality Processes and Individual Differences, 59(2), 337–343. Google Scholar
  45. Langley, P. (1996). Elements of machine learning. San Francisco: Morgan Kaufmann. Google Scholar
  46. Lee, M.D. (2001). Determining the dimensionality of mutli-dimensional scaling represetations for cognitive modeling. Journal of Mathematical Psychology, 45(1), 149–166. PubMedCrossRefGoogle Scholar
  47. Love, B.C. (2003). Concept learning. In L. Nadel (Ed.), The encyclopedia of cognitive science (pp. 646–652). London: Nature Publishing Group. Google Scholar
  48. Maranzana, F.E. (1963). On the location of supply points to minimize transportation costs. IBM Systems Journal, 2(2), 129–135. CrossRefGoogle Scholar
  49. Medin, D.L., & Schaffer, M.M. (1978). Context theory of classification learning. Psychological Review, 85(3), 207–238. CrossRefGoogle Scholar
  50. Mervis, C.B., Catlin, J., & Rosch, E. (1976). Relationships among goodness-of-example, category norms, and word frequency. Bulletin of the Psychonomic Society, 7(3), 283–294. Google Scholar
  51. Miller, G.A. (1969). A psychological method to investigate verbal concepts. Journal of Mathematical Psychology, 6(2), 169–191. CrossRefGoogle Scholar
  52. Mladenović, N., Brimberg, J., Hansen, P., & Moreno-Perez, J.A. (2007). The p-median problem: a survey of metaheuristic approaches. European Journal of Operational Research, 179(3), 927–939. CrossRefGoogle Scholar
  53. Mladenović, N., & Hansen, P. (1997). Variable neighborhood search. Computers & Operations Research, 24(11), 1097–1100. CrossRefGoogle Scholar
  54. Perkins, W.S. (1993). The effects of experience and education on the organization of marketing knowledge. Psychology & Marketing, 10(3), 169–183. CrossRefGoogle Scholar
  55. Rao, V.R., & Katz, R. (1971). Alternative multidimensional scaling methods for large stimulus sets. Journal of Marketing Research, 8(4), 488–494. CrossRefGoogle Scholar
  56. Reed, S.K. (1972). Pattern recognition and categorization. Cognitive Psychology, 3(3), 382–407. CrossRefGoogle Scholar
  57. Reed, S.K. (1978). Category vs. item learning: implications for categorization models. Memory & Cognition, 6(6), 612–621. CrossRefGoogle Scholar
  58. Rosch, E., & Mervis, C.B. (1975). Family resemblances: studies in the internal structure of categories. Cognitive Psychology, 7(4), 573–605. CrossRefGoogle Scholar
  59. Rosch, E., Simpson, C., & Miller, R.S. (1976). Structural bases of typicality effects. Journal of Experimental Psychology. Human Perception and Performance, 2(4), 491–502. CrossRefGoogle Scholar
  60. Ross, B.H., & Murphy, G.L. (1999). Food for thought: cross-classification and category organization in a complex real-world domain. Cognitive Psychology, 38(4), 495–554. PubMedCrossRefGoogle Scholar
  61. Rousseeuw, P.J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(November), 53–65. CrossRefGoogle Scholar
  62. Shugan, S.M. (1980). The cost of thinking. Journal of Consumer Research, 7(2), 99–111. CrossRefGoogle Scholar
  63. Simon, H.A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69(1), 99–118. CrossRefGoogle Scholar
  64. Smith, E.R., Fazio, R.H., & Cejka, M.A. (1996). Accessible attitudes influence categorization of multiply categorizable objects. Journal of Personality and Social Psychology, 71(5), 888–898. PubMedCrossRefGoogle Scholar
  65. Sujan, M., & Dekleva, C. (1987). Product categorization and inference making: some implications for comparative advertising. Journal of Consumer Research, 14(3), 372–378. CrossRefGoogle Scholar
  66. Takane, Y. (1980). Analysis of categorizing behavior using a quantification method. Behaviormetrika, 7(8), 75–86. CrossRefGoogle Scholar
  67. Tucker, L.R., & Messick, S.J. (1963). An individual differences model for multidimensional scaling. Psychometrika, 28(4), 333–367. CrossRefGoogle Scholar
  68. Urban, G.L., Hulland, J.S., & Weinberg, B.D. (1993). Premarket forecasting for new consumer durable goods: modeling categorization, elimination, and consideration phenomena. Journal of Marketing, 57(2), 47–63. CrossRefGoogle Scholar
  69. Vapnik, V. (1998). Statistical learning theory. New York: Wiley. Google Scholar
  70. Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 38(301), 236–244. CrossRefGoogle Scholar
  71. Wedel, M., & Kamakura, W.A. (2000). Market segmentation: conceptual and methodological foundations. Norwell: Kluwer Academic. Google Scholar
  72. Yang, C.C., & Yang, C.C. (2007). Separating latent classes by information criteria. Journal of Classification, 24(2), 183–203. CrossRefGoogle Scholar

Copyright information

© The Psychometric Society 2012

Authors and Affiliations

  • Simon J. Blanchard
    • 1
    Email author
  • Daniel Aloise
    • 2
  • Wayne S. DeSarbo
    • 3
  1. 1.McDonough School of BusinessGeorgetown UniversityWashingtonUSA
  2. 2.Department of Computer Engineering and AutomationUniversidade Federal do Rio Grande do NorteNatalBrazil
  3. 3.Department of Marketing, Smeal College of BusinessPennsylvania State UniversityUniversity ParkUSA

Personalised recommendations