Skip to main content
Log in

A new spatial (social) interaction discrete choice model accommodating for unobserved effects due to endogenous network formation

  • Published:
Transportation Aims and scope Submit manuscript

Abstract

This paper formulates a model that extends the traditional panel discrete choice model to include social/spatial dependencies in the form of dyadic interactions between each pair of decision-makers. In addition, the formulation accommodates spatial correlation effects as well as allows a global spatial structure to be placed on the individual-specific unobserved response sensitivity to exogenous variables. We interpret these latter two effects, sometimes referred to as spatial drift effects, as originating from endogenous group formation. To our knowledge, we are the first to suggest this endogenous group formation interpretation for spatial drift effects in the social/spatial interactions literature. The formulation is motivated in a travel mode choice context, but is applicable in a wide variety of other empirical contexts. Bhat’s (Transp Res B 45(7):923–939, 2011) maximum approximate composite marginal likelihood (MACML) procedure is used for model estimation. A simulation exercise indicates that the MACML approach recovers the model parameters very well, even in the presence of high spatial dependence and endogenous group formation tendency. In addition, the simulation results demonstrate that ignoring spatial dependence and endogenous group formation when both are actually present will lead to bias in parameter estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Our approach, in a traditional continuous dependent variable setting, leads to what is referred to as the Kelejian–Pruscha (KP) model (though, as we will discuss later, the model we propose is for an unordered-response discrete dependent variable and extends the KP model in important ways). The alternative of considering exogenous variable effects and ignoring unobserved correlation effects, in a traditional continuous dependent variable setting, is referred to as the spatial Durbin model. The reader is referred to LeSage and Pace (2009) and Elhorst (2010) for discussions. Both these authors suggest that, given the identification problems in introducing both these effects (along with the social/spatial interaction or spatial lag effect), there are benefits to starting with the spatial Durbin model as the most general specification. This is because ignoring spatial dependence in the errors in continuous variable models leads only to inefficiency loss, while ignoring spatial dependence in the exogenous variables leads to biased and inconsistent estimates due to the omitted variable problem. However, this discussion does not extend to discrete choice models, where the typical spatial dependence structure used for the error disturbance also adds to error heteroscedasticity, which leads to biased and inconsistent estimates of the discrete choice model. Further, as discussed in the main text, in many transportation contexts, it may be easier to motivate unobserved error correlation effects.

  2. Econometrically speaking, relative to Lee et al. (2010), our paper differs in that we allow a global network structure (equivalently, a single group in Lee et al.’s study), and also accommodate a network-based unobserved correlation structure for the overall error term as well as for individual coefficients on exogenous variables.

  3. In the spatial literature, spatial drift effects have been typically incorporated using the geographically weighted regression (GWR) approach of Brunsdon et al. (1998) (some other approaches, such as spatial adaptive filtering and multi-level modeling, are either too ad hoc or too restrictive in capturing spatial drift effects, and are not often used; see Mittal et al. 2004 for a discussion). The GWR approach allows spatial dependence in parameters based on spatial proximity, but is not able to disentangle the spatial drift effects from the direct parameter effect. On the other hand, being able to do so is important in many cases. For example, in our mode choice context, the effects of neo-urbanist developments on mode shares, net of residential self-selection effects (as captured by spatial drift), is important in its own right. Our formulation explicitly and directly captures spatial dependence in the random coefficients and is able to isolate the mean direct effect of each exogenous variable, while also controlling for spatial lag and spatial error effects.

  4. In the terminology of the social network formation literature, we accommodate self-selection effects caused by residential patterns in modeling modal choice, though we do not jointly model the coevolution of residential network formation (that is, the W matrix) and the modal outcome. That is, the network literature (see, for example, Steglich et al. 2010 and Christakis and Fowler 2013) considers the network formation and behavior outcomes as closely intertwined and evolving over time, with each influencing the other in a dynamic fashion. We do not examine such a dynamic system of network and behavior coevolution in this paper.

  5. This is the same estimator as the one used by Wang et al. (2013) in a spatial binary probit context and that they label as the partial maximum likelihood estimator (PMLE). However, papers using the pairwise CML for spatial econometric contexts and for even more general discrete choice contexts than the spatial binary probit were published by Bhat and colleagues earlier, as in Bhat and Sidharthan (2012), Sener and Bhat (2012), Castro et al. (2012), Bhat (2011), and Bhat et al. (2010). Some of these combine the CML method with the MVNCD approximation, which then becomes the MACML approach of Bhat (2011).

  6. The CML estimator loses some asymptotic efficiency from a theoretical perspective relative to a full likelihood estimator, because information embedded in the higher dimension components of the full information estimator are ignored by the CML estimator. However, as presented in Bhat (2014), many studies have found that the efficiency loss of the CML estimator (relative to the maximum likelihood (ML) estimator) is negligible to small in applications on finite samples. Besides, in spatial models, a maximum simulated likelihood (MSL) approach is needed for estimation because of the high dimensionality of integration. When simulation methods are used, there is also a loss in asymptotic efficiency in the maximum simulated likelihood (MSL) estimator relative to a full likelihood estimator (McFadden and Train 2000). Consequently, it is difficult to state from a theoretical standpoint whether the CML estimator efficiency will be higher or lower than the MSL estimator efficiency. Bhat (2014) presents many studies that empirically compare the CML and MSL finite sample efficiency results in models where it is practical to implement the MSL, and concludes that “….any reduction in the efficiency of the CML approach relative to the MSL approach is in the range of non-existent to small”. In addition, while the MSL method encountered convergence problems even for relatively simple aspatial models, they noted that the CML approach exhibited no such problems, and had the benefit of substantially faster computational times.

  7. In the MACML approach, a single random permutation is generated for each choice instance (the random permutation varies across choice instances, but is the same across iterations for a given choice instance) to decompose the MVNCD function into a product sequence of marginal and conditional probabilities (see Sect. 2.1 of Bhat 2011). We also tested higher number of permutations, but noticed little difference in the estimation results, and hence settled with the single permutation per individual.

  8. As pointed out by Sandor and Train (2004), approximation methods of any kind to evaluate the maximum of an analytically-intractable function will tend to show more variance in the convergent values (in repeated applications of the approximation with different sets of simulation draws in a simulated setting or different permutations of the conditional probability sequence in our MACML estimation setting) as the function gets flatter near the maximum. This is because errors introduced by simulation or other approximations can move the maximum considerably when the function is flat near the maximum. Of course, large sampling variances of the parameters embedded in a function means a large sampling variance of the function being maximized; that is, the larger the sampling variance of parameters, the higher in general will be the approximation error. Thus, we examine the extent of approximation error as a percentage of the finite sample standard deviation (or FSSD).

  9. A peculiar observation related to the approximation error (as a percentage of the FSSD) is that it declines quite considerably for the λ 2 and λ 3 parameters as one moves from low spatial lag to high spatial lag and from low spatial drift to high spatial drift. Why this is so is left for future exploration.

References

  • Anselin, L.: Spatial externalities, spatial multipliers and spatial econometrics. Int. Region. Sci. Rev. 26, 153–166 (2003)

    Article  Google Scholar 

  • Bartolucci, F., Nigro, V.: A dynamic model for binary panel data with unobserved heterogeneity admitting a consistent conditional estimator. Econometrica 78(2), 719–733 (2010)

    Article  Google Scholar 

  • Beron, K.J., Vijverberg, W.P.M.: Probit in a spatial context: a Monte Carlo analysis. In: Anselin, L., Florax, R.J.G.M., Rey, S.J. (eds.) Advances in Spatial Econometrics: Methodology, Tools and Applications. Springer-Verlag, Berlin (2004)

    Google Scholar 

  • Bhat, C.R.: The maximum approximate composite marginal likelihood (MACML) estimation of multinomial probit-based unordered response choice models. Transp. Res. B 45(7), 923–939 (2011)

    Article  Google Scholar 

  • Bhat, C.R.: The composite marginal likelihood (CML) inference approach with applications to discrete and mixed dependent variable models. Found. Trends Econom. 7(1), 1–117 (2014)

    Article  Google Scholar 

  • Bhat, C.R., Guo, J.Y.: A comprehensive analysis of built environment characteristics on household residential choice and auto ownership levels. Transp. Res. B 41(5), 506–526 (2007)

    Article  Google Scholar 

  • Bhat, C.R., Sardesai, R.: The impact of stop-making and travel time reliability on commute mode choice. Transp. Res. B 40(9), 709–730 (2006)

    Article  Google Scholar 

  • Bhat, C.R., Sener, I.N.: A copula-based closed-form binary logit choice model for accommodating spatial correlation across observational units. J. Geogr. Syst. 11(3), 243–272 (2009)

    Article  Google Scholar 

  • Bhat, C.R., Sidharthan, R.: A new approach to specify and estimate non-normally mixed multinomial probit models. Transp. Res. B 46(7), 817–833 (2012)

    Article  Google Scholar 

  • Bhat, C.R., Sener, I.N., Eluru, N.: A flexible spatially dependent discrete choice model: formulation and application to teenagers’ weekday recreational activity participation. Transp. Res. B 44(8–9), 903–921 (2010)

    Article  Google Scholar 

  • Bhat, C.R., Elhorst, J.P, LeSage, J.P.: On spatial effects, interpretation, and estimation in econometric models. In: Paper Based on a Workshop Held at the 9th Invitational Choice Symposium, Noordwijk (2013)

  • Bhat, C.R., Paleti, R., Singh, P.: A spatial multivariate count model for firm location decisions. J. Region. Sci. 54(3), 462–502 (2014a)

    Google Scholar 

  • Bhat, C.R., Astroza, C., Sidharthan, R., Jobair Bin Alam, M., Khushefati, W.H.: A joint count-continuous model of travel behavior with selection based on a multinomial probit residential density choice model. Transp. Res. B 68, 31–51 (2014b)

    Article  Google Scholar 

  • Bille, A.G.: Computational issues in estimation of the spatial probit model: a comparison of various estimators. Rev. Region. Stud. 43, 131–154 (2013)

    Google Scholar 

  • Blume, L.E., Brock, W.A., Durlauf, S.N., Ioannides, Y.M.: Identification of social interactions, chap. 18. In: Behhabib, J., Jackson, M.O., Bisin, A. (eds.) Handbook of Social Economics, vol. 1B, pp. 853–964. North-Holland, San Diego (2011)

    Google Scholar 

  • Bradlow, E.T., Bronnenberg, B., Russell, G.J., Arora, N., Bell, D.R., Duvvuri, S.D., Hofstede, F.T., Sismeiro, C., Thomadsen, R., Yang, S.: Spatial models in marketing. Mark. Lett. 16(3), 267–278 (2005)

    Article  Google Scholar 

  • Bramoullé, Y., Djebbari, H., Fortin, B.: Identification of peer effects through social networks. J. Econom. 150, 41–55 (2009)

    Article  Google Scholar 

  • Brock, W., Durlauf, S.: Discrete choice with social interactions. Rev. Econ. Stud. 68, 235–260 (2001)

    Article  Google Scholar 

  • Brock, W., Durlauf, S.: Multinomial choice with social interactions. In: Blume, L., Durlauf, S. (eds.) The Economy as an Evolving Complex System, vol. 3. Oxford University Press, New York (2006)

    Google Scholar 

  • Brock, W., Durlauf, S.: Identification of binary choice models with social interactions. J. Econom. 140, 52–75 (2007)

    Article  Google Scholar 

  • Brunsdon, C., Fotheringham, S., Charlton, M.: Geographically weighted regression. J. R. Stat. Soc. Ser. D 47(3), 431–443 (1998)

    Article  Google Scholar 

  • Calabrese, R., Elkink, J.A.: Estimators of binary spatial autoregressive models: a Monte Carlo study. J. Region. Sci. 54(4), 664–687 (2014)

    Article  Google Scholar 

  • Carrión-Flores, C., Irwin, E.G.: Determinants of residential land-use conversion and sprawl at the rural-urban fringe. Am. J. Agric. Econ. 86(4), 889–904 (2004)

    Article  Google Scholar 

  • Castro, M., Paleti, R., Bhat, C.R.: A latent variable representation of count data models to accommodate spatial and temporal dependence: application to predicting crash frequency at intersections. Transp. Res. B 46(1), 253–272 (2012)

    Article  Google Scholar 

  • Chakir, R., Parent, O.: Determinants of land use changes: a spatial multinomial probit approach. Pap. Region. Sci. 88(2), 327–344 (2009)

    Article  Google Scholar 

  • Chamberlain, G.: Binary response models for panel data: identification and information. Econometrica 78(1), 159–168 (2010)

    Article  Google Scholar 

  • Christakis, N.A., Fowler, J.H.: Social contagion theory: examining dynamic social networks and human behavior. Stat. Med. 32(4), 556–577 (2013)

    Article  Google Scholar 

  • Cox, D.R., Reid, N.: A note on pseudolikelihood constructed from marginal densities. Biometrika 91(3), 729–737 (2004)

    Article  Google Scholar 

  • Davezies, L., d´ Haultfoeuille, X., Foughre, D.: Identification of peer effects using group size variation. Econom. J. 12(3), 397–413 (2009)

    Article  Google Scholar 

  • Elhorst, J.P.: Applied spatial econometrics: raising the bar. Spat. Econ. Anal. 5(1), 10–28 (2010)

    Article  Google Scholar 

  • Franzese RJ, Hays JC, Schaffer LM (2010) Spatial, temporal, and spatiotemporal autoregressive probit models of binary outcomes: estimation, interpretation, and presentation. APSA 2010 Annual Meeting Paper. Available at: http://ssrn.com/abstract=1643867

  • Godambe, V.P.: An optimum property of regular maximum likelihood estimation. Ann. Math. Stat. 31(4), 1208–1211 (1960)

    Article  Google Scholar 

  • Guo, J.Y., Bhat, C.R.: Operationalizing the concept of neighborhood: application to residential location choice analysis. J. Transp. Geogr. 15(1), 31–45 (2007)

    Article  Google Scholar 

  • Hartman, W.R., Manchanda, P., Nair, H., Bothner, M., Dodds, P., Godes, D., Hosanagar, K., Tucker, C.: Modeling social interactions: identification, empirical methods and policy implications. Mark. Lett. 19(3–4), 287–304 (2008)

    Article  Google Scholar 

  • Klier, T., McMillen, D.P.: Clustering of auto supplier plants in the U.S.: GMM spatial logit for large samples. ASA J. Bus. Econ. Stat. 26(4), 460–471 (2008)

    Article  Google Scholar 

  • Krauth, B.: Social interactions in small groups. Can. J. Econ. 39, 414–433 (2006)

    Article  Google Scholar 

  • Lee, L.-F.: Identification and estimation of econometric models with group interactions, contextual factors and fixed effects. J. Econom. 140, 333–374 (2007)

    Article  Google Scholar 

  • Lee, L.-F., Liu, X., Lin, X.: Specification and estimation of social interaction models with network structure: contextual factors, correlation and fixed effects. Econom. J. 13, 145–176 (2010)

    Article  Google Scholar 

  • LeSage, J.P.: Bayesian estimation of limited dependent variable spatial autoregressive models. Geogr. Anal. 32(1), 19–35 (2000)

    Article  Google Scholar 

  • LeSage, J.P., Pace, R.K.: Introduction to Spatial Econometrics. Chapman & Hall/CRC, Taylor & Francis Group, Boca Raton (2009)

    Book  Google Scholar 

  • LeSage, J.P., Pace, R.K., Lam, N., Campanella, R., Liu, X.: New Orleans business recovery in the aftermath of Hurricane Katrina. J. R. Stat. Soc. Ser. A 174(4), 1007–1027 (2011)

    Article  Google Scholar 

  • Lindsay, B.G.: Composite likelihood methods. Contemp. Math. 80, 221–239 (1988)

    Article  Google Scholar 

  • Lindsay, B.G., Yi, G.Y., Sun, J.: Issues and strategies in the selection of composite likelihoods. Stat. Sin. 21(1), 71–105 (2011)

    Google Scholar 

  • Manski, C.F.: Identification of endogenous social effects: the reflection problem. Rev. Econ. Stud. 60(3), 531–542 (1993)

    Article  Google Scholar 

  • McFadden, D., Train, K.: Mixed MNL models for discrete response. J. Appl. Econom. 15(5), 447–470 (2000)

    Article  Google Scholar 

  • McMillen, D.P.: Probit with spatial autocorrelation. J. Region. Sci. 32, 335–348 (1992)

    Article  Google Scholar 

  • Mittal, V., Kamakura, W.A., Govind, R.: Geographic patterns in customer service and satisfaction; an empirical investigation. J. Mark. 68, 48–62 (2004)

    Article  Google Scholar 

  • Moffitt, R.: Policy interventions, low-level equilibria, and social interactions. In: Durlauf, S., Young, H.P. (eds.) Social Dynamics. MIT Press, Cambridge (2001)

    Google Scholar 

  • Mokhtarian, P.L., Cao, X.: Examining the impacts of residential self-selection on travel behavior: a focus on methodologies. Transp. Res. B 42(3), 204–228 (2008)

    Article  Google Scholar 

  • Molenberghs, G., Verbeke, G.: Models for Discrete Longitudinal Data. Springer Series in Statistics. Springer Science + Business Media Inc, New York (2005)

    Google Scholar 

  • Pace RK, LeSage JP (2011) Fast simulated maximum likelihood estimation of the spatial probit model capable of handling large samples. Available at SSRN: http://ssrn.com/abstract=1966039

  • Pace, L., Salvan, A., Sartori, N.: Adjusting composite likelihood ratio statistics. Stat. Sin. 21(1), 129–148 (2011)

    Google Scholar 

  • Pinjari, A.R., Bhat, C.R.: On the nonlinearity of response to level of service variables in travel mode choice models. Transp. Res. Rec. 1977, 67–74 (2006)

    Article  Google Scholar 

  • Sándor, Z., Train, K.: Quasi-random simulation of discrete choice models. Transp. Res. B 38, 313–327 (2004)

    Article  Google Scholar 

  • Sener, I.N., Bhat, C.R.: Flexible spatial dependence structures for unordered multinomial choice models: formulation and application to teenagers’ activity participation. Transportation 39(3), 657–683 (2012)

    Article  Google Scholar 

  • Sidharthan, R., Bhat, C.R.: Incorporating spatial dynamics and temporal dependency in land use change models. Geogr. Anal. 44(4), 321–349 (2012)

    Article  Google Scholar 

  • Smirnov, O.A.: Spatial econometrics approach to integration of behavioral biases in travel demand analysis. Transp. Res. Rec. 2157, 1–10 (2010)

    Article  Google Scholar 

  • Steglich, C., Snijders, T.A.B., Pearson, M.: Dynamic networks and behavior: separating selection from influence. Sociol. Methodol. 40(1), 329–393 (2010)

    Article  Google Scholar 

  • Soetevent, A., Kooreman, P.: A discrete-choice model with social interactions: with an application to high school teen behavior. J. Appl. Econom. 22(599), 624 (2007)

    Google Scholar 

  • Thomas, T.S.: Impact of economic policy options on deforestation in Madagascar, Companion Paper for the Policy Research Report on Forests, Environment, and Livelihoods, The World Bank, Washington DC (2007). http://siteresources.worldbank.org/INTKNOWLEDGEFORCHANGE/Resources/491519-1199818447826/a_primer_for_bayesian_spatial_probits_with_an_application_to_deforestation_in_madagascar.pdf. Accessed 11 Oct 2014

  • Tsai, Y.: Impacts of self-selection and transit proximity on commute mode choice: evidence from Taipei Rapid Transit System. Ann. Region. Sci. 43(4), 1073–1094 (2009)

    Article  Google Scholar 

  • Varin, C., Reid, N., Firth, D.: An overview of composite likelihood methods. Stat. Sin. 21(1), 5–42 (2011)

    Google Scholar 

  • Wang, H., Iglesias, E.M., Wooldridge, J.M.: Partial maximum likelihood estimation of spatial probit models. J. Econom. 172, 77–89 (2013)

    Article  Google Scholar 

  • Xu, X., Reid, N.: On the robustness of maximum composite likelihood estimate. J. Stat. Plan. Inference 141(9), 3047–3054 (2011)

    Article  Google Scholar 

  • Yi, G.Y., Zeng, L., Cook, R.J.: A robust pairwise likelihood method for incomplete longitudinal binary data arising in clusters. Can. J. Stat. 39(1), 34–51 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

This research was partially supported by the U.S. Department of Transportation through the Data-Supported Transportation Operations and Planning (D-STOP) Tier 1 University Transportation Center. The author would also like to acknowledge support from a Humboldt Research Award from the Alexander von Humboldt Foundation, Germany. The author is grateful to Lisa Macias for her help in formatting this document, and to Chrissy Bernardo, Rajesh Paleti, and Subodh Dubey for GAUSS coding help and providing the results from the simulation runs. Four anonymous reviewers provided useful comments on an earlier version of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandra Bhat.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhat, C. A new spatial (social) interaction discrete choice model accommodating for unobserved effects due to endogenous network formation. Transportation 42, 879–914 (2015). https://doi.org/10.1007/s11116-015-9651-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11116-015-9651-9

Keywords

Navigation