Abstract
In recent decades we have seen an increased interest in the use of seemingly unrelated regressions models (SUR) in a spatial context, with compelling case studies in different fields. This upsurge has favoured the development of new and more efficient inference techniques. At present, the user has a basic toolkit to deal with this kind of model, that is, however, in need of improvement. This paper focuses on the question of estimating, quickly and accurately, spatial SUR models. The most popular procedure is maximum likelihood (ML) which guarantees precision at the cost of a high computational burden. This is especially true in cases of large sample size and strong spatial structure. We explore simpler estimation algorithms such as instrumental variables (IV), which expedites calculation at the cost of a certain loss in quality of the estimates. We focus on the importance of sample size and the trade-off between accuracy and speed. To that end, we perform a comprehensive simulation experiment in which we compare ML and IV algorithms, looking for their strengths and weaknesses. The paper includes two applications to the case of Airbnb in the urban area of Madrid. First, we estimate a spatial SUR hedonic model of accommodation prices, using micro-data for three different cross sections à la Anselin, that is, considering temporal correlation plus spatial structure. Then, the data are aggregated by neighbourhoods. We specify a spatial SUR model with two equations (apartments and rooms) also using three cross sections. In both cases, the models are estimated by ML and IV using the spsur R package (Angulo et al. in spSUR: spatial seemingly unrelated regression models. R Package version 1.0.0.3, 2019), with the aim of illustrating its capabilities.
Similar content being viewed by others
Notes
If we allow this matrix to change across equations and/or time periods, the results remain essentially the same.
To save space, we consider only the case of the SUR–SLM models. The results for the SUR–SDM models are available upon request from the authors.
Characterized as having hosted at least 10 trips, obtained \(90\%\) response rate or higher, received a 5-star review at least \(80\%\) of the times they have been reviewed, completed each of their confirmed reservations without cancelling.
References
Angulo A, Lopez FA, Minguez R, Mur J (2019) spSUR: spatial seemingly unrelated regression models. R package version 1.0.0.3
Anselin L (1988a) Spatial econometrics: methods and models. Studies in operational regional science. Kluwer Academic Publishers, Dordrecht
Anselin L (1988b) A test for spatial autocorrelation in seemingly unrelated regressions. Econ Lett 28(4):335–341
Anselin L (1995) Spacestat, a software program for the analysis of spatial data
Anselin L (2016) Estimation and testing in the spatial seemingly unrelated regression (sur). Technical report, Geoda Center for Geospatial Analysis and Computation, Arizona State University. Working Paper 2016-01
Arbia G (1989) Spatial data configuration in statistical analysis of regional economic and related problems. Kluwer Academic Publisher, Dordrecht
Arora SS, Brown M (1977) Alternative approaches to spatial autocorrelation: an improvement over current practice. Int Reg Sci Rev 2:67–78
Baltagi BH, Bresson G (2011) Maximum likelihood estimation and lagrange multiplier tests for panel seemingly unrelated regressions with spatial lag and spatial errors: an application to hedonic housing prices in paris. J Urban Econ 69(1):24–42
Baltagi BH, Pirotte A (2011) Seemingly unrelated regressions with spatial error components. Empir Econ 40(1):5–49
Barry RP, Pace RK (1999) Monte carlo estimates of the log determinant of large sparse matrices. Linear Algebra Appl 289:41–54
Bech M, Hansen M, Lauridsen J, Kronborg C (2012) Can the municipalities prevent medication of mental diseases? J Ment Health Policy Econ 15(2):53–60
Benítez-Aurioles B (2018a) The role of distance in the peer-to-peer market for tourist accommodation. Tour Econ 24(3):237–250
Benítez-Aurioles B (2018b) Why are flexible booking policies priced negatively? Tour Manag 67:312–325
Cotteleer G, van Kooten GC (2012) Expert opinion versus actual transaction evidence in the valuation of non-market amenities. Econ Model 29(1):32–40
Davidson R, MacKinnon JG (1993) Estimation and inference in econometrics. Oxford University Press, Oxford
Durbin J (1954) Errors in variables. Revue de l’institut International de Statistique 23–32
Edelman BG, Luca M (2014) Digital discrimination: The case of airbnb.com
Egger P, Pfaffermayr M (2004) Distance, trade and FDI: a Hausman–Taylor SUR approach. J Appl Econ 19(2):227–246
Elhorst P (2014) Spatial econometrics from cross-sectional data to spatial panels. Springer, Berlin
Fang B, Ye Q, Law R (2016) Effect of sharing economy on tourism industry employment. Ann Tour Res 57:264–267
Fiebig DG (2007) Seemingly unrelated regression. In: Baltagi B (ed) A companion to theoretical econometrics, vol 5. Wiley-Blackwell, Hoboken, pp 101–121
Fingleton B (2007) Multi-equation spatial econometric model, with application to eu manufacturing productivity growth. J Geogr Syst 119–144:119–144
Gunter U (2018) What makes an Airbnb host a superhost? Empirical evidence from San Francisco and the Bay Area. Tour Manag 66:26–37
Gurran N, Phibbs P (2017) When tourists move in: How should urban planners respond to Airbnb? J Am Plan Assoc 83:80–92
Gutiérrez J, Garcí-Palomares JC, Romanillos G, Salas-Olmedo MH (2017) The eruption of Airbnb in tourist cities: comparing spatial patterns of hotels and peer-to-peer accommodation in Barcelona. Tour Manag 62:278–291
Hordijk L, Nijkamp P (1977) Dynamic models of spatial autocorrelation. Environ Plan A 9(5):505–519
Hung W-T, Shang J-K, Wang F-C (2010) Pricing determinants in the hotel industry: quantile regression analysis. Int J Hosp Manag 29:378–384
Izón GM, Hand MS, Mccollum DW, Thacher JA, Berrens RP (2016) Proximity to natural amenities: a seemingly unrelated hedonic regression model with spatial Durbin and spatial error processes. Growth Change 47(4):461–480
Johnston J (1984) Econometric methods, 3rd edn
Kakamu K, Polasek W, Wago H (2012) Production technology and agglomeration for Japanese prefectures during 1991–2000. Pap Reg Sci 91(1):29–41
Kakar V, Voelz J, Wu J, Franco J (2018) The visible host: Does race guide Airbnb rental rates in San Francisco? J Hous Econ 40:25–40
Kapoor M, Kelejian HH, Prucha IR (2007) Panel data models with spatially correlated error components. J Econ 140(1):97–130
Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1):99–121
Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40:509–533
Kelejian HH, Prucha IR (2004) Estimation of simultaneous systems of spatially interrelated cross sectional equations. J Econ 118(1–2):27–50
Kelejian HH, Prucha IR, Yuzefovich Y (2004) Instrumental variable estimation of a spatial autoregressive model with autoregressive disturbances: large and small sample results. In: LeSage J, Pace K (eds) Spatial and spatiotemporal econometrics. Emerald Group Publishing Limited, Bingley, pp 163–198
Kennedy P (2003) A guide to econometrics. MIT Press
Lauridsen J, Bech M, López F, Sánchez MM (2010) A spatiotemporal analysis of public pharmaceutical expenditure. Ann Reg Sci 44(2):299–314
Le Gallo J, Chasco C (2008) Spatial analysis of urban growth in Spain, 1900–2001. Empir Econ 34(1):59–80
Le Gallo J, DallErba S (2006) Evaluating the temporal and spatial heterogeneity of the European convergence process, 1980–1999. J Reg Sci 46(2):269–288
Lee L-F (2003) Best spatial two-stage least squares estimators for a spatial autoregressive model with autoregressive disturbances. Econ Rev 22(3):307–335
Lee L-F (2004) Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72(6):1899–1925
Lee L-F (2007) GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. J Econ 137(2):489–514
Li J, Moreno A, Zhang DJ (2015) Agent behavior in the sharing economy: evidence from Airbnb. Ross School of Business Working Paper Series, Working paper no 1298, 2015
López FA, Mur J, Angulo A (2014) Spatial model selection strategies in a SUR framework. The case of regional productivity in EU. Ann Reg Sci 53(1):197–220
López FA, Martínez-Ortiz PJ, Cegarra-Navarro J-G (2017) Spatial spillovers in public expenditure on a municipal level in Spain. Ann Reg Sci 58(1):39–65
Lundberg J (2006) Spatial interaction model of spillovers from locally provided public services. Reg Stud 40(6):631–644
Malinvaud E (1970) Statistical methods of econometrics. North Holland, Amsterdam
Mínguez R, López F, Mur J (2018) An R package for specification, estimation and testing of spatial and spatio-temporal SUR econometric model. R package version 1.0.0.3. https://CRAN.R-project.org/package=spsur
Moscone F, Tosetti E, Knapp M (2007) SUR model with spatial effects: an application to mental health expenditure. Health Econ 16(12):1403–1408
Mur J, López F, Herrera M (2010) Testing for spatial effects in seemingly unrelated regressions. Spatial Econ Anal 5:399–440
Pace RK, Barry R (1997) Quick computation of spatial autoregressive estimators. Geogr Anal 29(3):232–247
Rey SJ, Montouri BD (1999) US regional income convergence: a spatial econometric perspective. Reg Stud 33(2):143–156
Sargan JD (1958) The estimation of economic relationships using instrumental variables. Econ J Econ Soc 26(3):393–415
Smirnov O, Anselin L (2001) Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput Stat Data Anal 35:301–319
Smith TE (2009) Estimation bias in spatial models with strongly connected weight matrices. Geogr Anal 41(3):307–332
Spanos A (1986) Statistical foundations of econometric modelling. Cambridge University Press, Cambridge
Teubner T, Hawlitschek F, Dann D (2017) Price determinants on Airbnb: how reputation pays off in the sharing economy. J Self-Gov Manag Econ 5:53–80
Theil H (1971) Principles of econometrics. Wiley, Hoboken
Wang D, Nicolau J (2017) Price determinants of sharing economy based accommodation rental: a study of listings from 33 cities on Airbnb.com. Int J Hosp Manag 62:120–131
Wang X, Kockelman KM (2007) Specification and estimation of a spatially and temporally autocorrelated seemingly unrelated regression model: application to crash rates in china. Transportation 34(3):281–300
White EN, Hewings GJ (1982) Space–time employment modeling: some results using seemingly unrelated regressions estimators. J Reg Sci 22(3):283–302
Wooldridge JM (2010) Econometric analysis of cross section and panel data, 2nd edn. Massachusets Institut of Technology Press, Cambridge
Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and test of aggregation bias. J Am Stat Assoc 57:500–509
Zervas G, Proserpio D, Byers J (2017) The rise of the sharing economy: estimating the impact of airbnb on the hotel industry. J Mark Res 54(5):687–705
Zhang L, Chen L, Wu Z, Xue H, Dong W (2018) Key factors affecting informed consumers’ willingness to pay for green housing: a case study of Jinan, China. Sustainability 10:1711
Zhou BB, Kockelman KM (2009) Predicting the distribution of households and employment: a seemingly unrelated regression model with two spatial processes. J Transp Geogr 17(5):369–376
Acknowledgements
This work has been partially funded by the Spanish Ministry of Economy and Competitiveness Grants MTM2014-52184, ECO2015-65826-P and ECO2015- 65758-P and Seneca Foundation Grant 19884/GERM/15. Jesus Mur is grateful for the financial support of the Department of Industry and Innovation of Aragón Government and the European Regional Development Fund, through the GAEC group, project number S37_17R.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Additional results on the Monte Carlo
Appendix: Additional results on the Monte Carlo
Figures 3, 4, 5, 6, 7, 8 add details to Sect. 3. Here, we include the estimation bias for the parameters of the experiment, using classical boxplots. For reasons of space, we only distinguish between type of equation, SLM or SDM. The data in the graphs refer to the difference between the true and the estimated value of the corresponding parameter for each iteration; so, at the very least, the boxplot should be centred on zero. Moreover, the IV algorithm has been simulated for a sample size of \(n=5000\), to check for its consistency. The results, available upon request, confirm the low consistency rate of these estimates.
This is what happens with the boxplots of estimates of the slope coefficients, i.e. \(\beta _{2g}\) parameters. Figure 3 reveals several important things: (1) the boxes are centred around a value of zero, which confirms that the estimates are unbiased, (2) the estimates are also consistent, which is clear from the tendency to compress the boxplots around zero, (3) there are almost no differences between the results of the SLM and of the SDM equations; the shape of the two boxplots is basically the same, (4) both algorithms produce strange results which are well outside the upper and lower whiskers; however, this propensity decreases with sample size, (5) 400 observations can be taken as a threshold to significantly reduce the risk of outliers, and (6) the IV estimates appear to be more affected by the incidence of outliers which are of greater size than in the ML approach.
Figure 4 describes the estimation of the intercepts, i.e. \(\beta _{1g}\) parameters; it is obvious that the shape of the boxes changes a lot. The results point to the existence of a slight bias for small sample sizes (\(n=25,49\)), which disappears with large n. Moreover, the distribution of the ML estimation errors is asymmetric and skewed towards the range of positive values. The bias is stronger for the SLM case. Another fact to note is the slow rate of convergence towards the true values. The dispersion of bias concentrates around the value of zero only for large sample sizes, \(n=900\), for both algorithms. In sum, there is a high risk of obtaining anomalous estimates for small and medium sample sizes for the intercept terms of the equations.
Similar comments can be made with respect to the estimation of the \(\theta _{2g}\) parameters in Fig. 5. The distribution of the estimation errors is centred around a value of zero, but the rate of convergence seems to be slow. There are abundant outliers for small and medium sample sizes, which increases considerably the risk of calculating strange estimates.
Figure 6 synthesizes the estimation of the spatial autocorrelation coefficients, \(\lambda _{1}\) and \(\lambda _{2}\). The shape of the boxplots confirms the consistency of both algorithms, but also the non-negligible risk of obtaining anomalous estimates even for large sample sizes. The boxes, in the ML case, remain slightly asymmetrical for small sample sizes, \(n=25\) and \(n=49\), and are not exactly centred on zero. The last result corroborates the tendency of the ML algorithm to underestimate the spatial dependence parameters, as noted previously, for example, by Pace and Barry (1997) or Barry and Pace (1999). The graphs corresponding to IV estimates are blurred because of the presence of outliers; however, we can anticipate that these distributions are centred around zero and, as it is clear in the graphs, the dispersion of the estimates is very large.
Finally, Figs. 7 and 8 summarize the estimation of the terms in \(\varvec{\varSigma }\); that is of \(\sigma _{1}^2\), \(\sigma _{2}^2\) and \(\sigma _{12}\). The graphs show a collection of distributions with a similar shape for the ML and IV algorithms, although the dispersion of the second is larger. Overall, these distributions are centred on zero and confirm the consistency of the algorithms. However, there is a large risk of obtaining anomalous estimations for small and medium size samples, \(n=25,49,100\), which only decreases significantly for samples of \(n=400\) observations. It is obvious that estimation bias is skew to the range of positive values.
Rights and permissions
About this article
Cite this article
López, F.A., Mínguez, R. & Mur, J. ML versus IV estimates of spatial SUR models: evidence from the case of Airbnb in Madrid urban area. Ann Reg Sci 64, 313–347 (2020). https://doi.org/10.1007/s00168-019-00914-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00168-019-00914-1