Skip to main content

ML versus IV estimates of spatial SUR models: evidence from the case of Airbnb in Madrid urban area

Abstract

In recent decades we have seen an increased interest in the use of seemingly unrelated regressions models (SUR) in a spatial context, with compelling case studies in different fields. This upsurge has favoured the development of new and more efficient inference techniques. At present, the user has a basic toolkit to deal with this kind of model, that is, however, in need of improvement. This paper focuses on the question of estimating, quickly and accurately, spatial SUR models. The most popular procedure is maximum likelihood (ML) which guarantees precision at the cost of a high computational burden. This is especially true in cases of large sample size and strong spatial structure. We explore simpler estimation algorithms such as instrumental variables (IV), which expedites calculation at the cost of a certain loss in quality of the estimates. We focus on the importance of sample size and the trade-off between accuracy and speed. To that end, we perform a comprehensive simulation experiment in which we compare ML and IV algorithms, looking for their strengths and weaknesses. The paper includes two applications to the case of Airbnb in the urban area of Madrid. First, we estimate a spatial SUR hedonic model of accommodation prices, using micro-data for three different cross sections à la Anselin, that is, considering temporal correlation plus spatial structure. Then, the data are aggregated by neighbourhoods. We specify a spatial SUR model with two equations (apartments and rooms) also using three cross sections. In both cases, the models are estimated by ML and IV using the spsur R package (Angulo et al. in spSUR: spatial seemingly unrelated regression models. R Package version 1.0.0.3, 2019), with the aim of illustrating its capabilities.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

Notes

  1. If we allow this matrix to change across equations and/or time periods, the results remain essentially the same.

  2. To save space, we consider only the case of the SUR–SLM models. The results for the SUR–SDM models are available upon request from the authors.

  3. Characterized as having hosted at least 10 trips, obtained \(90\%\) response rate or higher, received a 5-star review at least \(80\%\) of the times they have been reviewed, completed each of their confirmed reservations without cancelling.

References

  • Angulo A, Lopez FA, Minguez R, Mur J (2019) spSUR: spatial seemingly unrelated regression models. R package version 1.0.0.3

  • Anselin L (1988a) Spatial econometrics: methods and models. Studies in operational regional science. Kluwer Academic Publishers, Dordrecht

    Book  Google Scholar 

  • Anselin L (1988b) A test for spatial autocorrelation in seemingly unrelated regressions. Econ Lett 28(4):335–341

    Article  Google Scholar 

  • Anselin L (1995) Spacestat, a software program for the analysis of spatial data

  • Anselin L (2016) Estimation and testing in the spatial seemingly unrelated regression (sur). Technical report, Geoda Center for Geospatial Analysis and Computation, Arizona State University. Working Paper 2016-01

  • Arbia G (1989) Spatial data configuration in statistical analysis of regional economic and related problems. Kluwer Academic Publisher, Dordrecht

    Book  Google Scholar 

  • Arora SS, Brown M (1977) Alternative approaches to spatial autocorrelation: an improvement over current practice. Int Reg Sci Rev 2:67–78

    Article  Google Scholar 

  • Baltagi BH, Bresson G (2011) Maximum likelihood estimation and lagrange multiplier tests for panel seemingly unrelated regressions with spatial lag and spatial errors: an application to hedonic housing prices in paris. J Urban Econ 69(1):24–42

    Article  Google Scholar 

  • Baltagi BH, Pirotte A (2011) Seemingly unrelated regressions with spatial error components. Empir Econ 40(1):5–49

    Article  Google Scholar 

  • Barry RP, Pace RK (1999) Monte carlo estimates of the log determinant of large sparse matrices. Linear Algebra Appl 289:41–54

    Article  Google Scholar 

  • Bech M, Hansen M, Lauridsen J, Kronborg C (2012) Can the municipalities prevent medication of mental diseases? J Ment Health Policy Econ 15(2):53–60

    Google Scholar 

  • Benítez-Aurioles B (2018a) The role of distance in the peer-to-peer market for tourist accommodation. Tour Econ 24(3):237–250

    Article  Google Scholar 

  • Benítez-Aurioles B (2018b) Why are flexible booking policies priced negatively? Tour Manag 67:312–325

    Article  Google Scholar 

  • Cotteleer G, van Kooten GC (2012) Expert opinion versus actual transaction evidence in the valuation of non-market amenities. Econ Model 29(1):32–40

    Article  Google Scholar 

  • Davidson R, MacKinnon JG (1993) Estimation and inference in econometrics. Oxford University Press, Oxford

    Google Scholar 

  • Durbin J (1954) Errors in variables. Revue de l’institut International de Statistique 23–32

  • Edelman BG, Luca M (2014) Digital discrimination: The case of airbnb.com

  • Egger P, Pfaffermayr M (2004) Distance, trade and FDI: a Hausman–Taylor SUR approach. J Appl Econ 19(2):227–246

    Article  Google Scholar 

  • Elhorst P (2014) Spatial econometrics from cross-sectional data to spatial panels. Springer, Berlin

    Google Scholar 

  • Fang B, Ye Q, Law R (2016) Effect of sharing economy on tourism industry employment. Ann Tour Res 57:264–267

    Article  Google Scholar 

  • Fiebig DG (2007) Seemingly unrelated regression. In: Baltagi B (ed) A companion to theoretical econometrics, vol 5. Wiley-Blackwell, Hoboken, pp 101–121

    Google Scholar 

  • Fingleton B (2007) Multi-equation spatial econometric model, with application to eu manufacturing productivity growth. J Geogr Syst 119–144:119–144

    Article  Google Scholar 

  • Gunter U (2018) What makes an Airbnb host a superhost? Empirical evidence from San Francisco and the Bay Area. Tour Manag 66:26–37

    Article  Google Scholar 

  • Gurran N, Phibbs P (2017) When tourists move in: How should urban planners respond to Airbnb? J Am Plan Assoc 83:80–92

    Article  Google Scholar 

  • Gutiérrez J, Garcí-Palomares JC, Romanillos G, Salas-Olmedo MH (2017) The eruption of Airbnb in tourist cities: comparing spatial patterns of hotels and peer-to-peer accommodation in Barcelona. Tour Manag 62:278–291

    Article  Google Scholar 

  • Hordijk L, Nijkamp P (1977) Dynamic models of spatial autocorrelation. Environ Plan A 9(5):505–519

    Article  Google Scholar 

  • Hung W-T, Shang J-K, Wang F-C (2010) Pricing determinants in the hotel industry: quantile regression analysis. Int J Hosp Manag 29:378–384

    Article  Google Scholar 

  • Izón GM, Hand MS, Mccollum DW, Thacher JA, Berrens RP (2016) Proximity to natural amenities: a seemingly unrelated hedonic regression model with spatial Durbin and spatial error processes. Growth Change 47(4):461–480

    Article  Google Scholar 

  • Johnston J (1984) Econometric methods, 3rd edn

  • Kakamu K, Polasek W, Wago H (2012) Production technology and agglomeration for Japanese prefectures during 1991–2000. Pap Reg Sci 91(1):29–41

    Article  Google Scholar 

  • Kakar V, Voelz J, Wu J, Franco J (2018) The visible host: Does race guide Airbnb rental rates in San Francisco? J Hous Econ 40:25–40

    Article  Google Scholar 

  • Kapoor M, Kelejian HH, Prucha IR (2007) Panel data models with spatially correlated error components. J Econ 140(1):97–130

    Article  Google Scholar 

  • Kelejian HH, Prucha IR (1998) A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J Real Estate Finance Econ 17(1):99–121

    Article  Google Scholar 

  • Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econ Rev 40:509–533

    Article  Google Scholar 

  • Kelejian HH, Prucha IR (2004) Estimation of simultaneous systems of spatially interrelated cross sectional equations. J Econ 118(1–2):27–50

    Article  Google Scholar 

  • Kelejian HH, Prucha IR, Yuzefovich Y (2004) Instrumental variable estimation of a spatial autoregressive model with autoregressive disturbances: large and small sample results. In: LeSage J, Pace K (eds) Spatial and spatiotemporal econometrics. Emerald Group Publishing Limited, Bingley, pp 163–198

    Chapter  Google Scholar 

  • Kennedy P (2003) A guide to econometrics. MIT Press

  • Lauridsen J, Bech M, López F, Sánchez MM (2010) A spatiotemporal analysis of public pharmaceutical expenditure. Ann Reg Sci 44(2):299–314

    Article  Google Scholar 

  • Le Gallo J, Chasco C (2008) Spatial analysis of urban growth in Spain, 1900–2001. Empir Econ 34(1):59–80

    Article  Google Scholar 

  • Le Gallo J, DallErba S (2006) Evaluating the temporal and spatial heterogeneity of the European convergence process, 1980–1999. J Reg Sci 46(2):269–288

    Article  Google Scholar 

  • Lee L-F (2003) Best spatial two-stage least squares estimators for a spatial autoregressive model with autoregressive disturbances. Econ Rev 22(3):307–335

    Article  Google Scholar 

  • Lee L-F (2004) Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72(6):1899–1925

    Article  Google Scholar 

  • Lee L-F (2007) GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. J Econ 137(2):489–514

    Article  Google Scholar 

  • Li J, Moreno A, Zhang DJ (2015) Agent behavior in the sharing economy: evidence from Airbnb. Ross School of Business Working Paper Series, Working paper no 1298, 2015

  • López FA, Mur J, Angulo A (2014) Spatial model selection strategies in a SUR framework. The case of regional productivity in EU. Ann Reg Sci 53(1):197–220

    Article  Google Scholar 

  • López FA, Martínez-Ortiz PJ, Cegarra-Navarro J-G (2017) Spatial spillovers in public expenditure on a municipal level in Spain. Ann Reg Sci 58(1):39–65

    Article  Google Scholar 

  • Lundberg J (2006) Spatial interaction model of spillovers from locally provided public services. Reg Stud 40(6):631–644

    Article  Google Scholar 

  • Malinvaud E (1970) Statistical methods of econometrics. North Holland, Amsterdam

    Google Scholar 

  • Mínguez R, López F, Mur J (2018) An R package for specification, estimation and testing of spatial and spatio-temporal SUR econometric model. R package version 1.0.0.3. https://CRAN.R-project.org/package=spsur

  • Moscone F, Tosetti E, Knapp M (2007) SUR model with spatial effects: an application to mental health expenditure. Health Econ 16(12):1403–1408

    Article  Google Scholar 

  • Mur J, López F, Herrera M (2010) Testing for spatial effects in seemingly unrelated regressions. Spatial Econ Anal 5:399–440

    Article  Google Scholar 

  • Pace RK, Barry R (1997) Quick computation of spatial autoregressive estimators. Geogr Anal 29(3):232–247

    Article  Google Scholar 

  • Rey SJ, Montouri BD (1999) US regional income convergence: a spatial econometric perspective. Reg Stud 33(2):143–156

    Article  Google Scholar 

  • Sargan JD (1958) The estimation of economic relationships using instrumental variables. Econ J Econ Soc 26(3):393–415

    Google Scholar 

  • Smirnov O, Anselin L (2001) Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput Stat Data Anal 35:301–319

    Article  Google Scholar 

  • Smith TE (2009) Estimation bias in spatial models with strongly connected weight matrices. Geogr Anal 41(3):307–332

    Article  Google Scholar 

  • Spanos A (1986) Statistical foundations of econometric modelling. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Teubner T, Hawlitschek F, Dann D (2017) Price determinants on Airbnb: how reputation pays off in the sharing economy. J Self-Gov Manag Econ 5:53–80

    Article  Google Scholar 

  • Theil H (1971) Principles of econometrics. Wiley, Hoboken

    Google Scholar 

  • Wang D, Nicolau J (2017) Price determinants of sharing economy based accommodation rental: a study of listings from 33 cities on Airbnb.com. Int J Hosp Manag 62:120–131

    Article  Google Scholar 

  • Wang X, Kockelman KM (2007) Specification and estimation of a spatially and temporally autocorrelated seemingly unrelated regression model: application to crash rates in china. Transportation 34(3):281–300

    Article  Google Scholar 

  • White EN, Hewings GJ (1982) Space–time employment modeling: some results using seemingly unrelated regressions estimators. J Reg Sci 22(3):283–302

    Article  Google Scholar 

  • Wooldridge JM (2010) Econometric analysis of cross section and panel data, 2nd edn. Massachusets Institut of Technology Press, Cambridge

    Google Scholar 

  • Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and test of aggregation bias. J Am Stat Assoc 57:500–509

    Article  Google Scholar 

  • Zervas G, Proserpio D, Byers J (2017) The rise of the sharing economy: estimating the impact of airbnb on the hotel industry. J Mark Res 54(5):687–705

    Article  Google Scholar 

  • Zhang L, Chen L, Wu Z, Xue H, Dong W (2018) Key factors affecting informed consumers’ willingness to pay for green housing: a case study of Jinan, China. Sustainability 10:1711

    Article  Google Scholar 

  • Zhou BB, Kockelman KM (2009) Predicting the distribution of households and employment: a seemingly unrelated regression model with two spatial processes. J Transp Geogr 17(5):369–376

    Article  Google Scholar 

Download references

Acknowledgements

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness Grants MTM2014-52184, ECO2015-65826-P and ECO2015- 65758-P and Seneca Foundation Grant 19884/GERM/15. Jesus Mur is grateful for the financial support of the Department of Industry and Innovation of Aragón Government and the European Regional Development Fund, through the GAEC group, project number S37_17R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernando A. López.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Additional results on the Monte Carlo

Appendix: Additional results on the Monte Carlo

Figures 3, 4, 5, 6, 7, 8 add details to Sect. 3. Here, we include the estimation bias for the parameters of the experiment, using classical boxplots. For reasons of space, we only distinguish between type of equation, SLM or SDM. The data in the graphs refer to the difference between the true and the estimated value of the corresponding parameter for each iteration; so, at the very least, the boxplot should be centred on zero. Moreover, the IV algorithm has been simulated for a sample size of \(n=5000\), to check for its consistency. The results, available upon request, confirm the low consistency rate of these estimates.

This is what happens with the boxplots of estimates of the slope coefficients, i.e. \(\beta _{2g}\) parameters. Figure 3 reveals several important things: (1) the boxes are centred around a value of zero, which confirms that the estimates are unbiased, (2) the estimates are also consistent, which is clear from the tendency to compress the boxplots around zero, (3) there are almost no differences between the results of the SLM and of the SDM equations; the shape of the two boxplots is basically the same, (4) both algorithms produce strange results which are well outside the upper and lower whiskers; however, this propensity decreases with sample size, (5) 400 observations can be taken as a threshold to significantly reduce the risk of outliers, and (6) the IV estimates appear to be more affected by the incidence of outliers which are of greater size than in the ML approach.

Figure 4 describes the estimation of the intercepts, i.e. \(\beta _{1g}\) parameters; it is obvious that the shape of the boxes changes a lot. The results point to the existence of a slight bias for small sample sizes (\(n=25,49\)), which disappears with large n. Moreover, the distribution of the ML estimation errors is asymmetric and skewed towards the range of positive values. The bias is stronger for the SLM case. Another fact to note is the slow rate of convergence towards the true values. The dispersion of bias concentrates around the value of zero only for large sample sizes, \(n=900\), for both algorithms. In sum, there is a high risk of obtaining anomalous estimates for small and medium sample sizes for the intercept terms of the equations.

Similar comments can be made with respect to the estimation of the \(\theta _{2g}\) parameters in Fig. 5. The distribution of the estimation errors is centred around a value of zero, but the rate of convergence seems to be slow. There are abundant outliers for small and medium sample sizes, which increases considerably the risk of calculating strange estimates.

Figure 6 synthesizes the estimation of the spatial autocorrelation coefficients, \(\lambda _{1}\) and \(\lambda _{2}\). The shape of the boxplots confirms the consistency of both algorithms, but also the non-negligible risk of obtaining anomalous estimates even for large sample sizes. The boxes, in the ML case, remain slightly asymmetrical for small sample sizes, \(n=25\) and \(n=49\), and are not exactly centred on zero. The last result corroborates the tendency of the ML algorithm to underestimate the spatial dependence parameters, as noted previously, for example, by Pace and Barry (1997) or Barry and Pace (1999). The graphs corresponding to IV estimates are blurred because of the presence of outliers; however, we can anticipate that these distributions are centred around zero and, as it is clear in the graphs, the dispersion of the estimates is very large.

Finally, Figs. 7 and 8 summarize the estimation of the terms in \(\varvec{\varSigma }\); that is of \(\sigma _{1}^2\), \(\sigma _{2}^2\) and \(\sigma _{12}\). The graphs show a collection of distributions with a similar shape for the ML and IV algorithms, although the dispersion of the second is larger. Overall, these distributions are centred on zero and confirm the consistency of the algorithms. However, there is a large risk of obtaining anomalous estimations for small and medium size samples, \(n=25,49,100\), which only decreases significantly for samples of \(n=400\) observations. It is obvious that estimation bias is skew to the range of positive values.

Fig. 3
figure 3

Estimation bias. Slope coefficients

Fig. 4
figure 4

Estimation bias. Intercepts

Fig. 5
figure 5

Estimation bias. Theta terms

Fig. 6
figure 6

Estimation bias. Lambda terms

Fig. 7
figure 7

Estimation bias. Sigmas

Fig. 8
figure 8

Estimation bias. Sigmas

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

López, F.A., Mínguez, R. & Mur, J. ML versus IV estimates of spatial SUR models: evidence from the case of Airbnb in Madrid urban area. Ann Reg Sci 64, 313–347 (2020). https://doi.org/10.1007/s00168-019-00914-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00168-019-00914-1

JEL Classification

  • C4
  • C5
  • R1