Introduction

Access to public services plays a crucial role in shaping the lives of people and, particularly social infrastructures (SIs), such as library, schools, parks, and hospitals, serve as essential foundations within the realm of public services1,2,3,4. Even transportation accessibility to SIs stands as a fundamental pillar of sustainable urban development and social equity, profoundly influencing the quality of life and opportunities available to communities5,6. This accessibility not only drives urban growth but also holds critical significance in addressing the multifaceted challenges encountered by cities experiencing population decline and disinvestment.

Despite the imperative for equitable dissemination of essential services and amenities facilitated by SIs, persistent inequities in access endure, as evidenced by extensive disparities in previous studies7. For instance, in South Korea, where urbanization trends have led to both concentrated expansion and localized decline, understanding and rectifying spatial disparities in accessibility to essential SIs are crucial for fostering resilient, inclusive communities amidst ongoing urban transformations.

Thus, we aimed to examine transportation accessibility to 12 SIs in four critical categories: (1) green and recreation spaces, (2) health and aged care facilities, (3) educational institutions, and (4) justice and emergency services. We employed a multi-dimensional methodological approach in Fig. 1, utilizing various methods including descriptive and bivariate analysis using quantile classification, Gini index with Lorenz Curve, correlation, and bivariate choropleth maps. Additionally, explanatory factor analysis (EFA) and k-mean clustering techniques were employed to identify spatial stratification and determine areas with inadequate accessibility to SIs. Furthermore, multinomial logit model (MNL) was utilized to investigate the factors associated with the spatial inequalities, shedding light on the underlying determinants of accessibility disparities.

Figure 1
figure 1

Framework of the multi-dimensional approach used in this study.

The significance of this study extends beyond academic inquiry, aiming to inform evidence-based policymaking and urban planning practices tailored to the realities of shrinking cities in South Korea. By pinpointing areas with deficient accessibility to SIs and proposing targeted interventions, our endeavor promotes resilience, equity, and inclusivity within evolving urban landscapes of South Korea. By addressing research gaps identified in previous literature through the lens of transportation equity theory, our research provides valuable insights into the complex dynamics of transportation accessibility and spatial inequalities. Also, our comprehensive approach advances the field by offering a holistic understanding of SI planning and development, contributing to an in-depth understanding of social equity and urban sustainability.

Background

Theoretical background

Theoretical perspectives on urban planning and transportation equity underscore the importance of equitable access to these services, positing that transportation accessibility to SIs is fundamental to sustainable urban development and social equity8,9,10. Such accessibility profoundly influences the quality of life and opportunities available to communities, driving urban growth while addressing the multifaceted challenges faced by cities experiencing population decline and disinvestment. Accordingly, a paper by Willberg et al.11 suggested a conceptual framework for integrating environmental and social sustainability objectives into accessibility research. Despite the theoretical imperative for equitable dissemination of essential services facilitated by SIs, persistent inequities in access endure.

Literature review

A significant body of literature has explored spatial inequalities of transportation accessibility to SIs. One line of previous studies has focused on developing various accessibility measurements. For instance, a recent study of Ling et al.12 introduced disaggregated excess-commuting framework and found that secondary sectors and lower-skilled industries exhibit greater commuting efficiency than their high-skilled counterparts. Also, Chen et al.13 advanced gravity-based models to assess park accessibility, incorporating different assumptions about attractiveness factors and travel impedance.

More importantly, extensive disparities have been evidenced in previous studies, highlighting significant spatial heterogeneity in accessibility metrics14. For instance, analyses have revealed significant spatial heterogeneity in accessibility metrics for SIs, illustrating a consistent decline in the accessibility along the gradient from old to newer urban areas15, and from well-developed areas to slums16. Additionally, rural areas have faced prominent accessibility challenges and attendant inequalities17.

Despite a significant body of literature, existing studies often focus on certain types of SIs or specific geographic areas, providing fragmented insights into accessibility disparities. Moreover, while some research has explored the socio-economic and demographic factors influencing accessibility, there is a lack of comprehensive analyses incorporating a wide range of factors associated with spatial inequalities. Thus, there is a need for a holistic understanding of transportation accessibility to SIs and the underlying determinants of spatial disparities.

Contribution of this study

Our research builds on this body of work by addressing the identified gaps and advancing the field through a comprehensive, multi-dimensional analysis of transportation accessibility to 12 SIs in four critical categories: green and recreation spaces, health and aged care facilities, educational institutions, and justice and emergency services. By conducting a nationwide spatial analysis, we seek to provide a comprehensive assessment of accessibility patterns across South Korea.

Materials and methods

Research questions

We aimed to answer the following research questions.

  • RQ1: What spatial variances exist in the distribution of accessibility to SIs across diverse regions of South Korea?

  • RQ2: Do correlations between accessibility metrics for different categories of SIs signify prevalent spatial disparities?

  • RQ3: Can we categorize spatial units into discrete clusters based on their accessibility profiles to a comprehensive set of SIs?

  • RQ4: What distinguishing characteristics define each cluster, and how do they diverge concerning accessibility to SIs?

  • RQ5: Which factors, such as population density and housing prices, contribute to the observed spatial stratification in accessibility to SIs?

Methods and data

Overview

We adopt a multi-dimensional methodological approach to address our research questions, as outlined in Table 1. First, we utilized descriptive and bivariate analysis techniques, including quantile classification, Gini index, correlation analysis, and bivariate choropleth maps. Second, we employed EFA and, then, K-Mean clustering. Third, we used MNL. This comprehensive methodological approach enabled us to gain a holistic understanding of spatial inequalities in transportation accessibility to SIs and their underlying determinants. Subsequent subsections provided detailed explanations of each method.

Table 1 Overview of the multi-dimensional approaches used in this study.

Methodological approach 1

Methodological approach 1 encompassed four key techniques: (1) quantile classification, (2) Gini index and Lorenz Curve, (3) correlation analysis, and (4) bivariate choropleth maps. To begin, quantile classification involved dividing the dataset into groups based on the 10th quantile, allowing us to examine the distribution of each transportation accessibility variable listed in Table 2. The Gini index with Lorenz Curve served as a measure of spatial inequality18, quantifying the degree of inequality in transportation accessibility to SIs across Eup/Myoen/Dongs (EMDs), which represent the finest administrative unit in South Korea. Additionally, correlation analysis was utilized to discern potential correlations between accessibility to different types of SIs, facilitating the exploration of spatial inequalities. Lastly, bivariate choropleth maps were employed to visually represent potential correlations between variables across EMDs.

Table 2 Description and descriptive statistics of transportation accessibility variables.

Data on transportation accessibility to SIs were obtained from the National Geographic Information Institute (NGII) and were publicly available. The accessibility measure was network distance between two points, which is largely used in previous studies19,20. The set of SIs analyzed in this study included 12 variables categorized into four groups. The dataset were the EMD level and represented the year 2022. Table 2 presents detailed descriptions and descriptive statistics of the 12 accessibility variables.

Methodological approach 2

The process of methodological approach 2 involved two steps: (1) EFA and (2) K-Mean, as illustrated in Fig. 2. EFA was initially employed to reduce the dimensionality of the 12 transportation accessibility variables, and then K-Mean clustering is utilized to classify EMDs into homogeneous clusters based on the identified latent factors from EFA. This facilitated the identification of distinct EMDs characterized by varying levels of the accessibility.

Figure 2
figure 2

Conceptual Diagram of the Method of Explanatory Factor Analysis and K-Mean Cluster Analysis (The maps in this figure were generated using a publically available package in R Studio 2023.03.0 + 386 called “leaflet” and modified using Microsoft PowerPoint).

In the methodological approach 2, we first conducted EFA on the 12 transportation accessibility variables listed in Table 2. This method was crucial as it reduced the complexity of dimensions, making it easier to identify distinct clusters with varying levels of transportation accessibility to SIs21,22. The results of the varimax rotation were presented in Table 1 in the supplementary file. The scree plot of eigenvalues indicated that either three or four factors were suitable (see Supplementary Fig. 1). Based on our subjective interpretation of the factor loadings, we concluded that three factors better captured the underlying latent constructs. Supplementary Table 1 displays the loadings for each factor, with shaded areas highlighting the largest loadings for each variable. Supplementary Fig. 2 displays the spatial distribution of the three factor loadings, revealing similar patterns to those observed in the 12 transportation accessibility variables.

Following EFA, we employed K-Means clustering23 to categorize EMDs into clusters based on the three factors identified in the EFA. Initially, the algorithm partitions each EMD into a predetermined number of clusters based on the minimum distance between the object and its centroids. Determining the optimal number of clusters is vital as it substantially impacts the results. To identify the optimal number of clusters, we conducted silhouette analysis, which determined that three clusters were optimal for our dataset (see Supplementary Fig. 3).

We chose EFA and K-Means clustering for the following reasons. First, EFA is a robust technique for reducing the dimensionality of a dataset, which is particularly beneficial when dealing with multiple variables, as in our study with 12 accessibility variables. By identifying latent factors, EFA simplifies the dataset while preserving its essential characteristics, facilitating easier interpretation and analysis. Second, K-Means clustering is an effective method for classifying data into distinct groups based on similarity. By utilizing the latent factors identified through EFA, we ensured that the clusters formed were homogeneous in terms of their characteristics. Third, both EFA and K-Means clustering are widely used and well-understood methods in data analysis. Their flexibility allows for easy adjustment of parameters (such as the number of factors or clusters), and the results are straightforward to interpret and visualize, making them suitable for complex spatial analysis. Overall, the combination of EFA and K-Means clustering provided an effective methodological framework for analyzing spatial inequalities.

Methodological approach 3

Methodological approach 3 involved the utilization of MNL in discrete choice modeling family24 to analyze the relationships between the identified clusters from methodological approach 2 and various factors across five sectors: (1) urban density, (2) urban design, (3) urban dynamics, (3) travel behavior, and (5) economics (see Supplementary Table 2). We selected MNL for various reasons. First, MNL was appropriate when the dependent variable has more than two discrete categories. In your study, you categorized EMDs into High, Middle, and Low clusters based on transportation accessibility to SIs. MNL allowed us to model and analyze these categorical outcomes effectively. Second, MNL enabled us to examine how various independent variables contribute to the probability of an EMD belonging to a particular accessibility cluster. This helped in understanding spatial inequalities in accessibility to SIs across different regions of South Korea since the parameter estimation (i.e., coefficient) of the model can offer insights on the significance, direction, and magnitude of the relationship25.

The selection of independent variables for MNL in this study included population density, employment density, housing price, and total Average Annual Daily Traffic (AADT) at the EMD level. These variables were chosen based on their relevance to understanding spatial inequalities in transportation accessibility to SIs across South Korea. Population density and employment density serve as fundamental indicators of urban intensity and concentration26. Higher population density typically signifies greater demand for SIs and services, influencing the distribution of accessibility across regions. Similarly, employment density reflects economic activity and spatial patterns of employment, impacting transportation needs and accessibility to workplaces. The density of elderly individuals was included because of their specific mobility and healthcare requirements, highlighting the importance of ensuring adequate accessibility to healthcare facilities and elder care services27. Changes in population were considered to provide insights into demographic shifts and migration patterns, which are pivotal for understanding evolving demands for transportation accessibility and the provision of SIs over time28. Collectively, these variables encompassed various dimensions of urban and regional dynamics pertinent to transportation planning and social equity considerations. Data on the independent variables was obtained from public repositories made by NGII, Korea Transportation Institute, and Urban Regeneration Information System.

Results

Patterns of spatial inequalities in accessibility to social infrastructures

Figure 3 illustrates the quantile classification for each of the 12 variables pertaining to transportation accessibility to SIs, segmented into ten quantile groups based on network distance measured in kilometers. Visual analysis of the quantile maps reveals discernible spatial patterns: urban centers and densely populated EMDs, particularly within the Seoul, Busan, and Daejeon metropolitan areas, demonstrate notably higher levels of accessibility. This trend is particularly pronounced in areas proximate to major transportation hubs and economic centers. Conversely, rural and remote regions, alongside locales characterized by lower population densities, exhibit diminished levels of accessibility across all analyzed SIs.

Figure 3
figure 3

Spatial Distribution of Transportation Accessibility Variables in the 10th Quantile (The maps in this figure were generated using a publically available package in R Studio 2023.03.0 + 386 called “leaflet” and modified using Microsoft PowerPoint).

Figure 4 depicts the Gini index, affirming a significant prevalence of spatial inequalities in accessibility. Across various EMDs, the Gini index values range from 0.4 to 0.6, signifying a moderate to high degree of inequality. Notably, certain SIs, such as Elderly (0.508), Library (0.513), Park (0.578), Welfare (0.544), and General Hospital (0.558), exhibit markedly higher levels of spatial inequality, as evidenced by their elevated Gini index values exceeding 0.5. Conversely, Elementary demonstrates the lowest Gini index value (0.384), indicating relatively lower spatial inequality within South Korea for this particular SI. Collectively, these observations underscore significant spatial disparities in transportation accessibility across South Korea, with certain regions enjoying markedly superior access to SIs compared to others.

Figure 4
figure 4

Results of Gini Index and Lorenz Curve (The maps in this figure were generated using a publically available package in R Studio 2023.03.0 + 386 called “ineq” and modified using Microsoft PowerPoint).

Figure 5 presents the correlation matrix of the 12 accessibility variables to SIs, revealing statistically significant and positive correlations among all variables. These findings suggest a notable degree of spatial coherence in accessibility patterns, with implications for equitable access to essential services. For instance, accessibility to healthcare facilities exhibited strong positive correlations with accessibility to education institutions and emergency services, indicating spatial consistency in accessibility across these SIs. Similarly, positive correlations were observed between accessibility to green spaces and recreational facilities, as well as educational institutions, highlighting integrated accessibility patterns in areas offering diverse opportunities. These results underscore the interconnectedness of transportation accessibility to various types of SIs in South Korea, where certain EMDs enjoy more comprehensive and better accessibility to all SIs compared to others.

Figure 5
figure 5

Correlation Matrix of Transportation Accessibility Variables (The figure was generated using a publically available package in R Studio 2023.03.0 + 386 called “Corrplot” and modified using Microsoft PowerPoint).

Figure 6 shows bivariate choropleth maps illustrating the spatial relationships between selected accessibility variables, providing visual insights into accessibility patterns across different types of SIs. Visual examination of the maps revealed spatial coherence and disparities in accessibility, particularly in rural and remote EMDs. For example, EMDs with low accessibility to exercise facilities often exhibited concurrent low accessibility to cultural institutions, reflecting spatial consistency in accessibility patterns for the SIs. The observed spatial patterns highlight EMDs of both coherence and disparity in transportation accessibility to various SIs across South Korea.

Figure 6
figure 6

Selected Bivariate Choropleth Maps between Transportation Accessibility Variables (The maps in this figure were generated using a publically available package in R Studio 2023.03.0 + 386 called “biscale” and modified using Microsoft PowerPoint).

Spatial stratification and transportation accessibility burdens

The cluster maps presented in Fig. 7, derived from EFA and K-Mean, delineated three distinct clusters based on transportation accessibility to SIs: (1) High, comprising EMDs with the highest levels of transportation accessibility (observations of 3,599), (2) Middle (observations of 660), and (3) Low (observations of 549). Disparities in accessibility were evident between clusters, with metropolitan areas and major cities in each province of South Korea (High cluster) demonstrating notably higher levels of accessibility compared to Middle and Low clusters. Furthermore, EMDs classified within the Low cluster encompassed areas indicative of shrinking cities in South Korea.

Figure 7
figure 7

Three Clusters Found in K-Mean Clustering: (1) High, (2) Middle, and (3) Low (The maps in this figure were generated using a publically available package in R Studio 2023.03.0 + 386 called “leaflet” and modified using Microsoft PowerPoint).

Table 3 provides a comparison of the mean and median transportation accessibility to SIs across the three clusters, revealing distinct spatial patterns in accessibility across different regions of South Korea. For example, EMDs within the High cluster exhibited the lowest average network distances to all SIs, ranging from 1 to 5 km. In contrast, residents within the Low cluster experienced substantially higher average network distances, such as 26.53 km to access General Hospitals, 22.94 km for Welfare facilities, 20.09 km for Hospitals, and 19.20 km to access Elderly care facilities. Moreover, SIs that imposed notable burdens to EMDs, as indicated by the difference in average network distance between clusters, included General Hospitals, Welfare facilities, Elderly care centers, and Hospitals, which play a crucial role in the well-being of individuals, particularly the elderly population. Also, the differences in average network distances to SIs between the clusters were found to be statistically significant, indicating that EMDs within the Low cluster face a considerably higher burden in accessing essential SIs in South Korea.

Table 3 Characteristics of each cluster identified in K-mean clustering.

Factors associated with the spatial stratification

Table 4 displays the best fit MNL that incorporates 11 covariates. A forward-stepwise procedure was employed to identify the most suitable set of covariates, resulting in the final specifications of the MNL model. Initially, starting with only the constant term, the model iteratively included significant variables while discarding non-significant ones. The McFadden R squared value of 0.468 indicates a reasonably good goodness-of-fit for the final model. Interpreting the model coefficients presented in Table 4, population and employment density emerged as significant factors when categorizing EMDs into Middle or Low clusters. Specifically, in EMDs categorized into the Low cluster, both population and employment density exhibited significant negative associations (coefficients of − 0.549 and − 0.285, respectively). Further, this analysis revealed that areas with lower accessibility to SIs tended to have fewer working-age individuals (aged 19 to 64), implying demographic disparities in areas with the lowest levels of SI accessibility. Notably, regions burdened with poorer SI accessibility exhibited a higher proportion of elderly residents compared to those in the High cluster.

Table 4 The results of the best fit multinomial logit model.

Additionally, housing prices emerged as a significant factor; lower housing prices were associated with decreased odds of being categorized into the Middle or Low cluster versus the High cluster (coefficients of − 0.168 and − 0.278, respectively). Furthermore, floor area and building density showed significant negative associations with categorization into Middle and Low clusters, indicating that areas with lower SI accessibility tended to exhibit less compact urban development patterns. In terms of transportation indicators, lower accessibility to SIs corresponded with decreased traffic volumes across all transportation modes, particularly public transportation. This suggests that EMDs in the Low cluster not only faced challenges in accessing SIs but also experienced reduced public transportation usage. Lastly, EMDs categorized into the Middle and Low clusters experienced significant declines in both business and population, as evidenced by changes in business activity and population composition.

Discussions

Major findings

Our study found that urban centers and densely populated areas, particularly within major metropolitan regions such as Seoul, Busan, and Daejeon, exhibit significantly higher levels of accessibility to SIs. Conversely, rural and remote regions, which are spatially disadvantaged areas, experience diminished levels of accessibility across all analyzed SIs, highlighting pronounced spatial inequalities. The findings are largely consistent with those in previous studies29,30,31,32.

Additionally, similar to previous studies33,34,35,36,37, we delineated three distinct clusters based on transportation accessibility to SIs, called spatial stratifications: High, Middle, and Low clusters. Discrepancies in accessibility were evident between these clusters, with urban centers and major cities displaying markedly higher accessibility levels compared to Middle and Low clusters, thereby highlighting regional disparities in the accessibility.

Our contribution to offering an additional understanding of spatial inequalities in SI accessibility is as follows. First, EMDs within the High cluster exhibited the lowest average network distances to all SIs, ranging from 1 to 5 km. In contrast, residents within the Low cluster experienced substantially higher average network distances, such as 26.53 km to access General Hospitals, 22.94 km for Welfare facilities, 20.09 km for Hospitals, and 19.20 km to access Elderly care facilities.

Second, we observed significant positive correlations among accessibility metrics for various types of SIs, indicative of pervasive spatial disparities in accessibility patterns across all SI categories. This suggests that regions with diminished access to one type of SI also exhibit reduced access to others, emphasizing the interconnection of accessibility patterns.

Third, factors such as population and employment density, housing prices, floor area, and building density emerged as significant determinants of spatial stratification in SI accessibility. Regions with inferior SI accessibility tended to have a lower working-age population and a higher proportion of elderly residents. Moreover, decreased accessibility correlated with diminished traffic volumes across all transportation modes, particularly public transportation. EMDs classified within the Middle and Low clusters experienced notable declines in both business activity and population, underscoring the broader socio-economic implications of spatial accessibility disparities.

Policy implications

Based on the major findings, we propose several policy implications to address spatial inequalities in accessibility to SIs. First, policy interventions should prioritize investments in transportation infrastructure and SIs, particularly in areas characterized by lower accessibility to essential services. This includes improving public transportation networks and increasing the availability of SIs such as healthcare facilities, welfare services, and educational institutions in underserved regions.

Second, to promote spatial equity, policymakers should address disparities in housing prices. Measures should be explored to ensure affordable housing options in areas with high housing prices, thereby mitigating the spatial concentration of socio-economic disadvantage and improving accessibility to essential services for all residents.

Third, given the association between lower accessibility to SIs and a higher proportion of elderly residents, age-friendly policies are essential. This includes improving accessibility features in public spaces and transportation systems, expanding home care services, and promoting social inclusion and community engagement among older adults.

Fourth, enhancing public transportation services, particularly in spatially disadvantaged areas with lower accessibility to SIs, can improve mobility and connectivity for residents. This may include expanding public transportation coverage, increasing service frequency, offering demand response transit (DRT), and implementing fare subsidies to make public transportation more accessible and affordable for all residents.

Limitations of this study and future research directions

Despite the contributions of this research, it is important to acknowledge its limitations and suggest future research directions. First, while this study focuses primarily on transportation accessibility as a proxy for spatial inequalities in accessing SIs, it inherently neglects other significant factors influencing SI access, such as socio-economic disparities, cultural differences, and institutional barriers. Future research should adopt a more holistic and advanced approach that integrates these factors to provide a comprehensive understanding of spatial inequalities in SI accessibility.

Second, the multi-dimensional methodological approach employed in this study, such as EFA and K-means clustering, provides valuable insights into spatial patterns of accessibility. However, these methods have their own limitations and assumptions. EFA assumes linear relationships among variables and normally distributed data, which may not always hold true in real-world datasets. Similarly, K-means clustering relies on initial random assignment of cluster centroids and may not always converge on the optimal solution. Future research should consider alternative methods or sensitivity analyses to validate the robustness of clustering results.

Third, our research is based on data collected at a specific point in time, which limits its ability to capture temporal dynamics or changes in accessibility patterns over time. Using point-in-time data may not fully reflect seasonal variations, dynamic changes in urban development, or shifts in socio-economic conditions that influence accessibility. Longitudinal studies using time-series data could offer deeper insights into how spatial inequalities in transportation accessibility evolve over time and in response to policy interventions.

Fourth, while this study identifies associations between accessibility to SIs and various socio-economic and demographic factors, it does not establish causality. Further research incorporating qualitative methods and longitudinal analyses could elucidate the underlying mechanisms driving spatial inequalities in accessibility and provide causal explanations. Additionally, expanding the set of independent variables could offer a more comprehensive understanding of the factors associated with spatial inequalities, enhancing the depth of future research in this area.

Lastly, while the study offers valuable policy implications, the effectiveness of proposed interventions is not empirically tested. Future research should prioritize evaluating the impact of policy interventions aimed at addressing spatial inequalities in transportation accessibility and promoting inclusive development, such as the demand response transit.

Concluding remarks

We comprehensively explored spatial inequalities in transportation accessibility to SIs in South Korea using a multi-dimensional methodological approach, such as descriptive analysis, EFA, K-Mean, and MNL. We believe that this research improves our understanding of spatial inequalities in transportation accessibility to SIs and offers valuable insights for evidence-based policymaking and urban planning practices. By addressing the identified challenges and implementing the policy recommendations outlined in this study, policymakers can take meaningful steps towards creating more resilient, equitable, and inclusive communities in South Korea, particularly in spatially disadvantaged areas.