## Abstract

Available data to depict socioeconomic realities are often scarce at the municipal level. Unlike recurring or continuous data, which are collected regularly or repeatedly, nonrecurrent data may be sporadic or irregular, due to significant costs for their compilation and limited resources at municipalities. To address regional data scarcity, we develop a bottom-up top-down methodology for constructing synthetic socioeconomic indicators combining a genetic algorithm and regression techniques. We apply our methodology for assessing income inequalities at 178 municipalities in Spain. The genetic algorithm draws the available data on circumstances or inequalities of opportunities that give birth to income disparities. Our methodology allows to mitigate the shortcomings arising from unavailable data. Thus, it is a suitable method to assess relevant socioeconomic conditions at a regional level that are currently obscured due to data unavailability. This is crucial to provide policymakers with an enhanced socioeconomic overview at regional administrative units, relevant to allocating public service funds.

### Similar content being viewed by others

## Notes

Other nature-inspired optimization algorithms could have been used with similar results (especially those ones with binary structures). Some examples where the genetic algorithm has been compared with other nature-inspired optimization algorithms (such as the Particle Swarm Optimization Algorithm and the Fireworks Algorithm) in the context of complex problems of optimal selection are Roch et al., (2021a, 2021b). These studies offer very similar results with the different algorithms, with a slight outperformance of the genetic algorithm.

Although we have selected an arithmetic aggregation to estimate \({Y}_{R}\), the BUTD methodology can be extended to cases in which other aggregation methods may be preferred. We have opted for the arithmetic aggregation due to the low level of substitutability (Lafortune et al., 2018) across the indicators within \({X}_{R}^{*}\) and the similarity between formulas of the OLS regression (Eq. 2) and the arithmetic aggregation. The selected aggregation method determines the model in Eq. 2.

A real-time estimation restricts input data to information available at the time of estimation. In our application, this implies that selection and weights are assigned using only 2015 data, and those are applied to 2016 recurrent indicators to build that year’s synthetic indicator for inequality. Likewise, only data for 2015 and 2016 are used to assign new selection and weights that are then applied to 2017 recurrent indicators for the estimation of that year’s synthetic indicator.

The municipality of Madrid city is not included in our analysis due to its distinctive features, which would require a particular and adapted analytical approach for municipalities with larger populations (Brezzi et al., 2011; Royuela et al., 2014). Madrid represented 50% of the region’s population and 55% of the region’s total economic activity in 2020, according to INE, which makes it an outlier in our analysis.

Unambiguity ensures a homogeneous interpretation of the indicators’ performance (increments or decrements). In our application, the higher the value of an indicator, the more vulnerable the municipality. If the GDP is not inverted, its interpretation would be the opposite (i.e., the higher the value, the less vulnerable the municipality).

These categories are merely indicative and different groupings do not affect the estimations.

Results estimated with \({I}_{2015}\) and \({I}_{2016}\) are available upon request.

A mapping visualization of relative income inequality during 2015 and 2016 depicts similar results. These are available upon request.

Each \({A}_{j}\) conceptually represents a unique combination of recurrent indicators. Note that because each \({a}_{j,i}\) can take two values (0 and 1), the total number of possible combinations of high frequency indicators is \({2}^{{N}_{R}}\). Therefore, the search space of combinations increases exponentially with a higher number of available indicators, which entails the use of an optimization algorithm.

## References

Aaberge, R., & Brandolini, A. (2015). Multidimensional poverty and inequality. In A. B. Atkinson & F. Bourguignon (Eds.),

*Handbook of income distribution*(Vol. 2, pp. 141–216). Elsevier.Aaberge, R., Mogstad, M., & Peragine, V. (2011). Measuring long-term inequality of opportunity.

*Journal of Public Economics,**95*(3), 193–204.Aiyar, S., & Ebeke, C. (2020). Inequality of opportunity, inequality of income and economic growth.

*World Development,**136*, 105115.Alberti, V., Banys, K., Caperna, G., Del Sorbo, M., Fregoni, M., Havari, E., Kovacic, M., Lapatinas, A., Litina, A., Montalto, V. , Tacao Moura, CJ, Neher, F., Panella, F., Peragine, V., Pisoni, E., Stuhler, J., Symeonidis, K., Verzillo, S.Y., Boldrini, M., (2021). Monitoring Multidimensional Inequalities in the European Union.

*Joint Research Centre, EUR 30649 EN, JRC123911*. Publications Office of the European Union.Banerjee, A., Duflo, E., & Sharma, G. (2021). Long-term effects of the targeting the ultra poor program.

*American Economic Review: Insights,**3*(4), 471–486.Bannor, R. K., & Oppong-Kyeremeh, H. (2018). Extent of poverty and inequality among households in the Techiman municipality of Brong Ahafo region, Ghana.

*Journal of Energy and Natural Resource Management,**1*(1), 26–36.Bennett, N., & Lemoine, G. J. (2014). What a difference a word makes: Understanding threats to performance in a VUCA world.

*Business Horizons,**57*(3), 311–317.Boulant, J., Brezzi, M., & Veneri, P. (2016). Income levels and inequality in metropolitan areas: A comparative approach in OECD countries.

*OECD Regional Development Working Papers*, No. 2016/06, OECD Publishing, Paris.Bourguignon, F. (2017). Global inequality. In F. Bourguignon (Ed.),

*The globalization of inequality*(pp. 9–40). Princeton University Press.Bourguignon, F., Ferreira, F. H. G., & Walton, M. (2007). Equity, efficiency and inequality traps: A research agenda.

*The Journal of Economic Inequality,**5*, 235–256.Bouzarovski, S., & Tirado-Herrero, S. (2017). The energy divide: Integrating energy transitions, regional inequalities and poverty trends in the European Union.

*European Urban and Regional Studies,**24*(1), 69–86.Brezzi, M., L. Dijkstra & V. Ruiz (2011). OECD extended regional typology: The economic performance of remote rural regions.

*OECD Regional Development Working Papers*, No. 2011/06.Brock, J. M. (2020). Unfair inequality, governance and individual beliefs.

*Journal of Comparative Economics,**48*(3), 658–687.Brunori, P., Hufe, P., & Mahler, D. G. (2018). The roots of inequality: Estimating inequality of opportunity from regression trees.

*World Bank Policy Research Working Paper*, No. 8349.Brunori, P., Salas-Rojo, P., & Verne, P. (2022). Estimating inequality with missing incomes. London School of Economics. International Inequalities Institute.

*Working Paper*, No. 82.Brunori, P., Ferreira, F. H., & Peragine, V. (2013). Inequality of opportunity, income inequality, and economic mobility: Some international comparisons.

*Getting development right: Structural transformation, inclusion, and sustainability in the post-crisis era*(pp. 85–115). Palgrave Macmillan US.Cabrera, L., Marrero, G. A., Rodríguez, J. G., & Salas-Rojo, P. (2021). Inequality of opportunity in Spain: New insights from new data.

*Review of Public Economics,**237*(2), 153–185.Chatterjee, S., & Turnovsky, S. J. (2012). Infrastructure and inequality.

*European Economic Review,**56*(8), 1730–1745.Checchi, D., Peragine, V., & Serlenga, L. (2016). Inequality of opportunity in Europe: Is there a role for institutions? In L. Cappellari, S. W. Polachek, & K. Tatsiramos (Eds.),

*Inequality: Causes and Consequences*(Vol. 43, pp. 1–44). Emerald Publishing Ltd.Dang, A. T. (2014). Amartya Sen’s capability approach: A framework for well-being evaluation and policy analysis?

*Review of Social Economy,**72*(4), 460–484.Dang, H. A., Jolliffe, D., & Carletto, C. (2019). Data gaps, data incomparability, and data imputation: A review of poverty measurement methods for data-scarce environments.

*Journal of Economic Surveys,**33*(3), 757–797.Dat, L. Q., Linh, D. T. T., Chou, S. Y., & Vincent, F. Y. (2012). Optimizing reverse logistic costs for recycling end-of-life electrical and electronic products.

*Expert Systems with Applications,**39*(7), 6380–6387.De Barros, R. P., Ferreira, F., Vega, J., & Chanduri, J. (2009).

*Measuring inequality of opportunities in Latin America and the Caribbean*. World Bank Publications.Dempster, M. A. H., & Jones, C. M. (2001). A real-time adaptive trading system using genetic programming.

*Quantitative Finance,**1*(4), 397.Diaz, E. M., & Perez-Quiros, G. (2021). GEA tracker: A daily indicator of global economic activity.

*Journal of International Money and Finance,**115*, 102400.Dong, Y., & Peng, C. J. (2013). Principled missing data methods for researchers.

*Springerplus,**2*(1), 222.Efthymiou, D., Chrysostomou, K., Morfoulaki, M., & Aifantopoulou, G. (2017). Electric vehicles charging infrastructure location: A genetic algorithm approach.

*European Transport Research Review,**9*(2), 27.Ertenlice, O., & Kalayci, C. B. (2018). A survey of swarm intelligence for portfolio optimization: Algorithms and applications.

*Swarm and Evolutionary Computation,**39*, 36–52.Espina, P. Z., & Somarriba, N. (2013). An assessment of social welfare in Spain: Territorial analysis using a synthetic welfare indicator.

*Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement,**111*(1), 1–23.Eurostat (2018). Methodological manual of territorial typologies. In

*Eurostat Statistical Books*.Ferreira, F. H., & Gignoux, J. (2011). The measurement of inequality of opportunity: Theory and an application to Latin America.

*Review of Income and Wealth,**57*(4), 622–657.Ferreira, F. H., & Peragine, V. (2016). Individual responsibility and equality of opportunity. In M. D. Adler & M. Fleurbaey (Eds.),

*The Oxford handbook of well-being and public policy.*Oxford University Press.Fleurbaey, M., & Peragine, V. (2013). Ex ante versus ex post equality of opportunity.

*Economica,**80*(317), 118–130.Gamboa, L. F., & Waltenberg, F. D. (2012). Inequality of opportunity for educational achievement in Latin America: Evidence from PISA 2006–2009.

*Economics of Education Review,**31*(5), 694–708.Gan, X., Fernandez, I. C., Guo, J., Wilson, M., Zhao, Y., Zhou, B., & Wu, J. (2017). When to use what: Methods for weighting and aggregating sustainability indicators.

*Ecological Indicators,**81*, 491–502.González, E., Cárcaba, A., & Ventura, J. (2011). The importance of the geographic level of analysis in the assessment of the quality of life: The case of Spain.

*Social Indicators Research,**102*, 209–228.Hick, R. (2016). Material poverty and multiple deprivation in Britain: The distinctiveness of multidimensional assessment.

*Journal of Public Policy,**36*(2), 277–308.Holland, J. H. (1975).

*Adaptation in natural and artificial systems: An introductory analysis with applications to biology, control, and artificial intelligence*. University of Michigan Press.Hufe, P., Kanbur, R. & Peichl, A. (2018). Measuring unfair inequality: Reconciling equality of opportunity and freedom from poverty.

*CESifo Working Paper Series*No. 7119.Jusot, F., Tubeuf, S., & Trannoy, A. (2013). Circumstances and efforts: How important is their correlation for the measurement of inequality of opportunity in health?

*Health Economics,**22*(12), 1470–1495.Kakwani, N. C. (1980). Income inequality and poverty: Methods of estimation and policy implications.

*Population and Development Review,**6*, 673.Kia, R., Khaksar-Haghani, F., Javadian, N., & Tavakkoli-Moghaddam, R. (2014). Solving a multi-floor layout design model of a dynamic cellular manufacturing system by an efficient genetic algorithm.

*Journal of Manufacturing Systems,**33*(1), 218–232.Kilkiş, Ş. (2016). Sustainable development of energy, water and environment systems index for Southeast European cities.

*Journal of Cleaner Production,**130*, 222–234.Kovacic, M., Verzillo, S., & Peragine, V. (2021). Using survey and administrative data to gain insights on the evolution of inequality of opportunity in the EU. In M. Dominguez-Torreiro & E. Papadimitriou (Eds.),

*Monitoring multidimensional inequalities in the European union*(pp. 75–97). Publications Office of the European Union.Kuo, Y. H., Rado, O., Lupia, B., Leung, J. M., & Graham, C. A. (2016). Improving the efficiency of a hospital emergency department: A simulation study with indirectly imputed service-time distributions.

*Flexible Services and Manufacturing Journal,**28*(1–2), 120–147.Kyriacou, A. P., Muinelo-Gallo, L., & Roca-Sagalés, O. (2017). Regional inequalities, fiscal decentralization and government quality.

*Regional Studies,**51*(6), 945–957.Lafortune, G., Fuller, G., Moreno, J., Schmidt-Traub, G., & Kroll, C. (2018). SDG index and dashboards detailed methodological paper.

*Sustainable Development Solutions Network.,**9*, 1–56.Lee, C. K. H. (2018). A review of applications of genetic algorithms in operations management.

*Engineering Applications of Artificial Intelligence,**76*, 1–12.Lefranc, A., Pistolesi, N., & Trannoy, A. (2008). Inequality of opportunities vs. inequality of outcomes: Are western societies all alike?

*Review of Income and Wealth,**54*(4), 513–546.Lefranc, A., Pistolesi, N., & Trannoy, A. (2009). Equality of opportunity and luck: Definitions and testable conditions, with an application to income in France.

*Journal of Public Economics,**93*(11–12), 1189–1207.Lustig, N. (2018).

*Commitment to equity handbook: Estimating the impact of fiscal policy on inequality and poverty*. Brookings Institution Press.Lustig, N., Lopez-Calva, L. F., & Ortiz-Juarez, E. (2013). Declining inequality in Latin America in the 2000s: The cases of Argentina, Brazil, and Mexico.

*World Development,**44*, 129–141.Marmot, M. (2005). Social determinants of health inequalities.

*Lancet,**365*, 1099–1104.Marrero, G., & Rodríguez, J. G. (2012). Inequality of opportunity in Europe.

*Review of Income and Wealth,**58*(4), 597–621.Martínez-Galarraga, J., Rosés, J. R., & Tirado, D. A. (2015). The long-term patterns of regional income inequality in Spain, 1860–2000.

*Regional Studies,**49*(4), 502–517.Mavrovouniotis, M., Li, C., & Yang, S. (2017). A survey of swarm intelligence for dynamic optimization: Algorithms and applications.

*Swarm and Evolutionary Computation,**33*, 1–17.Mazinani, M., Abedzadeh, M., & Mohebali, N. (2013). Dynamic facility layout problem based on flexible bay structure and solving by genetic algorithm.

*The International Journal of Advanced Manufacturing Technology,**65*(5–8), 929–943.Millar, C. C. J. M., Groth, O., & Mahon, J. F. (2018). Management innovation in a VUCA world: Challenges and recommendations.

*California Management Review,**61*(1), 5–14.Morini, M., & Pellegrino, S. (2018). Personal income tax reports: A genetic algorithm approach.

*European Journal of Operational Research,**264*(3), 994–1004.Niehues, J., & Peichl, A. (2014). Upper bounds of inequality of opportunity: Theory and evidence for Germany and the US.

*Social Choice and Welfare,**43*, 73–99.Nygård, F., & Sandström, A. (1989). Income inequality measures based on sample surveys.

*Journal of Econometrics,**42*(1), 81–95.O’rand, A., & Henrettam, J. C. (2018).

*Age and inequality: Diverse pathways through later life*. Routledge.Organization for Economic Co-operation and Development, OECD. (2008).

*Handbook on constructing composite indicators*. OECD Publishing.Organization for Economic Co-operation and Development, OECD. (2016).

*OECD Regional Outlook 2016: Productive regions for inclusive societies. Chapter 3: Understanding rural economies*. OECD Publishing.Pandeya, B., Buytaert, W., Zulkafli, Z., Karpouzoglou, T., Mao, F., & Hannah, D. M. (2016). A comparative analysis of ecosystem services valuation approaches for application at the local scale and in data scarce regions.

*Ecosystem Services,**22*, 250–259.Pike, A., Béal, V., Cauchi-Duval, N., Franklin, R., Kinossian, N., Lang, T., & Velthuis, S. (2023). “Left behind places”: a geographical etymology.

*Regional Studies*. https://doi.org/10.1080/00343404.2023.2167972Ramos, X., & Van de Gaer, D. (2016). Approaches to inequality of opportunity: Principles, measures and evidence.

*Journal of Economic Surveys,**30*(5), 855–883.Ramos, X., & Van de Gaer, D. (2021). Is inequality of opportunity robust to the measurement approach?

*Review of Income and Wealth,**67*(1), 18–36.Robeyns, I. (2017).

*Wellbeing, freedom and social justice: The capability approach re-examined*. Open Book Publishers.Roch-Dupré, D., Gonsalves, T., Cucala, A. P., Pecharromán, R. R., López-López, A. J., & Fernández Cardador, A. (2021). Determining the optimum installation of energy storage systems in railway electrical infrastructures by means of swarm and evolutionary optimization algorithms.

*International Journal of Electrical Power & Energy Systems,**124*, 106295-1-106295–15.Roch-Dupré, D., Gonsalves, T., Cucala, A. P., Pecharromán, R. R., López-López, A. J., & Fernández Cardador, A. (2021). Multi-stage optimization of the installation of energy storage systems in railway electrical infrastructures with nature-inspired optimization algorithms.

*Engineering Applications of Artificial Intelligence,**104*, 104370-1-104370–18.Roemer, J. E. (1993). A pragmatic theory of responsibility for the egalitarian planner.

*Philosophy & Public Affairs,**22*(2), 146–166.Roemer, J. E. (2000).

*Equality of opportunity*. Harvard University Press.Roemer, J. E., & Trannoy, A. (2016). Equality of opportunity: Theory and measurement.

*Journal of Economic Literature,**54*(4), 1288–1332.Royuela, V., Veneri, P., & Ramos, R. (2014). Income inequality, urban size and economic growth in OECD regions.

*OECD Regional Development Working Papers*No. 2014/10.Rubin, D. B. (1976). Inference and missing data.

*Biometrika,**63*(3), 581–592.Sanogo, T. (2019). Does fiscal decentralization enhance citizens’ access to public services and reduce poverty? Evidence from Côte d’Ivoire municipalities in a conflict setting.

*World Development,**113*, 204–221.Sen, A. (1999).

*Development as freedom*. Oxford University Press.Shek, D. T., & Wu, F. K. (2018). The social indicators movement: Progress, paradigms, puzzles, promise and potential research directions.

*Social Indicators Research,**135*, 975–990.Shin, K. S., & Lee, Y. J. (2002). A genetic algorithm application in bankruptcy prediction modeling.

*Expert Systems with Applications,**23*(3), 321–328.Silveira, R. D. M., & Azzoni, C. R. (2011). Non-spatial government policies and regional income inequality in Brazil.

*Regional Studies,**45*(4), 453–461.Soleimani, H., Govindan, K., Saghafi, H., & Jafari, H. (2017). Fuzzy multi-objective sustainable and green closed-loop supply chain network design.

*Computers & Industrial Engineering,**109*, 191–203.Somarriba, N., & Pena, B. (2009). Synthetic indicators of quality of life in Europe.

*Social Indicators Research,**94*, 115–133.Taylor, S. J., & Letham, B. (2018). Forecasting at Scale.

*The American Statistician,**72*(1), 37–45.Wilkinson, R. G., & Pickett, K. E. (2009). Income inequality and social dysfunction.

*Annual Review of Sociology,**35*, 493–511.World Bank. (2005).

*World development report 2006: Equity and development*. World Bank.Yitzhaki, S., & Schechtman, E. (2013).

*The Gini methodology: A primer on a statistical methodology*. Springer.

## Acknowledgments

This paper has been supported by Project PID2021-124641NB-I00 of the Ministry of Science and Innovation (Spain).

## Author information

### Authors and Affiliations

### Contributions

Elisa Aracil: Project administration, Funding acquisition, Conceptualization, Writing-original draft, Writing-review and editing, Supervision. Elena Díaz Aguiluz: Conceptualization; Methodology; Software; Validation; Formal Analysis; Data Curation; Writing – original draft preparation; Writing – review and editing; Visualization. Gonzalo Gómez-Bengoechea: Resources, Data Curation, Writing – original draft preparation, Writing – review and editing, Investigation. Rosalía Mota: Resources, Partial Data Curation, Writing – original draft preparation, Writing – review and editing. David Roch-Dupré: Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Writing - Original Draft, Writing - Review & Editing, Visualization.

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors declare no conflict of interest.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Appendices

### Appendix 1: Genetic Algorithm to Build a Synthetic Indicator

The genetic algorithm selects a subset of recurrent indicators, \({{\text{X}}}_{R}^{*},\), that best explain a benchmark indicator \({Y}_{NR}\). The iterative process is as follows.

For the selection of recurrent indicators from \({X}_{R}\) that should be included in \({{\text{X}}}_{R}^{*}\), we begin by defining any \(j\)-th solution \({A}_{j}\) as a binary vector of size \({N}_{R}\), where \({N}_{R}\) is the total amount of recurrent indicators available, such that

where \({a}_{j,i}\) can take the values of 0 or 1 for all \(i\in \{1,\dots ,{N}_{R}\}\). Whenever \({a}_{j,i}=0\), indicator \({x}_{R,i}\) is not included in subset \({X}_{R,j}\). Therefore, a possible solution \({A}_{j}\) denotes the combination of indicators \({X}_{R,j}\) constituted only by those indicators \({x}_{R,i}\) for which \({a}_{j,i}=1\).^{Footnote 10}

The algorithm is initiated by generating an amount of \(J\) possible solutions \({A}_{j}\) for all \(j\in \{1,\dots ,J\}\) in a first iteration. Each \({A}_{j}\) is generated randomly, such that the probability of \({a}_{j,1}=1\) is \(0.5 \forall i\in \{1,\dots ,{N}_{R}\}\) and \(\forall j\in \{1,\dots J\}\). We therefore have \(J\) random combinations \({A}_{j}\) of recurrent indicators, each denoted by \({X}_{R,j}\).

The genetic algorithm then works through the search space of possible combinations \({A}_{j}\) to find the combination that maximises the \({R}^{2}\)-statistic in a regression of the benchmark indicator \({Y}_{NR}\) against \({X}_{R,j}\) while complying with the nonnegative restriction. In particular, any combination \({A}_{j}\) of indicators is assessed by performing the following OLS regression:

where \({\alpha }_{j}\) and \({\omega }_{R,j}\) are the OLS constant and coefficients, respectively, and \({\varepsilon }_{j}\) is a vector of error term. Next, a fitness value is assigned to \({A}_{j}\) such that

where \({\omega }_{R,j}\) corresponds to the OLS estimates of the weights in Eq. (5), \({R}_{j}^{2}\) is the estimated \({R}^{2}\)-statistic, and \(\lambda\) is a penalisation factor. Whenever the nonnegative restriction is not complied with, such that any element in \({\omega }_{R,j}<0\), the fitness value of \({A}_{j}\) will be heavily penalised with a large value, \(\lambda\). Because the genetic algorithm maximises \(Fitness\left({A}_{j}\right)\), it dismisses all combinations \({A}_{j}\) of indicators that violate the nonnegative restriction while continuing to search for the combination that provides the highest \({R}^{2}\)-statistic.

Once the fitness values are calculated for all the randomly generated combinations of indicators \({A}_{j}\), these values are rescaled into probabilities. To do so, the combinations \({A}_{j}\), for all \(j\in \{1,\dots ,J\}\), are first ranked from highest to lowest according to their fitness value. Next, the scaled probability \(p\) for each \({A}_{j}\) is defined as

where \({r}_{j}\) is the rank of individual \({A}_{j}\).

In a second iteration of the algorithm, a new set of \(J\) possible solutions \({A}_{j}\) will be created from the previous set. This is performed by first randomly selecting combinations \({A}_{j}\) according to their scaled probabilities \(p\left({A}_{j}\right)\). New combinations \({A}_{j}\) will then be created by two specific functions of the genetic algorithm denoted *crossover* and *mutation*, which mimic the evolutionary theories put forward by Charles Darwin. With *crossover*, two of the randomly selected combinations \({A}_{j}\) are blended. The *genetic algorithm uses the crossover function* to explore the search space of possible combinations of indicators in its task for optimisation. With *mutation*, one of the randomly selected combinations \({A}_{j}\) is altered to provide diversity to the possible combinations \({A}_{j}\) to avoid premature convergence to a solution.

Once the new set of \(J\) possible solutions is created, the process is repeated by calculating the fitness values of the new combinations, rescaling these fitness values into probabilities, and selecting combinations for *crossover* and *mutation*. This is iterated numerous times until the algorithm converges to an optimal solution, denoted \({A}^{*}\), and defined as

Subset \({{\text{X}}}_{R}^{*}\) will then include all indicators \({x}_{R,i}\) for which \({a}_{i}^{*}=1\), \(\forall i\in \{1,\dots ,{N}_{R}\}\).

### Appendix 2: Recurrent Indicators for Circumstances Underlying Income Inequality Across Municipalities in Madrid

Table

3 summarises the recurrent indicators that depict the circumstances underlying income inequality in the municipalities of Madrid.

The data are aggregated and made available by the Regional Statistics Office since 2009. Tables

4,

5,

6 and

7 depict each indicator's description and primary source across categories.

### Appendix 3. Descriptive Statistics

Mean | Median | Mode | Std. Dev | Variance | Skewness | Kurtosis | |
---|---|---|---|---|---|---|---|

80/20 Poverty Ratio | 2.995 | 2.9 | 2.7 | 0.417 | 0.174 | 1.243 | 6.792 |

GINI | 33.594 | 33.3 | 35.7 | 3.098 | 9.597 | 0.393 | 2.931 |

Demography | |||||||

Total population | 0.09 | 0.017 | 1 | 0.192 | 0.037 | 3.186 | 13.133 |

Female population | 0.925 | 0.94 | 1 | 0.059 | 0.004 | -2.237 | 9.057 |

Youth population | 0.617 | 0.628 | 1 | 0.157 | 0.025 | -0.602 | 4.112 |

Senior population | 0.354 | 0.336 | 0.243 | 0.146 | 0.021 | 1.234 | 5.526 |

Dependency ratio | 0.491 | 0.479 | 0.5 | 0.108 | 0.012 | 1.404 | 7.375 |

Foreign population | 0.432 | 0.413 | 1 | 0.178 | 0.032 | 0.451 | 3.105 |

Foreign Female population | 0.554 | 0.526 | 0.5 | 0.11 | 0.012 | -0.222 | 7.28 |

Labour market | |||||||

Working ratio | 0.388 | 0.375 | 1 | 0.166 | 0.028 | 0.688 | 3.84 |

Female working population | 0.627 | 0.614 | 0.582 | 0.069 | 0.005 | 2.118 | 10.798 |

Foreign working population | 0.211 | 0.171 | 0.081 | 0.143 | 0.02 | 2.908 | 13.434 |

Young working population | 0.203 | 0.185 | 0.138 | 0.089 | 0.008 | 5.32 | 42.367 |

Senior working population | 0.669 | 0.672 | 0.483 | 0.128 | 0.016 | -0.145 | 2.91 |

Temporary contracts | 0.571 | 0.578 | 0.389 | 0.149 | 0.022 | 0.137 | 2.11 |

Unemployment rate | 0.513 | 0.498 | 0.554 | 0.168 | 0.028 | 0.386 | 2.985 |

Female unemployment | 0.541 | 0.548 | 0.5 | 0.095 | 0.009 | -0.555 | 13.152 |

Unemployment relative variation | -0.14 | -0.136 | -0.25 | 0.201 | 0.04 | 0.189 | 14.982 |

Youth unemployment | 0.21 | 0.155 | 0.078 | 0.165 | 0.027 | 1.386 | 5.902 |

Female youth unemployment | 0.443 | 0.455 | 0 | 0.205 | 0.042 | 0.022 | 4.467 |

Foreigners' unemployment | 0.373 | 0.344 | 0.286 | 0.177 | 0.031 | 0.711 | 3.733 |

Female work insertion | 0.171 | 0.154 | 0.171 | 0.093 | 0.009 | 5.764 | 47.984 |

Foreign intra-EU work insertion | 0.199 | 0.169 | 0.181 | 0.137 | 0.019 | 2.483 | 12.315 |

Foreign extra-EU work Insertion | 0.21 | 0.163 | 0.155 | 0.156 | 0.024 | 2.407 | 10.157 |

Income | |||||||

GDP per Capita | 0.417 | 0.403 | 1 | 0.17 | 0.029 | 0.434 | 3.207 |

Number of tax declarations | 0.424 | 0.405 | 1 | 0.077 | 0.006 | 3.287 | 21.514 |

Tax base amount | 0.333 | 0.311 | 1 | 0.135 | 0.018 | 1.477 | 7.11 |

Taxable saving base | 0.374 | 0.401 | 1 | 0.14 | 0.02 | 0.722 | 5.099 |

Urban tax base per receipt | 0.123 | 0.088 | 1 | 0.132 | 0.018 | 3.757 | 21.708 |

Labour income | 0.685 | 0.693 | 1 | 0.134 | 0.018 | -0.408 | 3.107 |

Gross disposable income | 0.242 | 0.19 | 1 | 0.171 | 0.029 | 1.387 | 5.095 |

Families with Minimum Insertion Income | 0.213 | 0.152 | 1 | 0.182 | 0.033 | 1.636 | 5.892 |

Living conditions | |||||||

Electricity consumption | 0.445 | 0.438 | 1 | 0.168 | 0.028 | 0.169 | 3.098 |

Sanitary infrastructure | 0.039 | 0.008 | 0.417 | 0.111 | 0.012 | 6.076 | 47.027 |

Water consumption | 0.008 | 0.002 | 1 | 0.076 | 0.006 | 12.987 | 170.431 |

Passenger cars | 0.689 | 0.74 | 0.783 | 0.211 | 0.044 | -1.866 | 6.213 |

Population dispersion | 0.049 | 0.013 | 1 | 0.106 | 0.011 | 5.234 | 40.462 |

Enrollment rate for basic education | 0.05 | 0.029 | 0 | 0.108 | 0.012 | 5.376 | 39.233 |

Students per teacher | 0.607 | 0.725 | 0 | 0.314 | 0.098 | -0.88 | 2.44 |

Students per school unit | 0.623 | 0.735 | 0 | 0.318 | 0.101 | -0.89 | 2.49 |

Public education | 0.718 | 0.936 | 1 | 0.356 | 0.127 | -1.037 | 2.647 |

## Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

## About this article

### Cite this article

Aracil, E., Diaz, E., Gómez-Bengoechea, G. *et al.* Regional Socioeconomic Assessments with a Genetic Algorithm: An Application on Income Inequality Across Municipalities.
*Soc Indic Res* **173**, 499–521 (2024). https://doi.org/10.1007/s11205-024-03345-4

Accepted:

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11205-024-03345-4