Decomposition of Inequality of Opportunity in India: An Application of Data-Driven Machine Learning Approach

Mehta, Balwant Singh; Dhote, Siddharth; Srivastava, Ravi

doi:10.1007/s41027-023-00446-5

Decomposition of Inequality of Opportunity in India: An Application of Data-Driven Machine Learning Approach

ARTICLE
Published: 20 July 2023

Volume 66, pages 439–469, (2023)
Cite this article

The Indian Journal of Labour Economics Aims and scope Submit manuscript

Balwant Singh Mehta¹,
Siddharth Dhote¹ &
Ravi Srivastava¹

193 Accesses
2 Altmetric
Explore all metrics

Abstract

This paper introduces a novel measure of inequality of opportunity (IOp) in India, by comparing both ex-ante and ex-post results, which aligns with Roemer’s (1998) equality of opportunity, theory. The study utilizes data-driven machine learning algorithms, namely conditional inference tree and conditional inference forest, to measure ex-ante IOp, and a transformation tree to estimate ex-post IOp. The findings indicate that, according to the ex-ante approach, approximately 58–61 percent of the overall income inequality can be attributed to variations in circumstances, while around 46 percent of the overall income inequality is explained by differences in the degree of efforts. The results from the tree-based analysis reveal that parents’ occupation, sector (rural–urban areas), and geographical regions are the primary circumstances contributing to IOp, which is further confirmed by the Shapley decomposition exercise. Specifically, individuals residing in rural areas in the eastern and central parts of the country, whose parents are employed in low-skilled and unskilled occupations, and have below secondary and no formal education, and who belong to marginalized social groups, exhibit significantly lower average income. Consequently, it is crucial to implement regional-level development policies that specifically target marginalized groups in order to foster a more equitable society and mitigate overall income inequality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Changing Contours of Income Stratification and Decomposition of Income Inequality: Evidence from Recent Longitudinal Survey in India

Article 09 September 2016

On the spatially explicit Gini coefficient: the case study of Chile—a high-income developing country

Article 28 January 2020

A Data Mining Analysis of the Chinese Inland-Coastal Inequality

Notes

It is widely used in computer graphics to model smooth curves (Farouki 2012). It outperforms competitors such as kernel estimators, in approximating distribution function (Lablanc 2012).
A Conditional Distribution Functions (CDF) is a function of the form \({P}_{r}(Y=j|X={x}_{o})\), i.e., the probability of Y is j given for a given value of X (James et al. 2013, p.37). A type-specific ECDF, as used in Brunori et al. (2023), describes the probability distribution of a random variable given certain conditions, in the context of the paper, ECDF’s give us about the probability distribution of the MPCI given a circumstance type.

References

Anand, I., & A. Thampi. 2016. Recent trends in wealth inequality in India. 51(50). https://www.epw.in/journal/2016/50/special-articles/recent-trends-wealth-inequality-india.html
Arneson, R. 1989. Equality and equal opportunity for welfare. Philosophical Studies 56 (1): 77–93.
Article Google Scholar
Asadullah, M.N., and G. Yalonetzky. 2012. Inequality of educational opportunity in India: Changes over time and across states. World Development 40 (6): 1151–1163. https://doi.org/10.1016/j.worlddev.2011.11.008.
Article Google Scholar
Bourguignon, F. 2004. The poverty-growth-inequality triangle. Indian Council for Research on International Economic Relations Working Paper, 131, 35.
Brunori, P., & Neidhofer, G. (2020). The Evolution of Inequality of Opportunity in Germany: A Machine Learning Approach. 53.
Brunori, P., and G. Neidhöfer. 2021b. The evolution of inequality of opportunity in Germany: A machine learning approach. Review of Income and Wealth 67 (4): 900–927. https://doi.org/10.1111/roiw.12502.
Article Google Scholar
Brunori, P., P. Hufe, and D.G. Mahler. 2018. The Roots of Inequality: Estimating Inequality of Opportunity from Regression Trees. https://doi.org/10.1596/1813-9450-8349.
Book Google Scholar
Brunori, P., F. Palmisano, and V. Peragine. 2019. Inequality of opportunity in sub-Saharan Africa. Applied Economics 51 (60): 6428–6458. https://doi.org/10.1080/00036846.2019.1619018.
Article Google Scholar
Brunori, P., and G. Neidhöfer. 2021a. Inequality of opportunity in comparative perspective: Recent advances and challenges. In Handbook of income distribution (Vol. 3B, pp. 1393–1479). Elsevier.
Brunori, P., Ferreira, F., & Neidhöfer, G. (2023). UNU-WIDER: Working Paper : Inequality of opportunity and intergenerational persistence in Latin America [Working Paper]. UNU_Wider. http://www.wider.unu.edu/publication/inequality-opportunity-and-intergenerational-persistence-latin-america
Chancel, L., & Piketty, T. (2017). Indian income inequality, 1922–2015: From British Raj to billionaire Raj(Working Paper Series No. 2007/11). Paris: World Inequality Database (WID).
Chancel, L., and T. Piketty. 2019. Indian income inequality, 1922–2015: From British Raj to Billionaire Raj? Review of Income and Wealth 65 (S1): S33–S62. https://doi.org/10.1111/roiw.12439.
Article Google Scholar
Chancel, L., T. Piketty, E. Saez, and G. Zucman. 2022. World inequality report 2022. UNDP, World Inequality Lab.
Checchi, D., and V. Peragine. 2010. Inequality of opportunity in Italy. The Journal of Economic Inequality 8 (4): 429–450. https://doi.org/10.1007/s10888-009-9118-3.
Article Google Scholar
Choudhary, A., G.T. Muthukkumaran, and A. Singh. 2019. Inequality of Opportunity in Indian Women. Social Indicators Research 145 (1): 389–413. https://doi.org/10.1007/s11205-019-02097-w.
Article Google Scholar
Cohen, G.A. 1989. On the currency of egalitarian justice. Ethics, 99(4), 906–944.
Article Google Scholar
Das, P., and S. Biswas. 2022. Social identity, gender and unequal opportunity of earning in urban India: 2017–2018 to 2019–2020. Indian Journal of Labour Economics 65 (1): 39–57.
Article Google Scholar
Deaton, A., & A.A. Stone. 2013. Grandpa and the snapper: The wellbeing of the elderly who live with children, Working Paper No. 19100. National Bureau of Economic Research. https://doi.org/10.3386/w19100
Deshpande, A., D. Goel, and S. Khanna. 2018. Bad Karma or Discrimination? Male-Female Wage Gaps Among Salaried Workers in India. World Development 102: 331–344. https://doi.org/10.1016/j.worlddev.2017.07.012.
Article Google Scholar
Dworkin, R. 1981a. What is equality? Part 1: Equality of welfare. Philosophy & Public Affairs 10 (3): 185–246.
Google Scholar
Dworkin, R. 1981b. What is equality? Part 2: Equality of resources. Philosophy & Public Affairs 10(4), 283–345.
Google Scholar
Farouki, R.T. 2012. The Bernstein Polynomial Basis: A Centennial Retrospective. Computer Aided Geometric Design 29: 379–419.
Article Google Scholar
Ferreira, F.H.G., and J. Gignoux. 2011. The measurement of inequality of opportunity: Theory and an application to Latin America. Review of Income and Wealth 57 (4): 622–657. https://doi.org/10.1111/j.1475-4991.2011.00467.x.
Article Google Scholar
Ferreira, F.H.G., and J. Gignoux. 2014. The Measurement of Educational Inequality: Achievement and Opportunity1. The World Bank Economic Review 28 (2): 210–246. https://doi.org/10.1093/wber/lht004.
Article Google Scholar
Ferreira, F., and V. Peragine. 2015. Equality of Opportunity: Theory and Evidence, 8994. No: IZA Discussion Papers.
Book Google Scholar
Fleurbaey, M. 1995. Equal opportunity or equal social outcome? Economics and Philosophy 11 (1): 25–55.
Article Google Scholar
Fleurbaey, M. 2008. Fairness, responsibility, and welfare. Oxford University Press.
Book Google Scholar
Fleurbaey, M., and V. Pergaine. 2013. Ex post inequalities and ex ante inequalities. In Justice, political liberalism, and utilitarianism: Themes from Harsanyi and Rawls, 59–77. Cambridge University Press.
Google Scholar
Ghose, A.K. 2021. Structural change and development in India. Indian Journal of Human Development 15 (1): 7–29. https://doi.org/10.1177/09737030211005496.
Article Google Scholar
Hothorn, T., and A. Zeileis. 2015. partykit: A modular toolkit for recursive partytioning in R. Journal of Machine Learning Research 16 (118): 3905–3909.
Google Scholar
Hothorn, T., and A. Zeileis. 2021. Predictive distribution modeling using transformation forests. Journal of Computational and Graphical Statistics 30 (4): 1181–1196. https://doi.org/10.1080/10618600.2021.1872581.
Article Google Scholar
Hothorn, T., K. Hornik, and A. Zeileis. 2006. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics 15 (3): 651–674. https://doi.org/10.1198/106186006X133933.
Article Google Scholar
Hothorn, T. 2018. Video: Torsten hothorn. Transformation Forests. http://www.birs.ca/events/2018/5-day-workshops/18w5054/videos/watch/201801161507-Hothorn.html
Hufe, P., A. Peichl, J. Roemer, and M. Ungerer. 2017. Inequality of income acquisition: The role of childhood circumstances. Social Choice and Welfare 49 (3/4): 499–544.
Article Google Scholar
Institute for Human Development, New Delhi. (2014). India Labour and Employment Report, 2014: Workers in the Era of Globalization. Academic Foundation.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning: With Applications in R. Springer. https://link.springer.com/book/10.1007/978-1-0716-1418-1
Kuznets, S. 1955. Economic growth and income inequality. The American Economic Review 45 (1): 1–28.
Google Scholar
Leblanc, A. 2012. On estimating distribution functions using Bernstein polynomials. Annals of the Institute of Statistical Mathematics 64 (5): 919–943.
Article Google Scholar
Lefranc, A., and T. Kundu. n.d. Inequality of opportunity in Indian society. 37.
Motiram, S. 2018. Inequality of Opportunity in India: Concepts, Measurement and Empirics. Indian Journal of Human Development 12 (2): 236–247. https://doi.org/10.1177/0973703018792646.
Article Google Scholar
Piketty, T. 2011. On the long-run evolution of inheritance: France 1820–2050. The Quarterly Journal of Economics 126 (3): 1071–1131.
Article Google Scholar
Piketty, T., and G. Zucman. 2015. Wealth and inheritance in the long run. In Handbook of income distribution, vol. 2, ed. A.B. Atkinson and F. Bourguignon, 1303–1368. Elsevier. https://doi.org/10.1016/B978-0-444-59429-7.00016-9.
Chapter Google Scholar
Plassot, M., X. Ramos, and D. Van de Gaer. 2022. The ex-ante and ex-post measurement of inequality of opportunity: A normative framework. Review of Income and Wealth 68 (1): 4–31.
Google Scholar
Ramos, X., and Van de gaer, D. 2016. Approaches to Inequality of Opportunity: Principles, Measures and Evidence. Journal of Economic Surveys 30 (5): 855–883.
Article Google Scholar
Ramos, X., and Van de gaer, D. 2021. Is Inequality of Opportunity Robust to the Measurement Approach? Review of Income and Wealth 67 (1): 18–36. https://doi.org/10.1111/roiw.12448.
Article Google Scholar
Ravallion, M., and S. Chen. 2003. Measuring pro-poor growth. Economics Letters 78 (1): 93–99. https://doi.org/10.1016/S0165-1765(02)00205-7.
Article Google Scholar
Rawls, J. 1958b. Justice as fairness. The Philosophical Review 67 (2): 164–194.
Article Google Scholar
Rawls, J. 1958a. A theory of justice: Original edition. In A theory of justice. Harvard University Press. https://doi.org/10.4159/9780674042605
Rawls, J. (1971). A Theory of Justice: Original Edition. Harvard University Press. https://doi.org/10.2307/j.ctvjf9z6v
Book Google Scholar
Roemer, J.E. 1993. A pragmatic theory of responsibility for the egalitarian planner. Philosophy & Public Affairs 22 (2): 146–166.
Google Scholar
Roemer, J.E. 1998. Equality of opportunity. Harvard University Press, Cambridge. https://doi.org/10.1057/978-1-349-95121-5_2223-1.
Book Google Scholar
Roemer, J.E. 2002. Equality of opportunity: A progress report. Social Choice and Welfare 19 (2): 455–471.
Article Google Scholar
Roemer, J.E., and A. Trannoy. 2016. Equality of opportunity: Theory and measurement. Journal of Economic Literature 54 (4): 1288–1332. https://doi.org/10.1257/jel.20151206.
Article Google Scholar
Salas-Rojo, P., and J.G. Rodríguez. 2022. Inheritances and wealth inequality: A machine learning approach. The Journal of Economic Inequality 20 (1): 27–51.
Article Google Scholar
Sen, A. 1980. Equality of what? In Tanner lectures on human values (Vol. 1). Cambridge University Press.
Singh, A. (2012). Inequality Of Opportunity in India. 26.
Shapley, L. S. (1953). A value for n-person games. Princeton University Press Princeton.
Google Scholar
Tagade, N., Naik, A., & Thorat, S. (2018). Wealth Ownership and Inequality in India: A Socio-religious Analysis. Journal of Social Inclusion Studies, 4(2). https://journals.sagepub.com/doi/abs/10.1177/2394481118808107
Van de gaer. 1993. Equality of Opportunity and Investment in Human Capital. Kath: Univ.
Google Scholar
Weisskopf, T.E. 2011. Why Worry about Inequality in the Booming Indian Economy? Economic and Political Weekly 46 (47): 41–51.
Google Scholar
Wendelspiess, F., and I. Soloaga. 2014. Iop: Estimating ex-ante inequality of opportunity. The Stata Journal 14 (4): 830–846.
Article Google Scholar

Download references

Acknowledgements

Authors are grateful to Prof S Madeshwaran, Prof Arup Maitra, Dr Pedro Salas-Rojo, and the conference audience at 63^rd ISLE conference for their valuable comments and suggestion in the draft paper.

Funding

This paper is the outcome of a study titled ‘Inequalitrees—A Novel Look at Socio-Economic Inequalities using Machine Learning Techniques and Integrated Data Sources’ funded by the Volkswagen Stiftung, Germany. Authors gratefully acknowledge the financial support received from the VW-Stiftung.

Author information

Authors and Affiliations

Centre for Employment Studies, Institute for Human Development, New Delhi, Delhi, India
Balwant Singh Mehta, Siddharth Dhote & Ravi Srivastava

Authors

Balwant Singh Mehta
View author publications
You can also search for this author in PubMed Google Scholar
Siddharth Dhote
View author publications
You can also search for this author in PubMed Google Scholar
Ravi Srivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Balwant Singh Mehta.

Ethics declarations

Conflict of interest

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Sample Selection and Construction of the Variables

Variable Selection

From the PLFS 2018–19, six variables have been selected, three variables namely sector, caste, and gender are used in the existing form, while another three variables namely states, parents’ education, and parents’ occupations are modified, and created in the new form. The sector is categorized as rural, and urban; gender as male, female, and caste as General Caste (GEN), Scheduled Caste (SC), Scheduled Tribe (ST), and Other Backward Classes (OBC). In the gender variable, transgender has been dropped before the analysis.

The state variable is categorized into 36 states/union territories of India, which have been modified and classified into six broad following geographical regions as follows:

1.
North: Jammu and Kashmir, Himachal Pradesh, Punjab, and Haryana
2.
East: Bihar, Jharkhand, Orissa, and West Bengal
3.
Central: Uttar Pradesh, Rajasthan, Madhya Pradesh, Uttarakhand, and Chhattisgarh
4.
North-East: Sikkim, Arunachal Pradesh, Assam, Nagaland, Meghalaya, Manipur, Mizoram, and Tripura
5.
South: Karnataka, Andhra Pradesh, Tamil Nadu, Pondicherry, Kerala, and Lakshadweep
6.
West: Gujrat, Daman and Diu, Dadra and Nagar Haveli, Maharashtra, and Goa.

The education variable is classified into following four broad categories as follows:

1.
Illiterate or no education: (code 1: Illiterate)
2.
Below secondary: (code 2–7, literate to up to middle school)
3.
Secondary and above secondary: (code 8–10, secondary to higher secondary)
4.
Graduate and above: (code 12–13, graduate and post-graduate)

The occupation/skill level is classified into following four broad categories using NCO (National Classification of Occupations) at one digit: (as per.OECD Employment Outlook 2014; NCO, 2015, Ministry of Labour and Employment, Government of India).

1.Unskilled or routine manual task: Typically involves the performance of simple and routine physical or manual tasks (NCO code 9: Elementary Occupations or unskilled such as domestic helpers, cleaners, street vendors and garbage collectors)

2. Low-skilled or non-routine manual task: Typically involves the performance of tasks such as operating machinery and electronic equipment, driving vehicles, maintenance and repair of electrical and mechanical equipment and manipulation, ordering and storage information (NCO code 4–8, low skilled as clerical jobs, service workers, shop and market sales workers, craft and related trade workers, etc.).

3. Medium skill or non-routine cognitive task: Typically involves the performance of complex technical and practical tasks that require an extensive body of factual, technical and procedural knowledge in a specialized field (NCO code 3, as professional and technical associates); and.

4.High skilled or cognitive task: Typically involves the performance of tasks that require complex problem solving, decision making and creativity based on an extensive body of theoretical and factual knowledge in a specialized field (NCO 2 as professional and Technicians).

The concept of skill level is not applied in the case of NCO code 1 as legislators, managers, etc., as skills for executing tasks and duties of these occupations varied to such an extent that it was not feasible to link them with any of the four, broad skill levels.

Sample Selection

For the selection of sample following multi-stage procedure has been adopted.

In the first stage, the parent of each respondent has been identified using the relation to the head variable in the data. For an individual identified as self (code 1), the household member with code 7 (labeled Father/Mother/Father-in-Law/Mother-in-Law) was treated as the parents and prepared the first set of data with children and parents.

In the second stage, the individuals as unmarried children (code 5) and married children (code 3) have been identified, and further, the parents of these children are identified as household heads labeled as self (code 1) in the data. T the respondent labeled self was identified as the parent and prepared the second set of data with children and parents.

In case of duplicate records (or multiple parental information), we have deleted the duplicate case by carefully looking at the unit records. Once both the files were cleaned, we merged both the files along with key variables in the data as discussed above.

Appendix 2: Grid Search CV Process for Conditional Inference Tree and Conditional Inference Forest

In the Grid Search CV process, the data are divided into training and test sets. Different combinations of min-split (minimum number of observations required to perform a split) and alpha values are tested, and the combination that yields the lowest root mean squared error (RMSE) for the test set is selected. The RMSE is a measure of the model's prediction accuracy. For Conditional inference tree model with MPCI as dependent variables, the Grid Search CV has been conducted. After evaluating various combinations, an alpha value of 0.07 and a min-split value of 10,000 provide the lowest RMSE. The robustness of the endogenously chosen alpha is examined by comparing the results with alpha values of 0.01 and 0.05 as given in Table

Table A.2.1 Ctree results MPCI

Full size table

This comparison is done following the approach outlined by Salas-Rojo and Rodriguez (2022).

Similarly, after evaluating various combinations, an alpha value of 0.06 and number of tree at 200 provide the lowest RMSE for conditional inference forest model. The robustness of the endogenously chosen alpha is examined by comparing the results with alpha values of 0.01 and 0.05 as given

Table A.2.2 Cforest results MPCI

Full size table

Appendix 3: Plots for MPCI

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mehta, B.S., Dhote, S. & Srivastava, R. Decomposition of Inequality of Opportunity in India: An Application of Data-Driven Machine Learning Approach. Ind. J. Labour Econ. 66, 439–469 (2023). https://doi.org/10.1007/s41027-023-00446-5

Download citation

Accepted: 02 June 2023
Published: 20 July 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s41027-023-00446-5

Keywords

JEL Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Decomposition of Inequality of Opportunity in India: An Application of Data-Driven Machine Learning Approach

Abstract

Access this article

Similar content being viewed by others

Changing Contours of Income Stratification and Decomposition of Income Inequality: Evidence from Recent Longitudinal Survey in India

On the spatially explicit Gini coefficient: the case study of Chile—a high-income developing country

A Data Mining Analysis of the Chinese Inland-Coastal Inequality

Notes

References

Acknowledgements

Funding