Abstract
This paper give an insight on effect of education and socio economic factors on education on earning for Pakistan using data mining technique Regression tree and classification tree (CART). Labor force survey data used in this paper. Variables used as predictors in the study are Education, Gender, Status, Training, and Occupation, Location of working, Training, Experience, Age and Type of industry, where monthly income is used as an independent variable. In case of classification income is divided in Quintiles, which is used as a dependent variable for classification variable. Type of industry, education, age and occupation are found significant variables in both classification and regression tree. Regression trees shows that instead of education type of industry is the most important variable and sex and education are the least important variables. Classification tree also shows that Type of industry is the most significant variable which effects the earning of an individual, then age and occupation of an individual come and education is the least important variable where the rest of predictors play no role in earning of an individual.
This is a preview of subscription content, log in via an institution.
References
Bjorklund, A., Kjellstrom, C.: Estimating the return to investments in education: how useful is the standard Mincer education? Econ. Educ. Rev. 21, 195–210 (2000)
Metcalf, D.: The determinants of earnings changes: a regional analysis for the U.K., 1960–68. Int. Econ. Rev. 12(2), 273–282 (1971)
Afzal, M.: Micro econometric analysis of private returns to education and determinants of earnings. Pak. Econ. Soc. Rev. 49(1), 39–68 (2011)
Khan, S., Irfan, M.: Rates of returns to education and the determinants of earnings in Pakistan. Pak. Dev. Rev. XXIV(3&4), 671–683 (1985)
Tubman, P.: The determinants of earnings: genetics, family, and other environments; study of white male twins. Am. Econ. Assoc. 66(5), 858–870 (1976)
Nasir, Z.: Determinants of earnings in Pakistan: findings from the labor force survey 1993–94. Pak. Dev. Rev. 37(3), 251–274 (1998)
Kapoor, B.L., Puri, A.K.: The determinates of personal earnings: a study of industrial workers in Punjab. Econ. Educ. Rev. (1971)
Sutton, C.D.: Classification and regression trees, bagging, and boosting. In: Hand Book of Statistics, vol. 24 (2005)
Pakgohar, A., Tabrizi, R.S., Khalili, M., Esmaeili, A.: The role of human factor in incidence and severity of road crashes based on the CART and LR regression: a data mining approach. Procedia Comput. Sci. 3, 764–769 (2010)
Lewis, R.J.: An introduction to classification and regression tree (CART) analysis. In: Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California (2000)
Berndt, E.R.: The Practice of Econometrics: Classic and Contemporary. Addison-Wesley, Boston (1991)
De’ath, G., Fabricius, K.E.: Classification and regression trees: a powerful yet simple technique for ecological data analysis. Ecology 81(11), 3178–3192 (2000)
Gordon, L.: Using classification and regression trees (CART) in SAS® enterprise miner TM for applications in public health. In: Data Mining and Text Analytics: 089-2013 (2013)
Horning, N.: Introduction to decision trees and random forests. Am. Mus. Nat. Hist. (2013)
James, G., Witten, D., Hastie, T.: An Introduction to Statistical Learning: With Applications in R. Taylor & Francis, Abingdon (2014)
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Loh, W.Y.: Classification and regression trees. Wiley Interdisc. Rev.: Data Min. Knowl. Discov. 1(1), 14–23 (2011)
Ohno-Machado, L., et al.: Decision trees and fuzzy logic: a comparison of models for the selection of measles vaccination strategies in Brazil. In: Proceedings of the AMIA Symposium. American Medical Informatics Association (2000)
Patel, H.D., et al.: Cost-effectiveness of a new rotavirus vaccination program in Pakistan: a decision tree model. Vaccine 31(51), 6072–6078 (2013)
Rokach, L.: Data Mining with Decision Trees: Theory and Applications. World scientific, Singapore (2007)
Thakur, G.S., et al.: Understanding the applicability of linear & non-linear models using a case-based study. Int. J. Artif. Intell. Appl. (IJAIA) 5, 1–15 (2014)
Varian, H.R.: Big data: new tricks for econometrics. J. Econ. Perspect. 28, 3–27 (2014)
Chang, Y.: Robustifying Regression and Classification Trees in the Presence of Irrelevant Variables. ProQuest, Ann Arbor (2008)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)
Zhang, D.: Advances in Machine Learning Applications in Software Engineering. IGI Global, Hershey (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Younas, N., Asghar, Z., Qayyum, M., Khan, F. (2017). Education and Socio Economic Factors Impact on Earning for Pakistan - A Bigdata Analysis. In: Ferreira, J., Alam, M. (eds) Future Intelligent Vehicular Technologies. Future 5V 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 185. Springer, Cham. https://doi.org/10.1007/978-3-319-51207-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-51207-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51206-8
Online ISBN: 978-3-319-51207-5
eBook Packages: Computer ScienceComputer Science (R0)