Using Crowdsourcing to Identify a Proxy of Socio-economic Status

  • Adil E. RajputEmail author
  • Akila Sarirete
  • Tamer F. Desouky
Conference paper
Part of the Springer Proceedings in Complexity book series (SPCOM)


Social Media provides researchers with an unprecedented opportunity to gain insight into various facets of human life. Researchers put a great emphasis on pinpointing socioeconomic status (SES) of individuals as they can use to it to predict numerous outcomes of interest. Crowdsourcing is a term coined that entails gathering intelligence from a user community online. In order to group online users into a common conversation, researchers have made use of hashtags that will label users and user content into tags that can be easily searched for. In this paper, we propose a mechanism to group a group of users based on their geographic background and build a corpus for such users. Specifically, we have looked at online discussion forums for commercial vehicles where the website has established forums for different geographic areas to share information, have discussions, and provide additional information about the vehicle of interest. From such a discussion, it was possible to glean the vocabulary that these group of users adhere to. We compared the corpus of different communities and noted the difference in the choice of language. This provided us with the groundwork for predicting a proxy of SES of such communities. More work is underway to take words and emojis out of vocabulary (OOV) and assessing the average score as special cases.


  1. 1.
    Adler, N.E., Ostrove, J.M.: Socioeconomic status and health: what we know and what we don’t. Ann. N. Y. Acad. Sci. 896(1), 3–15 (1999)CrossRefGoogle Scholar
  2. 2.
    American Psychological Association: Measuring socioeconomic status and subjective social status. Public Interest Directorate, Socioeconomic Status Office, Resources and Publication (2016)Google Scholar
  3. 3.
    Baker, E.H.: Socioeconomic Status, Definition. The Wiley Blackwell Encyclopedia of Health, Illness, Behavior, and Society, 2210–2214. (2014)
  4. 4.
    Bradley, R.H., Corwyn, R.F.: Socioeconomic status and child development. Annu. Rev. Psychol. 53(1), 371–399 (2002)CrossRefGoogle Scholar
  5. 5.
    Chen, E., Paterson, L.Q.: Neighborhood, family, and subjective socioeconomic status: how do they relate to adolescent health? Health Psychol. 25(6), 704 (2006)CrossRefGoogle Scholar
  6. 6.
    Collins, S.E.: Associations between socioeconomic factors and alcohol outcomes. Alcohol Res. Curr. Rev. 38(1), 83–94 (2016)Google Scholar
  7. 7.
    Diemer, M.A., Mistry, R.S., Wadsworth, M.E., López, I., Reimers, F.: Best practices in conceptualizing and measuring social class in psychological research. Anal. Soc. Issues Public Policy 13(1), 77–113 (2013)CrossRefGoogle Scholar
  8. 8.
    Flaounas, I., Ali, O., Lansdall-Welfare, T., De Bie, T., Mosdell, N., Lewis, J., Cristianini, N.: Research methods in the age of digital journalism: massive-scale automated analysis of news-content—topics, style and gender. Digital J. 1(1), 102–116 (2013)CrossRefGoogle Scholar
  9. 9.
    Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221–233 (1948). Scholar
  10. 10.
    Gayman, M.D., Cislo, A.M., Goidel, A.R., Ueno, K.: SES and race-ethnic differences in the stress-buffering effects of coping resources among young adults. Ethn. Health 19(2), 198–216 (2014)CrossRefGoogle Scholar
  11. 11.
    Geronimus, A.T., Bound, J.: Use of census-based aggregate variables to proxy for socioeconomic group: evidence from national samples. Am. J. Epidemiol. 148(5), 475–486 (1998)CrossRefGoogle Scholar
  12. 12.
    Geronimus, A.T., Bound, J., Neidert, L.J.: On the validity of using census geocode characteristics to proxy individual socioeconomic characteristics. J. Am. Stat. Assoc. 91(434), 529–537 (1996)CrossRefGoogle Scholar
  13. 13.
    Hackman, D.A., Farah, M.J.: Socioeconomic status and the developing brain. Trends Cogn. Sci. 13(2), 65–73 (2009)CrossRefGoogle Scholar
  14. 14.
    Kincaid, J.P., Fishburne Jr, R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for navy enlisted personnel (1975)Google Scholar
  15. 15.
    Kincaid, J.P., Fishburne, R.P., Rogers, R.L., Chissom, B.S.: Derivation of new readability formulas (automated readability index, fog count, and flesch reading ease formula) for Navy enlisted personnel. Research Branch Report 8–75. Chief of Naval Technical Training: Naval Air Station Memphis (1975)Google Scholar
  16. 16.
    Mailloux, S.L., Johnson, M.E., Fisher, D.G., Pettibone, T.J.: How reliable is computerized assessment of readability? Comput. Nurs. 13, 221 (1995)Google Scholar
  17. 17.
    McLaren, L.: Socioeconomic status and obesity. Epidemiol. Rev. 29(1), 29–48 (2007)CrossRefGoogle Scholar
  18. 18.
    Rajput, A., Ahmed, S.: Big Data and Social/Medical Sciences: State of the Art and Future Trends. IACHS 2018 Available at arXiv preprint arXiv:1902.00705 (2018)
  19. 19.
    Rajput, A., Ahmed, S.: Making a case for Social Media Corpus to detect Depression. IACHSS 2018 Available at arXiv preprint arXiv:1902.00702 (2018)
  20. 20.
    Schvaneveldt, R.W., Meyer, D.E., Becker, C.A.: Lexical ambiguity, semantic context, and visual word recognition. J. Exp. Psychol.: Hum. Percept. Perform. 2(2), 243 (1976)Google Scholar
  21. 21.
    Si, L., Callan, J.: A statistical model for scientific readability. In: Proceedings of the Tenth International Conference on Information and Knowledge Management, pp. 574–576. ACM, New York (2001, October)Google Scholar
  22. 22.
    Sirin, S.R.: Socioeconomic status and academic achievement: a meta-analytic review of research. Rev. Educ. Res. 75(3), 417–453 (2005)CrossRefGoogle Scholar
  23. 23.
    Sobal, J., Stunkard, A.J.: Socioeconomic status and obesity: a review of the literature. Psychol. Bull. 105(2), 260 (1989)CrossRefGoogle Scholar
  24. 24.
    Soobader, M.J., LeClere, F.B., Hadden, W., Maury, B.: Using aggregate geographic data to proxy individual socioeconomic status: does size matter? Am. J. Public Health 91(4), 632 (2001)CrossRefGoogle Scholar
  25. 25.
    Sparck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. J. Doc. 28(1), 11–21 (1972)CrossRefGoogle Scholar
  26. 26.
    Stockmeyer, N.O.: Using Microsoft Word’s readability program. Mich. Bar J. 88, 46 (2009)Google Scholar
  27. 27.
    Wang, L.W., Miller, M.J., Schmitt, M.R., Wen, F.K.: Assessing readability formula differences with written health information materials: application, results, and recommendations. Res. Soc. Adm. Pharm. 9(5), 503–516 (2013)CrossRefGoogle Scholar
  28. 28.
    Wazny, K.: “Crowdsourcing” ten years: a review. J. Glob. Health 7(2), 020602 (2017)CrossRefGoogle Scholar
  29. 29.
    Weinberg, B.D., Williams, C.B.: The 2004 US Presidential campaign: impact of hybrid offline and online ‘meetup’ communities. J. Direct Data Digit. Mark. Pract. 8(1), 46–57 (2006)CrossRefGoogle Scholar
  30. 30.
    White, K.R.: The relation between socioeconomic status and academic achievement. Psychol. Bull. 91(3), 461 (1982)CrossRefGoogle Scholar
  31. 31.
    Youyou, W., Kosinski, M., Stillwell, D.: Computer-based personality judgments are more accurate than those made by humans. Proc. Natl. Acad. Sci. 112(4), 1036–1040 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Adil E. Rajput
    • 1
    Email author
  • Akila Sarirete
    • 1
  • Tamer F. Desouky
    • 1
  1. 1.College of EngineeringEffat UniversityJeddahKingdom of Saudi Arabia

Personalised recommendations