Soft skills are crucial for candidates in the job market, and analyzing these skills listed in job ads can help in identifying the most important soft skills required by recruiters. This analysis can benefit from building a taxonomy to extract soft skills. However, most prior work is primarily focused on building hard skill taxonomies. Unfortunately, methodologies for building hard skill taxonomies do not work well for soft skills, due to the wide variety of terminologies used to list soft skills in job ads. Moreover, prior work has mainly focused on extracting soft skills from job ads using a simple keyword search, which can fail to detect the different forms in which soft skills are listed in job ads. In this paper, we develop TaxoSoft, a methodology for building a soft skill taxonomy that uses DBpedia and Word2Vec in order to find terms related to different soft skills. TaxoSoft also uses social network analysis to build a hierarchy of terms. We use this method to build soft skill taxonomies in both English and French. We evaluate TaxoSoft on a sample of job ads and find that it achieves an F-score of 0.84, while taxonomies developed in prior work achieve an F-score of only 0.54. We then use the proposed methodology to analyze soft skills listed in job ads in order to find the skills most required in the American and Moroccan job markets. Our findings can offer insights to universities about the top soft skills requested in the job market.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Altszyler E, Sigman M, Ribeiro S, Slezak DF (2017) Comparative study of LSA vs Word2vec embeddings in small corpora: a case study in dreams database. Conscious Cogn 56:178–187. https://doi.org/10.1016/j.concog.2017.09.004
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) DBpedia: a nucleus for a web of open data. In: The semantic web, Lecture notes in computer science. Springer, Berlin, pp 722–735, https://doi.org/10.1007/978-3-540-76298-0_52
Balcar J (2014) Soft skills and their wage returns: overview of empirical literature. Rev Econ Perspect 14(1):3–15. https://doi.org/10.2478/revecp-2014-0001
Bastian M, Hayes M, Vaughan W, Shah S, Skomoroch P, Kim H, Uryasev S, Lloyd C (2014) LinkedIn skills: large-scale topic extraction and inference. In: Proceedings of the 8th ACM conference on recommender systems, ACM, New York, NY, USA, RecSys’14, pp 1–8. https://doi.org/10.1145/2645710.2645729
Benz D, Hotho A, Stumme G, Stützer S (2010) Semantics made by you and me: self-emerging ontologies can capture the diversity of shared knowledge. In: Proceedings of the 2nd web science conference (WebSci10)
Blake R, Gutierrez O (2011) A semantic analysis approach for assessing professionalism using free-form text entered online. Comput Hum Behav 27(6):2249–2262. https://doi.org/10.1016/j.chb.2011.07.004
Boldi P, Monti C (2016) Cleansing Wikipedia categories using centrality. ACM Press, New York, pp 969–974. https://doi.org/10.1145/2872518.2891111
Brooks NG, Greer TH, Morris SA (2018) Information systems security job advertisement analysis Skills review and implications for information systems curriculum. J Educ Bus 93(5):213–221. https://doi.org/10.1080/08832323.2018.1446893
Calanca F, Sayfullina L, Minkus L, Wagner C, Malmi E (2018) Responsible team players wanted an analysis of soft skill requirements in job advertisements. arXiv:181007781
Cornali F (2018) Training and developing soft skills in higher education. In: 4th international conference on higher education advances (HEAD’18), Editorial Universitat Politecnica de Valencia, pp 961–967
Daneva M, Wang C, Hoener P (2017) What the job market wants from requirements engineers? An empirical analysis of online job ads from the Netherlands. In: 2017 ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 448–453. https://doi.org/10.1109/ESEM.2017.60
De Smedt J, le Vrang M, Papantoniou A (2015) ESCO: towards a semantic web for the European labor market. In: LDOW@ WWW
Fellbaum C (1998) A semantic network of english: the mother of all wordnets. Comput Human 32(2):209–220. https://doi.org/10.1023/A:1001181927857
Fernandez-Sanz L (2010) Analysis of non technical skills for ICT profiles. In: 5th Iberian conference on information systems and technologies, pp 1–5
Florea R, Stray V (2018) Software tester, we want to hire you an analysis of the demand for soft skills. In: Garbajosa J, Wang X, Aguiar A (eds) Agile processes in software engineering and extreme programming. Lecture notes in business information processing. Springer, Berlin, pp 54–67
Gabrilovich E, Markovitch S (2007) Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Proceedings of the 20th international joint conference on artifical intelligence, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, IJCAI’07, pp 1606–1611. http://dl.acm.org/citation.cfm?id=1625275.1625535. Event-place: Hyderabad, India
Gardiner A, Aasheim C, Rutner P, Williams S (2018) Skill requirements in big data a content analysis of job advertisements. J Comput Inf Syst 58(4):374–384. https://doi.org/10.1080/08874417.2017.1289354
Grineva M, Grinev M, Lizorkin D (2009) Extracting key terms from noisy and multitheme documents. In: Proceedings of the 18th international conference on World wide web—WWW’09, ACM Press, Madrid, Spain, p 661. https://doi.org/10.1145/1526709.1526798
Heymann P, Garcia-Molina H (2006) Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report, Stanford
Hillmer G, Fink C, Foradori M, Gall M, Kilian D, Sparer W (2007) Social and soft skills training concept in engineering education. Innovations 2007: world innovations in engineering education and research, International network for engineering education and research, pp 355–366
Hurrell SA (2016) Rethinking the soft skills deficit blame game: employers, skills withdrawal and the reporting of soft skills gaps. Hum Relat 69(3):605–628
Javed F, Hoang P, Mahoney T, McNair M (2017) Large-scale occupational skills normalization for online recruitment. https://aaai.org/ocs/index.php/IAAI/IAAI17/paper/view/14922
Joseph D, Ang S, Chang RHL, Slaughter SA (2010) Practical intelligence in IT: assessing soft skills of IT professionals. Commun ACM 53(2):149–154. https://doi.org/10.1145/1646353.1646391
Kautz T, Heckman JJ, Diris R, Weel Bt, Borghans L (2014) Fostering and measuring skills: improving cognitive and non-cognitive skills to promote lifetime success. Working Paper 20749, National Bureau of Economic Research. https://doi.org/10.3386/w20749
Kivimäki I, Panchenko A, Dessy A, Verdegem D, Francq P, Bersini H, Saerens M (2013) A graph-based approach to skill extraction from text. In: Proceedings of TextGraphs-8 graph-based methods for natural language processing, pp 79–87
Lacerenza CN, Marlow SL, Tannenbaum SI, Salas E (2018) Team development interventions: evidence-based approaches for improving teamwork. Am Psychol 73(4):517
Lai S, Liu K, Xu L, Zhao J (2015) How to generate a good word embedding? arXiv:150705523
Maitra S, Gopalram K (2016) Ethics and soft skill assessment tool for program outcome attainment: a case study. In: 2016 IEEE 4th international conference on MOOCs, innovation and technology in education (MITE), pp 317–324. https://doi.org/10.1109/MITE.2016.069
Malherbe E, Aufaure MA (2016) Bridge the terminology gap between recruiters and candidates: a multilingual skills base built from social media and linked data. In: 2016 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM), pp 583–590. https://doi.org/10.1109/ASONAM.2016.7752295
Manku GS, Jain A, Das Sarma A (2007) Detecting near-duplicates for web crawling. In: Proceedings of the 16th international conference on World Wide Web, ACM, New York, NY, USA, WWW’07, pp 141–150. https://doi.org/10.1145/1242572.1242592
Manpower (2017) Talent shortage 2016–2017 | ManpowerGroup. http://manpowergroup.com/talent-shortage-2016
Matturro G (2013) Soft skills in software engineering A study of its demand by software companies in Uruguay. In: 2013 6th international workshop on cooperative and human aspects of software engineering (CHASE), pp 133–136. https://doi.org/10.1109/CHASE.2013.6614749
Matturro G, Raschetti F, Fontàn C (2015) Soft skills in software development teams a survey of the points of view of team leaders and team members. In: 2015 IEEE/ACM 8th international workshop on cooperative and human aspects of software engineering, pp 101–104. https://doi.org/10.1109/CHASE.2015.30
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:13013781
Monasor MJ, Noll J, Vizcaìno A, Piattini M, Beecham S (2014) Walk before you run: using heuristic evaluation to assess a training tool prototype. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering, ACM, New York, NY, USA, EASE’14, pp 41:1–41:10. https://doi.org/10.1145/2601248.2601271. Event-place: London, England, United Kingdom
Nolinske T, Millis B (1999) Cooperative learning as an approach to pedagogy. Am J Occup Ther 53(1):31–40
Rehurek R, Sojka P (2010) Software framework for topic modelling with large corpora. In: In Proceedings of the LREC 2010 workshop on new challenges for NLP Frameworks, Citeseer
Roget PM (1911) Roget’s thesaurus of English words and phrases. TY Crowell Company, Philadelphia
Smith SP, Hickmott D, Bille R, Burd E, Southgate E, Stephens L (2015) Improving undergraduate soft skills using m-learning and serious games. In: 2015 IEEE international conference on teaching, assessment, and learning for engineering (TALE), pp 230–235. https://doi.org/10.1109/TALE.2015.7386049
Wu F, Weld DS (2010) Open information extraction using Wikipedia. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Association for Computational Linguistics, pp 118–127
Yanaze LKH, Lopes RdD (2014) Transversal competencies of electrical and computing engineers considering market demand. In: 2014 IEEE frontiers in education conference (FIE) proceedings, pp 1–4. https://doi.org/10.1109/FIE.2014.7044169
Zaharim A, Ahmad I, Yusoff YM, Omar MZ, Basri H (2012) Evaluating the soft skills performed by applicants of malaysian engineers. Procedia Soc Behav Sci 60:522–528. https://doi.org/10.1016/j.sbspro.2012.09.417
Zhao M, Javed F, Jacob F, McNair M (2015) SKILL: a system for skill identification and normalization. In: Proceedings of the 29th AAAI conference on artificial intelligence, AAAI Press, AAAI’15, pp 4012–4017. http://dl.acm.org/citation.cfm?id=2888116.2888273. Event-place: Austin, Texas
This work is supported in part by the United States Agency for International Development (USAID) under grant AID-OAAA-11-00012 and by a Google Africa PhD fellowship. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied of USAID or Google. The authors would like to thank Mehdi Zakroum and Ibtissam Makdoun for useful comments and discussion.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Khaouja, I., Mezzour, G., Carley, K.M. et al. Building a soft skill taxonomy from job openings. Soc. Netw. Anal. Min. 9, 43 (2019). https://doi.org/10.1007/s13278-019-0583-9