Semantic Analysis for Identifying Security Concerns in Software Procurement Edicts
- 283 Downloads
- 3 Citations
Abstract
Brazilian Federal Institutions must acquire software tools by procurement, so their software teams have to develop, verify, and audit the specifications to ensure that the edicts properly include software security risks concerns. This work presents the Automated Analyst of Edicts tool, which aids the analysis of a document by automatic identification of absent relationships between its sentences and concepts related to software security risks or weaknesses. It was compared to software security experts’ performance for multi-label classification into five of the OWASP Top 10 risks. Specificity of over 80% was achieved when analyzing individual sentences for multiple risks, and a 90% negative prediction probability result obtained when applied to specific risk–sentence relationships.
Keywords
Security risks Natural language processing Text miningReferences
- 1.Diretrizes para desenvolvimento e obtenção de software seguro nos órgãos e entidades da Administração Pública Federal. Norma Complementar 16/IN01/DSIC/GSIPR, Departamento de Segurança da Informação e Comunicações do Gabinete de Segurança Institucional da Presidência da República (2012). http://dsic.planalto.gov.br/documentos/nc_17_profissionais_sic.pdf
- 2.Tecnologia da informação—Técnicas de segurança—Gestão de riscos de segurança da informação. NBR ISO/IEC 27005/2011, Associação Brasileira de Normas Técnicas, Rio de Janeiro, Brazil (2011)Google Scholar
- 3.COBIT 5: a business framework for the governance and management of enterprise IT (ISACA, Rolling Meadows, IL, 2012)Google Scholar
- 4.Lei \(\text{n}^{\circ }\) 8.666. Regulamenta o art. 37, inciso XXI, da Constituição Federal, institui normas para licitações e contratos da Administração Pública e dá outras providências. Lei 8.666 (1993). http://www.planalto.gov.br/ccivil_03/leis/l8666cons.htm
- 5.CMMI Product Team, CMMI for Acquisition, Version 1.3. Tech. Rep. CMU/SEI-2010-TR-032, Software Engineering Institute, Carnegie Mellon University, Pittsburgh, PA (2010). http://resources.sei.cmu.edu/library/asset-view.cfm?AssetID=9657
- 6.Howard, M.: Writing Secure Code. Microsoft Press, Redmond (2003)Google Scholar
- 7.McGraw, G., Chess, B., Migues S.: Building security in maturity model BSIMM v6. 0 (2015)Google Scholar
- 8.Jones, C.: Software Engineering Best Practices: Lessons from Successful Projects in the Top Companies. McGraw-Hill Education, New York (2009)Google Scholar
- 9.Acórdão 1200/2014 p—Diagnóstico da situação da estrutura de recursos humanos alocadas na Área de tecnologia da informação das instituições públicas federais. Tech. Rep., Tribunal de Contas da União (2014). http://portal2.tcu.gov.br/portal/page/portal/TCU
- 10.Atuação e adequações para profissionais da área de segurança da informação e comunicações nos órgãos e entidades da Administração Pública Federal. Norma Complementar 17/IN01/DSIC/GSIPR, Departamento de Segurança da Informação e Comunicações do Gabinete de Segurança Institucional da Presidência da República (2013). http://dsic.planalto.gov.br/documentos/nc_16_software_seguro.pdf
- 11.Shuaibu, B.M., Norwawi, N.M., Selamat, M.H., Al-Alwani, A.: Systematic review of web application security development model. Artif. Intell. Rev. 43(2), 259 (2013)CrossRefGoogle Scholar
- 12.Allen, J.H., Ellison, R.J., McGraw, G., Mead, N.R.: Software Security Engineering: A Guide for Project Managers. Addison-Wesley, Upper Saddle River (2008)Google Scholar
- 13.McGraw, G.: Software Security: Building Security In. Addison-Wesley Professional, Upper Saddle River (2006)Google Scholar
- 14.Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Oxford (2000)zbMATHGoogle Scholar
- 15.Khan, N.M., Ksantini, R., Ahmad, I.S., Boufama, B.: A novel SVM+ nda model for classification with an application to face recognition. Pattern Recogn. 45(1), 66 (2012)CrossRefzbMATHGoogle Scholar
- 16.Amato, F., López, A., Peña-Méndez, E.M., Vaňhara, P., Hampl, A., Havel, J.: Artificial neural networks in medical diagnosis. J. Appl. Biomed. 11(2), 47 (2013)CrossRefGoogle Scholar
- 17.Lopez, M.J., Matthews, G.: Building an NCAA mens basketball predictive model and quantifying its success. arXiv:https://arxiv.org/abs/1412.0248v1 (e-prints) (2014)
- 18.Systems and Software Engineering—Systems and Software Quality Requirements and Evaluation (SQuaRE)—System and Software Quality Models. ISO/IEC 25010:2011, International Organization for Standardization, Geneva, Switzerland (2011)Google Scholar
- 19.Dispõe sobre o processo de contratação de serviços de Tecnologia da Informação pela Administração Pública Federal direta, autárquica e fundacional. Instrução Normativa 4, Secretaria de Logística e Tecnologia da Informação do Ministério do Planejamento, Orçamento e Gestão (2008). http://www.comprasnet.gov.br/legislacao/in/IN04_08.htm
- 20.Hunnebeck, L., Rudd, C., Lacy, S., Hanna, A.: ITIL Service Design (TSO, 2011)Google Scholar
- 21.Feldman, R., Sanger, J.: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2006). doi: 10.1017/CBO9780511546914 CrossRefGoogle Scholar
- 22.Singh, P., Singh, D., Sharma, A.: Rule-based system for automated classification of non-functional requirements from requirement specifications. In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 620–626 (2016). doi: 10.1109/ICACCI.2016.7732115
- 23.Mahmoud, A., Williams, G.: Detecting, classifying, and tracing non-functional software requirements. Requir. Eng. 21(3), 357 (2016). doi: 10.1007/s00766-016-0252-8 CrossRefGoogle Scholar
- 24.Meth, H., Brhel, M., Maedche, A.: The state of the art in automated requirements elicitation. Inf. Softw. Technol. 55(10), 1695 (2013). doi: 10.1016/j.infsof.2013.03.008 CrossRefGoogle Scholar
- 25.Cleland-Huang, J., Settimi, R., Zou, X., Solc, P.: The detection and classification of non-functional requirements with application to early aspects. In: 14th IEEE International Requirements Engineering Conference (RE’06), pp. 39–48 (2006). doi: 10.1109/RE.2006.65
- 26.Nigam, K., Mccallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Mach. Learn. 39(2), 103 (2000). doi: 10.1023/A:1007692713085 CrossRefzbMATHGoogle Scholar
- 27.Casamayor, A., Godoy, D., Campo, M.: Identification of non-functional requirements in textual specifications: a semi-supervised learning approach. Inf. Softw. Technol. 52(4), 436 (2010). doi: 10.1016/j.infsof.2009.10.010 CrossRefGoogle Scholar
- 28.SOFTEX. MPS.BR—Melhoria do Processo de Software Brasileiro. Guia de Aquisição (2013)Google Scholar
- 29.Cannon, D., Wheeldon, D., Lacy, S., Hanna, A.: ITIL Service Strategy. The Stationery Office, London (2011)Google Scholar
- 30.Gao, X., Singh, M.P., Mehra, P.: Mining business contracts for service exceptions. IEEE Trans. Serv. Comput. 5(3), 333 (2012)CrossRefGoogle Scholar
- 31.Gao, X., Singh, M.P: Extracting normative relationships from business contracts. In: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems (International Foundation for Autonomous Agents and Multiagent Systems, 2014), pp. 101–108Google Scholar
- 32.Riaz, M., King, J., Slankas, J., Williams, L.: Hidden in plain sight: automatically identifying security requirements from natural language artifacts. In: 2014 IEEE 22nd International Requirements Engineering Conference (RE)Google Scholar
- 33.Slankas, J., Williams, L.: Automated extraction of non-functional requirements in available documentation. In: 2013 1st International Workshop on Natural Language Analysis in Software Engineering (NaturaLiSE), pp. 9–16 (2013). doi: 10.1109/NAturaLiSE.2013.6611715
- 34.Peclat, R.N.: Avaliação semântica da integração da gestão de riscos de segurança em documentos de software da administração pública. Master’s Thesis, University of Brasìlia (2015). http://repositorio.unb.br/handle/10482/18827
- 35.E. Gabrilovich, S. Markovitch, Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence IJCAI’07 (Hyderabad, India, 2007), pp. 1606–1611Google Scholar
- 36.Huang, L.: Concept-based text clustering. Ph.D. Thesis, Hamilton, New Zealand (2011). http://hdl.handle.net/10289/5476. Doctoral
- 37.OWASP Top 10—2013: Os dez riscos de segurança mais críticos em aplicações web (2013). https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project
- 38.2011 CWE/SANS Top 25 Most Dangerous Software Errors (2011). http://cwe.mitre.org/top25/
- 39.Jones, C., Bonsignour, O.: The Economics of Software Quality. Addison-Wesley Professional, Upper Saddle River (2011)Google Scholar
- 40.Canongia, C., Gonçalves Júnior, A., Mandarino Junior, R.: Guia de Referência para a Segurança das Infraestruturas Críticas da Informação (2010). http://dsic.planalto.gov.br/publicacoes-2
- 41.Pillai, I., Fumera, G., Roli, F.: Threshold optimisation for multi-label classifiers. Pattern Recogn. 46(7), 2055 (2013). doi: 10.1016/j.patcog.2013.01.012 CrossRefzbMATHGoogle Scholar
- 42.Yang, Y.: A study of thresholding strategies for text categorization. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (ACM, New York, NY, USA, 2001), SIGIR ’01, pp. 137–145. doi: 10.1145/383952.383975
- 43.OWASP Application Security Verification Standard 2009—Web Application Standard (2009). https://www.owasp.org/index.php/OWASP_Portuguese_Language_Project
- 44.Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0 Step-by-step data mining guide. Tech. Rep. The CRISP-DM Consortium (2000). http://www.crisp-dm.org
- 45.State of Software Security 2016. Survey Report 7, Veracode (2016). https://info.veracode.com/state-of-software-security-report.html
- 46.Licitações e contratos : orientações e jurisprudência do TCU. Tech. Rep. Tribunal de Contas da União (2010). http://portal.tcu.gov.br/comunidades/orientacoes-sobre-licitacoes-contratos-e-convenios/home/
- 47.Licitações e Contratos Administrativos: Perguntas e Respostas. Tech. Rep. Controladoria-Geral da União (2011). http://www.cgu.gov.br
- 48.Guia de Implementação de Software. Tech. Rep. SOFTEX (2016). http://www.softex.br/mpsbr/guias/
- 49.Guia de Implementação de Serviços. Tech. Rep. SOFTEX (2015). http://www.softex.br/mpsbr/guias/
- 50.Sparck Jones, K., Willett, P. (eds.): Readings in Information Retrieval. Morgan Kaufmann Publishers Inc., San Francisco (1997)Google Scholar
- 51.Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819 (2014). doi: 10.1109/TKDE.2013.39 CrossRefGoogle Scholar
- 52.Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining Multi-Label Data. Springer, Boston (2010). doi: 10.1007/978-0-387-09823-4_34 Google Scholar
- 53.Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann Publishers Inc., San Francisco (2011)Google Scholar
- 54.Akobeng, A.K.: Understanding diagnostic tests 1: sensitivity, specificity and predictive values. Acta Paediatr. 96(3), 338 (2007). doi: 10.1111/j.1651-2227.2006.00180.x CrossRefGoogle Scholar
- 55.Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133 (2008). doi: 10.1007/s10994-008-5064-8 CrossRefGoogle Scholar
- 56.Relatório, Voto e Acórdao 3117/2014. Tech. Rep. TC 003.732/2014-2, Ministério do Planejamento, Orçamento e Gestão (2014). http://goo.gl/MsdGBQ