Abstract
Online vulnerability databases provide a wealth of information pertaining to vulnerabilities that are present in computer application software, operating systems, and firmware. Extracting useful information from these databases that can subsequently be utilized by applications such as vulnerability scanners and security monitoring tools can be a challenging task. This paper presents two approaches to information extraction from online vulnerability databases: a machine learning based solution and a solution that exploits linguistic patterns elucidated by part-of-speech tagging. These two systems are evaluated to compare accuracy in recognizing security concepts in previously unseen vulnerability description texts. We discuss design considerations that should be taken into account in implementing information retrieval systems for security domain.
This material is based upon work supported by the National Science Foundation under Grant No. 0905232.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bridges, R.A., Jones, C.L., Iannacone, M.D., Goodall, J.R.: Automatic labeling for entity extraction in cyber security. Computing Research Repository (2013). http://arxiv.org/abs/1308.4941
Esuli, A., Sebastiani, F.: SentIWordNet: A publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation, Genoa, Italy, May 2006
Fellbaum, C.: WordNet: An Electronic Lexical Database. Bradford Books, Cambridge (1998)
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by Gibbs sampling. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Ann Arbor, MI, June 2005
Joshi, A., Lal, R., Finin, T., Joshi, A.: Extracting cybersecurity related linked data from text. In: Proceedings of the 7th IEEE International Conference on Semantic Computing, Irvine, CA, September 2013
Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, Sapporo, Japan, July 2003
Lab, N.: BRAT annotation tool (2010). http://brat.nlplab.org/
Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for information extraction. In: Proceedings of DARPA Broadcast News Workshop, Herndon, VA, March 1999
de Marneffe, M.C., et al.: Generating typed dependency parses from phrase structure parses. In: Proceedings of the International Conference on Language Resources and Evaluation, Genoa, Italy, May 2006
McNeil, N., Bridges, R.A., Iannacone, M.D., Czejdo, B.D., Perez, N.: PACE: Pattern accurate computationally efficient bootstrapping for timely discovery of cyber-security concepts. Computing Research Repository (2013). http://arxiv.org/abs/1308.4648
Mulwad, V., Li, W., Joshi, A., Finin, T., Viswanathan, K.: Extracting information about security vulnerabilities from web text. In: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, Lyon, France, August 2011
Roschke, S., Cheng, F., Schuppenies, R., Meinel, C.: Towards unifying vulnerability information for attack graph construction. In: Samarati, P., Yung, M., Martinelli, F., Ardagna, C.A. (eds.) ISC 2009. LNCS, vol. 5735, pp. 218–233. Springer, Heidelberg (2009)
Settles, B.: Biomedical named entity recognition using conditional random fields and rich feature sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, Geneva, Switzerland, August 2004
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, Hong Kong, October 2000
Urbanska, M., Ray, I., Howe, A., Roberts., M.: Structuring a vulnerability description for comprehensive single system security analysis. In: Rocky Mountain Celebration of Women in Computing, Fort Collins, CO, USA, November 2012
Urbanska, M., Roberts, M., Ray, I., Howe, A., Byrne, Z.: Accepting the inevitable: Factoring the user into home computer security. In: Proceedings of the Third ACM Conference on Data and Application Security and Privacy, San Antonio, TX, USA, February 2013
Wallach, H.M.: Conditional random fields: An introduction. CIS Technical report MS-CIS-04-21, University of Pennsylvania (2004)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Weerawardhana, S., Mukherjee, S., Ray, I., Howe, A. (2015). Automated Extraction of Vulnerability Information for Home Computer Security. In: Cuppens, F., Garcia-Alfaro, J., Zincir Heywood, N., Fong, P. (eds) Foundations and Practice of Security. FPS 2014. Lecture Notes in Computer Science(), vol 8930. Springer, Cham. https://doi.org/10.1007/978-3-319-17040-4_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-17040-4_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-17039-8
Online ISBN: 978-3-319-17040-4
eBook Packages: Computer ScienceComputer Science (R0)