Skip to main content
Log in

Estimating software robustness in relation to input validation vulnerabilities using Bayesian networks

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Estimating the robustness of software in the presence of invalid inputs has long been a challenging task owing to the fact that developers usually fail to take the necessary action to validate inputs during the design and implementation of software. We propose a method for estimating the robustness of software in relation to input validation vulnerabilities using Bayesian networks. The proposed method runs on all program functions and/or methods. It calculates a robustness value using information on the existence of input validation code in the functions and utilizing common weakness scores of known input validation vulnerabilities. In the case study, ten well-known software libraries implemented in the JavaScript language, which are chosen because of their increasing popularity among software developers, are evaluated. Using our method, software development teams can track changes made to software to deal with invalid inputs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Alkhalaf, M.A. (2014). Automatic Detection and Repair of Input Validation and Sanitization Bugs (PhD dissertation, University Of California Santa Barbara).

  • Avizienis, A., Laprie, J., Randell, B. (2001). Fundamental Concepts of Dependability, Tech. Rep. 1145, University of Newcastle.

  • Ben-Gal, I. (2007). Bayesian networks. In F. Ruggeri, F. Faltin, & R. Kenett (Eds.), Encyclopedia of statistics in Quality & Reliability. New York: Wiley.

    Google Scholar 

  • Bobbio, A., Portinale, L., Minichino, M., & Ciancamerla, E. (2001). Improving the analysis of dependable systems by mapping fault trees into Bayesian networks. Reliability Engineering & System Safety, 71(3), 249–260.

    Article  Google Scholar 

  • Christey, S. (2005). Preliminary list of vulnerability examples for researchers. NIST Workshop Defining the State of the Art of Software Security Tools, Gaithersburg, MD.

  • Dejaeger, K., Verbraken, T., & Baesens, B. (2013). Toward comprehensible software fault prediction models using bayesian network classifiers. IEEE Transactions on Software Engineering, 39(2), 237–257. doi:10.1109/TSE.2012.20.

    Article  Google Scholar 

  • Fenton, N., & Neil, M. (2012). Risk assessment and decision analysis with Bayesian networks. Boca Raton: CRC Press.

    MATH  Google Scholar 

  • Franke, U., Johnson, P., König, J., & Marcks von Würtemberg, L. (2011). Availability of enterprise IT systems: an expert-based Bayesian framework. Software Quality Journal, 20(2), 369–394. doi:10.1007/s11219-011-9141-z.

    Article  Google Scholar 

  • Frigault, M., & Wang, L. (2008). Measuring network security using Bayesian network-based attack graphs. In 32nd annual IEEE international conference on computer software and applications (COMPSAC '08) (pp. 698–703).

    Google Scholar 

  • Guarnieri, S., & Livshits, V.B. (2010). Gulfstream: Incremental static analysis for streaming Java Script applications. In Proc. of the USENIX Conference on Web Application Development, accessible through http://static.usenix.org/event/webapps10/tech/full_papers/Guarnieri.pdf .

  • Halfond, W.G., Viegas, J., & Orso, A., (2006). A classification of SQL-injection attacks and countermeasures. In Proceedings of the IEEE International Symposium on Secure Software Engineering (Vol. 1, pp. 13–15). IEEE.

  • Heckerman, D. (1996). A tutorial on learning with Bayesian networks. Technical Report Microsoft Research, MSR-TR-96-06, http://research.microsoft.com/apps/pubs/?id=69588.

  • Holm, H., Korman, M., & Ekstedt, M. (2014). A Bayesian network model for likelihood estimations of acquirement of critical software vulnerabilities and exploits. Information and Software Technology, 58, 304–318. doi:10.1016/j.infsof.2014.07.001.

    Article  Google Scholar 

  • IEEE Std 610.12-1990 (1990). IEEE Standard Glossary of Software Engineering Terminology.

  • Jensen, S.H., Møller, A., & Thiemann, P. (2009). Type analysis for Java Script. In Proc. 16th International Static Analysis Symposium, SAS ‘09, LNCS (vol. 5673, pp. 238–255). Berlin Heidelberg New York: Springer.

  • Jourdan, G. V. (2008). Data validation, data neutralization, data footprint: a framework against injection attacks. Open Software Engineering Journal, 2, 45–54.

    Article  Google Scholar 

  • Kondakci, S. (2010). Network security risk assessment using Bayesian belief networks. In IEEE international conference on social computing (social com) (pp. 952–960).

    Google Scholar 

  • Korb, K. B., & Nicholson, A. E. (2003). Bayesian artificial intelligence. Boca Raton: CRC Press.

    Book  MATH  Google Scholar 

  • Kuperman, B. A., Brodley, C. E., Ozdoganoglu, H., Vijaykumar, T. N., & Jalote, A. (2005). Detection and prevention of stack buffer overflow attacks. Communications of the ACM, 48(11), 50–56.

    Article  Google Scholar 

  • National Vulnerability Database (2014). http://web.nvd.nist.gov/view/vuln/statistics. Accessed 5 June 2015.

  • Okutan, A., & Yıldız, O. T. (2012). Software defect prediction using Bayesian networks. Empirical Software Engineering, 2, 154–181. doi:10.1007/s10664-012-9218-8.

    Google Scholar 

  • OpenMarkov (2014). http://www.openmarkov.org/. Accessed 5 June 2015.

  • Perkusich, M., Soares, G., Almeida, H., & Perkusich, A. (2015). A procedure to detect problems of processes in software development projects using Bayesian networks. Expert Systems with Applications, 42(1), 437–450. doi:10.1016/j.eswa.2014.08.015.

    Article  Google Scholar 

  • Shahrokni, A., & Feldt, R. (2013). A systematic review of software robustness. Information and Software Technology, 55, 1–17.

    Article  Google Scholar 

  • Smithline, N. (2013). OWASP Top 10 2013. https://www.owasp.org/index.php/Top_10_2013-Top_10. Accessed 5 June 2015.

  • Wagner, S. (2010). A Bayesian network approach to assess and predict software quality using activity-based quality models. Information and Software Technology, 52, 1230–1241.

    Article  Google Scholar 

  • Weber, P., Medina-Oliva, G., Simon, C., & Iung, B. (2012). Overview on Bayesian networks applications for dependability, risk analysis and maintenance areas. Engineering Applications of Artificial Intelligence, 25(4), 671–682. doi:10.1016/j.engappai.2010.06.002.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tugkan Tuglular.

Appendices

Appendix 1. Descriptions of input validation vulnerabilities in the scope of work

CWE-79: Improper Neutralization of Input During Web Page Generation (“Cross-site Scripting”).

Description Summary.

The software does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.

http://cwe.mitre.org/data/definitions/79.html

CWE-89: Improper Neutralization of Special Elements used in an SQL Command (“SQL Injection”).

Description Summary.

The software constructs all or part of an SQL command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended SQL command when it is sent to a downstream component.

http://cwe.mitre.org/data/definitions/89.html

CWE-120: Buffer Copy without Checking Size of Input (“Classic Buffer Overflow”).

Description Summary.

The program copies an input buffer to an output buffer without verifying that the size of the input buffer is less than the size of the output buffer, leading to a buffer overflow.

http://cwe.mitre.org/data/definitions/120.html

CWE-22: Improper Limitation of a Pathname to a Restricted Directory (“Path Traversal”).

Description Summary.

The software uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the software does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.

http://cwe.mitre.org/data/definitions/22.html

CWE-78: Improper Neutralization of Special Elements used in an OS Command (“OS Command Injection”).

Description Summary.

The software constructs all or part of an OS command using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the intended OS command when it is sent to a downstream component.

http://cwe.mitre.org/data/definitions/78.html

CWE-134: Uncontrolled Format String.

Description Summary.

The software uses externally-controlled format strings in printf-style functions, which can lead to buffer overflows or data representation problems.

https://cwe.mitre.org/data/definitions/134.html

Appendix 2. Input validation vulnerability examples

CWE-78: Improper Neutralization of Special Elements used in an OS Command (“OS Command Injection”).

An OS Command Injection using Node.JS

figure f

Another example of an OS Command injection.

figure g

CWE-89: Improper Neutralization of Special Elements used in an SQL Command (“SQL Injection”).

Assume that the user has defined a database with JavaScript.

figure h

The attacker can inject an SQL as,

figure i

CWE-120: Buffer Copy without Checking Size of Input (“Classic Buffer Overflow”).

JavaScript heap spraying example (https://crypto.stanford.edu/cs155old/cs155-spring11/lectures/03-ctrl-hijack.pdf)

figure j

Appendix 3. Five JavaScript functions for Proof of Concept

Original JavaScript functions (in regular font style) empowered with input validation code (in bold font style).

figure kfigure kfigure kfigure k

Appendix 4. Effect of values at vulnerability node on robustness estimation

Different values for “not validated” and “contains no validation code” at vulnerability node and their effect on the robustness estimation

 

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Robustness

0.1004

0.0904

0.0804

0.0704

0.0604

0.0504

0.0404

0.0304

0.0204

0.0104

0.0004

Format String Vulnerability

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Path Traversal

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

XSS

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

SQL Injection

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

Buffer Overflow

0.8979

0.9078

0.9178

0.9278

0.9377

0.9477

0.9577

0.9676

0.9776

0.9875

0.9975

OS Command Injection

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

1

  1. Since there exists only input validation code for Buffer Overflow, its value is different than other vulnerability node values

Appendix 5. Full Application Node Probabilistic Table

figure lfigure l

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ufuktepe, E., Tuglular, T. Estimating software robustness in relation to input validation vulnerabilities using Bayesian networks. Software Qual J 26, 455–489 (2018). https://doi.org/10.1007/s11219-017-9359-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-017-9359-5

Keywords

Navigation