Abstract
Customizing program and communication features is a commonly adopted strategy to counter security threats that arise from rapid inflation of software features. In this paper, we propose Hecate, a novel framework that leverages dynamic execution and trace to create customized, self-contained programs, in order to minimize potential attack surface. It automatically identifies program features (i.e., independent, well-contained operations, utilities, or capabilities) relating to application binaries and their communication functions, tailors and eliminates the features to create customized program binaries in accordance with user needs, in a fully unsupervised fashion. Hecate makes novel use of deep learning to identify program features and their constituent functions by mapping dynamic instruction trace to functions in the binaries. It enables us to modularize program features and efficiently create customized program binaries at large scale. We implement a prototype of Hecate using a number of open source tools such as DynInst and TensorFlow. Evaluation using real-world executables including OpenSSL and LibreOffice demonstrates that Hecate can create a wide range of customized binaries for diverse feature requirements, with the highest accuracy up to 96.28% for feature/function identification and up to 67% reduction of program attack surface.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
SPEC CPU (2006). https://www.spec.org/cpu2006/
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: OSDI (2016)
Bao, T., Burket, J., Woo, M., Turner, R., Brumley, D.: BYTEWEIGHT: learning to recognize functions in binary code. In: USENIX (2014)
Bishop, C.M.: Machine Learning and Pattern Recognition. Information Science and Statistics. Springer, Heidelberg (2006)
Chen, Y., Sun, S., Lan, T., Venkataramani, G.: TOSS: tailoring online server systems through binary feature customization. In: FEAST Workshop (2018)
Harris, L.C., Miller, B.P.: Practical analysis of stripped binary code. ACM SIGARCH Comput. Archit. News 33, 63–68 (2005)
Jiang, Y., Wu, D., Liu, P.: JRed: program customization and bloatware mitigation based on static analysis. In: IEEE Computer Software and Applications Conference (2016)
Jiang, Y., Zhang, C., Wu, D., Liu, P.: Feature-based software customization: preliminary analysis, formalization, and methods. In: High Assurance Systems Engineering (2016)
Kim, Y.: Convolutional neural networks for sentence classification (2014). arXiv preprint arXiv:1408.5882
Li, Y., Yao, F., Lan, T., Venkataramani, G.: SARRE: semantics-aware rule recommendation and enforcement for event paths on android. IEEE Trans. Inf. Forensics Secur. 11(12), 2748–2762 (2016)
Lu, S., Li, Z., Qin, F., Tan, L., Zhou, P., Zhou, Y.: Bugbench: benchmarks for evaluating bug detection tools. In: Workshop on the Evaluation of Software Defect Detection Tools (2005)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Annual Conference of the International Speech Communication Association (2010)
Mikolov, T., Kombrink, S., Deoras, A., Burget, L., Cernocky, J.: RNNLM-recurrent neural network language modeling toolkit. In: ASRU Workshop (2011)
Ming, J., Xu, D., Jiang, Y., Wu, D.: BinSim: trace-based semantic binary diffing via system call sliced segment equivalence checking. In: USENIX Security (2017)
Oh, J., Hughes, C.J., Venkataramani, G., Prvulovic, M.: LIME: a framework for debugging load imbalance in multi-threaded execution. In: Proceedings of the 33rd International Conference on Software Engineering. ACM (2011)
Smith, G.C., Seaman, S.R., Wood, A.M., Royston, P., White, I.R.: Correcting for optimistic prediction in small data sets. Am. J. Epidemiol. 180(3), 318–324 (2014)
Open-Source: LibreOffice
Stephens, N., et al.: Driller: augmenting fuzzing through selective symbolic execution. In: NDSS (2016)
Venkataramani, G., Doudalis, I., Solihin, Y., Prvulovic, M.: FlexiTaint: a programmable accelerator for dynamic taint propagation. In: IEEE International Symposium on High Performance Computer Architecture (2008)
Venkataramani, G., Doudalis, I., Solihin, Y., Prvulovic, M.: Memtracker: an accelerator for memory debugging and monitoring. ACM Trans. Archit. Code Optim. (TACO) 6(2), 5 (2009)
Venkataramani, G., Hughes, C.J., Kumar, S., Prvulovic, M.: DeFT: design space exploration for on-the-fly detection of coherence misses. ACM Trans. Archit. Code Optim. (TACO) 8(2), 8 (2011)
Viega, J., Messier, M., Chandra, P.: Network Security with OpenSSL: Cryptography for Secure Communications. O’Reilly Media Inc., Cambridge (2002)
White, M., Tufano, M., Vendome, C., Poshyvanyk, D.: Deep learning code fragments for code clone detection. In: IEEE/ACM International Conference on Automated Software Engineering (2016)
Xue, H., Chen, Y., Venkataramani, G., Lan, T., Jin, G., Li, J.: MORPH: enhancing system security through interactive customization of application and communication protocol features. In: Poster in ACM Conference on Computer and Communications Security (2018)
Xue, H., Chen, Y., Yao, F., Li, Y., Lan, T., Venkataramani, G.: SIMBER: eliminating redundant memory bound checks via statistical inference. In: De Capitani di Vimercati, S., Martinelli, F. (eds.) SEC 2017. IAICT, vol. 502, pp. 413–426. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-58469-0_28
Xue, H., Sun, S., Venkataramani, G., Lan, T.: Machine learning-based analysis of program binaries: a comprehensive study. IEEE Access 7, 65889–65912 (2019)
Xue, H., Venkataramani, G., Lan, T.: Clone-hunter: accelerated bound checks elimination via binary code clone detection. In: ACM SIGPLAN International Workshop on Machine Learning and Programming Languages (2018)
Xue, H., Venkataramani, G., Lan, T.: Clone-slicer: detecting domain specific binary code clones through program slicing. In: FEAST Workshop. ACM (2018)
Yao, F., Chen, J., Venkataramani, G.: JOP-alarm: detecting jump-oriented programming-based anomalies in applications. In: IEEE 31st International Conference on Computer Design (ICCD). IEEE (2013)
Yao, F., Li, Y., Chen, Y., Xue, H., Lan, T., Venkataramani, G.: StatSym: vulnerable path discovery through statistics-guided symbolic execution. In: Dependable Systems and Networks (DSN) (2017)
Yao, F., Venkataramani, G., Doroslovački, M.: Covert timing channels exploiting non-uniform memory access based architectures. In: Great Lakes Symposium on VLSI. ACM (2017)
Zalewski, M.: American fuzzy lop (2007)
Zhang, K., et al.: Personal attributes extraction based on the combination of trigger words, dictionary and rules. In: Proceedings of the Third CIPS-SIGHAN Joint Conference on Chinese Language Processing, pp. 114–119 (2014)
Acknowledgments
This work was supported by the US Office of Naval Research (ONR) under Awards N00014-15-1-2210 and N00014-17-1-2786. Any opinions, findings, conclusions, or recommendations expressed in this article are those of the authors, and do not necessarily reflect those of ONR.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Xue, H., Chen, Y., Venkataramani, G., Lan, T. (2019). Hecate: Automated Customization of Program and Communication Features to Reduce Attack Surfaces. In: Chen, S., Choo, KK., Fu, X., Lou, W., Mohaisen, A. (eds) Security and Privacy in Communication Networks. SecureComm 2019. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 305. Springer, Cham. https://doi.org/10.1007/978-3-030-37231-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-37231-6_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37230-9
Online ISBN: 978-3-030-37231-6
eBook Packages: Computer ScienceComputer Science (R0)