Abstract
As licensed programs are pirated and illegally spread over the Internet, it is necessary to filter illegally distributed or cracked programs. The conventional software filtering systems can prevent unauthorized dissemination of the programs maintained by their databases using an exact matching method where the feature of a suspicious program is the same as that of any program stored in the database. However, the conventional filtering systems have some limitations to deal with cracked or new programs which are not maintained by their database. To address the limitations, we design and implement an efficient and intelligent software filtering system based on software similarity. Our system measures the similarity of the characteristics extracted from an original program and a suspicious one (or, a cracked one) and then determines whether the suspicious program is a cracked version of the copyrighted original program based on the similarity measure. In addition, the proposed system can handle a new program by categorizing it using a machine learning scheme. This scheme helps an unknown program to be identified by narrowing the search space. To demonstrate the effectiveness of the proposed system, we perform a series of experiments on a number of executable programs under Microsoft Windows. The experimental results show that our system has achieved comparable performance.
Similar content being viewed by others
References
Arxan Technologies Inc (2012) State of Security in the App Economy: Mobile Apps Under Attack. Research report, Arxan Technologie, Inc
Bai Y, Sun X, Sun G, Deng X, Zhou X (2008) Dynamic k-gram based software birthmark. In: Proceedings of the 19th Australian Conference on Software Engineering (ASWEC), pp 644–649
Burnett IS, Pereira F, de Walle RV, Koenen R (2006) The MPEG-21 Book. Wiley, NJ
Cessie SL, Houwelingen JCV (1992) Ridge estimators in logistic regression. J R Stat Soc Ser C (Appl Stat) 41(1):191–201
Chien S, Immorlica N (2005) Semantic similarity between search engine queries using temporal correlation. In: Proceedings of the 14th International Conference on World Wide Web, pp 2–11. ACM (2005)
Choi J, Han Y, Cho S, Yoo H, Woo J, Park M, Song Y, Chung L (2013) A static birthmark for ms windows applications using import address table. In: Proceedings of the 7th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Taichung, Taiwan, IEEE, pp 129–134
Choi S, Park H, il Lim H, Han T (2007) A static birthmark of binary executables based on API call structure. In: Proceedings of the 12th Asian Computing Science Conference (ASIAN), pp 2–16. Springer, Berlin Heidelberg
Chuang SL, Chien LF (2005) Taxonomy generation for text segments: a practical web-based approach. ACM Trans Inf Syst 23(4):363–396
Gantz JF, Vavra T, Howard J, Rodolfo R, Lee R, Satidkanitkul A, Taori HN, Sharma R, Villate R, Florean A, Christiansen CA, Minton S, Wang A, Warmerdam M, Lachawitz C (2013) The dangerous world of counterfeit and pirated software. IDC White Paper
Hall MA (1999) Correlation-based feature selection for machine learning. University of Waikato, Tech. rep
IDA pro: Hex-Rays. http://www.hex-rays.com/idaprot (2014)
Jang M, Kim D (2014) Filtering illegal Android application based on feature information. In: Proceedings of the 2013 Research in Adaptive and Convergent Systems (RACS), Towson, Maryland, USA, pp 357–358. ACM
Kim D, Han Y, Cho S, Yoo H, Woo J, Nah Y, Park M, Chung L (2013) Measuring similarity of windows applications using static and dynamic birthmarks. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing (SAC), pp 1628–1633. ACM
Kim D, Kim Y, Moon J, Cho S, Woo J, You I (2014a) Identifying windows installer package files for detection of pirated software. In: Proceedings of the 8th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Birmingham, United Kingdom, pp 287–290. IEEE
Kim S, Kim E, Choi J (2012) A method for detecting illegally copied apk files on the network. In: Proceedings of the 2012 ACM Research in Applied Computation Symposium, pp 253–256. ACM
Kim Y, Moon J, Cho S, Park M, Han S (2014b) Efficient identification of windows executable programs to prevent software piracy. In: Proceedings of the 8th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS), Birmingham, United Kingdom, pp 236–240. IEEE
Lu B, Liu F, Ge X, Liu B, Luo X (2007) A software birthmark based on dynamic opcode n-gram. In: Proceedings of the International Conf. on Semantic Computing (ICSC), pp 37–44
Myles G, Collberg C (2004) Detecting software theft via whole program path birthmarks. Information Security, vol 3225., Lecture Notes in Computer ScienceSpringer, Berlin Heidelberg, pp 404–415
Myles G, Collberg C (2005) K-gram based software birthmarks. In: Proceedings of the 2005 ACM symposium on Applied computing (SAC), Santa Fe, New Mexico, USA, pp 314–318. ACM
Park H, Choi S, il Lim H, Han T (2008) Detecting java theft based on static api trace birthmark. In: Proceedings of the 3rd International Workshop on Security: Advances in Information and Computer Security (IWSEC), pp 121–135. Springer, Berlin Heidelberg
Sahami M, Heilman TD (2006) A web-based kernel function for measuring the similarity of short text snippets. In: Proceedings of the 15th International Conference on World Wide Web, pp 377–386. ACM
Secure Digital Music Initiativ: Call for Proposals for Phase II Screening Technology (FRWG 000224–01) (2000)
Spertus E, Sahami M, Buyukkokten O (2005) Evaluating similarity measures: A large-scale study in the orkut social network. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp 678–684. ACM
Tamada H, Nakamura M, Monden A, Matsumoto K (2005) Java Birthmarks—Detecting the Software Theft. IEICE Transactions on Information Systems E88-D(9), pp 2148–2158
Tamada H, Okamoto K, Nakamura M, Monden A (2004) Dynamic software birthmarks to detect the theft of windows applications. In: Proceedings of the International Symposium on Future Software Technology 2004 (ISFST), Towson, Maryland, USA
Tamada H, Okamoto K, Nakamura M, Monden A, Matsumoto K (2007) Design and Evaluation of Dynamic Software Birthmarks Based on API Calls. Tech. rep, Nara Institute of Science and Technology
Wang X, Jhi YC, Zhu S, Liu P (2009) Detecting software theft via system call based birthmarks. In: Proceedings of the Annual Computer Security Applications Conference (ACSAC), pp 149–158
Acknowledgments
This research was supported by the MSIP(Ministry of Science, ICT and Future Planning), Korea, under the ITRC(Information Technology Research Center) support program (IITP-2015-H8501-15-1012) supervised by the IITP(Institute for Information & communications Technology Promotion) and Ministry of Culture, Sports and Tourism(MCST) and from Korea Copyright Commission in 2014.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Jara, M. R. Ogiela, I. You and F.-Y. Leu.
Rights and permissions
About this article
Cite this article
Kim, D., Kim, Y., Cho, Sj. et al. An effective and intelligent Windows application filtering system using software similarity. Soft Comput 20, 1821–1827 (2016). https://doi.org/10.1007/s00500-015-1678-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-015-1678-5