Skip to main content
Log in

An oversampling approach for mining program specifications

  • Published:
Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Abstract

Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data (object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios (OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpecMiner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Deng Chen.

Additional information

Project supported by the Scientific Research Project of the Education Department of Hubei Province, China (No. Q20181508), the Youths Science Foundation of Wuhan Institute of Technology (No. k201622), the Surveying and Mapping Geographic Information Public Welfare Scientific Research Special Industry (No. 201412014), the Educational Commission of Hubei Province, China (No. Q20151504), the National Natural Science Foundation of China (Nos. 41501505, 61502355, 61502355, and 61502354), the China Postdoctoral Science Foundation (No. 2015M581887), the Key Program of Higher Education Institutions of Henan Province, China (No. 17A520040), and the Natural Science Foundation of Henan Province, China (No. 162300410177)

A preliminary version was presented at the 27th International Conference on Software Engineering and Knowledge Engineering, Pittsburgh, USA, July 6–8, 2015

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, D., Zhang, Yd., Wei, W. et al. An oversampling approach for mining program specifications. Frontiers Inf Technol Electronic Eng 19, 737–754 (2018). https://doi.org/10.1631/FITEE.1601783

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/FITEE.1601783

Key words

CLC number

Navigation