SNIFF: A Search Engine for Java Using Free-Form Queries
Reuse of existing libraries simplifies software development efforts. However, these libraries are often complex and reusing the APIs in the libraries involves a steep learning curve. A programmer often uses a search engine such as Google to discover code snippets involving library usage to perform a common task. A problem with search engines is that they return many pages that a programmer has to manually mine to discover the desired code. Recent research efforts have tried to address this problem by automating the generation of code snippets from user queries. However, these queries need to have type information and therefore require the user to have a partial knowledge of the APIs.
We propose a novel code search technique, called SNIFF, which retains the flexibility of performing code search in plain English, while obtaining a small set of relevant code snippets to perform the desired task. Our technique is based on the observation that the library methods that a user code calls are often well-documented. We use the documentation of the library methods to add plain English meaning to an otherwise undocumented user code. The annotated user code is then indexed for the purpose of free-form query search. Another novel contribution of our technique is that we take a type-based intersection of the candidate code snippets obtained from a query search to generate a set of small and highly relevant code snippets.
We have implemented SNIFF for Java and have performed evaluations and user studies to demonstrate the utility of SNIFF. Our evaluations show that SNIFF performed better than most of the existing online search engines as well as related tools.
KeywordsSearch Engine User Query Longe Common Subsequence User Code Longe Common Subsequence
- 1.Ammons, G., Bodik, R., Larus, J.R.: Mining specifications. In: POPL 2002, pp. 4–16 (2002)Google Scholar
- 3.Krugle inc, http://www.krugle.com
- 4.Cormen, T., Leiserson, C., Rivest, R., Stein, C.: Introduction to algorithms. MIT press/ McGraw-Hill (2001)Google Scholar
- 5.Google code search, http://google.com/codesearch
- 6.Holmes, R., Murphy, G.: Using structural context to recommend source code examples. In: Inverardi, P., Jazayeri, M. (eds.) ICSE 2005. LNCS, vol. 4309, pp. 117–125. Springer, Heidelberg (2006)Google Scholar
- 8.Java frequently asked questions, http://www.javafaq.com/
- 9.Jiang, L., Misherghi, G., Su, Z., Glondu, S.: Deckard: Scalable and accurate tree-based detection of code clones. In: ICSE 2007, pp. 96–105 (2007)Google Scholar
- 10.Koders inc, http://www.koders.com
- 11.Kremenek, T., Twohey, P., Back, G., Ng, A., Engler, D.: From uncertainty to belief: inferring the specification within. In: OSDI 2006, pp. 161–176 (2006)Google Scholar
- 12.Mandelin, D., Xu, L., Bodík, R., Kimelman, D.: Jungloid mining: helping to navigate the api jungle. In: PLDI 2005, pp. 48–61 (2005)Google Scholar
- 14.Porter, M.F.: An algorithm for suffix stripping. In: Readings in information retrieval, vol. 14, pp. 130–137 (1980)Google Scholar
- 15.Robillard, M.P.: Automatic generation of suggestions for program investigation. In: ESEC/FSE-13: Proceedings of the 10th European software engineering conference held jointly with 13th ACM SIGSOFT international symposium on Foundations of software engineering, pp. 11–20. ACM, New York (2005)CrossRefGoogle Scholar
- 16.Sahavechaphan, N., Claypool, K.: Xsnippet: mining for sample code. In: OOPSLA 2006, pp. 413 – 430 (2006)Google Scholar
- 17.Tan, L., Yuan, D., Krishna, G., Zhou, Y.: /*icomment: bugs or bad comments?*/. In: SOSP 2007, pp. 145–158 (2007)Google Scholar
- 18.Thummalapenta, S., Xie, T.: PARSEWeb: A programmer assistant for reusing open source code on the web. In: ASE 2007, pp. 204–213 (2007)Google Scholar
- 19.Woodfield, S., Dunsmore, H., Shen, V.Y.: The effect of modularization and comments on program comprehension. In: ICSE 2002, pp. 215–223 (1981)Google Scholar
- 20.Ying, A.T.T., Wright, J.L., Abrams, S.: Source code that talks: an exploration of eclipse task comments and their implication to repository mining. In: MSR 2005, pp. 1–5 (2005)Google Scholar