Advertisement

Field Studies

A Methodology for Construction and Evaluation of Recommendation Systems in Software Engineering
Chapter

Abstract

One way to implement and evaluate the effectiveness of recommendation systems in software engineering is to conduct field studies. Field studies are important as they are the extension of laboratory experiments into real-life situations of organizations and/or society. They bring greater realism to the phenomena that are under study. However, field studies require following a rigorous research approach with many challenges attached, such as difficulties in implementing the research design, achieving sufficient control, replication, validity, and reliability. In practice, another challenge is to find organizations who are prepared to be studied. In this chapter, we provide a step-by-step process for the construction and deployment of recommendation systems in software engineering in the field. We also emphasize three main challenges (organizational, data, design) encountered during field studies, both in general and specifically with respect to software organizations.

Keywords

Building Recommender Systems Conducted Field Studies Software Organizations Software Engineering Uncertainty Avoidance Index 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Anvik, J., Murphy, G.C.: Reducing the effort of bug report triage: recommenders for development-oriented decisions. ACM Trans. Software Eng. Methodol. 20(3), 10:1–10:35 (2011). doi:10.1145/2000791.2000794Google Scholar
  2. 2.
    Ashok, B., Joy, J., Liang, H., Rajamani, S.K., Srinivasa, G., Vangala, V.: DebugAdvisor: a recommender system for debugging. In: Proceedings of the European Software Engineering Conference/ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 373–382 (2009). doi:10.1145/1595696.1595766Google Scholar
  3. 3.
    Bakir, A., Kocaguneli, E., Tosun, A., Bener, A., Turhan, B.: Xiruxe: an intelligent fault tracking tool. In: Proceedings of the International Conference on Artificial Intelligence and Pattern Recognition, Orlando, Florida, USA, July 13–16, pp. 293–300 (2009)Google Scholar
  4. 4.
    Bener, A.B.: Risk perception, trust and credibility: a case in internet banking. Technical Report, Department of Information System, London School of Economics, London (2000)Google Scholar
  5. 5.
    Bruch, M., Monperrus, M., Mezini, M.: Learning from examples to improve code completion systems. In: Proceedings of the European Software Engineering Conference/ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 213–222 (2009). doi:10.1145/1595696.1595728Google Scholar
  6. 6.
    Caglayan, B., Misirli, A.T., Calikli, G., Bener, A., Aytac, T., Turhan, B.: Dione: an integrated measurement and defect prediction solution. In: Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 20:1–20:2 (2012). doi:10.1145/2393596.2393619Google Scholar
  7. 7.
    Cartwright, M.H., Shepperd, M.J., Song, Q.: Dealing with missing software project data. In: Proceedings of the IEEE International Symposium on Software Metrics, pp. 154–165 (2003). doi:10.1109/METRIC.2003.1232464Google Scholar
  8. 8.
    Chen, L., Pu, P.: A cross-cultural user evaluation of product recommender interfaces. In: Proceedings of the ACM Conference on Recommender Systems, pp. 75–82 (2008). doi:10.1145/1454008.1454022Google Scholar
  9. 9.
    Creswell, J.W.: Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 3rd edn. Sage, Beverly Hills (2009)Google Scholar
  10. 10.
    Crotty, M.J.: The Foundations of Social Research: Meaning and Perspective in the Research Process. Sage, Beverly Hills (1998)Google Scholar
  11. 11.
    DeMarco, T.: Controlling Software Projects: Management, Measurement, and Estimates. Prentice Hall, Englewood Cliffs (1982)Google Scholar
  12. 12.
    DeMarco, T., Lister, T.: Peopleware: Productive Projects and Teams, 2nd edn. Dorset House, New York (1999)Google Scholar
  13. 13.
    Dewar, R., Hage, J.: Size, technology, complexity, and structural differentiation: toward a theoretical synthesis. Adm. Sci. Q. 23(1), 111–136 (1978)CrossRefGoogle Scholar
  14. 14.
    Duala-Ekoko, E., Robillard, M.P.: Using structure-based recommendations to facilitate discoverability in APIs. In: Proceedings of the European Conference on Object-Oriented Programming, pp. 79–104 (2011). doi:10.1007/978-3-642-22655-7_5Google Scholar
  15. 15.
    Guba, E.G.: The Paradigm Dialog. Sage, Beverly Hills (1990)Google Scholar
  16. 16.
    Gudykunst, W.B., Ting-Toomey, S., Chua, E.: Culture and Interpersonal Communication. Sage, Beverly Hills (1988)Google Scholar
  17. 17.
    Guo, Y., Seaman, C., Zazworka, N., Shull, F.: Domain-specific tailoring of code smells: an empirical study. In: Proceedings of the ACM/IEEE International Conference on Software Engineering, vol. 2, pp. 167–170 (2010). doi:10.1145/1810295.1810321Google Scholar
  18. 18.
    Han, S., Dang, Y., Ge, S., Zhang, D., Xie, T.: Performance debugging in the large via mining millions of stack traces. In: Proceedings of the ACM/IEEE International Conference on Software Engineering, pp. 145–155 (2012). doi:10.1109/ICSE.2012.6227198Google Scholar
  19. 19.
    Höfer, A., Tichy, W.F.: Status of empirical research in software engineering. In: Basili, V., Rombach, D., Schneider, K., Kitchenham, B., Pfahl, D., Selby, R.W. (eds.) Empirical Software Engineering Issues: Critical Assessment and Future Directions, pp. 10–19. Springer, Berlin (2007). doi:10.1007/978-3-540-71301-2_3CrossRefGoogle Scholar
  20. 20.
    Hu, R.: Design and user issues in personality-based recommender systems. In: Proceedings of the ACM Conference on Recommender Systems, pp. 357–360 (2010). doi:10.1145/1864708.1864790Google Scholar
  21. 21.
    Inozemtseva, L., Holmes, R., Walker, R.J.: Recommendation systems in-the-small. In: Robillard, M., Maalej, W., Walker, R.J., Zimmermann, T. (eds.) Recommendation Systems in Software Engineering. Springer, Berlin (2014)Google Scholar
  22. 22.
    Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999). doi:10.1145/331499.331504CrossRefGoogle Scholar
  23. 23.
    Kersten, M., Murphy, G.C.: Using task context to improve programmer productivity. In: Proceedings of the ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 1–11 (2006). doi:10.1145/1181775.1181777Google Scholar
  24. 24.
    Krug, S.: Don’t Make Me Think: A Common Sense Approach to Web Usability, 2nd edn. New Riders, Thousand Oaks (2005)Google Scholar
  25. 25.
    Lethbridge, T.C., Sim, S.E., Singer, J.: Studying software engineers: data collection techniques for software field studies. Empir. Software Eng. 10, 311–341 (2005). doi:10.1007/s10664-005-1290-xCrossRefGoogle Scholar
  26. 26.
    Little, R.J., Rubin Donald, B.: Statistical Analysis with Missing Data, 2nd edn. Wiley, London (2002)zbMATHGoogle Scholar
  27. 27.
    Liu, H.H.: Software Performance and Scalability: A Quantitative Approach. Wiley, London (2009)CrossRefGoogle Scholar
  28. 28.
    Massa, P., Bhattacharjee, B.: Using trust in recommender systems: an experimental analysis. In: Jensen, C., Poslad, S., Dimitrakos, T. (eds.) Proceedings of the International Conference on Trust Management. Lecture Notes in Computer Science, vol. 2995, pp. 221–235 (2004). doi:10.1007/978-3-540-24747-0_17CrossRefGoogle Scholar
  29. 29.
    McCarey, F., Ó Cinnéide, M., Kushmerick, N.: RASCAL: a recommender agent for agile reuse. Artif. Intell. Rev. 24(3–4), 253–276 (2005). doi:10.1007/s10462-005-9012-8Google Scholar
  30. 30.
    Menzies, T.: Data mining: a tutorial. In: Robillard, M., Maalej, W., Walker, R.J., Zimmermann, T. (eds.) Recommendation Systems in Software Engineering. Springer, Berlin (2014)Google Scholar
  31. 31.
    Mockus, A., Herbsleb, J.D.: Expertise browser: a quantitative approach to identifying expertise. In: Proceedings of the ACM/IEEE International Conference on Software Engineering, pp. 503–512 (2002). doi:10.1145/581339.581401Google Scholar
  32. 32.
    Murphy, G.C., Murphy-Hill, E.: What is trust in a recommender for software development? In: Proceedings of the International Workshop on Recommendation Systems for Software Engineering, pp. 57–58 (2010). doi:10.1145/1808920.1808934Google Scholar
  33. 33.
    Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans. Software Eng. 27(11), 999–1013 (2001). doi:10.1109/32.965340CrossRefGoogle Scholar
  34. 34.
    Ramakrishnan, N., Keller, B.J., Mirza, B.J., Grama, A.Y., Karypis, G.: Privacy risks in recommender systems. IEEE Internet Comput. 5(6), 54–62 (2001). doi:10.1109/4236.968832CrossRefGoogle Scholar
  35. 35.
    Robillard, M., Walker, R.J.: An introduction to recommendation systems in software engineering. In: Robillard, M., Maalej, W., Walker, R.J., Zimmermann, T. (eds.) Recommendation Systems in Software Engineering. Springer, Berlin (2014)CrossRefGoogle Scholar
  36. 36.
    Robillard, M.P., Walker, R.J., Zimmermann, T.: Recommendation systems for software engineering. IEEE Software 27(4), 80–86 (2010). doi:10.1109/MS.2009.161CrossRefGoogle Scholar
  37. 37.
    Seaman, C.: Qualitative methods in empirical studies of software engineering. IEEE Trans. Software Eng. 24(5), 1–16 (1999). doi:10.1109/32.799955Google Scholar
  38. 38.
    Shani, G., Gunawardana, A.: Evaluating recommendation systems. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 257–297. Springer, Berlin (2011). doi:10.1007/978-0-387-85820-3_8CrossRefGoogle Scholar
  39. 39.
    Shull, F.: Research 2.0? IEEE Software 29(6), 4–8 (2012). doi:10.1109/MS.2012.164Google Scholar
  40. 40.
    Shull, F., Singer, J., Sjoberg, D.I.K. (eds.): Guide to Advanced Empirical Software Engineering. Springer, Berlin (2008). doi:10.1007/978-1-84800-044-5Google Scholar
  41. 41.
    Sieber, J.E.: Planning Ethically Responsible Research: A Guide for Students and Internal Review Boards, 1st edn. Sage, Beverly Hills (1992)Google Scholar
  42. 42.
    Sieber, J.E.: Protecting research subjects, employees and researchers: Implications for software engineering. Empir. Software Eng. 6(4), 329–341 (2001). doi:10.1023/A:1011978700481CrossRefzbMATHGoogle Scholar
  43. 43.
    Singer, J., Vinson, N.G.: Why and how research ethics matters to you: yes, you! Empir. Software Eng. 6(4), 287–290 (2001). doi:10.1023/A:1011998412776CrossRefGoogle Scholar
  44. 44.
    Singer, J., Vinson, N.G.: Ethical issues in empirical studies of software engineering. IEEE Trans. Software Eng. 28(12), 1171–1180 (2002). doi:10.1109/TSE.2002.1158289CrossRefGoogle Scholar
  45. 45.
    Sommerville, I.: Software Engineering, 8th edn. Addison-Wesley, Reading (2006)Google Scholar
  46. 46.
    Strike, K., El Emam, K., Madhavji, N.: Software cost estimation with incomplete data. IEEE Trans. Software Eng. 27(10), 890–908 (2001). doi:10.1109/32.935855CrossRefGoogle Scholar
  47. 47.
    Tosun, A., Bener, A.B., Turhan, B., Menzies, T.: Practical considerations in deploying statistical methods for defect prediction: a case study within the Turkish telecommunications industry. Inf. Software Technol. 52(11), 1242–1257 (2010). doi:10.1016/j.infsof.2010.06.006CrossRefGoogle Scholar
  48. 48.
    Tosun Misirli, A., Caglayan, B., Bener, A., Turhan, B.: A retrospective study of software analytics projects: in-depth interviews with practitioners. IEEE Software 30(5), 54–61 (2013). doi:10.1109/MS.2013.93CrossRefGoogle Scholar
  49. 49.
    Turhan, B., Bener, A.: On combining the scattered knowledge: putting the bricks together. In: Proceedings of the International NSF Sponsored Workshop on Realizing Artificial Intelligence Synergies in Software Engineering in conjunction with ICSE, Zurich, Switzerland (2013)Google Scholar
  50. 50.
    Turhan, B., Bener, A., Menzies, T.: Nearest neighbor sampling for cross company defect predictors. In: Proceedings of the International Workshop on Defects in Large Software Systems, p. 26 (2008). doi:10.1145/1390817.1390824Google Scholar
  51. 51.
    Twala, B.: An empirical comparison of techniques for handling incomplete data using decision trees. Appl. Artif. Intell. Int. J. 23(5), 373–405 (2009). doi:10.1080/08839510902872223CrossRefGoogle Scholar
  52. 52.
    Twala, B., Cartwright, M., Shepperd, M.: Ensemble of missing data techniques to improve software prediction accuracy. In: Proceedings of the ACM/IEEE International Conference on Software Engineering, pp. 909–912 (2006). doi:10.1145/1134285.1134449Google Scholar
  53. 53.
    Vinson, N.G., Singer, J.: A practical guide to ethical research involving humans. In: Shull, F., Singer, J., Sjøberg, D.I.K. (eds.) Guide to Advanced Empirical Software Engineering, pp. 229–256. Springer, Berlin (2008). doi:10.1007/978-1-84800-044-5_9CrossRefGoogle Scholar
  54. 54.
    Walker, R.J., Holmes, R.: Simulation: a methodology to evaluate recommendation systems in software engineering. In: Robillard, M., Maalej, W., Walker, R.J., Zimmermann, T. (eds.) Recommendation Systems in Software Engineering. Springer, Berlin (2014)Google Scholar
  55. 55.
    Wohlin, C., Runeson, P., Host, M., Ohlsson, M.C., Regnell, B., Wesslen, A.: Experimentation in Software Engineering. Springer, Berlin (2012). doi:10.1007/978-3-642-29044-2Google Scholar
  56. 56.
    Ye, Y., Fischer, G.: Supporting reuse by delivering task-relevant and personalized information. In: Proceedings of the ACM/IEEE International Conference on Software Engineering, pp. 513–523 (2002). doi:10.1145/581339.581402Google Scholar
  57. 57.
    Yin, R.K.: Case Study Research: Design and Methods, 3rd edn. Sage, Beverly Hills (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.University of OuluOuluFinland
  2. 2.Ryerson UniversityTorontoCanada
  3. 3.Boğaziçi UniversityIstanbulTurkey

Personalised recommendations