Skip to main content
Log in

MSRBot: Using bots to answer questions from software repositories

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Software repositories contain a plethora of useful information that can be used to enhance software projects. Prior work has leveraged repository data to improve many aspects of the software development process, such as, help extract requirement decisions, identify potentially defective code and improve maintenance and evolution. However, in many cases, project stakeholders are not able to fully benefit from their software repositories due to the fact that they need special expertise to mine their repositories. Also, extracting and linking data from different types of repositories (e.g., source code control and bug repositories) requires dedicated effort and time, even if the stakeholder has the expertise to perform such a task. Therefore, in this paper, we use bots to automate and ease the process of extracting useful information from software repositories. Particularly, we lay out an approach of how bots, layered on top of software repositories, can be used to answer some of the most common software development/maintenance questions facing developers. We perform a preliminary study with 12 participants to validate the effectiveness of the bot. Our findings indicate that using bots achieves very promising results compared to not using the bot (baseline). Most of the participants (90.0%) find the bot to be either useful or very useful. Also, they completed 90.8% of the tasks correctly using the bot with a median time of 40 seconds per task. On the other hand, without the bot, the participants completed 25.2% of the tasks with a median time of 240 seconds per task. Our work has the potential to transform the MSR field by significantly lowering the barrier to entry, making the extraction of useful information from software repositories as easy as chatting with a bot.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Since we gave a maximum of 30 minutes for participants to complete a task, tasks that were not answered after 30 minutes were considered to be incomplete and also to have taken 30 minutes.

References

  • Amazon Lex (2019) Build conversation bots. https://aws.amazon.com/lex/. Accessed 19 Mar 2019

  • Abdellatif A, Badran K, Shihab E (2019a) ahmad-abdellatif/msrbot: Msrbot framework. https://github.com/ahmad-abdellatif/MSRBot. Accessed 10 Oct 2019

  • Abdellatif A, Badran K, Shihab E (2019b) MSRBot: Using bots to answer questions from software repositories empirical software engineering

  • Acharya MP, Parnin C, Kraft NA, Dagnino A, Qu X (2016) Code drones. In: 2016 IEEE/ACM 38th international conference on software engineering companion (ICSE-c), pp 785–788

  • Ahmed TM, Bezemer C-P, Chen T-H, Hassan AE, Shang W (2016) Studying the effectiveness of application performance management (apm) tools for detecting performance regressions for web applications: An experience report. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16. ACM, New York, pp 1–12

  • Ali N, Guéhéneuc YG, Antoniol G (2013) Trustrace: Mining software repositories to improve the accuracy of requirement traceability links. IEEE Trans Softw Eng 39(5):725–741

    Article  Google Scholar 

  • Ask JA, Facemire M, Hogan A, Conversations HB (2016) The state of chatbots. Forrester. com report, 20

  • Banerjee S, Cukic B (2015) On the cost of mining very large open source repositories. In: Proceedings of the first international workshop on BIG data software engineering, BIGDSE ’15. IEEE Press, Piscataway, pp 37–43

  • Bankier JG, Gleason K (2014) Institutional repository software comparison

  • Begel A, Zimmermann T (2014) Analyze this! 145 questions for data scientists in software engineering. In: Proceedings of the 36th international conference on software engineering, ICSE 2014. ACM, New York, pp 12–23

  • Begel A, Zimmermann T (2018) Appendix to analyze this! 145 questions for data scientists in software engineering - microsoft research. https://www.microsoft.com/en-us/research/publication/appendix-to-analyze-this-145-questions-for-data-scientists-in-software-engineering. Accessed 20 Dec 2018

  • Begel A, Khoo YP, Zimmermann T (2010) Codebook: discovering and exploiting relationships in software repositories. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1, pp 125–134

  • Beschastnikh I, Lungu M, Zhuang Y (2017) Accelerating software engineering research adoption with analysis bots, IEEE Press, Piscataway

  • Bradley NC, Fritz T, Holmes R (2018) Context-aware conversational developer assistants. In: Proceedings of the 40th international conference on software engineering, ICSE ’18. ACM, New York, pp 993–1003

  • Brown C, Parnin C (2019) Sorry to bother you: designing bots for effective recommendations. In: Proceedings of the 1st international workshop on Bots in software engineering, botSE ’19. IEEE Press, Piscataway, pp 54–58

  • Cerezo J, Kubelka J, Robbes R, Bergel A (2019) Building an expert recommender chatbot. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 59–63

  • Digkas G, Lungu M, Chatzigeorgiou A, Avgeriou P (2017) The evolution of technical debt in the apache ecosystem. In: Lopes A, de Lemos R (eds) Software architecture. Springer International Publishing, Cham, pp 51–66

  • Feng D, Shaw E, Kim J, Hovy E (2006) An intelligent discussion-bot for answering student queries in threaded discussions. In: Proceedings of the 11th international conference on intelligent user interfaces, IUI ’06. ACM, New York, pp 171–177

  • Fritz T, Murphy GC (2010) Using information fragments to answer the questions developers ask. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering - Volume 1, ICSE ’10. ACM, New York, pp 175–184

  • Gensim (2019) gensim: Topic modelling for humans. https://radimrehurek.com/gensim/index.html. Accessed 13 Feb 2019

  • GitKraken (2019) GitKraken Git client. https://www.gitkraken.com/. Accessed 04 Mar 2019

  • Good Rebels (2019) The impact of conversational bots in the customer experience. https://www.goodrebels.com/the-impact-of-conversational-bots-in-the-customer-experience/. Accessed 05 Aug 2019

  • Google (2019a) Dialogflow. https://dialogflow.com/. Accessed 09 Jan 2019

  • Google (2019b) Training —- dialogflow. https://dialogflow.com/docs/training. Accessed 16 02 2019

  • Gottipati S, Lo D, Jiang J (2011) Finding relevant answers in software forums. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering, ASE ’11. IEEE Computer Society, Washington, pp 323–332

  • Gupta M, Sureka A, Padmanabhuni S (2014) Process mining multiple repositories for software defect resolution from control and organizational perspective. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014. ACM, New York, pp 122–131

  • Hassan AE (2008) The road ahead for mining software repositories. In: 2008 Frontiers of software maintenance, pp 48–57

  • Hattori L, D’Ambros M, Lanza M, Lungu M (2013) Answering software evolution questions. Inf Softw Technol 55(4):755–775

    Article  Google Scholar 

  • Höst M, Regnell B, Wohlin C (2000) Using students as subjects—a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214

    Article  Google Scholar 

  • IBM (2019) Watson conversation. https://www.ibm.com/watson/services/conversation/. Accessed 09 Jan 2019

  • Jira (2019) Jira client — atlassian marketplace. https://marketplace.atlassian.com/apps/7070/jira-client?hosting=server&tab=overview. Accessed on 04 Mar 2019

  • Jirafe (2019) Manage Jira Cloud issues in the comfort of Slack. https://www.jirafe. 948 io/#features. Accessed 25 July 2019

  • Jurafsky D, Martin JH (2009) Speech and language processing, 2nd edn., Prentice-Hall, Inc, Upper Saddle River

  • Kabinna S, Bezemer C-P, Shang W, Hassan AE (2016) Logging library migrations: a case study for the apache software foundation projects. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16. ACM, New York, pp 154–164,

  • Khomh F, Adams B, Dhaliwal T, Zou Y (2015) Understanding the impact of rapid releases on software quality. Empir Softw Eng 20(2):336–373

    Article  Google Scholar 

  • Kumar R, Bansal C, Maddila C, Sharma N, Martelock S, Bhargava R (2019) Building sankie: an ai platform for devops. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 48–53

  • Lebeuf C, Storey M, Zagalsky A (2018) Software bots. IEEE Software 35 (1):18–23

    Article  Google Scholar 

  • Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - Volume 1, HLT ’11. Association for Computational Linguistics, Stroudsburg, pp 359–367

  • Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford coreNLP natural language processing toolkit. In: Association for computational linguistics (ACL) system demonstrations, pp 55–60

  • Matthies C, Dobrigkeit F, Hesse G (2019) An additional set of (automated) eyes: Chatbots for agile retrospectives. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 34–37

  • Microsoft (2019) Luis: Language understanding intelligent service. https://www.luis.ai/home. Accessed 09 Jan 2019

  • Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - volume 2, NIPS’13. Curran Associates Inc, USA, pp 3111–3119

  • Mohit B (2014) Named entity recognition. In: Zitouni I. (ed) Natural language processing of semitic languages. Springer, USA

  • Monperrus M (2019) Explainable software bot contributions: Case study of automated bug fixes. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 12–15

  • Monperrus M, Urli S, Durieux T, Martinez M, Baudry B, Seinturier L (2019) Repairnator patches programs automatically. Ubiquity 2019(July):2:1–2:12

    Article  Google Scholar 

  • Mordinyi R, Biffl S (2017) Exploring traceability links via issues for detailed requirements coverage reports. In: 2017 IEEE 25Th international requirements engineering conference workshops (REW), pp 359–366

  • Murgia A, Janssens D, Demeyer S, Vasilescu B (2016) Among the machines: Humanbot interaction on social Q&A websites. In: Proceedings of the 2016 CHI conference extended abstracts on human factors in computing systems, CHI EA ’16. ACM, New York, pp 1272–1279

  • Paikari E, Choi J, Kim S, Baek S, Kim M, Lee S, Han C, Kim Y, Ahn K, Cheong C, van der Hoek A (2019) A chatbot for conflict detection and resolution. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, pp 29–33

  • Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the thirteenth conference on computational natural language learning, CoNLL ’09. Association for Computational Linguistics, Stroudsburg, pp 147–155

  • Robillard MP, Marcus A, Treude C, Bavota G, Chaparro O, Ernst N, Gerosa MA, Godfrey M, Lanza M, Linares-Vásquez M, Murphy GC, Moreno L, Shepherd D, Wong E (2017) On-demand developer documentation. In: 2017 IEEE nternational conference on software maintenance and evolution (ICSME), pp 479–483

  • Salman I, Misirli AT, Juristo N (2015) Are students representatives of professionals in software engineering experiments? 2015 IEEE/ACM 37th IEEE international conference on software engineering 1:666–676

    Article  Google Scholar 

  • Sankar GR, Greyling J, Vogts D, du Plessis MC (2008) Models towards a hybrid conversational agent for contact centres. In: Proceedings of the 2008 annual research conference of the South African Institute of computer scientists and information technologists on it research in developing countries: riding the wave of technology SAICSIT ’08. ACM, New York, pp 200–209

  • Sawant AA, Bacchelli A (2017) fine-grape: fine-grained api usage extractor – an approach and dataset to investigate api usage. Empir Softw Eng 22(3):1348–1371

    Article  Google Scholar 

  • Sharma VS, Mehra R, Kaulgud V (2017) What do developers want?: an advisor approach for developer priorities. In: Proceedings of the 10th international workshop on cooperative and human aspects of software engineering, CHASE ’17. IEEE Press, Piscataway, pp 78–81

  • Siddiqui T, Ahmad A (2018) Data mining tools and techniques for mining software repositories: a systematic review. In: Aggarwal V. B., Bhatnagar V., Mishra D. K. (eds) Big data analytics. Springer Singapore, Singapore, pp 717–726

  • Sillito J, Murphy GC, Volder KD (2008) Asking and answering questions during a programming change task. IEEE Trans Softw Eng 34(4):434–451

    Article  Google Scholar 

  • Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proceedings of the 2005 international workshop on mining software repositories, MSR ’05. ACM, New York, pp 1–5

  • Storey M-A, Zagalsky A (2016) Disrupting developer productivity one bot at a time. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016. ACM, New York, pp 928–931

  • Tian Y, Thung F, Sharma A, Apibot DLo (2017) Question answering bot for api documentation. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 153–158

  • Tjong Kim Sang EF, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003 - Volume 4, CONLL ’03. Association for Computational Linguistics, Stroudsburg, pp 142–147

  • Toxtli C, Monroy-Hernández A, Cranshaw J (2018) Understanding chatbot-mediated task management. In: Proceedings of the 2018 CHI conference on human factors in computing systems, CHI ’18. ACM, New York, pp 58:1–58:6

  • Treude C, Robillard MP, Dagenais B (2015) Extracting development tasks to navigate software documentation. IEEE Trans Softw Eng 41(6):565–581

    Article  Google Scholar 

  • Urli S, Yu Z, Seinturier L, Monperrus M (2018) How to design a program repair bot?: Insights from the repairnator project. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ICSE-SEIP ’18. ACM, New York, pp 95–104

  • van Tonder R, Goues CL (2019) Towards s/engineer/bot: Principles for program repair bots. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 43–47

  • Vasconcelos M, Candello H, Pinhanez C, dos Santos T (2017) Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: Brazilian symposium on human factors in computing systems, HFCS ’17, p 10

  • Wessel M, de Souza BM, Steinmacher I, Wiese IS, Polato I, Chaves AP, Gerosa MA (2018) The power of bots: Characterizing and understanding bots in oss projects. Proc ACM Hum-Comput Interact 2(CSCW):182:1–182:19

    Article  Google Scholar 

  • Wyrich M, Bogner J (2019) Towards an autonomous bot for automatic source code refactoring. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 24–28

  • Xu B, Xing Z, Xia X, Lo D (2017) Answerbot: Automated generation of answer summary to developersź technical questions. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 706–716

  • Zamanirad S, Benatallah B, Chai Barukh M, Casati F, Rodriguez C (2017) Programming bots by synthesizing natural language expressions into api invocations. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 832–837

  • Zamora J (2017) I’m sorry, dave, i’m afraid i can’t do that: Chatbot perception and expectations. In: Proceedings of the 5th international conference on human agent interaction, HAI ’17. ACM, New York, pp 253–260

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ahmad Abdellatif.

Additional information

Communicated by: Sven Apel

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdellatif, A., Badran, K. & Shihab, E. MSRBot: Using bots to answer questions from software repositories. Empir Software Eng 25, 1834–1863 (2020). https://doi.org/10.1007/s10664-019-09788-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-019-09788-5

Keywords

Navigation