MSRBot: Using bots to answer questions from software repositories

Abdellatif, Ahmad; Badran, Khaled; Shihab, Emad

doi:10.1007/s10664-019-09788-5

MSRBot: Using bots to answer questions from software repositories

Published: 03 March 2020

Volume 25, pages 1834–1863, (2020)
Cite this article

Empirical Software Engineering Aims and scope Submit manuscript

Ahmad Abdellatif¹,
Khaled Badran¹ &
Emad Shihab¹

1393 Accesses
25 Citations
4 Altmetric
Explore all metrics

Abstract

Software repositories contain a plethora of useful information that can be used to enhance software projects. Prior work has leveraged repository data to improve many aspects of the software development process, such as, help extract requirement decisions, identify potentially defective code and improve maintenance and evolution. However, in many cases, project stakeholders are not able to fully benefit from their software repositories due to the fact that they need special expertise to mine their repositories. Also, extracting and linking data from different types of repositories (e.g., source code control and bug repositories) requires dedicated effort and time, even if the stakeholder has the expertise to perform such a task. Therefore, in this paper, we use bots to automate and ease the process of extracting useful information from software repositories. Particularly, we lay out an approach of how bots, layered on top of software repositories, can be used to answer some of the most common software development/maintenance questions facing developers. We perform a preliminary study with 12 participants to validate the effectiveness of the bot. Our findings indicate that using bots achieves very promising results compared to not using the bot (baseline). Most of the participants (90.0%) find the bot to be either useful or very useful. Also, they completed 90.8% of the tasks correctly using the bot with a median time of 40 seconds per task. On the other hand, without the bot, the participants completed 25.2% of the tasks with a median time of 240 seconds per task. Our work has the potential to transform the MSR field by significantly lowering the barrier to entry, making the extraction of useful information from software repositories as easy as chatting with a bot.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploring how software developers work with mention bot in GitHub

Article 05 September 2019

Quality gatekeepers: investigating the effects of code review bots on pull request activities

Article Open access 30 May 2022

A fine-grained data set and analysis of tangling in bug fixing commits

Article Open access 02 July 2022

Notes

Since we gave a maximum of 30 minutes for participants to complete a task, tasks that were not answered after 30 minutes were considered to be incomplete and also to have taken 30 minutes.

References

Amazon Lex (2019) Build conversation bots. https://aws.amazon.com/lex/. Accessed 19 Mar 2019
Abdellatif A, Badran K, Shihab E (2019a) ahmad-abdellatif/msrbot: Msrbot framework. https://github.com/ahmad-abdellatif/MSRBot. Accessed 10 Oct 2019
Abdellatif A, Badran K, Shihab E (2019b) MSRBot: Using bots to answer questions from software repositories empirical software engineering
Acharya MP, Parnin C, Kraft NA, Dagnino A, Qu X (2016) Code drones. In: 2016 IEEE/ACM 38th international conference on software engineering companion (ICSE-c), pp 785–788
Ahmed TM, Bezemer C-P, Chen T-H, Hassan AE, Shang W (2016) Studying the effectiveness of application performance management (apm) tools for detecting performance regressions for web applications: An experience report. In: Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16. ACM, New York, pp 1–12
Ali N, Guéhéneuc YG, Antoniol G (2013) Trustrace: Mining software repositories to improve the accuracy of requirement traceability links. IEEE Trans Softw Eng 39(5):725–741
Article Google Scholar
Ask JA, Facemire M, Hogan A, Conversations HB (2016) The state of chatbots. Forrester. com report, 20
Banerjee S, Cukic B (2015) On the cost of mining very large open source repositories. In: Proceedings of the first international workshop on BIG data software engineering, BIGDSE ’15. IEEE Press, Piscataway, pp 37–43
Bankier JG, Gleason K (2014) Institutional repository software comparison
Begel A, Zimmermann T (2014) Analyze this! 145 questions for data scientists in software engineering. In: Proceedings of the 36th international conference on software engineering, ICSE 2014. ACM, New York, pp 12–23
Begel A, Zimmermann T (2018) Appendix to analyze this! 145 questions for data scientists in software engineering - microsoft research. https://www.microsoft.com/en-us/research/publication/appendix-to-analyze-this-145-questions-for-data-scientists-in-software-engineering. Accessed 20 Dec 2018
Begel A, Khoo YP, Zimmermann T (2010) Codebook: discovering and exploiting relationships in software repositories. In: 2010 ACM/IEEE 32nd international conference on software engineering, vol 1, pp 125–134
Beschastnikh I, Lungu M, Zhuang Y (2017) Accelerating software engineering research adoption with analysis bots, IEEE Press, Piscataway
Bradley NC, Fritz T, Holmes R (2018) Context-aware conversational developer assistants. In: Proceedings of the 40th international conference on software engineering, ICSE ’18. ACM, New York, pp 993–1003
Brown C, Parnin C (2019) Sorry to bother you: designing bots for effective recommendations. In: Proceedings of the 1st international workshop on Bots in software engineering, botSE ’19. IEEE Press, Piscataway, pp 54–58
Cerezo J, Kubelka J, Robbes R, Bergel A (2019) Building an expert recommender chatbot. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 59–63
Digkas G, Lungu M, Chatzigeorgiou A, Avgeriou P (2017) The evolution of technical debt in the apache ecosystem. In: Lopes A, de Lemos R (eds) Software architecture. Springer International Publishing, Cham, pp 51–66
Feng D, Shaw E, Kim J, Hovy E (2006) An intelligent discussion-bot for answering student queries in threaded discussions. In: Proceedings of the 11th international conference on intelligent user interfaces, IUI ’06. ACM, New York, pp 171–177
Fritz T, Murphy GC (2010) Using information fragments to answer the questions developers ask. In: Proceedings of the 32nd ACM/IEEE international conference on software engineering - Volume 1, ICSE ’10. ACM, New York, pp 175–184
Gensim (2019) gensim: Topic modelling for humans. https://radimrehurek.com/gensim/index.html. Accessed 13 Feb 2019
GitKraken (2019) GitKraken Git client. https://www.gitkraken.com/. Accessed 04 Mar 2019
Good Rebels (2019) The impact of conversational bots in the customer experience. https://www.goodrebels.com/the-impact-of-conversational-bots-in-the-customer-experience/. Accessed 05 Aug 2019
Google (2019a) Dialogflow. https://dialogflow.com/. Accessed 09 Jan 2019
Google (2019b) Training —- dialogflow. https://dialogflow.com/docs/training. Accessed 16 02 2019
Gottipati S, Lo D, Jiang J (2011) Finding relevant answers in software forums. In: Proceedings of the 2011 26th IEEE/ACM international conference on automated software engineering, ASE ’11. IEEE Computer Society, Washington, pp 323–332
Gupta M, Sureka A, Padmanabhuni S (2014) Process mining multiple repositories for software defect resolution from control and organizational perspective. In: Proceedings of the 11th working conference on mining software repositories, MSR 2014. ACM, New York, pp 122–131
Hassan AE (2008) The road ahead for mining software repositories. In: 2008 Frontiers of software maintenance, pp 48–57
Hattori L, D’Ambros M, Lanza M, Lungu M (2013) Answering software evolution questions. Inf Softw Technol 55(4):755–775
Article Google Scholar
Höst M, Regnell B, Wohlin C (2000) Using students as subjects—a comparative study of students and professionals in lead-time impact assessment. Empir Softw Eng 5(3):201–214
Article Google Scholar
IBM (2019) Watson conversation. https://www.ibm.com/watson/services/conversation/. Accessed 09 Jan 2019
Jira (2019) Jira client — atlassian marketplace. https://marketplace.atlassian.com/apps/7070/jira-client?hosting=server&tab=overview. Accessed on 04 Mar 2019
Jirafe (2019) Manage Jira Cloud issues in the comfort of Slack. https://www.jirafe. 948 io/#features. Accessed 25 July 2019
Jurafsky D, Martin JH (2009) Speech and language processing, 2nd edn., Prentice-Hall, Inc, Upper Saddle River
Kabinna S, Bezemer C-P, Shang W, Hassan AE (2016) Logging library migrations: a case study for the apache software foundation projects. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16. ACM, New York, pp 154–164,
Khomh F, Adams B, Dhaliwal T, Zou Y (2015) Understanding the impact of rapid releases on software quality. Empir Softw Eng 20(2):336–373
Article Google Scholar
Kumar R, Bansal C, Maddila C, Sharma N, Martelock S, Bhargava R (2019) Building sankie: an ai platform for devops. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 48–53
Lebeuf C, Storey M, Zagalsky A (2018) Software bots. IEEE Software 35 (1):18–23
Article Google Scholar
Liu X, Zhang S, Wei F, Zhou M (2011) Recognizing named entities in tweets. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies - Volume 1, HLT ’11. Association for Computational Linguistics, Stroudsburg, pp 359–367
Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The Stanford coreNLP natural language processing toolkit. In: Association for computational linguistics (ACL) system demonstrations, pp 55–60
Matthies C, Dobrigkeit F, Hesse G (2019) An additional set of (automated) eyes: Chatbots for agile retrospectives. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 34–37
Microsoft (2019) Luis: Language understanding intelligent service. https://www.luis.ai/home. Accessed 09 Jan 2019
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - volume 2, NIPS’13. Curran Associates Inc, USA, pp 3111–3119
Mohit B (2014) Named entity recognition. In: Zitouni I. (ed) Natural language processing of semitic languages. Springer, USA
Monperrus M (2019) Explainable software bot contributions: Case study of automated bug fixes. In: Proceedings of the 1st international workshop on bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 12–15
Monperrus M, Urli S, Durieux T, Martinez M, Baudry B, Seinturier L (2019) Repairnator patches programs automatically. Ubiquity 2019(July):2:1–2:12
Article Google Scholar
Mordinyi R, Biffl S (2017) Exploring traceability links via issues for detailed requirements coverage reports. In: 2017 IEEE 25Th international requirements engineering conference workshops (REW), pp 359–366
Murgia A, Janssens D, Demeyer S, Vasilescu B (2016) Among the machines: Humanbot interaction on social Q&A websites. In: Proceedings of the 2016 CHI conference extended abstracts on human factors in computing systems, CHI EA ’16. ACM, New York, pp 1272–1279
Paikari E, Choi J, Kim S, Baek S, Kim M, Lee S, Han C, Kim Y, Ahn K, Cheong C, van der Hoek A (2019) A chatbot for conflict detection and resolution. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, pp 29–33
Ratinov L, Roth D (2009) Design challenges and misconceptions in named entity recognition. In: Proceedings of the thirteenth conference on computational natural language learning, CoNLL ’09. Association for Computational Linguistics, Stroudsburg, pp 147–155
Robillard MP, Marcus A, Treude C, Bavota G, Chaparro O, Ernst N, Gerosa MA, Godfrey M, Lanza M, Linares-Vásquez M, Murphy GC, Moreno L, Shepherd D, Wong E (2017) On-demand developer documentation. In: 2017 IEEE nternational conference on software maintenance and evolution (ICSME), pp 479–483
Salman I, Misirli AT, Juristo N (2015) Are students representatives of professionals in software engineering experiments? 2015 IEEE/ACM 37th IEEE international conference on software engineering 1:666–676
Article Google Scholar
Sankar GR, Greyling J, Vogts D, du Plessis MC (2008) Models towards a hybrid conversational agent for contact centres. In: Proceedings of the 2008 annual research conference of the South African Institute of computer scientists and information technologists on it research in developing countries: riding the wave of technology SAICSIT ’08. ACM, New York, pp 200–209
Sawant AA, Bacchelli A (2017) fine-grape: fine-grained api usage extractor – an approach and dataset to investigate api usage. Empir Softw Eng 22(3):1348–1371
Article Google Scholar
Sharma VS, Mehra R, Kaulgud V (2017) What do developers want?: an advisor approach for developer priorities. In: Proceedings of the 10th international workshop on cooperative and human aspects of software engineering, CHASE ’17. IEEE Press, Piscataway, pp 78–81
Siddiqui T, Ahmad A (2018) Data mining tools and techniques for mining software repositories: a systematic review. In: Aggarwal V. B., Bhatnagar V., Mishra D. K. (eds) Big data analytics. Springer Singapore, Singapore, pp 717–726
Sillito J, Murphy GC, Volder KD (2008) Asking and answering questions during a programming change task. IEEE Trans Softw Eng 34(4):434–451
Article Google Scholar
Śliwerski J, Zimmermann T, Zeller A (2005) When do changes induce fixes?. In: Proceedings of the 2005 international workshop on mining software repositories, MSR ’05. ACM, New York, pp 1–5
Storey M-A, Zagalsky A (2016) Disrupting developer productivity one bot at a time. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016. ACM, New York, pp 928–931
Tian Y, Thung F, Sharma A, Apibot DLo (2017) Question answering bot for api documentation. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 153–158
Tjong Kim Sang EF, De Meulder F (2003) Introduction to the conll-2003 shared task: Language-independent named entity recognition. In: Proceedings of the seventh conference on natural language learning at HLT-NAACL 2003 - Volume 4, CONLL ’03. Association for Computational Linguistics, Stroudsburg, pp 142–147
Toxtli C, Monroy-Hernández A, Cranshaw J (2018) Understanding chatbot-mediated task management. In: Proceedings of the 2018 CHI conference on human factors in computing systems, CHI ’18. ACM, New York, pp 58:1–58:6
Treude C, Robillard MP, Dagenais B (2015) Extracting development tasks to navigate software documentation. IEEE Trans Softw Eng 41(6):565–581
Article Google Scholar
Urli S, Yu Z, Seinturier L, Monperrus M (2018) How to design a program repair bot?: Insights from the repairnator project. In: Proceedings of the 40th international conference on software engineering: software engineering in practice, ICSE-SEIP ’18. ACM, New York, pp 95–104
van Tonder R, Goues CL (2019) Towards s/engineer/bot: Principles for program repair bots. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 43–47
Vasconcelos M, Candello H, Pinhanez C, dos Santos T (2017) Boa: a language and infrastructure for analyzing ultra-large-scale software repositories. In: Brazilian symposium on human factors in computing systems, HFCS ’17, p 10
Wessel M, de Souza BM, Steinmacher I, Wiese IS, Polato I, Chaves AP, Gerosa MA (2018) The power of bots: Characterizing and understanding bots in oss projects. Proc ACM Hum-Comput Interact 2(CSCW):182:1–182:19
Article Google Scholar
Wyrich M, Bogner J (2019) Towards an autonomous bot for automatic source code refactoring. In: Proceedings of the 1st international workshop on Bots in software engineering, BotSE ’19. IEEE Press, Piscataway, pp 24–28
Xu B, Xing Z, Xia X, Lo D (2017) Answerbot: Automated generation of answer summary to developersź technical questions. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 706–716
Zamanirad S, Benatallah B, Chai Barukh M, Casati F, Rodriguez C (2017) Programming bots by synthesizing natural language expressions into api invocations. In: Proceedings of the 32nd IEEE/ACM international conference on automated software engineering, ASE 2017. IEEE Press, Piscataway, pp 832–837
Zamora J (2017) I’m sorry, dave, i’m afraid i can’t do that: Chatbot perception and expectations. In: Proceedings of the 5th international conference on human agent interaction, HAI ’17. ACM, New York, pp 253–260

Download references

Author information

Authors and Affiliations

Data-Driven Analysis of Software (DAS) Lab, Department of Computer Science and Software Engineering, Concordia University, Montreal, QC, H3G 1M8, Canada
Ahmad Abdellatif, Khaled Badran & Emad Shihab

Authors

Ahmad Abdellatif
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Badran
View author publications
You can also search for this author in PubMed Google Scholar
Emad Shihab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmad Abdellatif.

Additional information

Communicated by: Sven Apel

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abdellatif, A., Badran, K. & Shihab, E. MSRBot: Using bots to answer questions from software repositories. Empir Software Eng 25, 1834–1863 (2020). https://doi.org/10.1007/s10664-019-09788-5

Download citation

Published: 03 March 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s10664-019-09788-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MSRBot: Using bots to answer questions from software repositories

Abstract

Access this article

Similar content being viewed by others

Exploring how software developers work with mention bot in GitHub

Quality gatekeepers: investigating the effects of code review bots on pull request activities

A fine-grained data set and analysis of tangling in bug fixing commits

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MSRBot: Using bots to answer questions from software repositories

Abstract

Access this article

Similar content being viewed by others

Exploring how software developers work with mention bot in GitHub

Quality gatekeepers: investigating the effects of code review bots on pull request activities

A fine-grained data set and analysis of tangling in bug fixing commits

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation