Abstract
Agility gives the ability to be highly responsive to changes and make improvements in the way the agile project is documented. The just-in-time and just barely good enough documentation may miss out on important executable specifications. Interestingly, the frequently asked ‘how-to-do’ questions on the popular question answering (Q&A) websites like stack overflow are strong indicators of gaps in documentation and respondent answers can complement conventional software documentation practices. Social interaction within these QA websites generates partially structured content commonly referred to as crowd knowledge that can offer a peer-reviewed re-documentation by integrating the answers of these ‘how-to-do’ concerns. But finding the best, value-added answer to the question which can contribute toward an enriched and curated documentation is computationally difficult. Moreover, query duplicates can cause seekers to spend more time finding these best answers. As a solution, the research proffers a novel question-answering crowd documentation model (QACDoc) which is based on a socially mediated documentation mechanics involving the dynamics of community-based web. Firstly, duplicate questions are detected using Siamese neural architecture where two identical hierarchical attention networks are used to generate vectors for similarity matching. Semantic matching is done using Manhattan distance function, and a multi-layer perceptron is trained to output the predictions. Next, all respondents of semantically matched questions are grouped to form an intent-based crowd and lastly, top k-experts are identified using representative social presence features for expert ranking. The crowd documentation is then filtered to only include answers of these identified experts.
Similar content being viewed by others
References
Huber, T.L., Winkler, M.A., Dibbern, J., Brown, C.V.: The use of prototypes to bridge knowledge boundaries in agile software development. Inf. Syst. J. 30(2), 270–294 (2020)
Abou-Nassar, E.M., Iliyasu, A.M., El-Kafrawy, P.M., Song, O.Y., Bashir, A.K., Abd El-Latif, A.A.: DITrust chain: towards blockchain-based trust models for sustainable healthcare IoT systems. IEEE Access 8, 111223–111238 (2020)
Lou, K., Yang, Y., Wang, E., Liu, Z., Baker, T., Bashir, A.K.: Reinforcement Learning Based Advertising Strategy Using Crowdsensing Vehicular Data. IEEE Trans. Intell. Transp. Syst. (2020). https://doi.org/10.1109/TITS.2020.2991029
Rafique, H., Almagrabi, A.O., Shamim, A., Anwar, F., Bashir, A.K.: Investigating the acceptance of mobile library applications with an extended technology acceptance model (TAM). Comput. Educ. 145, 103732 (2020)
Abbasi, M., Shokrollahi, A., Khosravi, M.R., Menon, V.G.: High-performance flow classification using hybrid clusters in software defined mobile edge computing. Comput. Commun. 160, 643–660 (2020)
Menon, V.G., Jacob, S., Joseph, S., Almagrabi, A.O.: SDN powered humanoid with edge computing for assisting paralyzed patients. IEEE Internet Things J. 7(7), 5874–5881 (2019). https://doi.org/10.1109/JIOT.2019.2963288
Kumar, A., Jaiswal, A.: A deep swarm-optimized model for leveraging industrial data analytics in cognitive manufacturing. IEEE Trans. Ind. Inf. (2020). https://doi.org/10.1109/TII.2020.3005532
Tripathi, A.K., Sharma, K., Bala, M., Kumar, A., Menon, V.G., Bashir, A.K.: A parallel military dog based algorithm for clustering big data in cognitive industrial internet of things. IEEE Trans. Ind. Inf. (2020). https://doi.org/10.1109/TII.2020.2995680
Jain, D.K., Kumar, A., Sharma, V.: Tweet recommender model using adaptive neuro-fuzzy inference system. Future Gener. Comput. Syst. 112, 996–1009 (2020)
Parnas, D.L.: Precise documentation: the key to better software. In: The future of software engineering, pp. 125–148. Springer, Berlin (2011)
Poth, A., Sasabe, S., Mas, A., Mesquida, A.L.: Lean and agile software process improvement in traditional and agile environments. J. Softw. Evol. Process 31(1), e1986 (2019)
Kumar, A., Sangwan, S.R.: Expert finding in community question-answering for post recommendation. Int. J. Eng. Technol. 7(34), 151–159 (2018)
Anderson, A., Huttenlocher, D., Kleinberg, J., Leskovec, J.: Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 850–858 (2012)
Hassenzahl, M.: User experience (UX) towards an experiential perspective on product quality. In: Proceedings of the 20th conference on l'Interaction Homme-machine, pp. 11–15 (2008)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp. 1480–1489 (2016)
Sillitti, A., Ceschi, M., Russo, B., Succi, G.: Managing uncertainty in requirements: a survey in documentation-driven and agile companies. In: 11th IEEE international software metrics symposium (METRICS'05), p. 10 (2005)
Rüping, A.: Agile documentation: a pattern guide to producing lightweight documents for software projects. John Wiley & Sons, Hoboken (2005)
Selic, B.: Agile documentation, anyone? IEEE Softw. 26(6), 11–12 (2009)
Poniszewska-Marańda, A., Zieliski, A., Marańda, W.: Towards project documentation in agile software development methods. In: Data-centric business and applications, pp. 1–18. Springer, Cham (2020)
Bhatia, M.P.S., Khalid, A.K.: Contextual proximity based term-weighting for improved web information retrieval. In: International conference on knowledge science, engineering and management, pp. 267–278. Springer, Berlin, Heidelberg (2007)
Kumar, A., Sangwan, S.R.: Rumor detection using machine learning techniques on social media. In: International conference on innovative computing and communications, pp. 213–221. Springer, Singapore (2019)
Young, T., Hazarika, D., Poria, S., Cambria, E.: Recent trends in deep learning based natural language processing. IEEE Comput. Intell. Mag. 13(3), 55–75 (2018)
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., et al.: Recent advances in convolutional neural networks. Pattern Recognit. 77, 354–377 (2018)
Otter, D.W., Medina, J.R., Kalita, J.K.: A survey of the usages of deep learning for natural language processing. IEEE Trans. Neural Netw. Learn. Syst. (2020). https://doi.org/10.1109/TNNLS.2020.2979670
Kumar, A.: Using cognition to resolve duplicacy issues in socially connected healthcare for smart cities. Comput. Commun. 152, 272–281 (2020)
Bogdanova, D., dos Santos, C., Barbosa, L., Zadrozny, B.: Detecting semantically equivalent questions in online user forums. In: Proceedings of the nineteenth conference on computational natural language learning, pp. 123–131 (2015)
Bromley, J., Bentz, J.W., Bottou, L., Guyon, I., LeCun, Y., Moore, C., et al.: Signature verification using a “siamese” time delay neural network. Int. J. Pattern Recognit. Artif. Intell. 7(04), 669–688 (1993)
Wang, G., Gill, K., Mohanlal, M., Zheng, H., Zhao, B.Y.: Wisdom in the social crowd: an analysis of quora. In: Proceedings of the 22nd international conference on world wide web, pp. 1341–1352 (2013)
Saedi, C., Rodrigues, J., Silva, J., Branco, A., Maraev, V.: Learning profiles in duplicate question detection. In: 2017 IEEE international conference on information reuse and integration (IRI), pp. 544–550 (2017)
Rodrigues, J., Saedi, C., Maraev, V., Silva, J., Branco, A.: Ways of asking and replying in duplicate question detection. In: Proceedings of the 6th joint conference on lexical and computational semantics (* SEM 2017), pp. 262–270 (2017)
Chen, Z., Zhang, H., Zhang, X., Zhao, L.: Quora question pairs. (2018). http://static.hongbozhang.me/doc/STAT_441_Report.pdf
Zhang, Y., Lo, D., Xia, X., Sun, J.L.: Multi-factor duplicate question detection in stack overflow. J. Comput. Sci. Technol. 30(5), 981–997 (2015)
Ahasanuzzaman, M., Asaduzzaman, M., Roy, C.K., Schneider, K.A.: Mining duplicate questions of stack overflow. In: 2016 IEEE/ACM 13th working conference on mining software repositories (MSR), pp. 402–412 (2016)
Silva, R.F., Paixão, K., de Almeida Maia, M.: Duplicate question detection in stack overflow: a reproducibility study. In: 2018 IEEE 25th international conference on software analysis, evolution and reengineering (SANER), pp. 572–581 (2018)
Zhou, T.C., Lyu, M.R., King, I.: A classification-based approach to question routing in community question answering. In: Proceedings of the 21st international conference on world wide web, pp. 783–790 (2012)
Pal, A., Harper, F.M., Konstan, J.A.: Exploring question selection bias to identify experts and potential experts in community question answering. ACM Trans. Inf. Syst. 30(2), 1–28 (2012)
Kumar, A., Ahmad, N.: ComEx miner: Expert mining in virtual communities. Int. J. Adv. Comput. Sci. Appl. 3(6) (2012). https://doi.org/10.14569/IJACSA.2012.030610
Zhou, G., Lai, S., Liu, K., Zhao, J.: Topic-sensitive probabilistic model for expert finding in question answer communities. In: Proceedings of the 21st ACM international conference on information and knowledge management, pp. 1662–1666 (2012)
Bozzon, A., Brambilla, M., Ceri, S., Silvestri, M., Vesci, G.: Choosing the right crowd: expert finding in social networks. In: Proceedings of the 16th international conference on extending database technology, pp. 637–648 (2013)
Chang, S., Pal, A.: Routing questions for collaborative answering in community question answering. In: 2013 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2013), pp. 494–501 (2013)
Yang, B., Manandhar, S.: Tag-based expert recommendation in community question answering. In: 2014 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM 2014), pp. 960–963 (2014)
Rafiei, M., Kardan, A.A.: A novel method for expert finding in online communities based on concept map and PageRank. Hum. Centric Comput. Inf. Sci. 5(1), 10 (2015)
Zhao, Z., Zhang, L., He, X., Ng, W.: Expert finding for question answering via graph regularized matrix completion. IEEE Trans. Knowl. Data Eng. 27(4), 993–1004 (2014)
Mandal, D.P., Kundu, D., Maiti, S.: Finding experts in community question answering services: a theme based query likelihood language approach. In: 2015 International conference on advances in computer engineering and applications, pp. 423–427 (2015)
Geerthik, S., Gandhi, K.R., Venkatraman, S.: Domain expert ranking for finding domain authoritative users on community question answering sites. In: 2016 IEEE international conference on computational intelligence and computing research (ICCIC), pp. 1–5 (2016)
Yang, J., Peng, S., Wang, L., Wu, B.: Finding experts in community question answering based on topic-sensitive link analysis. In: 2016 IEEE first international conference on data science in cyberspace (DSC), pp. 54–60 (2016)
Alharthi, H., Outioua, D., Baysal, O.: Predicting questions’ scores on stack overflow. In: 2016 IEEE/ACM 3rd international workshop on crowd sourcing in software engineering (CSI-SE), pp. 1–7 (2016)
Liu, J., Shen, H., Yu, L.: Question quality analysis and prediction in community question answering services with coupled mutual reinforcement. IEEE Trans. Serv. Comput. 10(2), 286–301 (2015)
Erkan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intel. Res. 22, 457–479 (2004)
Boudin, F., El-Bèze, M., Torres-Moreno, J.M.: A scalable MMR approach to sentence scoring for multi-document update summarization. In: Coling 2008: companion volume: posters, pp. 23–26 (2008)
Treude, C., Robillard, M.P.: Augmenting api documentation with insights from stack overflow. In: 2016 IEEE/ACM 38th international conference on software engineering (ICSE), pp. 392–403 (2016)
Delfim, F.M., Paixão, K.V., Cassou, D., de Almeida Maia, M.: Redocumenting APIs with crowd knowledge: a coverage analysis based on question types. J. Braz. Comput. Soc. 22(1), 9 (2016)
Ying, A.T.: Mining challenge 2015: comparing and combining different information sources on the stack overflow data set. In: The 12th working conference on mining software repositories (2015)
Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., Zettlemoyer, L.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)
Kumar, A., Sachdeva, N.: Multi-input integrative learning using deep neural networks and transfer learning for cyberbullying detection in real-time code-mix data. In: Multimedia systems. Springer, Berlin (2020). https://doi.org/10.1007/s00530-020-00672-7
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Cao, Y., Xu, J., Liu, T.Y., Li, H., Huang, Y., Hon, H.W.: Adapting ranking SVM to document retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, pp. 186–193 (2006)
Singh, A.K., Nagwani, N.K., Pandey, S.: RANKuser: a folksonomy and user profile based algorithm to identify experts in community question answering sites. Data Technol. Appl. 52(3), 329–350 (2018)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, A. Leveraging crowd knowledge to curate documentation for agile software industry using deep learning and expert ranking. Multimedia Systems 29, 1799–1813 (2023). https://doi.org/10.1007/s00530-020-00741-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-020-00741-x