QUARE: towards a question-answering model for requirements elicitation

Requirements elicitation is a stakeholder-centered approach; therefore, natural language remains an effective way of documenting and validating requirements. As the scope of the software domain grows, software analysts process a higher number of requirements documents, generating delays and errors while characterizing the software domain. Natural language processing is key in such a process, allowing software analysts for speeding up the requirements elicitation process and mitigating the impact of the ambiguity and misinterpretations coming from natural-language-based requirements documents. However, natural-language-processing-based proposals for requirements elicitation are mainly focused on specific domains and still fail for understanding several requirements writing styles. In this paper, we present QUARE, a question-answering model for requirements elicitation. The QUARE model comprises a meta-ontology for requirements elicitation, easing the generation of requirements-elicitation-related questions and the initial structuration of any software domain. In addition, the QUARE model includes a named entity recognition and relation extraction system focused on requirements elicitation, allowing software analysts for processing several requirements writing styles. Although software analysts address a software domain at a time, they use the same kind of questions for identifying and characterizing requirements abstractions such as actors, concepts, and actions from a software domain. Such a process may be framed into the QUARE model workflow. We validate our proposal by using an experimental process including real-world requirements documents coming from several software domains and requirements writing styles. The QUARE model is a novel proposal aimed at supporting software analysts in the requirements elicitation process.


Introduction
Requirements Elicitation (RE) is focused on identifying and characterizing the stakeholders and their requirements (Dick et al. 2017).While there are several techniques for RE, including workshops, dialogues, and sample scenarios, interviews remain the most used technique for RE (Arruda et al. 2019).Such activity becomes challenging as the scope of the software product grows due to the number of requirements documents to be addressed, generating errors and delays.Also, since natural languageis the most common way of documenting requirements, RE is prone to text ambiguity, incompleteness, and inconsistency (Lim et al. 2021).
Natural Language Processing (NLP) is a multidisciplinary field comprising research areas such as linguistics, statistics, and logic (Aguilar and Sierra M., 2017).NLP deals with automatically analyzing, understanding, and generating natural language (Gelbukh 2013).Since most of the requirements documents are based on natural language (McZara et al. 2015), NLP is key for improving RE.Dalpiaz et al. (2018) introduce the term NLP4RE (Natural Language Processing for Requirements Engineering) for framing the NLP-based approaches for requirements engineering including all its activities.An important part of the NLP4RE approaches is focused on RE.Software analysts use such approaches for speeding up the RE process and making it more reliable (Zhao et al. 2021).
However, such approaches still fail on addressing several software domains and requirements writing styles, making it harder to generalize them to other RE domains (Lim et al. 2021).This is a significant drawback due to RE being aimed at identifying and characterizing the requirements of stakeholders coming from several software domains, e.g., medicine, biochemistry, financial, and law.Also, software analysts use diverse requirements writing styles, including natural language, semi-structured, and structured natural language (e.g., use cases and boilerplates, respectively), graphical notation, and mathematical specifications for documenting requirements (Raharjana et al. 2021).Thus, supporting the RE process in a more general way becomes necessary for an NLP4RE approach focused on RE.
Consequently, in this paper, we propose QUARE (Question Answering for Requirements Elicitation), a question-answering model for improving the RE process.Such a model is intended to support software analysts in the RE process by answering RE-related questions, e.g., "what are the functions of the actor x?" and "what are the attributes of the concept Y?," extracting the answers from requirements documents regardless the requirements writing style by using a RE-oriented Named Entity Recognition (NER) and relation extraction system, and structuring them with a RE meta-ontology including domain-independent requirements abstractions such as actors, actions, and concepts.We extend the semi-automated nature of the Question Answering Systems (QAS)s by defining a rule-based approach for generating RE-related questions, providing a novel fully automated NLP4RE approach for RE.
QASs are used for extracting specific answers after retrieving and processing several data sources relying on Information Retrieval (IR), NLP, and Artificial Intelligence (AI) techniques (Young et al. 2018).QASs may be classified (Bouziane et al. 2015;Xie et al. 2016) into two main models: IR-based factoid question answering is focused on finding small fractions of text representing answers for factoid questions such as "what is the capital of Colombia?" and "who is the writer of Foundation?," retrieving candidate documents from the web and large collections of text sources, and extracting fragments of texts representing the target answer; knowledge question answering is concerned with extracting answers from structured data sources, e.g., databases and domain-specific ontologies, responding to query-based question representations, including predicate calculus (Zelle and Mooney 1996), SQL (Iyer et al. 2017), and question decomposition meaning representation (Wolfson et al. 2020).While software analysts address software products from several domains, they use the same kind of questions for eliciting requirements abstractions (Lim et al. 2021), fitting well into the QASs functionality.
The QUARE model is a new NLP4RE approach focused on RE.Such a model allows software analysts for identifying and extracting key abstractions from requirements documents and structuring them, taking advantage of our meta-ontology for RE.Also, such meta-ontology may be used for representing key requirements abstractions from any software domain.In addition, the QUARE model makes it possible for processing requirements documents written by using several requirements writing styles, allowing for a wider range of requirements documents to be analyzed.Furthermore, the QUARE model functionality is closer to a real-life RE domain, providing a comprehensive tool to software analysts for gaining a broader understanding of the software domain by using RE-related questions.
We validate the QUARE model by using two case studies including real-life requirements coming from the PURE dataset, a collection of public requirements aimed at supporting NLP-based tasks focused on requirements engineering.Such validation is performed by using the experimental process of software engineering: planning, executing, and analyzing a mechanism experiment (Wieringa 2014;Wohlin et al. 2012).Our preliminary results indicate that the QUARE model is a promising NLP4RE approach for RE, outperforming some state-of-the-art proposals with an average F 1 score of 0.78.
The remainder of this paper is structured as follows: In Sect. 2 we introduce some background concepts; in Sect. 3 we analyze related work and state the research problem; in Sect. 4 we present the QUARE model; in Sect. 5 we validate our proposal; in Sect.6 we discuss results and threats to validity; in Sect.7 we state conclusions and challenges.

Background
In this section, we present two key concepts for our proposal: NLP for requirements engineering and question-answering systems.

Natural language processing for requirements engineering
The interest in NLP-based approaches for supporting requirements engineering activities is continuously increasing due to the recent advances in Deep Learning (DL) and NLP (Ferrari et al. 2021).Dalpiaz et al. (2018) and Zhao et al. (2021) introduce and define the term NLP4RE as a set of NLP-based techniques (e.g., partof-speech tagging and tokenization), tools (e.g., practical methods and processes such as NLTK1 and Stanford CoreNLP2 ), and resources (e.g., dictionaries and corpus) for supporting linguistic-focused tasks related to requirements engineering and its activities.
According to Zhao et al. (2021), such approaches may be classified into six tasks: Detection is used for supporting manual requirements review activities by identifying linguistic weaknesses such as passive voice, vague phrases, and weak verbs; Extraction is focused on identifying key requirements abstractions and domain concepts; Classification is aimed to classify requirements into several categories based on the nature of the problem, including functional requirements, non-functional requirements, bug reports, new features, and feedback; Modeling is related to the identification of modeling concepts which are used for building conceptual models; Tracing and relating are concerned with finding relationships between requirements other software artifacts such as existing models, code fragments, and test cases; Search and retrieval are related to searching, identifying, and retrieving existing requirements which may be totally and partially reused for developing a new software product.

Question-answering systems
Question-answering systems are used for automatically answering natural-language-based questions by using Information Retrieval (IR), NLP, and AI techniques (Young et al. 2018).In contrast to search engines, where users may get a ranked list of candidate documents where they may find the answer based on keywords, QASs allow users for extracting specific answers represented in form of paragraphs, complete sentences, and a set of words.QASs may be classified (Bouziane et al. 2015;Xie et al. 2016) into two main models: IR-based factoid question answering is focused on extracting small spans of text representing answers for factoid questions such as "what is the capital of Colombia?" and "who is the writer of Foundation?".Such systems comprise two main components: the information retrieval component is used for retrieving and ranking candidate documents from the web and large collections of text sources; the reading comprehension system is used for analyzing such candidate documents and extracting fragments of text representing the target answer.
Knowledge question answering is concerned with extracting answers from structured data sources, e.g., databases and domain-specific ontologies, and answering query-based questions in the form of predicate calculus (Zelle and Mooney 1996), SQL (Iyer et al. 2017), and question decomposition meaning representation (Wolfson et al. 2020).

IR-based factoid question-answering systems
IR-based factoid question-answering is one of the most common models for building QASs.Such a model is focused on answering factoid questions such as "who is the vocalist from deep purple?" and "where is the rock band AC/DC from?,"where the answers are short spans of texts and are likely to be named entity such as "person," e.g., Ian Gillan, and "location," e.g., Australia (Jurafsky and Martin 2009).
The basic architecture of such models (Bouziane et al. 2015;Jurafsky and Martin 2009) comprises three stages: question analysis is focused on identifying the question type based on the interrogative words, e.g., "what," "which," and "who," and the named entity categories, e.g., "person," "location," and "organization," in the question statement; document retrieval is concerned with the identification and classification of candidate documents and fragments of text which are likely to contain the target answer; answer extraction is related to the processing and extraction of the answer from the candidate documents.

Related work
The QUARE model is a question-answering-based approach focused on the NLP4RE extraction task.Therefore, in this section, we analyze and characterize NLP4RE proposals focused on the RE activity which are based on QASs and tools with similar functionality such as chatbots, dialogue models, and query-based tools for extracting requirements abstractions from requirements documents.Boquist (2014) presents a semi-automated dialogue system as a practice assistant for RE.Software analysts use such an approach for asking RE-related questions and extracting requirements abstractions such as actors, concepts, and constraints written by using structured natural language from a pre-populated database focused on the education domain.Sleimi et al. (2019) propose a knowledge-based QAS for extracting legal-related requirements abstractions from semi-structured natural language legal requirements documents.They define four categories for classifying requirements abstractions: concepts comprise actors, agents, targets, auxiliary parties, artifacts, situations, locations, and time; definitions are related to the formal meaning of a given domain concept; prescriptions are concerned with the obligations, prohibitions, permissions, conditions, and constraints for a given domain concept; sanctions are all about penalties and violations.Such a proposal relies on law-related ontologies and databases for extracting and structuring candidate abstractions.Also, software analysts should use natural language questions combined with queries for extracting such abstractions, making the RE process more complex.
25 Page 6 of 38 Lian et al. (2020) develop a RE-oriented ontology for structuring and storing requirements abstractions coming from semi-structured requirements.Such a proposal is used for detecting candidate requirements sentences by computing semantic similarity among requirements documents passages and abstractions retrieved from domain-specific ontologies by using queries.Laiq and Dieste (2020) propose a chatbot-based interview simulator for training novice software analysts.Unlike other chatbot-based proposals for RE (Arruda et al. 2019;Rajender Kumar Surana et al. 2019), where the chatbots are used for extracting requirements directly from stakeholders, such a proposal is focused on answering natural language questions related to the identification and extraction of requirements abstractions such as actors, goals, and actions from user stories by using a predefined knowledge base.While such a proposal is the closest to the QUARE model, software analysts may only use it for answering RE-related questions in the context of the education domain.
Some proposals are aimed at directly identifying and extracting requirements abstractions from stakeholder responses and interactions: (Dwitama and Rusli 2020) develop a tool for building use cases by using a chatbot where software analysts may gather information about actors and actions.Rajender Kumar Surana et al. ( 2019) propose a chatbot-based interview assistant for supporting novice software analysts while they are gathering requirements abstractions related to the finance domain, such as account type and operative system.Arruda et al. (2019) extract goals, expectations, actions, and actors by using a chatbot, allowing software analysts directly interact with stakeholders.Grigorious and Symeonidis (2014) introduce a search engine for identifying requirements abstractions such as actor, action, and concept so they may be traced and related to existing requirements by using domain-specific ontologies.While such proposals use a question-answering functionality, they are not intended to identify and extract requirements abstractions from requirements documents.
Even though the QASs functionality is close to the RE process, as the most used techniques for RE are question-answering processes (Pacheco et al. 2018), e.g., structured interviews, laddering, and questionnaires, some research is focused on integrating QASs and similar systems with RE.Formally, one proposal (Sleimi et al. 2019) is a factoid QAS; however, there are three more proposals (Boquist 2014;Laiq and Dieste 2020;Lian et al. 2020) that are close to the functionality of a knowledgebased question-answering system.Such proposals are the closest to the QUARE model as the software analysts may use them for answering RE-related questions from requirements documents.
Software analysts still face some challenges in using NLP4RE proposals in a generalized fashion.Such challenges are focused on two problems: limitations for understanding some requirements writing styles and limitations on the software domain scope.Software analysts use a high number of requirements writing styles for documenting requirements (Henriksson and Zdravkovic 2020).However, most of the NLP4RE proposals rely on manually crafted syntactic and semantic rules for identifying and extracting requirements abstractions, hardening the processing of some requirements writing styles and their variations (Lim et al. 2021).In addition, software analysts elicit requirements coming from different software domains, yet, some NLP4RE proposals are focused on specific software domains, limiting the scope of the software domains to be elicited.The lack of such capabilities remains a significant drawback for the NLP4RE research area.

THE QUARE model
In this section, we introduce QUARE, a question-answering model for RE that is intended to contribute to the NLP4RE research area focused on the extraction task.The QUARE model allows software analysts for identifying, extracting, and structuring key requirements abstractions such as actors, actions, objects, and attributes from several requirements writing styles and several software domains.
The QUARE model comprises two main components: (i) a meta-ontology for representing a software domain in the context of RE which is built upon requirements abstractions coming from recurrent RE-related questions and (ii) the RENER system, a NER and relation extraction system focused on RE designed for identifying and extracting such requirements abstractions.

Characterizing RE-related questions
Structured and semi-structured interviews are useful techniques for RE.While a software analyst is concerned with a software domain at a time, she should use the same kind of questions for identifying and extracting some requirements abstractions (Pacheco et al. 2018).We analyze (Calle 2022) interview-driven RE proposals comprising either a set of predefined RE-related questions or a set of RE-related topics software analysts should ask for while interviewing stakeholders.
While the RE process comprises a well-defined set of steps and its goal remains the same regardless of the software domain, setting a fixed set of questions may be a challenging task (Pacheco et al. 2018).Some authors propose a set of general questions for guiding the interview process: Actors are explored by using questions such as "who is the client?"(Gause and Weinberg 1989), "what are the actors of the system?" (Hunt, 1997), and "what are the actors who are going to use the system?" (Burnay et al. 2014); Actions are studied by using questions such as "what is the function of an information system?" (Wijers and Heijes 1990), "what are the tasks of the actor?" (Perepletchikov and Padgham 2005), and "what are the workflows and tasks to be performed by the user?" (Düchting et al. 2007;Yamanaka and Komiya, 2010;Zapata and Carmona 2010); Concepts are analyzed by using questions such as "what are the concepts of the system?," "what are the attributes of the actor?" (Cysneiros and Yu 2003), "what are the objects that could be wired to the system?," and "what are the concepts of the domain" (Burnay et al. 2014;Kücherer and Paech 2018).
Some authors define RE-related topics to be covered during the interview process, allowing software analysts for using a less structured approach for addressing the interview.Several approaches such as ERAE (Dubois et al. 1986), HyperQuest (Manago et al. 1992), ViewPoints (Do Prado Leite andGilvaz, 1996), WebRE (Escalona and Koch, 2006), Elicitation Topic Map (Burnay et al. 2014), and GORE (Adikara et al. 2016) are used for representing key requirements abstractions such as actors, concepts, actions, objects, and constraints, giving software analysts the guidelines on what to ask but not how to do it.
In summary (Calle 2022), most of the proposals on predefined questions for RE are focused on questions targeting concepts (27%), actors (24%), actions (24%), problem-related concepts (11%), goals (6%), and constraints (8%).In addition, most of the proposals are concerned with topics related to concepts (34%), actions (27%), actors (16%), goals, events, constraints, and problems (23%).Such results describe how important key requirements abstractions such as actor, action, and domain concepts are to properly characterize the software product.Such triads (actor-actionconcept) include, in a general way, the requirements of the stakeholders.We use such key requirements abstractions for building the base structure of the QUARE model, so the QUARE components remain consistent and aligned with the RE process.

Building a meta-ontology for RE
Domain ontologies are used for representing the knowledge coming from a specific domain, including domain concepts and the semantic relationships among them.In the context of RE, such representations provide a unified language for stakeholders and software analysts to understand and communicate the software domain (Lian et al. 2020).In contrast to domain ontologies, which are focused on a specific domain, meta-ontologies are concerned with the representation of any domain within a more general context such as RE (Carlos M. Zapata et al. 2010).
In this section, we adapt and extend previous work (Carlos M. Zapata et al. 2010) for representing RE-related concepts such as objects, actors, and actions by using a meta-ontology.In the remainder of this section, we refer to requirements abstractions and their relationships as classes and properties respectively.Such classes and properties are used for defining the scope of the QUARE model related to the requirements abstractions that may be identified and extracted, considering the importance of the proposed classes based on their recurrence in the RE process as we state in the previous subsection.In addition, such classes (actor-action-object) are used in the NLP4RE context as key components of RE-oriented ontologies (Grigorious and Symeonidis 2014;Sleimi et al. 2019;Vlas and Robinson 2011;Zhao et al. 2021).
Also, such meta-ontology allows software analysts for building an early conceptualization of the software domain.The structure of the meta-ontology for RE is sown in Fig. 1 and it is described as follows:  2010) which is used for representing concepts related to a software domain, e.g., "bread," "username," and "electrical circuit."We define the property hasName for representing the name of the object, e.g., the string "bread".• Action is a class adapted from Zapata et al. (2010) which is used for representing all the actions performed by an actor.We define three properties for such a class: hasActionVerb is used for describing the action verb related to a specific action, e.g., "makes" and "uploads"; hasRelatedActor is used for representing the relationship between an instance of the class action and an instance of the class actor, who performs the action, i.e., it is the inversed of the hasAction property; hasRe-latedConcept is used for representing the relationship between an instance of the Properties are classified into two types: data properties are used for representing properties based on primitive types such as strings and integers; object properties are used for describing properties related to other classes such as Actor and Action.In addition, properties include two features describing how classes relate to each other: the domain of a property is used for describing the classes including such property, e.g., the domain of the property hasRole is "Actor," so each instance of the class Actor has such property; the range of a property is used for representing the class type of the possible value for such property, e.g., the range of the property hasRole is the string class, so a possible value for such a range is the string "baker."Domains and ranges for all properties are summarized in Table 1.

The RENER system
We present RENER, a named entity recognition and relation extraction system focused on RE, allowing software analysts for identifying and extracting requirements abstractions and the semantic relationships among them from requirements documents comprising several requirements writing styles.
The RENER system comprises two models: the abstraction extraction model is concerned with identifying and tagging RE-related entities such as an actor, an object, and an action; the relation extraction model is focused on defining the semantic relationship among such abstractions.

Data preparation
Some analysis related to the PURE dataset (Ferrari et al. 2017) shows software requirements comprise a controlled vocabulary, similar expressions, and  (Montani and Honnibal 2018).We manually annotate our dataset by using Prodigy with three labels, according to the meta-ontology structure: the label Actor is used for representing animated (e.g., "the user[Actor] uploads her profile picture") and unanimated (e.g., "the system[Actor] sends an alert") concepts performing an action in the software domain; the label Action is used for representing a specific action performed by an actor, e.g., "the user[Actor] uploads[Action] her profile picture"; the label Object is used for representing objects related to the software domain, e.g., "the user[Actor] uploads[Action] her profile picture [Object]."Similarly, we annotate the dataset with three semantic relationships, following the structure of the meta-ontology: hasAction is used for binding together an actor with an action; hasRelatedConcept is used for relating actions to concepts (actors or objects) on which an action is performed; hasAttribute is used for representing the relationship between a concept and another concept, following the definition of the concept class (actor or object).In summary, in the tagged sentence "the user[Actor] uploads[Action] her profile picture[Object]" there is a hasAction relationship between the entity "user" and "uploads," a hasRelatedConcept relationship between "uploads" and "profile picture," and a hasAttribute between "user" and "profile picture."We summarize the annotated entities and semantic relationships in Table 2.

Training the models
We use spaCy for building and training the abstraction and relation extraction models.spaCy is an open-source NLP library comprising pre-trained pipelines and neural networks for tagging, parsing, NER, among other NLP tasks (Honnibal et al. 2020).spaCy provides a simple command-line interface method for building and training custom models and NLP components, easing the workflow from the annotating phase with Prodigy (Montani and Honnibal 2018) to the production phase.
The training process is based on supervised learning, so each prediction coming from the models is ."Then, the sentence is split into a list of recognized entities, e.g., ["user," "download," "profile picture"], and each entity is transformed into its vector representation by using the spaCy language models.Such a list is used for representing candidate triplets "abstraction-relation-abstraction", representing all the possible combinations among labeled entities, e.g., ("user," "hasAction," "download"), ("user," "hasAttribute," "profile picture"), ("download," "hasRelatedConcept," "profile picture"), ("user," "hasRelatedConcept," "profile picture"), and so on including the remaining combinations.Both the entity vector representations and the candidate triplets are used for building an instance tensor representing the data passed to the prediction module in a single form, resulting in a prediction matrix.The prediction matrix includes a score between 0 and 1 for each candidate triplet, indicating the confidence of the labeled relation, e.g., if the resulting prediction matrix has a score of 0.1 for the relation hasAction between the entities "user" and "profile picture," such a result indicates that is not likely to have a relation between such entities.On the contrary, if the prediction matrix has a score of 0.9 for the relation hasAction related to the entities "user" and "download," a relation of such a type is more likely to exist, according to the model prediction.The resulting model should be used for identifying and extracting RE-related relationships between pairs of requirements abstractions coming from requirements documents based on several requirements writing styles and software domains.We summarize the RENER system workflow in Fig. 2.

QUARE architecture
The QUARE model is used for answering factoid RE-related questions such as "what are the actors of the discourse?""what are the attributes of the actor X?" and "what are the actions of the actor X?".The structure of the RE-related questions follows a well-defined structure: "what are|is the property|class of the class|class instance?".Such questions are intended to be answered by short spans of text representing specific requirements abstractions such as actors, objects, and actions.Therefore, the QUARE model architecture follows the guidelines of the classical IR-based factoid QASs (Jurafsky and Martin 2009), comprising four stages: question analysis, passage retrieval, requirements abstraction extraction, and requirements abstraction structuring as represented in Fig. 3.

Question analysis
The question analysis stage comprises two steps: focus detection and question classification.
The focus detection step is aimed at identifying the classes, instances, and properties of the RE-related question according to the meta-ontology structure (see Sect. 4.1).Since the structure of the questions is well defined by the proposed structure, i.e., "what are|is the property of the class|instance?"extracting such components is accomplished by using regular expressions and manual-crafted rules.We define two types of questions: class-focused questions are used for extracting all the instances of a given class, e.g., "what are the actors of the discourse?"and "what are the objects of the discourse?".The focuses related to such questions are "actors" and "objects" respectively because those are the abstractions of interest to be extracted.Since "discourse" is the more general class in the proposed meta-ontology for RE and the question is focused only on a class, the focus of such type of question is at the class level; property-focused questions are used for extracting one or more abstractions related to a specific property of a given instance, e.g., "what are the attributes of the writer?".The focus of such types of questions comprises two components, the target property, which is used for representing the type of the property of the expected answer, e.g., "attributes," and the target instance, which is the instance or class related to the property, e.g., "writer".
The question classification step is aimed at defining the expected answer class.In contrast to the classical question classification approach (Jurafsky and Martin 2009) where the type of wh-word defines the type of expected answer, the question classification step in the QUARE model is represented by the range of the target property, e.g., the classification of the question "what are the attributes of the writer?"should be Concept as the range of the property hasAttribute is the class Concept.Thus, the expected answer should be an instance of such a class, e.g., the expected answer should be either an actor or an object (concept subclasses).

Passage retrieval
The passage retrieval stage is focused on identifying candidate passages coming from the requirements documents to be analyzed.Such a stage comprises two steps: document processing and passage ranking.
The document processing step comprises two tasks: sentence segmentation is focused on splitting each requirements document of the corpus into sentences.Since the expected answers (requirements abstractions) are mainly words and small word sequences, e.g., "user," "uploads," and "profile picture," splitting the requirements documents is key for achieving a better analysis in the passage ranking step; sentence vectorizing is aimed at representing each sentence in the form of dense embeddings.Such a representation allows for computing the similarity between two texts at different levels, i.e., token, word, sentence, and paragraph (Karpukhin et al. 2020).In contrast to classical frequency-based approaches for representing words, where matching exactly two words or a sequence of words is needed for finding a similarity, dense embeddings allow for handling synonymy, so finding candidate passages based on their semantic similarity is easier (Xie et al. 2016).
The passage ranking step is concerned with ranking candidate passages coming from processed requirements documents based on their semantic similarity with the identified question focus.The dense embeddings of each sentence are semantically compared with the dense embedding of the question focus by computing the dot product between each pair of embeddings (Karpukhin et al. 2020).The passages with higher semantic similarity are identified and ranked as candidate requirements sentences since they are likely to contain information about the target property and the target class.Such candidate requirements sentences are then passed to the answer extraction stage for extracting answers, i.e., requirements abstractions.

Answer extraction
The answer extraction stage is concerned with analyzing the retrieved passages in the search of instances of the target property and the target class.Answers are represented by specific combinations of instances and properties.The answer extraction stage relies on identifying and extracting the target property which is related to a specific class instance.Once such components are identified, a new answer may be generated.The answer extraction stage comprises two steps: abstraction extraction and relation extraction.We address such steps by using the RENER system (see Sect. 4.2).
The abstraction extraction step is focused on labeling class instances and extracting the instances related to the target class from the ranked candidate requirements sentences, e.g., the labels and instances resulting from the abstraction extraction step on the ranked requirements sentence "…the user uploads her profile picture…" are "user[Actor]," "uploads[Action]," and "profile picture [Object]."Such instances may be relevant to a specific question, e.g., the only relevant instance for answering the question "what are the actors of the discourse?" is "user[Actor]" since the classification of such a question is "Actor."Also, for more complex questions such as "what are the actions of the user?" the labeled instances by themselves are insufficient for extracting a proper answer because a relationship among such labeled instances is still undefined.
The relation extraction step is concerned with labeling the properties (semantic relationships) related to specific class instances (abstractions), including hasAttribute, hasAction, hasRelatedConcept.Such relations are used for semantically connecting the labeled class instances, e.g., the resulting extracted relations in the requirements sentence "…the user uploads profile picture…" are "user [hasAction] uploads," "uploads [hasRelatedConcept] profile picture."Such labeled properties may be used for answering more complex questions such as "what are the actions of the user?".Once the target class instance (e.g., "user") and the target property (e.g., "action") are identified, the answer may be composed following the rule "the [proper|ty|ties] of the [class instance] is|are [property instanc|e|es]," e.g., "the action of the user is uploads a profile picture."

Abstraction structuring
The QUARE model allows software analysts for answering RE-related questions and structuring the answers by using the meta-ontology structure, giving them an early conceptualization of the software domain.In contrast to classical QASs where the questions may be unrelated to each other and the user is aimed to know answers to one or a few specific questions, the QUARE model is designed for answering a set of RE-related questions aimed at characterizing a software domain.Such answers represent key requirements abstractions such as actors, actions, and concepts that may be reused for some other RE-related tasks.
The abstraction structuring stage is focused on populating the meta-ontology with the extracted answers.For each instance in the extracted answer, one instance of its class type is created and stored in the meta-ontology.Also, each pair of requirements abstraction is linked within the meta-ontology following the target class and target property type, e.g., the answer "the user uploads a profile picture" triggers the creation of an instance of the class object for "profile picture," an instance of the class action with the value "uploads" for its property "hasActionVerb," a reference to the instance "profile picture" for its property "hasRelatedConcept," and a reference to the instance "user" for its property "hasRelatedActor" as we show in Fig. 4.

Eliciting requirements by using the QUARE model
In this Section, we develop a prototype based on the QUARE model and we evaluate it with two case studies coming from the PURE dataset (Ferrari et al. 2017) by using well-defined metrics related to QASs.Also, we independently evaluate the performance of the RENER by using a validation dataset.Such a process is based on the case study research process of experimentation in software engineering (Wieringa 2014;Wohlin et al. 2012).

Motivating scenario
Assume a software analyst preparing an interview-driven RE process.The software analysts use the QUARE model for supporting such a process by performing three major steps: (i) uploading requirements documents including technical documents, previous interview documents, and any other textual-based requirements artifacts which may contain business-related knowledge; (ii) processing such documents by using the QUARE model by performing RE-related questions based on the components of the proposed meta-ontology (actor-action-concept).Such questions are answered by using the QUARE model, giving the software analysts a view of the knowledge within the analyzed requirements documents.Such answers are then automatically structured into the meta-ontology; (iii) analyzing pairs of questions and answers for gaining a better understanding of the software domain of interest.Also, validating the populated meta-ontology with stakeholders so further discussion in specific concerns may be developed.
The QUARE model is aimed at supporting the RE process but not replacing it.The output of the model is a starting point for the RE process, making it easier for the software analysts to have a broader view of the context of the software domain to be characterized.Software analysts may use the QUARE model components in several RE tasks such as interview preparation, interview performing by generating RE-related questions, and domain model building from early documentation or final interview transcriptions.

Case studies
The case studies and validation dataset are randomly chosen from the PURE dataset (Ferrari et al. 2017).Such documents are excluded from the training dataset of the RENER system components.We use the validation dataset, comprising 800 requirements sentences, for validating the performance of the abstraction extraction model and the relation extraction model of the RENER system.Furthermore, we use three new requirements documents coming from the PURE dataset for validating the QUARE prototype as follows: case study #1 (CS1) comprises 64 user stories (semistructured natural-language-based requirements) representing the requirements of a data-science-oriented software product; case study #3 (CS2) comprises 192 requirements sentences coming from use case specifications (semi-structured and naturallanguage-based requirements) representing the requirements of a water managing software product.Rajpurkar et al. (2016) define two metrics for evaluating factoid-based QASs: the exact match metric is used for evaluating QASs in terms of the percentage (%) of the predicted answer matching the gold answer (reference expected answer); the F 1 score is used for computing the average of words/tokens overlapping the predicted answer and the gold answer.Since the expected answers coming from the QUARE prototype comprise one-word answers and small word sequences, we use the F 1 score for evaluating both the QUARE prototype and the RENER system.

Theory
The F 1 score is formally described in Eq. ( 1) according to Rajpurkar et al. (2016): Precision and recall are used for representing the fraction of relevant predicted answers and the fraction of all relevant predicted answers, respectively.Such metrics are described in Eq. ( 2) and Eq. ( 3).
In the context of the QUARE prototype, true positives are used for representing the number of shared tokens between the gold answer and the predicted answer, e.g., if the gold answer is "upload profile picture" and the predicted answer is "upload profile picture" the value of true positives is 3. Also, in the context of the RENER system, a true positive is used for representing a correctly predicted requirements abstraction or entity relation, according to the gold answer, e.g., the model predicts the label Actor for the word "user" as it is labeled in the gold answer.
False positives are used for representing tokens and labels which are included in the predicted answer but missing in the gold answer, e.g. if the predicted answer is "upload profile picture" but the gold answer is "upload picture," the number of false positives for the QUARE model is 1 because the predicted answer includes an addition token ("profile"); Also, if the predicted answer includes the label Action on the token "want," but such a token is not labeled as Action in the gold answer, such a prediction represents a false positive in the RENER system.
False negatives are tokens and labels of the gold answer which are not included in the predicted answer, e.g., in the context of the QUARE model, if the gold answer is "the baker makes soft bread" and the predicted answer is "the baker makes bread," the number of false negatives is 1 because the token "soft" is missing in the predicted answer.Similarly, if a label from the gold answer is missing in the predicted answer, e.g., if the gold answer includes a labeled entity such as "designer [Actor]" and such an entity is missing in the predicted answer, the false-negative count is incremented by 1. (1) True positives (True positives + False positives) True positives (True positives + False negatives)

Methods
We use spaCy for computing the F 1 score of the components of the RENER system on the validation dataset.On the other hand, we validate the QUARE prototype by using the requirements documents coming from the proposed case studies (CS1 and CS2) by following three steps: (i) we pre-process the requirements documents, deleting nonrelevant textual and graphical data such as titles, figures, and introductory paragraphs; (ii) we use the structure of the proposed meta-ontology for manually formulating and answering RE-related questions from the two case studies; (iii) we ask the same questions to the QUARE prototype and we compare the predicted answers with the manually extracted answers for computing the F 1 score by using a Python script, counting and comparing the tokens in the answers.

Validating the RENER system
The answer to VR1 and VR2 is the following: the abstraction model achieves an F 1 score of 0.87 and the relation extraction model achieves an F 1 score of 0.74 as is shown in Fig. 5.

Validating the QUARE model
We develop a QUARE prototype following the structure and workflow of the QUARE model by using Python and Google Colaboratory (Bisong 2019), providing an interactive environment for asking and answering RE-related questions.
We validate such a prototype by using two case studies (CS1 and CS2) according to the proposed method (see Sect. 5.1.3):(i) we start by pre-processing the requirements documents coming from the case studies, removing non-relevant information such as titles, figures, and figure descriptions by using a Python script.Then (ii) we manually formulate RE-related questions and extract the answers from the requirements documents based on the structure of the meta-ontology for RE as shown in Algorithm 1, e.g., the first step of the algorithm is asking the trigger question "what are the actors of the discourse?"as the class Actor is the main class of the discourse; after answering such a question, we start generating new questions for each instance of the class Actor in the answer according to their properties such as "what are the actions of the Actor x?" and "what are the attributes of the Actor x?" and so on until all the extracted instances in the answer and their properties are covered.Finally, (iii) we use a Python script for asking the same questions by using the QUARE prototype, allowing us for comparing the answers to the manually extracted answers (gold answers) and computing the F 1 score according to the results.
The CS1 includes 64 semi-structured natural-language-based sentences (user stories) such as "as a user, I want to update a single property of a dataset instance without knowing all other properties" and "as a data scientist, I want to be able to create a dataset instance to a new version of its code."We generate 55 RErelated questions by using Algorithm 1. 41 out of 55 questions are focused on object abstractions, 13 out of 55 questions are related to action abstractions, and 1 out of 55 questions is focused on the actor abstraction.The process of manually formulating the RE-related questions and extracting the answers is performed in 78 min.Afterward, we use such questions for extracting and structuring the answers from the CS1 requirements document by using the QUARE model, achieving results in 21 s for uploading and processing the requirements document and 2.16 s for answering the 55 questions.We summarize some of the formulated RE-related questions, the manually extracted answers, and the answers extracted by using the QUARE prototype in Table 3.The resulting F 1 score of the QUARE prototype on the CS2 is 0.77.The CS2 comprises 107 semi-structured natural-language-based requirements sentences and natural-language-based requirements sentences (use cases specifications) such as "This will allow users of the system to view pumpage data, as an estimated value, even for permittees that do not submit their pumpage information" and "This use case will be used when an actor needs to view information about the relocation of permitted quantities associated with a specific water use permit."We generate 58 RE-related questions following Algorithm 1.Such questions are focused on object abstractions (41 out of 58), action abstractions (16 out of 58), and actor abstractions (1 out of 58).Manually formulating and extracting the answers from the CS2 requirements documents is performed in 64 min.Also, we use such questions for extracting and structuring the answers by using the QUARE prototype.Such a process takes 27 s for uploading and processing the requirements document and 3.24 s for answering the 58 questions.We summarize the formulated questions, the manually extracted answers, and the answers extracted by using the QUARE prototype in Table 4.The resulting F 1 score of the QUARE prototype on the CS2 is 0.74.
The average F 1 score of the QUARE prototype is 0.76, according to the two case study results.We show the automated structuring of the extracted abstractions coming from the CS1 and CS2 by using the proposed meta-ontology for RE in Figs. 6  and 7, respectively.
We answer VRQ3 and summarize the case study results in Table 5.

User Study
We use a prototype of the QUARE model for validating its usage among the assistants to LASES (Latin American Software Engineering Symposium) 2022.We gathered a group of 14 pre-grade and postgrad software engineering students from Universidad Nacional de Colombia and Universidad de Medellin, and we divided it into four validation groups: validation group 1 (vg1) and validation group 2 (vg2) comprise 4 and 4 pre-grade students, respectively; validation group 3 (vg3) and validation group 4 (vg4) comprise 3 and 3 postgrad students, respectively.All the involved students have taken the requirements elicitation course (pre-grade students) or have real-life experience in the RE process (postgrad students).
Validation groups are aimed at extracting as much information as they can from a requirements document by using a question-answering process, following instructions: (i) vg1 has no limitations on the questions they may ask; (ii) vg3 may ask RErelated questions on specific requirements abstractions, including actors, actions, and objects; (iii) vg2 and vg4 use the QUARE model for extracting the knowledge from the given requirements document.
We use the requirements document "Information Technology BLIT102" coming from the PURE dataset (Ferrari et al. 2017), for the validation groups to extract the answers (requirements abstractions).Such a requirements document comprises 55 natural-language-based requirements sentences of the data management domain, such as "the user shall be able to view the hyperlinks and descriptions listed" and "the user must associate at least one role, division, designator code, and lab location at the time of creating/adding a new user."We summarize some of the questions and answers generated by the validation groups during the elicitation process in Table 6.
We summarize the number of generated questions, their answers, and the type of extracted requirements abstractions in Table 7.

Comparing the QUARE model with other NLP4RE approaches
In this Section, we compare the performance of the SQUARE model with other NLP4RE approaches focused on the extraction task.Laiq and Dieste (2020) propose a chatbot for training novice requirements engineers.Such a proposal includes an abstraction identification step which is focused on identifying goals, actions, and actors.Also, they validate the chatbot by using a public requirements document.Since the QUARE model is focused on extracting actors, actions, and objects, we compare the performance of the proposals on actors and actions.We use the same requirements document for answering the questions "what are the actors of the discourse?"and "what are the actions of the discourse?."We identify 14 out of 18 requirements abstractions by using the QUARE model.On the other hand, the chatbot for novice software engineers (Laiq and Dieste 2020) manages to identify 12 out of 18. Sleimi et al. (2019) propose a query-based approach for extracting legal-related requirements abstractions, including actors and actions.Such a proposal includes an abstraction extraction step based on syntactic patterns for extracting actions and actors from the testing dataset.We develop a Python script according to such syntactic patterns for extracting actions and actors from the testing dataset, so we can compare the extracted abstractions.The SQUARE model achieves a precision score of 0.88 and a recall score of 0.89 for actor abstractions, and a precision score of 0.86 and 0.89 recall for action abstractions.The query-based approach (Sleimi et al. 2019) achieves a precision score of 0.76 and a recall score of 0.69 for actor abstractions, and a precision score of 0.74 and a recall score of 0.68 for action abstractions.

Discussion
We achieve promising results with the RENER system as shown in Fig. 5.The annotation dataset comprises 4000 requirements sentence examples, encompassing key RE-related features including several writing styles, software product domains, and requirements abstractions.Such a variety of examples provides a robust foundation for training generalized RE-oriented NER models, allowing software analysts for addressing some writing styles and software domains, avoiding the limitations coming from classical NLP4RE proposals which are mostly focused on syntactic and semantic patterns, and domain-specific structures.Although the RE process comprises several software domains, software requirements are based on recurrent terminology and structures regardless of their nature.Such features, make software requirements suitable for a better generalization in the automated RE analysis.Also, such particularities make NLP tools trained on generic English corpora not suitable for NLP4RE tasks (Ferrari et al. 2017).The abstraction extraction model achieves an F 1 score of 0.87 and the relation extraction model achieves an F 1 score of 0.74.Even though we use the same dataset for training both models, a lower result on the relation extraction is expected as it is a more complex task.Also, since we train the models of the RENER system based on supervised learning, the lower number  of examples of some relations such as hasAttribute (see Table 2) directly affects the performance of the relation extraction model.We perform some requirements abstractions and relation extractions from some requirements and non-requirements sentences for illustrating the functionality and the flexibility of the RENER system as shown in Figs. 8 and 9.
The abstraction extraction model allows software analysts for identifying and extracting requirements abstractions from natural-language-based sentences (see Fig. 8, example a), passive voice variations of requirements sentences (see Fig. 8, example b), non-interactive requirements sentences (see Fig. 8, example c), complex requirements sentences comprising multiple objects and actions (see Fig. 8, examples d, e, g, and h), multiple requirements writing styles (see Fig. 8, examples f and c), and requirements sentences including candidate actions without explicit actors such as the action "made" (see Fig. 8, example f).
Software analysts should use the relation extraction model for linking such requirements abstractions, defining their semantic relationships as we show highlighted in red in Fig. 9.
Such an example includes the abstractions "painter" and "canvas" related to the hasAction relationship with a score of 0.97.On the other hand, the abstraction "painter" should have a hasAttribute relationship with the abstraction "canvas" and "brushes" because of the pronoun "his" before such words, i.e., "canvas" and "brushes" are objects belonging to the actor "painter."However, the relation hasAttribute between the actor "painter" and the objects "canvas" and "brushes" are predicted with a score of 0.01 and 0.0 respectively, i.e., no such relationships exist according to the relation extraction model predictions.As we state in the labeled dataset summary (see Table 2), the relation hasAction is the label with fewer examples, so the model is expected to have some limitations for identifying such a relation.
The QUARE model allows software analysts for answering RE-related questions from requirements documents including several requirements writing styles and software domains.Such a process is close to a RE real-life scenario based on structured and semi-structured interviews, allowing practitioners for eliciting key requirements abstractions such as actors, actions, and objects characterizing the software domain.In addition, the QUARE model is used for building an early conceptualization of the software domain, making it easier for communicating and validating the software domain.The meta-ontology for RE is a key component of the QUARE model as it is used for defining the requirements abstractions that may be identified, extracted, and structured.Also, such metaontology for RE is used for describing the type of requirements abstractions and semantic relationships to be labeled by the RENER system.As the scope of the meta-ontology grows, the scope of the QUARE model grows, allowing practitioners to comprehensively elicit requirements from requirements documents regardless of the software domain.The QUARE model relies on the RENER system for identifying and labeling the requirements abstractions and their semantic relationships, so limitations on the RENER system directly affect the QUARE model as we notice in most of the questions of the type "what are the attributes of Concept X?."Such questions are often wrong answered due to the low performance of the relation extraction model on such specific relations, e.g., the gold answer to the question "what are the attributes of the water use permit?"coming from the CS2 is "object upgrade and administrative operation," however, the predicted answer is "app, code, type, dataset type, and version," showing a clear limitation for the model for such a type of relation.On the other hand, we achieve promising results for answering RE-related questions focused on the requirements abstractions Actor, Action, and Object.We manage to answer 113 RE-related questions coming from two different software domains and requirements writing styles with an F 1 score of 0.76, reducing the average manual extraction time by 110 times while formulating RE-related questions, analyzing requirements documents, extracting requirements abstractions and their relationships and structuring them by using the proposed meta-ontology for RE.Validation groups show that the QUARE model may be used for consistently supporting the RE process.Vg1 manage to generate 7 RE-related questions and extract 14 requirements abstractions from the proposed requirements document.Such questions are mostly focused (3 out of 7) on the action abstraction, including questions such as "what are the functions of the system?."Some questions are intended to explore the goals of the system, e.g., "what is the main goal of the system?" and its potential weaknesses, e.g., "what are the weaknesses of the system?."While there are some questions focused on key requirements abstractions such as actors and their actions, there is not a clear link between the proposed questions, making it harder to understand the aim of the interview.Such questions show a bias in the process as they ask the questions they have used before in their requirements elicitation courses and projects.Vg3 comprises post-grade students who have real-life experience in requirements elicitation, so the proposed questions are more accurate, showing a logical connection among them.Such a validation group manages to generate 33 RE-related questions, allowing them for extracting actors (7), actions (44), and (objects).While they extract a high amount of requirements abstractions often overlooked some objects and their attributes, as they did not think it would be important for the software domain of interest.On the other hand, vg2 and vg4 manage to generate 70 RE-related questions by using the QUARE model.Most of the questions are focused on object abstractions (61 out of 70).Some other questions are focused on the action abstraction (8 out of 70) and the actor abstraction (1 out of 70).Such a set of questions allow vg2 and vg4 for extracting 123 requirements abstractions and relationships, following a structured-interview process.Such an approach is intended to generate questions in depth according to the extracted answers, making it easier for generating new RE-related questions as the software analyst gathers more knowledge about the software domain they are working with.The QUARE model allows vg2 and vg4 for reducing the spent time twenty times against the other validation groups for identifying and extracting the requirements abstractions.Also, vg2 and vg4 have an early representation of the software domain represented by the populated meta-ontology, which they can use for validation purposes.Although vg2 and vg4 have different levels of skills and experience related to the RE process, both of the validation groups manage to use the QUARE model for extracting RE-related knowledge.
The SQUARE model outperforms other NLP4RE approaches due to its flexibility and independence of the software domain and requirements writing styles.In contrast to the chatbot-based proposal (Laiq and Dieste 2020), which is limited to extracting requirements abstractions from the educational domain, the SQUARE model extracts the requirements abstractions and their relationships from several software domains.In addition, since the query-based approach is based on syntactic patterns (Sleimi et al. 2019), such an approach is constrained by specific writing styles, hardening the identification of requirements abstractions from complex sentences such as "the project team must demonstrate mockups of UI changes to project stakeholders early in the development process."Such an example comprises two actors: "project team" and "project stakeholders," yet the query-based approach labels "project team" as the only actor because it is the nominal subject of the main 25 Page 32 of 38 verb of the sentences ("demonstrate").The QUARE model manages to identify both actors, allowing for a better characterization of the software domain.
The work products resulting from the QUARE model comprise the RENER system, the QUARE prototype, the meta-ontology for RE, and the datasets used for validating both the RENER system and the QUARE prototypes.Since the focus of the QUARE model is on the extraction task of the NLP4RE field, such work products may be used for supporting and improving other NLP4RE tasks.We summarize such work products and their availability in Table 8.

Threats to validity
In the context of external validity, we identify some threats that may impact the validity of the QUARE model results related to the generalization of the QUARE model results in other RE scenarios, including some software domains and requirements writing styles.However, the QUARE model is based on the structure of the meta-ontology for RE, comprising common requirements abstractions belonging to any software domain, such as actors, actions, and objects.Also, the RENER system provides a precise approach for handling multiple requirements writing styles and natural-language-based texts.Such components may be reused and extended, allowing practitioners for scaling them up as the scope of the software domain grows.Such a process may be done by including new classes and properties for the metaontology for RE (e.g., goals, events, and constraints), extending the provided annotated datasets, and retraining the RENER system models.
On the other hand, we identify some threats that may impact the internal validity related to the training and building process of the RENER system models.We annotate the dataset for the abstraction extraction model and the relation extraction model, making it prone to experimenter bias.However, we mitigate such a threat by randomly reviewing 20% of the annotations labeled by each other.Also, we use widely used definitions for requirements abstractions such as actors, actions, and objects (Lim et al. 2021).Another threat is concerned with the language model we use for training the models, as there are several language models available.However, the usage of the language models of spaCy in the NLP4RE context is been increasing constantly (Zhao et al. 2021).In addition, practitioners may achieve stateof-the-art results by using such models (Honnibal et al. 2020).

Conclusions and challenges
In this paper, we proposed QUARE, a question-answering model for requirements elicitation aimed at mitigating the limitations for addressing several requirements writing styles and software domains coming from the NLP4RE proposals.We proposed a meta-ontology for RE including key requirements abstractions such as actor, action, and object, allowing software analysts for structuring extracted requirements and linking them by using semantic relationships such as hasAction, hasAttribute, hasRelatedConcept, and hasRelatedActor regardless of the software domain nature.A new annotated requirements dataset was composed comprising 4000 requirements sentences and more than 30,000 requirements abstractions and semantic relationship annotations, following the structure of the meta-ontology for RE.Such a dataset was used for building and training the RENER system, a NER and relation extraction system focused on RE, allowing software analysts for analyzing requirements documents written in several requirements writing styles such as natural language, semi-structured natural language, and structured natural language.The RENER system should be used for other NLP4RE tasks such as extraction, and modeling.The QUARE model is a new contribution to the NLP4RE research field, integrating RE, QASs, and NLP.A QAS was developed following the QUARE model architecture for validating its performance by using three real-life case studies coming from the PURE dataset.Such case studies comprise several requirements writing styles and software domains.We achieved promising results with an F 1 score of 0.76 for the QUARE model, answering RErelated questions focused on some key requirements abstractions such as actors, actions, and objects, outperforming manual RE processes.
The main contributions of the QUARE model are three: (i) we build a QAS for RE, allowing software analysts for identifying, extracting, and structuring requirements abstractions coming from several software domains.Such an approach provides a closer experience to a RE scenario as software analysts may extract such abstractions by using RE-related questions; (ii) we develop a NER and relation extraction system for RE, allowing software analysts for processing requirements documents comprising several requirements writing styles, eliminating the dependency on manual crafted syntactic and semantic rules.Such a system may be retrained, extend, and reused in other NLP4RE scenarios; (iii) we provide an annotated dataset of 4000 publicly available requirements sentences coming from the PURE dataset (Ferrari et al. 2017), including label tags (e.g., actor, action, and object) and semantic relationships (e.g., "hasAction," "has-RelatedActor," and "hasAttribute").Such a dataset comprises more than 30,000 requirements abstractions and semantic relationships including several software domains and requirements writing styles.Some challenges are identified: (i) the QUARE model is focused on extracting actors, actions, and objects.However, other key requirements abstractions may be identified and extracted, including goals, events, and constraints; (ii) the RENER system may be used in other NLP4RE scenarios such as real-time detection of actors, actions, and objects from interviews by using speech-to-text techniques.Also, the RENER system may be used for supporting RE-oriented chatbots and dialogue systems; (iii) the QUARE model may be used as an educational environment for novice software analysts and requirements engineering students, allowing them for improving their skills on asking RE-related questions and modeling software domains.

•
Discourse is a new class used for representing the extracted requirements abstractions in the context of the requirements document, encapsulating the software domain from the stakeholder perspective.Since the discourse class encompasses the whole software domain, such a class should include one property "hasX" for each other class in the meta-ontology, e.g., hasAction and hasConcept.• Concept is a superclass adapted from Zapata et al. (2010).The concept class is abstract, so it has no instances.We define two subclasses related to the concept class: Actor and Object.In addition, we include the property hasAttribute which is inherited by both the classes Actor and Object and is used for representing concept-attribute relationships, e.g., "the user [Actor] has a name [Object]," "the car [Object] has a wheel [Object]," and "The employer [Actor] has an employee [Actor]".• Actor is a class adapted from Zapata et al. (2010) which is used for representing the roles performing some action in the software, e.g., "baker," "user," and "electrical engineer."We define two properties for the class Actor: hasRole is used for representing the noun related to a specific actor, e.g., the string "baker"; hasAction is used for relating actors, actions, and concept subclasses (i.e., actors and objects) at the instance level, e.g., "baker [Instance of Actor] makes [Instance of action] bread [Instance of object]."Such a class is the main class of the metaontology.• Object is a class adapted from Zapata et al. (

Fig. 1
Fig.1A meta-ontology for RE.Generated by using ProtégéTM(Knublauch et al. 2004).Adapted from (Carlos M.Zapata et al. 2010) based on weighted values resulting from learning the patterns and features of the annotated examples.The abstraction extraction model is a NER system focused on RE, including three entities: Actor, Action, and Object.The model building process starts with the spaCy configuration file and the annotated dataset.The spaCy configuration file includes all the initial settings and hyperparameters of the model, including default hyperparameters of the model, embedding algorithms, optimizers, and batching size.The annotated dataset is randomly split into training data (80% of the data examples) and validation data(20% of the data examples), allowing for testing and validating the model by using unseen data (Gholamy et al. 2018).The training process is based on an iterative approach where predictions are compared against the training examples to estimate how the weighted values of the model should be changed, so the predicted values fit better with the expected values.The resulting model should be used for identifying and labeling RE-related concepts from several requirements writing styles and software domains.The relation extraction model is focused on classifying the relationship between two identified entities (requirements abstractions) into one of the predefined RE-related semantic relationships: hasAction, hasAttribute, and hasRelat-edConcept.Such relations are used for making triplets in the form abstractionrelation-abstraction, representing how requirements abstractions relate to other requirements abstractions in the software domain.We use spaCy for building and training the relation extraction model.The input of the relation extraction model is a labeled requirements sentence coming from the abstraction extraction model, e.g., "as a user[Actor], I want to download[Action] my profile picture[Object]

Fig. 4
Fig.4QUARE abstraction structuring stage.Generated by using ProtégéTM(Knublauch et al. 2004) We define three validation research questions: (VR1) what is the performance of the RENER abstraction extraction model?; (VR2) what is the performance of the RENER extraction model?; (VR3) what is the performance of the QUARE prototype?.

Fig. 6
Fig.6CS1 answers representation by using the meta-ontology for RE.Generated by using ProtégéTM(Knublauch et al. 2004)

Table 1
(Ferrari et al. 2017)Janssens 2019)l software domains, making it harder for NLP-based tools trained on regular english texts to achieve results in the context of NLP4RE(Gacitua et al. 2010;Janssens 2019).Therefore, we build a dataset comprising 4000 requirements sentences coming from the PURE dataset(Ferrari et al. 2017), comprising 79 publicly available requirements documents.Such requirements documents come from some software domains comprising several requirements writing styles such as natural language, e.g., specific domain templates, semi-structured natural language, e.g., user stories and use cases specifications, and natural language-based requirements.Prodigy is a web-based annotator supporting entity and semantic relationship labeling

Table 5
Case study results summary

Table 7
Generated RE-related questions and extracted abstractions summary

Table 8
Work products summary