Keywords

1 Introduction

Search engines are commonly used to seek health information, an activity that is considered the third most popular activity on the Internet [1]. Despite the increasing use of the Web to search for health-related information, there may exist inequalities in access to health information [6]. Users with low levels of health literacy can struggle to satisfy their information needs because health-related information usually contains medico-scientific expressions that are not easily understandable [13]. The gap between lay and medico-scientific terminologies limits this access and can be assisted through query modification techniques [10]. There is evidence that multilingual query suggestions in lay and medico-scientific terminologies improve health information retrieval by laypeople [9].

Taking this into account, Health Suggestions was developed as an extension for Google Chrome, suggesting queries in lay and medico-scientific terminologies, both in English and Portuguese, based on the Consumer Health Vocabulary (CHV) [8]. To improve the system, we propose and evaluate strategies for query suggestion that involve multi-concept recognition and information from the Unified Medical Language System (UMLS). For evaluation, the new generated query is used to retrieve documents from an English and a Portuguese test collection. The strategies are evaluated, taking into account the relevance of the documents and its understandability by lay users, comparing them with the results of queries initially suggested by Health Suggestions.

2 Related Work

When users are trying to express their information need, they might use keywords that are too general or different from the ones included in documents, as well as an insufficient number of terms, making the query difficult to “be understood” by the system [5]. Techniques such as query expansion, query refinement, and query suggestion have been proposed to solve this problem, improving the relevance and comprehension of the retrieved documents.

Zeng et al. [12] developed a system that suggests alternative or additional terms to the query using logs and the co-occurrence of concepts in medical documents, as well as the semantic relationships existing in medical vocabularies. Liu and Wesley [7] proposed a query expansion method that exploited the UMLS, appending additional relevant terms to the original query.

A query suggestion system was developed by Lopes and Ribeiro [9], combining multilingual alternatives (in Portuguese and English) with the use of lay and medico-scientific terminology. Authors used the CHV that maps technical terms to consumer-friendly language. For each query, they identify the associated concept and then return its CHV and UMLS-preferred names in English and Portuguese. Lopes and Fernandes [8] created HealthSuggestions, an extension for Google Chrome to assist users in obtaining high-quality search results in the health domain using the CHV.

3 Proposed Methods for Suggesting Queries

To generate the query suggestions, we implemented several methods that use multi-concept recognition to detect the medical concepts included in the initial query and use the information from UMLS as a knowledge source. All methods follow the approach described in Fig. 1. Briefly, the initial query is translated into English, and its medical concepts are identified. For each of these concepts, we select lay and medico-scientific expressions, concatenate them to compose the corresponding suggestions in English and, in the end, we translate them to the original language. All translations are done with Google Translator.

Fig. 1.
figure 1

Process followed for the generation of query suggestions.

Several strategies were analyzed for multi-concept recognition, and we decided to use MetaMap, a rule-based system of concept recognition, to discover UMLS concepts referred to in free text [2], which is interesting because we use UMLS as our knowledge source. MetaMap provides a list of mappings for each identified concept. In each query suggestion method, we used two approaches to select the best mapping. In the first approach, we choose the first mapping, that is, the one with the highest score. In the second, we used the Word-Sense Disambiguation (WSD) feature that favors those that are semantically consistent with the surrounding text [3]. For each approach, we used the UMLS Concept Unique Identifier (CUI) and the name of the concepts as input.

Table 1. Proposed methods.

The selection of lay and medico-scientific synonyms is what differentiates the suggestion methods. All the methods use the UMLS, a knowledge base that aggregates multiple thesauri of the medical domain [4], each composed of concepts related to health, their various names, and the relationships that exist between them. One of the UMLS vocabularies is the CHVFootnote 1, a vocabulary that connects simple, everyday health words to technical terms used by health care professionals. For each concept, it stores the best way to express it for a lay audience (CHV-preferred) and the same for a professional audience (UMLS-preferred).

Differences between the methods are summarized in Table 1. In the CHV-preferred/UMLS-preferred method, the selected synonyms correspond to the CHV-preferred and UMLS-preferred expressions for each concept. This is the only method using exclusively one vocabulary.

The other methods use the overall UMLS to obtain an expression or a subset of expressions, from which we select the lay and medico-scientific synonymous. The lay synonymous is the expression with the highest value of similarity with the lay terminology, and the medico-scientific one is the expression closest to the medico-scientific terminology. To determine the closeness of the expressions to these terminologies, we used a previously created algorithm [11].

The Preferred Atoms method uses the default preferred atom associated with the CUI. The All preferred/synonym atoms method retrieves a list of all English atoms that are the preferred names or a synonym in the various vocabularies of the UMLS. The All Atoms method retrieves all the English atoms, instead of extracting only the preferred and synonym ones. To explore other atoms associated with a concept, the method All Atoms + Child/Parent/Same Relations identifies all English atoms associated with a concept and then retrieves atoms related to the first one through parent/child/same relationships. Finally, the Broader/Narrower Concepts recovers broader and narrower atoms that are directly connected with the initial identified concept, instead of looking for atoms associated with the concept.

4 Evaluation

To assess and compare the effectiveness of the developed methods, we used two test collections, one in English and the other in Portuguese. The English collection is provided by the Consumer Health Search Task in the 2018 edition of the CLEF eHealth LabFootnote 2. This task uses a set of 50 English queries and a document corpus with 5,535,120 web pages acquired from a CommonCrawl dump. It also provides 26,025 judgments of relevance and understandability.

The Portuguese collection was explicitly built for this work. We used the English queries provided by the User-Centred Health Information RetrievalFootnote 3 and Patient-Centred Information RetrievalFootnote 4 Tasks of the 2015 and 2016 editions of the CLEF eHealth Lab. We translated the 208 queries to the Portuguese language with the collaboration of a medical doctor. Although the dataset of the 2015 edition had Portuguese translations of the queries, they were in some cases in PT-BR, and for this reason, we decided to translate them to PT-PT manually.

The queries were used in a user study with 104 participants. These participants were students, and as part of one work assignment, they were assigned two tasks regarding two different queries. In each task, they were asked to judge the relevance and understandability of the 30-top documents retrieved by four search engines: Google, Bing, Yahoo!, and HONSearch. The 16,505 assessed documents and the judgments of the participants complete this collectionFootnote 5. The number of documents is different from 24,960 (208*4*30) because there was an overlap between documents retrieved by the four search engines and because the number of retrieved results may be inferior to 30.

We have indexed the document corpora in Elastic Search. For each query, we compute four types of suggestions, in lay and medico-scientific suggestions, both in English and Portuguese. Using the judgments of each test collection as ground truth, we assessed the performance of each suggestion through the top-10 documents retrieved by Elastic Search for that query. For this evaluation, our baseline is the performance of the suggestions provided by Health Suggestions.

The performance was assessed through the Understandability-based RBP (uRBP) and uRBP graded (uRBPgr). uRBP is a measure that increases when the user chooses a document that is considered both relevant and understandable, based on binary assessments. The uRBPgr allows graded assessment values [14]. For each method, we conduct one evaluation considering word-sense disambiguation and one without it.

5 Results

The best methods select the CHV-preferred expressions for lay suggestions and the UMLS-preferred expression for the medico-scientific suggestions (Table 2). Both methods outperform the baseline.

Table 2. Evaluation of the methods using the English and Portuguese test collections.

Globally, the methods with better performance are the ones that consider the preferred atoms of the different vocabularies from the UMLS, mainly the CHV. Using child relations does not help, probably due to the specificity of the suggestion. Using broader terms (parent and broader relations) proved to be more useful since other designations for the same concept are being explored.

In the English test collection, the use of WSD does not improve the performance of the methods that use UMLS-preferred terms but is useful when exploring relations. In the Portuguese collection, in general, there are slightly better results when using WSD. Nevertheless, this difference is so small that we conclude that it is better to disambiguate in methods that explore relations and the other way around in methods that pick the preferred terms. Note that context is essential in methods that use relations that may justify the importance of disambiguation.

The average number of seconds to formulate a suggestion is presented, for each method, in Table 3. As can be seen, methods that consider the relationships of atoms take a longer time compared to the others. The use of the relations from concepts should be preferred since it takes less time to process them, and the performance is similar. In English, the use of WSD helps to reduce the processing time because fewer atoms are retrieved and, therefore, less processing is needed afterward. The CHV/UMLS-preferred are the fastest methods since they only need to identify the concept and retrieve the corresponding CHV/UMLS-preferred expression.

Table 3. Average number of seconds to generate a suggestion.

6 Conclusions

The majority of the developed methods proved to be better than the baseline, helping the user to retrieve more relevant and understandable documents. Using UMLS-preferred terms resulted in a better performance. Others explored broader terms, more specific terms, and similar terms but did not retrieve as good results. The best method to suggest lay queries is the one that uses the CHV-preferred expressions (the most familiar ones) to substitute the identified concepts. The best method to suggest medico-scientific suggestions uses UMLS-preferred expressions. These methods are better in the relevance and understandability but are also better in generation time. Since the word-sense disambiguation reduces the time that is necessary to generate new suggestions, and slightly improves or does not affect the overall performance, we conclude it should be used.