Radiologists frequently search the Web to find information they need to improve their practice, and knowing the types of information they seek could be useful for evaluating Web resources. Our goal was to develop an automated method to categorize unstructured user queries using a controlled terminology and to infer the type of information users seek. We obtained the query logs from two commonly used Web resources for radiology. We created a computer algorithm to associate RadLex-controlled vocabulary terms with the user queries. Using the RadLex hierarchy, we determined the high-level category associated with each RadLex term to infer the type of information users were seeking. To test the hypothesis that the term category assignments to user queries are non-random, we compared the distributions of the term categories in RadLex with those in user queries using the chi square test. Of the 29,669 unique search terms found in user queries, 15,445 (52%) could be mapped to one or more RadLex terms by our algorithm. Each query contained an average of one to two RadLex terms, and the dominant categories of RadLex terms in user queries were diseases and anatomy. While the same types of RadLex terms were predominant in both RadLex itself and user queries, the distribution of types of terms in user queries and RadLex were significantly different (p < 0.0001). We conclude that RadLex can enable processing and categorization of user queries of Web resources and enable understanding the types of information users seek from radiology knowledge resources on the Web.
Ontologies terminologies vocabularies RadLex software tools controlled vocabulary natural language processing web technology