Predicting Consumer Familiarity with Health Topics by Query Formulation and Search Result Interaction

  • Ira Puspitasari
  • Ken-ichi Fukui
  • Koichi Moriyama
  • Masayuki Numao
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8862)


Searching for understandable health information on the Internet remains difficult for most consumers. Every consumer has different health topic familiarity. This diversity may cause misunderstanding because the information presented during health information searches may not fit the consumer’s understanding. This study aimed to develop health topic familiarity prediction models based on the consumer’s searching behavior, how the consumers formulate the query and how they interact with the search results. The experimental results show that Naïve Bayes and Sequential Minimal Optimization classifiers achieved high accuracy on the combination of query formulation and search result interaction feature sets in predicting consumer’s health topic familiarity. This finding suggests that health topic familiarity identification based on the query formulation and the search result interaction is feasible and effective.


health topic familiarity familiarity prediction query formulation feature search result interaction feature 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Zeng, Q.T., Tse, T.: Exploring and developing consumer health vocabularies. J. Am. Med. Inform. Assoc. 13(1), 24–29 (2006)CrossRefGoogle Scholar
  2. 2.
    Luo, G.: Design and evaluation of the iMed intelligent medical search engine. In: IEEE 25th International Conference on Data Engineering, pp. 1379–1390. IEEE (2009)Google Scholar
  3. 3.
    Zeng, Q., Kim, E., Crowell, J., Tse, T.: A text corpora-based estimation of the familiarity of health terminology. In: Oliveira, J.L., Maojo, V., Martín-Sánchez, F., Pereira, A.S. (eds.) ISBMDA 2005. LNCS (LNBI), vol. 3745, pp. 184–192. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Zeng-Treitler, Q., Goryachev, S., Tse, T., Keselman, A., Boxwala, A.: Estimating consumer familiarity with health terminology: A context-based approach. J. Am. Med. Inform. Assoc. 15(3), 349–356 (2008)CrossRefGoogle Scholar
  5. 5.
    Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., Zeng, Q.: Assessing consumer health vocabulary familiarity: An exploratory study. J. Med. Internet Res. 9(1) (2007)Google Scholar
  6. 6.
    Parker, R.M., Baker, D.W., Williams, M.V., Nurss, J.R.: The test of functional health literacy in adults: A new instrument for measuring patients’ literacy skills. J. Gen. Intern. Med. 10, 537–541 (1995)CrossRefGoogle Scholar
  7. 7., Camstudio Open Source Free Streaming Video Software,
  8. 8.
    Rieh, S.Y., Xie, H.: Analysis of multiple query reformulations on the web: The interactive information retrieval context. Inf. Process. Manage. 24, 751–768 (2006)CrossRefGoogle Scholar
  9. 9.
    McLaughlin, G.H.: SMOG grading: A new readability formula. Journal of Reading 12(8), 639–646 (1969)Google Scholar
  10. 10.
    Fitzsimmons, P.R., Michael, B.D., Hulley, J.L., Scott, G.O.: A readability assessment of online Parkinson’s disease information. J. R. Coll. Physicians Edinb. 40(4), 292–296 (2010)CrossRefGoogle Scholar
  11. 11.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutmann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)Google Scholar
  12. 12.
    Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15(6), 1437–1447 (2003)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Ira Puspitasari
    • 1
  • Ken-ichi Fukui
    • 1
  • Koichi Moriyama
    • 1
  • Masayuki Numao
    • 1
  1. 1.Institute of Scientific and Industrial ResearchOsaka UniversityIbarakiJapan

Personalised recommendations