Skip to main content

Comparative Study of Using Word Co-occurrence to Extract Disease Symptoms from Web Documents

  • Conference paper
  • First Online:
Knowledge and Systems Sciences (KSS 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 780))

Included in the following conference series:

Abstract

The research aim is a comparative study of using different word co-occurrence sizes as the two word co-occurrence and the N word co-occurrence on verb phrases to extract disease symptom explanations from downloaded hospital documents. The research results are applied to construct the semantic relations between disease-topic names and symptom explanations for enhancing the automatic problem-solving system. The machine learning technique, Support Vector Machine, and the similarity score determination are proposed to solve the boundary of simple sentences explaining the symptoms for the two word co-occurrence and the N word co-occurrence respectively. The symptom extraction result by the N word co-occurrence provides the higher precision than the two word co-occurrence from the documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Carlson, L., Marcu, D., Okurowski, M.E.: Building a discourse-tagged corpus in the framework of rhetorical structure theory. In: van Kuppevelt, J., Smith, R.W. (eds.) Current and New Directions in Discourse and Dialogue. Text, Speech and Language Technology, vol. 22. Springer, Dordrecht (2003). doi:10.1007/978-94-010-0019-2_5

    Chapter  Google Scholar 

  2. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545 (2011)

    Google Scholar 

  3. Ando, S., Fujii, Y., Ito, T.: Filtering harmful sentences based on multiple word co-occurrence. In: IEEE/ACIS 9th International Conference on Computer and Information Science(ICIS) (2010)

    Google Scholar 

  4. Riaz, M., Girju, R.: Recognizing causality in verb- noun pairs via noun and verb semantics. In Proceedings of the EACL 2014 Workshop on Computational Approaches to Causality in Language, pp. 48–57 (2014)

    Google Scholar 

  5. Preoţiuc-Pietro, D., Srijith, P.K., Mark, H., Trevor, C.: Studying the temporal dynamics of word co-occurrences: an application to event detection. In: LREC (2016)

    Google Scholar 

  6. Mitchell, T.M.: Machine Learning. The McGraw-Hill Companies Inc. and MIT Press, Singapore (1997)

    MATH  Google Scholar 

  7. Biggins, S., Mohammed, S., Oakley, S.: University of sheffield: two approaches to semantic text similarity. In: Proceedings of First Joint Conference on Lexical and Computational Semantics, Montreal, Canada, pp. 655–661 (2012)

    Google Scholar 

  8. Campbell, C.: An introduction to kernel methods. In: Howlett, R.J., Jain, L.C., Kacprzyk, J. (eds.) Radial Basis Function Networks 1: Recent Developments in Theory and Applications, pp. 155–192. Springer Physica Verlag Rudolf Liebing KG, Vienna (2001)

    Google Scholar 

  9. Sudprasert, S., Kawtrakul, A.: Thai word segmentation based on global and local unsupervised learning. In: Proceedings of the 7th National Computer Science and Engineering Conference (2003)

    Google Scholar 

  10. Chanlekha, H., Kawtrakul, A.: Thai named entity extraction by incorporating maximum entropy model with simple heuristic information. In: First International Joint Conference, Hainan Island, China (2004)

    Google Scholar 

  11. Chareonsuk, J., Sukvakree, T., Kawtrakul, A.: Elementary discourse unit segmentation for thai using discourse cue and syntactic information. In: Proceedings of the 9th National Computer Science and Engineering Conference (2005)

    Google Scholar 

Download references

Acknowledgements

This work has been supported by The Thailand Research Fund MU-SSIRB:2014/092(B1). The medical-care knowledge and the pharmacology knowledge applied in this research are provided by Puangthong Kraipiboon, a clinician of Division of Medical Oncology, Department of Medicine, Ramathibodi Hospital, and Uraiwan Janviriyasopak, a pharmacist of RexPharmcy, respectively.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chaveevan Pechsiri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Pechsiri, C., Sukharomana, R. (2017). Comparative Study of Using Word Co-occurrence to Extract Disease Symptoms from Web Documents. In: Chen, J., Theeramunkong, T., Supnithi, T., Tang, X. (eds) Knowledge and Systems Sciences. KSS 2017. Communications in Computer and Information Science, vol 780. Springer, Singapore. https://doi.org/10.1007/978-981-10-6989-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6989-5_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6988-8

  • Online ISBN: 978-981-10-6989-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics