Abstract
Healthcare information support (HIS) is essential in managing, gathering, and disseminating information for healthcare decision support through the Internet. To support HIS, text classification (TC) is a key kernel. Upon receiving a text of healthcare need (e.g. symptom description from patients) or healthcare information (e.g. information from medical literature and news), a text classifier may determine its corresponding categories (e.g. diseases), and hence subsequent HIS tasks (e.g. online healthcare consultancy and information recommendation) may be conducted. The key challenge lies on high-quality TC, which aims to classify most texts into suitable categories (i.e. recall is very high), while at the same time, avoid misclassifications of most texts (precision is very high). High-quality TC is particularly essential, since healthcare is a domain where an error may incur higher cost and/or serious problems. Unfortunately, high-quality TC was seldom achieved in previous studies. In the paper, we present a case study in which a high-quality classifier is built to support HIS in Chinese disease-related information, including the cause, symptom, curing, side-effect, and prevention of cancer. The results show that, without relying on domain knowledge and complicated processing, cancer information may be classified into suitable categories, with a controlled amount of confirmations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arampatzis, A., Beney, J., Koster, C.H.A., van der Weide, T.P.: Incrementality, Half-life, and Threshold Optimization for Adaptive Document Filtering. In: Proceedings of the 9th Text Retrieval Conference (2000), Gaithersburg, Maryland, pp. 589–600 (2000)
Fahey, D.K., Weinberg, J.: LASIK Complications and the Internet: Is the Public being Mislead? Journal of Medical Internet Research 5(1) (2003)
Ivanitskaya, L., O’Boyle, I., Casey, A.M.: Health Information Literacy and Competencies of Information Age Students: Results From the Interactive Online Research Readiness Self-Assessment (RRSA). Journal of Medical Internet Research 8(2), e6 (2006)
Iyengar, V.S., Apte, C., Zhang, T.: Active Learning using Adaptive Resampling. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, Massachusetts, pp. 91–98. ACM Press, New York (2000)
Kittler, A.F., Hobbs, J., Volk, L.A., Kreps, G.L., Bates, D.W.: The Internet as a Vehicle to Communicate Health Information During a Public Health Emergency: A Survey Analysis Involving the Anthrax Scare of 2001, Journal of Medical Internet Research 6(1) (2004)
Liu, R.-L., Lin, W.-J.: Adaptive Sampling for Thresholding in Document Filtering and Classification. Information Processing and Management 41(4), 745–758 (2005)
Liu, R.-L.: Dynamic Category Profiling for Text Filtering and Classification. Information Processing and Management 43(1), 154–168 (2007)
Schapire, R.E., Singer, Y., Singhal, A.: Boosting and Rocchio Applied to Text Filtering. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval, Melbourne, Australia, pp. 215–223. ACM Press, New York (1998)
Singhal, A., Mitra, M., Buckley, C.: Learning Routing Queries in a Query Zone. In: Proceedings of the 20th annual international ACM SIGIR conference on research and development in information retrieval, Philadelphia, Pennsylvania, pp. 25–32. ACM Press, New York (1997)
Tang, T.T., Hawking, D., Craswell, N., Griffiths, K.: Focused Crawling for both Topical Relevance and Quality of Medical Information. In: Proceedings of the ACM 14th Conference on Information and Knowledge Management, Bremen, Germany, pp. 147–154. ACM Press, New York (2005)
Wu, H., Phang, T.H., Liu, B., Li, X.: A Refinement Approach to Handling Model Misfit in Text Categorization. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 207–216. ACM Press, New York (2002)
Yang, Y.: A Study of Thresholding Strategies for Text Categorization. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, Louisiana, pp. 137–145. ACM Press, New York (2001)
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proceedings of the 14th International Conference on Machine Learning, Nashville, Tennessee, pp. 412–420 (1997)
Zhang, Y., Callan, J.: Maximum Likelihood Estimation for Filtering Thresholds. In: Proceedings of the 24th annual international ACM SIGIR conference on research and development in information retrieval, New Orleans, Louisiana, pp. 294–302. ACM Press, New York (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Liu, RL. (2007). Text Classification for Healthcare Information Support. In: Okuno, H.G., Ali, M. (eds) New Trends in Applied Artificial Intelligence. IEA/AIE 2007. Lecture Notes in Computer Science(), vol 4570. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73325-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-73325-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73322-5
Online ISBN: 978-3-540-73325-6
eBook Packages: Computer ScienceComputer Science (R0)