Integrated Mining for Cancer Incidence Factors from Healthcare Data

Zhang, Xiaolong; Narita, Tetsuo

doi:10.1007/11423270_19

Xiaolong Zhang²² &
Tetsuo Narita²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3430))

748 Accesses
1 Citations

Abstract

This paper describes how data mining is being used to identify primary factors of cancer incidences and living habits of cancer patients from a set of health and living habit questionnaires. Decision tree, radial basis function and back propagation neural network have been employed in this case study. Decision tree classification uncovers the primary factors of cancer patients from rules. Radial basis function method has advantages in comparing the living habits between a group of cancer patients and a group of healthy people. Back propagation neural network contributes to elicit the important factors of cancer incidences. This case study provides a useful data mining template for characteristics identification in healthcare and other areas.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pomeroy, S.L., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 405, 436–442 (2002)
Article Google Scholar
Kawamura, Y., Zhang, X., Konagaya, A.: Inference of genetic network in cluster level. In: 18th AI Symposium of Japanese Society for Artificial Intelligence, SIG-J-A301-12P (2003)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5, 914–925 (1993)
Article Google Scholar
Chen, M.S., Han, J., Yu, P.S.: Data mining: An overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering 8, 866–883 (1996)
Article Google Scholar
Zhang, X.: Knowledge Acqusition and Revision with First Order Logic Induction. PhD Thesis, Tokyo Institute of Technology (1998)
Google Scholar
Special Issue. Comparison and evaluation of KDD methods with common medical databases. Journal of Japanese Society for Artificial Intelligence 15,750–790 (2000)
Google Scholar
Apte, C., Grossman, E., Pednault, E., Rosen, B., Tipu, F., White, B.: Probablistic estimation based data mining for discovering insurance risks. Technical Report IBM Research Report RC-21483, T. J. Watson Research Center, IBM Research Division, Yorktown Heights, NY 10598 (1999)
Google Scholar
Gedeon, T.D.: Data mining of inputs: analysing magnitude and functional measures. Int. J. Neural Syst 8, 209–217 (1997)
Article Google Scholar
Cathy, W., Shivakumar, S.: Back-propagation and counter-propagation neural networks for phylogenetic classification of ribosomal RNA sequences. Nucleic Acids Research 22, 4291–4299 (1994)
Article Google Scholar
Cathy, W., Berry, M., Shivakumar, S., McLarty, J.: Neural networks for full-scale protein sequence classification: Sequence encoding with singular value decomposition. Machine Learning 21, 177–193 (1994)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Shafer, J.C., Agrawal, R., Mehta, M.: SPRINT: A scalable parallel classifier for data mining. In: Proc. of the 22th Int’l Conference on Very Large Databases, Bombay, India (1996)
Google Scholar
Poggio, T., Girosi, F.: Networks for approximation and learning. Proceedings of the IEEE 78, 1481–1497 (1990)
Article Google Scholar
Littleand, R.J.A., Rubin, D.B.: Statistical analysis with missing data. John Wiley & Sons, Chichester (1987)
Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximun likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc. B 39, 1–38 (1977)
MATH MathSciNet Google Scholar
IBM Intelligent Miner for Data. Using the Intelligent Miner for Data, 3rd edn. IBM Corp (1998)
Google Scholar
Cabena, P., et al.: Discovering data mining. Prentice Hall PTR, Englewood Cliffs (1998)
Google Scholar
Zhang, X., Narita, T.: Discovering the primary factors of cancer from health and living habit questionaires. In: Arikawa, S., Furukawa, K. (eds.) DS 1999. LNCS (LNAI), vol. 1721, p. 371. Springer, Heidelberg (1999)
Chapter Google Scholar
Srivastava, A.N.: Data mining for semiconductor yield forecasting. In: Future Fab International (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, 430081, China
Xiaolong Zhang
ISV Solutions, IBM-Japan Application Solution Co., Ltd., 1-14 Nissin-cho, Kawasaki-ku, Kanagawa, 210-8550, Japan
Tetsuo Narita

Authors

Xiaolong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tetsuo Narita
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shimane University, 89-1 Enya-cho Izumo, 6938501, Shimane, Japan
Shusaku Tsumoto
Faculty of Science and Technology, Keio University, 3-14-1 Hiyoshi Kohoku-ku, 223-8522, Yokohama, Japan
Takahira Yamaguchi
The Institute of Scientific and Industrial Research, Osaka University, Japan
Masayuki Numao
Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, Ibaraki, 567-0047, Osaka, Japan
Hiroshi Motoda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Narita, T. (2005). Integrated Mining for Cancer Incidence Factors from Healthcare Data. In: Tsumoto, S., Yamaguchi, T., Numao, M., Motoda, H. (eds) Active Mining. Lecture Notes in Computer Science(), vol 3430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11423270_19

Download citation

DOI: https://doi.org/10.1007/11423270_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26157-5
Online ISBN: 978-3-540-31933-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics