The World Health Organization predicted that depression would be the world's leading cause of disability by 2020. This is calling for urgent interventions. As most mental illnesses are caused by a number of genetic and environmental factors and many different types of mental illness exist, the identification of a precise combination of genetic and environmental causes for each mental illness type is crucial in the prevention and effective treatment of mental illness. Sophisticated data analysis tools, such as data mining, can greatly contribute in the identification of precise patterns of genetic and environmental factors and greatly help the prevention and intervention strategies. One of the factors that complicates data mining in this area is that much of the information is not in strictly structured form. In this paper, we demonstrate the application of tree mining algorithms on semi-structured mental health information. The extracted data patterns can provide useful information to help in the prevention of mental illness, and assist in the delivery of effective and efficient mental health services.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Agrawal R., Srikant R.: Fast algorithms for mining association rules. VLDB, Chile (1994).
Asai T., Arimura H., Uno T., Nakano S.: Discovering Frequent Substructures in Large Unordered Trees. Proc. of the Int'l Conf. on Discovery Science, Japan (2003).
Craddock N., Jones I.: Molecular genetics of bipolar disorder. The British Journal of Psychiatry, vol. 178, no. 41, pp. 128–133 (2001).
Ghoting A., Buehrer G., Parthasarathy S., Kim D., Nguyen A., Chen Y.-K., Dubey P. : Cache-conscious Frequent Pattern Mining on a Modern Processor, VLDB Conf., (2005).
Hadzic M., Chang E.: Web Semantics for Intelligent and Dynamic Information Retrieval Illustrated Within the Mental Health Domain, to appear in Advances in Web Semantics: A State-of-the Art, Springer, (2008).
Hadzic M., Chang E.: An Integrated Approach for Effective and Efficient Retrieval of the Information about Mental Illnesses', Biomedical Data and Applications, Springer, (2008).
Hadzic F., Tan H., Dillon T.S.: UNI3-Efficient Algorithm for Mining Unordered Induced Subtrees Using TMG Candidate Generation. IEEE CIDM Symposium, Hawaii (2007).
Hadzic F., Tan H., Dillon T.S., Chang E.: Implications of frequent subtree mining using hybrid support definition, Data Mining and Information Engineering, UK, (2007).
Hadzic F., Dillon T.S., Chang E.: Knowledge Analysis with Tree Patterns, HICSS-41, USA,(2008).
Hadzic F., Dillon T.S., Sidhu A., Chang E., Tan H.: Mining Substructures in Protein Data,IEEE ICDM DMB Workshop, China (2006).
Hadzic M., Hadzic F., Dillon T.: Mining of Health Information from Ontologies, Int'l Conf.on Health Informatics, Portugal, (2008).
Hadzic M., Hadzic F., Dillon T.: Tree Mining in Mental Health Domain, HICSS-41, USA,(2008).
Han J., Kamber M.: Data Mining: Concepts and Techniques (2nd edition). San Francisco:Morgan Kaufmann (2006).
Horvitz-Lennon M., Kilbourne A.M., Pincus H.A.: From Silos To Bridges: Meeting The General Health Care Needs Of Adults With Severe Mental Illnesses. Health Affairs vol. 25, no.3, pp. 659–669 (2006).
Liu J., Juo S.H., Dewan A., Grunn A., Tong X., Brito M., Park N., Loth J.E., Kanyas K., Lerer B., Endicott J., Penchaszadeh G., Knowles J.A., Ott J., Gilliam T.C., Baron M.: Evidence for a putative bipolar disorder locus on 2p13–16 and other potential loci on 4q31, 7q34, 8q13, 9q31,10q21–24,13q32, 14q21 and 17q11–12. Mol Psychiatry, vol. 8, no. 3, pp. 333–342 (2003).
Lopez A.D., Murray C.C.J.L.: The Global Burden of Disease, 1990–2020. Nature Medicine vol. 4, pp. 1241–1243 (1998).
Novichkova S., Egorov S., Daraselia N.: Medscan, a natural language processing engine for Medline abstracts. Bioinformatics, vol. 19, no. 13, pp. 1699–1706, (2003).
Onkamo P., Toivonen H.: A survey of data mining methods for linkage disequilibrium mapping. Human genomics, vol. 2, no. 5, pp. 336–340 (2006).
Piatetsky-Shapiro G., Tamayo P.: Microarray Data Mining: Facing the Challenges. SIGKDD Explorations, vol. 5, no. 2, pp. 1–6 (2003).
Shasha D., Wang J.T.L., Zhang S.: Unordered Tree Mining with Applications to Phylogeny.Int'l Conf. on Data Engineering, USA (2004).
Sidhu A.S., Dillon T.S., Sidhu B.S., Setiawan H.: A Unified Representation of Protein Structure Databases. Biotech. Approaches for Sustainable Development, pp. 396–408 (2004).
Smith D.G., Ebrahim S., Lewis S., Hansell A.L., Palmer L.J., Burton P.R.: Genetic epidemiology and public health: hope, hype, and future prospects. The Lancet, vol. 366, no. 9495, pp.1484–1498 (2005).
Tan H., Dillon T.S., Hadzic F., Chang E.: Razor: mining distance constrained embedded subtrees. IEEE ICDM 2006 Workshop on Ontology Mining and Knowledge Discovery from Semistructured documents, China (2006).
Tan H., Dillon T.S., Hadzic F., Chang E.: SEQUEST: mining frequent subsequences using DMA Strips. Data Mining and Information Engineering, Czech Republic, (2006).
Tan H., Dillon T.S., Hadzic F., Chang E., Feng L.: MB3-Miner: mining eMBedded subTREEs using Tree Model Guided candidate generation. MCD workshop, held in conjunction with ICDM05, USA (2005).
Tan H., Dillon T.S., Hadzic F., Feng L., Chang E.: IMB3-Miner: Mining Induced/Embedded subtrees by constraining the level of embedding. Proc. of PAKDD, (2006).
Tan H., Hadzic F., Dillon T.S., Feng L., Chang E.: Tree Model Guided Candidate Generation for Mining Frequent Subtrees from XML, to appear in ACM Transactions on Knowledge Discovery from Data, (2008).
Tan H., Hadzic F., Dillon T.S., Chang E.: State of the art of data mining of tree structured information, CSSE Journal, vol. 23, no 2, (2008).
Wang J.T.L., Shan H., Shasha D., Piel W.H.: Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases. Int'l Conf. on Scientific and Statistical Database Management, USA (2003).
Wilczynski N.L., Haynes R.B., Hedges T.: Optimal search strategies for identifying mentalhealth content in MEDLINE: an analytic survey. Annals of General Psychiatry, vol. 5, (2006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Hadzic, M., Hadzic, F., Dillon, T.S. (2009). Domain Driven Tree Mining of Semi-structured Mental Health Information. In: Cao, L., Yu, P.S., Zhang, C., Zhang, H. (eds) Data Mining for Business Applications. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-79420-4_9
Download citation
DOI: https://doi.org/10.1007/978-0-387-79420-4_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-79419-8
Online ISBN: 978-0-387-79420-4
eBook Packages: Computer ScienceComputer Science (R0)