Abstract
This paper presents an application of data mining in healthcare and discusses how the generated patterns can be used by physicians for early detection and hence prevention of oral cancer. One of the popular association rule mining algorithms, Apriori is used to extract a set of significant rules from the data pertaining to clinical examination, history, and survivability of the cancer patients. These rules suggest various investigations and also help predicting distribution of cancer in oral cavity. In spite of the fact that the clinical judgment happens by means of examination of the oral cavity and tongue using various diagnostic tools, the majority of cases present to a healthcare setups at later stages of tumor subtypes, thereby lessening the chances of survival due to delay in diagnosis. Nevertheless, the data mining rules would certainly assist the practitioners in early detection of oral cancer and prediction of distribution of cancer in the oral cavity that can be helpful preventing the disease. The experimental results demonstrate that all the generated rules hold the highest confidence level, thereby making them useful for early detection and prevention of the oral cancer.
Similar content being viewed by others
References
Abual-Rub MS, Al-Betar MA, Abdullah R, Khader AT (2012) A hybrid harmony search algorithm for ab initio protein tertiary structure prediction. Netw Model Anal Health Inform Bioinforma 1(3):69–85. doi:10.1007/s13721-012-0013-7
Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD international conference on management of data, pp 207–216
Agrawal M, Pandey S, Jain S, Maitin S (2012) Oral cancer awareness of the general public in Gorakhpur City, India. Asian Pac J Cancer Prev 13:5195–5199
An J, Chen YPP, Chen H (2005) DDR: an index method for large time series datasets. Inf Syst 30:333–348
Andrea LH, Hsinchun C, Susan MH, Bruce RS, Tobun DN, Robin RS, Kristin MT (1999) Medical data mining on the internet: research on a cancer information system. Artif Intell Rev 13:437–466
Anh TN, Hai DV, Tin TC, Bac LH (2011) Efficient algorithms for mining frequent itemsets with constraint. In: Proceedings of the third international conference on knowledge and systems engineering
Anuradha K, Sankaranarayanan K (2012) Identification of suspicious regions to detect Oral cancers at an earlier stage: a literature survey.In: Proceedings of International Journal of Advances in Engineering and Technology 03, 01 March 2012, pp 84–91
Bayardo RJ, Agrawal R, Gunopulos D (2000) Constraint-based rule mining in large, dense databases. Data Min Knowl Discov 4(2–3):217–240
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 1997), Tucson, Arizona, USA. May 1997, pp 265–276
Brin S, Motwani R, Ullman JD, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 1997), Tucson, Arizona, USA. May 1997, pp 255–264
Chen YPP, Chen F (2008) Targets for drug discovery using bioinformatics. Expert Opin Ther Targets 12(04):383–389
Chuang LY, Wu KC, Chang HW, Yang CH (2011) Support vector machine-based prediction for oral cancer using four snps in DNA repair genes. In: Proceedings of the International MultiConference of Engineers and Computer scientists. 16–18 March 2011
Clifton C (2010) Encyclopædia Britannica: definition of data mining
Coelho KR (2012) Challenges in oral cancer burden in India. J Cancer Epidemiol 2012:701932
Cong G, Liu B (2002) Speed-up iterative frequent itemset mining with constraint changes. In: Proceedings of IEEE International Conference on Data Mining (ICDM ′02), pp 107–114
Data Mining Curriculum. ACM SIGKDD (2006) 2006-04-30
Elango JK, Gangadharan P, Sumithra S, Kuriakose MA (2006) Trends of head and neck cancers in urban and rural India. Asian Pac J Cancer Prev 07(01):108–112
Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996a) From data mining to knowledge discovery: an overview. Advances in knowledge discovery and data mining (AAAI Press/MIT Press), pp 1–36
Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996b) From data mining to knowledge discovery in databases. American Association for Artificial Intelligence (AAAI-AI Magazine), pp 37–54
Gadewal NS, Zingde SM (2011) Database and interaction network of genes involved in oral cancer: version II. Bioinformation 06(04):169–170
Han J, Kamber M, Pei J (2011) Data Mining: concepts and techniques. Morgan Kaufmann Publishers, Third Edison. ISBN 9780123814791
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. In: LNCS, vol. 5808, 2nd edn. Springer, New York, pp 66–79
Hen LE, Lee SP (2008) Performance analysis of data mining tools cumulating with a proposed data mining middleware. J Comput Sci 4(10):826–833 Science Publication
Hipp J, Güntzer U, Nakhaeizadeh G (2000) Algorithms for association rule mining: a general survey and comparison. ACM SIGKDD Explor Newslett 2:58. doi:10.1145/360402.360421
Hou J, Zhu W, Chen YP (2013) Dynamically predicting protein functions from semantic associations of proteins. Netw Model Anal Health Inform Bioinforma 2(4):175–183. doi:10.1007/s13721-013-0024-z
Jemal A, Thimas A, Murray T, Thun M (2002) Cancer statistics, CA. Cancer J Clin 52:181–182
Kaladhar DSVGK, Chandana B, Kumar PB (2011) Predicting cancer survivability using Classification algorithms. In: Proceedings of International Journal of Research and Reviews in Computer Science (IJRRCS). 02, 02 April 2011, 340–343
Kent S (1996) Diagnosis of oral cancer using genetic programming: a technical report. CSTR pp 96–14
Khandekar PS, Bagdey PS, Tiwari RR (2006) Oral cancer and Some epidemiological factors: a hospital based study. Indian J Commun Med 31(03):157–159
Khosla R, Dillon T (1997) Knowledge discovery, data mining and hybrid systems. In: Engineering intelligent hybrid multi-agent systems, Kluwer Academic Publishers, pp 143–177
Lau RYK, Tang M, Wong O, Milliner SW, Chen YPP (2006) An evolutionary learning approach for adaptive negotiation agents. Int J Intell Syst 21(01):41–72
Lee AJ, Lin WC, Wang CS (2006) Mining association rules with multi-dimensional constraints. J Syst Soft 79(1):79–92
Manoharan N, Tyagi BB, Raina V (2010) Cancer incidences in rural Delhi, 2004–05. Asian Pac J Cancer Prev 11(01):73–78
Milovic B, Milovic M (2012) Prediction and decision making in health care using data mining. In: Proceedings of international Journal of Public Health Science. 01, 02 Dec 2012, 69–78
Nagi S, Bhattacharyya DK (2013) Classification of microarray cancer data using ensemble approach. Netw Model Anal Health Inform Bioinforma 2(3):159–173. doi:10.1007/s13721-013-0034-x
Nahar J, Kevin ST, Ali ABMS, Chen YP (2011) Significant cancer prevention factor extraction: an association rule discovery approach. J Med Syst 35(3):353–367. doi:10.1007/s10916-009-9372-8
Nguyen RT, Lakshman VS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained association rules. In: Proceedings of international conference on management of data, ACM-SIG-MOD, 13–24
Ordonez C (2006) Association rule discovery with the train and test approach for heart disease prediction. IEEE Trans Inf Technol Biomed 10(02):334–343
Ordonez C, Omiecinski E (1999) Discovering association rules based on image content. In: Proceedings of IEEE Advances in Digital Libraries Conference (ADL’99), pp 38–49
Ordonez C, Santana CA, Braal L (2000) Discovering interesting association rules in medical data. In: ACM DMKD workshop, pp 78–85
Piatetsky-Shapiro G (1991) Discovery, analysis, and presentation of strong rules. Knowledge discovery in databases. AAAI/MIT Press, Cambridge, pp 229–248
Sankaranarayanan R, Ramadas K, Thomas K (2005) Effect of screening on oral cancer mortality in Kerala, India: a cluster-randomised controlled trial. Lancet 365(9475):1927–1933
Scully C, Bagan JV, Hopper C, Epstein JB (2008) Oral cancer: current and future diagnostics techniques: a review article. Am J Dent 21:199–209
Sharma N, Om Hari (2012) Framework for early detection and prevention of oral cancer using data mining. Int J Adv Eng Technol 4(2):302–310
Sharma N, Hari Om (2013) Data mining models for predicting oral cancer survivability. Network modeling analysis in health informatics and bioinformatics, Springer, vol 2, Issue 5, pp 285–295, doi:10.1007/s13721-013-0045-7
Singh S, Yadav M, Gupta H (2012) Finding the chances and prediction of cancer through Apriori algorithm with transaction reduction. Int J Adv Comput Res 2(2):23–28 (ISSN (print): 2249-7277 ISSN (online):2277–7970
Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of KDD97, 67–73
Swami S, Thakur RS, Chande lRS (2011) Multi- dimensional association rules extraction in smoking habits database. Int J Adv Netw Appl 03(03):1176–1179
Tang J, Chuang L, Hsi E, Lin Y, Yang C, Chang H (2013) Identifying the association rules between clinicopathologic factors and higher survival performance in operation-centric oral cancer patients using the Apriori algorithm. BioMed Research International, vol 2013:359634. doi:10.1155/2013/359634
Werning JW (2007) Oral cancer: diagnosis, management, and rehabilitation. ISBN 978-1588903099, 16 May 2007
Witten IH, Frank E (2005) Data mining: practical machine learning tool and techniques. In: Morgan Kaufmann Series in Data Management Systems, 2nd edn. Elsevier, Amsterdam
Woolgar JA, Scott J, Vaughan ED, Brown JS, West CR, Rogers S (1995) Survival, metastasis and recurrence of oral cancer in relation to pathological features. Ann R Coll Surg Engl 1995(77):325–331
Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Disc 09:223–248
Acknowledgments
The authors would like to thank the management and staff of Indian School of Mines, for their constant support and motivation.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sharma, N., Om, H. Significant patterns for oral cancer detection: association rule on clinical examination and history data. Netw Model Anal Health Inform Bioinforma 3, 50 (2014). https://doi.org/10.1007/s13721-014-0050-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-014-0050-5