Abstract
Breast Cancer is one of the most occurring cancer among women affecting about 2 million people. There is 98% chance of 5 years survival rate if detected at early stage. The data about Breast cancer used in this paper is the Wisconsin dataset which is taken from Kaggle. This is a classification problem, there are two classes (0 representing a non-malignant tumor, 1 representing malignancy). Min Max scalar is used for preprocessing of data to limit data within certain range (known as scaling). The algorithms used for classification are Support Vector Classifier, Random Forest, Naïve Bayes, Decision Tree, K-Nearest Neighbours. Support Vector Classifier and Random forest gave the highest accuracy, Evaluation metrics such are Area Under Curve-Rectified Operational Characteristics curve, confusion matrix, Recall score, accuracy. To avoid overfitting cross validation is used where k fold value is 3.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chelladurai R, Selvakumar R, Poonguzhali S (2018) Automatic segmentation of multiple lesions in ultrasound breast image. Int J Eng Technol (UAE) 7:665–670
Siegel RL, Miller KD, Jemal A (2016) Cancer statistics, 2016. CA Cancer J Clin 66(1):7–30. https://doi.org/10.3322/caac.21332. Epub 2016 Jan 7. PMID: 26742998
Sunny J, Rane N, Kanade R, Devi S (2020) Breast cancer classification and prediction using machine learning, Int J Eng Res Technol (IJERT) 09(02):576–580
Varsha ML, Kashyap MHP, Bodhith E, Prasad MSR (2020) Prediction of heart diesease using machine learning techniques. Int J Sci Technol Res 9(02):4389–4392
Siva Kumar P, Sarvani V, Prudhvi Raj P, Suma K, Nandu D (2017) Prediction of heart disease using multiple regression analysis and support vector machines. J Adv Res Dynam Control Syst 9(18):675–682
Srinivas V, Aditya K, Prasanth G, Babukarthik RG, Satheeshkumar S, Sambasivam G (2018) A novel approach for prediction of heart disease: machine learning techniques. Int J Eng Technol (UAE), 7(2.32):108–110
Anisha PR, Vijaya Babu B (2018) EBPS: effective method for early breast cancer prediction using Wisconsin breast cancer dataset. Int J Innov Technol Explor Eng 8(2S):205–211
Jerez JM, Molina I, García-Laencina PJ, Alba E, Ribelles N, Martín M, Franco L (2010) Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif Intell Med 50(2):105–115
Ghasem Ahmad L, Eshlaghy A, Pourebrahimi A, Ebrahimi M, Razavi A (2013) Using three machine learning techniques for predicting breast cancer recurrence. J Health Med Inform 4:124–130.
Asri H, Mousannif H, Al Moatassime H, Thomas N (2016) Using machine learning algorithms for breast cancer risk prediction and diagnosis. Proc Comput Sci 83:1064–1069
Amrane M, Oukid S, Gagaoua I, Ensarİ T (2018) Breast cancer classification using machine learning. In: 2018 electric electronics, computer science, biomedical engineerings’ meeting (EBBT), Istanbul, pp 1–4. https://doi.org/10.1109/EBBT.2018.8391453
Jadhav M, Thakkar Z, Chawan P (2019) Breast cancer prediction using supervised machine learning algorithms. Int Res J Eng Technol (IRJET) 06(10): 851–854
Shravya Ch, Pravalika K, Subhani S (2019) Prediction of breast cancer using supervised machine learning techniques. Int J Innov Technol Explor Eng (IJITEE) 8(6):1106–1110. ISSN: 2278-3075
Huang MW, Chen CW, Lin WC, Ke SW, Tsai CF (2017) SVM and SVM ensembles in breast cancer prediction. Plos One 12(1):e0161501
Probst P, Wright MN, Boulesteix AL (2019) Hyperparameters and tuning strategies for random forest. Wiley Interdisc Rev Data Mining Knowl Discov 9(3):e1301
Banchhor C, Srinivasu N (2016) CNB-MRF: adapting correlative Naïve Bayes classifier and MapReduce framework for big data classification. Int Rev Comput Softw 11(11):1007–1015. https://doi.org/10.15866/irecos.v11i11.10116
Rachapudi V, Venkata Suryanarayana S, Subha Mastan Rao T (2019) Auto-encoder based K-means clustering algorithm. Int J Innov Technol Explor Eng 8(5):1223–1226
RamyaSri R, IshaSanjida S, Parasa D, Bano S (2019) Food survey using exploratory data analysis. In: 2019 2nd international conference on intelligent communication and computational techniques (ICCT), pp 258–264
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY (2017) Lightgbm: a highly efficient gradient boosting decision tree. In: Advances in neural information processing systems, pp 3146–3154
Roshini T, Sireesha PV, Parasa D, Bano S (2019) Social media survey using decision tree and Naïve Bayes classification. In: 2nd ınternational conference on ıntelligent communication and computational techniques (ICCT), Jaipur, India, pp. 265–270 (2019)
Longo GA, Zilio C, Ortombina L, Zigliotto M (2017) Application of Artificial Neural Network (ANN) for modeling oxide-based nanofluids dynamic viscosity. Int Commun Heat Mass Transfer 83:8–14
Gurram D, Narasinga Rao MR (2017) A comparative study of support vector machine and logistic regression for the diagnosis of thyroid dysfunction. Int J Eng Technol (UAE) 7(1.1):326–328
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Guru Sai Sarma Chilukuri, N.V.S., Bano, S., Tholeti, G.S.R., Kamma, S.P., Niharika, G.L. (2022). An Analytical Prediction of Breast Cancer Using Machine Learning. In: Kumar, A., Senatore, S., Gunjan, V.K. (eds) ICDSMLA 2020. Lecture Notes in Electrical Engineering, vol 783. Springer, Singapore. https://doi.org/10.1007/978-981-16-3690-5_17
Download citation
DOI: https://doi.org/10.1007/978-981-16-3690-5_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3689-9
Online ISBN: 978-981-16-3690-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)