Abstract
Regression branch of machine learning purely focuses on prediction of continuous values. The supervised learning branch has many regression-based methods with parametric and nonparametric learning models. In this paper, we aim to target a very subtle point related to distance-based regression model. The distance-based model used is k-nearest neighbors regressor which is a supervised nonparametric method. The point that we want to prove is the effect of k parameter of the model and its fluctuations affecting the metrics. The metrics that we use are root mean squared error and R-squared goodness of fit with their visual representation of values with respect to k values.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cormack RM (1971) A review of classification. J Roy Stat Soc Ser A (General) 134(3):321–367. http://www.jstor.org/stable/2344237
Maulud D, Abdulazeez AM (2020) A review on linear regression comprehensive in machine learning. J Appl Sci Technol Trends 1(4):140–147. https://doi.org/10.38094/jastt1457, https://jastt.org/index.php/jasttpath/article/view/57
Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. In: Proceedings of the 2007 conference on emerging artificial intelligence applications in computer engineering: real word AI systems with applications in EHealth, HCI, information retrieval and pervasive technologies. IOS Press, NLD, pp 3–24
Längkvist M, Karlsson L, Loutfi A (2014) A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogn Lett 42:11–24
Bashir F, Wei HL (2015) Parametric and non-parametric methods to enhance prediction performance in the presence of missing data. In: 2015 19th international conference on system theory, control and computing (ICSTCC), pp 337–342. https://doi.org/10.1109/ICSTCC.2015.7321316
Wallisch C, Bach P, Hafermann L, Klein N, Sauerbrei W, Steyerberg EW, Heinze G, Rauch G (2022) On behalf of topic group 2 of the STRATOS initiative: review of guidance papers on regression modeling in statistical series of medical journals. PLOS ONE 17(1):1–20. https://doi.org/10.1371/journal.pone.0262918
Cramer JS (2002) The origins of logistic regression
Gupta A, Soni H, Joshi R, Laban RM (2022) Discriminant analysis in contrasting dimensions for polycystic ovary syndrome prognostication. arXiv preprint arXiv:2201.03029
Taunk K, De S, Verma S, Swetapadma A (2019) A brief review of nearest neighbor algorithm for learning and classification. In: 2019 international conference on intelligent computing and control systems (ICCS), pp 1255–1260. https://doi.org/10.1109/ICCS45141.2019.9065747
Hearst M, Dumais S, Osuna E, Platt J, Scholkopf B (1998) Support vector machines. IEEE Intell Syst Appl 13(4):18–28. https://doi.org/10.1109/5254.708428
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106. https://doi.org/10.1023/A:1022643204877
Kanvinde N, Gupta A, Joshi R (2022) Binary classification for high dimensional data using supervised non-parametric ensemble method. arXiv preprint arXiv:2202.07779
Gupta AM, Shetty SS, Joshi RM, Laban RM (2021) Succinct differentiation of disparate boosting ensemble learning methods for prognostication of polycystic ovary syndrome diagnosis. In: 2021 international conference on advances in computing, communication, and control (ICAC3). IEEE, pp 1–5
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc Ser B (Methodological) 58(1):267–288. http://www.jstor.org/stable/2346178
Hoerl AE, Kennard RW (2000). Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1):80–86. http://www.jstor.org/stable/1271436
Sheather SJ (2004) Density estimation. Stat Sci 19(4):588–597. http://www.jstor.org/stable/4144429
Liberti L, Lavor C, Maculan N, Mucherino A (2014) Euclidean distance geometry and applications. SIAM Rev 56:3–69
Ranjitkar HS, Karki S (2016) Comparison of A*, Euclidean and Manhattan distance using influence map in MS. Pac-Man
Norouzi M, Fleet DJ, Salakhutdinov RR (2012) Hamming distance metric learning. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems, vol 25. Curran Associates, Inc., USA
Sammut C, Webb GI (2011) Encyclopedia of machine learning. Springer Science & Business Media, Berlin
Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)? Arguments against avoiding RMSE in the literature. Geoscientific Model Dev 7(3):1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Colin Cameron A, Windmeijer FA (1997) An R-squared measure of goodness of fit for some common nonlinear regression models. J Econometrics 77(2):329–342
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gupta, A., Joshi, R., Kanvinde, N., Gerela, P., Laban, R.M. (2023). Metric Effects Based on Fluctuations in Values of k in Nearest Neighbor Regressor. In: Jacob, I.J., Kolandapalayam Shanmugam, S., Izonin, I. (eds) Data Intelligence and Cognitive Informatics. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-6004-8_12
Download citation
DOI: https://doi.org/10.1007/978-981-19-6004-8_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-6003-1
Online ISBN: 978-981-19-6004-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)