Skip to main content

Improved Euclidean Distance in the K Nearest Neighbors Method

  • Conference paper
  • First Online:
Innovations for Community Services (I4CS 2023)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1876))

Included in the following conference series:

Abstract

The KNN algorithm is one of the most famous algorithms in data mining. It consists in calculating the distance between a query and all the data in the reference set. In this paper, we present an approach to standardize variables that avoids making assumptions about the presence of outliers or the number of classes. Our method involves computing the ranks of values within the dataset for each variable and using these ranks to standardize the variables. We then calculate a dissimilarity index between the standardized data, called the Rank-Based Dissimilarity Index (RBDI), which we use instead of Euclidean distance to find the K nearest neighbors. Finally, we combine the Euclidean distance and the RBDI index taking into account the advantage of both dissimilarity indices. In essence, the Euclidean distance considers the Euclidean geometry of the data space while RBDI is not constrained by distance or geometry in data space. We evaluate our approach using multidimensional open datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.univ-reims.fr/demetere.

References

  1. Afzal, S., Ganesh, K.: Load balancing in cloud computing - a hierarchical taxonomical classification. J. Cloud Comput. 8 (2019)

    Google Scholar 

  2. Aquino, L.D.G., Eckstein, S.: Minmax methods for optimal transport and beyond: regularization, approximation and numerics (2020)

    Google Scholar 

  3. Arora, A., Sinha, S., Kumar, P., Bhattacharya, A.: HD-index: pushing the scalability-accuracy boundary for approximate KNN search in high-dimensional spaces. Proc. VLDB Endow. 11(8), 906–919 (2018)

    Google Scholar 

  4. Belkasim, S., Shridhar, M., Ahmadi, M.: Pattern classification using an efficient KNNR. Pattern Recogn. 25(10), 1269–1274 (1992)

    Article  Google Scholar 

  5. Boucetta, C., Hussenet, L., Herbin, M.: Practical method for multidimensional data ranking. In: Phillipson, F., Eichler, G., Erfurth, C., Fahrnberger, G. (eds.) I4CS 2022. CCIS, vol. 1585, pp. 267–277. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-06668-9_19

    Chapter  Google Scholar 

  6. Farahnakian, F., Pahikkala, T., Liljeberg, P., Plosila, J.: Energy aware consolidation algorithm based on k-nearest neighbor regression for cloud data centers. In: 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing, pp. 256–259 (2013)

    Google Scholar 

  7. He, X., Cai, D., Niyogi, P.: Laplacian score for feature selection. In: Weiss, Y., Schölkopf, B., Platt, J. (eds.) Advances in Neural Information Processing Systems, vol. 18. MIT Press, Cambridge (2005)

    Google Scholar 

  8. Hussenet, L., Boucetta, C.: A green-aware optimization strategy for virtual machine migration in cloud data centers. In: 2022 International Wireless Communications and Mobile Computing (IWCMC), pp. 1082–1087 (2022)

    Google Scholar 

  9. Liang, B., Wu, D., Wu, P., Su, Y.: An energy-aware resource deployment algorithm for cloud data centers based on dynamic hybrid machine learning. Knowl.-Based Syst. 222, 107020 (2021)

    Article  Google Scholar 

  10. Mazidi, A., Golsorkhtabar, M., Tabari, M.: Autonomic resource provisioning for multilayer cloud applications with k-nearest neighbor resource scaling and priority-based resource allocation. Software: Practice and Experience 50 (04 2020)

    Google Scholar 

  11. Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)

    Article  Google Scholar 

  12. Ou, X., et al.: Hyperspectral image target detection via weighted joint k-nearest neighbor and multitask learning sparse representation. IEEE Access 8, 11503–11511 (2020)

    Article  Google Scholar 

  13. Su, J., Nair, S., Popokh, L.: EdgeGYM: a reinforcement learning environment for constraint-aware NFV resource allocation. In: 2023 IEEE 2nd International Conference on AI in Cybersecurity (ICAIC), pp. 1–7 (2023)

    Google Scholar 

  14. Taunk, K., De, S., Verma, S., Swetapadma, A.: A brief review of nearest neighbor algorithm for learning and classification. In: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 1255–1260 (2019)

    Google Scholar 

  15. Xie, M., Hu, J., Han, S., Chen, H.H.: Scalable hypergrid K-NN-based online anomaly detection in wireless sensor networks. IEEE Trans. Parallel Distrib. Syst. 24(8), 1661–1670 (2013)

    Article  Google Scholar 

  16. Yu, C., Cui, B., Wang, S., Su, J.: Efficient index-based KNN join processing for high-dimensional data. Inf. Softw. Technol. 49(4), 332–344 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chérifa Boucetta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Boucetta, C., Hussenet, L., Herbin, M. (2023). Improved Euclidean Distance in the K Nearest Neighbors Method. In: Krieger, U.R., Eichler, G., Erfurth, C., Fahrnberger, G. (eds) Innovations for Community Services. I4CS 2023. Communications in Computer and Information Science, vol 1876. Springer, Cham. https://doi.org/10.1007/978-3-031-40852-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40852-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40851-9

  • Online ISBN: 978-3-031-40852-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics