Threats on Machine Learning Technique by Data Poisoning Attack: A Survey

Ahmed, Ibrahim M.; Kashmoola, Manar Younis

doi:10.1007/978-981-16-8059-5_36

Ibrahim M. Ahmed⁸ &
Manar Younis Kashmoola⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1487))

Included in the following conference series:

International Conference on Advances in Cyber Security

2301 Accesses
8 Citations

Abstract

With the huge services provided by machine learning systems in our daily life, the attacks on these services are increasing every day. The attackers are trying to distort the functionality of these services and change their real duty by falsifying the function using the principle of intoxication. The poisoned system gives the unauthorized person the right to enter and exit the system as a legal person at anytime and anywhere. This could degrade the credibility of systems built using intelligent technologies. The paper extensively introduces the mechanisms of a data poisoning attack. Data poisoning attacks target systems based on machine learning technology, with explanations of the attack mechanisms targeting data sources and the intelligence model during either the training or testing phases. Defense methods presented by researchers in this field have also been described by defense strategies presented in the literature. The risks and effects caused by this attack are also described, and what are the future solutions that give opportunities for researchers working in this field to avoid and repel this attack perfectly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amanuel, S.V.A., Ameen, S.Y.: Device-to-device communication for 5G security: a review. J. Inf. Technol. Inf. 1(1), 26–31 (2021)
Google Scholar
Khalid, L.F., Ameen, S.Y.: Secure IoT integration in daily lives: a review. J. Inf. Technol. Inf. 1(1), 6–12 (2021)
Google Scholar
Medak, T., Krishna, A.P.: Power controlled secured transmission using self organizing trusted node model. Int. J. Pure Appl. Math. 118(24), 11–21 (2018)
Google Scholar
Pitropakis, N., et al.: A taxonomy and survey of attacks against machine learning. Comput. Sci. Rev. 34, 100199 (2019)
Article MathSciNet Google Scholar
Goldblum, M., et al.: Data security for machine learning: data poisoning, backdoor attacks, and defenses. arXiv preprint arXiv:2012.10544 (2020)
Hamed, Z.A., Ahmed, I.M., Ameen, S.Y.: Protecting windows OS against local threats without using antivirus. Relation 29(12s), 64–70 (2020)
Google Scholar
Abd Al Nomani, M.M., Birmani, A.H.T.: Informational destruction crime; A comparative Study. PalArch’s J. Archaeol. Egypt 17(3), 2266–2281 (2020)
Google Scholar
Yao, Y., et al.: Latent backdoor attacks on deep neural networks, pp. 2041–2055 (2019)
Google Scholar
Li, Y., et al.: Backdoor learning: a survey. arXiv preprint arXiv:2007.08745 (2020)
Tang, D., Wang, X., et al.: Demon in the variant: statistical analysis of DNNs for robust backdoor contamination detection. In: 30th {USENIX} Security Symposium ({USENIX} Security 21) (2021)
Google Scholar
Xia, Y., et al.: Weighted speech distortion losses for neural-network-based real-time speech enhancement. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE (2020)
Google Scholar
Ning, J., et al.: Analytical modeling of part distortion in metal additive manufacturing. Int. J. Adv. Manuf. Technol. 107(1–2), 49–57 (2020). https://doi.org/10.1007/s00170-020-05065-8
Article Google Scholar
Ahmed, I.: Enhancement of network attack classification using particle swarm optimization and multi-layer perceptron. Int. J. Comput. Appl. 137(12), 18–22 (2016)
Google Scholar
Huang, J., et al.: An exploratory analysis on users’ contributions in federated learning. arXiv preprint arXiv:2011.06830 (2020)
Tomsett, R., Chan, K.S., et al.: Model poisoning attacks against distributed machine learning systems (2019)
Google Scholar
Gu, T., et al.: BadNets: evaluating backdooring attacks on deep neural networks. IEEE Access 7, 47230–47244 (2019)
Article Google Scholar
Bagdasaryan, E., et al.: How to backdoor federated learning. In: International Conference on Artificial Intelligence and Statistics. PMLR (2020)
Google Scholar
Tolpegin, V., et al.: Data poisoning attacks against federated learning systems. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds.) ESORICS 2020. LNCS, vol. 12308, pp. 480–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58951-6_24
Chapter Google Scholar
Liu, Y., et al.: RC-SSFL towards robust and communication-efficient semi-supervised federated learning system. arXiv preprint arXiv:2012.04432 (2020)
Lyu, L., Yu, H., Yang, Q.: Threats to federated learning: a survey. arXiv preprint arXiv:2003.02133 (2020)
Weerasinghe, S., et al.: Defending regression learners against poisoning attacks. arXiv preprint arXiv:2008.09279 (2020)
Jagielski, M., et al.: Manipulating machine learning: poisoning attacks and countermeasures for regression learning. In: 2018 IEEE Symposium on Security and Privacy (SP). IEEE (2018)
Google Scholar
Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)
Recht, B., et al.: Do CIFAR-10 classifiers generalize to CIFAR-10? arXiv preprint arXiv:1806.00451 (2018)
Kwon, H., Yoon, H., Park, K.-W.: Selective poisoning attack on deep neural networks. Symmetry 11(7), 892 (2019)
Article Google Scholar
Kurita, K., Michel, P., Neubig, G.: Weight poisoning attacks on pre-trained models. arXiv preprint arXiv:2004.06660 (2020)
Candanedo, L.M., Feldheim, V., Deramaix, D.: Reconstruction of the indoor temperature dataset of a house using data driven models for performance evaluation. Build. Environ. 138, 250–261 (2018)
Article Google Scholar
Li, M., Mickel, A., Taylor, S.: Should this loan be approved or denied?: a large dataset with class assignment guidelines. J. Stat. Educ. 26(1), 55–66 (2018)
Article Google Scholar
Makonin, S., Wang, Z.J., Tumpach, C.: RAE: the rainforest automation energy dataset for smart grid meter data analysis. Data 3(1), 8 (2018)
Article Google Scholar
Purohit, H., et al.: MIMII dataset: sound dataset for malfunctioning industrial machine investigation and inspection. arXiv preprint arXiv:1909.09347 (2019)
Purushotham, S., et al.: Benchmarking deep learning models on large healthcare datasets. J. Biomed. Inform. 83, 112–134 (2018)
Article Google Scholar
Wadawadagi, R., Pagi, V.: Fine-grained sentiment rating of online reviews with Deep-RNN. In: Chiplunkar, N., Fukao, T. (eds.) Advances in Artificial Intelligence and Data Engineering. Advances in Intelligent Systems and Computing, vol. 1133, pp. 687–700. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-3514-7_52
Chapter Google Scholar
Fortuna, P., Soler-Company, J., Wanner, L.: How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Inf. Process. Manag. 58(3), 102524 (2021)
Article Google Scholar
Apoorva, K.A., Sangeetha, S.: Deep neural network and model-based clustering technique for forensic electronic mail author attribution. SN Appl. Sci. 3(3), 1–12 (2021). https://doi.org/10.1007/s42452-020-04127-6
Article Google Scholar
Huang, H., et al.: Data poisoning attacks to deep learning based recommender systems. arXiv preprint arXiv:2101.02644 (2021)
Shejwalkar, V., Houmansadr, A.: Manipulating the Byzantine: optimizing model poisoning attacks and defenses for federated learning (2021)
Google Scholar
Tahmasebian, F., et al.: Crowdsourcing under data poisoning attacks: a comparative study. In: Singhal, A., Vaidya, J. (eds.) DBSec 2020. LNCS, vol. 12122, pp. 310–332. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49669-2_18
Chapter Google Scholar
Chen, L., et al.: Data poisoning attacks on neighborhood-based recommender systems. Trans. Emerg. Telecommun. Technol. 32, e3872 (2020)
Google Scholar
Koh, P.W., Steinhardt, J., Liang, P.: Stronger data poisoning attacks break data sanitization defenses. arXiv preprint arXiv:1811.00741 (2018)
Jianqiang, Z., Xiaolin, G., Xuejun, Z.: Deep convolution neural networks for Twitter sentiment analysis. IEEE Access 6, 23253–23260 (2018)
Article Google Scholar
Baldominos, A., Saez, Y., Isasi, P.: A survey of handwritten character recognition with MNIST and EMNIST. Appl. Sci. 9(15), 3169 (2019)
Article Google Scholar
Jain, A., Jain, V.: Effect of activation functions on deep learning algorithms performance for IMDB movie review analysis. In: Bansal, P., Tushir, M., Balas, V.E., Srivastava, R. (eds.) Proceedings of International Conference on Artificial Intelligence and Applications. AISC, vol. 1164, pp. 489–497. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-4992-2_46
Chapter Google Scholar
Perry, E.: Lethean attack: an online data poisoning technique. arXiv preprint arXiv:2011.12355 (2020)
Zhang, X., Zhu, X., Lessard, L.: Online data poisoning attacks. In: Learning for Dynamics and Control. PMLR (2020)
Google Scholar
Geiping, J., et al.: Witches’ Brew industrial scale data poisoning via gradient matching. arXiv preprint arXiv:2009.02276 (2020)
Wang, Y., Chaudhuri, K.: Data poisoning attacks against online learning. arXiv preprint arXiv:1808.08994 (2018)
Fang, M., et al.: Local model poisoning attacks to Byzantine-robust federated learning. In: 29th {USENIX} [17] Security Symposium ({USENIX} Security 20) (2020)
Google Scholar
Zhang, Y., et al.: Towards poisoning the neural collaborative filtering-based recommender systems. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds.) ESORICS 2020. LNCS, vol. 12308, pp. 461–479. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58951-6_23
Chapter Google Scholar
Liu, Y., et al.: A survey on neural trojans. In: 2020 21st International Symposium on Quality Electronic Design (ISQED). IEEE (2020)
Google Scholar
Aono, Y., et al.: Privacy-preserving logistic regression with distributed data sources via homomorphic encryption. IEICE Trans. Inf. Syst. 99(8), 2079–2089 (2016)
Article Google Scholar
Assegie, T.A.: An optimized K-nearest neighbor based breast cancer detection. J. Robot. Control (JRC) 2(3), 115–118 (2021)
Google Scholar
Amin, B., et al.: Intelligent neutrosophic diagnostic system for cardiotocography data. Comput. Intell. Neurosci. 2021, 20–31 (2021)
Google Scholar
Rezaei, M.R.: Amazon product recommender system. arXiv preprint arXiv:2102.04238 (2021)
Ramasamy, L.K., et al.: Performance analysis of sentiments in Twitter dataset using SVM models. Int. J. Electr. Comput. Eng. 11(3), 2275–2284 (2088–8708) (2021)
Google Scholar
Leung, J.K., Griva, I., Kennedy, W.G.: An affective aware pseudo association method to connect disjoint users across multiple datasets–an enhanced validation method for text-based emotion aware recommender. arXiv preprint arXiv:2102.05719 (2021)
Liu, Y., et al.: Towards communication-efficient and attack-resistant federated edge learning for industrial internet of things. arXiv preprint arXiv:2012.04436 (2020)
Siddiqui, M., Wang, M.C., Lee, J.: Data mining methods for malware detection using instruction sequences. In: Artificial Intelligence and Applications (2008)
Google Scholar
Narisada, S., et al.: Stronger targeted poisoning attacks against malware detection. In: Krenn, S., Shulman, H., Vaudenay, S. (eds.) CANS 2020. LNCS, vol. 12579, pp. 65–84. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-65411-5_4
Chapter Google Scholar
Weerasinghe, S., et al.: Defending distributed classifiers against data poisoning attacks. arXiv preprint arXiv:2008.09284 (2020)
Xu, X., et al.: Detecting ai trojans using meta neural analysis. arXiv preprint arXiv:1910.03137 (2019)
Weerasinghe, P.S.L.: Novel defenses against data poisoning in adversarial machine learning (2019)
Google Scholar
Borgnia, E., et al.: Strong data augmentation sanitizes poisoning and backdoor attacks without an accuracy tradeoff. arXiv preprint arXiv:2011.09527 (2020)
Gray, J., Sgandurra, D., Cavallaro, L.: Identifying authorship style in malicious binaries: techniques, challenges & datasets. arXiv preprint arXiv:2101.06124 (2021)
Sridhar, K., et al.: ICASSP 2021 acoustic echo cancellation challenge: datasets and testing framework. arXiv preprint arXiv:2009.04972 (2020)
Gu, L., et al.: Semi-supervised learning in medical images through graph-embedded random forest. Front. Neuroinf. 14, 49 (2020)
Article Google Scholar
Yang, L., et al.: Random noise attenuation based on residual convolutional neural network in seismic datasets. IEEE Access 8, 30271–30286 (2020)
Article Google Scholar
Asghari, H., et al.: CircMiner: accurate and rapid detection of circular RNA through the splice-aware pseudo-alignment scheme. Bioinformatics 36(12), 3703–3711 (2020)
Article Google Scholar
Panda, N., Majhi, S.K.: How effective is the salp swarm algorithm in data classification. In: Das, A.K., Nayak, J., Naik, B., Pati, S.K., Pelusi, D. (eds.) Computational Intelligence in Pattern Recognition. AISC, vol. 999, pp. 579–588. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-9042-5_49
Chapter Google Scholar
Schwarzschild, A., et al.: Just how toxic is data poisoning? A unified benchmark for backdoor and data poisoning attacks. arXiv preprint arXiv:2006.12557 (2020)
Sablayrolles, A., et al.: Radioactive data: tracing through training. In: International Conference on Machine Learning. PMLR (2020)
Google Scholar
Dang, T.K., Truong, P.T.T., et al.: Data poisoning attack on deep neural network and some defense methods, pp. 15–22 (2020)
Google Scholar

Download references

Acknowledgment

The authors would like to thank the University of Mosul/ College of Computer Sciences and Mathematics for their provided facilities.

Author information

Authors and Affiliations

College of Science, University of Mosul, Mosul, Iraq
Ibrahim M. Ahmed
College of Computer Science and Mathematics, University of Mosul, Mosul, Iraq
Manar Younis Kashmoola

Authors

Ibrahim M. Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Manar Younis Kashmoola
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Ibrahim M. Ahmed or Manar Younis Kashmoola .

Editor information

Editors and Affiliations

Hodeidah University, Hodeidah, Yemen
Nibras Abdullah
Universiti Sains Malaysia, Penang, Malaysia
Selvakumar Manickam
Universiti Sains Malaysia, Penang, Malaysia
Mohammed Anbar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, I.M., Kashmoola, M.Y. (2021). Threats on Machine Learning Technique by Data Poisoning Attack: A Survey. In: Abdullah, N., Manickam, S., Anbar, M. (eds) Advances in Cyber Security. ACeS 2021. Communications in Computer and Information Science, vol 1487. Springer, Singapore. https://doi.org/10.1007/978-981-16-8059-5_36

Download citation

DOI: https://doi.org/10.1007/978-981-16-8059-5_36
Published: 01 January 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-8058-8
Online ISBN: 978-981-16-8059-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics