Application of the Bag-of-Words Algorithm in Classification the Quality of Sales Leads

  • Marcin GabryelEmail author
  • Robertas Damaševičius
  • Krzysztof Przybyszewski
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10841)


The article presents a sales lead classification method using an adapted version of the Bag-of-Words algorithm. The data collected on the website of a financial institution and evaluated by that institution undergo a classification process. It is expected that the customer submitting data through a web form should be a person interested in a particular financial product. It often happens that instead of a person, i.e. a human user, it is a bot – a computer program that simulates human behavior. However, bots deliver lower quality sales leads. The way in which a web form is handled by a bot differs from the way in which it is completed by a human user. It is therefore possible to analyze the behavior on the website and to link it with the evaluation of the submitted data. The Bag-of-Words algorithm has been adapted to deal with this particular task. Experimental research based on the real-life data obtained from a bank shows how effective this algorithm is in the sales leads quality classification.


Bot detection Online Ad-fraud Security 


  1. 1.
    Zhu, X., Tao, H., Wu, Z., Cao, J., Kalish, K., Kayne, J.: Fraud Prevention in Online Digital Advertising. Springer, Heidelberg (2017). Scholar
  2. 2.
    Martins, C.A., Monard, M.C., Matsubara, E.T.: Reducing the dimensionality of bag-of-words text representation used by learning algorithms. In: Proceedings of 3rd IASTED International Conference on Artificial Intelligence and Applications, pp. 228–233 (2003). Author, F., Author, S.: Title of a proceedings paper. In: Editor, F., Editor, S. (eds.) Conference 2016, LNCS, vol. 9999, pp. 1–13. Springer, Heidelberg (2016)Google Scholar
  3. 3.
    Silva, S.S.C., Silva, R.M.P., Pinto, R.C.G., Salles, R.M.: Botnets: a survey. Comput. Netw. 57(2), 378–403 (2013)CrossRefGoogle Scholar
  4. 4.
    AsSadhan, B., Moura, J., Lapsley, D., Jones, C., Strayer, W.: Detecting botnets using command and control traffic. In: Eighth IEEE International Symposium on Network Computing and Applications, NCA 2009, pp. 156–162 (2009)Google Scholar
  5. 5.
    Seyyar, M.B., Çatak, F.Ö., Gül, E.: Detection of attack-targeted scans from the apache HTTP server access logs. Appl. Comput. Inf. 14(1), 28–36 (2018)Google Scholar
  6. 6.
    WhiteOps: The Methbot operation. Accessed 01 Feb 2018
  7. 7.
    Rubner, Y., Tomasi, C., Guibas, L.J.: The earth mover’s distance as a metric for image retrieval. Int. J. Comput. Vis. 40(2), 99–121 (2000)CrossRefGoogle Scholar
  8. 8.
    Yahyazadeh, M., Abadi, M.: BotGrab: a negative reputation system for botnet detection. Comput. Electr. Eng. 41, 68–85 (2015)CrossRefGoogle Scholar
  9. 9.
    Soniya, B., Wilscy, M.: Detection of randomized bot command and control traffic on an end-point host. Alex. Eng. J. 55(3), 2771–2781 (2016)CrossRefGoogle Scholar
  10. 10.
    Chen, C.-M., Lin, H.-C.: Detecting botnet by anomalous traffic. J. Inf. Secur. Appl. 21, 42–51 (2015)Google Scholar
  11. 11.
    Gabryel, M.: A bag-of-features algorithm for applications using a NoSQL database. In: Dregvaite, G., Damasevicius, R. (eds.) ICIST 2016. CCIS, vol. 639, pp. 332–343. Springer, Cham (2016). Scholar
  12. 12.
    Gabryel, M.: The bag-of-features algorithm for practical applications using the MySQL database. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 635–646. Springer, Cham (2016). Scholar
  13. 13.
    Olson, D.L., Delen, D.: Advanced Data Mining Techniques, 1st edn. Springer, Heidelberg (2008). Scholar
  14. 14.
    Woźniak, M., Połap, D.: Adaptive neuro-heuristic hybrid model for fruit peel defects detection. Neural Netw. 98, 16–33 (2018). Scholar
  15. 15.
    Starczewski, A., Krzyżak, A.: A modification of the Silhouette index for the improvement of cluster validity assessment. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 114–124. Springer, Cham (2016). Scholar
  16. 16.
    Korytkowski, M.: A novel convolutional neural network with Glial cells. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 670–679. Springer, Cham (2016). Scholar
  17. 17.
    Bologna, G., Hayashi, Y.: Characterization of symbolic rules embedded in deep DIMLP networks: a challenge to transparency of deep learning. J. Artif. Intell. Soft Comput. Res. 7(4), 265–286. Scholar
  18. 18.
    Villmann, T., Bohnsack, A., Kaden, M.: Can learning vector quantization be an alternative to SVM and deep learning? - Recent trends and advanced variants of learning vector quantization for classification learning. J. Artif. Intell. Soft Comput. Res. 7(1), 65–81. Scholar
  19. 19.
    Nowicki, R.K., Starczewski, J.T.: A new method for classification of imprecise data using fuzzy rough fuzzification. Inf. Sci. 414, 33–52 (2017)CrossRefGoogle Scholar
  20. 20.
    Riid, A., Preden, J.-S.: Design of fuzzy rule-based classifiers through granulation and consolidation. J. Artif. Intell. Soft Comput. Res. 7(2), 137–147 (2017). Scholar
  21. 21.
    Łapa, K., Cpałka, K.: Evolutionary approach for automatic design of PID controllers. In: Gawęda, A.E., Kacprzyk, J., Rutkowski, L., Yen, G.G. (eds.) Advances in Data Analysis with Computational Intelligence Methods. SCI, vol. 738, pp. 353–373. Springer, Cham (2018). Scholar
  22. 22.
    Rotar, C., Iantovics, L.B.: Directed evolution - a new metaheuristc for optimization. J. Artif. Intell. Soft Comput. Res. 7(3), 183–200. Scholar
  23. 23.
    Marszałek, Z.: Parallelization of modified merge sort algorithm. Symmetry 9(9), 176 (2017)CrossRefGoogle Scholar
  24. 24.
    Bilski, J., Smoląg, J.: Parallel architectures for learning the RTRN and elman dynamic neural networks. IEEE Trans. Parallel Distrib. Syst. 26(9), 2561–2570 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Marcin Gabryel
    • 1
    Email author
  • Robertas Damaševičius
    • 2
  • Krzysztof Przybyszewski
    • 3
    • 4
  1. 1.Institute of Computational IntelligenceCzestochowa University of TechnologyCzęstochowaPoland
  2. 2.Software Engineering DepartmentKaunas University of TechnologyKaunasLithuania
  3. 3.Information Technology InstituteUniversity of Social SciencesŁódźPoland
  4. 4.Clark UniversityWorcesterUSA

Personalised recommendations