Fitting Aggregation Functions to Data: Part II - Idempotization
The use of supervised learning techniques for fitting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number of papers. Nevertheless, there are still some important issues that have not been discussed in the literature yet. In the second part of this two-part contribution we deal with a quite common situation in which we have inputs coming from different sources, describing a similar phenomenon, but which have not been properly normalized. In such a case, idempotent and nondecreasing functions cannot be used to aggregate them unless proper pre-processing is performed. The proposed idempotization method, based on the notion of B-splines, allows for an automatic calibration of independent variables. The introduced technique is applied in an R source code plagiarism detection system.
KeywordsAggregation functions Weighted quasi-arithmetic means Least squares fitting Idempotence
This study was supported by the National Science Center, Poland, research project 2014/13/D/HS4/01700.
- 1.Bartoszuk, M., Beliakov, G., Gagolewski, M., James, S.: Fitting aggregation functions to data: part I - linearization and regularization. In: Carvalho, J.P., Lesot, M.-J., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2016, Part II. CCIS, vol. 611, pp. 767–779. Springer, Heidelberg (2016)Google Scholar
- 2.Bartoszuk, M., Gagolewski, M.: A fuzzy R code similarity detection algorithm. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2014, Part III. CCIS, vol. 444, pp. 21–30. Springer, Heidelberg (2014)Google Scholar
- 3.Bartoszuk, M., Gagolewski, M.: Detecting similarity of R functions via a fusion of multiple heuristic methods. In: Alonso, J., Bustince, H., Reformat, M. (eds.) Proceedings of IFSA/Eusflat 2015, pp. 419–426. Atlantic Press (2015)Google Scholar
- 8.Beliakov, G., Bustince, H., Calvo, T.: A Practical Guide to Averaging Functions. STUDFUZZ, vol. 329. Springer, Heidelberg (2016)Google Scholar
- 11.Gagolewski, M.: Data Fusion: Theory, Methods, and Applications. Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland (2015)Google Scholar
- 15.R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2016). http://www.R-project.org