Fitting Aggregation Functions to Data: Part II - Idempotization

  • Maciej Bartoszuk
  • Gleb Beliakov
  • Marek Gagolewski
  • Simon James
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 611)

Abstract

The use of supervised learning techniques for fitting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number of papers. Nevertheless, there are still some important issues that have not been discussed in the literature yet. In the second part of this two-part contribution we deal with a quite common situation in which we have inputs coming from different sources, describing a similar phenomenon, but which have not been properly normalized. In such a case, idempotent and nondecreasing functions cannot be used to aggregate them unless proper pre-processing is performed. The proposed idempotization method, based on the notion of B-splines, allows for an automatic calibration of independent variables. The introduced technique is applied in an R source code plagiarism detection system.

Keywords

Aggregation functions Weighted quasi-arithmetic means Least squares fitting Idempotence 

Notes

Acknowledgments

This study was supported by the National Science Center, Poland, research project 2014/13/D/HS4/01700.

References

  1. 1.
    Bartoszuk, M., Beliakov, G., Gagolewski, M., James, S.: Fitting aggregation functions to data: part I - linearization and regularization. In: Carvalho, J.P., Lesot, M.-J., Kaymak, U., Vieira, S., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2016, Part II. CCIS, vol. 611, pp. 767–779. Springer, Heidelberg (2016)Google Scholar
  2. 2.
    Bartoszuk, M., Gagolewski, M.: A fuzzy R code similarity detection algorithm. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds.) IPMU 2014, Part III. CCIS, vol. 444, pp. 21–30. Springer, Heidelberg (2014)Google Scholar
  3. 3.
    Bartoszuk, M., Gagolewski, M.: Detecting similarity of R functions via a fusion of multiple heuristic methods. In: Alonso, J., Bustince, H., Reformat, M. (eds.) Proceedings of IFSA/Eusflat 2015, pp. 419–426. Atlantic Press (2015)Google Scholar
  4. 4.
    Beliakov, G.: Monotone approximation of aggregation operators using least squares splines. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 10, 659–676 (2002)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Beliakov, G.: How to build aggregation operators from data. Int. J. Intell. Syst. 18, 903–923 (2003)CrossRefMATHGoogle Scholar
  6. 6.
    Beliakov, G.: Learning weights in the generalized OWA operators. Fuzzy Optim. Decis. Making 4, 119–130 (2005)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Beliakov, G.: Construction of aggregation functions from data using linear programming. Fuzzy Sets Syst. 160, 65–75 (2009)MathSciNetCrossRefMATHGoogle Scholar
  8. 8.
    Beliakov, G., Bustince, H., Calvo, T.: A Practical Guide to Averaging Functions. STUDFUZZ, vol. 329. Springer, Heidelberg (2016)Google Scholar
  9. 9.
    Beliakov, G., Pradera, A., Calvo, T.: Aggregation Functions: A Guide for Practitioners. STUDFUZZ, vol. 221. Springer, Heidelberg (2007)MATHGoogle Scholar
  10. 10.
    Beliakov, G., Warren, J.: Appropriate choice of aggregation operators in fuzzy decision support systems. IEEE Trans. Fuzzy Syst. 9(6), 773–784 (2001)CrossRefGoogle Scholar
  11. 11.
    Gagolewski, M.: Data Fusion: Theory, Methods, and Applications. Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland (2015)Google Scholar
  12. 12.
    Hansen, N.: The CMA evolution strategy: a comparing review. In: Lozano, J., Larranga, P., Inza, I., Bengoetxea, E. (eds.) Towards a New Evolutionary Computation. STUDFUZZ, vol. 192, pp. 75–102. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2013)MATHGoogle Scholar
  14. 14.
    Mesiar, R., Mesiarová-Zemánková, A.: The ordered modular averages. IEEE Trans. Fuzzy Syst. 19(1), 42–50 (2011)CrossRefGoogle Scholar
  15. 15.
    R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2016). http://www.R-project.org
  16. 16.
    Schumaker, L.: Spline Functions: Basic Theory. Cambridge University Press, Cambridge (2007)CrossRefMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Maciej Bartoszuk
    • 1
  • Gleb Beliakov
    • 2
  • Marek Gagolewski
    • 1
    • 3
  • Simon James
    • 2
  1. 1.Faculty of Mathematics and Information ScienceWarsaw University of TechnologyWarsawPoland
  2. 2.School of Information TechnologyDeakin UniversityBurwoodAustralia
  3. 3.Systems Research InstitutePolish Academy of SciencesWarsawPoland

Personalised recommendations