Skip to main content
Log in

An MCDM approach towards handling outliers in web data: a case study using OWA operators

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

World Wide Web has emerged as one of the primary modes of information sharing and searching. Its reach has been extended to daily aspects of our life whether it is related to business or education. As the information is going online, and so is the complexity of finding the correct, precise and appropriate information. Many online companies rely heavily on analysis of web data to stay in business, to make strategic decisions, and for their existence. One of the problem in analyzing the web data is the web user. A typical web user exhibits highly uncertain pattern of web browsing and the same is captured in form of web server logs. Various data mining techniques like regression, are used to analyze such kind of data, but the inherent complex nature of web data introduces some outlier values while mining for information. Minimizing these outliers has always been a challenging task for data scientist and researchers. This paper uses an aggregation-based approach based on various ordered weighted averaging operators to reduce the outlier values in regression analysis. In this paper, a regression problem is being formulated followed by solving the problem with the help of concepts of multi-criteria decision making. Results, thus obtained are able to show that outliers can be reduced to a significant amount with the help of this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Aleskerov E, Freisleben B, Rao B (1997) Cardwatch: a neural network based database mining system for credit card fraud detection. In: Computational intelligence for financial engineering (CIFEr), 1997, proceedings of the IEEE/IAFE 1997, pp 220–226

  • Abdullah L (2013) Fuzzy multi criteria decision making and its applications: a brief review of category. Procedia Soc Behav Sci 97:131–136

    Article  Google Scholar 

  • Ahn BS (2009) Some remarks on the LSOWA approach for obtaining OWA operator weight. Int J Intell Syst 24:1265–1279

    Article  MATH  Google Scholar 

  • Bandler W, Kohout L (1980) Fuzzy power set and fuzzy implication operators. Fuzzy Sets Syst 4:13–30

    Article  MathSciNet  MATH  Google Scholar 

  • Belkin NJ (2008) Some(what) grand challenges for information retrieval, vol 42, no 1. SIGIR Newsletter, ACM-SIGIR forum, p 1

  • Bordogna G, Fedrizzi M, Pasi G (1997) A linguistic modeling of consensus in group decision making based on OWA operators. IEEE Trans Syst Man Cybern Part A Syst Hum 27(1):126–133

    Article  Google Scholar 

  • Brown RG (1963) Smoothing, forecasting and prediction of discrete time series. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Cabrerizo FJ, Morente-Molinera JA, Pérez IJ, López-Gijón J, Herrera-Viedma E (2015) A decision support system to develop a quality management in academic digital libraries. Inf Sci 323:48–58

    Article  MathSciNet  Google Scholar 

  • Carlsson C, Fuller R (1996) Fuzzy multiple criteria decision making: recent developments. Fuzzy Sets Syst 78(2):139–153

    Article  MathSciNet  MATH  Google Scholar 

  • Carlsson C, Fuller R, Fuller S (1997) OWA operators for doctoral student selection problem. In: Yager RR, Kacprzyk J (eds) The ordered weighted averaging operators: theory, methodology, and applications. Kluwer Academic Publishers, Boston, pp 167–178

  • Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):15

    Article  Google Scholar 

  • Chen S-J, Chen SM (2005) Fuzzy information retrieval based on geometric mean averaging operators. Int J Comput Math Appl 49:1213–1231

    Article  MathSciNet  MATH  Google Scholar 

  • Chen HY, Liu CL, Shen ZH (2004) induced ordered weighted harmonic averaging operator (iowha) and its application to combination forecasting methods. Chin J Manag Sci 12:35–40

    Google Scholar 

  • Chiclana F, Herrera-Viedma E, Herrera F, Alonso S (2004) Induced ordered weighted geometric operators and their use in the aggregation of multiplicative preference relations. Int J Intell Syst 19(3):233–255

    Article  MATH  Google Scholar 

  • Cuzick J (1992) Semiparametric additive regression. J R Stat Soc Ser B (Methodol) 54(3):831–843

  • Davey A, Olson D, Wallenius J (1994) The process of multiattribute decision making: a case study of selecting applicants for a Ph. D. program. Eur J Oper Res 72(3):469–484

    Article  Google Scholar 

  • Dombi J (1980) A general class of fuzzy connectives. Fuzzy Sets Syst 4:235–242

    Article  MathSciNet  Google Scholar 

  • Draper NR, Smith H, Pownell E (1966) Applied regression analysis, vol 3. Wiley, New York

    Google Scholar 

  • Dubois D, Fargier H, Prade H (1996) Refinement of the maximin approach to decision making in a fuzzy environment. Fuzzy Sets Syst 81:103–122

    Article  MathSciNet  MATH  Google Scholar 

  • Dubois D, Prade H (1986) New results about properties and semantics of fuzzy set-theoretic operators. Plenum Press, New York

    MATH  Google Scholar 

  • Edgeworth FY (1887) Xli. on discordant observations. The Lond Edinb Dublin Philos Mag J Sci 23(143):364–375

    Article  MATH  Google Scholar 

  • Emrouznejad A, Marra M (2014) Ordered weighted averaging operators 1988–2014: a citation-based literature survey. Int J Intell Syst 29(11):994–1014

    Article  Google Scholar 

  • Erkan TE, Rouyendegh BD (2014) Curriculum change parameters determined by multi criteria decision making (MCDM). Procedia Soc Behav Sci 116:1744–1747

    Article  Google Scholar 

  • Figueira J, Greco S, Ehrgott M (2005) Multiple criteria decision analysis: state of the art surveys, vol 78. Springer, Berlin

    Book  MATH  Google Scholar 

  • Filev D, Yager RR (1998) On the issue of obtaining OWA operator weights. Fuzzy Sets Syst 94:157–169

    Article  MathSciNet  Google Scholar 

  • Fuller R, Majlender P (2001) An analytic approach for obtaining maximal entropy OWA operator weights. Fuzzy Sets Syst 124:53–57

    Article  MathSciNet  MATH  Google Scholar 

  • Giles R (1976) Luckasiewicz logic and fuzzy set theory. Int J Man Mach stud 8:313–327

    Article  MathSciNet  MATH  Google Scholar 

  • Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21

    Article  Google Scholar 

  • Hagan O (1988) Aggregating template or rule antecedent in real time expert system with fuzzy set logic. In: Proceedings 22nd annual IEEE asilomar conference on signals, systems, computers. Pacific Grove, CA, pp 81–89

  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. SIGKDD Explor 11:10–18

  • Han J, Kamber M (2006) Data mining concepts and techniques, 2nd edn. Morgan Kauffman Publisher, Burlington

    MATH  Google Scholar 

  • Herrera F, Herrera-Viedma E, Verdegay JL (1996) Direct approach processes in group decision making using linguistic OWA operators. Fuzzy Sets Syst 79(2):175–190

    Article  MathSciNet  MATH  Google Scholar 

  • Herrera-Viedma E (2001a) Modeling the retrieval process for an information retrieval system using an ordinal fuzzy linguistic approach. J Am Soc Inf Sci Technol 52(6):460–475

    Article  Google Scholar 

  • Herrera-Viedma E (2001b) An information retrieval model with ordinal linguistic weighted queries based on two weighting elements. Int J Uncertain Fuzziness Knowl Based Syst 9(supp01):77–87

    Article  MathSciNet  MATH  Google Scholar 

  • Herrera F, Herrera-Viedma E (1997) Aggregation operators for linguistic weighted information. IEEE Trans Syst Man Cybern Part A Syst Hum 27(5):646–656

    Article  Google Scholar 

  • Herrera F, Herrera-Viedma E (2003) A study of the origin and uses of the ordered weighted geometric operator in multicriteria decision making. Int J Intell Syst 18:689–707

    Article  MATH  Google Scholar 

  • Herrera-Viedma E, Gijon JL, Alonso S, Vilchez J, Garcia C, Villen L, Lopez-Herrera AG (2008) Applying aggregation operators for information access systems: an application in digital libraries. Int J Intell Syst 23(12):1235–1250

    Article  MATH  Google Scholar 

  • Herrera-Viedma E, Pasi G, Lopez-Herrera AG, Porcel C (2006) Evaluating the information quality of web sites: a methodology based on fuzzy computing with words. J Am Soc Inf Sci Technol 57(4):538–549

    Article  Google Scholar 

  • Hodge VJ, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22(2):85–126

    Article  MATH  Google Scholar 

  • http://www.worldwidewebsize.com/. Retrieved 18 Jan 2013

  • Kohli S, Gupta A (2013a) Analysis of aggregation operators in regression analysis. In: Proceedings international conference on cognitive computing and information processing CCIP, 2015

  • Kohli S, Gupta A (2013b) A survey on web information retrieval inside fuzzy framework. In: Proceedings of the third international conference on soft computing for problem solving. Springer, India, 2014

  • Kohli S, Gupta A (2014a) Fuzzy information retrieval in WWW: a survey. Int J Adv Intell Paradig 6(4):272–311

    Article  Google Scholar 

  • Kohli S, Gupta A (2014b) An ordered weighted operator approach towards web usage mining. In: 2014 International conference on computer and communication technology (ICCCT). IEEE, pp 73–78

  • Kumar V (2005) Parallel and distributed computing for cybersecurity. IEEE Distrib Syst Online 10:1

    Article  Google Scholar 

  • MacCrimmon KR (1973) An overview of multiple objective decision making. In: Cochrane JL, Zeleny M (eds.) Multiple criteria decision making, University of South Carolina Press, Columbia, pp 18–44

  • Mardani A, Jusoh A, Zavadskas EK (2015) Fuzzy multiple criteria decision-making techniques and applications two decades review from 1994 to 2014. Expert Syst Appl 42(8):4126–4148

    Article  Google Scholar 

  • Marichal JL (1999) Aggregation operators for multicriteria decision aid. PhD dissertation, University De Liege

  • Merigo JM, Casanovas M (2009) The induced generalized hybrid averaging operator and its application in financial decision making. Int J Bus Econ Finance Manag Sci 2:95101

    Google Scholar 

  • Merigo JM, Casanovas M (2010) The fuzzy generalized OWA operator and its application in strategic decision making. Cybern Syst Int J 41(5):359–370

    Article  Google Scholar 

  • Merigo JM, Casanovas M (2011a) Induced aggregation operators in the Euclidean distance and its application in financial decision making. Expert Syst Appl 38(6):7603–7608

    Article  Google Scholar 

  • Merigo JM, GilLafuente AM (2011b) Fuzzy induced generalized aggregation operators and its application in multi-person decision making. Expert Syst Appl 38(8):9761–9772

    Article  Google Scholar 

  • Merigo JM, Wei G (2011c) Probabilistic aggregation operators and their application in uncertain multi-person decision-making. Technol Econ Dev Econ 2:335–351

    Article  Google Scholar 

  • Peng Y, Kou G, Wang G, Shi Y (2011) FAMCDM: a fusion approach of MCDM methods to rank multiclass classification algorithms. Omega 39(6):677–689

    Article  Google Scholar 

  • Perez LG, Mata F, Chiclana F (2014) Social network decision making with linguistic trustworthiness based induced OWA operators. Int J Intell Syst 29(12):1117–1137

    Article  Google Scholar 

  • Platt J (1998) Fast training of support vector machines using sequential minimal optimization. In: Scholkopf B, Burges C, Smola A (eds) Advances in Kernel methods support vector learning. MIT Press, Cambridge

    Google Scholar 

  • Schlobach S, Knoblock CA (2012) Dealing with the Messiness of web data. Web samantics: science, services and agents on the World Wide Web, vol 14, no 1

  • Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK (2000) Improvements to the SMO algorithm for SVM regression. IEEE Trans Neural Netw 11(5):1188–1193

    Article  MATH  Google Scholar 

  • Smith ME (1990) Aspects of the P-norm model of information retrieval: syntectic query generation, efficiency and theoretical properties. Phd dissertition, Cornell University

  • Smolikova R, Wachowiak MP (2002) Aggregation operators for selection problems. Fuzzy Sets Syst 131(1):23–34

    Article  MathSciNet  MATH  Google Scholar 

  • Stone CJ (1985) Additive regression and other nonparametric models. Ann Stat 13(2):689–705

  • Su ZX, Xia GP, Chen MY, Wang L (2012) Induced generalized intuitionistic fuzzy OWA operator for multi-attribute group decision making. Expert Syst Appl 39(2):1902–1910

    Article  Google Scholar 

  • Sugeno M (1974) Theory of fuzzy integrals and its applications. PhD thesis, Tokyo Institute of Technology, Tokyo

  • Waller WG, Kraft DH (1979) A mathematical model of a weighted boolean retrieval system. Inf Process Mang 15(6):235–245

    Article  MATH  Google Scholar 

  • Wang YJ (2014) A fuzzy multi-criteria decision-making model by associating technique for order preference by similarity to ideal solution with relative preference relation. Inf Sci 268:169–184

    Article  MathSciNet  Google Scholar 

  • Weber S (1983) A general concept of fuzzy connectives, negation and implications based on t-norms and t-conorms. Fuzzy Sets Syst 11:115–134

    Article  MATH  Google Scholar 

  • Wu J, Chiclana F, Herrera-Viedma E (2015) Trust based consensus model for social network in an incomplete linguistic information context. Appl Soft Comput 35:827–839

    Article  Google Scholar 

  • Xu Z (2005) An overview of methods for determining OWA weights. Int J Intell Syst 20:843–865

    Article  MATH  Google Scholar 

  • Xu ZS, Da QL (2002) The ordered weighted geometric averaging operators. Int J Intell Syst 17:709–716

    Article  MATH  Google Scholar 

  • Yager RR (1988) On Ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Trans Syst Man Cybern 18(1):183–190

    Article  MathSciNet  MATH  Google Scholar 

  • Yager RR (2004) Generalized OWA aggregation operators. Fuzzy Optim Decis Mak 3(1):93–107

    Article  MathSciNet  MATH  Google Scholar 

  • Yager RR, Filev DP (1999) Induced ordered weighted averaging operators. IEEE Trans Syst Man Cybern Part B Cybern 29(2):141–150

    Article  Google Scholar 

  • Yandong Y (1985) Traiangular norms and TNF-sigma algebras. Fuzzy Sets Syst 16:251–264

    Article  MATH  Google Scholar 

  • Yoon KP, Hwang CL (1995) Multiple attribute decision making: an introduction, vol 104. Sage, Thousand Oaks

    Google Scholar 

  • Yu X, Xu Z, Ma Y (2013) Prioritized multi-criteria decision making based on the idea of PROMETHEE. Procedia Comput Sci 17:449–456

    Article  Google Scholar 

  • Zadeh LA (1999) Outline of a new approach to the analysis of complex systems and decision process. IEEE Trans Syst Man Cybern 3:28–44

    MathSciNet  MATH  Google Scholar 

  • Zhong N, Jiming L, Yao YY, Ohsuga S (2000) Web intelligence. In: Proceedings of 24th annual international computer software and application conference, COMPSAC

  • Zhou LG, Chen HU (2010) Generalized ordered weighted logarithm aggregation operators and their applications to group decision making. Int J Intell Syst 25(7):683–707

    MATH  Google Scholar 

  • Zimmermann HJ (2001) Fuzzy set theory and its applications. Springer, Berlin

    Book  Google Scholar 

Download references

Acknowledgments

This Work has been partially funded under Grant F.No.42-134/2013(SR) in the Major Research Project Scheme of University Grant Commission,India. We are thankful to the anonymous reviewers who provided critical comments to improve the contents of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ankit Gupta.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gupta, A., Kohli, S. An MCDM approach towards handling outliers in web data: a case study using OWA operators. Artif Intell Rev 46, 59–82 (2016). https://doi.org/10.1007/s10462-015-9456-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-015-9456-4

Keywords

Navigation