Classifying univariate uncertain data

Liu, Ying-Ho; Fan, Huei-Yu

doi:10.1007/s10489-020-01911-0

Classifying univariate uncertain data

Published: 07 November 2020

Volume 51, pages 2622–2650, (2021)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Ying-Ho Liu¹ &
Huei-Yu Fan¹

244 Accesses
1 Citation
Explore all metrics

Abstract

In the literature, univariate uncertain data has a quantitative interval for each attribute in each transaction, which is accompanied by a probability density function indicating the probability that each value in the interval exists and appears. To the best of our knowledge, classifying univariate uncertain data has thus far seldom been addressed in the literature. Here, we propose the AssoU2Classifier algorithm to address this research gap. The AssoU2Classifier algorithm retrieves association rules from the univariate uncertain data to serve as a classification model. In addition, the U2Pruning procedure is developed to prune the association rules. The U2Pruning procedure not only reduces the number of association rules, which considerably accelerates the classification process, but also achieves high classification accuracies. In the experiments, the AssoU2Classifier algorithm was compared with 14 existing algorithms on 12 modified UCI datasets. The AssoU2Classifier algorithm obtained better classification accuracy than the compared algorithms on most of the datasets. Statistical tests (Friedman test and pairwise Wilcoxon test) also justified the advantage of the AssoU2Classifier algorithm. In addition, the AssoU2Classifier algorithm also had average learning time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimal Feature Selection for Multivalued Attributes Using Transaction Weights as Utility Scale

Fuzzy Set-Based Frequent Itemset Mining: An Alternative Approach to Study Consumer Behaviour

Application of Decision Rules to Discover Knowledge for Fitting Aggregations to Datasets

References

Liu YH (2012) Mining frequent patterns from univariate uncertain data. Data Knowl Eng 71(1):47–68
Article Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In:Proceedings of the Very Large Data Base, pp. 487–499
Gullo F, Ponti G, Tagarelli A (2008) Clustering uncertain data via k-medoids. Lect Notes Artif Int 5291:229–242
Google Scholar
Golpîra H (2018) A novel multiple attribute decision making approach based on interval data using U2P-miner algorithm. Data Knowl Eng 115:116–128
Article Google Scholar
Wu M, Wang Y, Lin S, Hao B, Sun P (2019) A U2P-miner-based method to identify critical energy-consuming parts of urban rail operation system. In: proceedings of the 4th international conference on electrical and information Technologies for Rail Transportation, pp 245–255
Liu YH (2017) Generating summaries for frequent univariate uncertain pattern. NTU Manag Rev 27(2S):29–62
Google Scholar
Liu YH (2014) Mining maximal frequent U2 patterns from univariate uncertain data. Intell Data Anal 18:653–676
Article Google Scholar
Fasihy H, Shahraki MHN (2018) Incremental mining maximal frequent patterns from univariate uncertain data. Knowl-Based Syst 152:40–50
Article Google Scholar
Liu YH, Wang CS (2013) Constrained frequent pattern mining on univariate uncertain data. J Syst Softw 86(3):759–778
Article Google Scholar
Liu YH (2013) Stream mining on univariate uncertain data. Appl Intell 39:315–344
Article Google Scholar
Liu YH (2015) Mining time-interval univariate uncertain sequential patterns. Data Knowl Eng 100:54–77
Article Google Scholar
Shao J, Tziatzios A (2018) Mining range associations for classification and characterization. Data Knowl Eng 118:92–106
Article Google Scholar
Xie Z, Xu Y, Hu Q (2018) Uncertain data classification with additive kernel support vector machine. Data Knowl Eng 117:87–97
Article Google Scholar
Huang J, Li Y, Qi K, Li F (2018) An Efficient Classification Method of Uncertain Data with Sampling. In: Liang Q, Liu X, Na Z, Wang W, Mu J, Zhang B (eds) Communications, signal processing, and systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 516
Malerba D, Esposito F, Appice A (2008) Exporting symbolic objects to databases. In: Symbolic data analysis and the SODAS software, Wiley-Interscience, New York, pp. 61–66
Oliveira MR, Vilela M, Pacheco A, Valadas R, Salvador P (2017) Extracting information from interval data using symbolic principal component analysis. Austrian J Stat 46(3–4):79–87
Article Google Scholar
Chui C, Kao B (2008) A decremental approach for mining frequent itemsets from uncertain data. In: Proceedings of the Pacific-Asia conference on Knowledge Discovery and Data Mining, pp. 64–75
Le T, Vo B, Huynh V, Nguyen NT, Sung WB (2020) Mining top-k frequent patterns from uncertain databases. Appl Intell 50:1487–1497
Article Google Scholar
Tavakkol B, Myonf KJ, Albin SL (2017) Object-to-group probabilistic distance measure for uncertain data classification. Neurocomputing 230:143–151
Article Google Scholar
Ahmed U, Lin JC, Srivastava G, Yasin R, Djenouri Y (2020) An evolutionary model to mine high expected utility patterns from uncertain databases. IEEE Trans Emerg Topics Comput Intell 1–10
Lee G, Yun U (2017) A new efficient approach for mining uncertain frequent patterns using minimum data structure without false positives. Future Gener Comp Sy 68:89–110
Article Google Scholar
Liu CM, Niu Z, Liao KT (2019) Efficiently extracting frequent patterns from continuous uncertain data. J Chin Inst Eng 42:225–235
Article Google Scholar
Sun L, Cheng R, Cheung DW, Cheng J (2010) Mining uncertain data with probabilistic guarantees. In: Proceedings of the ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp. 273–282
Prithviraj S, Amol D, Lise G (2007) Representing tuple and attribute uncertainty in probabilistic databases. In: Proceedings of the Seventh IEEE International Conference on Data Mining Workshops, pp. 273–282
Noirhomme-Fraiture M, Brito P (2012) Far beyond the classical data models: symbolic data analysis. Stat Anal Data Min 4(2):157–170
Article MathSciNet Google Scholar
Appice A, D'Amato C, Esposito F, Malerba D (2006) Classification of symbolic objects: a lazy learning approach. Intell Data Anal 10(4):301–324
Article Google Scholar
Diego CF Queiroz Renata MCR de Souza, Francisco José de A Cysneiros (2008) A classifier for interval symbolic data based on a multi-class probit model
Gan H, Zhang Y, Song Q (2017) Bayesian belief network for positive unlabeled learning with uncertainty. Pattern Recogn Lett 90:28–35
Article Google Scholar
Tavakkol B, Jeong MK, Albin SL (2019) Measures of scatter and fisher discriminant analysis for uncertain uata. IEEE T Syst Man CY-S 99:1–14
Google Scholar
Cover TM, Hart PE (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13(1):21–27
Article Google Scholar
Zhang H (2004) The optimality of naive Bayes. In: Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, pp. 562–567
Heckerman D (1995) A tutorial on learning with Bayesian networks, technique report. Microsoft Research
Rabiner LR (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Article Google Scholar
Zeidenberg M (1990) Neural networks in artificial intelligence. Ellis Horwood Limited
Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N, Hodjat B (2019) Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing, pp. 293–312
Zhang X, Zhou X, Lin M, Sun J, (2018) ShuffleNet: an extremely efficient convolutional neural network for Mobile devices. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 6848–6856
Umuroglu Y, Fraser NJ, Gambardella G, Blott M, Leong P, Jahre M, Vissers K (2018) FINN: a framework for fast, scalable binarized neural network inference. In: proceedings of the 25th international symposium on field-programmable gate arrays, pp 65–74
Hang R, Liu Q, Hong D, Ghamisi P (2019) Cascaded recurrent neural networks for hyperspectral image classification. IEEE Trans Geosci Remote Sens 57:5384–5394
Article Google Scholar
Adam C, Aliotti A, Malliaros FD, Cournède PH (2020) Dynamic monitoring of software use with recurrent neural networks. Data Knowl Eng 125:170781
Article Google Scholar
Corinna C, Vapnik VN (1995) Support-vector networks. Mach Learn 20(3):274–297
MATH Google Scholar
Quinlan JR (1993) C4.5: programs for machine learning, Morgan Kaufmann Publishers
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees, Monterey. Wadsworth & Brooks/Cole Advanced Books & Software, CA
MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Sun Y, Wong AKC (2006) An overview of associative classifiers. In: proceedings of the 2006 international conference on data mining, pp 138–143
Deng H, Runger G, Tuv E, Bannister W (2014) CBC: an associative classifier with a small number of rules. Decis Support Syst 59:163–170
Article Google Scholar
Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article MathSciNet Google Scholar
Liu H, Cocea M (2017) Granular computing-based approach for classification towards reduction of bias in ensemble learning. Granul Comput 2:131–139
Article Google Scholar
Liu H, Cocea M (2019) Nature-inspired framework of ensemble learning for collaborative. Granul Comput 4:715–724
Article Google Scholar
Amezcua J, Melin P (2019) A new fuzzy learning vector quantization method for classification problems based on a granular approach. Granul Comput 4:197–209
Article Google Scholar
Liu H, Zhang L (2018) Fuzzy rule-based systems for recognition-intensive classification in granular computing context. Granul Comput 3:355–365
Article Google Scholar
Liu H, Cocea M (2019) Granular computing-based approach of rule learning for binary classification. Granul Comput 4:275–283
Article Google Scholar
Ng AY, Jordan MI (2002) On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. In: proceedings of the NIPS-14
Liu B, Hsu W, Ma Y (1998) Integrating classification and association rule mining. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 80–86
Li W, Han J, Pei J (2001) CMAR: accurate and efficient classification based on multiple-class association rule. In: Proceedings of the International Conference on Data Mining, pp. 369–376
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: proceedings of the 2000 ACM SIGMOD international conference on Management of Data, pp 1–12
Quinlan J, Cameron-Jones R (1993) FOIL: A midterm report. In: Proceedings of the European Conference on Machine Learning, pp. 3–20
Thabtah F, Cowling P, Peng Y (2005) MCAR: multi-class classification based on association ruleapproach. In: proceeding of the 3rd IEEE international conference on computer systems and applications, pp 1–7
Liu B, Ma Y, Wong CK (2000) Improving an association rule based classifier. In: proceedings of the 4th European conference on principles of data mining and knowledge discovery, pp 504–509
Baralis E, Torino P (2002) A lazy approach to pruning classification rules. In: proceedings of the 2002 IEEE international conference on data mining, pp 35
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Iris. Accessed 2 Nov 2017
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Balance+Scale. Accessed 2 Nov 2017
Mballo C, Diday E (2006) The criterion of Kolmogorov-Smirnov for binary decision tree: application to interval valued variables. Intell Data Anal 10(4):325–341
Article Google Scholar
Yeh IC, Yang KJ, Ting TM (2008) Knowledge discovery on RFM model using Bernoulli sequence. Expert Syst Appl 36(3):5866–5871 https://archive.ics.uci.edu/ml/datasets/Blood+Transfusion+Service+Center. Accessed 2 Nov 2017
Article Google Scholar
Dua D, Karra Taniskidou E (2019) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Abalone. Accessed 1 Sep 2019
Bhatt R, Dhall A (2019) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Skin+Segmentation. Accessed 1 Sep 2019
Dua D, Karra Taniskidou E (2019) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29. Accessed 1 Sep 2019
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/covertype. Accessed 1 Jun 2017
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Ecoli. Accessed 1 Jun 2017
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/glass+identification. Accessed 1 Jun 2017
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Haberman%27s+Survival. Accessed 1 Jun 2017
Dua D, Karra Taniskidou E (2017) UCI machine learning repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. https://archive.ics.uci.edu/ml/datasets/Ionosphere. Accessed 1 Jun 2017
Waugh S (1995) Extending and benchmarking Cascade-Correlation, PhD thesis, Computer Science Department, University of Tasmania
Bogawar PS, Bhoyar KK (2018) An improved multiclass support vector machine classifier using reduced hyper-plane with skewed binary tree. Appl Intell 48:4382–4391
Article Google Scholar
Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. MA: Addison-Wesley Professional
Kennedy J, Eberhart R (1995) particle swarm optimization. In: Proceedings of the IEEE International Conference on Neural Networks IV, pp 1942–1948
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Technical Report, Erciyes University
Robu R, Holban S (2011) A genetic algorithm for classification. In: proceedings of the 2011 international conference on computers and computing, pp 52–56

Download references

Acknowledgements

This research was supported in part by the Ministry of Science and Technology of Republic of China under Grant No. MOST 103-2221-E-259 -019 -MY2.

Author information

Authors and Affiliations

Department of Information Management, National Dong Hwa University, No. 1, Sec. 2, Da Hsueh Road, Hualien, 97401, Taiwan, Republic of China
Ying-Ho Liu & Huei-Yu Fan

Authors

Ying-Ho Liu
View author publications
You can also search for this author in PubMed Google Scholar
Huei-Yu Fan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying-Ho Liu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, YH., Fan, HY. Classifying univariate uncertain data. Appl Intell 51, 2622–2650 (2021). https://doi.org/10.1007/s10489-020-01911-0

Download citation

Published: 07 November 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s10489-020-01911-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classifying univariate uncertain data

Abstract

Access this article

Similar content being viewed by others

Optimal Feature Selection for Multivalued Attributes Using Transaction Weights as Utility Scale

Fuzzy Set-Based Frequent Itemset Mining: An Alternative Approach to Study Consumer Behaviour

Application of Decision Rules to Discover Knowledge for Fitting Aggregations to Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classifying univariate uncertain data

Abstract

Access this article

Similar content being viewed by others

Optimal Feature Selection for Multivalued Attributes Using Transaction Weights as Utility Scale

Fuzzy Set-Based Frequent Itemset Mining: An Alternative Approach to Study Consumer Behaviour

Application of Decision Rules to Discover Knowledge for Fitting Aggregations to Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation