Abstract
A logical framework on Machine Learning by Rule Generation (MLRG) from tables with non-deterministic information is proposed, and its prototype system in SQL is implemented. In MLRG, the certain rules defined in Rough Non-deterministic Information Analysis (RNIA) are obtained at first, and each uncertain attribute value is estimated so as to cause the certain rules as many as possible, because the certain rules show us the most reliable information. This strategy is similar to the maximum likelihood estimation in statistics. By repeating this process, a standard table and the rules in its table are learned (or estimated) from a given table with non-deterministic information. Even though it will be hard to know the actual unknown values, MLRG will give a plausible estimation value.
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of VLDB 1994, pp. 487–499. Morgan Kaufmann (1994)
Aldrich, J.: R.A. Fisher and the making of maximum likelihood 1912–1922. Stat. Sci. 12(3), 162–176 (1997)
Clark, P., Grzymala-Busse, J.: Mining incomplete data with many attribute-concept values and “do not care” conditions. In: Proceedings of IEEE Big Data 2015, pp. 1597–1602 (2015)
Frank, A., Asuncion, A.: UCI machine learning repository. School of Information and Computer Science, University of California, Irvine (2010). http://mlearn.ics.uci.edu/MLRepository.html
Grzymala-Busse, J.: Data with missing attribute values: generalization of indiscernibility relation and rule induction. Trans. Rough Sets 1, 78–95 (2004)
Kowalski, M., Stawicki, S.: SQL-based heuristics for selected KDD tasks over large data sets. In: Proceedings of FedCSIS 2012, pp. 303–310 (2012)
Kryszkiewicz, M.: Rules in incomplete information systems. Inf. Sci. 113(3–4), 271–292 (1999)
Lipski, W.: On databases with incomplete information. J. ACM 28(1), 41–70 (1981)
Orłowska, E., Pawlak, Z.: Representation of nondeterministic information. Theor. Comput. Sci. 29(1–2), 27–39 (1984)
Pawlak, Z.: Systemy Informacyjne: Podstawy Teoretyczne (in Polish) WNT (1983)
Sahri, Z., Yusof, R., Watada, J.: FINNIM: iterative imputation of missing values in dissolved gas analysis dataset. IEEE Trans. Ind. Inform. 10(4), 2093–2102 (2014)
Sakai, H., et al.: Rules and apriori algorithm in non-deterministic information systems. Trans. Rough Sets 9, 328–350 (2008)
Sakai, H., Wu, M., Nakata, M.: Apriori-based rule generation in incomplete information databases and non-deterministic information systems. Fundam. Inform. 130(3), 343–376 (2014)
Sakai, H., Wu, M.: The completeness of NIS-Apriori algorithm and a software tool getRNIA. In: Mori, M. (ed.) Proceedings of International Conference on AAI 2014, pp. 115–121. IEEE (2014)
Sakai, H., Liu, C.: A consideration on learning by rule generation from tables with missing values. In: Mine, T. (ed.) Proceedings of International Conference on AAI 2015, pp. 183–188. IEEE (2015)
Sakai, H., Liu, C., Zhu, X., Nakata, M.: On NIS-Apriori based data mining in SQL. In: Flores, V., et al. (eds.) IJCRS 2016. LNCS (LNAI), vol. 9920, pp. 514–524. Springer, Cham (2016). doi:10.1007/978-3-319-47160-0_47
Sakai, H.: Execution logs by RNIA software tools (2016). http://www.mns.kyutech.ac.jp/~sakai/RNIA
Ślęzak, D., Sakai, H.: Automatic extraction of decision rules from non-deterministic data systems: theoretical foundations and SQL-based implementation. In: Ślęzak, D., Kim, T.H., Zhang, Y., Ma, J., Chung, K.I. (eds.) DTA 2009. CCIS, vol. 64, pp. 151–162. Springer, Heidelberg (2009). doi:10.1007/978-3-642-10583-8_18
Swieboda, W., Nguyen, S.: Rough set methods for large and spare data in EAV format. In: Proceedings of IEEE RIVF 2012, pp. 1–6 (2012)
Yao, Y.Y.: Three-way decisions with probabilistic rough sets. Inf. Sci. 180, 314–353 (2010)
Acknowledgment
The authors would be grateful to the anonymous referees for their useful comments. This work is supported by JSPS (Japan Society for the Promotion of Science) KAKENHI Grant Number 26330277.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sakai, H., Nakata, M., Watada, J. (2017). A Proposal of Machine Learning by Rule Generation from Tables with Non-deterministic Information and Its Prototype System. In: Polkowski, L., et al. Rough Sets. IJCRS 2017. Lecture Notes in Computer Science(), vol 10313. Springer, Cham. https://doi.org/10.1007/978-3-319-60837-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-60837-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60836-5
Online ISBN: 978-3-319-60837-2
eBook Packages: Computer ScienceComputer Science (R0)