Abstract
This paper presents a novel approach to the prediction of null values in relational databases, based on the notion of analogical proportion. We show in particular how an algorithm initially proposed in a classification context can be adapted to this purpose. This work focuses on the case of a transactional database, where attributes are Boolean. The experimental results reported here, even though preliminary, are encouraging since the approach yields a better precision, on average, than the classical nearest neighbors technique.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bayoudh, S., Miclet, L., Delhay, A.: Learning by analogy: A classification rule for binary and nominal data. In: Veloso, M.M. (ed.) IJCAI, pp. 678–683 (2007)
Chen, S.M., Chang, S.T.: Estimating null values in relational database systems having negative dependency relationships between attributes. Cybernetics and Systems 40(2), 146–159 (2009)
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Fujikawa, Y., Ho, T.-B.: Cluster-based algorithms for dealing with missing values. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 549–554. Springer, Heidelberg (2002)
Liu, W.Z., White, A.P., Thompson, S.G., Bramer, M.A.: Techniques for dealing with missing values in classification. In: Liu, X., Cohen, P., Berthold, M. (eds.) IDA 1997. LNCS, vol. 1280, pp. 527–536. Springer, Heidelberg (1997)
Miclet, L., Bayoudh, S., Delhay, A.: Analogical dissimilarity: Definition, algorithms and two experiments in machine learning. J. Artif. Intell. Res. (JAIR) 32, 793–824 (2008)
Miclet, L., Prade, H.: Handling analogical proportions in classical logic and fuzzy logics settings. In: Sossai, C., Chemello, G. (eds.) ECSQARU 2009. LNCS, vol. 5590, pp. 638–650. Springer, Heidelberg (2009)
Myrtveit, I., Stensrud, E., Olsson, U.H.: Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods. IEEE Trans. Software Eng. 27(11), 999–1013 (2001)
Nogueira, B.M., Santos, T.R.A., Zárate, L.E.: Comparison of classifiers efficiency on missing values recovering: Application in a marketing database with massive missing data. In: CIDM, pp. 66–72. IEEE (2007)
Prade, H., Richard, G.: Reasoning with logical proportions. In: Lin, F., Sattler, U., Truszczynski, M. (eds.) KR. AAAI Press (2010)
Prade, H., Richard, G.: Homogeneous logical proportions: Their uniqueness and their role in similarity-based prediction. In: Brewka, G., Eiter, T., McIlraith, S.A. (eds.) KR. AAAI Press (2012)
Prade, H., Richard, G.: Analogical proportions and multiple-valued logics. In: van der Gaag, L.C. (ed.) ECSQARU 2013. LNCS, vol. 7958, pp. 497–509. Springer, Heidelberg (2013)
Ragel, A.: Preprocessing of missing values using robust association rules. In: Żytkow, J.M., Quafafou, M. (eds.) PKDD 1998. LNCS, vol. 1510, pp. 414–422. Springer, Heidelberg (1998)
Shen, J.J., Chen, M.T.: A recycle technique of association rule for missing value completion. In: AINA, pp. 526–529. IEEE Computer Society (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Correa Beltran, W., Jaudoin, H., Pivert, O. (2014). Estimating Null Values in Relational Databases Using Analogical Proportions. In: Laurent, A., Strauss, O., Bouchon-Meunier, B., Yager, R.R. (eds) Information Processing and Management of Uncertainty in Knowledge-Based Systems. IPMU 2014. Communications in Computer and Information Science, vol 444. Springer, Cham. https://doi.org/10.1007/978-3-319-08852-5_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-08852-5_12
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08851-8
Online ISBN: 978-3-319-08852-5
eBook Packages: Computer ScienceComputer Science (R0)