Measuring Data Imperfection in a Neighborhood Based Method

Cadenas, José M.; Garrido, M. Carmen; Martínez, Raquel

doi:10.1007/978-3-319-24598-0_19

José M. Cadenas²⁰,
M. Carmen Garrido²⁰ &
Raquel Martínez²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9422))

Included in the following conference series:

Conference of the Spanish Association for Artificial Intelligence

955 Accesses
1 Altmetric

Abstract

In this paper, we present an extension of k nearest neighbors method so it can perform imputation/classification from datasets with low quality data. The method performs a weighting of neighbors based on their imperfection and distance of classes. Thus the method allows us explicitly to indicate the average degree of imperfection of the neighbors that it is accepted to carry out the imputation/classification and the average distance of classes to the class of example to impute/classify that it is allowed. We carry out several experiments with both real-world and synthetic datasets with low quality data to test the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bonissone, P.P., Cadenas, J.M., Garrido, M.C., Díaz-Valladares, R.A.: A fuzzy random forest. Int. J. Approximate Reasoning 51(7), 729–747 (2010)
Article Google Scholar
Cadenas, J.M., Garrido, M.C., Martínez, R., Bonissone, P.P.: Extending information processing in a fuzzy random forest. Soft. Comput. 16, 845–861 (2012)
Article Google Scholar
Cadenas, J.M., Garrido, M.C., Martínez-España, R.: Software tool: NIP tool, Universidad de Murcia (2012). http://heurimind.inf.um.es
Derrac, J., García, S., Herrera, F.: Fuzzy nearest neighbor algorithms: taxonomy, experimental analysis and prospects. Inf. Sci. 260, 98–119 (2014)
Article Google Scholar
Diamon, P., Kloeden, P.: Metric Spaces of Fuzzy Sets: Theory and Application. World Scientific, Singapore (1994)
Book Google Scholar
DeLuca, A., Termini, S.: A definition of a nonprobabilistic entropy in the setting of fuzzy sets theory. Inf. Control 20(4), 301–312 (1972)
Article MathSciNet Google Scholar
Dombi, J., Porkolab, L.: Measures fuzziness. Ann. Universitasis Scientiarium Budapestinensis Sect. Computatorica 12, 69–78 (1991)
MATH MathSciNet Google Scholar
Dubois, D., Parde, H.: Fuzzy Sets and System, Theory and Applications. Academic Press, New York (1980)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, New York (2001)
MATH Google Scholar
García, S., Fernández, A., Luengo, J., Herrera, F.: A study statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft. Comput. 13(10), 959–977 (2009)
Article Google Scholar
Ihaka, R., Gentleman, R.: R: a language for data analysis and graphics. J. Comput. Graph. Stat. 5(3), 299–314 (1996)
Google Scholar
Eickhoff, J.: Introduction to the theory of fuzzy subsets. In: Eickhoff, J. (ed.) Onboard Computers, Onboard Software and Satellite Operations. SAT, vol. 1, pp. 3–6. Springer, Heidelberg (2012)
Chapter Google Scholar
Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml
Palacios, A.M., Sánchez, L., Couso, I.: Extending a simple genetic cooperative-competitive learning fuzzy classifier to low quality datasets. Evol. Intel. 2, 73–884 (2009)
Article Google Scholar
Ralescu, A.L., Ralescu, D.A.: Probability and fuzziness. Information. Science 34, 85–92 (1984)
MATH MathSciNet Google Scholar
Zsolt, C.J., Kovács, S.: Distance based similarity measures of fuzzy sets. In: Proceedings 3rd Symposium on Applied Machine Intelligence (SAMI 2005), Slovakia (2005)
Google Scholar

Download references

Acknowledgements

Supported by the projects TIN2011-27696-C02-02, TIN2014-52099-R and TIN2014-56381-REDT (“Red de Lógica Difusa y Soft Computing (LODISCO)”) of the Ministry of Economy and Competitiveness of Spain.

Author information

Authors and Affiliations

Department of Information Engineering and Communication, University of Murcia, Murcia, Spain
José M. Cadenas & M. Carmen Garrido
Department of Computer Engineering, Catholic University of San Antonio, Murcia, Spain
Raquel Martínez

Authors

José M. Cadenas
View author publications
You can also search for this author in PubMed Google Scholar
M. Carmen Garrido
View author publications
You can also search for this author in PubMed Google Scholar
Raquel Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raquel Martínez .

Editor information

Editors and Affiliations

University of Castilla-La Mancha, Albacete, Spain
José M. Puerta
University of Castilla-La Mancha, Albacete, Spain
José A. Gámez
University of Cadiz, Cadiz, Spain
Bernabe Dorronsoro
Public University of Navarre, Pamplona, Spain
Edurne Barrenechea
Pablo de Olavide University, Sevilla, Spain
Alicia Troncoso
Department of Civil Engineering, University of Burgos, Burgos, Spain
Bruno Baruque
Public University of Navarre, Pamplona, Spain
Mikel Galar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cadenas, J.M., Garrido, M.C., Martínez, R. (2015). Measuring Data Imperfection in a Neighborhood Based Method. In: Puerta, J., et al. Advances in Artificial Intelligence. CAEPIA 2015. Lecture Notes in Computer Science(), vol 9422. Springer, Cham. https://doi.org/10.1007/978-3-319-24598-0_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-24598-0_19
Published: 14 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24597-3
Online ISBN: 978-3-319-24598-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics