A Novel Integrated Classifier for Handling Data Warehouse Anomalies
Within databases employed in various commercial sectors, anomalies continue to persist and hinder the overall integrity of data. Typically, Duplicate, Wrong and Missed observations of spatial-temporal data causes the user to be not able to accurately utilise recorded information. In literature, different methods have been mentioned to clean data which fall into the category of either deterministic and probabilistic approaches. However, we believe that to ensure the maximum integrity, a data cleaning methodology must have properties of both of these categories to effectively eliminate the anomalies. To realise this, we have proposed a method which relies both on integrated deterministic and probabilistic classifiers using fusion techniques. We have empirically evaluated the proposed concept with state-of-the-art techniques and found that our approach improves the integrity of the resulting data set.
KeywordsBayesian Network Fusion Technique Commercial Sector Capture Cycle Monotonic Reasoning
Unable to display preview. Download preview PDF.
- 1.Yang, Q.: Activity recognition: linking low-level sensors to high-level intelligence. In: Proceedings of the 21st International Joint conference on Artifical intelligence (IJCAI), pp. 20–25 (2009)Google Scholar
- 2.Chawathe, S.S., Krishnamurthy, V., Ramachandran, S., Sarma, S.E.: Managing RFID Data. In: VLDB, pp. 1189–1195 (2004)Google Scholar
- 3.Jeffery, S.R., Garofalakis, M.N., Franklin, M.J.: Adaptive Cleaning for RFID Data Streams. In: VLDB, pp. 163–174 (2006)Google Scholar
- 4.Darcy, P., Stantic, B., Mitrokotsa, A., Sattar, A.: Detecting Intrusions within RFID Systems through Non-Monotonic Reasoning Cleaning. In: Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP 2010), pp. 257–262 (2010)Google Scholar
- 6.Billington, D.: An Introduction to Clausal Defeasible Logic. David Billington’s Home Page (August 2007), http://www.cit.gu.edu.au/~db/research.pdf
- 8.Blumenstein, M., Verma, B.: A Neural Based Segmentation and Recognition Technique for Handwritten Words. In: The 1998 IEEE International Joint Conference on Neural Networks Proceedings, vol. 3, pp. 1738–1742 (May 1998)Google Scholar
- 9.Darcy, P., Stantic, B., Derakhshan, R.: Correcting Stored RFID Data with Non-Monotonic Reasoning. Principles and Applications in Information Systems and Technology (PAIST) 1(1), 65–77 (2007)Google Scholar
- 10.Rao, J., Doraiswamy, S., Thakkar, H., Colby, L.S.: A Deferred Cleansing Method for RFID Data Analytics. In: VLDB, pp. 175–186 (2006)Google Scholar
- 11.Khoussainova, N., Balazinska, M., Suciu, D.: Probabilistic Event Extraction from RFID Data. In: International Conference on Data Engineering, pp. 1480–1482 (2008)Google Scholar