Analyzing Data Through Data Fusion Using Classification Techniques

Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 32)


Knowledge is the ultimate output of decisions on a dataset. Applying classification rules is one of the vital methods to extract knowledge from dataset. Knowledge in a very distributed approach is derived by combining or fusing these rules. In a very standard approach this may generally be done either by combining the classifiers outputs or by combining the sets of classification rules. In this paper, we tend to do a new approach of fusing classifiers at the extent of parameters using classification rules. This approach relies on the fused probabilistic generative classifiers using multinomial distributions for categorical input dimensions and multivariable normal distributions for the continual ones. These distributions are used to produce results like valid/invalid data, error rate etc. Fusing two (or more) classifiers may be done by multiplying the hyper-distributions of the parameters. The main advantage of this fusion approach is that it requires less time to classify the data and is easily extensible for large dataset.


Data fusion Classification Multinomial distribution Hyper-distribution 


  1. 1.
    Fisch, D., Kalkowski, E., Sick, D.: Knowledge fusion for probabilistic generative classifier with data mining application. IEEE Trans. Knowl. Data Eng. 26, 652–666 (2014)CrossRefGoogle Scholar
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATHGoogle Scholar
  3. 3.
    Fisch, D., Kühbeck, B., Sick, B., Ovaska, S.J.: So near and yet so far: new insight into properties of some well-known classifier paradigms. Inf. Sci. 180(18), 3381–3401 (2010)CrossRefGoogle Scholar
  4. 4.
    Bouguila, N.: Hybrid generative/discriminative approaches for proportional data modeling and classification. IEEE Trans. Knowl. Data Eng. (2011). Accepted for publication doi: 10.1109/TKDE.2011.162
  5. 5.
    Hospedales, T.M., Gong, S., Xiang, T.: Finding rare classes: active learning with generative and discriminative models. IEEE Trans. Knowl. Data Eng. (2011). Accepted for publication. doi: 10.1109/TKDE.2011.231
  6. 6.
    Fisch, D., Gruber, T., Sick, B.: Swiftrule: mining comprehensible classification rules for time series analysis. IEEE Trans. Knowl. Data Eng. 23(5), 774–787 (2011)CrossRefGoogle Scholar
  7. 7.
    Gray, P., Preece, A., Fiddian, N., Gray, W., Capon, T.B., Have, M., Azarmi, N., Wiegand, I., Ashwell, M., Beer, M. et al.: KRAFT: knowledge fusion from distributed databases and knowledge bases. In: Proceedings of the 8th International Workshop on Database and Expert Systems Applications, pp. 682–691 (1997)Google Scholar
  8. 8.
    Hui, K.Y., Gray, P.: Constraint and data fusion in a distributed information system. In: Embury S., Fiddian N., Gray W., Jones A. (eds.) Advances in Databases, Ser. Lecture Notes in Computer Science, vol. 1405, pp. 181–182. Springer, BerlinGoogle Scholar
  9. 9.
    Hui, K.Y.: Knowledge fusion and constraint solving in a distributed environment. Ph.D. Dissertation, Department of Computing Science, University of Aberdeen (2000)Google Scholar
  10. 10.
    Pavlin, G., De Oude, P., Maris, M., Nunnink, J., Hood, T.: A multi agent systems approach to distributed Bayesian information fusion. Inf. Fusion 11(3), 267–282 (2010)CrossRefGoogle Scholar
  11. 11.
    Santos Jr., E., Wilkinson, J., Santos, E.: Bayesian knowledge fusion. In: Proceedings of the 22nd International FLAIRS Conference, pp. 559–564 (2009)Google Scholar
  12. 12.
    Wang, Y., Wu, B., Hu, J.: A semantic knowledge fusion method based on topic maps. In: Workshop on Intelligent Information Technology Application, pp 74–76 (2007)Google Scholar
  13. 13.
    Smirnov, A., Pashkin, M., Chilov, N., Levashova, T.: KSNET—approach to knowledge fusion from distributed sources. Comput. Inform. 22(2), 105–142 (2003)MATHGoogle Scholar
  14. 14.
    Foina, A.G., Planas, J., Badia, R.M., Ramirez-Fernandez, F.J.: P-means, a parallel clustering algorithm for a heterogeneous multi-processor environment. In: Proceedings of the international conference on high performance computing and simulation (HPCS), pp. 239–248 (2011)Google Scholar
  15. 15.
    Li, Y., Zhao, K., Chu, X., Liu, J.: Speeding up k-means algorithm by GPUs. In: Proceedings of the 10th IEEE International Conference on Computer and Information Technology, pp. 115–122 (2010)Google Scholar
  16. 16.
    Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: Proceedings of NIPS (2006)Google Scholar
  17. 17.
    Fisch, D., Ovaska, S.J., Kalkowski, E., Sick, B.: In your interest objective interestingness measures for a generative classifier. In: Proceedings of the 3rd International Conference on Agents and Artificial Intelligence, pp. 414–423 (2011)Google Scholar
  18. 18.
    Le Cam, L., Yang, G.: Asymptotics in statistics: some basic concepts, 2nd edn. Springer, Berlin (2000)CrossRefGoogle Scholar

Copyright information

© Springer India 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceAvinashilingam Deemed UniversityCoimbatoreIndia

Personalised recommendations