Advertisement

Incremental Schema Mapping

  • Sarawat Anam
  • Yang Sok Kim
  • Qing Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8863)

Abstract

Schema mapping that provides a unified view to the users is essential to manage schema heterogeneity among different sources. Schema mapping can be conducted by machine learning or by knowledge engineering approach. Machine learning approach needs training data set for building models, but usually it is very difficult to obtain training datasets for large datasets. In addition, it is very difficult to change the model by human knowledge. Knowledge engineering approach encodes human knowledge directly, such that the knowledge base can be constructed with limited data, but it needs time consuming knowledge acquisition. This research proposes an incremental schema mapping method that employs Ripple-Down Rules (RDR) with the censored production rules (CPR). Our experimental results show that RDR approach shows comparable performance with the machine learning approaches and RDR knowledge base can be expanded incrementally as the cases classified increase.

Keywords

Schema mapping incremental knowledge acquisition techniques and machine learning techniques 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cate, B.T., Dalmau, V., Kolaitis, P.G.: Learning schema mappings. In: Proceedings of the 15th International Conference on Database Theory, pp. 182–195. ACM, Berlin (2012)Google Scholar
  2. 2.
    Glavic, B., Alonso, G., Miller, R.J., Hass, L.M.: TRAMP: Understanding the behavior of schema mappings through provenance. Proceedings of the VLDB Endowment 3(1-2), 1314–1325 (2010)CrossRefGoogle Scholar
  3. 3.
    Ngo, D., Bellahsene, Z., Todorov, K.: Opening the Black Box of Ontology Matching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 16–30. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Do, H.H., Rahm, E.: COMA: a system for flexible combination of schema matching approaches. In: Proceedings of the 28th International Conference on Very Large Data Bases, pp. 610–621. VLDB Endowment, Hong Kong (2002)CrossRefGoogle Scholar
  5. 5.
    Aumueller, D., Do, H.H., Massmann, S., Rahm, E.: Schema and ontology matching with COMA++. In: Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM (2005)Google Scholar
  6. 6.
    Doan, A., Madhavan, J., Domingos, P., Halevy, A.: Learning to map between ontologies on the semantic web. In: Proceedings of the 11th International Conference on World Wide Web. ACM (2002)Google Scholar
  7. 7.
    Marie, A., Gal, A.: Boosting schema matchers. In: Meersman, R., Tari, Z. (eds.) OTM 2008, Part I. LNCS, vol. 5331, pp. 283–300. Springer, Heidelberg (2008)Google Scholar
  8. 8.
    Richards, D.: Two decades of ripple down rules research. The Knowledge Engineering Review 24(02), 159–184 (2009)CrossRefGoogle Scholar
  9. 9.
    Kim, Y.S., Compton, P., Kang, B.H.: Ripple-down rules with censored production rules. In: Richards, D., Kang, B.H. (eds.) PKAW 2012. LNCS, vol. 7457, pp. 175–187. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  10. 10.
    Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: A machine-learning approach. ACM Sigmod Record (2001)Google Scholar
  11. 11.
    Embley, D.W., Xu, L., Ding, Y.: Automatic direct and indirect schema mapping: experiences and lessons learned. ACM SIGMod Record 33(4), 14–19 (2004)CrossRefGoogle Scholar
  12. 12.
    Duchateau, F., Coletta, R., Bellahsene, Z., Miller, R.J.: Yam: a schema matcher factory. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management. ACM (2009)Google Scholar
  13. 13.
    Compton, P., Edwards, G., Kang, B., Lazarus, L., Malor, R., Menzies, T., Preston, P., Srinivasan, A., Sammut, S.: Ripple down rules: possibilities and limitations. In: Proceedings of the Sixth AAAI Knowledge Acquisition for Knowledge-Based Systems Workshop, Calgary, Canada, University of Calgary (1991)Google Scholar
  14. 14.
    Compton, P., Jansen, R.: A philosophical basis for knowledge acquisition. Knowledge Acquisition 2(3), 241–258 (1990)CrossRefGoogle Scholar
  15. 15.
    Kang, B., Compton, P., Preston, P.: Multiple classification ripple down rules: Evaluation and possibilities. In: The 9th Knowledge Acquisition for Knowledge Based Systems Workshop (1995)Google Scholar
  16. 16.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, California (1993)Google Scholar
  17. 17.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefzbMATHGoogle Scholar
  18. 18.
    Pater, N.: Enhancing random forest implementation in WEKA. In: Machine learning conference paper for ECE591Q (2005)Google Scholar
  19. 19.
    Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: ICML (1999)Google Scholar
  20. 20.
    Hall, M., Frank, E.: Combining Naive Bayes and Decision Tables. In: FLAIRS Conference (2008)Google Scholar
  21. 21.
    Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization: Papers from the Workshop (1998)Google Scholar
  22. 22.
    Jimenez, S., Becerra, C., Gelbukh, A., Gonzalez, F.: Generalized mongue-elkan method for approximate text string comparison. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 559–570. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  23. 23.
    Stoilos, G., Stamou, G., Kollias, S.D.: A string metric for ontology alignment. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 624–637. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  24. 24.
    Cheng, W., Lin, H., Sun, Y.: An efficient schema matching algorithm. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES 2005. LNCS (LNAI), vol. 3682, pp. 972–978. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  25. 25.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sarawat Anam
    • 1
    • 2
  • Yang Sok Kim
    • 1
  • Qing Liu
    • 2
  1. 1.School of Computing and Information SystemsUniversity of TasmaniaSandy BayAustralia
  2. 2.Intelligent Sensing and Systems LaboratoryCSIRO Computational InformaticsHobartAustralia

Personalised recommendations