Skip to main content

Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data

  • Conference paper
  • 686 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5871))

Abstract

We address the problem of learning predictive models from multiple large, distributed, autonomous, and hence almost invariably semantically disparate, relational data sources from a user’s point of view. We show under fairly general assumptions, how to exploit data sources annotated with relevant meta data in building predictive models (e.g., classifiers) from a collection of distributed relational data sources, without the need for a centralized data warehouse, while offering strong guarantees of exactness of the learned classifiers relative to their centralized relational learning counterparts. We demonstrate an application of the proposed approach in the case of learning link-based Naïve Bayes classifiers and present results of experiments on a text classification task that demonstrate the feasibility of the proposed approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Levy, A.: Logic-based techniques in data integration. In: Logic-based artificial intelligence, pp. 575–595. Kluwer Academic Publishers, Dordrecht (2000)

    Google Scholar 

  2. Noy, N.F.: Semantic Integration: A Survey Of Ontology-Based Approaches. SIGMOD Record, Special Issue on Semantic Integration 33 (2004)

    Google Scholar 

  3. Doan, A., Halevy, A.: Semantic Integration Research in the Database Community: A Brief Survey. AI Magazine 26, 83–94 (2005)

    Google Scholar 

  4. Calvanese, D., De Giacomo, G., Lenzerini, M., Vardi, M.Y.: View-based query processing: On the relationship between rewriting, answering and losslessness. In: Eiter, T., Libkin, L. (eds.) ICDT 2005. LNCS, vol. 3363, pp. 321–336. Springer, Heidelberg (2005)

    Google Scholar 

  5. Kalfoglou, Y., Schorlemmer, M.: Ontology mapping: The state of the art. In: Proceedings of Semantic Interoperability and Integration, Dagstuhl, Germany (2005)

    Google Scholar 

  6. Noy, N., Stuckenschmidt, H.: Ontology Alignment: An annotated Bibliography. In: Semantic Interoperability and Integration. Dagstuhl Seminar Proceedings, vol. 04391 (2005)

    Google Scholar 

  7. Caragea, D., Zhang, J., Bao, J., Pathak, J., Honavar, V.: Algorithms and software for collaborative discovery from autonomous, semantically heterogeneous information sources. In: Proceedings of ICALT, Singapore. LNCS, pp. 13–44 (2005)

    Google Scholar 

  8. Lu, Q., Getoor, L.: Link-based classification. In: Proceedings of the International Conference on Machine Learning, ICML (2003)

    Google Scholar 

  9. Mitchell, T.: Machine Learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  10. Caragea, D., Bao, J., Honavar, V.: Learning relational bayesian classifiers on the semantic web. In: Proceedings of the IJCAI 2007 SWeCKa Workshop, India (2007)

    Google Scholar 

  11. Rajan, S., Punera, K., Ghosh, J.: A maximum likelihood framework for integrating taxonomies. In: Proceedings of AAAI, Pittsburgh, Pennsylvania, pp. 856–861 (2005)

    Google Scholar 

  12. Doan, A., Madhavan, J., Dhamankar, R., Domingos, P., Halevy, A.: Learning to match ontologies on the semantic web. VLDB Journal (2003)

    Google Scholar 

  13. Caragea, C., Caragea, D., Honavar, V.: Learning link-based classifiers from ontology-extended textual data. In: Proceedings of ICTAI 2009, Newark, New Jersey, USA (2009)

    Google Scholar 

  14. Parag, Domingos, P.: Multi-relational record linkage. In: Proceedings of the KDD-2004 Workshop on Multi-Relational Data Mining, Seattle, CA. ACM Press, New York (2004)

    Google Scholar 

  15. McCallum, A., Nigam, K., Rennie, J., Seymore, K.: Automating the contruction of internet portals with machine learning. Information Retrieval Journal 3, 127–163 (2000)

    Article  Google Scholar 

  16. Getoor, L., Friedman, N., Koller, D., Taskar, B.: Learning probabilistic models of relational structure. Journal of Machine Learning Research 3, 679–707 (2002)

    Article  MathSciNet  Google Scholar 

  17. Neville, J., Jensen, D., Gallagher, B.: Simple estimators for relational bayesian classifiers. In: Proceedings of the 3rd IEEE ICDM 2003 (2003)

    Google Scholar 

  18. Kargupta, H., Chan, P.: Advances in Distributed and Parallel Knowledge Discovery. AAAI/MIT (2000)

    Google Scholar 

  19. Caragea, D., Honavar, V.: Learning classifiers from distributed data sources. Encyclopedia of Database Technologies and Applications (2008)

    Google Scholar 

  20. Zhang, J., Honavar, V.: Learning decision tree classifiers from attribute-value taxonomies and partially specified data. In: Fawcett, T., Mishra, N. (eds.) Proceedings of ICML, Washington, DC, pp. 880–887 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Caragea, C., Caragea, D., Honavar, V. (2009). Learning Link-Based Naïve Bayes Classifiers from Ontology-Extended Distributed Data. In: Meersman, R., Dillon, T., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2009. OTM 2009. Lecture Notes in Computer Science, vol 5871. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05151-7_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05151-7_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05150-0

  • Online ISBN: 978-3-642-05151-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics