Abstract
Emerging Web standards promise a network of heterogeneous yet interoperable Web Services. Web Services would greatly simplify the development of many kinds of data integration and knowledge management applications. Unfortunately, this vision requires that services describe themselves with large amounts of semantic metadata “glue”. We explore a variety of machine learning techniques to semi-automatically create such metadata.
We make three contributions. First, we describe a Bayesian learning and inference algorithm for classifying HTML forms into semantic categories, as well as assigning semantic labels to the form’s fields. These techniques are important as legacy HTML interfaces are migrated to Web Services. Second, we describe the application of the Naive Bayes and SVM algorithms to the task of Web Service classification. We show that an ensemble approach that treats Web Services as structured objects is more accurate than an unstructured approach. Finally, we describe a clustering algorithm that automatically discovers the semantic categories of Web Services. All of our algorithms are evaluated using large collections of real HTML forms and Web Services.
Chapter PDF
Similar content being viewed by others
Keywords
- Bayesian Network
- Semantic Category
- Normalize Mutual Information
- Reference Class
- Conditional Probability Table
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with cotraining. In: COLT: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann Publishers, San Francisco (1998)
Cardoso, J.: Quality of Service and Semantic Composition of Workflows. PhD thesis, Department of Computer Science, University of Georgia, Athens, GA (2002)
Ciravegna, F.: Adaptive information extraction from text by rule induction and generalization. In: 17th Int. Joint Conference on Artifical Intelligence (2001)
Cutting, D., Pedersen, J., Karger, D., Tukey, J.: Scatter/gather: A cluster-based approach to browsing large document collections. In: Proceedings of the Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 318–329 (1992)
Doan, A., Domingos, P., Halevy., A.: Reconciling schemas of disparate data sources: A machine-learning approach. In: Proc. SIGMOD Conference (2001)
Kerschberg, L., Kim, W., Scime, A.: Intelligent web search via personalizable meta-search agents, pp. 1345–1358 (2002)
Kushmerick, N.: Wrapper induction: Efficiency and expressiveness. Artificial Intelligence 118(1–2), 15–68 (2000)
Melnik, S., Molina-Garcia, H., Rahm, E.: Similariy flooding: A versatile graph matching algorithm. In: Proc. of the International Conference on Data Engineering (ICDE) (2002)
Muslea, I., Minton, S., Knoblock, C.: A Hierachical Approach to Wrapper Induction. In: Proc. 3rd Int. Conf. Autonomous Agents, pp. 190–197 (1999)
Paolucci, M., Kawamura, T., Payne, T., Sycara, K.: Semantic matchmaking of web services capabilities. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, p. 333. Springer, Heidelberg (2002)
Porter, M.F.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Strehl, A.: Relationship-based Clustering and Cluster Ensembles for High-dimensional Data Mining. PhD thesis, University of Texas, Austin (2002)
van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworths, London (1979)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco
Zamir, O., Etzioni, O., Madani, O., Karp, R.M.: Fast and intuitive clustering of web documents. In: Knowledge Discovery and Data Mining, pp. 287–290 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Heß, A., Kushmerick, N. (2003). Learning to Attach Semantic Metadata to Web Services. In: Fensel, D., Sycara, K., Mylopoulos, J. (eds) The Semantic Web - ISWC 2003. ISWC 2003. Lecture Notes in Computer Science, vol 2870. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39718-2_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-39718-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20362-9
Online ISBN: 978-3-540-39718-2
eBook Packages: Springer Book Archive