Abstract
Despite unified data models, such as the Resource Description Framework (Rdf) on structural level and the corresponding query language Sparql, the integration and usage of Linked Open Data faces major heterogeneity challenges on the semantic level. Incorrect use of ontology concepts and class properties impede the goal of machine readability and knowledge discovery. For example, users searching for movies with a certain artist cannot rely on a single given property artist, because some movies may be connected to that artist by the predicate starring. In addition, the information need of a data consumer may not always be clear and her interpretation of given schemata may differ from the intentions of the ontology engineer or data publisher.
It is thus necessary to either support users during query formulation or to incorporate implicitly related facts through predicate expansion. To this end, we introduce a data-driven synonym discovery algorithm for predicate expansion. We applied our algorithm to various data sets as shown in a thorough evaluation of different strategies and rule-based techniques for this purpose.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abedjan, Z., Lorey, J., Naumann, F.: Reconciling ontologies and the web of data. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), Maui, Hawaii, pp. 1532–1536 (2012)
Abedjan, Z., Naumann, F.: Context and target configurations for mining RDF data. In: Proceedings of the International Workshop on Search and Mining Entity-Relationship Data (SMER), Glasgow (2011)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), Washington, D.C., USA, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules in Large Databases. In: Proceedings of the International Conference on Very Large Databases (VLDB), Santiago de Chile, Chile, pp. 487–499 (1994)
Antonie, M.-L., Zaïane, O.R.: Mining positive and negative association rules: An approach for confined rules. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS (LNAI), vol. 3202, pp. 27–38. Springer, Heidelberg (2004)
Baeza-Yates, R.A., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley Longman Publishing Co., Inc., Boston (1999)
Baroni, M., Bisi, S.: Using cooccurrence statistics and the web to discover synonyms in technical language. In: International Conference on Language Resources and Evaluation, pp. 1725–1728 (2004)
Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia - A crystallization point for the Web of Data. Journal of Web Semantics (JWS) 7, 154–165 (2009)
Böhm, C., Naumann, F., Abedjan, Z., Fenz, D., Grütze, T., Hefenbrock, D., Pohl, M., Sonnabend, D.: Profiling linked open data with ProLOD. In: Proceedings of the International Workshop on New Trends in Information Integration (NTII), pp. 175–178 (2010)
Cafarella, M.J., Halevy, A., Wang, D.Z., Wu, E., Zhang, Y.: WebTables: exploring the power of tables on the web. Proceedings of the VLDB Endowment 1, 538–549 (2008)
Doan, A., Domingos, P., Halevy, A.Y.: Reconciling schemas of disparate data sources: a machine-learning approach. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), New York, NY, pp. 509–520 (2001)
Elbassuoni, S., Ramanath, M., Weikum, G.: Query relaxation for entity-relationship search. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 62–76. Springer, Heidelberg (2011)
Gottlob, G., Senellart, P.: Schema mapping discovery from data instances. Journal of the ACM 57(2), 6:1–6:37 (2010)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), pp. 1–12 (2000)
Harris, Z.: Distributional structure. Word 10(23), 146–162 (1954)
Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Morgan Claypool Publishers (2011)
Kuramochi, M., Karypis, G.: Frequent subgraph discovery. In: Proceedings of the IEEE International Conference on Data Mining (ICDM), Washington, D.C., pp. 313–320 (2001)
Li, W.-S., Clifton, C.: Semint: A tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data and Knowledge Engineering (DKE) 33(1), 49–84 (2000)
Naumann, F., Ho, C.-T., Tian, X., Haas, L.M., Megiddo, N.: Attribute classification using feature analysis. In: Proceedings of the International Conference on Data Engineering (ICDE), p. 271 (2002)
Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB Journal 10(4), 334–350 (2001)
Rettinger, A., Lösch, U., Tresp, V., d’Amato, C., Fanizzi, N.: Mining the semantic web - statistical learning for next generation knowledge bases. Data Min. Knowl. Discov. 24(3), 613–662 (2012)
Völker, J., Niepert, M.: Statistical schema induction. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 124–138. Springer, Heidelberg (2011)
Wei, X., Peng, F., Tseng, H., Lu, Y., Dumoulin, B.: Context sensitive synonym discovery for web search queries. In: Proceedings of the International Conference on Information and Knowledge Management (CIKM), New York, NY, USA, pp. 1585–1588 (2009)
Zaki, M.J.: Scalable Algorithms for Association Mining. IEEE Transactions on Knowledge and Data Engineering (TKDE) 12, 372–390 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abedjan, Z., Naumann, F. (2013). Synonym Analysis for Predicate Expansion. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds) The Semantic Web: Semantics and Big Data. ESWC 2013. Lecture Notes in Computer Science, vol 7882. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38288-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-38288-8_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38287-1
Online ISBN: 978-3-642-38288-8
eBook Packages: Computer ScienceComputer Science (R0)