Advertisement

On the Discovery of Relational Patterns in Semantically Similar Annotated Linked Data

  • Guillermo Palma
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8465)

Abstract

A wide variety of publicly linked datasets have been annotated with domain-specific ontologies. Annotations can be naturally represented with graphs, and the knowledge encoded in these annotations can be mined to discover potential novel relationships. We propose novel mining techniques that exploit semantics represented by these graphs to discover relational patterns. Initial experimental results suggest that our approach can be effectively applied in different biomedical domains, and exhibit performance comparable to state-of-the-art solutions.

Keywords

#eswcphd2014Palma Mining Patterns Linking Data Annotated Graph 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bauer, F., Kaltenbock, M.: Linked Open Data: The Essentials. edition mono/monochrom (2013)Google Scholar
  2. 2.
    Benik, J., Chang, C., Raschid, L., Vidal, M.-E., Palma, G., Thor, A.: Finding cross genome patterns in annotation graphs. In: Bodenreider, O., Rance, B. (eds.) DILS 2012. LNCS, vol. 7348, pp. 21–36. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  3. 3.
    Brélaz, D.: New methods to color vertices of a graph. Commun. ACM 22(4) (1979)Google Scholar
  4. 4.
    Broecheler, M., Mihalkova, L., Getoor, L.: Probabilistic similarity logic. In: Conference on Uncertainty in Artificial Intelligence (2010)Google Scholar
  5. 5.
    Brohee, S., van Helden, J.: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 7(1), 488 (2006)CrossRefGoogle Scholar
  6. 6.
    Cook, D.J., Holder, L.B.: Mining graph data. Wiley-Blackwell (2007)Google Scholar
  7. 7.
    Ding, H., Takigawa, I., Mamitsuka, H., Zhu, S.: Similarity-based machine learning methods for predicting drug–target interactions: a brief review. Briefings in bioinformatics, bbt056 (2013)Google Scholar
  8. 8.
    Fakhraei, S., Raschid, L., Getoor, L.: Drug-target interaction prediction for drug repurposing with probabilistic similarity logic. In: ACM SIGKDD International Workshop on Data Mining in Bioinformatics, BIOKDD (2013)Google Scholar
  9. 9.
    Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Giles, C.L.: The future of citeSeer: CiteSeerx. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 2–2. Springer, Heidelberg (2006)Google Scholar
  11. 11.
    Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., De Melo, G., Weikum, G.: Yago2: exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web, pp. 229–232. ACM (2011)Google Scholar
  12. 12.
    Ji, M., Sun, Y., Danilevsky, M., Han, J., Gao, J.: Graph regularized transductive classification on heterogeneous information networks. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 570–586. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  13. 13.
    Karp, R.: Reducibility among combinatorial problems. In: Miller, R., Thatcher, J. (eds.) Complexity of Computer Computations, pp. 85–103. Plenum Press (1972)Google Scholar
  14. 14.
    McInnes, B., Pedersen, T., Pakhomov, S.: Umls-interface and umls-similarity: Open source software for measuring paths and semantic similarity. In: Proceedings of the AMIA Symposium, pp. 431–435 (2009)Google Scholar
  15. 15.
    Mencıa, E.L., Holthausen, S., Schulz, A., Janssen, F.: Using data mining on linked open data for analyzing e-procurement information. In: Proceedings of the International Workshop on Data Mining on Linked Data, with Linked Data Mining Challenge collocated with the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECMLPKDD 2013 (2013)Google Scholar
  16. 16.
    Mougel, P.-N., Plantevit, M., Rigotti, C., Gandrillon, O., Boulicaut, J.-F.: Constraint-based mining of sets of cliques sharing vertex properties. In: Workshop on Analysis of Complex NEtworks (ACNE 2010) Co-Located with ECML/PKDD. Citeseer (2010)Google Scholar
  17. 17.
    Palma, G., Vidal, M.E., Haag, E., Raschid, L., Thor, A.: Measuring relatedness between scientific entities in annotation datasets. In: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics, p. 367. ACM (2013)Google Scholar
  18. 18.
    Perlman, L., Gottlieb, A., Atias, N., Ruppin, E., Sharan, R.: Combining drug and gene similarity measures for drug-target elucidation. Journal of Computational Biology 18(2), 133–145 (2011)CrossRefGoogle Scholar
  19. 19.
    Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal Of Artificial Intelligence Research 11, 95–130 (1999)zbMATHGoogle Scholar
  20. 20.
    Saha, B., Hoch, A., Khuller, S., Raschid, L., Zhang, X.-N.: Dense subgraphs with restrictions and applications to gene annotation graphs. In: Berger, B. (ed.) RECOMB 2010. LNCS, vol. 6044, pp. 456–472. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  21. 21.
    Schwartz, J., Steger, A., Weißl, A.: Fast algorithms for weighted bipartite matching. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 476–487. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  22. 22.
    Shavitt, Y., Weinsberg, E., Weinsberg, U.: Estimating peer similarity using distance of shared files. In: International workshop on peer-to-peer systems (IPTPS), vol. 104 (2010)Google Scholar
  23. 23.
    Shi, C., Kong, X., Yu, P.S., Xie, S., Wu, B.: Relevance search in heterogeneous networks. In: EDBT, pp. 180–191 (2012)Google Scholar
  24. 24.
    Sun, Y., Han, J., Yan, X., Yu, P.S., Wu, T.: Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. PVLDB 4(11), 992–1003 (2011)Google Scholar
  25. 25.
    Sun, Y., Han, J., Zhao, P., Yin, Z., Cheng, H., Wu, T.: Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Proceedings of the 12th EDBT. ACM (2009)Google Scholar
  26. 26.
    Thor, A., Anderson, P., Raschid, L., Navlakha, S., Saha, B., Khuller, S., Zhang, X.-N.: Link prediction for annotation graphs using graph summarization. In: ISWC, pp. 714–729 (2011)Google Scholar
  27. 27.
    Von Luxburg, U.: A tutorial on spectral clustering. Statistics and computing 17(4), 395–416 (2007)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Wishart, D.S., Knox, C., Guo, A.C., Cheng, D., Shrivastava, S., Tzur, D., Gautam, B., Hassanali, M.: Drugbank: a knowledgebase for drugs, drug actions and drug targets. Nucleic acids research 36(suppl. 1), D901–D906 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Guillermo Palma
    • 1
  1. 1.Universidad Simón BolívarVenezuela

Personalised recommendations