Semantic Graph Analysis for Federated LOD Surfing in Life Sciences

  • Atsuko YamaguchiEmail author
  • Kouji Kozaki
  • Yasunori Yamamoto
  • Hiroshi Masuya
  • Norio Kobayashi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10675)


Currently, Linked Open Data (LOD) is increasingly used when publishing life science databases. To facilitate flexible use of such databases, we employ a method that uses federated query search along a path of class–class relationships. However, an effective method for federated query search requires analysis of the structure the relationships form for LOD datasets. Therefore, we constructed a graph of class–class relationships among 43 SPARQL endpoints and analyzed the connectivity of the graph. As a result, we found that (1) the sizes of connected components follow a power law; thus we should deal with the classes separately according to the size of connected components, (2) only the largest and second largest connected components have paths among classes from two or more SPARQL endpoints, and the datasets of each of the two connected components share ontologies, and (3) key classes that connect SPARQL endpoints are primarily upper-level concepts in the biological domain.


Linked Open Data Class–class relationships Data integration Federated query search 



This work was supported by JSPS KAKENHI grant numbers 17K00434, 17K00424 and 17H01789, and by the National Bioscience Database Center (NBDC) of the Japan Science and Technology Agency (JST).


  1. 1.
    Heim, P., Hellmann, S., Lehmann, J., Lohmann, S., Stegemann, T.: RelFinder: revealing relationships in RDF knowledge bases. In: Chua, T.-S., Kompatsiaris, Y., Mérialdo, B., Haas, W., Thallinger, G., Bailer, W. (eds.) SAMT 2009. LNCS, vol. 5887, pp. 182–187. Springer, Heidelberg (2009). CrossRefGoogle Scholar
  2. 2.
    Yamaguchi, A., Kozaki, K., Lenz, K., Yamamoto, Y., Masuya, H., Kobayashi, N.: Semantic data acquisition by traversing class-class relationships over linkedopen data. In: 6th Joint International Conference (JIST 2016), LNCS 10055, pp. 136-151(2016)Google Scholar
  3. 3.
    Vasilevsky, N., Johnson, T., Corday, K., Torniai, C., Brush, M., Segerdell, E., Wilson, M., Shaffer, C., Robinson, D., Haendel, M.: Research resources: curating the new eagle-i discovery system. Database 2012, bar067 (2012). CrossRefGoogle Scholar
  4. 4.
    Yamamoto, Y., Yamaguchi, A., Bono, H., Takagi, T.: Allie: a database and a search service of abbreviations and long forms. Database 2011, bar013 (2011). Google Scholar
  5. 5.
    Belleau, F., Nolin, M.A., Tourigny, N., Rigault, P., Morissette, J.: Bio2RDF: towards a mashup to build bioinformatics knowledge systems. J. Biomed. Inform. 41(5), 706–716 (2008)CrossRefGoogle Scholar
  6. 6.
    Gile, C.L., Bollacker, K.D., Lawrence, S.: CiteSeer: an automatic citation indexing system. In: Proceedings of the Third ACM Conference on Digital Libraries (DL 98), pp. 89–98 (1998)Google Scholar
  7. 7.
    Piñero, J., Queralt-Rosinach, N., Bravo, À., Deu-Pons, J., Bauer-Mehren, A., Baron, M., Sanz, F., Furlong, L.I.: DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database (2015).
  8. 8.
    Jupp, S., Malone, J., Bolleman, J., Brandizi, M., Davies, M., Garcia, L., Gaulton, A., Gehant, S., Laibe, C., Redaschi, N., Wimalaratne, S.M., Martin, M., Le Novére, N., Parkinson, H., Birney, E., Jenkinson, A.M.: The EBI RDF platform: linked open data for the life sciences. Bioinformatics 30(9), 1338–1339 (2014)CrossRefGoogle Scholar
  9. 9.
    Hassanzadeh, O., Miller, R.J.: Automatic Curation of Clinical Trials Data in LinkedCT. In: Arenas, M., Corcho, O., Simperl, E., Strohmaier, M., d’Aquin, M., Srinivas, K., Groth, P., Dumontier, M., Heflin, J., Thirunarayan, K., Staab, S. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 270–278. Springer, Cham (2015). CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Atsuko Yamaguchi
    • 1
    Email author
  • Kouji Kozaki
    • 2
  • Yasunori Yamamoto
    • 1
  • Hiroshi Masuya
    • 3
    • 4
  • Norio Kobayashi
    • 3
    • 4
  1. 1.Database Center for Life Science (DBCLS)Research Organization of Information and SystemsKashiwaJapan
  2. 2.The Institute of Scientific and Industrial Research (ISIR)Osaka UniversityIbarakiJapan
  3. 3.RIKEN BioResource Center (BRC)TsukubaJapan
  4. 4.Advanced Center for Computing and Communication (ACCC), RIKENWakoJapan

Personalised recommendations