Knowledge and Information Systems

, Volume 34, Issue 3, pp 671–690 | Cite as

A semantic approach for the requirement-driven discovery of web resources in the Life Sciences

  • María Pérez-CatalánEmail author
  • Rafael Berlanga
  • Ismael Sanz
  • María José Aramburu
Regular Paper


Research in the Life Sciences depends on the integration of large, distributed and heterogeneous web resources (e.g., data sources and web services). The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there are many candidate resources and there is little, mostly unstructured, metadata to be able to decide among them. In this paper, we contribute to a semi-automatic approach, based on semantic techniques, to assist researchers in the discovery of the most appropriate web resources to fulfill a set of requirements. The main feature of our approach is that it exploits broad knowledge resources in order to annotate the unstructured texts that are available in the emerging web-based repositories of web resource metadata. The results show that the web resource discovery process benefits from a semantic-based approach in several important aspects. One of the advantages is that the user can express her requirements in natural language avoiding the use of specific vocabularies or query languages. Moreover, the discovery exploits not only the categories or tags of web resources, but also their description and documentation.


Web resources discovery Requirements-driven methods Life Sciences Knowledge resources 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Al-Masri E, Mahmoud Q (2007) QoS-based discovery and ranking of web services. In: Proceedings of 16th international conference on computer communications and networks, ICCCN 2007, pp 529–534. doi: 10.1109/ICCCN.2007.4317873
  2. 2.
    Berlanga R, Nebot V, Jimenez-Ruiz E (2010) Semantic annotation of biomedical texts through concept retrieval. Procesamiento del Lenguaje Natural 45: 247–250Google Scholar
  3. 3.
    Bhagat J, Tanoh F, Nzuobontane E et al (2010) BioCatalogue: a universal catalogue of web services for the life sciences. NAR 38 (suppl 2):W689–W694. doi: 10.1093/nar/gkq394
  4. 4.
    Birukou A, Blanzieri E, D’Andrea V et al (2007) Improving web service discovery with usage data. IEEE Softw 24(6): 47–54. doi: 10.1109/MS.2007.169 CrossRefGoogle Scholar
  5. 5.
    Bizer C, Lehmann J, Kobilarov G et al (2009) DBpedia—a crystallization point for the web of data. Web Semant 7(3): 154–165. doi: 10.1016/j.websem.2009.07.002 CrossRefGoogle Scholar
  6. 6.
    Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3(4–5): 993–1022. doi: 10.1162/jmlr.2003.3.4-5.993 zbMATHGoogle Scholar
  7. 7.
    Burgun A, Bodenreider O (2008) Accessing and integrating data and knowledge for biomedical research. Med Inform Yearb 2008: 91–101Google Scholar
  8. 8.
    Cardoso G (2006) Discovering semantic web services with and without a common ontology commitment. In: Proceedings of the IEEE services computing workshops 2006, SCW’06, pp 183–190. doi: 10.1109/SCW.2006.12
  9. 9.
    Chukmol U (2008) A framework for web service discovery. In: Proceedings of the 2nd SIGMOD PhD workshop on innovative database research—IDAR ’08, pp 13–18. doi: 10.1145/1410308.1410313
  10. 10.
    Cochrane G, Galperin M (2010) The 2010 nucleic acids research database issue and online database collection: a community of data resources. Nucl Acids Res 38: D1–D4CrossRefGoogle Scholar
  11. 11.
    Couto F, Silva M, Coutinho P (2005) Finding genomic ontology terms in text using evidence content. BMC Bioinf 6(S-1): S21. doi: 10.1186/1471-2105-6-S1-S21 CrossRefGoogle Scholar
  12. 12.
    Crasso M, Zunino A, Campo M (2008) Easy web service discovery: a query-by-example approach. Sci Comput Program 71(2): 144–164. doi: 10.1016/j.scico.2008.02.002 MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Dánger R, Berlanga R (2009) Generating complex ontology instances from documents. J Algorithms 64(1): 16–30MathSciNetzbMATHCrossRefGoogle Scholar
  14. 14.
    Dong X, Halevy A, Madhavan J et al (2004) Similarity search for web services. In: VLDB ’04, Proceedings of the thirtieth international conference on very large data bases, pp 372–383Google Scholar
  15. 15.
    Garofalakis J, Panagis Y, Sakkopoulos E et al (2006) Contemporary web service discovery mechanisms. J Web Eng 5(3): 265–290Google Scholar
  16. 16.
    Gessler D, Schiltz G, May G et al (2009) SSWAP: a simple semantic web architecture and protocol for semantic web services. BMC Bioinf 10: 309CrossRefGoogle Scholar
  17. 17.
    Goble C, Stevens R, Hull H et al (2008) Data curation + process curation = data integration + science. Brief Bioinf 9(6): 506–517CrossRefGoogle Scholar
  18. 18.
    Griffiths T, Steyvers M (2004) Finding scientific topics. In: Proceedings of the National Academy of Sciences of the United States of America, vol 101(Suppl 1), pp 5228–5235. doi: 10.1073/pnas.0307752101
  19. 19.
    Hao Y, Zhang Y (2007) Web services discovery based on schema matching. In: Proceedings of the thirtieth Australasian conference on Computer Science, ACSC ’07, pp 107–113Google Scholar
  20. 20.
    Hao Y, Zhang Y, Cao J (2010) Web services discovery and rank: an information retrieval approach. Future Gener Comput Syst 26(8): 1053–1062. doi: 10.1016/j.future.2010.04.012 CrossRefGoogle Scholar
  21. 21.
    Hu B (2010) WiKi’mantics: interpreting ontologies with Wikipedia. Knowl Inf Syst 25(3): 445–472CrossRefGoogle Scholar
  22. 22.
    Jimeno-Yepes A, Jiménez-Ruiz E, Berlanga R, Rebholz-Schuhmann D (2009) Reuse of terminological resources for efficient ontological engineering in life sciences. BMC Bioinf 10(S-10): 4CrossRefGoogle Scholar
  23. 23.
    Jimeno-Yepes A, Berlanga R, Rebholz-Schuhmann D (2010) Ontology refinement for improved information retrieval. Inf Process Manage 46(4): 426–435CrossRefGoogle Scholar
  24. 24.
    Kahan J, Koivunen M (2001) Annotea: an open RDF infrastructure for shared web annotations. In: Proceedings of the 10th international conference on World Wide Web, WWW ’01, pp 623–632. doi: 10.1145/371920.372166
  25. 25.
    Kiryakov A, Popov B, Terziev I et al (2004) Semantic annotation, indexing, and retrieval. J Web Semant 2(1): 49–79CrossRefGoogle Scholar
  26. 26.
    Loureno A, Carneiro S, Rocha M, Ferreira E et al (2010) Challenges in integrating Escherichia coli molecular biology data. Brief Bioinf 12(2): 91–103CrossRefGoogle Scholar
  27. 27.
    Mackay D, Bauman Peto L (1995) A hierarchical Dirichlet language model. Nat Lang Eng 1(3): 1–19CrossRefGoogle Scholar
  28. 28.
    Martinez-Gil J, Aldana-Montes J (2011) Evaluation of two heuristic approaches to solve the ontology meta-matching problem. Knowl Inf Syst 26(2): 225–247CrossRefGoogle Scholar
  29. 29.
    Mesiti M, Jiménez-Ruiz E, Sanz I, Berlanga R et al (2009) XML-based approaches for the integration of heterogeneous bio-molecular data. BMC Bioinf 10(S-12): 7CrossRefGoogle Scholar
  30. 30.
    Mottaz A, Yip Y, Ruch P, Veuthey A (2008) Mapping proteins to disease terminologies: from UniProt to MeSH. BMC Bioinf 9(S–5):S3Google Scholar
  31. 31.
    Nair M, Gopalakrishna V (2010) Look before you leap: a survey of web service discovery. Int J Comput Appl 7(5): 5–11Google Scholar
  32. 32.
    Navas-Delgado I, Rojano-Muñoz M, Ramírez S et al (2006) Intelligent client for integrating bioinformatics services. Bioinformatics 22(1): 106–111CrossRefGoogle Scholar
  33. 33.
    Nazir S, Sapkota B, Vitvar T (2008) Improving web service discovery with personalized goal. In: 4th International conference on web information systems and technologies, pp 266–277Google Scholar
  34. 34.
    Pérez JM, Berlanga R, Aramburu MJ (2009) A relevance model for a data warehouse contextualized with documents. Inf Process Manage 5(3): 356–367CrossRefGoogle Scholar
  35. 35.
    Pérez-Catalán M, Casteleyn S, Sanz I, Aramburu MJ (2009) Requirements gathering in a model-based approach for the design of multi-similarity systems. In: International workshop on model-driven service engineering and data quality and security. MOSE+DSQ’09. doi: 10.1145/1651415.1651425
  36. 36.
    Pérez-Catalán M, Berlanga R, Sanz I, Aramburu MJ (2011) Exploiting text-rich descriptions for faceted discovery of web resources. In: Semantic web applications and tools for the life sciences, SWAT4LS’11Google Scholar
  37. 37.
    Pettifer S, Thorne D, McDermott P et al (2010) An active registry for bioinformatics web services. Bioinformatics 25(16): 2090–2091CrossRefGoogle Scholar
  38. 38.
    Pilioura T, Tsalgatidou A (2009) Unified publication and discovery of semantic web services. ACM Trans Web 3(3): 1–44. doi: 10.1145/1541822.1541826 CrossRefGoogle Scholar
  39. 39.
    Plebani P, Pernici B (2009) URBE: web service retrieval based on similarity evaluation. IEEE Trans Knowl Data Eng 21(11): 1629–1642. doi: 10.1109/TKDE.2009.35 CrossRefGoogle Scholar
  40. 40.
    Rong W, Liu K (2010) A survey of context aware web service discovery: from user’s perspective. In: Fifth IEEE international symposium on service oriented system engineering (SOSE). doi: 10.1109/SOSE.2010.54
  41. 41.
    Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3): 393–418CrossRefGoogle Scholar
  42. 42.
    Skoutas D, Sacharidis D, Simitsis A, Sellis T (2010) Ranking and clustering web services using multicriteria dominance relationships. IEEE Trans Serv Comput 3(3): 163–177. doi: 10.1109/TSC.2010.14 CrossRefGoogle Scholar
  43. 43.
    Skoutas D, Simitsis A, Sellis T (2007) A ranking mechanism for semantic web service discovery. In: IEEE congress on services, pp 41–48Google Scholar
  44. 44.
    Smedley D, Schofield P, Chen C et al (2010) Finding and sharing: new approaches to registries of databases and services for the biomedical sciences. Database: the Journal of Biological Databases and Curation 2010(0):baq014. doi: 10.1093/database/baq014
  45. 45.
    Stevens R, Goble C, Baker P, Brass A (2011) A classification of tasks in bioinformatics. Bioinformatics 17(2): 180–188. doi: 10.1093/bioinformatics/17.2.180 CrossRefGoogle Scholar
  46. 46.
    Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Landauer T, McNamara DS, Dennis S, Kintsch W (eds) Handbook of latent semantic analysis Lawrence Erlbaum Associates. Hillsdale, NJ. ISBN 1410615340Google Scholar
  47. 47.
    Tan W, Zhang J, Foster I (2010) Network analysis of scientific workflows: a gateway to reuse. Computer 43(9): 54–61. doi: 10.1109/MC.2010.262 CrossRefGoogle Scholar
  48. 48.
    Tran D, Dubay C, Gorman P, Hersh W (2004) Applying task analysis to describe and facilitate bioinformatics tasks. Stud Health Technol Inf 107(Pt 2): 818–822Google Scholar
  49. 49.
    Uren V, Cimiano P, Iria J et al (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant Sci Serv Agents World Wide Web 4(1): 14–28. doi: 10.1016/j.websem.2005.10.002 CrossRefGoogle Scholar
  50. 50.
    Wang X, Hauswirth M, Vitvar T, Zaremba M (2008) Semantic Web Services selection improved by application ontology with multiple concept relations. In: Proceedings of the 2008 ACM symposium on applied computing—SAC ’08. doi: 10.1145/1363686.1364222
  51. 51.
    Wolstencroft K, Alper P, Hull D et al (2007) The myGrid ontology: Bioinformatics service discovery. Int J Bioinf Res Appl 3(3): 303–325CrossRefGoogle Scholar
  52. 52.
    Yu E (1995) Modelling strategic relationships for process reenginering. PhD thesis University of Toronto, CanadaGoogle Scholar
  53. 53.
    Yu E (1997) Towards modelling and reasoning support for early-phase requirements engineering. In: 3rd IEEE international symposium on requirements engineering (RE’97), pp 2444–2448Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  • María Pérez-Catalán
    • 1
    Email author
  • Rafael Berlanga
    • 2
  • Ismael Sanz
    • 1
  • María José Aramburu
    • 1
  1. 1.Department of Computer Science and Engineering (DICC)Universitat Jaume ICastelló de la PlanaSpain
  2. 2.Department of Computer Languages and SystemsUniversitat Jaume ICastelló de la PlanaSpain

Personalised recommendations