Skip to main content
Log in

A semantic approach for the requirement-driven discovery of web resources in the Life Sciences

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Research in the Life Sciences depends on the integration of large, distributed and heterogeneous web resources (e.g., data sources and web services). The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there are many candidate resources and there is little, mostly unstructured, metadata to be able to decide among them. In this paper, we contribute to a semi-automatic approach, based on semantic techniques, to assist researchers in the discovery of the most appropriate web resources to fulfill a set of requirements. The main feature of our approach is that it exploits broad knowledge resources in order to annotate the unstructured texts that are available in the emerging web-based repositories of web resource metadata. The results show that the web resource discovery process benefits from a semantic-based approach in several important aspects. One of the advantages is that the user can express her requirements in natural language avoiding the use of specific vocabularies or query languages. Moreover, the discovery exploits not only the categories or tags of web resources, but also their description and documentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Al-Masri E, Mahmoud Q (2007) QoS-based discovery and ranking of web services. In: Proceedings of 16th international conference on computer communications and networks, ICCCN 2007, pp 529–534. doi:10.1109/ICCCN.2007.4317873

  2. Berlanga R, Nebot V, Jimenez-Ruiz E (2010) Semantic annotation of biomedical texts through concept retrieval. Procesamiento del Lenguaje Natural 45: 247–250

    Google Scholar 

  3. Bhagat J, Tanoh F, Nzuobontane E et al (2010) BioCatalogue: a universal catalogue of web services for the life sciences. NAR 38 (suppl 2):W689–W694. doi:10.1093/nar/gkq394

  4. Birukou A, Blanzieri E, D’Andrea V et al (2007) Improving web service discovery with usage data. IEEE Softw 24(6): 47–54. doi:10.1109/MS.2007.169

    Article  Google Scholar 

  5. Bizer C, Lehmann J, Kobilarov G et al (2009) DBpedia—a crystallization point for the web of data. Web Semant 7(3): 154–165. doi:10.1016/j.websem.2009.07.002

    Article  Google Scholar 

  6. Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3(4–5): 993–1022. doi:10.1162/jmlr.2003.3.4-5.993

    MATH  Google Scholar 

  7. Burgun A, Bodenreider O (2008) Accessing and integrating data and knowledge for biomedical research. Med Inform Yearb 2008: 91–101

    Google Scholar 

  8. Cardoso G (2006) Discovering semantic web services with and without a common ontology commitment. In: Proceedings of the IEEE services computing workshops 2006, SCW’06, pp 183–190. doi:10.1109/SCW.2006.12

  9. Chukmol U (2008) A framework for web service discovery. In: Proceedings of the 2nd SIGMOD PhD workshop on innovative database research—IDAR ’08, pp 13–18. doi:10.1145/1410308.1410313

  10. Cochrane G, Galperin M (2010) The 2010 nucleic acids research database issue and online database collection: a community of data resources. Nucl Acids Res 38: D1–D4

    Article  Google Scholar 

  11. Couto F, Silva M, Coutinho P (2005) Finding genomic ontology terms in text using evidence content. BMC Bioinf 6(S-1): S21. doi:10.1186/1471-2105-6-S1-S21

    Article  Google Scholar 

  12. Crasso M, Zunino A, Campo M (2008) Easy web service discovery: a query-by-example approach. Sci Comput Program 71(2): 144–164. doi:10.1016/j.scico.2008.02.002

    Article  MathSciNet  MATH  Google Scholar 

  13. Dánger R, Berlanga R (2009) Generating complex ontology instances from documents. J Algorithms 64(1): 16–30

    Article  MathSciNet  MATH  Google Scholar 

  14. Dong X, Halevy A, Madhavan J et al (2004) Similarity search for web services. In: VLDB ’04, Proceedings of the thirtieth international conference on very large data bases, pp 372–383

  15. Garofalakis J, Panagis Y, Sakkopoulos E et al (2006) Contemporary web service discovery mechanisms. J Web Eng 5(3): 265–290

    Google Scholar 

  16. Gessler D, Schiltz G, May G et al (2009) SSWAP: a simple semantic web architecture and protocol for semantic web services. BMC Bioinf 10: 309

    Article  Google Scholar 

  17. Goble C, Stevens R, Hull H et al (2008) Data curation + process curation = data integration + science. Brief Bioinf 9(6): 506–517

    Article  Google Scholar 

  18. Griffiths T, Steyvers M (2004) Finding scientific topics. In: Proceedings of the National Academy of Sciences of the United States of America, vol 101(Suppl 1), pp 5228–5235. doi:10.1073/pnas.0307752101

  19. Hao Y, Zhang Y (2007) Web services discovery based on schema matching. In: Proceedings of the thirtieth Australasian conference on Computer Science, ACSC ’07, pp 107–113

  20. Hao Y, Zhang Y, Cao J (2010) Web services discovery and rank: an information retrieval approach. Future Gener Comput Syst 26(8): 1053–1062. doi:10.1016/j.future.2010.04.012

    Article  Google Scholar 

  21. Hu B (2010) WiKi’mantics: interpreting ontologies with Wikipedia. Knowl Inf Syst 25(3): 445–472

    Article  Google Scholar 

  22. Jimeno-Yepes A, Jiménez-Ruiz E, Berlanga R, Rebholz-Schuhmann D (2009) Reuse of terminological resources for efficient ontological engineering in life sciences. BMC Bioinf 10(S-10): 4

    Article  Google Scholar 

  23. Jimeno-Yepes A, Berlanga R, Rebholz-Schuhmann D (2010) Ontology refinement for improved information retrieval. Inf Process Manage 46(4): 426–435

    Article  Google Scholar 

  24. Kahan J, Koivunen M (2001) Annotea: an open RDF infrastructure for shared web annotations. In: Proceedings of the 10th international conference on World Wide Web, WWW ’01, pp 623–632. doi:10.1145/371920.372166

  25. Kiryakov A, Popov B, Terziev I et al (2004) Semantic annotation, indexing, and retrieval. J Web Semant 2(1): 49–79

    Article  Google Scholar 

  26. Loureno A, Carneiro S, Rocha M, Ferreira E et al (2010) Challenges in integrating Escherichia coli molecular biology data. Brief Bioinf 12(2): 91–103

    Article  Google Scholar 

  27. Mackay D, Bauman Peto L (1995) A hierarchical Dirichlet language model. Nat Lang Eng 1(3): 1–19

    Article  Google Scholar 

  28. Martinez-Gil J, Aldana-Montes J (2011) Evaluation of two heuristic approaches to solve the ontology meta-matching problem. Knowl Inf Syst 26(2): 225–247

    Article  Google Scholar 

  29. Mesiti M, Jiménez-Ruiz E, Sanz I, Berlanga R et al (2009) XML-based approaches for the integration of heterogeneous bio-molecular data. BMC Bioinf 10(S-12): 7

    Article  Google Scholar 

  30. Mottaz A, Yip Y, Ruch P, Veuthey A (2008) Mapping proteins to disease terminologies: from UniProt to MeSH. BMC Bioinf 9(S–5):S3

    Google Scholar 

  31. Nair M, Gopalakrishna V (2010) Look before you leap: a survey of web service discovery. Int J Comput Appl 7(5): 5–11

    Google Scholar 

  32. Navas-Delgado I, Rojano-Muñoz M, Ramírez S et al (2006) Intelligent client for integrating bioinformatics services. Bioinformatics 22(1): 106–111

    Article  Google Scholar 

  33. Nazir S, Sapkota B, Vitvar T (2008) Improving web service discovery with personalized goal. In: 4th International conference on web information systems and technologies, pp 266–277

  34. Pérez JM, Berlanga R, Aramburu MJ (2009) A relevance model for a data warehouse contextualized with documents. Inf Process Manage 5(3): 356–367

    Article  Google Scholar 

  35. Pérez-Catalán M, Casteleyn S, Sanz I, Aramburu MJ (2009) Requirements gathering in a model-based approach for the design of multi-similarity systems. In: International workshop on model-driven service engineering and data quality and security. MOSE+DSQ’09. doi:10.1145/1651415.1651425

  36. Pérez-Catalán M, Berlanga R, Sanz I, Aramburu MJ (2011) Exploiting text-rich descriptions for faceted discovery of web resources. In: Semantic web applications and tools for the life sciences, SWAT4LS’11

  37. Pettifer S, Thorne D, McDermott P et al (2010) An active registry for bioinformatics web services. Bioinformatics 25(16): 2090–2091

    Article  Google Scholar 

  38. Pilioura T, Tsalgatidou A (2009) Unified publication and discovery of semantic web services. ACM Trans Web 3(3): 1–44. doi:10.1145/1541822.1541826

    Article  Google Scholar 

  39. Plebani P, Pernici B (2009) URBE: web service retrieval based on similarity evaluation. IEEE Trans Knowl Data Eng 21(11): 1629–1642. doi:10.1109/TKDE.2009.35

    Article  Google Scholar 

  40. Rong W, Liu K (2010) A survey of context aware web service discovery: from user’s perspective. In: Fifth IEEE international symposium on service oriented system engineering (SOSE). doi:10.1109/SOSE.2010.54

  41. Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3): 393–418

    Article  Google Scholar 

  42. Skoutas D, Sacharidis D, Simitsis A, Sellis T (2010) Ranking and clustering web services using multicriteria dominance relationships. IEEE Trans Serv Comput 3(3): 163–177. doi:10.1109/TSC.2010.14

    Article  Google Scholar 

  43. Skoutas D, Simitsis A, Sellis T (2007) A ranking mechanism for semantic web service discovery. In: IEEE congress on services, pp 41–48

  44. Smedley D, Schofield P, Chen C et al (2010) Finding and sharing: new approaches to registries of databases and services for the biomedical sciences. Database: the Journal of Biological Databases and Curation 2010(0):baq014. doi:10.1093/database/baq014

  45. Stevens R, Goble C, Baker P, Brass A (2011) A classification of tasks in bioinformatics. Bioinformatics 17(2): 180–188. doi:10.1093/bioinformatics/17.2.180

    Article  Google Scholar 

  46. Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Landauer T, McNamara DS, Dennis S, Kintsch W (eds) Handbook of latent semantic analysis Lawrence Erlbaum Associates. Hillsdale, NJ. ISBN 1410615340

  47. Tan W, Zhang J, Foster I (2010) Network analysis of scientific workflows: a gateway to reuse. Computer 43(9): 54–61. doi:10.1109/MC.2010.262

    Article  Google Scholar 

  48. Tran D, Dubay C, Gorman P, Hersh W (2004) Applying task analysis to describe and facilitate bioinformatics tasks. Stud Health Technol Inf 107(Pt 2): 818–822

    Google Scholar 

  49. Uren V, Cimiano P, Iria J et al (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant Sci Serv Agents World Wide Web 4(1): 14–28. doi:10.1016/j.websem.2005.10.002

    Article  Google Scholar 

  50. Wang X, Hauswirth M, Vitvar T, Zaremba M (2008) Semantic Web Services selection improved by application ontology with multiple concept relations. In: Proceedings of the 2008 ACM symposium on applied computing—SAC ’08. doi:10.1145/1363686.1364222

  51. Wolstencroft K, Alper P, Hull D et al (2007) The myGrid ontology: Bioinformatics service discovery. Int J Bioinf Res Appl 3(3): 303–325

    Article  Google Scholar 

  52. Yu E (1995) Modelling strategic relationships for process reenginering. PhD thesis University of Toronto, Canada

  53. Yu E (1997) Towards modelling and reasoning support for early-phase requirements engineering. In: 3rd IEEE international symposium on requirements engineering (RE’97), pp 2444–2448

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to María Pérez-Catalán.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pérez-Catalán, M., Berlanga, R., Sanz, I. et al. A semantic approach for the requirement-driven discovery of web resources in the Life Sciences. Knowl Inf Syst 34, 671–690 (2013). https://doi.org/10.1007/s10115-012-0498-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0498-5

Keywords

Navigation