Abstract
Research in the Life Sciences depends on the integration of large, distributed and heterogeneous web resources (e.g., data sources and web services). The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there are many candidate resources and there is little, mostly unstructured, metadata to be able to decide among them. In this paper, we contribute to a semi-automatic approach, based on semantic techniques, to assist researchers in the discovery of the most appropriate web resources to fulfill a set of requirements. The main feature of our approach is that it exploits broad knowledge resources in order to annotate the unstructured texts that are available in the emerging web-based repositories of web resource metadata. The results show that the web resource discovery process benefits from a semantic-based approach in several important aspects. One of the advantages is that the user can express her requirements in natural language avoiding the use of specific vocabularies or query languages. Moreover, the discovery exploits not only the categories or tags of web resources, but also their description and documentation.
Similar content being viewed by others
References
Al-Masri E, Mahmoud Q (2007) QoS-based discovery and ranking of web services. In: Proceedings of 16th international conference on computer communications and networks, ICCCN 2007, pp 529–534. doi:10.1109/ICCCN.2007.4317873
Berlanga R, Nebot V, Jimenez-Ruiz E (2010) Semantic annotation of biomedical texts through concept retrieval. Procesamiento del Lenguaje Natural 45: 247–250
Bhagat J, Tanoh F, Nzuobontane E et al (2010) BioCatalogue: a universal catalogue of web services for the life sciences. NAR 38 (suppl 2):W689–W694. doi:10.1093/nar/gkq394
Birukou A, Blanzieri E, D’Andrea V et al (2007) Improving web service discovery with usage data. IEEE Softw 24(6): 47–54. doi:10.1109/MS.2007.169
Bizer C, Lehmann J, Kobilarov G et al (2009) DBpedia—a crystallization point for the web of data. Web Semant 7(3): 154–165. doi:10.1016/j.websem.2009.07.002
Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3(4–5): 993–1022. doi:10.1162/jmlr.2003.3.4-5.993
Burgun A, Bodenreider O (2008) Accessing and integrating data and knowledge for biomedical research. Med Inform Yearb 2008: 91–101
Cardoso G (2006) Discovering semantic web services with and without a common ontology commitment. In: Proceedings of the IEEE services computing workshops 2006, SCW’06, pp 183–190. doi:10.1109/SCW.2006.12
Chukmol U (2008) A framework for web service discovery. In: Proceedings of the 2nd SIGMOD PhD workshop on innovative database research—IDAR ’08, pp 13–18. doi:10.1145/1410308.1410313
Cochrane G, Galperin M (2010) The 2010 nucleic acids research database issue and online database collection: a community of data resources. Nucl Acids Res 38: D1–D4
Couto F, Silva M, Coutinho P (2005) Finding genomic ontology terms in text using evidence content. BMC Bioinf 6(S-1): S21. doi:10.1186/1471-2105-6-S1-S21
Crasso M, Zunino A, Campo M (2008) Easy web service discovery: a query-by-example approach. Sci Comput Program 71(2): 144–164. doi:10.1016/j.scico.2008.02.002
Dánger R, Berlanga R (2009) Generating complex ontology instances from documents. J Algorithms 64(1): 16–30
Dong X, Halevy A, Madhavan J et al (2004) Similarity search for web services. In: VLDB ’04, Proceedings of the thirtieth international conference on very large data bases, pp 372–383
Garofalakis J, Panagis Y, Sakkopoulos E et al (2006) Contemporary web service discovery mechanisms. J Web Eng 5(3): 265–290
Gessler D, Schiltz G, May G et al (2009) SSWAP: a simple semantic web architecture and protocol for semantic web services. BMC Bioinf 10: 309
Goble C, Stevens R, Hull H et al (2008) Data curation + process curation = data integration + science. Brief Bioinf 9(6): 506–517
Griffiths T, Steyvers M (2004) Finding scientific topics. In: Proceedings of the National Academy of Sciences of the United States of America, vol 101(Suppl 1), pp 5228–5235. doi:10.1073/pnas.0307752101
Hao Y, Zhang Y (2007) Web services discovery based on schema matching. In: Proceedings of the thirtieth Australasian conference on Computer Science, ACSC ’07, pp 107–113
Hao Y, Zhang Y, Cao J (2010) Web services discovery and rank: an information retrieval approach. Future Gener Comput Syst 26(8): 1053–1062. doi:10.1016/j.future.2010.04.012
Hu B (2010) WiKi’mantics: interpreting ontologies with Wikipedia. Knowl Inf Syst 25(3): 445–472
Jimeno-Yepes A, Jiménez-Ruiz E, Berlanga R, Rebholz-Schuhmann D (2009) Reuse of terminological resources for efficient ontological engineering in life sciences. BMC Bioinf 10(S-10): 4
Jimeno-Yepes A, Berlanga R, Rebholz-Schuhmann D (2010) Ontology refinement for improved information retrieval. Inf Process Manage 46(4): 426–435
Kahan J, Koivunen M (2001) Annotea: an open RDF infrastructure for shared web annotations. In: Proceedings of the 10th international conference on World Wide Web, WWW ’01, pp 623–632. doi:10.1145/371920.372166
Kiryakov A, Popov B, Terziev I et al (2004) Semantic annotation, indexing, and retrieval. J Web Semant 2(1): 49–79
Loureno A, Carneiro S, Rocha M, Ferreira E et al (2010) Challenges in integrating Escherichia coli molecular biology data. Brief Bioinf 12(2): 91–103
Mackay D, Bauman Peto L (1995) A hierarchical Dirichlet language model. Nat Lang Eng 1(3): 1–19
Martinez-Gil J, Aldana-Montes J (2011) Evaluation of two heuristic approaches to solve the ontology meta-matching problem. Knowl Inf Syst 26(2): 225–247
Mesiti M, Jiménez-Ruiz E, Sanz I, Berlanga R et al (2009) XML-based approaches for the integration of heterogeneous bio-molecular data. BMC Bioinf 10(S-12): 7
Mottaz A, Yip Y, Ruch P, Veuthey A (2008) Mapping proteins to disease terminologies: from UniProt to MeSH. BMC Bioinf 9(S–5):S3
Nair M, Gopalakrishna V (2010) Look before you leap: a survey of web service discovery. Int J Comput Appl 7(5): 5–11
Navas-Delgado I, Rojano-Muñoz M, Ramírez S et al (2006) Intelligent client for integrating bioinformatics services. Bioinformatics 22(1): 106–111
Nazir S, Sapkota B, Vitvar T (2008) Improving web service discovery with personalized goal. In: 4th International conference on web information systems and technologies, pp 266–277
Pérez JM, Berlanga R, Aramburu MJ (2009) A relevance model for a data warehouse contextualized with documents. Inf Process Manage 5(3): 356–367
Pérez-Catalán M, Casteleyn S, Sanz I, Aramburu MJ (2009) Requirements gathering in a model-based approach for the design of multi-similarity systems. In: International workshop on model-driven service engineering and data quality and security. MOSE+DSQ’09. doi:10.1145/1651415.1651425
Pérez-Catalán M, Berlanga R, Sanz I, Aramburu MJ (2011) Exploiting text-rich descriptions for faceted discovery of web resources. In: Semantic web applications and tools for the life sciences, SWAT4LS’11
Pettifer S, Thorne D, McDermott P et al (2010) An active registry for bioinformatics web services. Bioinformatics 25(16): 2090–2091
Pilioura T, Tsalgatidou A (2009) Unified publication and discovery of semantic web services. ACM Trans Web 3(3): 1–44. doi:10.1145/1541822.1541826
Plebani P, Pernici B (2009) URBE: web service retrieval based on similarity evaluation. IEEE Trans Knowl Data Eng 21(11): 1629–1642. doi:10.1109/TKDE.2009.35
Rong W, Liu K (2010) A survey of context aware web service discovery: from user’s perspective. In: Fifth IEEE international symposium on service oriented system engineering (SOSE). doi:10.1109/SOSE.2010.54
Sánchez D, Isern D, Millan M (2011) Content annotation for the semantic web: an automatic web-based approach. Knowl Inf Syst 27(3): 393–418
Skoutas D, Sacharidis D, Simitsis A, Sellis T (2010) Ranking and clustering web services using multicriteria dominance relationships. IEEE Trans Serv Comput 3(3): 163–177. doi:10.1109/TSC.2010.14
Skoutas D, Simitsis A, Sellis T (2007) A ranking mechanism for semantic web service discovery. In: IEEE congress on services, pp 41–48
Smedley D, Schofield P, Chen C et al (2010) Finding and sharing: new approaches to registries of databases and services for the biomedical sciences. Database: the Journal of Biological Databases and Curation 2010(0):baq014. doi:10.1093/database/baq014
Stevens R, Goble C, Baker P, Brass A (2011) A classification of tasks in bioinformatics. Bioinformatics 17(2): 180–188. doi:10.1093/bioinformatics/17.2.180
Steyvers M, Griffiths T (2007) Probabilistic topic models. In: Landauer T, McNamara DS, Dennis S, Kintsch W (eds) Handbook of latent semantic analysis Lawrence Erlbaum Associates. Hillsdale, NJ. ISBN 1410615340
Tan W, Zhang J, Foster I (2010) Network analysis of scientific workflows: a gateway to reuse. Computer 43(9): 54–61. doi:10.1109/MC.2010.262
Tran D, Dubay C, Gorman P, Hersh W (2004) Applying task analysis to describe and facilitate bioinformatics tasks. Stud Health Technol Inf 107(Pt 2): 818–822
Uren V, Cimiano P, Iria J et al (2006) Semantic annotation for knowledge management: requirements and a survey of the state of the art. Web Semant Sci Serv Agents World Wide Web 4(1): 14–28. doi:10.1016/j.websem.2005.10.002
Wang X, Hauswirth M, Vitvar T, Zaremba M (2008) Semantic Web Services selection improved by application ontology with multiple concept relations. In: Proceedings of the 2008 ACM symposium on applied computing—SAC ’08. doi:10.1145/1363686.1364222
Wolstencroft K, Alper P, Hull D et al (2007) The myGrid ontology: Bioinformatics service discovery. Int J Bioinf Res Appl 3(3): 303–325
Yu E (1995) Modelling strategic relationships for process reenginering. PhD thesis University of Toronto, Canada
Yu E (1997) Towards modelling and reasoning support for early-phase requirements engineering. In: 3rd IEEE international symposium on requirements engineering (RE’97), pp 2444–2448
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pérez-Catalán, M., Berlanga, R., Sanz, I. et al. A semantic approach for the requirement-driven discovery of web resources in the Life Sciences. Knowl Inf Syst 34, 671–690 (2013). https://doi.org/10.1007/s10115-012-0498-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-012-0498-5