Bringing Named Entity Recognition on Drupal Content Management System

Ferrnandes, José; Lourenço, Anália

doi:10.1007/978-3-319-07581-5_31

José Ferrnandes⁶ &
Anália Lourenço^6,7

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 294))

1154 Accesses
2 Citations

Abstract

Content management systems and frameworks (CMS/F) play a key role in Web development. They support common Web operations and provide for a number of optional modules to implement customized functionalities. Given the increasing demand for text mining (TM) applications, it seems logical that CMS/F extend their offer of TM modules. In this regard, this work contributes to Drupal CMS/F with modules that support customized named entity recognition and enable the construction of domain-specific document search engines. Implementation relies on well-recognized Apache Information Retrieval and TM initiatives, namely Apache Lucene, Apache Solr and Apache Unstructured Information Management Architecture (UIMA). As proof of concept, we present here the development of a Drupal CMS/F that retrieves biomedical articles and performs automatic recognition of organism names to enable further organism-driven document screening.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Kano, Y., Baumgartner, W.A., McCrohon, L., et al.: U-Compare: share and compare text mining tools with UIMA. Bioinformatics 25, 1997–1998 (2009), doi:10.1093/bioinformatics/btp289
Google Scholar
Fan, W., Wallace, L., Rich, S., Zhang, Z.: Tapping the power of text mining. Commun. ACM 49, 76–82 (2006), doi:10.1145/1151030.1151032
Article Google Scholar
Gemert, J.: Van Text Mining Tools on the Internet An overview. Univ. Amsterdam 25, 1–75 (2000)
Google Scholar
Lourenço, A., Carreira, R., Carneiro, S., et al.: @Note: A workbench for biomedical text mining. J. Biomed. Inform. 42, 710–720 (2009), doi:10.1016/j.jbi.2009.04.002
Article Google Scholar
Hucka, M., Finney, A., Sauro, H.: A medium for representation and exchange of biochemical network models (2003)
Google Scholar
Lu, Z., Hirschman, L.: Biocuration workflows and text mining: overview of the BioCreative, Workshop Track II. Database (Oxford) 2012:bas043 (2012), doi:10.1093/database/bas043
Google Scholar
Feinerer, I., Hornik, K., Meyer, D.: Text Mining Infrastructure in R. J. Stat. Softw. 25, 1–54 (2008), doi:citeulike-article-id:2842334
Google Scholar
Fernández-Suárez, X.M., Rigden, D.J., Galperin, M.Y.: The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection. Nucleic Acids Res. 42, 1–6 (2014), doi:10.1093/nar/gkt1282
Google Scholar
Papanicolaou, A., Heckel, D.G.: The GMOD Drupal bioinformatic server framework. Bioinformatics 26, 3119–3124 (2010), doi:10.1093bioinformatics/btq599
Google Scholar
Decker, S., Melnik, S., van Harmelen, F., et al.: The Semantic Web: the roles of XML and RDF. IEEE Internet Comput. 4, 63–73 (2000), doi:10.1109/4236.877487
Article Google Scholar
Rebholz-Schuhmann, D., Kafkas, S., Kim, J.-H., et al.: Monitoring named entity recognition: The League Table. J. Biomed Semantics 4, 19 (2013), doi:10.1186/2041-1480-4-19
Article Google Scholar
Rzhetsky, A., Seringhaus, M., Gerstein, M.B.: Getting started in text mining: Part two. PLoS Comput. Biol. 5, e1000411 (2009), doi:10.1371/journal.pcbi.1000411
Google Scholar
Gerner, M., Nenadic, G., Bergman, C.M.: LINNAEUS: A species name identification system for biomedical literature. BMC Bioinformatics 11, 85 (2010), doi:10.1186/1471-2105-11-85
Article Google Scholar
Fielding, R.T., Kaiser, G.: The Apache HTTP Server Project. IEEE Internet Comput. (1997), doi:10.1109/4236.612229
Google Scholar
Web server | Drupal.org., https://drupal.org/requirements/webserver
Smiley, D., Pugh, E.: Apache Solr 3 Enterprise Search Server, p. 418 (2011)
Google Scholar
McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action, Second Edition: Covers Apache Lucene 3.0, p. 475 (2010)
Google Scholar
Konchady, M.: Building Search Applications: Lucene, LingPipe, and Gate, p. 448 (2008)
Google Scholar
Ferrucci, D., Lally, A.: UIMA: An architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. (2004)
Google Scholar
Rak, R., Rowley, A., Ananiadou, S.: Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench. In: LREC (2012)
Google Scholar
Lin, J.: Is searching full text more effective than searching abstracts? BMC Bioinformatics 10, 46 (2009), doi:10.1186/1471-2105-10-46
Article Google Scholar
Baumgartner, W.A., Cohen, K.B., Hunter, L.: An open-source framework for large-scale, flexible evaluation of biomedical text mining systems. J. Biomed. Discov. Collab. 3(1) (2008), doi:10.1186/1747-5333-3-1
Google Scholar
Móra, G.: Concept identification by machine learning aided dictionary-based named entity recognition and rule-based entity normalisation. Second CALBC Work
Google Scholar
Kumar, J.: Apache Solr PHP Integration, p. 118 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

ESEI - Escuela Superior de Ingeniería Informática, Edificio Politécnico, University of Vigo, Campus Universitario As Lagoas s/n, 32004, Ourense, Spain
José Ferrnandes & Anália Lourenço
IBB - Institute for Biotechnology and Bioengineering, Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
Anália Lourenço

Authors

José Ferrnandes
View author publications
You can also search for this author in PubMed Google Scholar
Anália Lourenço
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José Ferrnandes .

Editor information

Editors and Affiliations

EMBL Outstation - Hinxton, European Bioinformatics Institute, Hinxton, United Kingdom
Julio Saez-Rodriguez
Department of Informatics, University of Minho, Braga, Portugal
Miguel P. Rocha
Department of Informatics Campus Universitario As Lagoas s/n, University of Vigo, Ourense, Spain
Florentino Fdez-Riverola
Department of Computing Science, University of Salamanca, Salamanca, Spain
Juan F. De Paz Santana

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferrnandes, J., Lourenço, A. (2014). Bringing Named Entity Recognition on Drupal Content Management System. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_31

Download citation

DOI: https://doi.org/10.1007/978-3-319-07581-5_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07580-8
Online ISBN: 978-3-319-07581-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics