Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 294))

Abstract

Content management systems and frameworks (CMS/F) play a key role in Web development. They support common Web operations and provide for a number of optional modules to implement customized functionalities. Given the increasing demand for text mining (TM) applications, it seems logical that CMS/F extend their offer of TM modules. In this regard, this work contributes to Drupal CMS/F with modules that support customized named entity recognition and enable the construction of domain-specific document search engines. Implementation relies on well-recognized Apache Information Retrieval and TM initiatives, namely Apache Lucene, Apache Solr and Apache Unstructured Information Management Architecture (UIMA). As proof of concept, we present here the development of a Drupal CMS/F that retrieves biomedical articles and performs automatic recognition of organism names to enable further organism-driven document screening.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kano, Y., Baumgartner, W.A., McCrohon, L., et al.: U-Compare: share and compare text mining tools with UIMA. Bioinformatics 25, 1997–1998 (2009), doi:10.1093/bioinformatics/btp289

    Google Scholar 

  2. Fan, W., Wallace, L., Rich, S., Zhang, Z.: Tapping the power of text mining. Commun. ACM 49, 76–82 (2006), doi:10.1145/1151030.1151032

    Article  Google Scholar 

  3. Gemert, J.: Van Text Mining Tools on the Internet An overview. Univ. Amsterdam 25, 1–75 (2000)

    Google Scholar 

  4. Lourenço, A., Carreira, R., Carneiro, S., et al.: @Note: A workbench for biomedical text mining. J. Biomed. Inform. 42, 710–720 (2009), doi:10.1016/j.jbi.2009.04.002

    Article  Google Scholar 

  5. Hucka, M., Finney, A., Sauro, H.: A medium for representation and exchange of biochemical network models (2003)

    Google Scholar 

  6. Lu, Z., Hirschman, L.: Biocuration workflows and text mining: overview of the BioCreative, Workshop Track II. Database (Oxford) 2012:bas043 (2012), doi:10.1093/database/bas043

    Google Scholar 

  7. Feinerer, I., Hornik, K., Meyer, D.: Text Mining Infrastructure in R. J. Stat. Softw. 25, 1–54 (2008), doi:citeulike-article-id:2842334

    Google Scholar 

  8. Fernández-Suárez, X.M., Rigden, D.J., Galperin, M.Y.: The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection. Nucleic Acids Res. 42, 1–6 (2014), doi:10.1093/nar/gkt1282

    Google Scholar 

  9. Papanicolaou, A., Heckel, D.G.: The GMOD Drupal bioinformatic server framework. Bioinformatics 26, 3119–3124 (2010), doi:10.1093bioinformatics/btq599

    Google Scholar 

  10. Decker, S., Melnik, S., van Harmelen, F., et al.: The Semantic Web: the roles of XML and RDF. IEEE Internet Comput. 4, 63–73 (2000), doi:10.1109/4236.877487

    Article  Google Scholar 

  11. Rebholz-Schuhmann, D., Kafkas, S., Kim, J.-H., et al.: Monitoring named entity recognition: The League Table. J. Biomed Semantics 4, 19 (2013), doi:10.1186/2041-1480-4-19

    Article  Google Scholar 

  12. Rzhetsky, A., Seringhaus, M., Gerstein, M.B.: Getting started in text mining: Part two. PLoS Comput. Biol. 5, e1000411 (2009), doi:10.1371/journal.pcbi.1000411

    Google Scholar 

  13. Gerner, M., Nenadic, G., Bergman, C.M.: LINNAEUS: A species name identification system for biomedical literature. BMC Bioinformatics 11, 85 (2010), doi:10.1186/1471-2105-11-85

    Article  Google Scholar 

  14. Fielding, R.T., Kaiser, G.: The Apache HTTP Server Project. IEEE Internet Comput. (1997), doi:10.1109/4236.612229

    Google Scholar 

  15. Web server | Drupal.org., https://drupal.org/requirements/webserver

  16. Smiley, D., Pugh, E.: Apache Solr 3 Enterprise Search Server, p. 418 (2011)

    Google Scholar 

  17. McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action, Second Edition: Covers Apache Lucene 3.0, p. 475 (2010)

    Google Scholar 

  18. Konchady, M.: Building Search Applications: Lucene, LingPipe, and Gate, p. 448 (2008)

    Google Scholar 

  19. Ferrucci, D., Lally, A.: UIMA: An architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. (2004)

    Google Scholar 

  20. Rak, R., Rowley, A., Ananiadou, S.: Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench. In: LREC (2012)

    Google Scholar 

  21. Lin, J.: Is searching full text more effective than searching abstracts? BMC Bioinformatics 10, 46 (2009), doi:10.1186/1471-2105-10-46

    Article  Google Scholar 

  22. Baumgartner, W.A., Cohen, K.B., Hunter, L.: An open-source framework for large-scale, flexible evaluation of biomedical text mining systems. J. Biomed. Discov. Collab. 3(1) (2008), doi:10.1186/1747-5333-3-1

    Google Scholar 

  23. Móra, G.: Concept identification by machine learning aided dictionary-based named entity recognition and rule-based entity normalisation. Second CALBC Work

    Google Scholar 

  24. Kumar, J.: Apache Solr PHP Integration, p. 118 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Ferrnandes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ferrnandes, J., Lourenço, A. (2014). Bringing Named Entity Recognition on Drupal Content Management System. In: Saez-Rodriguez, J., Rocha, M., Fdez-Riverola, F., De Paz Santana, J. (eds) 8th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB 2014). Advances in Intelligent Systems and Computing, vol 294. Springer, Cham. https://doi.org/10.1007/978-3-319-07581-5_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07581-5_31

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07580-8

  • Online ISBN: 978-3-319-07581-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics