Kachako: A Hybrid-Cloud Unstructured Information Platform for Full Automation of Service Composition, Scalable Deployment and Evaluation

Natural Language Processing as an Example
  • Yoshinobu Kano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7759)


Automation is the key concept when designing a service platform, because automation could reduce human’s work. Focusing on unstructured information such as text, image and audio, we implemented our service platform “Kachako” in a hybrid-cloud way where services themselves are transferred on demand. We suggest making each service specified by its input and output types, and executable of the service portable, compatible and interoperable. Assuming such services, Kachako thoroughly automates everything that users need. Kachako provides graphical user interfaces allowing end users to complete their tasks within Kachako without programming. Kachako is designed in a modular way by complying with well-known frameworks such as UIMA, Hadoop and Maven, allowing partial reuse or customization. We showed that Kachako is practically useful by integrating our natural language processing (NLP) services. Kachako is the world first full automation system for NLP freely available.


Automation Unstructured Information Service Composition Scalability Natural Language Processing 


  1. 1.
  2. 2.
    Ferrucci, D., Lally, A., Gruhl, D., Epstein, E., Schor, M., Murdock, J.W., Frenkiel, A., Brown, E.W., Hampp, T., Doganata, Y., Welty, C., Amini, L., Kofman, G., Kozakov, L., Mass, Y.: Towards an Interoperability Standard for Text and Multi-Modal Analytics. IBM Research Report, RC24122 (2006)Google Scholar
  3. 3.
    Apache ActiveMQ,
  4. 4.
  5. 5.
  6. 6.
  7. 7.
    Ferrucci, D.A.: Introduction to This is Watson. IBM Journal of Research and Development 56, 1:1–1:15 (2012)Google Scholar
  8. 8.
    Hahn, U., Buyko, E., Landefeld, R., Mühlhausen, M., Poprat, M., Tomanek, K., Wermter, J.: An Overview of JCoRe, the JULIE Lab UIMA Component Repository. In: LREC 2008 Workshop, Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP, Marrakech, Morocco, pp. 1–8 (2008)Google Scholar
  9. 9.
    Hernandez, N., Poulard, F., Vernier, M., Rocheteau, J.: Building a French-speaking community around UIMA, gathering research, education and industrial partners, mainly in Natural Language Processing and Speech Recognizing domains. In: LREC 2010 Workshop of New Challenges for NLP Frameworks, Valletta, Malta (2010)Google Scholar
  10. 10.
    Ogren, P.V., Wetzler, P.G., Bethard, S.: ClearTK: A UIMA Toolkit for Statistical Natural Language Processing. In: LREC 2008 Workshop  ’Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP’, Marrakech, Morocco, pp. 32–38 (2008)Google Scholar
  11. 11.
    Kano, Y., Miwa, M., Cohen, K., Hunter, L., Ananiadou, S., Tsujii, J.: U-Compare: a modular NLP workflow construction and evaluation system. IBM Journal of Research and Development 55, 11:1–11:10 (2011)Google Scholar
  12. 12.
    Kano, Y., Dorado, R., McCrohon, L., Ananiadou, S., Tsujii, J.: U-Compare: An Integrated Language Resource Evaluation Platform Including a Comprehensive UIMA Resource Library. In: 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, pp. 428–434 (2010)Google Scholar
  13. 13.
    Kano, Y., Baumgartner, W.A., McCrohon, L., Ananiadou, S., Cohen, K.B., Hunter, L., Tsujii, J.: U-Compare: share and compare text mining tools with UIMA. Bioinformatics 25, 1997–1998 (2009)CrossRefGoogle Scholar
  14. 14.
    Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M.R., Li, P., Oinn, T.: Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 34, W729–W732 (2006)Google Scholar
  15. 15.
    Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko, A., Taylor, J.: Galaxy: a web-based genome analysis tool for experimentalists. Curr. Protoc. Mol. Biol. ch. 19, Unit 19.10.1–19.10.21 (2010)Google Scholar
  16. 16.
    Ishida, T.: Language Grid: An Infrastructure for Intercultural Collaboration. In: Proceedings of the International Symposium on Applications on Internet, pp. 96–100. IEEE Computer Society (2006)Google Scholar
  17. 17.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: 40th Anniversary Meeting of the Association for Computational Linguistics, Philadelphia, USA, pp. 168–175 (2002)Google Scholar
  18. 18.
    BioMed Central’s open access full-text corpus,
  19. 19.
    Settles, B.: ABNER: an open source tool for automatically tagging genes, proteins and other entity names in text. Bioinformatics 21, 3191–3192 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yoshinobu Kano
    • 1
  1. 1.PRESTOJapan Science and Technology Agency (JST)Japan

Personalised recommendations