FormalPara What You Learn from This Chapter

Definitions of three types of bioimage analysis software—Component, Collection, and Workflow—are introduced in this chapter. The aim is to promote the structured designing of bioimage analysis methods, and to improve related learning and teaching.

1.1 Introduction

Software tools used for bioimage analysis tend to be seen as utilities that solve problems off-the-shelf. The extreme version of such is like: “If I know where to click, I can get good results!”. In case of gaming software, as the user gets more used to the software, the user can achieve the final stage faster. To some extent, this might be true also with bioimage analysis software, but there is a big difference. As bioimage analysis is a part of scientific research, the goal to achieve is not to clear the common final stage that everyone heads toward, but something original that others have not found out. The difficulty of the usage of bioimage analysis software does not only reside in the hidden commands, but also in the fact that the user needs to come up with more-or-less original analysis. Then, how can we do something original using tools that are provided in public?

In this short chapter, we define several terms describing the world of bioimage analysis software, which are “workflows”, “components”, and “collections”, and explain their relationships. We believe that clarifying the definition of these terms can contribute largely to those who want to learn bioimage analysis, as well as to those who need to design the teaching of bioimage analysis. The reason is that these terms link the generality of software packages provided in public, with the specificity and the originality of the analysis that one needs to achieve.

1.2 Types of Bioimage Analysis Software

Software packages such as ImageJ (Schneider et al. 2012),Footnote 1 MATLAB,Footnote 2 CellProfiler (Carpenter et al. 2006)Footnote 3 or ICY (de Chaumont et al. 2012)Footnote 4 are often used to analyze image data in life sciences. These software packages are “collections” of implementation of image processing and analysis algorithms. Libraries such as ImgLib2 (Pietzsch et al. 2012),Footnote 5 OpenCV (Bradski 2000),Footnote 6 ITK (Johnson et al. 2015a,b),Footnote 7 VTK (Schroeder et al. 2006),Footnote 8 and Scikit-Image (van der Walt et al. 2014)Footnote 9 are also packages of image processing and analysis algorithms, although with a different type of user interface that is not graphical. We invariably refer to them as “collections”. To scientifically analyze and address an underlying biological problem, one needs to hand-pick some algorithms from these collections, carefully adjust their functional parameters to the problem and assemble them in a meaningful order. Such a sequence of image processing algorithms with a specified parameter set is what we call a “workflow”. The implementations of the algorithms that are used in the workflows are the “components” constituting that workflow (or “workflow components”). From the point of view of the expert who needs to assemble a workflow, a collection is a package bundling many different components. As an example, many plugins offered for ImageJ are mostly also collections (e.g. Trackmate (Tinevez et al. 2016),Footnote 10 3D Suite (Ollion et al. 2013),Footnote 11 MosaicSuiteFootnote 12…), as they bundle multiple components. On the other hand, some plugins, such as Linear Kuwahara filter plugin,Footnote 13 are a single component implemented as a single plugin.

Each workflow is uniquely associated with a specific biological research project because the question asked therein as well as the acquired image quality are often unique. This calls for a unique combination of components and parameter set. Some collections, especially those designed with GUI, offer workflow templates. These templates are pre-assembled sequences of image processing tasks to solve a typical bioimage analysis problem; all one needs to do is to adjust the parameters of each step. For example, in the case of Trackmate plugin for ImageJ (Tinevez et al. 2016), a GUI wizard guides the user to choose an algorithm for each step among several candidates and also to adjust their parameters to achieve a successful particle tracking workflow (see ► Chap. 4). When these algorithms and parameters are set, the workflow is built. CellProfiler also has a helpful GUI that assists the user in building a workflow based on workflow templates (Carpenter et al. 2006). It allows the user to easily swap the algorithms for each step and test various parameter combinations. ◘ Figure 1.1 summarizes the above explanations.

Fig. 1.1
figure 1

Relationship between components, collection and workflow. Components (e.g. Gaussian blurring filter) are selected from collection (e.g. ImageJ) and assembled into a specific workflow (red arrow) for analyzing image data in each research project (e.g. scripts associated with journal papers)

Though such templates are available for some typical tasks, collections generally do not provide helpful clues to construct a workflow—choice of components to be used and approach taken to assemble those components depend on expert knowledge, empirical knowledge or testing. Since the biological questions are so diverse, the workflow often needs to be original and might not match any available workflow templates. Building a workflow from scratch needs some solid knowledge about the components and the ways to combine them. It also requires an understanding of the biological problem itself. Each workflow is in essence associated with a specific biological question, and this question together with the image acquisition setup affect the required precision of the analysis. For example, image data in general should not be analyzed at a precision higher than the physical resolution of the imaging system that captures those data.Footnote 14 In some cases, a higher precision does not imply more meaningful results just because such precision can be irrelevant to the biological question. These aspects should be carefully considered during the planning of the analysis and the choice of the components, together with the choice of statistical treatment.

Many biologists feel difficulty in analyzing image data, because of the lack in skills and knowledge to close the gap between a collection of components and a practical workflow. A collection bundles components without workflows, but it is often erroneously assumed that installing a collection is enough for solving bioimage analysis problem. The truth is that expert knowledge is required to choose components, adjust their parameters and build a workflow (◘ Fig. 1.1 red arrows). The correct assembly of components as an executable script is in general even more difficult, as it requires some programming skills. The use of components directly from library-type of collections, which host many useful components, also requires programming skills to access their API. Bioimage analysts may fill this gap but even they, who professionally analyze image data, need to always search for the most suitable components to solve problems, reaching the required accuracy or coping with huge data in a practical time.

Another important aspect and difficulty is the reproducibility of workflows. We often want to know how other people have performed image analysis and to learn from others new bioimage analysis strategies. In such cases, we look for workflows addressing a similar biological problem. However, many articles do not document the workflows they used in sufficient details to enable the reproducibility of the results. As an extreme example, we found articles with their image analysis description in Materials and Methods merely documenting that ImageJ was used for the image analysis. Such a minimalism should be strictly avoided. On the other hand, some workflows are written as a detailed text description in Materials and Methods sections in the publications. We go even further and recommend to publish workflows as executable scripts, i.e. a computer program, with documented parameter sets for clarity and reproducibility of analysis and results. In our opinion, the best format is a version-tracked script because the version used for the published results can be clearly stated and reused by others. A script embedded in a Docker image is even better for avoiding problems associated with a difference in execution environments.

Towards a more efficient designing of workflows, The Network of European Bioimage Analysts (NEUBIAS) has been developing a searchable index named Bioimage Informatics Search Engine (BISE). This service is accessible online at ► https://biii.eu and hosts the manually curated registry of collections, workflows and components.

Two ontologies are used for annotating resources registered to BISE: The BISE ontology for properties of resources e.g. programming language; and the EDAM Bioimaging Ontology (Kalaš et al. 2019)—an extension of the EDAM ontology (Ison et al. 2013) developed together with ELIXIRFootnote 15—for applications of these resources, e.g. image processing step and imaging modality. “Component”, “Workflow” and “Collections” are implemented as part of the BISE ontology for classifying the type of software, for more distinctive filtering of search results.

While BISE allows researchers to search for bioimage analysis resources at all these levels, general web search engines, such as Google, typically return hits of collections but not to the details of their components. In addition, workflows are in many cases hidden in biological papers and difficult to be discovered. BISE is also designed to feature users impressions on the usability of components and workflows so that individual experiences can be swiftly shared within the community.

Take Home Message

Within the world of bioimage analysis software, various types of tools, which can be classified as “collections”, “components”, or “workflows”, coexist and are flatly provided to the public as “software tools”. Clear definition of these types and recognition of the role of each is a foundation for learning and teaching bioimage analysis.

Further Readings

  1. 1.

    Miura and Tosi (2016) discusses the general challenges of bioimage analysis.

  2. 2.

    Miura and Tosi (2017) provides more details on the structure and designing of bioimage analysis workflows.

  3. 3.

    Details about NEUBIAS can be found at the following web pages: