FormalPara What You Will Learn in This Chapter

The aim of this workflow is to quantify the morphology of pancreatic stem cells lying on a 2D polystyrene substrate from phase contrast microscopy images. For this purpose, the images are first processed with a Deep Learning model trained for semantic segmentation (cell/background); next, the result is refined and individual cell instances are segmented before characterizing their morphology. Through this workflow the readers will learn the nomenclature and understand the principles of Deep Learning applied to image processing. Having followed all the steps in this chapter, the reader is expected to know how to use Google Colaboratory (Bisong, 2019) notebooks, ImageJ/Fiji (Schindelin et al., 2012; Schneider et al., 2012; Rueden et al., 2017), DeepImageJ (Gómez-de Mariscal et al., 2019) and MorpholibJ (Legland et al., 2016). This complete workflow sets the basis to develop further methods in the field of Bioimage Analysis using Deep Learning. All the material needed for this chapter is provided in the following GitHub repository (under chap 4): https://github.com/NEUBIAS/neubias-springer-book-2021.Footnote 1

4.1 Why You Should Know About Deep Learning

The workflow presented in this Chapter extracts binary masks for cells in 2D phase contrast microscopy images, identifies the cells in the image and quantifies their morphology. The central component of the workflow is the step to obtain a binary mask to distinguish the pixels belonging to the cells from the rest of pixels in the image. In particular, we will train a well established Deep Learning architecture called U-Net (Ronneberger et al., 2015; Falk et al., 2019) to perform this task.

Machine Learning and Deep Learning have become common technical terms in life-science. They are now large fields of study that have boosted both research and industry. While both are strongly related, they also belong to a larger field called Artificial Intelligence, which pursues mimicking (or even surpassing) human intelligence with a machine (Goodfellow et al., 2016). The techniques to extract the proper information and use it in an intelligent way is what we call Machine Learning (ML). The ML techniques are commonly divided into two main groups: supervised and unsupervised methods. Supervised learning is the task of learning a function that maps an input to an output based on sample input-output pairs. Namely, it infers such a function from labeled training data consisting of a set of training examples. When no labels or information about the correct output are given, then we are talking about unsupervised learning, and the corresponding function is inferred using the data structure only. All the clustering methods are thus included in the latter.

A simple example of ML is a linear classifier, technically called perceptron (Rosenblatt, 1961), which is able, for example, to split a set of 2D points into two different classes. In practice, ML classifiers operate on objects of way higher dimensions (e.g., images) and solve tasks far more complex than classifying input data into two groups. For this reason, in practice, multiple perceptrons are stacked together to build what is known as an Artificial Neural Network (ANN). That is, we define deep architectures to support better mathematical representations of our data. This, combined with a suitable training schedule, allows the computer to learn the correct patterns to perform the desired task. This is called Deep Learning (DL from now on) and, at the moment, it has proven to be among the most powerful frameworks for supervised learning.

What sets apart DL from classical approaches is that the system learns automatically from the data without any definition or explicit programming of complex heuristic rules. A pioneer work using DL for bioimage analysis is the Convolutional Neural Network (CNN) architecture called U-Net (Ronneberger et al., 2015). It was first introduced to the community in 2015 at the International Symposium on Biomedical Imaging (ISBI) and then published at the Medical Image Computing and Computer Assisted Interventions (MICCAI) conference, two of the most important conferences for biomedical image analysis. Since then, a growing number of manuscripts (about 390 in 2020 according to PubMed) related to biomedical image analysis using DL are published every year (Litjens et al., 2017).

Note that DL techniques do not only require sophisticated algorithms but also large sets of (manually) annotated images and an enormous amount of computational power. Data collection itself could be a whole project in Computer Vision (Roh et al., 2021), not only for being critical for the success of ML techniques, but also for the complexity that handling large amounts of data involves and the related time and economical costs. In contrast with other fields in Computer Vision, the availability of useful, large and robustly annotated datasets in bioimage analysis is still a bottleneck for the use of DL. This is due to the high economical cost that their acquisition implies, and the need for expertise to generate manual annotations. Indeed, preparing manual annotations can be tedious and many times non-viable. Some freely available annotation tools are QuPath (Bankhead et al., 2017), 3D Slicer (Kapur et al., 2016), Paintera,Footnote 2 Mastodon,Footnote 3 Catmaid (Saalfeld et al., 2009), TrakEM2 (Cardona et al, 2012), Napari (Sofroniew et al., 2020) and ITK-SNAP (Yushkevich et al., 2006); they offer a wide range of possibilities to simplify the annotation process and make it reasonably efficient. However, there is still a need for a general approach to annotate complex structures in higher dimensions (i.e., 3D, time, multiple channels, multi-modality images). Additionally, the large variability among the images acquired following exactly the same setup but in a different laboratory or by a different technician prevents the transfer of trained DL models. For this reason, we want to warn the reader about the necessity of retraining the DL model provided on the target data to be processed. Fortunately, as it will be demonstrated, this is quite simple to do with a basic knowledge of Python and some libraries such as TensorFlow (Abadi et al., 2016), Keras (Chollet et al., 2015), or Pytorch (Paszke et al., 2019), which release the user from many computational and programming technicalities. Other even more user-friendly frameworks are Ilastik (Berg et al., 2019), ImJoy  (Ouyang et al., 2019), ZeroCostDL4Mic (von Chamier et al., 2020), and the ones integrated in Fiji/ImageJ, CSBDeep (Weigert et al., 2018), and deepImageJ (Gómez-de Mariscal et al., 2019). These tools allow the direct use and/or retraining of DL models using zero-code.

(Re)training DL models requires considerable computational power. The use of a graphics processing unit (GPU) such as the ones found in modern graphics boards, or specialized tensor processing units (TPU), is strongly recommended in most cases to speed up the training process. Access to these resources is possible through non-free cloud computing services such as the ones provided by Amazon or Google. Fortunately, there is a free alternative available for Google users through the Google Colaboratory ("Google Colab") framework (Bisong, 2019). It provides serverless Python Jupyter notebooks running on this hardware with pre-installed DL libraries. The use of these resources is limited but most of the time sufficient to train and test bioimage analysis (BIA) models.

4.2 Dataset

Fig. 4.1
figure 1

Example of training data. From left to right: phase contrast microscopy image (scale bar: 150 \(\mu \)m), ground truth (GT) manually annotated cells, corresponding cell-contours, and a mask with 3 labels (background, cell or cell contour)

The original data processed by this workflow can be found on the web page of the Cell Tracking Challenge (CTC) (Maška et al., 2014; Ulman et al., 2017).Footnote 4 It is provided as two independent datasets (training and challenge) since it aims to benchmark (evaluate) cell segmentation and tracking computational methods. The training set is the only one for which Ground TruthFootnote 5 (GT) is publicly available. Additionally, the CTC provides a set called Silver TruthFootnote 6 (ST). The ST set is much larger than the GT set, so it is more suitable for DL tasks. An example of training data is illustrated in Fig. 4.1.

For this work, we will use the training set of the challenge and the ST annotations to train and evaluate our method. The ST is processed to extract the contours of each cell that will be used by the workflow (Fig. 4.1). A ready-to-use dataset is provided.Footnote 7 Note that the data is distributed into three groups (training, validation and test). We will elaborate more on this in the following sections. For the final step of the workflow, we will apply the trained models to unseen data for which manual annotations are not available. For this, we will use the challenge data provided at the CTC web page.Footnote 8 In a real case scenario, the trained models are always applied to unseen data, with no GT available, otherwise we would not need to train any method!

4.3 Tools

Some tools and software packages need to be installed to run the workflow:

  • FijiFootnote 9

    To install Fiji plugins, in Fiji, click on Help > Update... Once the ImageJ Updater opens, click on Manage update sites. There you need to select the IJPB-plugins for MorpholibJ. To install deepImageJ, you need to click on Add update site. Then, fill the fields with Name: DeepImageJ and update site URL. Click on Close and Apply changes.

  • Python Notebooks: they can be executed locally or in Google ColaboratoryFootnote 10 which provides free access to cloud GPU. The latter requires a Google account.

    • Link to the notebook.Footnote 11

    • Link to open the notebook directly in Google Colaboratory.Footnote 12 It is recommended to make a local copy of the Notebook, as it will be editable.

4.4 Workflow

The steps of the workflow covered in this chapter are summarized in Fig. 4.2.

Fig. 4.2
figure 2

Summary of the proposed workflow

4.4.1 Step 1: Setting up a Google Colaboratory Notebook

Fig. 4.3
figure 3

Setting up a Google Colab notebook. (a) Go to "Change runtime type" and (b) make sure to choose GPU hardware

After opening a Google Colab notebook, we configure the hardware needed for its execution. In this case, we set up a GPU runtime (Fig. 4.3). Now we can run the notebook. The way to proceed is by clicking on the "play" button on the left side of each code cell. For example, the first cell will install the correct version of the required DL libraries (TensorFlow and Keras). This is critical for results reproducibility since functions performance can differ among different versions, or the code may even crash (Fig. 4.4).

4.4.2 Step 2: Download and Split the Data into Training, Validation and Test

When using ML methods, we need to split the available annotated (GT) data into three exclusive sets: training, validation and test. The training set is used to train the method and let it learn the task of interest (e.g., binary segmentation). Such set needs to be large enough as to cover all representative scenarios (e.g., poor signal-to-noise ratio, blurred images) and events visible in the data (e.g., artifacts, debris, mitosis, apoptosis, clusters of cells). The validation set, as indicated by its name, serves to evaluate the performance of the method during training, to ensure that it is learning and to prevent over-fitting.Footnote 13 The test set will be used to assess the performance of the method once the training procedure has finished. Both validation and test sets need to be independent of the training set, so that when the accuracy of the model becomes acceptable on the validation set, we can be confident that it is because the model is properly trained and that it has not over-fit the training set. The evaluation of the model performance on the test set aims to assess its ability to generalize to unseen data.

Fig. 4.4
figure 4

Execution of the first code cell. Every piece of code is run by clicking on the play button (red square) of each code cell

The GT data, in this particular case, consists of two independent time-lapse videos (sequences 01 and 02). Some frames from sequence 01 are used as training data while some other frames from the sequence 02 are used for both validation (frames \(140, \ldots , 250\)) and test (frames \(151, 152,\ldots ,248, 249\)). This data organization is compiled in a zip file that needs to be downloaded and unzipped (in the cloud, if running the workflow in Google Colab). These operations are performed in the second code cell by the following commands:

figure a

After decompression, the new folder called dataset contains three sub-folders (input, binary_masks and contours) for the three different sets.

4.4.3 Step 3: Train a Deep Learning Model for Binary Segmentation

A U-Net DL network is designed and trained to segment the cells in the images. We train the network by using the original 2D phase contrast microscopy images as input, and a set of three binary masks as output: 1) background mask (with pixel values of 1 for the background and 0 for the rest), 2) cell mask (1 cells and 0 the rest) and 3) cell contour (1 cell contour and 0 the rest). In other words, the network will learn to classify each input pixel as belonging to one of these three classes: background, foreground or contour.

Since the classification is performed per pixel, this process is called semantic segmentation, as opposed to instance segmentation, for which the model outputs a unique label per object of interest (here, independent cells).

4.4.3.1 Step 3.1: Preparing the Data for Training

Read the images for training and store them into memory by running the following code:

figure b

You should get the following message together with the figures from Fig. 4.5.

figure c
Fig. 4.5
figure 5

Output of "Preparing the data for training" code section displaying one training image and corresponding annotations

The U-Net network we are going to train has \(\sim 500,000\) trainable parameters, which requires a large amount of memory. Thus, to reduce memory usage and make it fit to the hardware offered by Google Colab, we crop small random patches of size \(256\times 256\) pixels from the original images. To do so, we create a function that crops a fixed number of patches from each image. We need to make sure that the part cropped out from the input image and the output patches (annotation binary masks) correspond to each other. Then, we use this function to crop out patches from the training data in the following code section:

figure d

We choose to normalize the intensity values of the input and output images between 0.0 and 1.0. This way, a common range of values for all the images is set without changing the differences among them or their properties. This helps the network to find the optimal parameters which give generality to the model and in some cases, to speed up the training.

Note that the class of each pixel is mathematically written using a one-hot encoding representation, for which we need three binary matrices (one per class) for each image. Hence, a pixel in the background is encoded as \(\left[ 1,0,0\right] \), as \(\left[ 0,1,0\right] \) for foreground and as \(\left[ 0,0,1\right] \) for cell contour. This is performed by the following code section:

figure e

Exercise 1

Repeat the same procedure for the validation set. You should obtain two variables X_val and Y_val with shapes \(n\times 256\times 256\times 1\) and \(n\times 256\times 256\times 3\), respectively, n being the total number of patches generated from the validation set. We recommend to generate 6 patches for each image as there are only 11 images in the validation set and you will only crop small patches from them.

4.4.3.2 Step 3.2: Building a U-Net Shaped Convolutional Neural Network

Fig. 4.6
figure 6

(a) Convolution of an image using a kernel of size \(3\times 3\). (b) 2 level encoding of an input image into a feature space using convolutions and downsamplings. (c) 2 level decoding of a set of features into the original spatial dimension. In (b) and (c), the convolutional layers have 3 and 9, and 4 and 3 filters, respectively. All the kernels have size \(3\times 3\) and their weights are trainable parameters that are optimized during the training. Downsampling and upsampling have size \(2\times 2\), so the image size is halved and doubled, respectively

Fig. 4.7
figure 7

Architecture of the U-Net-like convolutional neural network used in the workflow

The key component of any DL method used for image analysis are the convolutional layers: A filter kernel, convolution matrix, which is a small matrix that is convolved with the input image (see Fig. 4.6a). Convolution is a (linear) operation of summing elements in a local neighbourhood in the image, each weighted by the given kernel coefficients, with an aim to cause an effect on the input image (i.e., blurring, enhancement, edge detection). In the DL context, we use the word kernel when referring to this small matrix. The coefficients of the matrix are called the kernel weights. The learning process consists of finding the optimal weights for each convolutional kernel. Most of the time, the features extracted with the convolutional layers are not complex enough as to represent and analyze the relevant information in the image. A common strategy is to encode the features into a high dimensional space, process them and recover the original spatial representation by decoding the processed features. In the encoding path, the number of filters in the convolutional layer is increased and the size of the image decreased. This way, a higher dimensional space of features is reached (see Fig. 4.6b). To recover the original spatial representation, the number of filters is decreased as the spatial dimensions are increased (see Fig. 4.6d). The architectures that follow this schema are called encoder-decoders. A well established encoder-decoder for biomedical image analysis is the U-Net, which has encoding levels in the contracting path (the encoder), a bottleneck and decoding levels in the expanding path (decoder). See Fig. 4.7 for a graphical description of the U-Net-like architecture used in the current workflow.

The layers in Keras can be defined as output = Operation(number of filters, size)(input). Some additional arguments that can be specified are: the type of activation function used in the convolutional layer (activation), the initial distribution of the weights (kernel_initializer), and whether to use zero padding or not to preserve the size of the images after every convolution (padding).

The encoding path of the U-Net can be programmed simply by a downsampling of the image. Here we use AveragePooling2D.Footnote 14 Similarly, the decoding can be achieved by upsampling. However, in this case, we decided to use transposed or inversed convolutions (Conv2DTranspose) that need to be trained as well as the convolutional layers. The final configuration is as follows:

figure f

Note that the layers are sequentially connected, that is, the output of a layer is the input of the following layer.

4.4.3.3 Step 3.3: Loss and Accuracy Measures

The training schedule is a common optimization process. During each iteration of the training, the output of the CNN is compared with the corresponding GT through a loss function (summarizing the differences between them as a numerical value). Hence, the learning process consists in minimizing the loss function. To perform this optimization, the gradient of the loss function is computed and the network parameters (the kernels weights) are updated accordingly in the direction of the gradient variation by step sizes proportional to the learning rate.

The most common loss functions are the mean squared error (MSE), the binary cross-entropy (BCE) and the categorical cross-entropy (CCE). MSE is used for regression problems (when the output is not a class but a continuous value), while BCE and CCE are used in classification tasks. Patterson and Gibson (2017) provide further details about loss functions in DL. TensorFlow and Keras have also implemented quite many ready-to-use loss functions.Footnote 15 Standard optimizers for neural networks are the Stochastic Gradient Descent (SGD) (Kiefer et al., 1952), Root Mean Square propagation (RMSprop)Footnote 16 and Adaptive Moment Estimation (Adam) (Kingma and Ba, 2014). The latter is an optimization algorithm specifically designed for DL.

Here, we use the CCE loss function (Eq. 4.1), and the Adam optimizer with a learning rate set to 0.0003 (experimentally estimated but learning rates are typically in this range of values; see comments in Appendix):

$$\begin{aligned} CCE(y, p) = -\sum \limits _{c=1}^{C}y_{i,c}log(p_{i,c}) \end{aligned}$$
(4.1)

where y is the GT, p the predicted value, C the total number of classes (\(C=3\) in this case); \(y_{i,c}=1\) if the class of the observation i is c and 0, otherwise, and \(p_{i,c}\) is the predicted probability for the observation i of being of class c. The values of the loss function are usually difficult to interpret since the better the performance is, the lower its value. The accuracy measure gives an indication of how close is the output of the network to the Ground Truth. This metric is easier to interpret and visualize than the loss value but it is not suitable to guide the network optimization during training. Its values are limited to the \(\left[ 0,1\right] \) range, 1 being a perfect match between the result and the GT. Some standard accuracy measures for classification are the Jaccard index (also called Intersection over Union (IoU)), the Dice coefficient, the Hausdorff distance and the rate of True or False Positives and Negatives.

In Keras, many standard loss functions are available but we need to define a suitable accuracy measure for the problem at hand. As we deal with a segmentation task, we will use the Jaccard index, a good indicator of the overlap between our predicted and target segmented cells. It is defined for a binary image as:

$$\begin{aligned} J(y, p) = \frac{|y\cap p|}{| y\cup p|} = \frac{TP}{TP + FN + FP} \end{aligned}$$
(4.2)

where y is the GT, p the predicted value, TP the true positives, FN false negatives and FP false positives. Note that the Jaccard index measures the ratio of correctly classified pixels. Although the network output has three channels (background, foreground and object-contours), we compute the accuracy measure as the average Jaccard index of the last two classes (channels). Since many pixels belong to the background class, including them into the computation would produce misleadingly high Jaccard index values. A function computing this metric can be implemented in TensorFlow as follows:

figure g

Once the network and all the required functions have been defined, we can compile the model by calling:

figure h

4.4.3.4 Step 3.4: Executing the Training Schedule

We set up the training schedule with a maximum of 100 epochsFootnote 17 and a batch sizeFootnote 18 of 10. The validation accuracy is monitored during the training. If it does not change for a certain number of epochs (i.e., patience), then the training process is interrupted and the best performing instance of the model is returned. Patience is initially set to 50 using the EarlyStopping callback of Keras.

To execute the training process, we just need to specify the training (X_train and Y_train) and the validation data (X_val and Y_val). During the training, the model (variable model) is automatically updated:

figure i

It is possible to store the details of the training for each epoch (variable history in the code) and plot them afterwards (Fig. 4.8):

figure j
Fig. 4.8
figure 8

Plotting the training loss and Jaccard index per epoch. The training was set to 100 epochs and values stored in the variable history are displayed. Two metrics are calculated: Categorical Cross Entropy (CCE) and Jaccard index, as loss and accuracy. The values for the training data are shown in blue, and for validation in orange

In Fig. 4.8, we can observe that the loss value in the training dataset decreases after each epoch while the loss for the validation data does only decrease until epoch 40 and then starts to increase slightly. This is a sign that the training cannot further improve the model and could even degrade it by over-fitting to the training dataset. A similar behavior can be observed when looking at the Jaccard index. It seems that the method can still do it better for the training dataset but not for the validation set. This is the second hint pointing that the model was optimized as much as possible given the training data.

Exercise 2

Train the network using a smaller amount of images. This can be done easily, by reducing the file lists train_input_filenames, train_masks_filenames and train_contours_filenames, in Step 3.1. You will notice that when using few images the accuracy of the network on the validation and test data is decreased. We suggest to increase the number of epochs so you can also visualize any existing over-fitting or whether the network needs a longer training process.

4.4.4 Step 4: Evaluating the Trained Model

Keras enables simple evaluation of the performance of the method as long as the same information as for the training is available for the test dataset (input and GT images). For this, we just need to initialize two variables X_test and Y_test, see Exercise 3.

figure k

Exercise 3

Same as what was asked in Exercise 1, read the images in the test folder and create two normalized Numpy arrays X_test and Y_test. However, note that random patches are not adopted this time as we want to evaluate the performance on the whole image. Additionally, the size of the network input needs to be a multiple of 16 due to the downsampling layers and skip connections (Fig. 4.7). Hence, crop the largest possible (\(560\times 704\) pixels) central patch for each image and its manual annotations. The expected shapes of X_test and Y_test are \(90\times 560\times 704\times 1\) and \(90\times 560\times 704\times 3\), respectively.

4.4.5 Step 5: Building a DeepImageJ Bundled Model to Process New Data

4.4.5.1 Step 5.1: Saving the Trained Model in TensorFlow’s Format

DeepImageJ is a plugin toolset in Fiji/ImageJ designed to load and run TensorFlow models. Next, we show how to store the model in a SavedModel ProtoBuffer format (default file format in TensorFlow), so that deepImageJ can read it and process an image directly loaded from ImageJ using the trained model:

figure l

A new folder called DeepImageJ-model is created with two items inside: saved_model.pb and a folder variables. We recommend to compress this folder into a DeepImageJ-model.zip file and download it so you can work on it locally with Fiji/ImageJ:

figure m

Unzip the file in your local machine. Note that the folder should look exactly like the one we had in the cloud (DeepImage-model).

4.4.5.2 Step 5.2: Creating a DeepImageJ Bundled Model

DeepImageJ comprises three different plugins: Run, Explore and BuildBundled Model. First, the TensorFlow model needs to be converted into a deepImageJ’s bundled model. Click on ImageJ> Plugins> DeepImageJ> Build BundledModel and open an example image for this processing. We opened the image t199.tif from the test set. A dialog box pops up indicating the steps to follow (see Fig. 4.9).

Fig. 4.9
figure 9

DeepImageJ build bundled model process: (a) Open a test image in Fiji and call Build Bundled Model; (b) Load a model indicating the path to the unzipped DeepImageJ-model folder; (c) Specify input and output dimension order (N: batch number, H: height, W: width, C: channels); and also (d) input size (32) and padding (47); (e) Write the name of the model, authors, credits, citations or any other relevant information; (f, g) Write the pre- and post-processing macro routines needed for the correct image processing; (h) Run the image processing routine and test that you get the desired output; (i) If so, specify a new name for the bundled model and save it under ImageJ’s recently created models folder

Fig. 4.10
figure 10

Example of network output. Given an input image (top-left, scale bar: 150 \(\mu \)m), the output of our U-Net is an image with three channels, each of them indicating the probability of being background, foreground or cell contour (columns 2–4). The color intensity of the three channels is equally calibrated from 0 to 1. Notice these predictions contain continuous values from 0 to 1 so they need to be post-processed in order to get a binary mask for each class as in the GT (last row). Note that the cells touching the image borders are discarded from the CTC GT

The pre-processing ImageJ macroFootnote 19 is used to normalize the input images:

figure n

If no post-processing macro is set, we get the raw output of the network (Fig. 4.10). However, we would like to identify each independent cell in the mask (i.e., instance segmentation). So, a distance transform Watershed routine is included in the post-processing macroFootnote 20 together with some morphological operations to split cell clusters and refine the results:

figure o

4.4.6 Step 6: Process All Images in Fiji Using DeepImageJ and MorpholibJ

We are now reaching the final stage of the workflow! We are ready to quantify the morphology of the cells from the test set. Download the data from the CTC web page (Sect. 4.2) and unzip it. Use the Fiji/ImageJ macro provided in this chapterFootnote 21 to process the new images. Please, update the path in the macro with the location of the unzipped CTC images in your computer.

Fig. 4.11
figure 11

Final step. From an ImageJ macro, the images stored in the folder images_to_process are processed using the trained model and for each detected cell, a complete list of morphological features are calculated

With this macro, the individual masks of the cells extracted from the downloaded CTC images will be stored (one label image per input image) together with their corresponding morphological measurements in an easy-to-read comma-separated values (CSV) file (see Fig. 4.11). More precisely, for each segmented cell, the area, perimeter, circularity, Euler number, bounding box, centroid coordinates, equivalent ellipse, ellipse elongation, convexity, maximum Feret diameter, oriented box, oriented box elongation, geodesic diameter, tortuosity, maximum inscribed disc, average thickness and geodesic elongation will be recorded. For a detailed description of each measurement, see the latest version of MorphoLibJ manual.Footnote 22

Take-Home Message

In this chapter, we have presented a complete bioimage analysis workflow leveraging a DL model to segment cells from phase contrast images. The proposed workflow is versatile and meant to be customizable to other image segmentation-related tasks. As was demonstrated, DL models for bioimage processing can be easily used in Fiji/ImageJ. However, trained models do not perform generally as well on new (and different) images unless they are re-trained. That being said, the proposed workflow can be effortlessly applied to new (similar) datasets by simply modifying the input folders and reproducing the steps described in this document.