Morphology on Categorical Distributions

Mathematical morphology (MM) is an indispensable tool for post-processing. Several extensions of MM to categorical images, such as multi-class segmentations, have been proposed. However, none provide satisfactory deﬁnitions for morphology on probabilistic representations of categorical images. The categorical distribution is a natural choice for representing uncertainty about categorical images. Extending MM to categorical distributions is problematic because categories are inherently unordered. Without ranking categories, we cannot use the standard framework based on supremum and inﬁmum. Ranking categories is impractical and problematic. Instead, we consider the probabilistic representation and operations that emphasize a single category. In this work, we review and compare previous approaches. We propose two approaches for morphology on categorical distributions: operating on Dirichlet distributions over the parameters of the distributions and operating directly on the distributions. We propose a “protected” variant of the latter and demonstrate the proposed approaches by ﬁxing misclassiﬁcations and modeling annotator bias.

Multi-class segmentation problems are common in analysis of biomedical images.A typical solution is to train a neural network pixel classifier.Commonly, these networks predict a probability distribution over all classes in each pixel, which can be thresholded to obtain a final segmentation.These predictions often contains holes, partial misclassifications, shrinkage of small classes and rough borders between classes, resulting in errors in the final segmentation.To improve the segmentation, post-processing is often used to close holes, reclassify uncertain pixel labels based on proximity, grow objects, and smoothen rough boundaries.
Mathematical morphology is a powerful framework for post-processing binary and grayscale images.Binary and grayscale morphology are special cases of morphology on complete lattices [1].A complete lattice is a partially ordered set (poset), where each non-empty subset has an infimum and a supremum.For complete lattices the core operators, dilation and erosion, can be defined using supremum and infimum: for binary morphology using set union and intersection; for grayscale morphology using maximum and minimum under the standard total ordering of the reals, see [1] for an in depth treatment of the theoretical foundations of mathematical morphology.
For general multi-class images, there is no natural ordering of the classes, and hence, they do not form a complete lattice.For example, for a segmentation of microscope images of cells into cell membrane, mitochondria and background, any ordering of the classes is task dependent and not given by the images themselves.A natural representation of this kind of data is the categorical distribution, which can represent both crisp segmentation masks and uncertainty as encountered in prediction images.In the remainder of this work we will use the term "categorical" instead of "multi-class".
In this work, we provide a thorough review of previously proposed approaches to morphology on categorical images.We then propose two approaches for morphology on categorical distributions, an indirect approach where we operate on Dirichlet distributions that are then transformed to categorical distributions, and a direct approach where we operate on the categorical distributions themselves.We then define protected variants of the direct operations that allow finer control over the processing.Finally, we illustrate the utility of the proposed approach on two tasks: fixing misclassified mitochondria and modeling annotator bias.

Background and related work
In this section we briefly restate morphology on complete lattices and on binary and grayscale images, before we review the most relevant literature [2][3][4][5][6][7][8].What we refer to as categorical images have various names in the literature: colorcoded images, label images and n-ary images.In the sections below we will use the original names in the section titles, but otherwise we will refer to categorical images and categorical morphology.
In the literature, there are three main approaches for extending morphology to images with values that do not have a natural ordering: impose an order on the values, which is the common approach for color images; operate on all categories simultaneously [2,5]; and operate on a single category at a time [6,7] Morphology on color images has received a lot of attention, with the primary focus on ordering colors by exploiting the relationship between dimensions of color spaces.See for example [9] for an overview of approaches for defining an ordering of colors.Our focus is on categorical images, where such approaches are less relevant.

Morphology on complete lattices
Let Γ be a set with the partial order ≤.The poset (Γ, ≤) is a complete lattice if every subset of Γ has an infimum ∧ and a supremum ∨.We define an image as a function f from pixel-coordinates D = Z d to Γ, and a structuring element B as a subset of D The dilation (δ) and erosion ( ) of f by B are then defined as the supremum and infimum over the local neighborhoods in f given by B Opening (γ) and closing (φ) are the compositions of dilation and erosion

Binary and grayscale morphology
We define a grayscale image as in (1) with Γ = [0, 1].Let ≤ be the usual ordering of the reals, then the poset ([0, 1], ≤) is a complete lattice, where the min function gives the infimum and the max function the supremum.Let B be defined as in (2).Dilation and erosion can then be obtained from ( 3) and (4) as

) Morphology on categorical distributions
If we restrict Γ to {0, 1} we obtain binary morphology.

Morphology on color-coded images
In [2] the authors propose a framework for categorical morphology where pixels have a set of categories.Let C = {c 1 , c 2 , . . ., c n } be a set of n categories.The powerset of C, P C , is the set of all subsets of C, including the empty set.An image f is then defined as in (1) with Γ = P C .In this framework the value of a pixel can be any element of P C , e.g {c 1 }, {c 1 , c n } or {}.Let ⊆ be the usual subset relation, then the poset (P C , ⊆) is a complete lattice where set intersection is the infimum and set union is the supremum.In [2] the authors propose to use structuring elements of the same form as f , that is B ∈ F. For the sake of comparison, we first consider the simpler case where B be is defined as in (2).Dilation and erosion can then be obtained from ( 3) and (4) as An example of these operations is shown Fig. 1a.Let B ∈ F. Under this scheme, an operation is only performed when one or more categories in the structuring element match a category in the image, and the result depends on the categories in both image and structuring element.Several variations of dilation and erosion are proposed in [2], here we only consider the "transparent" operations.Let D f be the domain of f and D B the domain of B. A specified reference point, y 0 ∈ D B , is used to determine if B matches f and could for example be the center of a ball shaped B. Dilation and erosion are then defined as An example of these operations is shown Fig. 1b using a cross-shaped structuring element with y 0 in the center.

Morphology on label images
In [5] the authors propose a framework for categorical morphology where pixels have no category (⊥), a unique category or conflicting categories ( ).Let C = {c 1 , c 2 , . . ., c n } be a set of n categories and let C * = C ∪{⊥, }.An image f is then defined as in (1) with Γ = C * .The poset (C * , ≤), where ≤ satisfies [∀c ∈ C](⊥ ≤ c ≤ ) is a complete lattice.Let B be defined as in (2) and let V (x) = {f (x − y) | y ∈ B}.Dilation and erosion are then defined as An example of these operations is shown Fig. 1c.In the context of categorical distributions, where we have detailed information about label uncertainty, this approach is unsuitable due to the loss of information.

N-ary morphology
In [6] the authors propose a framework for categorical morphology where pixels have a unique category.Let C = {c 1 , c 2 , . . ., c n } be a set of n categories.An image f is then defined as in (1) with Γ = C. Instead of operating on all categories simultaneously, the authors propose to operate on a single category at a time.Let B be defined as in (2) and let i be the category we operate on.We use subscripts to distinguish single category operations from standard operations.Dilation and erosion are then defined as where θ is a function that assigns a value in the case where there are different categories in the neighborhood of x.A natural choice for θ, which is also suggested in [6], is to pick the value of the closest pixels.However, this does not help when the closest pixels have different values, which is a fundamental problem when pixel values cannot represent uncertainty.This is solved by ranking the categories and using the ranking to break ties.In general there is no obvious way of ranking categories based on the image alone, and as the number of multi-category interfaces increases it becomes more difficult to understand how one particular ranking influence the outcome.
Without ranking categories a priori, the above definition implies an ordering ≤ i , which is not a partial order and thus (C, ≤ i ) is not a complete lattice.
In [7] the authors show that ≤ i is a preorder, and formalize constraints for choosing θ such that dilation and erosion form an adjunction and their compositions are an opening and a closing.However, this does not help decide which category to choose when multiple categories are closest, as the constraints on θ do not yield a unique rule for breaking ties.An example of these operations is shown in Fig. 1d, where the question marks highlight two pixels that cannot be assigned a value without a method for breaking ties.

Fuzzy n-ary morphology
In [6] the authors also propose an extension of n-ary morphology to images of categorical distributions.Let C = {c 1 , c 2 , . . ., c n+1 } be a set of n+1 categories.The categorical distribution of n + 1 categories is completely determined by a point in the n-simplex , where π k is the probability of c k .An image f is then defined as in ( 1) with Γ = ∆ n .Operations are again defined on a single category at a time.Let B r be a closed ball of radius r centered at the origin and let i be the category we operate on.Let f k (x) = f (x) k be the probability of observing category c k in pixel x and let ω k (x) = 1 − f k (x).Dilation is then defined as where δ( ωi(x) = 0. Two variations on erosion are proposed in [6], neither of which we find satisfactory.The first require that we pick a ranking of all categories and does not yield idempotent opening and closing The second assumes that the image is restricted to the edges of the simplex (at most two categories are non-zero in any pixel) and opening and closing are again not idempotent  We refer the reader to [6] for the motivation for these formulations and their properties.

Fuzzy Pareto morphology
In [3] the authors propose fuzzy Pareto morphology for color images.An RGB color image can be seen as a 3-dimensional fuzzy set, where the membership function for each set correspond to the value of each color channel.This can equivalently be seen as point in the half-open unit cube.An image f is then defined as in (1) with Γ = (0, 1] d .For each a ∈ Γ we can associate a hyperrectangle defined by the vector from the origin to a. Fuzzy Pareto morphology is based on the idea of dominance.For a, b be the intersection of a and b.Let A(a) = i a i be the area function, yielding the area of the hyperrectangle of a.The degree to which a dominates b is then which measures how much of the hyperrectangle of b is contained in the hyperrectangle of a.
Let B(x) = {x + y | y ∈ B}, dilation and erosion are then defined as Although not directly applicable to categorical distributions, it could easily be extended by either restricting Γ to {v ∈ (0, 1] d | i v i = 1} or by considering it in the context of the Dirichlet distribution.However, (21) and ( 22) are not guaranteed to yield a unique solution, requiring us to come up with an arbitration rule.

Morphology on the unit circle
In [10] the authors propose morphology on the unit circle for processing the hue space of color images.The idea is to use structuring elements from the hue space and define an ordering based on the shortest distance along the unit circle between values in the image and values in the structuring element.
Although not directly applicable to categorical images, it could be relevant to consider structuring elements that are themselves categorical distributions and base morphology on distance between distributions.Morphology on the unit circle is also considered in [4] where the authors propose three approaches: using difference operators (e.g.gradient), using grouped data, and using "labeled openings".It is the labeled openings that are most relevant in our context.Let f be an image as defined in (1) with Γ = [0, 2π].In a labeled opening the unit circle is partitioned into segments S(ω) = {[0, ω), [ω, 2ω),. . ., [2π − ω, 2π)} and each segment s ∈ S(ω) gives rise to a binary image f (x; s) = f (x) ∈ s.A labeled opening is then the union of the binary openings of all segments, γ ω (f ) = ∪ s∈S(ω) f (x; s), indicating for each pixel if it was opened.Categorical images have a natural partitioning based on the categories, leading to the set-based morphology in Section 2.3, where a labeled opening is the pixels that do not change when opened.

Morphology on component graphs
In [8] the authors propose a framework for morphology on multi-valued images based on component graphs.Let an image be defined as in (1) with Γ = R d .The component graph is constructed from the connected components of the level sets of an image.For example, for d = 2 and f (x) ∈ {0, 1} 2 the level sets are {(0, 0), (0, 1), (1, 0), (1, 1)}.For each level set we obtain a set of connected components.Each connected component is a node in the component graph and the children of this node are the connected components that are contained in it.In order to construct the component graph it is required that Γ allows a minimum, e.g.{0} d , such that the graph will be connected.For categorical images this would require that we have a special background category as in Section 2.3 and Section 2.4.Further, it requires that each pixel can have multiple categories, otherwise no component will be nested inside another and the graph will be the root with all connected components as children.
Because the component graph directly exposes the spatial relationship between differently valued regions, it is possible to apply morphological filters, e.g.noise reduction, by pruning some nodes and reconstructing the image from the pruned component graph.Directly pruning the component graph can lead to ambiguity in the reconstruction when a node with two non-comparable parents is removed.The authors propose to solve this by building a component tree of the component graph, prune the tree, then reconstruct the graph from the tree, and the image from the graph.In order to construct the component tree it is necessary to impose a total order on the nodes of the component graph.For example by using a shape measure on the connected components in the component graph.
Because the component graph only captures spatial relationships when connected components overlap for different level sets, some common postprocessing operations, such as closing holes in segmentations, are challenging to perform.

Morphology on categorical distributions
In this section we propose two approaches for morphology on categorical distributions.In Section 3.1 we show how to operate on all categories simultaneously by operating on Dirichlet distributions.The limitations of this approach will Morphology on categorical distributions then motivate single category operations that work directly on categorical distributions, which we will define in Section 3.2.

Morphology on Dirichlet distributions
Let R + be the positive real line.We consider the Dirichlet distribution of order n ≥ 2 with parameters α ∈ R n + , written as Dir(α), as a distribution over the where B(•) is the Beta function.A sample from the Dirichlet distribution of order n can be seen as parameters of the categorical distribution with n categories.Here, we only consider the expectation which maps each Dirichlet distribution to a specific categorical distribution.Note that 0 < α k < ∞ implies that we can only represent categorical distributions in the open simplex.In practice this is not a problem as we can get arbitrarily close to the boundary of the simplex.Let f k be the k'th category in f .An image f is defined as in ( 1) with ) we obtain a complete lattice.Dilation and erosion are then defined as their grayscale counterparts applied to each category independently An example of these operations is provided in Fig. 2. Opening and closing are possibly the most interesting operations as they, respectively, decreases and increases uncertainty at the boundaries between overlapping categories.We can easily extend these operators to operate on a subset of categories S by simply only updating those categories An example of these operations is provided in Fig. 3 where we operate on the green category.Consider the gray/blue region surrounded by green that is indicated with a white ellipse in the left image of the second row.When we dilate the green category we would expect this region to become green in The third row is entropy of the probability vectors, and the fourth row is magnitude of the parameter vectors.We can see that dilation increases both entropy and magnitude, whereas erosion decreases magnitude and increases or decreases entropy depending on the local distribution.
the probability image, but in the Dirichlet space these pixels already have the same green value as the green region, so they are unaffected by the dilation.We could partly solve this by carefully setting the α values, e.g.setting the pixels with only green to have very large green values.However, if our goal is to work on categorical distributions, this becomes too large a burden to be practical and we now turn our attention to morphological operators that work directly on categorical distributions.

Morphology on categorical distributions
Recall from Section 2.6 that for a set of n+1 categories, C = {c 1 , c 2 , . . ., c n+1 }, the categorical distribution over these categories is completely determined by a point in the n-simplex , where π k is the probability of c k .An image f is then defined as in (1) with Γ = ∆ n .Operations are again defined on a single category at a time.Let B r be a closed ball of radius r centered at the origin and let i be the category we operate on.Let f k (x) = f (x) k be the probability of observing category c k in pixel x and let Dir(α) H(E) α Fig. 3: Morphology on Dirichlet distributions using a subset of categories, in this case the green category {g}.See also Fig. 2.

Dilation
For the dilated category i the operation is the same as standard grayscale dilation.For the remaining set of categories the operation is a rescaling to ensure that the probabilities sum to one, while the conditional probabilities Pr(k|x, k = i) are unchanged If δ(f i ; B r ) = 1 then the conditional probabilities are not defined and we simply set the probabilities to 1 − δ(f i ; B r ) = 0.This definition is the same as ( 17) and equivalent to the definition from [6].

Erosion
Erosion is defined similarly to dilation, with the exception of the case when f i (x) = 1 where we cannot rescale the remaining categories because The function θ must only depend on the neighborhood defined by B r and defined such that ( . In addition we require that, when disregarding discretization issues, eroding with Since θ is only used in the case where f i (x) = 1 we must have that So θ must only depend on the smallest possible neighborhood B r * where This amounts to picking the closest category as suggested for crisp categorical images in [6,7], although without the need for breaking ties since multiple closest categories are now handled by rescaling.In Appendix A we show that these definitions have the same properties as the definitions in [7] for operating on n-ary images.An example of the proposed operations is provided in Fig. 4, where we operate on the green category.Compared to morphology on Dirichlet distributions using subsets in Fig. 3 the operations now work directly on the probabilities, making it much easier to understand and control.

Protected morphological operations
In [2] the authors introduce the concept of protected morphological operations, where a subset of categories are protected from being updated.Here we adapt the idea of protected morphological operations to categorical distributions and define protected dilation and erosion.
Let L be a set of categories, we then write i (f ; B r |L) for an erosion of i that protects L. Let J = C \ ({i} ∪ L) be the set of categories that are not protected nor operated on.Let f K (x) = k∈K f k (x) be the sum over a set of categories K ⊂ C. If L is empty, or [∀x](f L (x) = 0), protected operations reduce to their non-protected counterparts.Because L can change the topology Computing exact Euclidean distance on a Euclidean domain with holes is nontrivial.Here we use the simplified fast marching method (FMM) from [11] with the update rule defined in [12], which results in a small approximation error.For brevity, when possible we leave out function application and write f instead of f (x) in the following.

Protected dilation
Let Ω p = {x ∈ D | f L (x) ≤ 1 − p}, this is the part of f where it is possible to set f i = p.Protected dilation is then defined as

Protected erosion
Protected erosion is defined similarly to protected dilation, with the added complication of normalization The first case ensures that all protected categories are unchanged.The second case ensures that a pixel x is not updated, unless there is a path, not blocked by f L , to a pixel y with f J (y) > 0. The importance of this is easily seen by considering the case where f i varies in an region, but f i + f L = 1 in the region.The third case states that if there is such a path, then it can be eroded.The fourth and fifth cases handles normalization.The θ function is defined in a similar manner as for non-protected erosion in (33), An example of these operations is provided in Fig. 5, where the red category is protected while we operate on the green category.Compared to the nonprotected operations in Fig. 4 we can see that changes are restricted to the green and blue categories.

Examples
The first example illustrates how morphology on categorical distributions (Section 3.2) can be used to remove noisy predictions.The second example illustrates how protected morphology on categorical distributions (Section 4) can be used to model annotator bias.

Removing noisy predictions
Despite the impressive performance of neural networks for segmentation, the results are rarely perfect.Fig. 6 shows part of an electron microscopy image of the hippocampus, along with multi-class predictions and segmentations obtained from [13].Notice the noisy mitochondria predictions resulting in misclassifications highlighted in Fig. 6c.We can remove these misclassification by opening the mitochondria class before the final classification.Fig. 7 shows the opened predictions along with the final classifications.Notice in particular how the errors in circle 2 in Fig. 7c are fixed, such that the vesicle (teal) and the endoplasmic reticulum (yellow) are separated by cytosol.This would have been very difficult to achieve by working directly on the final segmentations.That the vesicle and endoplasmic reticulum are probably misclassified just illustrates that not all things should be fixed in post-processing.

Modeling annotator bias
Expert annotation is the gold standard in most clinical practice as well as for evaluating computer methods.However, annotation tasks are inherently subjective and prone to substantial inter-rater variation [14,15].When investigating the influence of this variation on statistics and decisions it can be interesting to consider specific hypotheses regarding the variation.Consider the brain tumor annotation in Fig. 8.The annotation is derived from the QUBIQ1   challenge brain tumor dataset, where three annotators each annotated whole tumor, tumor core and active tumor.From this we obtain an image with four categories: background, edema, active core, inactive core.Although the annotators have a high level of agreement, there is still substantial variation in the extent of edema and in how much of the tumor core is active.
Using protected dilation we can for example hypothesize how the merged annotation would appear under the assumption that the tumor core is oversegmented but the active part is undersegmented.Fig. 9 shows the results where we first dilate the active core while protecting edema and background, then dilate edema while protecting background.This would allow us to easily investigate if statistical differences in a case-control study could be explained by biased annotations.Fig. 9: What could the annotation look like if the core was oversegmented, but the active part undersegmented?Dilation of active core while protecting edema and background, followed by dilation of edema while protecting background using B 1 , B 2 , B 3 .

Discussion & Conclusion
We have provided a thorough review of morphology on categorically valued images.Based on this we have defined morphology on Dirichlet distributions and morphology on categorical distributions.Inspired by [2] we have further defined protected morphology on categorical distributions.We have demonstrated the behavior of the proposed operations and shown how they can be used in real-world applications such as noise removal in multi-class predictions and modeling annotator bias.Morphology on categorical distributions The definition of dilation is straightforward and no obvious alternatives present themselves.This is not so for erosion.In our definition, erosion corresponds to conditioning on a change in probability of the eroded category.An equally valid approach would be to also condition on where this change came from.Instead of simply rescaling the categories with non-zero mass we could include information from the neighborhood.For example, when eroding i we would fill the difference f i (x)− (f i ; B r )(x) based on the pixels that contribute to the difference, that is, those with minimum mass for i.This would result in smoother boundaries, which could be a better representation of uncertainty.A downside is that categories can leak into each other, leading to undesirable results.
In this work we have focused on the basic morphological operations, dilation and erosion, and their compositions, closing and opening.A logical next step is to investigate more complex morphological operations, such as the morphological gradient, which may be used to investigate spatial relationship between categories by measuring the change in one category as a function of change in another category.
We have defined protected versions of dilation and erosion.From these we could define opening and closing in the standard way.Alternatively, by changing which categories are protected for dilation and erosion we get more control over how a category is opened or closed.In [2] the authors explore similar ideas for so called "tunneling" and "bridging" operations on their setbased morphology, which would be interesting to consider in the context of categorical distributions.
Our aim in this work was to bring morphological operations to probabilistic representations of categorical images.These representations can be considered as generative processes that can be sampled.Naive sampling will result in noisy and unrealistic samples.Combining the sampling process with the proposed morphological operations could be an easy approach to obtain smoother and more realistic samples.
In summary, morphology is an indispensable tool for post-processing segmentations.Extending morphology to categorical images and their probabilistic counterparts presents a particular problem since there is in no inherent ordering of categories.In this paper, we have proposed to view categorical images as images of categorical distributions and defined morphological operations that are consistent with this view.

1
arXiv:2012.07315v2 [cs.CV] 9 Jan 2022 1 Introduction (a) Morphology on color-coded images using (2) as structuring element.The bold boundary highlights the pixels with one or more categories and is only for illustration.(b)Morphology on color coded images using structuring element from[2] .This figure extends Figure2.in[2] with the closing operation.Notice that B(y 0 ) = {plus, square}, meaning that B will match any pixel with either plus or square.(c)Morphology on label images.(d)N-ary morphology.The coloring of the structuring element with plus indicates that operations are on the plus category.The question mark ?indicate pixels that cannot be assigned a valued without additional ordering of the categories.Notice that it is not possible to assign pixels "no-category" as in (a), (b) or (c).

DirFig. 2 :
Fig. 2: Morphology on Dirichlet distributions.The top-left image is an RGB representation of an image f with three categories, where the colors red, green, and blue correspond to points very close to the vertices of ∆ 2 and the remaining colors are mixtures of these three colors.The first row is the Dirichlet distribution.The second row is the probability vectors obtained from (24).The third row is entropy of the probability vectors, and the fourth row is magnitude of the parameter vectors.We can see that dilation increases both entropy and magnitude, whereas erosion decreases magnitude and increases or decreases entropy depending on the local distribution.

Fig. 4 :
Fig. 4: Morphology on categorical distributions.Here we operate on the green category g.

Fig. 5 :
Fig. 5: Protected morphology on categorical distributions.The red category {r} is protected while we operate on the green category g.

Fig. 8 :
Fig.8: Inter-rater variation in annotation of brain tumors.White is background, blue edema, yellow inactive core and purple active core.Variation is indicated by color mixing.The black circles highlights two regions with large variation.