Trimming Prototypes of Handwritten Digit Images with Subset Infinite Relational Model
We propose a new probabilistic model for constructing efficient prototypes of handwritten digit images. We assume that all digit images are of the same size and obtain one color histogram for each pixel by counting the number of occurrences of each color over multiple images. For example, when we conduct the counting over the images of digit “5”, we obtain a set of histograms as a prototype of digit “5”. After normalizing each histogram to a probability distribution, we can classify an unknown digit image by multiplying probabilities of the colors appearing at each pixel of the unknown image. We regard this method as the baseline and compare it with a method using our probabilistic model called Multinomialized Subset Infinite Relational Model (MSIRM), which gives a prototype, where color histograms are clustered column- and row-wise. The number of clusters is adjusted flexibly with Chinese restaurant process. Further, MSIRM can detect irrelevant columns and rows. An experiment, comparing our method with the baseline and also with a method using Dirichlet process mixture, revealed that MSIRM could neatly detect irrelevant columns and rows at peripheral part of digit images. That is, MSIRM could “trim” irrelevant part. By utilizing this trimming, we could speed up classification of unknown images.
KeywordsBayesian nonparametrics Prototype Classification
- 1.Ishiguro K, Ueda N, Sawada H (2012) Subset infinite relational models. In: Proceedings of AISTATS 2012, JMLR W&CP 22, pp 547–555Google Scholar
- 2.Kemp C, Tenenbaum JB, Griffiths TL, Yamada T, Ueda N (2006) Learning systems of concepts with an infinite relational model. In: Proceedings of AAAI’06. p 381–388Google Scholar
- 4.Pitman J (2002) Combinatorial stochastic processes. Notes for Saint Flour Summer SchoolGoogle Scholar