An Improved Mixture of Experts Model: Divide and Conquer Using Random Prototypes
- 1.4k Downloads
The Mixture of Experts (ME) is one of the most popular ensemble methods used in Pattern Recognition and Machine Learning. This algorithm stochastically partitions the input space of a problem into a number of subspaces, experts becoming specialized on each subspace. To manage this process, theME uses an expert called gating network, which is trained together with the other experts. In this chapter, we propose a modified version of the ME algorithm which first partitions the original problem into centralized regions and then uses a simple distance-based gating function to specialize the expert networks. Each expert contributes to classify an input sample according to the distance between the input and a prototype embedded by the expert. The Hierarchical Mixture of Experts (HME) is a tree-structured architecture which can be considered a natural extension of the ME model. The training and testing strategies of the standard HME model are also modified, based on the same insight applied to standard ME. In both cases, the proposed approach does not require to train the gating networks, as they are implemented with simple distance-based rules. In so doing the overall time required for training a modifiedME/ HME system is considerably lower. Moreover, centralizing input subspaces and adopting a random strategy for selecting prototypes permits to increase at the same time individual accuracy and diversity of ME/HME modules, which in turn increases the accuracy of the overall ensemble. Experimental results on a binary toy problem and on selected datasets from the UCI machine learning repository show the robustness of the proposed methods compared to the standard ME/HME models.
KeywordsInput Space Ensemble Member Expert Model Gating Function Expert Network
Unable to display preview. Download preview PDF.
- 5.Jacobs, R., Jordan, M.I., Barto, A.: Task decomposition through competition in a modular connectionist architecture: The what and where vision tasks. Technical Report 90-44, Univ. Massachusetts, Amherst (1991)Google Scholar
- 9.Tang, B., Heywood, M., Shepherd, M.: Input partitioning to mixture of experts. In: Proc. the 2002 Int. Joint Conf. Neural Networks, Honolulu, HI, pp. 227–232. IEEE Comp. Society, Los Alamitos (2002)Google Scholar
- 10.Wan, E., Bone, D.: Interpolating earth-science data using RBF networks and mixtures of experts. In: Mozer, M., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Inf. Proc. Syst., vol. 9, pp. 988–994. MIT Press, Cambridge (1997)Google Scholar
- 11.Waterhouse, S., Cook, G.: Ensemble methods for phoneme classification. In: Mozer, M., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Inf. Proc. Syst., vol. 9, pp. 800–806. MIT Press, Cambridge (1997)Google Scholar
- 12.UCI Repository of Machine Learning Databases, Dept. of Inf. and Comp. Sci., Univ. of California, Irvine, http://archive.ics.uci.edu/ml/