We have discussed the maximum entropy discrimination framework for optimizing the discriminative power of generative models. It maximizes accuracy on the given task through large margins just as a support vector machine optimizes the margin of linear decision boundaries in Hilbert space. The MED framework is straightforward to solve for exponential family distributions and, in the special case of Gaussian means, subsumes the support vector machine. We also noted other useful models it can handle, such as arbitrary-covariance Gaussians for classification, multinomials for classification and general exponential family distributions. Nevertheless, despite the generality of the exponential family, we are still restricting the potential generative distributions in MED to only a subclass of the popular models in the literature. To harness the power of generative modeling, we must go beyond such simple models and consider mixtures or latent variable models. Such interesting extensions include sigmoid belief networks, latent Bayesian networks, and hidden Markov models, which play a critical role in many applied domains. These models are potential clients for our MED formalism and could benefit strongly from a discriminative estimation technique instead of their traditional maximum-likelihood incarnations.
KeywordsSupport Vector Machine Lagrange Multiplier Mixture Model Convex Hull Hide Markov Model
Unable to display preview. Download preview PDF.