Cost-sensitive learning with conditional Markov networks

Sen, Prithviraj; Getoor, Lise

doi:10.1007/s10618-008-0090-5

Cost-sensitive learning with conditional Markov networks

Published: 07 March 2008

Volume 17, pages 136–163, (2008)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Prithviraj Sen¹ &
Lise Getoor¹

236 Accesses
9 Citations
Explore all metrics

Abstract

There has been a recent, growing interest in classification and link prediction in structured domains. Methods such as conditional random fields and relational Markov networks support flexible mechanisms for modeling correlations due to the link structure. In addition, in many structured domains, there is an interesting structure in the risk or cost function associated with different misclassifications. There is a rich tradition of cost-sensitive learning applied to unstructured (IID) data. Here we propose a general framework which can capture correlations in the link structure and handle structured cost functions. We present two new cost-sensitive structured classifiers based on maximum entropy principles. The first determines the cost-sensitive classification by minimizing the expected cost of misclassification. The second directly determines the cost-sensitive classification without going through a probability estimation step. We contrast these approaches with an approach which employs a standard 0/1-loss structured classifier to estimate class conditional probabilities followed by minimization of the expected cost of misclassification and with a cost-sensitive IID classifier that does not utilize the correlations present in the link structure. We demonstrate the utility of our cost-sensitive structured classifiers with experiments on both synthetic and real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Abe N, Zadrozny B, Langford J (2004) An iterative method for multiclass cost-sensitive learning. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Seattle, WA, pp 3–11
Berrou C, Glavieux A, Thitimajshima P (1993) Near Shannon limit error-correcting coding and decoding: turbo codes. In: Proceedings of IEEE international conference on communications, vol 2. IEEE, Geneva, Switzerland, pp 1064–1070
Besag J (1986) On the statistical analysis of dirty pictures. J R Stat Soc 48: 259–302
MATH MathSciNet Google Scholar
Bodik P, Hong W, Guestrin C, Madden S, Paskin M, Thibaux R (2004) Intel lab dataset. http://berkeley.intel-research.net/labdata/
Bollobas B, Borgs C, Chayes JT, Riordan O (2003) Directed scale-free graphs. In: Proceedings of the fourteenth ACM-SIAM symposium on discrete algorithms (SODA). SIAM, Baltimore, MD, pp 132–139
Bradford J, Kunz C, Kohavi R, Brunk C, Brodley C (1998) Pruning decision trees with misclassification costs. In: Proceedings of the tenth European conference on machine learning. Springer-Verlag, Chemnitz, Germany, pp 131–136
Brefeld U, Geibel P, Wysotzki F (2003) Support vector machines with example dependent costs. In: Proceedings of the fourteenth European conference on machine learning. Springer, Cavtat-Dubrovnik, Croatia, pp 23–34
Chakrabarti S, Dom B, Indyk P (1998) Enhanced hypertext categorization using hyperlinks. In: Proceedings of the ACM SIGMOD international conference on management of data. ACM, Seattle, WA, pp 307–318
Chan P, Stolfo S (1998) Toward scalable learning with non-uniform class and cost distributions: a case study in credit card fraud detection. In: Proceedings of the fourth international conference on knowledge discovery and data mining. AAAI Press, New York, NY, pp 164–168
Cohn D, Hofmann T (2001) The missing link – a probabilistic model of document content and hypertext connectivity. In: Advances in neural information processing systems 13. MIT Press, Denver, CO, pp 430–436
Deshpande A, Guestrin C, Madden S, Hong W (2005) Exploiting correlated attributes in acquisitional query processing. In: Proceedings of the twenty-first international conference on data engineering. IEEE, Tokyo, Japan, pp 143–154
Domingos P (1999) MetaCost: a general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 155–164
Duda RO, Hart PE, Stork DG (2001) Pattern classification. Wiley Interscience
Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of the seventeenth international joint conference on artificial intelligence. Morgan Kaufmann, Seattle, WA, pp 973–978
Fumera G, Roli F (2002) Cost-sensitive learning in support vector machines. In: Convegno Associazione Italiana per L’Intelligenza Artificiale
Geibel P, Wysotzki F (2003) Perceptron based learning with example dependent and noisy costs. In: Proceedings of the twentieth international conference on machine learning. AAAI Press, Menlo Park, CA, pp 218–225
Getoor L, Friedman N, Koller D, Taskar B (2002) Learning probabilistic models of link structure. J Machine Learning Res 3: 679–707
Article MathSciNet Google Scholar
Hummel R, Zucker S (1983) On the foundations of relaxation labeling processes. IEEE Trans Pattern Anal Mach Intell 5(3): 267–287
Article MATH Google Scholar
Jaynes ET, Rosenkrantz RD (ed) (2003) E. T. Jaynes: papers on probability, statistics and statistical physics. Springer
Knoll U, Nakhaeizadeh G, Tausend B (1994) Cost-sensitive pruning of decision trees. In: Proceedings of the European conference on machine learning. Springer-Verlag, Catania, Italy, pp 383–386
Kschischang FR, Frey BJ (1998) Iterative decoding of compound codes by probability propagation in graphical models. IEEE J Sel Areas Commun 16: 219–230
Article Google Scholar
Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann, Williamstown, MA, pp 282–289
Lu Q, Getoor L (2003) Link-based classification. In: Proceedings of the twentieth international conference on machine learning. AAAI Press, Washington, DC, pp 496–503
McEliece RJ, MacKay DJC, Cheng JF (1998) Turbo decoding as an instance of Pearl’s belief propagation algorithm. IEEE J Sel Areas Commun 16: 140–152
Article Google Scholar
Minka T (2001) Expectation propagation for approximate bayesian inference. In: Proceedings of the seventeenth conference in uncertainty in artificial intelligence. Morgan Kaufmann, Seattle, WA, pp 362–369
Murphy K, Weiss Y, Jordan MI (1999) Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the fifteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann, Stockholm, Sweden, pp 467–475
Neville J, Jensen D (2000) Iterative classification in relational data. In: AAAI workshop on learning statistical models from relational data. AAAI Press, Austin, TX, pp 13–20
Sen P, Getoor L (2006) Cost-sensitive learning with conditional Markov networks. In: Proceedings of the twenty-third international conference on machine learning. ACM, Pittsburgh, PA, pp 801–808
Singhvi V, Krause A, Guestrin C, Garrett J, Matthews HS (2005) Intelligent light control using sensor networks. In: Proceedings of the third international conference on embedded networked sensor systems. ACM, San Diego, CA, pp 218–229
Slattery S, Craven M (1998) Combining statistical and relational methods for learning in hypertext domains. In: Proceedings of the 8th international workshop on inductive logic programming. Springer-Verlag, Madison, WI, pp 38–52
Taskar B, Abbeel P, Koller D (2002) Discriminative probabilistic models for relational data. In: Proceedings of the eighteenth conference on uncertainty in artificial intelligence. Morgan Kaufmann, Edmonton, Canada, pp 485–492
Taskar B, Chatalbashev V, Koller D (2004a) Learning associative Markov networks. In: Proceedings of the twenty-first international conference on machine learning. ACM, Banff, Alberta, Canada, pp 807–814
Taskar B, Guestrin C, Koller D (2004b) Max-margin Markov networks. In: Advances in neural information processing systems 16. MIT Press, Vancouver and Whistler, British Columbia, Canada, pp 25–32
Tsochantaridis I, Hofmann T, Joachims T, Altun Y (2004) Support vector machine learning for interdependent and structured output spaces. In: Proceedings of the twenty-first international conference on machine learning. ACM, Banff, Alberta, Canada, pp 823–830
Xu L, Wilkinson D, Southey F, Schuurmans D (2006) Discriminative unsupervised learning of structured predictors. In: Proceedings of the twenty-third international conference on machine learning. ACM, Pittsburgh, PA, pp 1057–1064
Yedidia JS, Freeman WT, Weiss Y (2001) Generalized belief propagation. In: Advances in neural information processing systems 13. MIT Press, Denver, CO, pp 689–695
Yedidia JS, Freeman WT, Weiss Y (2005) Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans Inform Theory 51: 2282–2312
Article MathSciNet Google Scholar
Zadrozny B, Elkan C (2001) Learning and making decisions when costs and probabilities are both unknown. In: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining. ACM, San Francisco, CA, pp 204–213
Zadrozny B, Langford J, Abe N (2003) Cost-sensitive learning by cost-proportionate example weighting. In: Proceedings of the third IEEE international conference on data mining. IEEE, Melbourne, FL, pp 435–442
Zhang T, Oles FJ (2001) Text categorization based on regularized linear classification methods. Inform Retrieval 4: 5–31
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Maryland, College Park, MD, 20783, USA
Prithviraj Sen & Lise Getoor

Authors

Prithviraj Sen
View author publications
You can also search for this author in PubMed Google Scholar
Lise Getoor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prithviraj Sen.

Additional information

Responsible editor: Bianca Zadrozny.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sen, P., Getoor, L. Cost-sensitive learning with conditional Markov networks. Data Min Knowl Disc 17, 136–163 (2008). https://doi.org/10.1007/s10618-008-0090-5

Download citation

Received: 20 November 2006
Accepted: 17 January 2008
Published: 07 March 2008
Issue Date: October 2008
DOI: https://doi.org/10.1007/s10618-008-0090-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cost-sensitive learning with conditional Markov networks

Abstract

Access this article

Similar content being viewed by others

Learning Hierarchical Multi-label Classification Trees from Network Data

Accurate parameter estimation for Bayesian network classifiers using hierarchical Dirichlet processes

Learning Bayesian Network Structures When Discrete and Continuous Variables Are Present

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Cost-sensitive learning with conditional Markov networks

Abstract

Access this article

Similar content being viewed by others

Learning Hierarchical Multi-label Classification Trees from Network Data

Accurate parameter estimation for Bayesian network classifiers using hierarchical Dirichlet processes

Learning Bayesian Network Structures When Discrete and Continuous Variables Are Present

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation