Skip to main content
Log in

Inference Algorithms for Pattern-Based CRFs on Sequence Data

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) \(x_1\ldots x_n\) is the sum of terms over intervals [ij] where each term is non-zero only if the substring \(x_i\ldots x_j\) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively \(O(\textit{nL})\), \(O(\textit{nL} \ell _{\max })\) and \(O(\textit{nL} \min \{|D|,\log (\ell _{\max }\!+\!1)\})\) where L is the combined length of input patterns, \(\ell _{\max }\) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively \(O(\textit{nL} |D|)\), \(O\left( n |\varGamma | L^2 \ell _{\max }^2\right) \) and \(O(\textit{nL} |D|)\), where \(|\varGamma |\) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. Some of the bounds stated in [11] are actually weaker. However, it is not difficult to show that their algorithms can be implemented in times stated above, using our Lemma 1.

  2. Note that we still claim complexity \(O(n{P})\) where \({P}\) is the number of distinct non-empty prefixes of words in the original set \(\varGamma \). Indeed, we can assume w.l.o.g. that each letter in D occurs in at least one word \(w\!\in \!\varGamma \) (If not, then we can “merge” non-occuring letters to a single letter and add this letter to \(\varGamma \); clearly, any instance over the original pair \((D,\varGamma )\) can be equivalenly formulated as an instance over the new pair. The transformation increases \({P}\) only by 1). The assumption implies that \(|D|\le {P}\). Adding D to \(\varGamma \) increases \({P}\) by at most \({P}\), and thus does not affect bound \(O(n{P})\).

  3. The assumption \(|I(\varGamma )|\sim k\) will hold if e.g. we have \(|\varGamma _\delta |\ll k\). The assumption \(|\widehat{\varGamma }|\sim k\overline{\ell }\) means, roughly speaking, that words \(w_1,\ldots ,w_k\) rarely have common prefixes. It will hold, for example, if all words \(sw_is\) have the same length \(\overline{\ell }\) and their prefixes of length \(\overline{\ell }/2\) are all unique.

References

  1. Berkman, O., Vishkin, U.: Recursive star-tree parallel data structure. SIAM J. Comput. 22(2), 221–242 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bystroff, C., Thorsson, V., Baker, D.: HMMSTR: a hidden Markov model for local sequence-structure correlation in proteins. J Mol. Biol. 301, 173–190 (2000)

    Article  Google Scholar 

  3. Komodakis, N., Paragios, N.: Beyond pairwise energies: efficient optimization for higher-order MRFs. In: CVPR (2009)

  4. Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML (2001)

  5. Nguyen, V.C., Ye, N., Lee, W.S., Chieu, H.L.: Semi-Markov conditional random field with high-order features. In: ICML 2011 Structured Sparsity: Learning and Inference Workshop (2011)

  6. Qian, X., Jiang, X., Zhang, Q., Huang, X., Wu, L.: Sparse higher order conditional random fields for improved sequence labeling. In: ICML (2009)

  7. Rother, C., Kohli, P., Feng, W., Jia, J.: Minimizing sparse higher order energy functions of discrete variables. In: CVPR (2009)

  8. Sarawagi, S., Cohen, W.: Semi-Markov conditional random fields for information extraction. In: NIPS (2004)

  9. Takhanov, R., Kolmogorov, V.: Inference algorithms for pattern-based CRFs on sequence data. In: ICML (2013)

  10. Vose, M.D.: A linear algorithm for generating random numbers with a given distribution. IEEE Trans. Softw. Eng. 17(9), 972–975 (1991)

    Article  MathSciNet  Google Scholar 

  11. Ye, N., Lee, W.S., Chieu, H.L., Wu, D.: Conditional random fields with high-order features for sequence labeling. In: NIPS (2009)

Download references

Acknowledgments

The authors thank Herbert Edelsbrunner for helpful discussions. This work has been partially supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 616160.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Kolmogorov.

Additional information

A preliminary version of this paper appeared in Proceedings of the 30th International Conference on Machine Learning (ICML), 2013 [9]. This expanded version contains proofs that were missing in [9], and also revisits the case of MAP with non-positive weights (see Sect. 8).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kolmogorov, V., Takhanov, R. Inference Algorithms for Pattern-Based CRFs on Sequence Data. Algorithmica 76, 17–46 (2016). https://doi.org/10.1007/s00453-015-0017-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-015-0017-7

Keywords

Navigation