Bi-directional Joint Inference for Entity Resolution and Segmentation Using Imperatively-Defined Factor Graphs

  • Sameer Singh
  • Karl Schultz
  • Andrew McCallum
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5782)


There has been growing interest in using joint inference across multiple subtasks as a mechanism for avoiding the cascading accumulation of errors in traditional pipelines. Several recent papers demonstrate joint inference between the segmentation of entity mentions and their de-duplication, however, they have various weaknesses: inference information flows only in one direction, the number of uncertain hypotheses is severely limited, or the subtasks are only loosely coupled. This paper presents a highly-coupled, bi-directional approach to joint inference based on efficient Markov chain Monte Carlo sampling in a relational conditional random field. The model is specified with our new probabilistic programming language that leverages imperative constructs to define factor graph structure and operation. Experimental results show that our approach provides a dramatic reduction in error while also running faster than the previous state-of-the-art system.


Markov Chain Monte Carlo Joint Model Factor Graph Entity Resolution Markov Logic Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    McCallum, A., Jensen, D.: A note on the unification of information extraction and data mining using conditional-probability, relational models. In: IJCAI Workshop on Learning Statistical Models from Relational Data (2003)Google Scholar
  2. 2.
    Finkel, J.R., Manning, C.D., Ng, A.Y.: Solving the problem of cascading errors: Approximate bayesian inference for linguistic annotation pipelines. In: Conference on Empirical Methods on Natural Language Processing, EMNLP (2006)Google Scholar
  3. 3.
    Wellner, B., McCallum, A., Peng, F., Hay, M.: An integrated, conditional model of information extraction and coreference with application to citation matching. In: Uncertainty in Artificial Intelligence (UAI), pp. 593–601 (2004)Google Scholar
  4. 4.
    Poon, H., Domingos, P.: Joint inference in information extraction. In: AAAI Conference on Artificial Intelligence, pp. 913–918 (2007)Google Scholar
  5. 5.
    Poon, H., Domingos, P.: Sound and efficient inference with probabilistic and deterministic dependencies. In: AAAI Conference on Artificial Intelligence (2006)Google Scholar
  6. 6.
    Selman, B., Kautz, H., Cohen, B.: Local search strategies for satisfiability testing. Discrete Mathematics and Theoretical Computer Science (DIMACS) 26 (1996)Google Scholar
  7. 7.
    Kschischang, F., Frey, B., Loeliger, H.A.: Factor graphs and the sum-product algorithm. IEEE Trans on Information Theory 47(2), 498–519 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Culotta, A., McCallum, A.: Tractable learning and inference with high-order representations. In: International Conference on Machine Learning (ICML) Workshop on Open Problems in Statistical Relational Learning (2006)Google Scholar
  9. 9.
    Richardson, M., Domingos, P.: Markov logic networks. Machine Learning 62(1-2), 107–136 (2006)CrossRefGoogle Scholar
  10. 10.
    Milch, B., Marthi, B., Russell, S.: BLOG: Relational Modeling with Unknown Objects. PhD thesis, University of California, Berkeley (2006)Google Scholar
  11. 11.
    McCallum, A., Rohanimanesh, K., Wick, M., Schultz, K., Singh, S.: FACTORIE: Efficient probabilistic programming via imperative declarations of structure, inference and learning. In: NIPS Workshop on Probabilistic Programming (2008)Google Scholar
  12. 12.
    Rohanimanesh, K., Wick, M., McCallum, A.: Inference and learning in large factor graphs with a rank based objective. Technical Report UM-CS-2009-08, University of Massachusetts, Amherst (2009)Google Scholar
  13. 13.
    Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD (2003)Google Scholar
  14. 14.
    Singla, P., Domingos, P.: Entity resolution with Markov logic. In: International Conference on Data Mining (ICDM), pp. 572–582 (2006)Google Scholar
  15. 15.
    Pasula, H., Marthi, B., Milch, B., Russell, S., Shpitser, I.: Identity uncertainty and citation matching. In: Neural Information Processing Systems, NIPS (2003)Google Scholar
  16. 16.
    McCallum, A., Nigam, K., Rennie, J., Seymore, K.: A machine learning approach to building domain-specific search engines. In: IJCAI, pp. 661–667 (1999)Google Scholar
  17. 17.
    Poon, H., Domingos, P., Sumner, M.: A general method for reducing the complexity of relational inference and its application to MCMC. In: AAAI (2008)Google Scholar
  18. 18.
    Miller, S., Fox, H., Ramshaw, L., Weischedel, R.: A novel use of statistical parsing to extract information from text. In: Applied Natural Language Processing Conference, pp. 226–233 (2000)Google Scholar
  19. 19.
    Finkel, J.R., Manning, C.D.: Joint parsing and named entity recognition. In: North American Association of Computational Linguistics, NAACL (2009)Google Scholar
  20. 20.
    Gildea, D., Jurafsky, D.: Automatic labeling of semantic roles. Computational Linguistics 28, 245–288 (2002)CrossRefGoogle Scholar
  21. 21.
    Sutton, C., McCallum, A.: Joint parsing and semantic role labeling. In: Conference on Computational Natural Language Learning, CoNLL (2005)Google Scholar
  22. 22.
    Hollingshead, K., Roark, B.: Pipeline iteration. In: Annual Meeting of the Association of Computational Linguistics (ACL), pp. 952–959 (2007)Google Scholar
  23. 23.
    Sutton, C., McCallum, A.: Collective segmentation and labeling of distant entities in information extraction. In: ICML Workshop on Statistical Relational Learning and Its Connections to Other Fields (2004)Google Scholar
  24. 24.
    Roth, D., Yih, W.: Global inference for entity and relation identification via a linear programming formulation. In: Getoor, L., Taskar, B. (eds.) Introduction to Statistical Relational Learning. MIT Press, Cambridge (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Sameer Singh
    • 1
  • Karl Schultz
    • 1
  • Andrew McCallum
    • 1
  1. 1.Department of Computer ScienceUniversity of MassachusettsAmherstUSA

Personalised recommendations