Entity Network Extraction Based on Association Finding and Relation Extraction

  • Ridho Reinanda
  • Marta Utama
  • Fridus Steijlen
  • Maarten de Rijke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8092)

Abstract

One of the core aims of semantic search is to directly present users with information instead of lists of documents. Various entity-oriented tasks have been or are being considered, including entity search and related entity finding. In the context of digital libraries for computational humanities, we consider another task, network extraction: given an input entity and a document collection, extract related entities from the collection and present them as a network. We develop a combined approach for entity network extraction that consists of a co-occurrence-based approach to association finding and a machine learning-based approach to relation extraction. We evaluate our approach by comparing the results on a ground truth obtained using a pooling method.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: DL 2000, pp. 85–94. ACM, New York (2000)Google Scholar
  2. 2.
    Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the TREC 2011 entity track. In: TREC 2011 Working Notes. NIST (2011)Google Scholar
  3. 3.
    Balog, K., Fang, Y., de Rijke, M., Serdyukov, P., Si, L.: Expertise retrieval. Foundations and Trends in Information Retrieval 6(2-3), 127–256 (2012)CrossRefGoogle Scholar
  4. 4.
    Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  5. 5.
    Bron, M., Huurnink, B., de Rijke, M.: Linking archives using document enrichment and term selection. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 360–371. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  6. 6.
    Chaudhari, D.L., Damani, O.P., Laxman, S.: Lexical co-occurrence, statistical significance, and word association. In: EMNLP 2011, pp. 1058–1068. ACL, Stroudsburg (2011)Google Scholar
  7. 7.
    Elson, D.K., Dames, N., McKeown, K.R.: Extracting social networks from literary fiction. In: ACL 2010, pp. 138–147. ACL, Stroudsburg (2010)Google Scholar
  8. 8.
    Etzioni, O., et al.: Open information extraction from the web. Commun. ACM 51(12), 68–74 (2008)CrossRefGoogle Scholar
  9. 9.
    Farkas, G.: Essays on Elite Networks in Sweden: Power, social integration, and informal contacts among political elites. PhD thesis, Stockholm University (2012)Google Scholar
  10. 10.
    Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL 2005, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
  11. 11.
    Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)MATHCrossRefGoogle Scholar
  12. 12.
    Harman, D.K., Voorhees, E.M. (eds.): TREC: Experiment and Evaluation in Information Retrieval. MIT Press (2005)Google Scholar
  13. 13.
    Joachims, T.: Training linear SVMs in linear time. In: KDD 2006, pp. 217–226. ACM, New York (2006)Google Scholar
  14. 14.
    Kautz, H., Selman, B., Shah, M.: Referral web: Combining social networks and collaborative filtering. Commun. ACM 40(3), 63–65 (1997)CrossRefGoogle Scholar
  15. 15.
    Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL 2003, pp. 423–430. ACL, Stroudsburg (2003)Google Scholar
  16. 16.
    Lunenfeld, P., Burdick, A., Drucker, J., Presner, T., Schnapp, J.: Digital Humanities. MIT Press (2012)Google Scholar
  17. 17.
    Merhav, Y., Mesquita, F., Barbosa, D., Yee, W.G., Frieder, O.: Extracting information networks from the blogosphere. ACM Trans. Web 6(3), 11:1–11:33 (2012)Google Scholar
  18. 18.
    Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: AAAI 2008 (2008)Google Scholar
  19. 19.
    Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL 2009, pp. 1003–1011. ACL, Stroudsburg (2009)Google Scholar
  20. 20.
    Pedregosa, F., et al.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)MathSciNetGoogle Scholar
  21. 21.
    Tang, J., Zhang, D., Yao, L.: Social network extraction of academic researchers. In: ICDM 2007, pp. 292–301. IEEE Computer Society, Washington, DC (2007)Google Scholar
  22. 22.
    Washtell, J., Markert, K.: A comparison of windowless and window-based computational association measures as predictors of syntagmatic human associations. In: EMNLP 2009, pp. 628–637. ACL, Stroudsburg (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ridho Reinanda
    • 1
    • 2
  • Marta Utama
    • 1
  • Fridus Steijlen
    • 2
  • Maarten de Rijke
    • 1
  1. 1.ISLAUniversity of AmsterdamThe Netherlands
  2. 2.Royal Netherlands Institute of Southeast Asian and Caribbean StudiesThe Netherlands

Personalised recommendations