Distributed Human Computation Framework for Linked Data Co-reference Resolution

  • Yang Yang
  • Priyanka Singh
  • Jiadi Yao
  • Ching-man Au Yeung
  • Amir Zareian
  • Xiaowei Wang
  • Zhonglun Cai
  • Manuel Salvadores
  • Nicholas Gibbins
  • Wendy Hall
  • Nigel Shadbolt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6643)

Abstract

Distributed Human Computation (DHC) is used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI has many research problems that are considered as AI-complete. E.g. co-reference resolution, which involves determining whether different URIs refer to the same entity, is a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution when integrating distributed datasets. Traditionally machine-learning algorithms are used as a solution for this but they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co-reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are dereferenceable in the Open Linked Data Cloud.

References

  1. 1.
    Albors, J., Ramos, J.C., Hervas, J.L.: New learning network paradigms: Communities of objectives, crowdsourcing, wikis and open source. International Journal of Information Management 28(3), 194–202 (2008)CrossRefGoogle Scholar
  2. 2.
    Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web: Scientific american. Scientific American 284(5), 34–43 (2001)CrossRefGoogle Scholar
  3. 3.
    Feldman, F.: Leibniz and” Leibniz’Law”. The Philosophical Review 79(4), 510–522 (1970)CrossRefGoogle Scholar
  4. 4.
    Glaser, H., Jaffri, A., Millard, I.: Managing co-reference on the semantic web. In: WWW 2009 Workshop: Linked Data on the Web (LDOW 2009) (April 2009)Google Scholar
  5. 5.
    Glaser, H., Lewy, T., Millard, I., Dowling, B.: On coreference and the semantic web (December 2007)Google Scholar
  6. 6.
    Gruber, T.: Collective knowledge systems: Where the social web meets the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web 6(1), 4–13 (2008), Semantic Web and Web 2.0CrossRefGoogle Scholar
  7. 7.
    Jaffri, A., Glaser, H., Millard, I.: Uri identity management for semantic web data integration and linkage. In: On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops. LNCS, pp. 1125–1134. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  8. 8.
    Kang, I.-S., Na, S.-H., Lee, S., Jung, H., Kim, P., Sung, W.-K., Lee, J.-H.: On co-authorship for author disambiguation. Information Processing and Management 45(1), 84–97 (2009)CrossRefGoogle Scholar
  9. 9.
    Millard, I., Glaser, H., Salvadores, M., Shadbolt, N.: Consuming multiple linked data sources: Challenges and Experiences (November 2010)Google Scholar
  10. 10.
    Ng, V., Cardie, C.: Improving machine learning approaches to coreference resolution. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL 2002, pp. 104–111. Association for Computational Linguistics, Morristown (2002)Google Scholar
  11. 11.
    Quinn, A.J., Bederson, B.B.: A taxonomy of distributed human computation. Human-Computer Interaction Lab. Tech Report, University of Maryland (2009)Google Scholar
  12. 12.
    Salvadores, M., Correndo, G., Rodriguez-Castro, B., Gibbins, N., Darlington, J., Shadbolt, N.R.: LinksB2N: Automatic data integration for the semantic web. In: Meersman, R., Dillon, T., Herrero, P. (eds.) OTM 2009. LNCS, vol. 5871, pp. 1121–1138. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  13. 13.
    Shadbolt, N., Hall, W., Berners-Lee, T.: The semantic web revisited. Intelligent Systems 21(3), 96–101 (2006)CrossRefGoogle Scholar
  14. 14.
    Shapiro, S.C.: Encyclopedia of artificial intelligence, vol. 1,2 (1992)Google Scholar
  15. 15.
    Sleeman, J., Finin, T.: Computing FOAF Co-reference Relations with Rules and Machine Learning. In: Proceedings of the Third International Workshop on Social Data on the Web (November 2010)Google Scholar
  16. 16.
    Soon, W.M., Ng, H.T., Lim, D.C.Y.: A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27(4), 521–544 (2001)CrossRefGoogle Scholar
  17. 17.
    von Ahn, L., Blum, M., Hopper, N., Langford, J.: Captcha: Using hard ai problems for security. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 646–646. Springer, Heidelberg (2003)Google Scholar
  18. 18.
    Yang, Y., Yeung, C.M.A., Weal, M.J., Davis, H.: The researcher social network: A social network based on metadata of scientific publications. In: Proceedings of WebSci 2009: Society On-Line (March 2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yang Yang
    • 1
  • Priyanka Singh
    • 1
  • Jiadi Yao
    • 1
  • Ching-man Au Yeung
    • 2
  • Amir Zareian
    • 1
  • Xiaowei Wang
    • 1
  • Zhonglun Cai
    • 1
  • Manuel Salvadores
    • 1
  • Nicholas Gibbins
    • 1
  • Wendy Hall
    • 1
  • Nigel Shadbolt
    • 1
  1. 1.Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer ScienceUniversity of SouthamptonUK
  2. 2.NTT Communication Science LaboratoriesKyotoJapan

Personalised recommendations