Skip to main content
Log in

Automatic discovery of person-related named-entity in news articles based on verb analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Verb is the most important word in a sentence as it asserts an action, events, feeling about the subject and object discussed in the sentence. For news articles, it is observable that there is always at least a verb attached to the person(s) mentioned in the news. As such, a hypothesis has been formed such that there must exist some verbs that specifically describe human being conducts within a news article. In this paper, we propose an approach which aims to identify named-entity (NE) that performs human activity automatically. More specifically, our approach attempts to identify person-related NE generally and “person name” predefined type specifically by studying the nature of verb that associated with human activity via TreeTagger, Stanford packages and WordNet. The experimental results show that it is viable to use verb in identifying “person name“entity type. In addition, our empirical study proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. http://www.daviddlewis.com/resources/testcollections/reuters21578/

  2. http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html

  3. http://thestar.com.my/

  4. http://www.nst.com.my/

  5. http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

  6. http://nlp.stanford.edu/software/stanford-dependencies.shtml

  7. http://nlp.stanford.edu/software/lex-parser.shtml

  8. http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.html

  9. http://thestar.com.my/

  10. http://www.nst.com.my/

  11. http://cpansearch.perl.org/src/KWILLIAMS/reuters-21578/README

  12. http://gate.ac.uk/

  13. http://nlp.stanford.edu/software/CRF-NER.shtml

  14. http://www.opencalais.com/

References

  1. Argamon S, Koppel M, Fine J, Shimoni AR (2003) Gender, genre, and writing style in formal written texts. Text Interdiscip J Study Discourse 23(3):321–346

    Google Scholar 

  2. Artiles J, Amigo E, Gonzalo J (2009) The role of named entities in web people search. In: Conference on Empirical Methods in Natural Language Processing, Singapore, pp 534–542

  3. Boas HC (2010) Linguistically relevant meaning elements of English communication verbs. Belg J Linguist 24:54–82

    Article  Google Scholar 

  4. Brown SW, Dligach D, Palmer M (2011) Verbnet class assignment as a WSD task. In: Proceedings of the Ninth International Conference on Computational Semantics, IWCS ’11, Stroudsburg, PA, USA. Association for Computational Linguistics, pp 85–94

  5. Chafe W (1970) Meaning and the structure of language. The University of Chicago Press, Chicago

    Google Scholar 

  6. Chiticariu L, Krishnamurthy R, Li YY, Reiss F, Vaithyanathan S (2010) Domain adaptation of rule-based annotators for named-entity recognition tasks. In: Empirical Methods in Natural Language Processing, Massachusetts, pp 1002–1012

  7. Dannells D, Borin L (2012) Towards a language independent methodology for generating artwork descriptions—exploring framenet information. In: Proceedings of the 6th EACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp 18–23

  8. Doran C (1996) Punctuation in quoted speech. In: Working Notes of the ACL-SIGPARSE Workshop onPunctuation in Computational Linguistics, Santa Cruz, CA

  9. Eckert P (1997) Age as a sociolinguistic variable, the handbook of sociolinguistics. Coulmas, Florian, ed, Oxford, pp 151–167

    Google Scholar 

  10. Elson DK, Mckeown KR (2010) Automatic attribution of quoted speech in literary narrative. In: AAAI, 2010

  11. Farmakiotou D, Karkaletsis V, Koutsias J, Sigletos G, Spyropoulos CD, Stamatopoulos P (2000) Rule-based named entity recognition for greek financial texts. In: Proceedings of the workshop on computational lexicography and multimedia dictionaries (COMLEX 2000), pp 75–78

  12. Fillmore CJ (1968) Lexical entries for verbs. Found Lang 4(4):373–393

    Google Scholar 

  13. Fleischman M, Hovy E (2002) Fine grained classification of named entities. In: 19th International Conference on Computational Linguistics, pp 1–7

  14. Freire N, Borbinha J, Calado P (2012) An approach for named entity recognition in poorly structured data. In: 9th Extended Semantic Web Conference, Greece, 27–31 May 2012, pp 718–732

  15. Goh HN, Soon LK, Haw SC (2012) Automatic identification of protagonist using verb. In: 16th pacific-asia conference on knowledge discovery and data mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 395–406

  16. Goh HN, Soon LK, Haw SC (2012) VAHA: Verbs associate with human activity—a study on fairy tales. In: 25th International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (AIE/IEA’ 2012), Dalian, China, 9–12 June, 2012

  17. Gruhl D, Nagarajan M, Pieper J, Robson C, Sheth AP (2009) Context and domain knowledge enhanced entity spotting in informal text. In: International Semantic Web Conference, pp 260–276

  18. Huong TL, Thien HN (2010) Name entity recognition using inductive logic programming. In: Symposium on Information and Communication Technology, Vietnam, pp 71–77

  19. Karaa WBA (2011) Named entity recognition using web text corpus. Int J Manag Inf Technol 3:No. 1

  20. Kim KS, Choi SJ, Park SB, Lee SJ (2012) A two-step zero pronoun resolution by reducing candidate cardinality. In: PRICAI 2012, pp 194–205

  21. Klein D, Smarr J, Nguyen H, Manning CD (2003) NE recognition with character-level models. In: 7th Conference on Natural Language Learning, pp 180–183

  22. Klenner M, Petrakis S (2012) Polarity preference of verbs: what could verbs reveal about the polarity of their objects? In: 17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012, pp 35–46

  23. Krupka GR (1995) SRA: Description of the SRA system as used for MUC-6. In: 6th Message Understanding Conference, pp 221–235

  24. Li DC, Miller T, Schuler W (2011) A pronoun anaphora resolution system based on factorial hidden markov models. In: ACL, pp 1169–1178

  25. Liao ZH, Zhang ZL, Liu Y (2010) Chinese named entity recognition based on hierarchical hybrid model. In: 11th Pacific Rim International Conference on Artificial Intelligence, Daegu, Korea, 30 Aug–2 Sept 2010, pp 620–624

  26. Lobo PV, Matos DMD (2010) Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Language Resources and Evaluation Conference - LREC 2010, European Language Resources Association (ELRA), Malta

  27. Ma WY, Mckeown K (2009) Where’s the verb? Correcting machine translation during question answering. In: Proceeding of the ACL-IJCNLP 2009, pp 333–336

  28. Madhyastha HV, Balakrishnan N, Ramakrishnan KR (2003) Event information extraction using link grammar. In: RIDE-MLIM pp 16–22

  29. Marneffe M-CD, MacCartnet B, Manning CD (2006) Generating dependency parses from phrase structure parses. In: LREC

  30. Marneffe M-CD, Manning CD (2008, revised in 2011) Stanford typed dependencies manual

  31. Messiant C (2008) A subcategorization acquisition system for French verbs. In: Proceedings of the ACL-08: HLT Student Research Workshop, Columbus, Ohio, pp 55–60

  32. Minkov E, Wang R, Cohen WW (2005) Extracting personal names from email: Applying NE recognition to informal text. In: Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, pp 443–450

  33. Mikheev A, Grover C, Moens M (1998) Description of the LTG system used for MUC-&. In: 7th Message Understanding Conference

  34. Novischi A, Moldova D (2006) Question Answering with lexical chains propagating verb arguments. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp 897–904

  35. Plachouras V, Riviere M, Vazirgiannis M (2012) Named entity recognition and identification for finding the owner of a home page. In: 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 554–565

  36. Rau LF (1991) Extracting company names from text. In: IEEE Conference on Artificial Intelligence Applications, pp 20–32

  37. Roberts W, Kordoni V (2012) Using verb subcategorization for word sense disambiguation. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)

  38. Santorini B (1991) Part-of-speech tagging guidelines for the penn treebank project

  39. Schwartz HA, Gomez F, Millward C (2008) A semantic feature for verbal predicate and semantic role labeling using SVMs. Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference, pp 213–218

  40. Sekine S, Nobata C (2004) Definition, dictionaries and tagger for Extended Named Entity Hierarchy. In: Conference on Language Resources and Evaluation

  41. Sharma A, Swaminathan R, Yang H (2010) A verb-centric approach for relationship extraction in biomedical text. In: Proceedings of the 2010 I.E. Fourth International Conference on Semantic Computing, pp 377–385

  42. Silva FJVD, Carvalho AMBR, Roman NT (2010) A comparative analysis of centering-based algorithms for pronoun resolution in portuguese, lecture notes in computer science Volume 6433, pp 336–345

  43. Siorpaes K, Simperl EPB (2010) Human Intelligence in the process of semantic content creation. World Wide Web 13(1–2):33–59

    Article  Google Scholar 

  44. Smarr J, Manning CD (2002) Classifying unknown proper noun phrases without context, Technical Report dbpubs/2002-46. Stanford University,Stanford, CA

  45. Trudgill P (1972) Sex, covert prestige and linguistic change in the urban British English of Norwich. Lang Soc 1:179–195

    Article  Google Scholar 

  46. Whitelaw C, Kehlenbeck A, Petrovic N, Ungar L (2008) Web-scale NE recognition. In: 17th ACM Conference on Information and Knowledge Management, Napa Valley, California, USA, pp 123–132

  47. Williams J, Katz G (2012a) A new twitter verb lexicon for natural language processing. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), pp 293–298

  48. Williams J, Katz G (2012b) Extracting and modelling durations for habits and events from Twitter. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp 223–227

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui-Ngo Goh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goh, HN., Soon, LK. & Haw, SC. Automatic discovery of person-related named-entity in news articles based on verb analysis. Multimed Tools Appl 74, 2587–2610 (2015). https://doi.org/10.1007/s11042-013-1618-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-013-1618-2

Keywords

Navigation