Abstract
Verb is the most important word in a sentence as it asserts an action, events, feeling about the subject and object discussed in the sentence. For news articles, it is observable that there is always at least a verb attached to the person(s) mentioned in the news. As such, a hypothesis has been formed such that there must exist some verbs that specifically describe human being conducts within a news article. In this paper, we propose an approach which aims to identify named-entity (NE) that performs human activity automatically. More specifically, our approach attempts to identify person-related NE generally and “person name” predefined type specifically by studying the nature of verb that associated with human activity via TreeTagger, Stanford packages and WordNet. The experimental results show that it is viable to use verb in identifying “person name“entity type. In addition, our empirical study proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.
Similar content being viewed by others
Notes
References
Argamon S, Koppel M, Fine J, Shimoni AR (2003) Gender, genre, and writing style in formal written texts. Text Interdiscip J Study Discourse 23(3):321–346
Artiles J, Amigo E, Gonzalo J (2009) The role of named entities in web people search. In: Conference on Empirical Methods in Natural Language Processing, Singapore, pp 534–542
Boas HC (2010) Linguistically relevant meaning elements of English communication verbs. Belg J Linguist 24:54–82
Brown SW, Dligach D, Palmer M (2011) Verbnet class assignment as a WSD task. In: Proceedings of the Ninth International Conference on Computational Semantics, IWCS ’11, Stroudsburg, PA, USA. Association for Computational Linguistics, pp 85–94
Chafe W (1970) Meaning and the structure of language. The University of Chicago Press, Chicago
Chiticariu L, Krishnamurthy R, Li YY, Reiss F, Vaithyanathan S (2010) Domain adaptation of rule-based annotators for named-entity recognition tasks. In: Empirical Methods in Natural Language Processing, Massachusetts, pp 1002–1012
Dannells D, Borin L (2012) Towards a language independent methodology for generating artwork descriptions—exploring framenet information. In: Proceedings of the 6th EACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp 18–23
Doran C (1996) Punctuation in quoted speech. In: Working Notes of the ACL-SIGPARSE Workshop onPunctuation in Computational Linguistics, Santa Cruz, CA
Eckert P (1997) Age as a sociolinguistic variable, the handbook of sociolinguistics. Coulmas, Florian, ed, Oxford, pp 151–167
Elson DK, Mckeown KR (2010) Automatic attribution of quoted speech in literary narrative. In: AAAI, 2010
Farmakiotou D, Karkaletsis V, Koutsias J, Sigletos G, Spyropoulos CD, Stamatopoulos P (2000) Rule-based named entity recognition for greek financial texts. In: Proceedings of the workshop on computational lexicography and multimedia dictionaries (COMLEX 2000), pp 75–78
Fillmore CJ (1968) Lexical entries for verbs. Found Lang 4(4):373–393
Fleischman M, Hovy E (2002) Fine grained classification of named entities. In: 19th International Conference on Computational Linguistics, pp 1–7
Freire N, Borbinha J, Calado P (2012) An approach for named entity recognition in poorly structured data. In: 9th Extended Semantic Web Conference, Greece, 27–31 May 2012, pp 718–732
Goh HN, Soon LK, Haw SC (2012) Automatic identification of protagonist using verb. In: 16th pacific-asia conference on knowledge discovery and data mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 395–406
Goh HN, Soon LK, Haw SC (2012) VAHA: Verbs associate with human activity—a study on fairy tales. In: 25th International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (AIE/IEA’ 2012), Dalian, China, 9–12 June, 2012
Gruhl D, Nagarajan M, Pieper J, Robson C, Sheth AP (2009) Context and domain knowledge enhanced entity spotting in informal text. In: International Semantic Web Conference, pp 260–276
Huong TL, Thien HN (2010) Name entity recognition using inductive logic programming. In: Symposium on Information and Communication Technology, Vietnam, pp 71–77
Karaa WBA (2011) Named entity recognition using web text corpus. Int J Manag Inf Technol 3:No. 1
Kim KS, Choi SJ, Park SB, Lee SJ (2012) A two-step zero pronoun resolution by reducing candidate cardinality. In: PRICAI 2012, pp 194–205
Klein D, Smarr J, Nguyen H, Manning CD (2003) NE recognition with character-level models. In: 7th Conference on Natural Language Learning, pp 180–183
Klenner M, Petrakis S (2012) Polarity preference of verbs: what could verbs reveal about the polarity of their objects? In: 17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012, pp 35–46
Krupka GR (1995) SRA: Description of the SRA system as used for MUC-6. In: 6th Message Understanding Conference, pp 221–235
Li DC, Miller T, Schuler W (2011) A pronoun anaphora resolution system based on factorial hidden markov models. In: ACL, pp 1169–1178
Liao ZH, Zhang ZL, Liu Y (2010) Chinese named entity recognition based on hierarchical hybrid model. In: 11th Pacific Rim International Conference on Artificial Intelligence, Daegu, Korea, 30 Aug–2 Sept 2010, pp 620–624
Lobo PV, Matos DMD (2010) Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Language Resources and Evaluation Conference - LREC 2010, European Language Resources Association (ELRA), Malta
Ma WY, Mckeown K (2009) Where’s the verb? Correcting machine translation during question answering. In: Proceeding of the ACL-IJCNLP 2009, pp 333–336
Madhyastha HV, Balakrishnan N, Ramakrishnan KR (2003) Event information extraction using link grammar. In: RIDE-MLIM pp 16–22
Marneffe M-CD, MacCartnet B, Manning CD (2006) Generating dependency parses from phrase structure parses. In: LREC
Marneffe M-CD, Manning CD (2008, revised in 2011) Stanford typed dependencies manual
Messiant C (2008) A subcategorization acquisition system for French verbs. In: Proceedings of the ACL-08: HLT Student Research Workshop, Columbus, Ohio, pp 55–60
Minkov E, Wang R, Cohen WW (2005) Extracting personal names from email: Applying NE recognition to informal text. In: Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, pp 443–450
Mikheev A, Grover C, Moens M (1998) Description of the LTG system used for MUC-&. In: 7th Message Understanding Conference
Novischi A, Moldova D (2006) Question Answering with lexical chains propagating verb arguments. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp 897–904
Plachouras V, Riviere M, Vazirgiannis M (2012) Named entity recognition and identification for finding the owner of a home page. In: 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 554–565
Rau LF (1991) Extracting company names from text. In: IEEE Conference on Artificial Intelligence Applications, pp 20–32
Roberts W, Kordoni V (2012) Using verb subcategorization for word sense disambiguation. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)
Santorini B (1991) Part-of-speech tagging guidelines for the penn treebank project
Schwartz HA, Gomez F, Millward C (2008) A semantic feature for verbal predicate and semantic role labeling using SVMs. Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference, pp 213–218
Sekine S, Nobata C (2004) Definition, dictionaries and tagger for Extended Named Entity Hierarchy. In: Conference on Language Resources and Evaluation
Sharma A, Swaminathan R, Yang H (2010) A verb-centric approach for relationship extraction in biomedical text. In: Proceedings of the 2010 I.E. Fourth International Conference on Semantic Computing, pp 377–385
Silva FJVD, Carvalho AMBR, Roman NT (2010) A comparative analysis of centering-based algorithms for pronoun resolution in portuguese, lecture notes in computer science Volume 6433, pp 336–345
Siorpaes K, Simperl EPB (2010) Human Intelligence in the process of semantic content creation. World Wide Web 13(1–2):33–59
Smarr J, Manning CD (2002) Classifying unknown proper noun phrases without context, Technical Report dbpubs/2002-46. Stanford University,Stanford, CA
Trudgill P (1972) Sex, covert prestige and linguistic change in the urban British English of Norwich. Lang Soc 1:179–195
Whitelaw C, Kehlenbeck A, Petrovic N, Ungar L (2008) Web-scale NE recognition. In: 17th ACM Conference on Information and Knowledge Management, Napa Valley, California, USA, pp 123–132
Williams J, Katz G (2012a) A new twitter verb lexicon for natural language processing. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), pp 293–298
Williams J, Katz G (2012b) Extracting and modelling durations for habits and events from Twitter. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp 223–227
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Goh, HN., Soon, LK. & Haw, SC. Automatic discovery of person-related named-entity in news articles based on verb analysis. Multimed Tools Appl 74, 2587–2610 (2015). https://doi.org/10.1007/s11042-013-1618-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1618-2