Automatic discovery of person-related named-entity in news articles based on verb analysis

Goh, Hui-Ngo; Soon, Lay-Ki; Haw, Su-Cheng

doi:10.1007/s11042-013-1618-2

Automatic discovery of person-related named-entity in news articles based on verb analysis

Published: 14 August 2013

Volume 74, pages 2587–2610, (2015)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hui-Ngo Goh¹,
Lay-Ki Soon¹ &
Su-Cheng Haw¹

370 Accesses
1 Citation
Explore all metrics

Abstract

Verb is the most important word in a sentence as it asserts an action, events, feeling about the subject and object discussed in the sentence. For news articles, it is observable that there is always at least a verb attached to the person(s) mentioned in the news. As such, a hypothesis has been formed such that there must exist some verbs that specifically describe human being conducts within a news article. In this paper, we propose an approach which aims to identify named-entity (NE) that performs human activity automatically. More specifically, our approach attempts to identify person-related NE generally and “person name” predefined type specifically by studying the nature of verb that associated with human activity via TreeTagger, Stanford packages and WordNet. The experimental results show that it is viable to use verb in identifying “person name“entity type. In addition, our empirical study proves that the approach is applicable to small text size articles. Another significant contribution of our approach is that it does not require training data set and anaphora resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automated identification of media bias in news articles: an interdisciplinary literature review

Article Open access 16 November 2018

Information extraction from electronic medical documents: state of the art and future research directions

Article 08 November 2022

Survey on sentiment analysis: evolution of research methods and topics

Article 06 January 2023

Notes

References

Argamon S, Koppel M, Fine J, Shimoni AR (2003) Gender, genre, and writing style in formal written texts. Text Interdiscip J Study Discourse 23(3):321–346
Google Scholar
Artiles J, Amigo E, Gonzalo J (2009) The role of named entities in web people search. In: Conference on Empirical Methods in Natural Language Processing, Singapore, pp 534–542
Boas HC (2010) Linguistically relevant meaning elements of English communication verbs. Belg J Linguist 24:54–82
Article Google Scholar
Brown SW, Dligach D, Palmer M (2011) Verbnet class assignment as a WSD task. In: Proceedings of the Ninth International Conference on Computational Semantics, IWCS ’11, Stroudsburg, PA, USA. Association for Computational Linguistics, pp 85–94
Chafe W (1970) Meaning and the structure of language. The University of Chicago Press, Chicago
Google Scholar
Chiticariu L, Krishnamurthy R, Li YY, Reiss F, Vaithyanathan S (2010) Domain adaptation of rule-based annotators for named-entity recognition tasks. In: Empirical Methods in Natural Language Processing, Massachusetts, pp 1002–1012
Dannells D, Borin L (2012) Towards a language independent methodology for generating artwork descriptions—exploring framenet information. In: Proceedings of the 6th EACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, pp 18–23
Doran C (1996) Punctuation in quoted speech. In: Working Notes of the ACL-SIGPARSE Workshop onPunctuation in Computational Linguistics, Santa Cruz, CA
Eckert P (1997) Age as a sociolinguistic variable, the handbook of sociolinguistics. Coulmas, Florian, ed, Oxford, pp 151–167
Google Scholar
Elson DK, Mckeown KR (2010) Automatic attribution of quoted speech in literary narrative. In: AAAI, 2010
Farmakiotou D, Karkaletsis V, Koutsias J, Sigletos G, Spyropoulos CD, Stamatopoulos P (2000) Rule-based named entity recognition for greek financial texts. In: Proceedings of the workshop on computational lexicography and multimedia dictionaries (COMLEX 2000), pp 75–78
Fillmore CJ (1968) Lexical entries for verbs. Found Lang 4(4):373–393
Google Scholar
Fleischman M, Hovy E (2002) Fine grained classification of named entities. In: 19th International Conference on Computational Linguistics, pp 1–7
Freire N, Borbinha J, Calado P (2012) An approach for named entity recognition in poorly structured data. In: 9th Extended Semantic Web Conference, Greece, 27–31 May 2012, pp 718–732
Goh HN, Soon LK, Haw SC (2012) Automatic identification of protagonist using verb. In: 16th pacific-asia conference on knowledge discovery and data mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 395–406
Goh HN, Soon LK, Haw SC (2012) VAHA: Verbs associate with human activity—a study on fairy tales. In: 25th International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (AIE/IEA’ 2012), Dalian, China, 9–12 June, 2012
Gruhl D, Nagarajan M, Pieper J, Robson C, Sheth AP (2009) Context and domain knowledge enhanced entity spotting in informal text. In: International Semantic Web Conference, pp 260–276
Huong TL, Thien HN (2010) Name entity recognition using inductive logic programming. In: Symposium on Information and Communication Technology, Vietnam, pp 71–77
Karaa WBA (2011) Named entity recognition using web text corpus. Int J Manag Inf Technol 3:No. 1
Kim KS, Choi SJ, Park SB, Lee SJ (2012) A two-step zero pronoun resolution by reducing candidate cardinality. In: PRICAI 2012, pp 194–205
Klein D, Smarr J, Nguyen H, Manning CD (2003) NE recognition with character-level models. In: 7th Conference on Natural Language Learning, pp 180–183
Klenner M, Petrakis S (2012) Polarity preference of verbs: what could verbs reveal about the polarity of their objects? In: 17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012, pp 35–46
Krupka GR (1995) SRA: Description of the SRA system as used for MUC-6. In: 6th Message Understanding Conference, pp 221–235
Li DC, Miller T, Schuler W (2011) A pronoun anaphora resolution system based on factorial hidden markov models. In: ACL, pp 1169–1178
Liao ZH, Zhang ZL, Liu Y (2010) Chinese named entity recognition based on hierarchical hybrid model. In: 11th Pacific Rim International Conference on Artificial Intelligence, Daegu, Korea, 30 Aug–2 Sept 2010, pp 620–624
Lobo PV, Matos DMD (2010) Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Language Resources and Evaluation Conference - LREC 2010, European Language Resources Association (ELRA), Malta
Ma WY, Mckeown K (2009) Where’s the verb? Correcting machine translation during question answering. In: Proceeding of the ACL-IJCNLP 2009, pp 333–336
Madhyastha HV, Balakrishnan N, Ramakrishnan KR (2003) Event information extraction using link grammar. In: RIDE-MLIM pp 16–22
Marneffe M-CD, MacCartnet B, Manning CD (2006) Generating dependency parses from phrase structure parses. In: LREC
Marneffe M-CD, Manning CD (2008, revised in 2011) Stanford typed dependencies manual
Messiant C (2008) A subcategorization acquisition system for French verbs. In: Proceedings of the ACL-08: HLT Student Research Workshop, Columbus, Ohio, pp 55–60
Minkov E, Wang R, Cohen WW (2005) Extracting personal names from email: Applying NE recognition to informal text. In: Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, Vancouver, pp 443–450
Mikheev A, Grover C, Moens M (1998) Description of the LTG system used for MUC-&. In: 7th Message Understanding Conference
Novischi A, Moldova D (2006) Question Answering with lexical chains propagating verb arguments. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp 897–904
Plachouras V, Riviere M, Vazirgiannis M (2012) Named entity recognition and identification for finding the owner of a home page. In: 16th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’2012), Kuala Lumpur, Malaysia. 29 May–1 June 2012, pp 554–565
Rau LF (1991) Extracting company names from text. In: IEEE Conference on Artificial Intelligence Applications, pp 20–32
Roberts W, Kordoni V (2012) Using verb subcategorization for word sense disambiguation. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)
Santorini B (1991) Part-of-speech tagging guidelines for the penn treebank project
Schwartz HA, Gomez F, Millward C (2008) A semantic feature for verbal predicate and semantic role labeling using SVMs. Proceedings of the 21st International Florida Artificial Intelligence Research Society Conference, pp 213–218
Sekine S, Nobata C (2004) Definition, dictionaries and tagger for Extended Named Entity Hierarchy. In: Conference on Language Resources and Evaluation
Sharma A, Swaminathan R, Yang H (2010) A verb-centric approach for relationship extraction in biomedical text. In: Proceedings of the 2010 I.E. Fourth International Conference on Semantic Computing, pp 377–385
Silva FJVD, Carvalho AMBR, Roman NT (2010) A comparative analysis of centering-based algorithms for pronoun resolution in portuguese, lecture notes in computer science Volume 6433, pp 336–345
Siorpaes K, Simperl EPB (2010) Human Intelligence in the process of semantic content creation. World Wide Web 13(1–2):33–59
Article Google Scholar
Smarr J, Manning CD (2002) Classifying unknown proper noun phrases without context, Technical Report dbpubs/2002-46. Stanford University,Stanford, CA
Trudgill P (1972) Sex, covert prestige and linguistic change in the urban British English of Norwich. Lang Soc 1:179–195
Article Google Scholar
Whitelaw C, Kehlenbeck A, Petrovic N, Ungar L (2008) Web-scale NE recognition. In: 17th ACM Conference on Information and Knowledge Management, Napa Valley, California, USA, pp 123–132
Williams J, Katz G (2012a) A new twitter verb lexicon for natural language processing. In: Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), pp 293–298
Williams J, Katz G (2012b) Extracting and modelling durations for habits and events from Twitter. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp 223–227

Download references

Author information

Authors and Affiliations

Faculty of Computing and Informatics, Multimedia University, Jalan Multimedia, 63100, Cyberjaya, Selangor, Malaysia
Hui-Ngo Goh, Lay-Ki Soon & Su-Cheng Haw

Authors

Hui-Ngo Goh
View author publications
You can also search for this author in PubMed Google Scholar
Lay-Ki Soon
View author publications
You can also search for this author in PubMed Google Scholar
Su-Cheng Haw
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hui-Ngo Goh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Goh, HN., Soon, LK. & Haw, SC. Automatic discovery of person-related named-entity in news articles based on verb analysis. Multimed Tools Appl 74, 2587–2610 (2015). https://doi.org/10.1007/s11042-013-1618-2

Download citation

Published: 14 August 2013
Issue Date: April 2015
DOI: https://doi.org/10.1007/s11042-013-1618-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic discovery of person-related named-entity in news articles based on verb analysis

Abstract

Access this article

Similar content being viewed by others

Automated identification of media bias in news articles: an interdisciplinary literature review

Information extraction from electronic medical documents: state of the art and future research directions

Survey on sentiment analysis: evolution of research methods and topics

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatic discovery of person-related named-entity in news articles based on verb analysis

Abstract

Access this article

Similar content being viewed by others

Automated identification of media bias in news articles: an interdisciplinary literature review

Information extraction from electronic medical documents: state of the art and future research directions

Survey on sentiment analysis: evolution of research methods and topics

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation