Personalized e-news monitoring agent system for tracking user-interested Chinese news events
- 155 Downloads
- 5 Citations
Abstract
Numerous paper-based newspapers have been transformed into a digital format and published on the Internet. Digital newspapers are gradually becoming a popular electronic media for conveying information immediately. Google developed a powerful news service, Google news alert, based on the Google news aggregator for tracking user-interested new events utilizing a keywords matching approach. However, this service only monitors and tracks news events using the keyword-matching scheme; consequently, the Google news alert retrieves many irrelevant news events and sends them to users. In other words, the current service cannot monitor news events via a specific news topic; although recall rate is high, the precision rate is low when tracking user-interested news events. Thus, this study presents a novel personalized e-news monitoring agent system that employs the topic-tracking-based approach, improving the flaw of the keyword-based approach, for tracking user-interested news events on Google News site. The proposed scheme simultaneously considers both similarities and the semantic relationships among news topics to track news events. Additionally, to further support the promotion of the accuracy rate in tracking user-interested Chinese news events, the Chinese word segmentation system ECScanner (An Extension Chinese Lexicon Scanner) with new word extension is proposed for the Chinese word segmentation process. Experimental results demonstrated that the proposed scheme, based on topic-based approach, is superior to the keyword-based approach used by Google news alert in terms of precision rate, and retains a high recall rate when tracking user-interested news events. Compared with the conventional Chinese word segmentation system CKIP (Chinese Knowledge Information Processing), experimental results also confirmed that using the proposed ECScanner with novel extension mechanism for new words improves the accuracy rate in tracking user-interested news events.
Keywords
News events News events monitoring agent system Information retrieval Intelligent agentPreview
Unable to display preview. Download preview PDF.
References
- 1.Cheung P-S, Huang R, Lam W (2004) Financial activity mining from online multilingual news. In: The international conference on information technology: coding and computing Google Scholar
- 2.Fung GPC, Yu JX, Lam W (2003) Stock prediction: integrating text mining approach using real-time news. In: IEEE international conference on computational intelligence for financial engineering, pp 395–402 Google Scholar
- 3.Mittermayer M-A (2004) Forecasting intraday stock price trends with text mining techniques. In: The 37th Hawaii international conference on system sciences, pp 1–10 Google Scholar
- 4.Wiithrich B, Permunetilleke D, Leung S, Cho V, Zhang J, Lam W (1998) Daily prediction of major stock indices from textual www data. In: Proceedings of the 4th international conference on knowledge discovery and data mining, KDD-98 Google Scholar
- 5.Fawcett T, Provost F (1999) Activity monitoring: noticing interesting changes in behavior. In: Chaudhuri, Madigan (eds) Proceedings on the fifth ACM SIGKDD international conference on knowledge discovery and data mining, San Diego, CA, pp 53–62 Google Scholar
- 6.Wuthrich B et al (1998) Daily stock market forecast from textual web data. In: IEEE International conference on systems, man, and cybernetics, pp 1–6 Google Scholar
- 7.Peramunetilleke D, Wong RK (2002) Currency exchange rate forecasting from news headlines. In: Proceedings of the thirteenth Australasian database conference Google Scholar
- 8.Nesbitt KV, Barrass S (2004) Finding trading patterns in stock market data. IEEE Comput Graph Appl 24(5):45–55 CrossRefGoogle Scholar
- 9.Kuo RJ, Chen CH, Hwang YC (2001) An intelligent stock trading decision support system through integration of genetic algorithm based fuzzy neural network and artificial neural network. Fuzzy Sets Syst 118(1):21–45 CrossRefMathSciNetGoogle Scholar
- 10.Shan NA, Elbahesh EM (2004) Topic-based clustering of news articles. In: Proceedings of the 42th annual southeast regional conference, pp 412–413 Google Scholar
- 11.Maria N, Silva MJ (2000) Theme-based retrieval of web news. In: SIGIR, July 2000, pp 354–356 Google Scholar
- 12.Kurtz AJ, Mostafa J (2003) Topic detection and interest tracking in a dynamic online news source. In: Proceedings of the 2003 joint conference on digital libraries Google Scholar
- 13.Lam W, Cheung P-S, Huang R (2004) Mining events and new name translations from online daily news. In: Proceedings of the 4th ACM/IEEE-CS joint conference on digital libraries, pp 287–295 Google Scholar
- 14.Allan J, Papka R, Lavrenko V (1998) On-line new event detection and tracking. In: SIGIR, pp 37–45 Google Scholar
- 15.Lee C-S, Jian Z-W, Huang L-K (2005) A fuzzy ontology and its application to news summarization. IEEE Trans Syst Man Cybern Part B: Cybern 35(5):859–880 CrossRefGoogle Scholar
- 16.Michael JAB, Gordon L (2004) Data mining techniques for marketing, sales, and customer relationship management. Indianapolis, Wiley Google Scholar
- 17.Google alerts. Web available at http://www.google.com/press/descriptions.html#alerts
- 18.Foo S, Li H (2004) Chinese word segmentation and its effect on information retrieval. Inf Process Manag 40:161–190 CrossRefGoogle Scholar
- 19.Chinese knowledge information processing (CKIP). Web available at http://140.109.19.112/
- 20.Ma W-Y, Chen K-J (2003) Introduction to CKIP Chinese word segmentation system for the first international Chinese word segmentation bakeoff. In: Proceedings of ACL, second SIGHAN workshop on Chinese language processing, pp 168–171 Google Scholar
- 21.ECScanner (An Extension Chinese Lexicon Scanner). Web available at http://dlll.nccu.edu.tw/~rank/ecscanner/
- 22.Google news. Web available from: http://www.google.com/press/descriptions.html#news
- 23.Google advanced search. Web available at http://www.google.com/press/descriptions.html#special
- 24.Caglayan A, Harrison C (1997) Agent sourcebook: a practical guide to introducing agent technology into your business applications. New York, Wiley Google Scholar
- 25.Yeh CL, Lee HJ (1991) Rule-based word identification for mandarin Chinese sentences—a unification approach. Comput Process Chin Oriental Lang 5:97–118 Google Scholar
- 26.Zhang M-Y, Lu Z-D, Zou C-Y (2004) A Chinese word segmentation based on language situation in processing ambiguous words. Inf Sci 162(3–4):275–285 MATHCrossRefGoogle Scholar
- 27.Chen KJ, Liu SH (1992) Word identification for mandarin Chinese sentences. In: Proceedings of COLING, pp 101–107 Google Scholar
- 28.Dee HM (1985) Introduction to natural language processing. Va.Reston, Reston Google Scholar
- 29.Huang CR, Chen KJ, Chang LL (1997) Segmentation standard for Chinese natural language processing. Int J Comput Linguist Chin Lang Process 2(2):47–62 Google Scholar
- 30.He S, Zhu J (2000) A bootstrap method for Chinese new words extraction. IEEE Int Conf Acoust Speech, Signal Process 1(7–11):581–584 Google Scholar
- 31.Nie JY, Brisebois M, Ren XB (1996) On Chinese text retrieval. In: Proceedings of SIGIR’96, pp 225–233 Google Scholar
- 32.Wu ZM, Tseng G (1993) Chinese text segmentation for text retrieval: achievements and problems. J Am Soc Inf Sci 44(9):532–542 CrossRefGoogle Scholar
- 33.Wu ZM, Tseng G (1995) ACTS: an automatic Chinese text segmentation system for full text retrieval. J Am Soc Inf Sci 46(2):83–96 CrossRefGoogle Scholar
- 34.Chowdhury GG (2004) Introduction to modern information retrieval Facet, London Google Scholar
- 35.CScanner (A Chinese Lexicon Scanner). Web available at http://technology.chtsai.org/cscanner/
- 36.Department of Chinese Literature of National Chengchi University. Web available at http://www.chinese.nccu.edu.tw/english/english06/index.htm
- 37.Dhillon IS, Modha DS (2001) Concept decompositions for large sparse text data using clustering. Mach Learn 42(1):143–175 MATHCrossRefGoogle Scholar
- 38.Taiwan version of Google news. Web available at http://news.google.com.tw/
- 39.Chen KJ, Ma WY (2002) Unknown word extraction for Chinese documents. In: Proceedings of COLING, pp 169–175 Google Scholar
- 40.Chinese word lexicon. Web available at http://www.aclclp.org.tw/use_rlssd_c.php