Advertisement

Scientometrics

, Volume 119, Issue 2, pp 987–1008 | Cite as

In quest of new document relations: evaluating co-opinion relations between co-citations and its impact on Information retrieval effectiveness

  • Maryam Yaghtin
  • Hajar SotudehEmail author
  • Mahdieh Mirzabeigi
  • Seyed Mostafa Fakhrahmad
  • Mehdi Mohammadi
Article
  • 57 Downloads

Abstract

Document relational network has been effective in retrieving and evaluating papers. Despite their effectiveness, relational measures, including co-citation, are far from ideal and need improvements. The assumption underlying the co-citation relation is the content relevance and opinion relatedness of cited and citing papers. This may imply existence of some kind of co-opinionatedness between co-cited papers which may be effective in improving the measure. Therefore, the present study tries to test the existence of this phenomenon and its role in improving information retrieval. To do so, based on CITREC, a medical test collection was developed consisting of 30 queries (seed documents) and 4823 of their co-cited papers. Using NLP techniques, the co-citances of the queries and their co-cited papers were analyzed and their similarities were computed by 4 g similarity measure. Opinion scores were extracted from co-citances using SentiWordnet. Also, nDCG values were calculated and then compared in terms of the citation proximity index (CPI) and co-citedness measures before and after being normalized by the co-opinionatedness measure. The reliability of the test collection was measured by generalizability theory. The findings suggested that a majority of the co-citations exhibited a high level of co-opinionatedness in that they were mostly similar either in their opinion strengths or in their polarities. Although anti-polar co-citations were not trivial in their number, a significantly higher number of the co-citations were co-polar, with a majority being positive. The evaluation of the normalization of the CPI and co-citedness by the co-opinionatedness indicated a generally significant improvement in retrieval effectiveness. While anti-polar similarity reduced the effectiveness of the measure, the co-polar similarity proved to be effective in improving the co-citedness. Consequently, the co-opinionatedness can be presented as a new document relation and used as a normalization factor to improve retrieval performance and research evaluation.

Keywords

Opinion mining Co-citations Co-opinion Information retrieval 

References

  1. Abu-Jbara, A., Ezra, J., & Radev, D. R. (2013). Purpose and polarity of citation: Towards NLP-based bibliometrics. In HLT-NAACL (pp. 596–606).Google Scholar
  2. Agarwal, S., Choubey, L., & Yu, H. (2010). Automatically classifying the role of citations in biomedical articles. In Proceedings of American Medical Informatics Association fall symposium (AMIA), Washington, DC (pp. 11–15).Google Scholar
  3. Amadi, U. P. (2014). Exploiting the role of polarity in citation analysis. Baltimore County: University of Maryland.Google Scholar
  4. Andrejko, A., & Bieliková, M. (2012). Comparing instances of ontological concepts for personalized recommendation in large information spaces. Computing and Informatics, 28(4), 429–452.Google Scholar
  5. Athar, A. (2011). Sentiment analysis of citations using sentence structure-based features. In Proceedings of the ACL 2011 student session (pp. 81–87). Association for Computational Linguistics.Google Scholar
  6. Athar, A. (2014). Sentiment analysis of scientific citations. Technical Report, University of Cambridge, Computer Laboratory,(UCAM-CL-TR-856).Google Scholar
  7. Athar, A., & Teufel, S. (2012). Context-enhanced citation sentiment detection. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 597–601). Association for Computational Linguistics.Google Scholar
  8. Badran, O. M. (1984). An alternative search strategy to improve information retrieval. In Proceedings of the 47th ASIS annual meeting (pp. 137–140).Google Scholar
  9. Bichteler, J., & Eaton, E. A. (1980). The combined use of bibliographic coupling and cocitation for document retrieval. Journal of the American Society for Information Science, 31(4), 278.Google Scholar
  10. Bodoff, D., & Li, P. (2007, July). Test theory for assessing IR test collections. In Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval (pp. 367–374). ACM.Google Scholar
  11. Bornmann, L., & Daniel, H. D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.Google Scholar
  12. Boyack, K. W., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text. Journal of the American Society for Information Science and Technology, 64(9), 1759–1767.Google Scholar
  13. Brooks, T. A. (1985). Private acts and public objects: An investigation of citer motivations. Journal of the American Society for Information Science, 36(4), 223–229.Google Scholar
  14. Callahan, A., Hockema, S., & Eysenbach, G. (2010). Contextual cocitation: Augmenting cocitation analysis and its applications. Journal of the American Society for Information Science and Technology, 61(6), 1130–1143.Google Scholar
  15. Cavalcanti, D. C., Prudêncio, R. B., Pradhan, S. S., Shah, J. Y., & Pietrobon, R. S. (2011). Good to be bad? Distinguishing between positive and negative citations in scientific impact. In 2011 23rd IEEE international conference on tools with artificial intelligence (ICTAI) (pp. 156–162). IEEE.Google Scholar
  16. Chubin, D. E., & Moitra, S. D. (1975). Content analysis of references: Adjunct or alternative to citation counting? Social Studies of Science, 5(4), 423–441.Google Scholar
  17. Dabrowska, A., & Larsen, B. (2015). Exploiting citation contexts for physics retrieval. In Second workshop on bibliometric-enhanced information retrieval (pp. 14–21).Google Scholar
  18. Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the association for information science and technology, 65(9), 1820–1833.Google Scholar
  19. Dong, C., & Schäfer, U. (2011). Ensemble-style self-training on citation classification. In IJCNLP (pp. 623–631).Google Scholar
  20. Doslu, M., & Bingol, H. O. (2016). Context sensitive article ranking with citation context analysis. Scientometrics, 108, 653–671.Google Scholar
  21. Egghe, L., & Rousseau, R. (1990). Introduction to informetrics: Quantitative methods in library, documentation and information science. Amsterdam: Elsevier.Google Scholar
  22. Elkiss, A., et al. (2008). Blind men and elephants: What do citation summaries tell us about a research article? Journal of the American Society for Information Science and Technology, 59(1), 51–62.Google Scholar
  23. Esuli, A., & Sebastiani, F. (2007). SentiWordNet: A high-coverage lexical resource for opinion mining. Technical Report ISTI-PP-002/2007, Institute of Information Science and Technologies (ISTI) of the Italian National Research Council (CNR). http://nmis.isti.cnr.it/sebastiani/Publications/2007TR02.pdf.
  24. Eto, M. (2012). Spread co-citation relationship as a measure for document retrieval. In Proceedings of the fifth ACM workshop on research advances in large digital book repositories and complementary media (pp. 7–8). ACM.Google Scholar
  25. Eto, M. (2013). Evaluations of context-based co-citation searching. Scientometrics, 94(2), 651–673.Google Scholar
  26. Eto, M. (2014). Document retrieval method using random walk with restart on weighted co-citation network. Proceedings of the American Society for Information Science and Technology, 51(1), 1–4.Google Scholar
  27. Eto, M. (2015). Combination effects of word-based and extended co-citation search algorithms. In Proceedings of the 15th ACM/IEEE-CS joint conference on digital libraries (pp. 245–246). ACM.Google Scholar
  28. Fujiwara, T., & Yamamoto, Y. (2015). Colil: A database and search service for citation contexts in the life sciences domain. Journal of biomedical semantics, 6(1), 38.Google Scholar
  29. Gipp, B., & Beel, J. (2009). Citation proximity analysis (CPA)-A new approach for identifying related work based on co-citation analysis. In Proceedings of the 12th international conference on scientometrics and informetrics (ISSI’09) (Vol. 2, pp. 571–575). Rio de Janeiro (Brazil): International Society for Scientometrics and Informetrics.Google Scholar
  30. Hamedani, M. R., Kim, S. W., & Kim, D. J. (2016). SimCC: A novel method to consider both content and citations for computing similarity of scientific papers. Information Sciences, 334, 273–292.Google Scholar
  31. Hanney, S., Grant, J., Jones, T., & Buxton, M. (2005). Categorising citations to trace research impact. In Proceedings of the 10th international conference of the international society for scientometrics and informetrics. Stockholm: Karolinska University Press.Google Scholar
  32. Hasanain, M., Suwaileh, R., Elsayed, T., Kutlu, M., & Almerekhi, H. (2018). EveTAR: Building a large-scale multi-task test collection over Arabic tweets. Information Retrieval Journal, 21(4), 307–336.Google Scholar
  33. Hernández-Alvarez, M. Y. R. I. A. M., & Gomez, J. M. (2016). Survey about citation context analysis: Tasks, techniques, and resources. Natural Language Engineering, 22(03), 327–349.Google Scholar
  34. Janssens, A. C. J., & Gwinn, M. (2015). Novel citation-based search method for scientific literature: Application to meta-analyses. BMC Medical Research Methodology, 15(1), 1.Google Scholar
  35. Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422–446.Google Scholar
  36. Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8(1), 197–211.Google Scholar
  37. Jia, M. (2018). Citation function and polarity classification in biomedical papers. Electronic Thesis and Dissertation Repository, 5367.Google Scholar
  38. Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of COLING 2012 (pp. 1343–1358).Google Scholar
  39. Kanoulas, E., & Aslam, J. A. (2009). Empirical justification of the gain and discount function for nDCG. In Proceedings of the 18th ACM conference on Information and knowledge management (pp. 611–620). ACM.Google Scholar
  40. Kekäläinen, J. (2005). Binary and graded relevance in IR evaluations—Comparison of the effects on ranking of IR systems. Information Processing and Management, 41(5), 1019–1033.Google Scholar
  41. Leung, P. T., Macdonald, E. M., Stanbrook, M. B., Dhalla, I. A., & Juurlink, D. N. (2017). A 1980 letter on the risk of opioid addiction. New England Journal of Medicine, 376(22), 2194–2195.Google Scholar
  42. Lipetz, B. A. (1965). Improvement of the selectivity of citation indexes to science literature through inclusion of citation relationship indicators. Journal of the Association for Information Science and Technology, 16(2), 81–90.Google Scholar
  43. Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1–167.Google Scholar
  44. Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis. In Mining text data (pp. 415–463). Boston, MA: Springer.Google Scholar
  45. Liu, S., Chen, C., Ding, K., Wang, B., Xu, K., & Lin, Y. (2014). Literature retrieval based on citation context. Scientometrics, 101(2), 1293–1307.Google Scholar
  46. MacRoberts, M. H., & MacRoberts, B. R. (1984). The negational reference: Or the art of dissembling. Social Studies of Science, 14(1), 91–94.Google Scholar
  47. MacRoberts, M. H., & MacRoberts, B. R. (1989). Problems of citation analysis: A critical review. Journal of the American Society for information Science, 40(5), 342–349.Google Scholar
  48. Mahalakshmi, G. S., Siva, R., & Sendhilkumar, S. (2015). Context based retrieval of scientific publications via reader lens. In Computational intelligence in data mining (Vol. 3, pp. 583–596). Springer India.Google Scholar
  49. Martyn, J. (1964). Bibliographic coupling. Journal of Documentation, 20(4), 236.Google Scholar
  50. Matosin, N., Frank, E., Engel, M., Lum, J. S., & Newell, K. A. (2014). Negativity towards negative results: A discussion of the disconnect between scientific worth and scientific culture. Disease Models & Mechanisms, 7:171–173.Google Scholar
  51. Moravcsik, M. J., & Murugesan, P. (1975). Some results on the function and quality of citations. Social Studies of Science, 5(1), 86–92.Google Scholar
  52. Nakov, P. I., Schwartz, A. S., & Hearst, M. (2004). Citances: Citation sentences for semantic analysis of bioscience text. In Proceedings of the SIGIR’04 workshop on search and discovery in bioinformatics (pp. 81–88).Google Scholar
  53. Parthasarathy, G., & Tomar, D. C. (2014). Sentiment analyzer: Analysis of journal citations from citation databases. In 2014 5th international conference- confluence the next generation information technology summit (confluence) (pp. 923–928). IEEE.Google Scholar
  54. Parthasarathy, G., & Tomar, D. C. (2015). A survey of sentiment analysis for journal citation. Indian Journal of Science and Technology https://doi.org/10.17485/ijst/2015/v8i35/55134.Google Scholar
  55. Piao, S., Ananiadou, S., Tsuruoka, Y., Sasaki, Y., & McNaught, J. (2007). Mining opinion polarity relations of citations. In International workshop on computational semantics (IWCS) (pp. 366–371).Google Scholar
  56. Ritchie, A., Robertson, S., & Teufel, S. (2008). Comparing citation contexts for information retrieval. In Proceedings of the 17th ACM conference on Information and knowledge management (pp. 213–222). ACM.Google Scholar
  57. Saraçoğlu, R., Tütüncü, K., & Allahverdi, N. (2007). A fuzzy clustering approach for finding similar documents using a novel similarity measure. Expert Systems with Applications, 33(3), 600–605.Google Scholar
  58. Schafer, U., & Spurk, C. (2010). TAKE scientist’s workbench: semantic search and citation-based visual navigation in scholar papers. In 2010 IEEE fourth international conference on semantic computing (ICSC) (pp. 317–324). IEEE.Google Scholar
  59. Segaran, T. (2007). Programming collective intelligence: Building smart web 2.0 applications. Beijing: O’Reilly Media, Inc.Google Scholar
  60. Sendhilkumar, S., Elakkiya, E., & Mahalakshmi, G. S. (2013). Citation semantic based approaches to identify article quality. In Proceedings of international conference ICCSEA (pp. 411–420).Google Scholar
  61. Shuy, R. W. (2003). 22 Discourse analysis in the legal context. The Handbook of Discourse Analysis, 18, 437.Google Scholar
  62. Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for information Science, 24(4), 265–269.Google Scholar
  63. Small, H. (1982). Citation context analysis. Progress in Communication Sciences, 3, 287–310.Google Scholar
  64. Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87(2), 373–388.Google Scholar
  65. Smith, L. C. (1981). Citation analysis. Library Trends, 30(1), 83–106.Google Scholar
  66. Su, M. C., & Chou, C. H. (2001). A modified version of the K-means algorithm with a distance based on cluster symmetry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 674–680.Google Scholar
  67. Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110). Association for Computational Linguistics.Google Scholar
  68. Teufel, S., Siddharthan, A., & Tidhar, D. (2009). An annotation scheme for citation function. In Proceedings of the 7th SIGdial workshop on discourse and dialogue (pp. 80-87). Association for Computational Linguistics.Google Scholar
  69. Urbano, J., Marrero, M., & Martín, D. (2013). On the measurement of test collection reliability. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval (pp. 393–402). ACM.Google Scholar
  70. Wang, Y., Wang, L., Li, Y., He, D., Chen, W., & Liu, T. Y. (2013, April). A theoretical analysis of NDCG ranking measures. In Proceedings of the 26th annual conference on learning theory (COLT 2013) (Vol. 8).Google Scholar
  71. White, H. D. (2016). Bag of works retrieval: TF* IDF weighting of co-cited works. In BIR@ ECIR (pp. 63–72).‏Google Scholar
  72. Yoon, S. H., Kim, S. W., & Park, S. (2016). C-Rank: A link-based similarity measure for scientific literature databases. Information Sciences, 326, 25–40.Google Scholar
  73. Yu, B. (2013). Automated citation sentiment analysis: What can we learn from biomedical researchers. Proceedings of the American Society for Information Science and Technology, 50(1), 1–9.Google Scholar
  74. Zhao, H. (2014). Sharding for literature search via cutting citation graphs. In 2014 IEEE international conference on Big Data (Big Data) (pp. 77–79). IEEE.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2019

Authors and Affiliations

  • Maryam Yaghtin
    • 1
  • Hajar Sotudeh
    • 1
    Email author
  • Mahdieh Mirzabeigi
    • 1
  • Seyed Mostafa Fakhrahmad
    • 2
  • Mehdi Mohammadi
    • 3
  1. 1.Department of Knowledge and Information Sciences, Faculty of Education and Psychology, Eram CampusShiraz UniversityShirazIran
  2. 2.Department of Computer Science and Engineering, School of Electrical and Computer EngineeringShiraz UniversityShirazIran
  3. 3.Department of Educational Management and Planning, Faculty of Education and Psychology, Eram CampusShiraz UniversityShirazIran

Personalised recommendations