Abstract
A publication venue authority file stores variants of the names of journals and conferences that publish scientific articles. It is useful in the construction of search tools and data disambiguation, and it is of special interest to agencies funding research and evaluating graduate programs, which use the quality of publication venues as a basis for evaluating researchers’ and research groups’ publications. However, keeping an updated authority file is not a trivial task. Different names are used to refer to the same publication venue, these venues sometimes change their name, new venues emerge regularly, and journal bibliometrics are updated frequently. This paper presents the publication venue authority file (PVAF), an environment for the disambiguation of scientific publication venues. It consists of an authority file and a set of tools for updating and querying its data. We describe and experimentally evaluate each of these tools. We also propose a search algorithm based on an associative classifier, which allows for incremental updates of its learning model. The results show that the PVAF has coverage greater than 86% for publication venues in several fields of knowledge, and its tools attain a good accuracy in the classification of publication venues from curricula vitae formatted in various citation styles.
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499, Santiago, Chile (1994)
Auld, L.: Authority control: an eight-year review. Libr. Res. Tech. Serv. 26, 319–330 (1982)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search. Addison-Wesley Professional, New York (2011)
de Cássia, R., Barata, B.: Ten things you should know about the Qualis. Revista Brasileira de Pós-Graduação 13(30), 13–40 (2016)
Rick, B., Hengel-Dittrich, C., O’Neill, E. T., Tillett, B. B.: VIAF (virtual international authority file): Linking die deutsche bibliothek and library of congress name authority files. In: Proceedings of the World Library and Information Congress: 72nd IFLA General Conference and Council, Seoul, Korea, August (2006)
Castro, E.P.S., Chakravarty, S., Williamson, E., Pereira, D.A, Fox, E.A.: Classifying short unstructured data using the Apache Spark platform. In: Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). IEEE, June 2017
Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering, pp. 106–114. IEEE, (1996)
Connaway, L.S., Dickey, T.J.: Publisher names in bibliographic data. Libr. Res. Tech. Serv. J. 55(4), 182–194 (2011)
Councill, I.G., Giles, C.L, Kan, M.-Y.: Parscit: An open-source crf reference string parsing package. In: Proceedings of the Language Resources and Evaluation Conference (LREC)
de Jesus, H.A., Pereira, D.A.: Enriching an authority file of scientific conferences with information extracted from the web. J. Comp. Sci. 13(4), 68–77 (2017)
French, J.C., Powell, A.L., Schulman, E.: Using clustering strategies for creating authority files. J. Am. Soc. Inf. Sci. 51(8), 774–786 (2000)
Garfield, E.: The history and meaning of the journal impact factor. Jama 295(1), 90–93 (2006)
Hallo, M., Luján-Mora, S., Maté, A., Trujillo, J.: Current state of linked data in digital libraries. J. Inf. Sci. 42(2), 117–127 (2016)
Houssos, N., Paschou, C., Stathopoulou, I.-O., Stamatis, K., Hardouveli, D.: Implementing citation management and report generation value-added services over oai-pmh compliant repositories. In: Proceedings of the 5th International Conference on Open Repositories, Madrid, Spain, July (2010)
Jaccard, P.: étude comparative de la distribution florale dans une portion des alpes et des jura. Bull. del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)
Knyazeva, A., Kolobov, O., Turchanovsky, I.: An example of automatic authority control. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pp. 255–256, New York, NY, USA, (2016). ACM
Knyazeva, A, Kolobov, O., Turchanovsky, I.: An example of empirical approach for bibliographic record linkage. In: Proceedings of IEEE Tenth International Conference on Research Challenges in Information Science, pp. 1–6. IEEE, 2016
Laender, A.H.F., de Lucena, C.J.P., Maldonado, J.C., Silva, E.S., Ziviani, N.: Assessing the research and education quality of the top brazilian computer science graduate programs. ACM SIGCSE Bull. 4(2), 135–145 (2008)
Lee, D., Kang, J., Mitra, P., Giles, C.L., On, B.-W.: Are your citations clean? Commun. ACM 50(12), 33–38 (2007)
Linden, R., Barbosa, L.F., Digiampietri, L.A.: Brazilian style science–an analysis of the difference between brazilian and international computer science departments and graduate programs using social networks analysis and bibliometrics. Soc. Netw. Anal. Mining 7, 1–19 (2017)
Loesch, Martha Fallahay: VIAF (the virtual international authority file). Techn. Serv. Quart. 28(2), 255–256 (2011)
Mena-Chalco, J., Cesar Junior, R.M.: Scriptlattes: an open-source knowledge extraction system from the lattes platform. J. Braz. Comp. Soc. 15, 31–39 (2009). 12
Nath, B., Bhattacharyya, D.K., Ghosh, A.: Incremental association rule mining: a survey. WIREs Data Mining Knowl. Discov. 3(3), 157–169 (2013)
Oliveira, C.M., Pereira, D.A.: An association rules based method for classifying product offers from e-shopping. Intell. Data Anal. 21(3), 637–660 (2017)
Paraizo, T.A., Silva, D.H., Pereira, D. A.: Pvaf manager—um sistema de gerenciamento de informação sobre veículos de publicação científica. In: Anais do XIII Simpósio Brasileiro de Sistemas de Informação (SBSI), Lavras, MG, junho (2017)
Pereira, D.A., da Silva, E.E.B., Esmin, A.A.: Disambiguating publication venue titles using association rules. In: Proceedings of the IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 77–86, London, UK, September 2014. IEEE
Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.F.: Using web information for creating publication venue authority files. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304, Pittsburgh, USA, June 2008. ACM New York, NY, USA
Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.F., Gonçalves, M.A.: A generic web-based entity resolution framework. J. Am. Soc. Inf. Sci. Technol. 62(5), 919–932 (2011)
Ribas, S., Ribeiro-Neto, B., Silva de Souza e, E., Ueda, A.H., Ziviani, N.: Using reference groups to assess academic productivity in computer science. In: Proceedings of the 24th International Conference on World Wide Web, pp. 603–608, New York, NY, USA, 2015. ACM
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc, New York (1983)
Tsai, P.S.M., Lee, C.-C., Chen, A.L.P.: An efficient approach for incremental association rule mining. In: Ning Z. and Lizhu Z. (eds) Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 74–83, 1999. Springer, Berlin
VIAF: The virtual international authority file, 2019. http://viaf.org/. (2019). Accessed 01 Mar 2019
Wagh, K., Thool, R.: A comparative study of soap vs rest web services provisioning techniques for mobile host. J. Inf. Eng. Appl. 2(5), 12–16 (2012)
Wainer, J., Vieira, P.: Correlations between bibliometrics and peer evaluation for all disciplines: the evaluation of brazilian scientists. Scientometrics 96(2), 395–410 (2013)
Wedyan, S.: Review and comparison of associative classification data mining approaches. Int. J. Comp. Electr. Autom. Contr. Inf. Eng. 8(1), 34–45 (2014)
Witten, Ian H., Frank, Eibe, Hall, Mark A.: Data Mining: Practical Machine Learning Tools and Techniques MA, 3rd edn. Morgan Kaufmann, Burlington (2011)
Acknowledgements
This work was partially supported by the Brazilian National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq) and the Minas Gerais Research Support Foundation (Fundação de Amparo à Pesquisa do Estado de Minas Gerais - FAPEMIG).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Paraizo, T.A., Pereira, D.A. PVAF: an environment for disambiguation of scientific publication venues. Int J Digit Libr 21, 407–421 (2020). https://doi.org/10.1007/s00799-020-00289-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-020-00289-1