Skip to main content
Log in

PVAF: an environment for disambiguation of scientific publication venues

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

A publication venue authority file stores variants of the names of journals and conferences that publish scientific articles. It is useful in the construction of search tools and data disambiguation, and it is of special interest to agencies funding research and evaluating graduate programs, which use the quality of publication venues as a basis for evaluating researchers’ and research groups’ publications. However, keeping an updated authority file is not a trivial task. Different names are used to refer to the same publication venue, these venues sometimes change their name, new venues emerge regularly, and journal bibliometrics are updated frequently. This paper presents the publication venue authority file (PVAF), an environment for the disambiguation of scientific publication venues. It consists of an authority file and a set of tools for updating and querying its data. We describe and experimentally evaluate each of these tools. We also propose a search algorithm based on an associative classifier, which allows for incremental updates of its learning model. The results show that the PVAF has coverage greater than 86% for publication venues in several fields of knowledge, and its tools attain a good accuracy in the classification of publication venues from curricula vitae formatted in various citation styles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. http://www.capes.gov.br/

  2. https://www.webofknowledge.com/JCR

  3. https://www.scopus.com/sources

  4. http://www.scimagojr.com

  5. https://scholar.google.com/intl/en/scholar/metrics.html

  6. http://lattes.cnpq.br/

  7. https://sucupira.capes.gov.br/

References

  1. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499, Santiago, Chile (1994)

  2. Auld, L.: Authority control: an eight-year review. Libr. Res. Tech. Serv. 26, 319–330 (1982)

    Google Scholar 

  3. Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval: The Concepts and Technology Behind Search. Addison-Wesley Professional, New York (2011)

    Google Scholar 

  4. de Cássia, R., Barata, B.: Ten things you should know about the Qualis. Revista Brasileira de Pós-Graduação 13(30), 13–40 (2016)

    Google Scholar 

  5. Rick, B., Hengel-Dittrich, C., O’Neill, E. T., Tillett, B. B.: VIAF (virtual international authority file): Linking die deutsche bibliothek and library of congress name authority files. In: Proceedings of the World Library and Information Congress: 72nd IFLA General Conference and Council, Seoul, Korea, August (2006)

  6. Castro, E.P.S., Chakravarty, S., Williamson, E., Pereira, D.A, Fox, E.A.: Classifying short unstructured data using the Apache Spark platform. In: Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL). IEEE, June 2017

  7. Cheung, D.W., Han, J., Ng, V.T., Wong, C.Y.: Maintenance of discovered association rules in large databases: an incremental updating technique. In: Proceedings of the Twelfth International Conference on Data Engineering, pp. 106–114. IEEE, (1996)

  8. Connaway, L.S., Dickey, T.J.: Publisher names in bibliographic data. Libr. Res. Tech. Serv. J. 55(4), 182–194 (2011)

    Google Scholar 

  9. Councill, I.G., Giles, C.L, Kan, M.-Y.: Parscit: An open-source crf reference string parsing package. In: Proceedings of the Language Resources and Evaluation Conference (LREC)

  10. de Jesus, H.A., Pereira, D.A.: Enriching an authority file of scientific conferences with information extracted from the web. J. Comp. Sci. 13(4), 68–77 (2017)

    Article  Google Scholar 

  11. French, J.C., Powell, A.L., Schulman, E.: Using clustering strategies for creating authority files. J. Am. Soc. Inf. Sci. 51(8), 774–786 (2000)

    Article  Google Scholar 

  12. Garfield, E.: The history and meaning of the journal impact factor. Jama 295(1), 90–93 (2006)

    Article  Google Scholar 

  13. Hallo, M., Luján-Mora, S., Maté, A., Trujillo, J.: Current state of linked data in digital libraries. J. Inf. Sci. 42(2), 117–127 (2016)

    Article  Google Scholar 

  14. Houssos, N., Paschou, C., Stathopoulou, I.-O., Stamatis, K., Hardouveli, D.: Implementing citation management and report generation value-added services over oai-pmh compliant repositories. In: Proceedings of the 5th International Conference on Open Repositories, Madrid, Spain, July (2010)

  15. Jaccard, P.: étude comparative de la distribution florale dans une portion des alpes et des jura. Bull. del la Société Vaudoise des Sciences Naturelles 37, 547–579 (1901)

    Google Scholar 

  16. Knyazeva, A., Kolobov, O., Turchanovsky, I.: An example of automatic authority control. In: Proceedings of the 16th ACM/IEEE-CS on Joint Conference on Digital Libraries, pp. 255–256, New York, NY, USA, (2016). ACM

  17. Knyazeva, A, Kolobov, O., Turchanovsky, I.: An example of empirical approach for bibliographic record linkage. In: Proceedings of IEEE Tenth International Conference on Research Challenges in Information Science, pp. 1–6. IEEE, 2016

  18. Laender, A.H.F., de Lucena, C.J.P., Maldonado, J.C., Silva, E.S., Ziviani, N.: Assessing the research and education quality of the top brazilian computer science graduate programs. ACM SIGCSE Bull. 4(2), 135–145 (2008)

    Article  Google Scholar 

  19. Lee, D., Kang, J., Mitra, P., Giles, C.L., On, B.-W.: Are your citations clean? Commun. ACM 50(12), 33–38 (2007)

    Article  Google Scholar 

  20. Linden, R., Barbosa, L.F., Digiampietri, L.A.: Brazilian style science–an analysis of the difference between brazilian and international computer science departments and graduate programs using social networks analysis and bibliometrics. Soc. Netw. Anal. Mining 7, 1–19 (2017)

    Article  Google Scholar 

  21. Loesch, Martha Fallahay: VIAF (the virtual international authority file). Techn. Serv. Quart. 28(2), 255–256 (2011)

    Article  MathSciNet  Google Scholar 

  22. Mena-Chalco, J., Cesar Junior, R.M.: Scriptlattes: an open-source knowledge extraction system from the lattes platform. J. Braz. Comp. Soc. 15, 31–39 (2009). 12

    Article  Google Scholar 

  23. Nath, B., Bhattacharyya, D.K., Ghosh, A.: Incremental association rule mining: a survey. WIREs Data Mining Knowl. Discov. 3(3), 157–169 (2013)

    Article  Google Scholar 

  24. Oliveira, C.M., Pereira, D.A.: An association rules based method for classifying product offers from e-shopping. Intell. Data Anal. 21(3), 637–660 (2017)

    Article  Google Scholar 

  25. Paraizo, T.A., Silva, D.H., Pereira, D. A.: Pvaf manager—um sistema de gerenciamento de informação sobre veículos de publicação científica. In: Anais do XIII Simpósio Brasileiro de Sistemas de Informação (SBSI), Lavras, MG, junho (2017)

  26. Pereira, D.A., da Silva, E.E.B., Esmin, A.A.: Disambiguating publication venue titles using association rules. In: Proceedings of the IEEE/ACM Joint Conference on Digital Libraries (JCDL), pp. 77–86, London, UK, September 2014. IEEE

  27. Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.F.: Using web information for creating publication venue authority files. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 295–304, Pittsburgh, USA, June 2008. ACM New York, NY, USA

  28. Pereira, D.A., Ribeiro-Neto, B., Ziviani, N., Laender, A.H.F., Gonçalves, M.A.: A generic web-based entity resolution framework. J. Am. Soc. Inf. Sci. Technol. 62(5), 919–932 (2011)

    Article  Google Scholar 

  29. Ribas, S., Ribeiro-Neto, B., Silva de Souza e, E., Ueda, A.H., Ziviani, N.: Using reference groups to assess academic productivity in computer science. In: Proceedings of the 24th International Conference on World Wide Web, pp. 603–608, New York, NY, USA, 2015. ACM

  30. Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill Inc, New York (1983)

    MATH  Google Scholar 

  31. Tsai, P.S.M., Lee, C.-C., Chen, A.L.P.: An efficient approach for incremental association rule mining. In: Ning Z. and Lizhu Z. (eds) Proceedings of Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 74–83, 1999. Springer, Berlin

  32. VIAF: The virtual international authority file, 2019. http://viaf.org/. (2019). Accessed 01 Mar 2019

  33. Wagh, K., Thool, R.: A comparative study of soap vs rest web services provisioning techniques for mobile host. J. Inf. Eng. Appl. 2(5), 12–16 (2012)

    Google Scholar 

  34. Wainer, J., Vieira, P.: Correlations between bibliometrics and peer evaluation for all disciplines: the evaluation of brazilian scientists. Scientometrics 96(2), 395–410 (2013)

    Article  Google Scholar 

  35. Wedyan, S.: Review and comparison of associative classification data mining approaches. Int. J. Comp. Electr. Autom. Contr. Inf. Eng. 8(1), 34–45 (2014)

    Google Scholar 

  36. Witten, Ian H., Frank, Eibe, Hall, Mark A.: Data Mining: Practical Machine Learning Tools and Techniques MA, 3rd edn. Morgan Kaufmann, Burlington (2011)

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by the Brazilian National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq) and the Minas Gerais Research Support Foundation (Fundação de Amparo à Pesquisa do Estado de Minas Gerais - FAPEMIG).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Denilson Alves Pereira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Paraizo, T.A., Pereira, D.A. PVAF: an environment for disambiguation of scientific publication venues. Int J Digit Libr 21, 407–421 (2020). https://doi.org/10.1007/s00799-020-00289-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-020-00289-1

Keywords

Navigation