Skip to main content
Log in

The study of subject-classification based on journal coupling and expert subject-classification system

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

As the framework of scientific research, subject-classification plays an important role in the development of science. In order to combine the development of science with the current expert subject-classification system and further give a more appropriate description of scientific output analysis from subject level, We study the relationship between the natural science related sub-categories of Chinese library classification using objective computerized scientometrics, and give some modification to the first two level subjects of the existing Chinese library classification system. Taking Chinese Science Citation Database as our data source, this article studies the similarity of subjects based on journal coupling strength. Then we try to set up an improved subject-classification system whose top categories are relied on Chinese library classification system and sub-categories are the ensemble clustering result based on journal coupling measure. Further, in order to help identifying and interpreting the rationality of this improved classification system, we make use of some text mining methods, such as key words recognition and topic detection, to explain the cause of similarity between some subjects from the perspective of semantic. Our study shows that the improved subject-classification system constructed in this article not only conforms to previous experience and cognitive but also combines subject development knowledge.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. In this study, we use term “cross-citation” to refer to the citing and cited behavior among articles, journals and authors and so on. Hereafter, we will also mention term “coupling”, such as “journal coupling”, and this refers to the measurement we used to study the similarity between different journals or different subjects based on the “cross-citation” behavior among them. That is to say, in this study we use “journal coupling” to study the “cross-citation” relationship among subjects, so the “cross-citation” relationship is the basis of the similarity measure “journal coupling”.

  2. 80 % is determined by repeated trials so that sparse degree of the adjacency matrix can be reduced significantly and the raw information cannot be loss much.

  3. In subject-journal matrix we derived from step 3, the number in row i and column j indicate times cited of journal j by subject i. In order to avoid the influences indicated in step 4, we choose to calculate the journal-based coupling strength of different subject with the simple method (a basic method of calculating coupling strength), which only consider the number of journals coupled by two subjects not cites. So we change the original cites in matrix to 0–1 which indicated if the citation from subject to journal is exist or not. Well, the simple method of coupling has problems of using original information insufficiently. But compared with bias coming from the sensitive cites, bias coming from the insufficient data usage is smaller, so we eventually choose this method, and further we will make great effort to improve our data quality and try to apply other coupling calculation method, such as the binary one proposed by Rousseau et al. (2004).

  4. We choose the general Gower’s coefficient for the reason that it is suitable for handling of nominal, ordinal, and binary data. Moreover, due to including weights to different variable, the calculation of distance is more robust.

  5. For each number of clusters k, it compares log (W (k)) with E^*[log (W (k))] where the latter is defined via bootstrapping, i.e. simulating from a reference distribution. The optimal number of cluster is the one who make the log (W (k)) decrease most fast, that is make the Gap statistics increase most fast to its maximum.

  6. We believe the subject-classification we derived in this paper is applicable to other situation for the reason that the journals in CSCD source list are all nature science related core journals. And according to Garfield's Law of Concentration, the citation behavior of these core journals have strong representation, so the modified subject system based on citation can be commonly adopted by situations using CLC to some extent.

References

  • Ahlgren, P., & Colliander, C. (2009). Document–document similarity approaches and science mapping: Experimental comparison of five approaches. Journal of Informetrics, 3(1), 49–63. doi:10.1016/j.joi.2008.11.003.

    Article  Google Scholar 

  • Archambault, É., Beauchesne, O. H., & Caruso, J. (2011). Towards a multilingual comprehensive and open scientific journal ontology. In E. C. M. Noyons, P.Ngulube, & J. Leta (Eds.), Proceedings of the 13th international conference of the international society for scientometrics and informetrics (pp. 66–77).

  • Börner, K., Klavans, R., Patek, M., Zoss, A. M., Biberstine, J. R., Light, R. P., et al. (2012). Design and update of a classification system: The UCSD map of science. PLoS One, 7(7), e39464. doi:10.1371/journal.pone.0039464.

    Article  Google Scholar 

  • Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.

    Article  Google Scholar 

  • Braam, R. R., Moed, H. F., & van Raan, A. F. J. (1991). Mapping of science by combined co-citation and word analysis: I: Structural Aspects. Journal of the American Society for Information Science and Technology, 42(4), 233–251.

    Article  Google Scholar 

  • Cason, H., & Lubotsky, M. (1936). The influence and dependence of psychological journals on each other. Psychological Bulletin, 33(2), 95–103.

    Article  Google Scholar 

  • Chang, Y. F., & Chen, C.-M. (2011). Classification and visualization of the social science network by the minimum span clustering method. Journal of the American Society for Information Science and Technology, 62(8), 2404–2413.

    Article  Google Scholar 

  • Chen, C. M., Ibekwe-SanJuan, F., & Hou, J. H. (2010). The structure and dynamics of co-citation clusters: A multiple-perspective co-citation analysis. Journal of the American Society for Information Science and Technology, 61(7), 1386–1409.

    Article  Google Scholar 

  • Daniel, R. S., & Loutitt, C. M. (1953). Professional problems in psychology. New York: Prentice Hall.

    Book  Google Scholar 

  • Everitt, B. (1974). Cluster analysis. London: Heinemann Educ.

    MATH  Google Scholar 

  • Glänzel, W., & Schubert, A. (2003). A new classification scheme of science fields and subfields designed for scientometric evaluation purposes. Scientometrics, 56(3), 357–367.

    Article  Google Scholar 

  • Gómez-Núñez, A. J., Batagelj, V., Vargas-Quesada, B., Moya-Anegón, F., & Chinchilla-Rodríguez, Z. (2014). Optimising SCImago journal & country rank classification by community detection. Journal of Informetrics, 8(2), 369–383.

    Article  Google Scholar 

  • Gómez-Núñez, A. J., Vargas-Quesada, B., & Moya-Anegón, F. (2015). Updating the SCImago journal and country rank classification: A new approach using Ward's clustering and alternative combination of citation measures. Journal of the Association for Information Science and Technology, 67(1), 178–190.

    Article  Google Scholar 

  • Hartigan, J. A., & Wong, M. A. (1979). A K-means clustering algorithm. Applied Statistics, 28(1), 100–108.

    Article  MATH  Google Scholar 

  • Katz, J. S., & Hicks, D. (1995). The classification of interdisciplinary journals: A new approach (Version 2.0). In M.E.D. Koenig & A. Bookstein (Eds.), Proceedings of the Fifth Biennial Conference of the International Society for Scientometrics and Informatics (pp. 245–254). Medford: Learned Information.

  • Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.

    Book  Google Scholar 

  • Kessler, M. M. (1963). Bibliographic coupling between scientific Papers. American Documentation, 14(1), 10–25.

    Article  Google Scholar 

  • Kronegger, L., Mali, F., & Ferligoj, A. (2013). Classifying scientific disciplines in Slovenia: A study of the evolution of collaboration structures. Journal of the American Society for Information Science and Technology, 66(2), 321–339.

    Article  Google Scholar 

  • Leydesdorff, L. (2002). Dynamic and evolutionary updates of classificatory schemes in scientific journal structures. Journal of the American Society for Information Science and Technology, 53(12), 987–994.

    Article  Google Scholar 

  • Leydesdorff, L. (2004a). Clusters and maps of science journals based on bi-connected graphs in the Journal Citation Reports. Journal of Documentation, 60(4), 371–427.

    Article  Google Scholar 

  • Leydesdorff, L. (2004b). Top-down decomposition of the Journal Citation Report of the Social Science Citation Index: Graph- and factor-analytical approaches. Scientometrics, 60(2), 159–180.

    Article  Google Scholar 

  • Leydesdorff, L. (2006). Can scientific journals be classified in term of aggregated journal—Journal citation relations using the journal citation reports. Journal of the American Society of Information and Technology, 57(5), 601–603.

    Article  Google Scholar 

  • Leydesdorff, L., & Cozzen, S. E. (1993). The delineation of specialties in terms of Journals using the dynamic journal set of the SCI. Scientometrics, 26(1), 135–156.

    Article  Google Scholar 

  • Leydesdorff, L., & Rafols, I. (2008). A global map of science based on the ISI discipline categories. Journal of the American Society for Information Science and Technology, 60(2), 348–362.

    Article  Google Scholar 

  • Leydesdorff, L., & Rafols, I. (2012). Interactive overlays: A new method for generating global journal maps from Web-of-Science data. Journal of Informetrics, 6(2), 318–332. doi:10.1016/j.joi.2011.11.003.

    Article  Google Scholar 

  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, (pp. 281–297) University of California Press, Berkeley, Calif.

  • Marshakova, S. I. (1973). System of Document Connections Based on References. Scientific and Technical Information Serial of VINITI, 6(2), 3–8.

    Google Scholar 

  • Narin, F. (1976). Evaluative bibliometrics: The use of publication and citation analysis in the evaluation of scientific activity. Washington, DC: National Science Foundation.

    Google Scholar 

  • Narin, F., Carpenter, M., & BerltN, C. (1972). Interrelationships of scientific journals. Journal of the American Society for Information Science, 23(5), 323–331.

    Article  Google Scholar 

  • Ni, C., Sugimoto, C. R., & Jiang, J. (2013). Venue-author-coupling: A measure for identifying disciplines through author communities. Journal of the American Society for Information Science and Technology, 64(2), 265–279.

    Article  Google Scholar 

  • Qiu, J., & Dong, K. (2013). A Comparative study on the ability of author co-occurrence network in revealing scientific structure. Journal of library science china, 39(1), 15–24. (In Chinese).

    Google Scholar 

  • Qiu, J., & Liu, G. (2014). Research of discipline knowledge aggregation based on the journal-author coupling method. Journal of intelligence, 33(4), 17–22. (In Chinese).

    MathSciNet  Google Scholar 

  • Reynolds, A., Richards, G., de la Iglesia, B., & Rayward-Smith, V. J. (1992). Clustering rules: A comparison of partitioning and hierarchical clustering algorithms. Journal of Mathematical Modeling and Algorithms, 5(4), 475–504.

    Article  MathSciNet  MATH  Google Scholar 

  • Rousseau, R., & Zuccala, A. (2004). A classification of author co-citations: Definitions and search strategies. Journal of the American Society for Information Science and Technology, 55(6), 513–529.

    Article  Google Scholar 

  • Small, H. (1973). Co-citation in the Scientific Literature:A New Measure of the Relationship Between Two Documents. Journal of the American Society for Information Science, 24(4), 265.

    Article  Google Scholar 

  • Tibshirani, R., Walther, G., & Hastie, T. (2001). Estimating the number of data clusters via the Gap statistic. Journal of the Royal Statistical Society B, 63(2), 411–423.

    Article  MathSciNet  MATH  Google Scholar 

  • Waltman, L., & Van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the Association for Information Science and Technology, 63(12), 2378–2392. doi:10.1002/asi.22748.

    Article  Google Scholar 

  • White, H. D., & McCain, K. W. (1998). Visualizing a Discipline: An Author Co-Citation Analysis of Information Science, 1972–1995. Journal of the American Society for Information Science, 49(4), 327–355.

    Google Scholar 

  • Zhang, L., Janssens, F., Liang, L., & Glänzel, W. (2010). Journal cross-citation analysis for validation and improvement of journal-based discipline classification in bibliometric research. Scientometrics, 82(5), 687–706.

    Article  Google Scholar 

  • Zhang, L., Liang, L., Liu, Z., & Glänzel, W. (2012). The analysis of science structure based on journal clustering and SOOI classification system. Study in science of science, 30(9), 14–22. (In Chinese).

    Google Scholar 

  • Zhao, D. Z., & Strotmann, A. (2008). Evolution of research activities and intellectual in information science 1996–2005: Introducing author bibliographic -coupling analysis. Journal of the American Society for Information Science and Technology, 59(13), 2070–2086.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Zhang.

Appendix 1

Appendix 1

See Table 7.

Table 7 Original subject-classification system of CLC

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Liu, X. & Wu, L. The study of subject-classification based on journal coupling and expert subject-classification system. Scientometrics 107, 1149–1170 (2016). https://doi.org/10.1007/s11192-016-1890-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-016-1890-9

Keywords

Navigation