Abstract
Knowledge networks are large, interconnected data sets of knowledge that can be represented, studied and modeled using complex networks concepts and methodologies. One aspect of particular interest in this type of networks concerns how much the topological properties change along successive neighborhoods of each of the nodes. Another issue of special importance consists in quantifying how much the structure of a knowledge network changes at two different points along time. Here, we report a cross-relation study of two model—theoretical networks (Erdős–Rényi, ER, and Barabási–Albert model, BA) as well as real-world knowledge networks corresponding to the areas of Physics and Theology, obtained from the Wikipedia and taken at two different dates separated by 4 years. The respective two versions of these networks were characterized in terms of their respective cross-relation signatures, being summarized in terms of modification indices obtained for each of the nodes that are preserved among the two versions. It has been observed that the nodes at the core and periphery of both types of theoretical models yielded similar modification indices within these two groups of nodes, but with distinct values when taken across these two groups. The study of the real-world networks indicated that these two networks have signatures, respectively, similar to those of the BA and ER models, as well as that higher modification values tended to occur at the periphery nodes, as compared to the respective core nodes.
Graphical abstract
Similar content being viewed by others
Data availability statement
Data sharing not applicable to this article as no data sets were generated or analyzed during the current study.
References
A. Abbasi, K.S.K. Chung, L. Hossain, Egocentric analysis of co-authorship network structure, position and performance. Inform. Process. Manag. 48(4), 671–679 (2012)
R. Albert, A.L. Barabási, Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47 (2002)
K. Börner, C. Chen, K.W. Boyack, Visualizing knowledge domains. Annu. Rev. Inform. Sci. Technol. 37(1), 179–255 (2003)
U. Brandes, D. Wagner, Analysis and visualization of social networks. In: Graph drawing software. Springer, pp. 321–340 (2004)
E.O. Brigham, R. Morrow, The fast fourier transform. IEEE Spect. 4(12), 63–70 (1967)
C. Chen, CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature. J. Am. Soc. Inform. Sci. Technol. 57(3), 359–377 (2006)
X. Chen, S. Jia, Y. Xiang, A review: knowledge reasoning over knowledge graph. Expert Syst. Appl. 141, 112948 (2020)
A. Clauset, C. Moore, M.E. Newman, Hierarchical structure and the prediction of missing links in networks. Nature 453(7191), 98–101 (2008)
C. Consonni, D. Laniado, A. Montresor, WikiLinkGraphs: a complete, longitudinal and multi-language dataset of the Wikipedia link networks. In: Proceedings of the International AAAI Conference on Web and Social Media, pp. 598–607 (2019)
M.H. DeGroot, M.J. Schervish, Probability and statistics. Pearson Education (2012)
G.S. Domingues, E. Tokuda, L. da F Costa, Identification of city motifs: a method based on modularity and similarity between hierarchical features of urban networks. J. Phys. Complex. 3(4), 045003 (2022)
C. Donnat, S. Holmes, Tracking network dynamics: a survey using graph distances. Ann. Appl. Stat. 12(2), 971–1012 (2018)
N.J.V. Eck, L. Waltman, CitNetExplorer: a new software tool for analyzing and visualizing citation networks. J. Informet. 8(4), 802–823 (2014)
D.E. Edmunds, W.D. Evans, Spectral theory and differential operators (Oxford University Press, Oxford, 2018)
P. Erdős, A. Rényi, On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5(1), 17–60 (1960)
L. da F Costa, A compact guide to PCA. ResearchGate (2020). https://www.researchgate.net/publication/346656784_A_Compact_Guide_to_PCA_CDT-47
L. da F Costa, Comparing cross correlation-based similarities. HAL Open Sci. (2021). https://hal.science/hal-03406688v4
L. da F Costa, Further generalizations of the Jaccard index. ResearchGate (2021). https://www.researchgate.net/publication/355381945_Further_Generalizations_of_the_Jaccard_Index
L. da F Costa, Multisets. ResearchGate (2021). https://www.researchgate.net/publication/355437006_Multisets
L. da F Costa, On complexity and the prospects for scientific advancement. Revista Brasileira de Ensino de Física 43, e20200442 (2021)
L. da F Costa, Autorrelation and cross-relation of graphs and networks. J. Phys. Complex. 3(4), 045009 (2022)
L. da F Costa, Coincidence complex networks. J. Phys. Complexity 3(1), 015012 (2022). https://doi.org/10.1088/2632-072x/ac54c3
L. da F Costa, On similarity. Phys. A: Stat. Mech. Appl. 127456 (2022). https://doi.org/10.1016/j.physa.2022.127456
L. da F Costa, Multiset neurons. Phys. A: Stat. Mech. Appl. 609, 128318 (2023)
L. da F Costa, R.A. Rodrigues, G. Travieso et al., Characterization of complex networks: a survey of measurements. Adv. Phys. 56(1), 167–242 (2007)
L. da F Costa, F.A. Rodrigues, A.S. Cristino, Complex networks: the key to systems biology. Genet. Mol. Biol. 31, 591–601 (2008)
L. da F Costa, M.A.R. Tognetti, F.N. Silva, Concentric characterization and classification of complex network nodes: application to an institutional collaboration network. Phys. A: Stat. Mech. Appl. 387(24), 6201–6214 (2008)
L. da F Costa, O.N. Oliveira Jr., G. Travieso et al., Analyzing and modeling real-world phenomena with complex networks: a survey of applications. Adv. Phys. 60(3), 329–412 (2011)
M. Färber, F. Bartscherer, C. Menne et al., Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and Yago. Semant. Web 9(1), 77–129 (2018)
S. Fortunato, Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)
T.M.J. Fruchterman, E.M. Reingold, Graph drawing by force-directed placement. Software: Practi. Exp. 21(11), 1129–1164 (1991)
F.L. Gewers, G.R. Ferreira, H.F. de Arruda et al., Principal component analysis: a natural approach to data exploration. ACM Comput. Surv. (CSUR) 54(4), 1–34 (2021)
Q. Guan, F.R. Yu, S. Jiang et al., Prediction-based topology control and routing in cognitive radio mobile ad hoc networks. IEEE Trans. Veh. Technol. 59(9), 4443–4452 (2010)
R.A. Johnson, D.W. Wichern et al., Applied multivariate statistical analysis (Prentice Hall, Upper Saddle River, 2002)
I.T. Jolliffe, Principal component analysis for special types of data (Springer, Berlin, 2002)
T. Kuhn, M. Perc, D. Helbing, Inheritance patterns in citation networks reveal scientific memes. Phys. Rev. X 4(4), 041036 (2014)
R. Lambiotte, J.C. Delvenne, M. Barahona, Random walks, markov processes and the multiscale modular organization of complex networks. IEEE Trans. Netw. Sci. Eng. 1(2), 76–90 (2014)
A. Li, S.P. Cornelius, Y.Y. Liu et al., The fundamental advantages of temporal networks. Science 358(6366), 1042–1046 (2017)
A. Mellor, A. Grusovin, Graph comparison via the nonbacktracking spectrum. Phys. Rev. E 99(5), 052309 (2019)
C.A. Moreira-Filho, S.Y. Bando, F.B. Bertonha, et al. Methods for gene co-expression network visualization and analysis. In: Transcriptomics in Health and Disease. Springer, pp. 143–163 (2022)
M. Newman, Networks (Oxford University Press, Oxford, 2018)
J.P. Onnela, K. Kaski, J. Kertész, Clustering and information in correlation based financial networks. Eur. Phys. J. B 38(2), 353–362 (2004)
J.P. Onnela, D.J. Fenn, S. Reid et al., Taxonomies of networks from community structure. Phys. Rev. E 86(3), 036104 (2012)
M.A. Porter, J.P. Onnela, P.J. Mucha et al., Communities in networks. Not. AMS 56(9), 1082–1097 (2009)
S.M. Ross, Introduction to probability models (Academic Press, Cambridge, 2014)
F.N. Silva, L. da F Costa, Visualizing complex networks. In: ResearchGate (2018). https://www.researchgate.net/publication/328811693_Visualizing_Complex_Networks_CDT-5
F.N. Silva, L. da F. Costa, Self-correspondence along multipartite complex networks. ResearchGate (2022). https://www.researchgate.net/publication/372986844_Self-Correspondence_Along_Multipartite_Complex_Networks
F.N. Silva, M.P. Viana, B.A.N. Travençolo et al., Investigating relationships within and between category networks in Wikipedia. J. Informet. 5(3), 431–438 (2011)
B.A.N. Travençolo, L. da F Costa, Hierarchical spatial organization of geographical networks. J. Phys. A: Math. Theor. 41(22), 224004 (2008)
H. Wolda, Similarity indices, sample size and diversity. Oecologia 50, 296–302 (1981)
F. Xie, D. Levinson, Topological evolution of surface transportation networks. Comput. Environ. Urban Syst. 33(3), 211–223 (2009)
Acknowledgements
E. K. Tokuda thanks FAPESP (2019/01077-3 and 2021/14310-8) for financial support. L. da F. Costa thanks CNPq (307085/2018-0 ) and FAPESP (2015/22308-2) for support. The work of RL was supported by EPSRC Grants EP/V013068/1 and EP/V03474X/1, and by the NSFC grant 62373169.
Author information
Authors and Affiliations
Contributions
The authors have equal contribution to the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare they have no financial interests.
Appendix A: Direct comparison of hierarchical features
Appendix A: Direct comparison of hierarchical features
Given two networks A and B with aligned nodes, an alternative to obtain signatures characterizing the similarity between the topological structure along all the hierarchies defined by respective reference nodes consists in comparing the features, in pairwise fashion, between the nodes in the same hierarchical levels along the two networks (e.g., [8, 25]).
Figure 14 illustrates this method, respectively, to a specific pair of networks A and B and the first neighborhoods (\(\delta =1\)) respective to the reference node i. The node features are taken as corresponding to the respective degrees. Also illustrated is the comparison, using the coincidence similarity index, of the node degrees of the nodes in the first neighborhood of networks A and B, which leads to the value 0.7. The direct approach reported in this appendix considers all neighborhoods around each reference node.
This method has its performance quantified in terms of the same experiments as described in Sect. 5. Starting from a reference network, a set of interest (m) nodes is defined. From each of these nodes, a new neighbour is added in uniformly random manner. The obtained network with m additional edges is then compared with the respective original network by using the above described approach.
In Fig. 15, we show an ER network and two sets of reference nodes (shown in red) corresponding, respectively, to the core (Fig. 15a) and the periphery (Fig. 15b) of the network. The size of the nodes corresponds to the coincidence index between the node degrees along the successive neighborhoods. The obtained core and periphery nodes can be observed to be characterized by markedly distinct sizes within their respective groups, therefore, indicating great dispersion of coincidence index values, in contrast to the otherwise expected uniformity of topological properties among these two types of nodes. This result, therefore, indicates that the direct comparison of neighborhoods does not provide a stable approach to characterizing the topological alterations undergone by each modified node along the respective hierarchies, as had been obtained for the cross-relation approach as described in Sect. 5.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tokuda, E.K., Lambiotte, R. & Costa, L.d.F. Cross-relation characterization of knowledge networks. Eur. Phys. J. B 96, 144 (2023). https://doi.org/10.1140/epjb/s10051-023-00608-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1140/epjb/s10051-023-00608-w