Abstract
Extracting information from very large collections of structured, semi-structured or even unstructured data can be a considerable challenge when much of the hidden information is implicit within relationships among entities in the data. Social networks are such data collections in which relationships play a vital role in the knowledge these networks can convey. A bibliographic database is an essential tool for the research community, yet finding and making use of relationships comprised within such a social network is difficult. In this paper we introduce DBconnect, a prototype that exploits the social network coded within the DBLP database by drawing on a new random walk approach to reveal interesting knowledge about the research community and even recommend collaborations.
This work is based on an earlier work: DBconnect: mining research community on DBLP data, in Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, COPYRIGHT ACM, 2007, http://portal.acm.org/ citation.cfm?doid=1348549.1348558
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Seventh International World Wide Web Conference, Brisbane, Australia, pp. 107–117 (1998)
Buchanan, M.: Nexus: Small worlds and the groundbreaking theory of networks. W. W. Company, Inc., Norton (2003)
DBLP (Digital Bibliography & Library Project) Bibliography database, http://www.informatik.uni-trier.de/~ley/db/
Doan, A., Ramakrishnan, R., Chen, F., DeRose, P., Lee, Y., McCann, R., Sayyadian, M., Shen, W.: Community information management. IEEE Data Engineering Bulletin, Special Issue on Probabilistic Databases 29(1) (2006)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Science USA 99, 8271–8276 (2002)
Haveliwala, T.H.: Topic-sensitive pagerank. In: WWW: Proceedings of the 11th international conference on World Wide Web, pp. 517–526 (2002)
He, J., Li, M., Zhang, H.-J., Tong, H., Zhang, C.: Manifold-ranking based image retrieval. In: MULTIMEDIA: Proceedings of the 12th annual ACM international conference on Multimedia, pp. 9–16 (2004)
Holme, P., Huss, M., Jeong, H.: Subnetwork hierarchies of biochemical pathways. Bioinformatics 19, 532–538 (2003)
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: KDD (2002)
Karypis, G., Kumar, V.: Multilevel k-way partitioning scheme for irregular graphs. Journal of Parallel and Distriuted Computing 48(1), 96–129 (1998)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal 49, 291–307 (1970)
Klink, S., Reuther, P., Weber, A., Walter, B., Ley, M.: Analysing social networks within bibliographical data. In: Bressan, S., Küng, J., Wagner, R. (eds.) DEXA 2006. LNCS, vol. 4080, pp. 234–243. Springer, Heidelberg (2006)
Ley, M.: The DBLP computer science bibliography: Evolution, research issues, perspectives. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 1–10. Springer, Heidelberg (2002)
César Cazella, S., Campos Alvares, L.O.: An architecture based on multi-agent system and data mining for recommending research papers and researchers. In: Proc. of the 18th International Conference on Software Engineering and Knowledge Engineering (SEKE), pp. 67–72 (2006)
Nascimento, M.A., Sander, J., Pound, J.: Analysis of sigmod’s co-authorship graph. SIGMOD Record 32(2), 57–58 (2003)
Newman, M.E.J.: The structure and function of complex networks. SIAM Review 45(2), 167–256 (2003)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web. Technical report, Stanford University Database Group (1998)
Pan, J.-Y., Yang, H.-J., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: KDD, pp. 653–658 (2004)
Pothen, A., Simon, H., Liou, K.P.: Partitioning sparse matrices with eigenvectorsof graphs. SIAM J. Matrix Anal. Appl. 11, 430–452 (1990)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. USA 101, 2658 (2004)
Smeaton, A.F., Keogh, G., Gurrin, C., McDonald, K., Sodring, T.: Analysis of papers from twenty-five years of sigir conferences: What have we been doing for the last quarter of a century. SIGIR Forum 36(2), 39–43 (2002)
Strang, G.: Introduction to linear algebra, 3rd edn. Wellesley-Cambridge Press (1998)
Sun, J., Qu, H., Chakrabarti, D., Faloutsos, C.: Neighborhood formation and anomaly detection in bipartite graphs. In: ICDM, pp. 418–425 (2005)
Tong, H., Faloutsos, C., Pan, J.-Y.: Fast random walk with restart and its applications. In: ICDM, pp. 613–622 (2006)
Tyler, J.R., Wilkinson, D.M., Huberman, B.A.: Email as spectroscopy: automated discovery of community structure within organizations. Communities and technologies, pp. 81–96 (2003)
Wasserman, S., Faust, K.: Social network analysis: Methods and applications. Cambridge University Press, Cambridge (1994)
Wendl, M.C.: H-index: however ranked, citations need context. Nature 449(403) (2007)
Yin, X., Han, J., Yu, P.S.: Linkclus: efficient clustering via heterogeneous semantic links. In: VLDB, pp. 427–438 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zaïane, O.R., Chen, J., Goebel, R. (2009). Mining Research Communities in Bibliographical Data. In: Zhang, H., et al. Advances in Web Mining and Web Usage Analysis. SNAKDD 2007. Lecture Notes in Computer Science(), vol 5439. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00528-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-00528-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00527-5
Online ISBN: 978-3-642-00528-2
eBook Packages: Computer ScienceComputer Science (R0)