Advertisement

Characterizing the Hypergraph-of-Entity Representation Model

  • José DevezasEmail author
  • Sérgio Nunes
Conference paper
Part of the Studies in Computational Intelligence book series (SCI, volume 882)

Abstract

The hypergraph-of-entity is a joint representation model for terms, entities and their relations, used as an indexing approach in entity-oriented search. In this work, we characterize the structure of the hypergraph, from a microscopic and macroscopic scale, as well as over time with an increasing number of documents. We use a random walk based approach to estimate shortest distances and node sampling to estimate clustering coefficients. We also propose the calculation of a general mixed hypergraph density based on the corresponding bipartite mixed graph. We analyze these statistics for the hypergraph-of-entity, finding that hyperedge-based node degrees are distributed as a power law, while node-based node degrees and hyperedge cardinalities are log-normally distributed. We also find that most statistics tend to converge after an initial period of accentuated growth in the number of documents.

Keywords

Hypergraph-of-entity Combined data Indexing Representation model Hypergraph analysis Characterization 

Notes

Acknowledgements

José Devezas is supported by research grant PD/BD/128160/2016, provided by the Portuguese national funding agency for science, research and technology, Fundação para a Ciência e a Tecnologia (FCT), within the scope of Operational Program Human Capital (POCH), supported by the European Social Fund and by national funds from MCTES.

References

  1. 1.
    Aparicio, D., Ribeiro, P., Silva, F.: Graphlet-orbit transitions (got): a fingerprint for temporal network comparison. PLoS One 13, e0205497 (2018)CrossRefGoogle Scholar
  2. 2.
    Backstrom, L., Boldi, P., Rosa, M., Ugander, J., Vigna, S.: Four degrees of separation. CoRR abs/1111.4570 (2011). http://arxiv.org/abs/1111.4570
  3. 3.
    Bast, H., Buchhold, B.: An index for efficient semantic full-text search. In: Proceedings of the 22nd ACM International Conference on Conference on Information and Knowledge Management, pp. 369–378 (2013).  https://doi.org/10.1145/2505515.2505689
  4. 4.
    Bast, H., Buchhold, B., Haussmann, E., et al.: Semantic search on text and knowledge bases. Found. Trends® Inf. Retrieval 10(2–3), 119–271 (2016)CrossRefGoogle Scholar
  5. 5.
    Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the Third International Conference on Weblogs and Social Media, ICWSM 2009, San Jose, California, USA, 17-20 May 2009 (2009). http://aaai.org/ocs/index.php/ICWSM/09/paper/view/154
  6. 6.
    Berge, C.: Graphes et hypergraphes. Dunod, Paris (1970)zbMATHGoogle Scholar
  7. 7.
    Bhagdev, R., Chapman, S., Ciravegna, F., Lanfranchi, V., Petrelli, D.: Hybrid search: Effectively combining keywords and semantic searches. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) European Semantic Web Conference, pp. 554–568. Springer, Berlin (2008)Google Scholar
  8. 8.
    Brandes, U., Eiglsperger, M., Herman, I., Himsolt, M., Marshall, M.S.: Graphml progress report structural layer proposal. In: Mutzel, P., Jünger, M., Leipert, S. (eds.) International Symposium on Graph Drawing, pp. 501–512. Springer, Berlin (2001)CrossRefGoogle Scholar
  9. 9.
    Csardi, G., Nepusz, T., et al.: The igraph software package for complex network research. InterJ. Complex Syst. 1695(5), 1–9 (2006)Google Scholar
  10. 10.
    Devezas, J., Nunes, S.: Hypergraph-of-entity: a unified representation model for the retrieval of text and knowledge. Open Comput. Sci. 9(1), 103–127 (2019).  https://doi.org/10.1515/comp-2019-0006CrossRefGoogle Scholar
  11. 11.
    Estrada, E., Rodriguez-Velazquez, J.A.: Complex networks as hypergraphs. arXiv preprint physics/0505137 (2005)Google Scholar
  12. 12.
    Fernández, J.D., Martínez-Prieto, M.A., de la Fuente Redondo, P., Gutiérrez, C.: Characterizing RDF datasets. J. Inf. Sci. 1, 1–27 (2016)Google Scholar
  13. 13.
    Gallagher, S.R., Goldberg, D.S.: Clustering coefficients in protein interaction hypernetworks. In: ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics, ACM-BCB 2013, Washington, DC, USA, 22-25 September 2013, p. 552 (2013).  https://doi.org/10.1145/2506583.2506635
  14. 14.
    Ge, W., Chen, J., Hu, W., Qu, Y.: Object link structure in the semantic web. In: The Semantic Web: Research and Applications, 7th Extended Semantic Web Conference, ESWC 2010, Heraklion, Crete, Greece, May 30 - June 3 2010, Proceedings, Part II, pp. 257–271 (2010).  https://doi.org/10.1007/978-3-642-13489-0_18Google Scholar
  15. 15.
    Głąbowski, M., Musznicki, B., Nowak, P., Zwierzykowski, P.: Shortest path problem solving based on ant colony optimization metaheuristic. Image Process. Commun. 17(1–2), 7–17 (2012)CrossRefGoogle Scholar
  16. 16.
    Halpin, H.: A query-driven characterization of linked data. In: Proceedings of the WWW2009 Workshop on Linked Data on the Web, LDOW 2009, Madrid, Spain, 20 April 2009. http://ceur-ws.org/Vol-538/ldow2009_paper16.pdf
  17. 17.
    Himsolt, M.: GML: A portable graph file format. Technical report, Universität Passau (1997)Google Scholar
  18. 18.
    Klamt, S., Haus, U., Theis, F.J.: Hypergraphs and cellular networks. PLoS Comput. Biol. 5(5), e1000385 (2009).  https://doi.org/10.1371/journal.pcbi.1000385MathSciNetCrossRefGoogle Scholar
  19. 19.
    Li, D.: Shortest paths through a reinforced random walk. Tech. rep., University of Uppsala (2011)Google Scholar
  20. 20.
    Mubayi, D., Zhao, Y.: Co-degree density of hypergraphs. J. Comb. Theory, Ser. A 114(6), 1118–1132 (2007).  https://doi.org/10.1016/j.jcta.2006.11.006MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Ouvrard, X., Goff, J.L., Marchand-Maillet, S.: Adjacency and tensor representation in general hypergraphs part 1: e-adjacency tensor uniformisation using homogeneous polynomials. CoRR abs/1712.08189 (2017). http://arxiv.org/abs/1712.08189
  22. 22.
    Ribeiro, B.F., Basu, P., Towsley, D.: Multiple random walks to uncover short paths in power law networks. In: 2012 Proceedings IEEE INFOCOM Workshops, Orlando, FL, USA, 25-30 March 2012, pp. 250–255 (2012).  https://doi.org/10.1109/INFCOMW.2012.6193500
  23. 23.
    Voorhees, E.M.: The efficiency of inverted index and cluster searches. In: SIGIR 1986, Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Pisa, Italy, 8-10 September 1986, pp. 164–174 (1986).  https://doi.org/10.1145/253168.253203
  24. 24.
    Yu, W., Sun, N.: Establishment and analysis of the supernetwork model for nanjing metro transportation system. Complexity 2018, 4860531:1–4860531:11 (2018).  https://doi.org/10.1155/2018/4860531CrossRefGoogle Scholar
  25. 25.
    Zobel, J., Moffat, A., Ramamohanarao, K.: Inverted files versus signature files for text indexing. ACM Trans. Database Syst. 23(4), 453–490 (1998).  https://doi.org/10.1145/296854.277632CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.INESC TEC and Faculty of EngineeringUniversity of Porto, Rua Dr. Roberto FriasPortoPortugal

Personalised recommendations