Advertisement

Social Ontologies as Generalized Nearly Acyclic Directed Graphs: A Quantitative Graph Model of Social Tagging

  • Alexander Mehler
Chapter

Abstract

In this paper, we introduce a quantitative graph model of social ontologies as exemplified by the category system of Wikipedia. This is done to contrast structure formation in distributed cognition with classification schemes (by example of the DDC and MeSH), formal ontologies (by example of OpenCyc and SUMO), and terminological ontologies (as exemplified by WordNet). Our basic findings are that social ontologies have a characteristic topology that clearly separates them from other types of ontologies. In this context, we introduce the notion of a Zipfian bipartivity to analyze the relationship of categories and categorized units in distributed cognition.

Keywords

Generalized nearly acyclic directed graphs Quantitative network analysis Social ontology Wikipedia Zipfian bipartivity 

Notes

Acknowledgment

Financial support of the German Federal Ministry of Education (BMBF) through the research project Linguistic Networks, of the German Research Foundation (DFG) through the Excellence Cluster 277 Cognitive Interaction Technology (via the Project KnowCIT) and of the SFB 673 Alignment in Communication (via the Project A3 Dialogue Games and Group Dynamics and X1 Multimodal Alignment Corpora: Statistical Modeling and Information Management) is gratefully acknowledged. We also thank Dietmar Esch, Tobias Feith, and Roman Pustylnikov for the download of ontologies as well as Rüdiger Gleim, Olga Abramov, and Paul Warner for their fruitful hints which helped to reduce the number of errors in this chapter.

References

  1. 1.
    Altmann, G.: Semantische Diversifikation. Folia Ling. 19, 177–200 (1985)CrossRefGoogle Scholar
  2. 2.
    Altmann, G., Köhler, R.: “Language forces” and synergetic modelling of language phenomena. In: Glottometrika, vil. 15, pp. 62–76. Brockmeyer, Bochum (1996)Google Scholar
  3. 3.
    Altmann, G., Lehfeldt. W.: Allgemeine Sprachtypologie. Fink, München (1973)Google Scholar
  4. 4.
    Baldi, P., Frasconi, P., Smyth, P.: Modeling the Internet and the Web. Wiley, Chichester (2003)Google Scholar
  5. 5.
    Bales, M.E., Lussier, Y.A., Johnson, S.B.: Topological analysis of large-scale biomedical terminology structures. J. Am. Med. Informat. Assoc. 14(6), 788–797 (2007)CrossRefGoogle Scholar
  6. 6.
    Bang-Jensen, J., Gutin, G.: Digraphs. Theory, Algorithms and Applications. Springer, London/Berlin (2006)Google Scholar
  7. 7.
    Barabási, A.-L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Barrat, A., Barthélemy, M., Vespignani, A.: Dynamical Processes on Complex Networks. Cambridge University Press, Cambridge (2008)MATHCrossRefGoogle Scholar
  9. 9.
    Barthélemy, M.: Betweenness centrality in large complex networks. Eur. Phys. J. B 38, 163–168 (2004)CrossRefGoogle Scholar
  10. 10.
    Berwanger, D., Dawar, A., Hunter, P., Kreutzer, S.: DAG-width and parity games. In: Durand, B., Thomas, W. (eds.) STACS, vol. 3884, Lecture Notes in Computer Science, pp. 524–536. Springer, Berlin (2006)Google Scholar
  11. 11.
    Bickhard, M.H.: Social ontology as convention. Topoi 27(1-2), 139–149 (2008)CrossRefGoogle Scholar
  12. 12.
    Blohm, S., Kroetzsch, M., Cimiano, P.: Integrating the fast and the numerous – combining machine and community intelligence for semantic annotation and Wikipedia: Folksonomy meets rigorously defined common-sense. In: Proceedings of AAAI 2008 Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy (WikiAI08), Chicago (2008)Google Scholar
  13. 13.
    Bonchev, D.: Information Theoretic Indices for Characterization of Chemical Structures. Research Studies Press, Chichester (1983)Google Scholar
  14. 14.
    Botafogo, R.A., Rivlin, E., Shneiderman, B.: Structural analysis of hypertexts: Identifying hierarchies and useful metrics. ACM Trans. Inform. Syst. 10(2), 142–180 (1992)CrossRefGoogle Scholar
  15. 15.
    Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Ling. 32(1), 13–47 (2006)CrossRefGoogle Scholar
  16. 16.
    Caldarelli, G.: Scale-Free Networks. Complex webs in nature and technology. Oxford University Press, Oxford (2008)Google Scholar
  17. 17.
    Capocci, A., Caldarelli, G.: Folksonomies and clustering in the collaborative system CiteULike. J. Phys. A Math. Theor. 41, 224016 (2008)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Capocci, A., Rao, F., Caldarelli, G.: Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia Wikipedia. Europhys. Lett. 81, 28006 (2008)CrossRefGoogle Scholar
  19. 19.
    Cattuto, C., Barrat, A., Baldassarri, A., Schehr, G., Loreto, V.: Collective dynamics of social annotation. PNAS 106(26), 10511–10515 (2009)Google Scholar
  20. 20.
    Cattuto, C., Benz, D., Hotho, A., Stumme, G.: Semantic grounding of tag relatedness in social bookmarking systems. In: The Semantic Web – ISWC 2008, vol. 5318, Lecture Notes in Computer Science, pp. 615–631. Springer, Berlin, Heidelberg (2008)Google Scholar
  21. 21.
    Chernov, S., Iofciu, T., Nejdl, W., Zhou, X.: Extracting semantic relationships between Wikipedia categories. In: 1st International Workshop: From Wiki to Semantics (Sem Wiki 2006), co-located with ESWC 2006, Budva, Montenegro, June 12, 2006Google Scholar
  22. 22.
    Dehmer, M.: Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 201, 82–94 (2008)MathSciNetMATHCrossRefGoogle Scholar
  23. 23.
    Dehmer, M., Mehler, A.: A new method of measuring the similarity for a special class of directed graphs. Tatra Mountains Math. Publ. 36, 39–59 (2007)MathSciNetMATHGoogle Scholar
  24. 24.
    Dehmer, M., Mowshowitz, A.: A history of graph entropy measures. Inform. Sci. 181(1), 57–78 (2011)MathSciNetMATHCrossRefGoogle Scholar
  25. 25.
    Dellschaft, K., Staab, S.: An epistemic dynamic model for tagging systems. In: Hypertext 2008, Proceedings of the 19th ACM Conference on Hypertext and Hypermedia, June 19–21, 2008, Pittsburgh, Pennsylvania, USA, 2008Google Scholar
  26. 26.
    Estrada, E.: Protein bipartivity and essentiality in the yeast protein-protein interaction network. J. Proteome Res. 5(9), 2177–2184 (2006)CrossRefGoogle Scholar
  27. 27.
    Estrada, E., Rodríguez-Velázquez, J.A.: Spectral measures of bipartivity in complex networks. Phys. Rev. E 72(4), 046105 (2005)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Fellbaum, C., (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)MATHGoogle Scholar
  29. 29.
    Firth, J.R.: A synopsis of linguistic theory, 1933–1955. In: Firth, J.R., (ed.) Studies in Linguistic Analysis, pp. 1–32. Blackwell, Oxford (1957)Google Scholar
  30. 30.
    Freyd, J.J.: Shareability: The social psychology of epistemology. Cognit. Sci. 7, 191–210 (1983)CrossRefGoogle Scholar
  31. 31.
    Hammwöhner, R.: Interlingual aspects of Wikipedia’s quality. In: Proceedings of the International Conference On Information Qualiy (ICIQ 2007) (2007)Google Scholar
  32. 32.
    Harary, F.: Graph Theory. Addison Wesley, Boston (1969)Google Scholar
  33. 33.
    Hollan, J., Hutchins, E., Kirsh, D.: Distributed cognition: toward a new foundation for human-computer interaction research. ACM Trans. Comput. Hum. Interact. 7(2), 174–196 (2000)CrossRefGoogle Scholar
  34. 34.
    Holme, P., Liljeros, F., Edling, C.R., Kim, B.J.: On network bipartivity. Phys. Rev. E 68, 056107 (2003)Google Scholar
  35. 35.
    Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: BibSonomy: A social book-mark and publication sharing system. In: Proceedings Of the Workshop on Tool Interoperability at the International Conference on Conceptual Structures 2006, pp. 87–102 (2006)Google Scholar
  36. 36.
    Hotho, A., Nürnberger, A., Paaß, G.: A brief survey of text mining. J. Lang. Tech. Comput. Ling. 20(1), 19–62 (2005)Google Scholar
  37. 37.
    Jäschke, R., Hotho, A., Schmitz, C., Ganter, B., Stumme, G.: Discovering shared conceptualizations in folksonomies. Web Semant. Sci. Serv. Agents World Wide Web 6(1), 38–53 (2008)CrossRefGoogle Scholar
  38. 38.
    Klein, D.J., Ivanciuc, O.: Graph cyclicity, excess conductance, and resistance deficit. J. Math. Chem. 30(3), 271–287 (2001)MathSciNetCrossRefGoogle Scholar
  39. 39.
    Köhler, R.: Systems theoretical linguistics. Theor. Ling. 14(2/3), 241–257 (1987)CrossRefGoogle Scholar
  40. 40.
    Köhler, R.: Syntactic structures, properties and interrelations. J. Quant. Ling. 6, 46–57 (1999)CrossRefGoogle Scholar
  41. 41.
    Koschützki, D., Lehmann, K.A., Peeters, L., Richter, S., Tenfelde-Podehl, D., Zlotowski, O.: Centrality indices. In: Brandes, U., Erlebach, T., (eds.) Network Analysis, vol. 3418, Lecture Notes in Computer Science, pp. 16–61. Springer, Berlin (2004)Google Scholar
  42. 42.
    Kunze, C., Lemnitzer, L.: GermaNet – representation, visualization, application. In Rodriguez, M., González, Paz Suárez Araujo, C., (eds.) Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), pp. 1485–1491. European Language Resources Association, Paris (2002)Google Scholar
  43. 43.
    Lambiotte, R., Ausloos, M.: Collaborative tagging as a tripartite network. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J., (eds.) International Conference on Computational Science (3), vol. 3993, Lecture Notes in Computer Science, pp. 1114–1117. Springer, Berlin (2006)Google Scholar
  44. 44.
    Lenat, D.B.: CYC: A large-scale investment in knowledge infrastructure. Comm. ACM 38, 33–38 (1995)CrossRefGoogle Scholar
  45. 45.
    Medelyan, O., Witten, I.H., Milne, D.: Topic indexing with Wikipedia. In: Proceedings of AAAI 2008 Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy (WikiAI08), Chicago (2008)Google Scholar
  46. 46.
    Mehler, A.: Text linkage in the wiki medium – a comparative study. In: Karlgren, J., (ed.) Proceedings of the EACL Workshop on New Text – Wikis and blogs and other dynamic text sources, pp. 1–8. Trento, Italy (2006)Google Scholar
  47. 47.
    Mehler, A.: Large text networks as an object of corpus linguistic studies. In: Lüdeling, A., Kytö, M., (eds.) Corpus Linguistics. An International Handbook of the Science of Language and Society, pp. 328–382. De Gruyter, Berlin/NewYork (2008)Google Scholar
  48. 48.
    Mehler, A.: On the impact of community structure on self-organizing lexical networks. In: Smith, A.D.M., Smith, K., Ferrer i Cancho, R., (eds.) Proceedings of the 7th Evolution of Language Conference (Evolang7), pp. 227–234. World Scientific, Barcelona (2008)Google Scholar
  49. 49.
    Mehler, A.: Structural similarities of complex networks: A computational model by example of wiki graphs. Appl. Artif. Intell. 22(7&8), 619–683 (2008)CrossRefGoogle Scholar
  50. 50.
    Mehler, A.: Generalized shortest paths trees: A novel graph class applied to semiotic networks. In: Dehmer, M., Emmert-Streib, F., (eds.) Analysis of Complex Networks: From Biology to Linguistics, pp. 175–220. Wiley-VCH, Weinheim (2009)Google Scholar
  51. 51.
    Mehler, A.: Minimum spanning Markovian trees: Introducing context-sensitivity into the generation of spanning trees. In: Dehmer, M., (ed.) Structural Analysis of Complex Networks, pp. 381–401. Birkhäuser/Basel (2010)Google Scholar
  52. 52.
    Mehler, A., Geibel, P., Pustylnikov, O.: Structural classifiers of text types: Towards a novel model of text representation. J. Lang. Tech. Comput. Ling. 22(2), 51–66 (2007)Google Scholar
  53. 53.
    Mehler, A., Gleim, R., Ernst, A., Waltinger, U.: WikiDB: Building interoperable wiki-based knowledge resources for semantic databases. Sprache und Datenverarbeitung Int. J. Lang. Data Process. 32(1), 47–70 (2008)Google Scholar
  54. 54.
    Mehler, A., Gleim, R., Wegner, A.: Structural uncertainty of hypertext types. An empirical study. In: Proceedings of the Workshop “Towards Genre-Enabled Search Engines: The Impact of NLP”, in conjunction with RANLP 2007, pp. 13–19. Borovets, Bulgaria (2007)Google Scholar
  55. 55.
    Meluk, I.: Dependency Syntax: Theory and Practice. SUNY, Albany (1988)Google Scholar
  56. 56.
    Mika, P.: Ontologies are us: A unified model of social networks and semantics. J. Web Semant. 5(1), 5–15 (2007)MathSciNetCrossRefGoogle Scholar
  57. 57.
    Mika, P., Gangemi, A.: Descriptions of social relations. In: Proceedings of the 1st Workshop on Friend of a Friend, Social Networking and the (Semantic) Web (2004)Google Scholar
  58. 58.
    Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from Wikipedia links. In: Proceedings of AAAI 2008 Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy (WikiAI08), Chicago (2008)Google Scholar
  59. 59.
    Naranan, S., Balasubrahmanyan, V.K.: Models for power law relations in linguistics and information science. J. Quant. Ling. 5(1-2), 35–61 (1998)CrossRefGoogle Scholar
  60. 60.
    Nelson, S.J., Johnston, W.D., Humphreys, B.L.: Relationships in medical subject headings. In: Bean, C.A., Green, R., (eds.) Relationships in the organization of knowledge, pp. 171–184. Kluwer Academic Publishers, New York (2001)Google Scholar
  61. 61.
    Newman, M.E.J.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)MathSciNetMATHCrossRefGoogle Scholar
  62. 62.
    Newman, M.E.J.: Power laws, Pareto distributions and Zipf’s law. Contemp. Phys. 46, 323–351 (2005)Google Scholar
  63. 63.
    Newman, M.E.J., Park, J.: The origin of degree correlations in the internet and other networks. Phys. Rev. E 68, 026121 (2003)CrossRefGoogle Scholar
  64. 64.
    Niles, I., Pease, A.: Towards a standard upper ontology. In: Welty, C., Smith, B., (eds.) Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001), Ogunquit, Maine (2001)Google Scholar
  65. 65.
    Obdržálek, J.: DAG-width: connectivity measure for directed graphs. In: SODA’06: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm, pp. 814–821. ACM, New York, NY, USA (2006)Google Scholar
  66. 66.
    OCLC. Dewey decimal classification summaries. A brief introduction to the Dewey Decimal Classification. http://www.oclc.org/dewey/resources/summaries/default.htm [accessed February 15, 2009], 2008
  67. 67.
    OpenCyc.org. OpenCyc documentation. http://www.opencyc.org/doc [accessed February 15, 2009], 2008
  68. 68.
    Pastor-Satorras, R., Vázquez, A., Vesipignani, A.: Dynamical and correlation properties of the internet. Phys. Rev. Letters 87(25), 268701 (2001)CrossRefGoogle Scholar
  69. 69.
    Ponzetto, S., Strube, M.: Deriving a large scale taxonomy from Wikipedia. In: Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI-07), pp. 1440–1447. Vancouver, B.C., Canada (2007)Google Scholar
  70. 70.
    Pustylnikov, O., Mehler, A.: Structural differentiate of text types. A quantitative model. In: Proceedings of the 31st Annual Conference of the German Classification Society on Data Analysis, Machine Learning, and Applications (GfKl), pp. 655–662 (2007)Google Scholar
  71. 71.
    Pustylnikov, O., Mehler, A.: Text classification by means of structural features. What kind of information about texts is captured by their structure? In: Proceedings of RUSSIR’08. Taganrog, Russia (2008)Google Scholar
  72. 72.
    Abramov, O., Mehler, A.: Automatic language classification by means of syntactic dependency networks. J. Quant. Ling. (2011) (accepted)Google Scholar
  73. 73.
    Ravasz, E., Barabási, A.-L.: Hierarchical organization in complex networks. Phys. Rev. E 67, 026112 (2003)CrossRefGoogle Scholar
  74. 74.
    Rosch, E.: Principles of categorization. In: Rosch, E., Lloyd, B.B., (eds.) Cognition and Categorization, pp. 27–48. Erlbaum, Hillsdale, N.J. (1978)Google Scholar
  75. 75.
    Santini, M.: Characterizing genres of web pages: Genre hybridism and individualization. In: Proceedings of the 40th Annual Hawaii International Conference on System Sciences (HICSS’07) (2007)Google Scholar
  76. 76.
    Saunders, S.: Improved shortest path algorithms for nearly acyclic graphs. Ph.D thesis, University of Canterbury, Computer Science (2004)Google Scholar
  77. 77.
    Saunders, S., Takaoka, T.: Improved shortest path algorithms for nearly acyclic graphs. Theor. Comput. Sci. 293(3), 535–556 (2003)MathSciNetMATHCrossRefGoogle Scholar
  78. 78.
    Saunders, S., Takaoka, T.: Solving shortest paths efficiently on nearly acyclic directed graphs. Theor. Comput. Sci. 370(1-3), 94–109 (2007)MathSciNetMATHCrossRefGoogle Scholar
  79. 79.
    Searle, J.R.: Social ontology. Some basic principles. Anthropol. Theor. 6(1), 12–29 (2006)Google Scholar
  80. 80.
    Skorobogatov, V.A., Dobrynin, A.A.: Metrical analysis of graphs. MATCH 23, 105–155 (1988)MathSciNetMATHGoogle Scholar
  81. 81.
    Sowa, J.F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. Brooks/Cole, Pacific Grove (2000)Google Scholar
  82. 82.
    Steels, L.: Collaborative tagging as distributed cognition. Pragmatics Cognit. 14(2), 287–292 (2006)CrossRefGoogle Scholar
  83. 83.
    Steyvers, M., Tenenbaum, J.: The large-scale structure of semantic networks: Statistical analyses and a model of semantic growth. Cognit. Sci. 29(1), 41–78 (2005)CrossRefGoogle Scholar
  84. 84.
    Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: WWW’07: Proceedings of the 16th international conference on World Wide Web, pp. 697–706. ACM, New York, NY, USA (2007)Google Scholar
  85. 85.
    Takaoka, T.: Shortest path algorithms for nearly acyclic directed graphs. Theor. Comput. Sci. 203(1), 143–150 (1998)MathSciNetMATHCrossRefGoogle Scholar
  86. 86.
    Tuldava, J.: Probleme und Methoden der quantitativ-systemischen Lexikologie. Wissenschaftlicher Verlag, Trier (1998)Google Scholar
  87. 87.
    Ulanowicz, R.E.: Identifying the structure of cycling in ecosystems. Math. Biosci. 65(2), 219–237 (1983)MATHCrossRefGoogle Scholar
  88. 88.
    Voss, J.: Collaborative thesaurus tagging the Wikipedia way. arXiv.org:cs/0604036 (2006)Google Scholar
  89. 89.
    Waltinger, U., Mehler, A., Heyer, G.: Towards automatic content tagging: Enhanced web services in digital libraries using lexical chaining. In: Cordeiro, J., Filipe, J., Hammoudi, S., (eds.) 4th Int. Conf. on Web Information Systems and Technologies (WEBIST ’08), pp. 231–236. INSTICC Press, Barcelona, Funchal, Portugal (2008)Google Scholar
  90. 90.
    Watts, D.J.: Six Degrees. The Science of a Connected Age. W. W. Norton & Company, New York/London (2003)Google Scholar
  91. 91.
    Zelinka, B.: Nearly acyclic digraphs. Czech. Math. J. 33(1), 164–165 (1983)MathSciNetGoogle Scholar
  92. 92.
    Zipf, G.K.: Human Behavior and the Principle of Least Effort. An Introduction to Human Ecology. Hafner Publishing Company, New York (1972)Google Scholar
  93. 93.
    Zlatic, V., Bozicevic, M., Stefancic, H., Domazet, M.: Wikipedias: Collaborative web-based encyclopedias as complex networks. Phys. Rev. E 74, 016115 (2006)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Faculty of Computer Science and MathematicsGoethe University Frankfurt am MainFrankfurt am MainGermany

Personalised recommendations