Social media filtering based on collaborative tagging in semantic space

Kim, Heung-Nam; Roczniak, Andrew; Lévy, Pierre; El Saddik, Abdulmotaleb

doi:10.1007/s11042-010-0557-4

Social media filtering based on collaborative tagging in semantic space

Published: 26 June 2010

Volume 56, pages 63–89, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Heung-Nam Kim^1,3,
Andrew Roczniak¹,
Pierre Lévy² &
…
Abdulmotaleb El Saddik³

553 Accesses
10 Citations
4 Altmetric
Explore all metrics

Abstract

We propose a semantic collaborative filtering method to enhance recommendation quality derived from user-generated tags. Social tagging is employed as an approach in order to grasp and filter users’ preferences for items. In addition, we explore several advantages of semantic tagging for ambiguity, synonymy, and semantic interoperability, which are notable challenges in information filtering. The proposed approach first determines semantically similar users using social tagging and subsequently discovers semantically relevant items for each user. Experimental results show that our method offers significant advantages both in terms of improving the recommendation quality and in dealing with ambiguity, synonymy, and interoperability issues.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

Article Open access 03 May 2022

Deepjyoti Roy & Mala Dutta

News recommender system: a review of recent progress, challenges, and opportunities

Article 21 July 2021

Shaina Raza & Chen Ding

Notes

http://ieml.org
http://linkeddata.org/
The star before an English expression marks a *tag, a natural language descriptor of an IEML expression. A *tag holds the place of an IEML expression by suggesting its meaning rather than uttering the IEML expression
http://www.opencalais.com/
Detailed formal models are presented in Appendix A
In IEML notation, the former “Java” can be expressed as (l.i.-k.i.-’)[Java] which means “Java as a geographic unit” whereas the latter “Java” is (b.-’ b.e.-t.u.-wa.e.-’ E:T:.p.-’,)[Java] which means “Java as a programming language”
http://bibsonomy.org
A Cartesian product of two sets X and Y is written as follows: X × Y = {(x, y) | x∈X, y∈Y}.
A powerset of S is the set of all subsets of S, including the empty set ∅.

References

Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Article Google Scholar
Bao S, Wu X, Fei B, Xue G, Su Z, Yu Y (2007) Optimizing web search using social annotations. In: Proceedings of the 16th International Conference on World Wide Web, pp 501–510
Bonhard P, Sasse A (2006) ‘Knowing me, knowing you’—using profiles and social networking to improve recommender systems. BT Technol J 24(3):84–98
Article Google Scholar
Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pp 43–52
Deshpande M, Karypis G (2004) Item-based top-N recommendation algorithms. ACM Trans Inf Sys 22(1):143–177
Article Google Scholar
Facebook Statistics (2010) http://www.facebook.com/press/info.php?statistics. Accessed 30 Mar 2010
Golder SA, Huberman BA (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208
Article Google Scholar
Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Sys 22(1):5–53
Article Google Scholar
Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Information retrieval in folksonomies: search and ranking. In: Proceedings of the 3rd European Semantic Web Conference, pp 411–426
Jäschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2008) Tag recommendations in social bookmarking systems. AI Commun 21(4):231–247
MATH MathSciNet Google Scholar
Kim H-N, Ji A-T, Ha I, Jo G-S (2009) Collaborative filtering based on collaborative tagging for enhancing the quality of recommendation. Electron Commer Res Appl. doi:10.1016/j.elerap.2009.08.004
Google Scholar
Knowledge and Data Engineering Group (2007) University of Kassel: Benchmark Folksonomy Data from BibSonomy, version of April 30th, 2007. http://www.kde.cs.uni-kassel.de/bibsonomy/dumps/. Accessed 15 Dec 2009
Lévy P (2009) Toward a self-referential collective intelligence some philosophical background of the IEML research program. In: Proceedings of 1st International Conference on Computational Collective Intelligence—Semantic Web, Social Networks & Multiagent Systems, pp 22–35
Lévy P (2010) From social computing to reflexive collective intelligence: the IEML research program. Inf Sci 180(1):71–94
Article Google Scholar
Li X, Guo L, Zhao Y (2008) Tag-based social interest discovery. In: Proceedings of the 17th International Conference on World Wide Web, pp 675–684
Marchetti A, Tesconi M, Ronzano F (2007) SemKey: a semantic collaborative tagging system. In: Proceedings of Tagging and Metadata for Social Information Organization Workshop in the 16th International Conference on World Wide Web
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41
Article Google Scholar
Peis E, Morales-del-Castillo JM, Delgado-López JA (2008) Semantic recommender systems. Analysis of the state of the topic. Hipertext.net number 6. http://www.hipertext.net/english/pag1031.htm. Accessed 15 Dec 2009
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pp 175–186
Sarwar B, Karypis G, Konstan J, Riedl J (2000) Analysis of recommendation algorithms for E-commerce. In: Proceedings of ACM Conference on Electronic Commerce, pp 158–167
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the Tenth International World Wide Web Conference, pp 285–295
Schenkel R, Crecelius T, Kacimi M, Michel S, Neumann T, Parreira JX, Weikum G (2008) Efficient top-k querying over social-tagging networks. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 523–530
Siersdorfer S, Sizov S (2009) Social recommender systems for web 2.0 folksonomies. In: Proceedings of the 20th ACM conference on Hypertext and hypermedia, pp 261–270
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp 327–336
Tso-Sutter KHL, Marinho LB, Thieme LS (2008) Tag-aware recommender systems by fusion of collaborative filtering algorithms. In: Proceedings of the 2008 ACM symposium on Applied computing, pp 1995–1999
Xu Z, Fu Y, Mao J, Su D (2006) Towards the semantic web: collaborative tag suggestions. In: Proceedings of the Collaborative Web Tagging Workshop in the 15th International Conference on the World Wide Web
Zanardi V, Capra L (2008) Social ranking: uncovering relevant content using tag-based recommender systems. In: Proceedings of the 2008 ACM conference on Recommender Systems, pp 51–58
Zhang Z-K, Zhou T, Zhang Y-C (2010) Personalized recommendation via integrated diffusion on user-item-tag tripartite graphs. Physica A: Statistical Mechanics and its Applications 389(1):179–186
Article Google Scholar

Download references

Acknowledgment

The work was mainly funded since 2009 by the Canada Research Chair in Collective Intelligence at University of Ottawa.

Author information

Authors and Affiliations

Collective Intelligence Lab, University of Ottawa, Ottawa, Ontario, Canada
Heung-Nam Kim & Andrew Roczniak
Collective Intelligence Lab, Canada Research Chair in Collective Intelligence, University of Ottawa, Ottawa, Ontario, Canada
Pierre Lévy
Multimedia Communication Research Lab, University of Ottawa, Ottawa, Ontario, Canada
Heung-Nam Kim & Abdulmotaleb El Saddik

Authors

Heung-Nam Kim
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Roczniak
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Lévy
View author publications
You can also search for this author in PubMed Google Scholar
Abdulmotaleb El Saddik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heung-Nam Kim.

Appendix A

1.1 IEML language model

We present the model of the IEML language, along with the model of semantic variables. Let ∑ be a nonempty and finite set of symbols, ∑ = {S, B, T, U, A, E}. Let string s be a finite sequence of symbols chosen from ∑. The length of this string is denoted by |s|. An empty string ε is a string with zero occurrence of symbols and its length is |ε |= 0. The set of all strings of length k composed with symbols from ∑ is defined as ∑^k = {s where |s| = k}. Note that ∑⁰ = {ε} and ∑¹ = {S, B, T, U, A, E}. Although ∑ and ∑¹ are sets containing exactly the same members, the former contains symbols and the latter strings. The set of all strings over ∑ is defined as ∑^* = ∑⁰∪∑¹∪∑²∪∑³ …

A useful operation on strings is concatenation, defined as follows. For all s _i = a ₁ a ₂ a ₃ a ₄ …a _i ∈∑^* and s _j = b ₁ b ₂ b ₃ b ₄ …b _j∈∑^*, then s _i s _j denotes string concatenation such that s _i s _j = a ₁ a ₂ a ₃ a ₄ …a _i b ₁ b ₂ b ₃ b ₄ …b _j and |s _i s _j| = i + j. The IEML language over ∑ is a subset of ∑^*, L _IEML ⊆ ∑^*:

$$ L_{{IEML}} = {\left\{ {s \in {\sum {^{*} \left\| s \right.\left| { = 3^{l} ,0 \leqslant l \leqslant 6} \right.} }} \right\}} $$

(9)

1.2 Model of semantic sequences

Definition 3 (Semantic sequence)

A string s is called a semantic sequence if and only if s∈L _IEML.

To denote the p _nth primitive of a sequence s, we use a superscript n where 1 ≤ n ≤ 3^l and write s ⁿ. Note that for any sequence s of layer l, s ⁿ is undefined for any n > 3^l. Two semantic sequences are distinct if and only if either of the following holds: i) their layers are different, ii) they are composed from different primitives, iii) their primitives do not follow the same order: for any s _a and s _b,

$$ {s_a} = {s_b} \Leftrightarrow \forall n,s_a^n = s_b^n \wedge |{s_a}| = |{s_b}| $$

(10)

Let’s now consider binary relations between semantic sequences in general. These are obtained by performing a Cartesian product of two sets.^{Footnote 8} For any set of semantic sequences X, Y where s _a ∈X, s _b∈Y and using Eq. 2, we define four binary relations whole ⊆ X × Y, substance ⊆ X × Y, attribute ⊆ X × Y, and mode ⊆ X × Y as follows:

$$ \begin{array}{*{20}{c}} {{\hbox{whole}} = \left\{ {({s_a},{s_b})|{s_a} = {s_b}} \right\}} \\{{\hbox{substance}} = \left\{ {({s_a},{s_b})|s_a^n = s_b^n \wedge |{s_a}| = 3|{s_b}|,{ }1 \leqslant n \leqslant { }|{s_b}|} \right\}} \\{{\hbox{attribute}} = \left\{ {({s_a},{s_b})|s_a^{n + |{s_b}|} = s_b^n \wedge |{s_a}| = 3|{s_b}|,{ }1 \leqslant n \leqslant { }|{s_b}|} \right\}} \\{{\hbox{mode}} = \left\{ {({s_a},{s_b})|s_a^n = s_b^{n + 2|{s_b}|} \wedge |{s_a}| = 3|{s_b}|,{ }1 \leqslant n \leqslant { }|{s_b}|} \right\}} \\\end{array} $$

(11)

Any two semantic sequences that are equal are in a whole relationship. In addition, any two semantic sequences that share specific subsequences may be in substance, attribute or mode relationship. For any two semantic sequences s _a and s _b, if they are in one of the above relations, then we say that s _b plays a role w.r.t s _a and we call s _b a seme of sequence.

Definition 4 (Seme of a sequence)

For any semantic sequence s _a and s _b, if (s _a, s _b) ∈ whole ∪substance∪attribute∪mode, then s _b plays a role w.r.t. s _a and s _b is called a seme.

We can now group distinct semantic sequences together into sets. A useful grouping is based on the layer of those semantic sequences.

1.3 Model of semantic categories

A category of L _IEML is a subset such that all strings of that subset have the same length:

$$ c = \left\{ {\forall {s_i},{s_j} \in {L_{IEML}}\,where\,\left| {{s_i}| = |{s_j}} \right|} \right\} $$

(12)

Definition 5 (Semantic category)

A semantic category c is a set containing semantic sequences at the same layer.

The layer of any category c is exactly the same as the layer of the semantic sequences included in that category. The set of all categories of layer l is given as the powerset^{Footnote 9} of the set of all strings of layer l of L _IEML:

$$ {C_l} = Powerset\left( {\left\{ {s \in {L_{IEML}}\,where\,\left| s \right| = {3^l}} \right\}} \right) $$

(13)

Two categories are distinct if and only if they differ by at least one element. For any c _a and c _b:

$$ {c_a} = {c_b} \Leftrightarrow {c_a} \subseteq {c_b} \wedge {c_b} \subseteq {c_a} $$

(14)

A weaker condition can be applied to categories of distinct layers (since two categories are different if their layers are different) and is written as:

$$ l({c_a}) \ne l({c_b}) \Rightarrow {c_a} \ne {c_b} $$

(15)

where l(c _a) and l(c _b) denotes the layer of category c _a and c _b, respectively. Analogously to sequences, we consider binary relations between any categories c _i and c _j where l(c _i), l(c _j) ≥ 1. For any set of categories X, Y where c _a ∈ X, c _b ∈ Y, we define four binary relations whole _C ⊆ X × Y, substance _C ⊆ X × Y, attribute _C ⊆ X × Y, and mode _C ⊆ X × Y as follows:

$$ \begin{array}{*{20}c} {{{\text{whole}}_{{\text{C}}} = {\left\{ {{\left( {c_{a} ,c_{b} } \right)}\left| {c_{a} = c_{b} } \right.} \right\}}}} \\ {{{\text{substance}}_{{\text{C}}} = {\left\{ {{\left( {c_{a} ,c_{b} } \right)}\left| {\forall s_{a} \in c_{a} ,\exists s_{b} \in c_{b} ,{\left( {s_{a} ,s_{b} } \right)} \in {\text{substance}}} \right.} \right\}}}} \\ {{{\text{attribute}}_{{\text{C}}} = {\left\{ {{\left( {c_{a} ,c_{b} } \right)}\left| {\forall s_{a} \in c_{a} ,\exists s_{b} \in c_{b} ,{\left( {s_{a} ,s_{b} } \right)} \in {\text{attribute}}} \right.} \right\}}}} \\ {{{\text{mode}}_{{\text{C}}} = {\left\{ {{\left( {c_{a} ,c_{b} } \right)}\left| {\forall s_{a} \in c_{a} ,\exists s_{b} \in c_{b} ,{\left( {s_{a} ,s_{b} } \right)} \in {\text{mode}}} \right.} \right\}}}} \\ \end{array} $$

(16)

For any two categories c _a and c _b, if they are in one of the above relations (c _a, c _b) ∈ whole _C ∪ substance _C ∪ attribute _C ∪ mode _C, then we say that c _b plays a role with respect to c _a and c _b is called a seme of category.

1.4 Model of catsets

A catset is a set of distinct categories of the same layer as defined Definition 4.

Definition 6 (Catset)

A catset κ is a set containing categories such that κ ={c _n |∀i, j: c _i ≠ c _j, l(c _i)=l(c _j)}.

The layer of a catset is given by the layer of any of its members: if some c ∈ κ, then l(κ) = l(c). Note that a category c can be written as c ∈ C _l, while a catset κ can be written as κ ⊆ C _l. All standard set operations, such as union and intersection, (e.g., ∪ and ∩), can be performed on catsets of the same layer.

1.5 Model of uniform semantic locator

A USL is composed of up to seven catsets of different layers as follows:

Definition 7 (Uniform Semantic Locator, USL)

A USL υ is a set containing catsets of different layers such that υ = {κ _n | ∀i, j: l(c _i) ≠ l(c _j)}.

Note that since there are seven distinct layers, a USL can have at most seven members. All standard set operations, such as union and intersection (e.g., ∪ and ∩) on USLs are always performed on sets of categories (and therefore on sets of sequences), layer by layer. Since at each layer l there is |C _l| distinct catsets, the whole semantic space is defined by the tuple: Ł =C ₀ × C ₁ × C ₂ × C ₃ × C ₄ × C ₅ × C ₆.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, HN., Roczniak, A., Lévy, P. et al. Social media filtering based on collaborative tagging in semantic space. Multimed Tools Appl 56, 63–89 (2012). https://doi.org/10.1007/s11042-010-0557-4

Download citation

Published: 26 June 2010
Issue Date: January 2012
DOI: https://doi.org/10.1007/s11042-010-0557-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Social media filtering based on collaborative tagging in semantic space

Abstract

Access this article

Similar content being viewed by others

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

News recommender system: a review of recent progress, challenges, and opportunities

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Appendix A

1.1 IEML language model

1.2 Model of semantic sequences

Definition 3 (Semantic sequence)

Definition 4 (Seme of a sequence)

1.3 Model of semantic categories

Definition 5 (Semantic category)

1.4 Model of catsets

Definition 6 (Catset)

1.5 Model of uniform semantic locator

Definition 7 (Uniform Semantic Locator, USL)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Social media filtering based on collaborative tagging in semantic space

Abstract

Access this article

Similar content being viewed by others

Recommender Systems: Techniques, Applications, and Challenges

A systematic review and research perspective on recommender systems

News recommender system: a review of recent progress, challenges, and opportunities

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Appendix A

Appendix A

1.1 IEML language model

1.2 Model of semantic sequences

Definition 3 (Semantic sequence)

Definition 4 (Seme of a sequence)

1.3 Model of semantic categories

Definition 5 (Semantic category)

1.4 Model of catsets

Definition 6 (Catset)

1.5 Model of uniform semantic locator

Definition 7 (Uniform Semantic Locator, USL)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation