Abstract
We propose a semantic collaborative filtering method to enhance recommendation quality derived from user-generated tags. Social tagging is employed as an approach in order to grasp and filter users’ preferences for items. In addition, we explore several advantages of semantic tagging for ambiguity, synonymy, and semantic interoperability, which are notable challenges in information filtering. The proposed approach first determines semantically similar users using social tagging and subsequently discovers semantically relevant items for each user. Experimental results show that our method offers significant advantages both in terms of improving the recommendation quality and in dealing with ambiguity, synonymy, and interoperability issues.
Similar content being viewed by others
Notes
The star before an English expression marks a *tag, a natural language descriptor of an IEML expression. A *tag holds the place of an IEML expression by suggesting its meaning rather than uttering the IEML expression
Detailed formal models are presented in Appendix A
In IEML notation, the former “Java” can be expressed as (l.i.-k.i.-’)[Java] which means “Java as a geographic unit” whereas the latter “Java” is (b.-’ b.e.-t.u.-wa.e.-’ E:T:.p.-’,)[Java] which means “Java as a programming language”
A Cartesian product of two sets X and Y is written as follows: X × Y = {(x, y) | x∈X, y∈Y}.
A powerset of S is the set of all subsets of S, including the empty set ∅.
References
Adomavicius G, Tuzhilin A (2005) Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans Knowl Data Eng 17(6):734–749
Bao S, Wu X, Fei B, Xue G, Su Z, Yu Y (2007) Optimizing web search using social annotations. In: Proceedings of the 16th International Conference on World Wide Web, pp 501–510
Bonhard P, Sasse A (2006) ‘Knowing me, knowing you’—using profiles and social networking to improve recommender systems. BT Technol J 24(3):84–98
Breese JS, Heckerman D, Kadie C (1998) Empirical analysis of predictive algorithms for collaborative filtering. In: Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pp 43–52
Deshpande M, Karypis G (2004) Item-based top-N recommendation algorithms. ACM Trans Inf Sys 22(1):143–177
Facebook Statistics (2010) http://www.facebook.com/press/info.php?statistics. Accessed 30 Mar 2010
Golder SA, Huberman BA (2006) Usage patterns of collaborative tagging systems. J Inf Sci 32(2):198–208
Herlocker JL, Konstan JA, Terveen LG, Riedl JT (2004) Evaluating collaborative filtering recommender systems. ACM Trans Inf Sys 22(1):5–53
Hotho A, Jäschke R, Schmitz C, Stumme G (2006) Information retrieval in folksonomies: search and ranking. In: Proceedings of the 3rd European Semantic Web Conference, pp 411–426
Jäschke R, Marinho L, Hotho A, Schmidt-Thieme L, Stumme G (2008) Tag recommendations in social bookmarking systems. AI Commun 21(4):231–247
Kim H-N, Ji A-T, Ha I, Jo G-S (2009) Collaborative filtering based on collaborative tagging for enhancing the quality of recommendation. Electron Commer Res Appl. doi:10.1016/j.elerap.2009.08.004
Knowledge and Data Engineering Group (2007) University of Kassel: Benchmark Folksonomy Data from BibSonomy, version of April 30th, 2007. http://www.kde.cs.uni-kassel.de/bibsonomy/dumps/. Accessed 15 Dec 2009
Lévy P (2009) Toward a self-referential collective intelligence some philosophical background of the IEML research program. In: Proceedings of 1st International Conference on Computational Collective Intelligence—Semantic Web, Social Networks & Multiagent Systems, pp 22–35
Lévy P (2010) From social computing to reflexive collective intelligence: the IEML research program. Inf Sci 180(1):71–94
Li X, Guo L, Zhao Y (2008) Tag-based social interest discovery. In: Proceedings of the 17th International Conference on World Wide Web, pp 675–684
Marchetti A, Tesconi M, Ronzano F (2007) SemKey: a semantic collaborative tagging system. In: Proceedings of Tagging and Metadata for Social Information Organization Workshop in the 16th International Conference on World Wide Web
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41
Peis E, Morales-del-Castillo JM, Delgado-López JA (2008) Semantic recommender systems. Analysis of the state of the topic. Hipertext.net number 6. http://www.hipertext.net/english/pag1031.htm. Accessed 15 Dec 2009
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) GroupLens: an open architecture for collaborative filtering of netnews. In: Proceedings of the ACM 1994 Conference on Computer Supported Cooperative Work, pp 175–186
Sarwar B, Karypis G, Konstan J, Riedl J (2000) Analysis of recommendation algorithms for E-commerce. In: Proceedings of ACM Conference on Electronic Commerce, pp 158–167
Sarwar B, Karypis G, Konstan J, Reidl J (2001) Item-based collaborative filtering recommendation algorithms. In: Proceedings of the Tenth International World Wide Web Conference, pp 285–295
Schenkel R, Crecelius T, Kacimi M, Michel S, Neumann T, Parreira JX, Weikum G (2008) Efficient top-k querying over social-tagging networks. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 523–530
Siersdorfer S, Sizov S (2009) Social recommender systems for web 2.0 folksonomies. In: Proceedings of the 20th ACM conference on Hypertext and hypermedia, pp 261–270
Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp 327–336
Tso-Sutter KHL, Marinho LB, Thieme LS (2008) Tag-aware recommender systems by fusion of collaborative filtering algorithms. In: Proceedings of the 2008 ACM symposium on Applied computing, pp 1995–1999
Xu Z, Fu Y, Mao J, Su D (2006) Towards the semantic web: collaborative tag suggestions. In: Proceedings of the Collaborative Web Tagging Workshop in the 15th International Conference on the World Wide Web
Zanardi V, Capra L (2008) Social ranking: uncovering relevant content using tag-based recommender systems. In: Proceedings of the 2008 ACM conference on Recommender Systems, pp 51–58
Zhang Z-K, Zhou T, Zhang Y-C (2010) Personalized recommendation via integrated diffusion on user-item-tag tripartite graphs. Physica A: Statistical Mechanics and its Applications 389(1):179–186
Acknowledgment
The work was mainly funded since 2009 by the Canada Research Chair in Collective Intelligence at University of Ottawa.
Author information
Authors and Affiliations
Corresponding author
Appendix A
Appendix A
1.1 IEML language model
We present the model of the IEML language, along with the model of semantic variables. Let ∑ be a nonempty and finite set of symbols, ∑ = {S, B, T, U, A, E}. Let string s be a finite sequence of symbols chosen from ∑. The length of this string is denoted by |s|. An empty string ε is a string with zero occurrence of symbols and its length is |ε |= 0. The set of all strings of length k composed with symbols from ∑ is defined as ∑k = {s where |s| = k}. Note that ∑0 = {ε} and ∑1 = {S, B, T, U, A, E}. Although ∑ and ∑1 are sets containing exactly the same members, the former contains symbols and the latter strings. The set of all strings over ∑ is defined as ∑* = ∑0∪∑1∪∑2∪∑3 …
A useful operation on strings is concatenation, defined as follows. For all s i = a 1 a 2 a 3 a 4 …a i ∈∑* and s j = b 1 b 2 b 3 b 4 …b j ∈∑*, then s i s j denotes string concatenation such that s i s j = a 1 a 2 a 3 a 4 …a i b 1 b 2 b 3 b 4 …b j and |s i s j | = i + j. The IEML language over ∑ is a subset of ∑*, L IEML ⊆ ∑*:
1.2 Model of semantic sequences
Definition 3 (Semantic sequence)
A string s is called a semantic sequence if and only if s∈L IEML .
To denote the p n th primitive of a sequence s, we use a superscript n where 1 ≤ n ≤ 3l and write s n. Note that for any sequence s of layer l, s n is undefined for any n > 3l. Two semantic sequences are distinct if and only if either of the following holds: i) their layers are different, ii) they are composed from different primitives, iii) their primitives do not follow the same order: for any s a and s b ,
Let’s now consider binary relations between semantic sequences in general. These are obtained by performing a Cartesian product of two sets.Footnote 8 For any set of semantic sequences X, Y where s a ∈X, s b ∈Y and using Eq. 2, we define four binary relations whole ⊆ X × Y, substance ⊆ X × Y, attribute ⊆ X × Y, and mode ⊆ X × Y as follows:
Any two semantic sequences that are equal are in a whole relationship. In addition, any two semantic sequences that share specific subsequences may be in substance, attribute or mode relationship. For any two semantic sequences s a and s b , if they are in one of the above relations, then we say that s b plays a role w.r.t s a and we call s b a seme of sequence.
Definition 4 (Seme of a sequence)
For any semantic sequence s a and s b , if (s a , s b ) ∈ whole ∪substance∪attribute∪mode, then s b plays a role w.r.t. s a and s b is called a seme.
We can now group distinct semantic sequences together into sets. A useful grouping is based on the layer of those semantic sequences.
1.3 Model of semantic categories
A category of L IEML is a subset such that all strings of that subset have the same length:
Definition 5 (Semantic category)
A semantic category c is a set containing semantic sequences at the same layer.
The layer of any category c is exactly the same as the layer of the semantic sequences included in that category. The set of all categories of layer l is given as the powersetFootnote 9 of the set of all strings of layer l of L IEML :
Two categories are distinct if and only if they differ by at least one element. For any c a and c b :
A weaker condition can be applied to categories of distinct layers (since two categories are different if their layers are different) and is written as:
where l(c a ) and l(c b ) denotes the layer of category c a and c b , respectively. Analogously to sequences, we consider binary relations between any categories c i and c j where l(c i ), l(c j ) ≥ 1. For any set of categories X, Y where c a ∈ X, c b ∈ Y, we define four binary relations whole C ⊆ X × Y, substance C ⊆ X × Y, attribute C ⊆ X × Y, and mode C ⊆ X × Y as follows:
For any two categories c a and c b , if they are in one of the above relations (c a , c b ) ∈ whole C ∪ substance C ∪ attribute C ∪ mode C , then we say that c b plays a role with respect to c a and c b is called a seme of category.
1.4 Model of catsets
A catset is a set of distinct categories of the same layer as defined Definition 4.
Definition 6 (Catset)
A catset κ is a set containing categories such that κ ={c n |∀i, j: c i ≠ c j , l(c i )=l(c j )}.
The layer of a catset is given by the layer of any of its members: if some c ∈ κ, then l(κ) = l(c). Note that a category c can be written as c ∈ C l , while a catset κ can be written as κ ⊆ C l . All standard set operations, such as union and intersection, (e.g., ∪ and ∩), can be performed on catsets of the same layer.
1.5 Model of uniform semantic locator
A USL is composed of up to seven catsets of different layers as follows:
Definition 7 (Uniform Semantic Locator, USL)
A USL υ is a set containing catsets of different layers such that υ = {κ n | ∀i, j: l(c i ) ≠ l(c j )}.
Note that since there are seven distinct layers, a USL can have at most seven members. All standard set operations, such as union and intersection (e.g., ∪ and ∩) on USLs are always performed on sets of categories (and therefore on sets of sequences), layer by layer. Since at each layer l there is |C l | distinct catsets, the whole semantic space is defined by the tuple: Ł =C 0 × C 1 × C 2 × C 3 × C 4 × C 5 × C 6 .
Rights and permissions
About this article
Cite this article
Kim, HN., Roczniak, A., Lévy, P. et al. Social media filtering based on collaborative tagging in semantic space. Multimed Tools Appl 56, 63–89 (2012). https://doi.org/10.1007/s11042-010-0557-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0557-4