FamilyID: A Hybrid Approach to Identify Family Information from Microblogs

Conference paper

DOI: 10.1007/978-3-319-20810-7_14

Part of the Lecture Notes in Computer Science book series (LNCS, volume 9149)
Cite this paper as:
Gopal J., Huang S., Luo B. (2015) FamilyID: A Hybrid Approach to Identify Family Information from Microblogs. In: Samarati P. (eds) Data and Applications Security and Privacy XXIX. DBSec 2015. Lecture Notes in Computer Science, vol 9149. Springer, Cham

Abstract

With the growing popularity of social networks, extremely large amount of users routinely post messages about their daily life to online social networking services. In particular, we have observed that family related information, including some very sensitive information, are freely available and easily extracted from Twitter. In this paper, we present a hybrid information retrieval mechanism, namely FamilyID, to identify and extract family related information of a user from his/her microblogs (tweets). The proposed model takes into account part-of-speech tagging, pattern matching, lexical similarity, and semantic similarity of the tweets. Experiment results show that FamilyID provides both high precision and recall. We expect the project to serve as a warning to users that they may have accidentally revealed too much personal/family information to the public. It could also help microblog users to evaluate the amount of information that they have already revealed.

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  1. 1.IBMSan JoseUSA
  2. 2.MicrosoftSeattleUSA
  3. 3.Department of EECSUniversity of KansasLawrenceUSA

Personalised recommendations