FamilyID: A Hybrid Approach to Identify Family Information from Microblogs

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9149)


With the growing popularity of social networks, extremely large amount of users routinely post messages about their daily life to online social networking services. In particular, we have observed that family related information, including some very sensitive information, are freely available and easily extracted from Twitter. In this paper, we present a hybrid information retrieval mechanism, namely FamilyID, to identify and extract family related information of a user from his/her microblogs (tweets). The proposed model takes into account part-of-speech tagging, pattern matching, lexical similarity, and semantic similarity of the tweets. Experiment results show that FamilyID provides both high precision and recall. We expect the project to serve as a warning to users that they may have accidentally revealed too much personal/family information to the public. It could also help microblog users to evaluate the amount of information that they have already revealed.


  1. 1.
    Balduzzi, M., Platzer, C., Holz, T., Kirda, E., Balzarotti, D., Kruegel, C.: Abusing social networks for automated user profiling. In: Jha, S., Sommer, R., Kreibich, C. (eds.) RAID 2010. LNCS, vol. 6307, pp. 422–441. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  2. 2.
    Bilge, L., Strufe, T., Balzarotti, D., Kirda, E.: All your contacts are belong to us: automated identity theft attacks on social networks. In: WWW (2009)Google Scholar
  3. 3.
    Bollen, J., Mao, H., Zeng, X.-J.: Twitter mood predicts the stock market. In: CoRR, abs/1010.3003 (2010)Google Scholar
  4. 4.
    Dey, R., Tang, C., Ross, K., Saxena, N.: Estimating age privacy leakage in online social networks (2012)Google Scholar
  5. 5.
    He, J., Chu, W.W., Liu, Z.V.: Inferring privacy information from social networks. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, pp. 154–165. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  6. 6.
    Huang, S., Chen, M., Luo, B., Lee, D.: Predicting aggregate social activities using continuous-time stochastic process. In: Proceedings ACM International Conference on Information and knowledge management (2012)Google Scholar
  7. 7.
    Huberman, B.A., Adar, E., Fine, L.R.: Valuating privacy. IEEE Secur. Priv. 3(5), 22–25 (2005)CrossRefGoogle Scholar
  8. 8.
    Li, F., Chen, J.Y., Zou, X., Liu, P.: New privacy threats in healthcare informatics: when medical records join the web. In: BIOKDD (2010)Google Scholar
  9. 9.
    Liu, H., Luo, B., Lee, D.: Location type classification using tweet content. In: ICMLA, vol. 1, pp. 232–237. IEEE (2012)Google Scholar
  10. 10.
    Luo, B., Lee, D.: On protecting private information in social networks: a proposal. In: M3SN Workshop (2009)Google Scholar
  11. 11.
    Madejski, M., Johnson, M., Bellovin, S.M.: The failure of online social network privacy settings. Technical report CUCS-010-11, Columbia University (2011)Google Scholar
  12. 12.
    Mahmud, J., Nichols, J., Drews, C.: Home location identification of twitter users. ACM Trans. Intell. Syst. Technol. 5(3), 1–47 (2014)CrossRefGoogle Scholar
  13. 13.
    Mao, H., Shuai, X., Kapadia, A.: Loose tweets: an analysis of privacy leaks on twitter. In: WPES (2011)Google Scholar
  14. 14.
    Mislove, A., Viswanath, B., Gummadi, K.P., Druschel, P.: You are who you know: inferring user profiles in online social networks. In: WSDM (2010)Google Scholar
  15. 15.
    Quercia, D., Kosinski, M., Stillwell, D., Crowcroft, J.: Our twitter profiles, our selves: predicting personality with twitter. In: IEEE PASSAT (2011)Google Scholar
  16. 16.
    Wondracek, G., Holz, T., Kirda, E., Kruegel, C.: A practical attack to de-anonymize social network users. In: IEEE Security and Privacy (2010)Google Scholar
  17. 17.
    Yang, Y., Lutes, J., Li, F., Luo, B., Liu, P.: Stalking online: on user privacy in social networks. In: ACM CODASPY (2012)Google Scholar
  18. 18.
    Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: WWW (2009)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  1. 1.IBMSan JoseUSA
  2. 2.MicrosoftSeattleUSA
  3. 3.Department of EECSUniversity of KansasLawrenceUSA

Personalised recommendations