Skip to main content

Reliable Online Social Network Data Collection

  • Chapter
  • First Online:
Computational Social Networks

Abstract

Large quantities of information are shared through online social networks, making them attractive sources of data for social network research. When studying the usage of online social networks, these data may not describe properly users’ behaviours. For instance, the data collected often include content shared by the users only, or content accessible to the researchers, hence obfuscating a large amount of data that would help to understand users’ behaviours and privacy concerns. Moreover, the data collection methods employed in experiments may also have an effect on data reliability when participants self-report inaccurate information or are observed while using a simulated application. Understanding the effects of these collection methods on data reliability is paramount for the study of social networks; for understanding user behaviour; for designing socially aware applications and services; and for mining data collected from such social networks and applications. This chapter reviews previous research which has looked at social network data collection and user behaviour in these networks. We highlight shortcomings in the methods used in these studies and introduce our own methodology and user study based on the experience sampling method; we claim that our methodology leads to the collection of more reliable data by capturing both those data which are shared and not shared. We conclude with suggestions for collecting and mining data from online social networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.facebook.com/press/info.php?statistics

  2. 2.

    That said, one of the most popular OSNs, Twitter, has recently made some effort to provide researchers with access to part of their data by donating an archive of public data to the US Library of Congress for preservation and research (http://blog.twitter.com/2010/04/tweet-preservation.html).

  3. 3.

    Regional networks have been since removed from Facebook in 2009.

  4. 4.

    http://www.openstreetmap.org/

  5. 5.

    http://www.google.com/latitude/

  6. 6.

    We did not mention the Federal government and Watergate as it was not appropriate to the participants in UK.

  7. 7.

    To realistically simulate publishing for the simulation group, the information was published using Facebook’s “only visible to me” privacy option. Therefore, each user was able to see exactly the information which would have been shared.

  8. 8.

    We conducted the experiment in four runs because of resource constraints: we had 20 mobile phones available, but 80 participants over the experiment.

References

  1. Amichai-Hamburger, Y., Vinitzky, G.: Social network use and personality. Comput. Hum. Behav. 26(6), 1289–1295 (2010). doi:10. 1016/j.chb.2010.03.018

    Article  Google Scholar 

  2. Anthony, D., Henderson, T., Kotz, D.: Privacy in location-aware computing environments. IEEE Pervasive Comput. 6(4), 64–72 (2007). doi:10.1109/MPRV.2007.83

    Article  Google Scholar 

  3. Ben Abdesslem, F., Phillips, A., Henderson, T.: Less is more: energy-efficient mobile sensing with SenseLess. In: ACM MobiHeld’09, Barcelona, pp. 61–62 (2009). doi:10.1145/1592606.1592621

    Google Scholar 

  4. Benevenuto, F., Rodrigues, T., Cha, M., Almeida, V.: Characterizing user behavior in online social networks. In: IMC ’09: Proceedings of the 9th ACM Internet Measurement Conference, Chicago, pp. 49–62 (2009). doi:10.1145/1644893.1644900

    Google Scholar 

  5. Besmer, A., Lipford, H.R.: Moving beyond untagging: photo privacy in a tagged world. In: CHI ’10: Proceedings of the 28th International Conference on Human Factors in Computing Systems, Atlanta, pp. 1563–1572 (2010). doi:10.1145/1753326.1753560

    Google Scholar 

  6. Brandtzæg, P.B., Heim, J.: Why people use social networking sites. In Hutchison, D., Kanade, T., Kittler, J., Kleinberg, J.M., Mattern, F., Mitchell, J.C., Naor, M., Nierstrasz, O., Rangan, C.P., Steffen, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M.Y., Weikum, G., Ozok, A.A., Zaphiris, P. (eds.) Online Communities and Social Computing, vol. 5621, chapter 16, pp. 143–152. Springer, Berlin/Heidelberg (2009). doi:10.1007/978-3-642-02774-1_16

    Google Scholar 

  7. Cha, M., Haddadi, H., Benevenuto, F., Gummadi, K.P.: Measuring user influence in Twitter: the million follower fallacy. In: Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM), Washington (2010). Online at http://aaai.org/ocs/index.php/ICWSM/ICWSM10/paper/view/1538/0

  8. Consolvo, S., Walker, M.: Using the experience sampling method to evaluate ubicomp applications. IEEE Pervasive Comput. 2(2), 24–31 (2003). doi:10.1109/MPRV.2003.1203750

    Article  Google Scholar 

  9. Consolvo, S., Smith, I.E., Matthews, T., Lamarca, A., Tabert, J., Powledge, P.: Location disclosure to social relations: why, when, and what people want to share. In: CHI ’05: Proceedings of the SIGCHI conference on Human factors in computing systems, Portland, pp. 81–90 (2005). doi:10.1145/1054972.1054985

    Google Scholar 

  10. Eagle, N., Pentland, A.S., Lazer, D.: Inferring friendship network structure by using mobile phone data. Proc. Natl. Acad. Sci. 106(36), 15274–15278 (2009). doi:10.1073/pnas.0900282106

    Article  Google Scholar 

  11. Ellison, N.B., Steinfield, C., Lampe, C.: The benefits of Facebook “friends:” social capital and college students use of online social network sites. J. Comput. Mediat. Commun. 12(4), 1143–1168 (2007). doi:10. 1111/j.1083-6101.2007.00367.x

    Article  Google Scholar 

  12. Froehlich, J., Chen, M.Y., Consolvo, S., Harrison, B., Landay, J.A.: MyExperience: a system for in situ tracing and capturing of user feedback on mobile phones. In: MobiSys ’07: Proceedings of the 5th International Conference on Mobile Systems, Applications and Services, San Juan, pp. 57–70 (2007). doi:10.1145/1247660.1247670

    Google Scholar 

  13. Garg, S., Gupta, T., Carlsson, N., Mahanti, A.: Evolution of an online social aggregation network: an empirical study. In: IMC ’09: Proceedings of the 9th ACM Internet Measurement Conference, Chicago, pp. 315–321 (2009). doi:10.1145/1644893.1644931

    Google Scholar 

  14. Ghosh, S., Korlam, G., Ganguly, N.: The effects of restrictions on number of connections in OSNs: a case-study on Twitter. In: Proceedings of the 3rd Workshop on Online Social Networks (WOSN 2010), Boston (2010). Online at http://www.usenix.org/events/wosn10/tech/full_papers/Ghosh.pdf

  15. Gjoka, M., Sirivianos, M., Markopoulou, A., Yang, X.: Poking Facebook: characterization of OSN applications. In: WOSN ’08: Proceedings of the First Workshop on Online Social Networks, Seattle, pp. 31–36 (2008). doi:10.1145/1397735.1397743

    Google Scholar 

  16. Gjoka, M., Kurant, M., Butts, C.T., Markopoulou, A.: Walking in Facebook: a case study of unbiased sampling of OSNs. In: Proceedings of IEEE INFOCOM 2010, San Diego, pp. 1–9 (2010). doi:10.1109/ INFCOM.2010.5462078

    Google Scholar 

  17. Guy, I., Jacovi, M., Meshulam, N., Ronen, I., Shahar, E.: Public vs. private: comparing public social network information with email. In: CSCW ’08: Proceedings of the ACM 2008 Conference on Computer Supported Cooperative Work, San Diego, pp. 393–402 (2008). doi:10. 1145/1460563.1460627

    Google Scholar 

  18. Gyarmati, L., Trinh, T.: Measuring user behavior in online social networks. IEEE Netw. 24(5), 26–31 (2010). doi:10.1109/MNET.2010. 5578915

    Article  Google Scholar 

  19. Hoser, B., Nitschke, T.: Questions on ethics for research in the virtually connected world. Soc. Netw. 32(3), 180–186 (2010). doi:10.1016/j.socnet. 2009.11.003

    Article  Google Scholar 

  20. Iachello, G., Smith, I., Consolvo, S., Chen, M., Abowd, G.D.: Developing privacy guidelines for social location disclosure applications and services. In: SOUPS ’05: Proceedings of the 2005 Symposium on Usable Privacy and Security, Philadelphia, pp. 65–76 (2005). doi:10.1145/1073001. 1073008

    Google Scholar 

  21. Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: an analysis of a microblogging community. In: Zhang, H., Spiliopoulou, M., Mobasher, B., Giles, C.L., McCallum, A., Nasraoui, O., Srivastava, J., Yen, J. (eds.) Advances in Web Mining and Web Usage Analysis. Lecture Notes in Computer Science, vol. 5439, chapter 7, pp. 118–138. Springer, Berlin/Heidelberg (2007). doi: 10.1007/978-3-642-00528-2_7

  22. Jiang, J., Wilson, C., Wang, X., Huang, P., Sha, W., Dai, Y., Zhao, B.Y.: Understanding latent interactions in online social networks. In: IMC ’10: Proceedings of the 10th Annual Conference on Internet Measurement, Melbourne, pp. 369–382 (2010). doi:10.1145/1879141.1879190

    Google Scholar 

  23. Kofod-Petersen, A., Gransaether, P.A., Krogstie, J.: An empirical investigation of attitude towards location-aware social network service. Int. J. Mobile Commun. 8(1), 53–70 (2010). doi:10.1504/IJMC.2010. 030520

    Article  Google Scholar 

  24. Krasnova, H., Günther, O., Spiekermann, S., Koroleva, K.: Privacy concerns and identity in online social networks. Identit. Inf. Soc. 2(1), 39–63 (2009). doi:10.1007/s12394-009-0019-1

    Article  Google Scholar 

  25. Kwon, O., Wen, Y.: An empirical study of the factors affecting social network service use. Comput. Hum. Behav. 26(2), 254–263 (2010). doi:10.1016/j.chb.2009.04.011

    Article  Google Scholar 

  26. Lampe, C., Ellison, N.B., Steinfield, C.: Changes in use and perception of Facebook. In: CSCW ’08: Proceedings of the ACM 2008 Conference on Computer Supported Cooperative Work, San Diego,, pp. 721–730 (2008). doi:10.1145/1460563.1460675

    Google Scholar 

  27. Larson, R., Csikszentmihalyi, M.: The experience sampling method. New Dir. Methodol. Soc. Behav. Sci. 15, 41–56 (1983)

    Google Scholar 

  28. Lewis, K., Kaufman, J., Christakis, N.: The Taste for privacy: an analysis of college student privacy settings in an online social network. J. Comput. Mediat. Commun. 14(1), 79–100 (2008). doi:10.1111/j. 1083-6101.2008.01432.x

    Article  Google Scholar 

  29. Lindamood, J., Heatherly, R., Kantarcioglu, M., Thuraisingham, B.: Inferring private information using social network data. In: WWW ’09: Proceedings of the 18th International World Wide Web Conference, Madrid, pp. 1145–1146 (2009). doi:10.1145/1526709.1526899

    Google Scholar 

  30. Mancini, C., Thomas, K., Rogers, Y., Price, B.A., Jedrzejczyk, L., Bandara, A.K., Joinson, A.N., Nuseibeh, B.: From spaces to places: emerging contexts in mobile privacy. In: Ubicomp ’09: Proceedings of the 11th International Conference on Ubiquitous Computing, Orlando, pp. 1–10 (2009) doi:10.1145/1620545.1620547

    Google Scholar 

  31. Nagle, F., Singh, L.: Can friends be trusted? Exploring privacy in online social networks. In: 2009 International Conference on Advances in Social Network Analysis and Mining (ASONAM), Athens, pp. 312–315 (2009). doi:10.1109/ASONAM.2009.61

    Google Scholar 

  32. Nazir, A., Raza, S., Chuah, C.N.: Unveiling Facebook: a measurement study of social network based applications. In: IMC ’08: Proceedings of the 8th ACM SIGCOMM Conference on Internet Measurement, Vouliagmeni, pp. 43–56 (2008). doi:10.1145/1452520.1452527

    Google Scholar 

  33. Pempek, T.A., Yermolayeva, Y.A., Calvert, S.L.: College students’ social networking experiences on Facebook. J. Appl. Dev. Psychol. 30(3), 227–238 (2009). doi:10.1016/j.appdev.2008.12.010

    Article  Google Scholar 

  34. Peterson, K., Siek, K.A.: Analysis of information disclosure on a social networking site. In Hutchison, D., Kanade, T., Kittler, J., Kleinberg, J.M., Mattern, F., Mitchell, J.C., Naor, M., Nierstrasz, O., Rangan, C.P., Steffen, B., Sudan, M., Terzopoulos, D., Tygar, D., Vardi, M.Y., Weikum, G., Ozok, A.A., Zaphiris, P. (eds.) Online Communities and Social Computing, vol. 5621, chapter 28, pp. 256–264. Springer, Berlin/Heidelberg (2009). doi:10.1007/978-3-642-02774-1_28

    Google Scholar 

  35. Qiu, T., Feng, J., Ge, Z., Wang, J., Xu, J., Yates, J.: Listen to me if you can: tracking user experience of mobile network on social media. In: IMC ’10: Proceedings of the 10th Annual Conference on Internet Measurement, Melbourne, pp. 288–293 (2010). doi:10.1145/ 1879141.1879178

    Google Scholar 

  36. Rejaie, R., Torkjazi, M., Valafar, M., Willinger, W.: Sizing up online social networks. IEEE Netw. 24(5), 32–37 (2010). doi:10.1109/MNET. 2010.5578916

    Article  Google Scholar 

  37. Roblyer, M., McDaniel, M., Webb, M., Herman, J., Witty, J.V.: Findings on Facebook in higher education: a comparison of college faculty and student uses and perceptions of social networking sites. Int. High. Educ. 13(3), 134–140 (2010). doi:10.1016/j.iheduc.2010.03.002

    Article  Google Scholar 

  38. Sadeh, N., Hong, J., Cranor, L., Fette, I., Kelley, P., Prabaker, M., Rao, J.: Understanding and capturing people’s privacy policies in a mobile social networking application. Pers. Ubiquitous Comput. 13, 401–412 (2009). doi:10.1007/s00779-008-0214-3

    Article  Google Scholar 

  39. Schneider, F., Feldmann, A., Krishnamurthy, B., Willinger, W.: Understanding online social network usage from a network perspective. In: IMC ’09: Proceedings of the 9th ACM Internet Measurement Conference, Chicago, pp. 35–48 (2009). doi:10.1145/1644893.1644899

    Google Scholar 

  40. Stutzman, F., Duffield, J.K.: Friends only: examining a privacy-enhancing behavior in facebook. In: CHI ’10: Proceedings of the 28th International Conference on Human Factors in Computing Systems, Atlanta, pp. 1553–1562 (2010). doi:10.1145/1753326.1753559

    Google Scholar 

  41. Tsai, J.Y., Kelley, P., Drielsma, P., Cranor, L.F., Hong, J., Sadeh, N.: Who’s viewed you?: the impact of feedback in a mobile location-sharing application. In: CHI ’09: Proceedings of the 27th International Conference on Human Factors in Computing Systems, Boston, pp. 2003–2012 (2009). doi:10.1145/1518701.1519005

    Google Scholar 

  42. Valafar, M., Rejaie, R., Willinger. W.: Beyond friendship graphs: a study of user interactions in Flickr. In: WOSN ’09: Proceedings of the 2nd ACM Workshop on Online Social Networks, Barcelona, pp. 25–30 (2009). doi:10.1145/1592665.1592672

    Google Scholar 

  43. Viswanath, B., Mislove, A., Cha, M., Gummadi, K.P.: On the evolution of user interaction in Facebook. In: WOSN ’09: Proceedings of the 2nd ACM Workshop on Online Social Networks, Barcelona, pp. 37–42 (2009). doi:10.1145/1592665.1592675

    Google Scholar 

  44. Westin, A., Harris, L. & Associates: Equifax-Harris Consumer Privacy Survey. Conducted for Equifax Inc. (1991)

    Google Scholar 

  45. Wilson, C., Boe, B., Sala, A., Puttaswamy, K.P., Zhao, B.Y.: User interactions in social networks and their implications. In: Proceedings of the Fourth ACM European Conference on Computer Systems (EuroSys), Nuremberg, pp. 205–218 (2009). doi:10.1145/1519065.1519089

    Google Scholar 

  46. Ye, S., Wu, F.: Estimating the size of online social networks. In: Proceedings of the IEEE Second International Conference on Social Computing (SocialCom), Minneapolis, pp. 169–176 (2010). doi:10.1109/ SocialCom.2010.32

    Google Scholar 

  47. Young, A.L., Quan-Haase, A.: Information revelation and internet privacy concerns on social network sites: a case study of Facebook. In: C&T ’09: Proceedings of the Fourth International Conference on Communities and Technologies, University Park, pp. 265–274 (2009). doi:10.1145/1556460.1556499

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fehmi Ben Abdesslem .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag London

About this chapter

Cite this chapter

Abdesslem, F.B., Parris, I., Henderson, T. (2012). Reliable Online Social Network Data Collection. In: Abraham, A. (eds) Computational Social Networks. Springer, London. https://doi.org/10.1007/978-1-4471-4054-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4471-4054-2_8

  • Published:

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-4471-4053-5

  • Online ISBN: 978-1-4471-4054-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics