Skip to main content

Invisible market for online personal data: An examination

Abstract

Despite the widespread knowledge that corporations collect and exchange user online personal data (OPD) between themselves in a market for OPD, there have been few attempts to systematically understand the nature and structure of these markets or answer basic questions about the behavior of parties in these markets. This paper addresses these questions using records of data sharing behavior by 218 websites across eight economic sectors. Two datasets, collected 4 years apart, are analyzed using social network analysis (SNA). Findings indicate linear preferential attachment is the most likely coordinating mechanism in the OPD market. Further, this market has a much higher number of brokers (intermediary corporations that facilitate exchange between other corporations) than comparable markets. Building on these findings, implications for research and practice are presented along with future research directions.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Acar, G., Eubank, C., Englehardt, S., Juarez, M., Narayanan, A., & Diaz, C. (2014). The web never forgets: Persistent tracking mechanisms in the wild. Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 674–689. https://doi.org/10.1145/2660267.2660347 .

  2. Access Now. (2015). The rise of mobile tracking headers: How telcos around the world are threatening your privacy. Retrieved from https://www.accessnow.org/cms/assets/uploads/archive/AIBT-Report.pdf.

  3. Achrol, R. S. (1997). Changes in the theory of interorganizational relations in marketing: Toward a network paradigm. Journal of the Academy of Marketing Science, 25(1), 56–71. https://doi.org/10.1177/0092070397251006 .

    Article  Google Scholar 

  4. Achrol, R. S., & Kotler, P. (1999). Marketing in the network economy. The Journal of Marketing, 63, 146–163. https://doi.org/10.1177/00222429990634s114 .

    Article  Google Scholar 

  5. Achrol, R. S., & Kotler, P. (2012). Frontiers of the marketing paradigm in the third millennium. Journal of the Academy of Marketing Science, 40(1), 35–52. https://doi.org/10.1007/s11747-011-0255-4 .

    Article  Google Scholar 

  6. Agarwal, L., Shrivastava, N., Jaiswal, S., & Panjwani, S. (2013). Do not embarrass: Re-examining user concerns for online tracking and advertising. Proceedings of the Ninth Symposium on Usable Privacy and Security, 1–13. https://doi.org/10.1145/2501604.2501612 .

  7. Albert, R., Jeong, H., & Barabási, A.-L. (2000). Error and attack tolerance of complex networks. Nature, 406(6794), 378–382. https://doi.org/10.1038/35019019 .

    Article  Google Scholar 

  8. Bagley, A. W., & Brown, J. S. (2014). Consumer Legal Protections Against the Layers of Big Data. 2014 TPRC Conference Paper. https://doi.org/10.2139/ssrn.2418805 .

  9. Bain, J. S. (1968). Industrial organization. John Wiley & Sons.

  10. Barabási, A.-L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512. https://doi.org/10.1126/science.286.5439.509.

  11. Barabási, A.-L., & Bonabeau, E. (2003). Scale-free networks. Scientific American, 288(5), 60–69.

  12. Barrat, A., Barthelemy, M., Pastor-Satorras, R., & Vespignani, A. (2004). The architecture of complex weighted networks. Proceedings of the National Academy of Sciences, 101(11), 3747–3752. https://doi.org/10.1073/pnas.0400087101 .

    Article  Google Scholar 

  13. Bauch, A., & Superti-Furga, G. (2006). Charting protein complexes, signaling pathways, and networks in the immune system. Immunological Reviews, 210(1), 187–207. https://doi.org/10.1111/j.0105-2896.2006.00369.x .

    Article  Google Scholar 

  14. Bearman, P. S., Moody, J., & Stovel, K. (2004). Chains of affection: The structure of adolescent romantic and sexual networks. American Journal of Sociology, 110(1), 44–91. https://doi.org/10.1086/386272 .

    Article  Google Scholar 

  15. Binns, R., Zhao, J., Kleek, M. V., & Shadbolt, N. (2018). Measuring third-party tracker power across web and mobile. ACM Transactions on Internet Technology (TOIT), 18(4), 1–22. https://doi.org/10.1145/3176246 .

    Article  Google Scholar 

  16. Bohn, D. (2020). Google to ‘phase out’ third-party cookies in chrome, but not for two years. The Verge. Retrieved from https://www.theverge.com/2020/1/14/21064698/google-third-party-cookies-chrome-two-years-privacy-safari-firefox.

  17. Boss, M., Elsinger, H., Summer, M., & Thurner, S. (2004). Network topology of the interbank market. Quantitative Finance, 4(6), 677–684. https://doi.org/10.1080/14697680400020325 .

    Article  Google Scholar 

  18. Burt, R. S., & Merluzzi, J. (2014). Embedded brokerage: Hubs versus locals. Contemporary Perspectives on Organizational Social Networks, 40, 161–177. https://doi.org/10.1108/S0733-558X(2014)0000040008.

  19. Butts, C. T. (2010). Tools for social network analysis. R Package Version, 2.

  20. Cadogan, R. A. (2004). An imbalance of power: The readability of internet privacy policies. Journal of Business & Economics Research (JBER), 2(3). https://doi.org/10.19030/jber.v2i3.2864 .

  21. Chakrabarti, D., Faloutsos, C., & McGlohon, M. (2010). Graph mining: Laws and Generators. In C. C. Aggarwal & H. Wang (Eds.), Managing and Mining Graph Data, 40 (69–123). Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-6045-0_3.

  22. Chatterjee, D., & Ravichandran, T. (2004). Beyond exchange models: Understanding the structure of B2B information systems. Information Systems and e-Business Management, 2(2–3), 169–186. https://doi.org/10.1007/s10257-004-0033-7 .

    Article  Google Scholar 

  23. Chen, P., & Wu, S. (2013). The impact and implications of on-demand services on market structure. Information Systems Research, 24(3), 750–767. https://doi.org/10.1287/isre.1120.0451 .

    Article  Google Scholar 

  24. Chircu, A. M., & Kauffman, R. J. (1999). Strategies for internet middlemen in the intermediation/disintermediation/reintermediation cycle. Electronic Markets, 9(1–2), 109–117. https://doi.org/10.1080/101967899359337 .

    Article  Google Scholar 

  25. Clauset, A., Shalizi, C. R., & Newman, M. E. (2009). Power-law distributions in empirical data. SIAM Review, 51(4), 661–703. https://doi.org/10.1137/070710111 .

    Article  Google Scholar 

  26. Clifton, J. A. (1977). Competition and the evolution of the capitalist mode of production. Cambridge Journal of Economics, 1(2), 137–151.

    Google Scholar 

  27. Coles, N. (2001). It’s not what you know—It’s who you know that counts. Analysing serious crime groups as social networks. British Journal of Criminology, 41(4), 580–594. https://doi.org/10.1093/bjc/41.4.580 .

    Article  Google Scholar 

  28. Comanor, W. S., & Wilson, T. A. (1972). Advertising market structure and performance. Journal of Reprints for Antitrust Law and Economics, 4, 25. https://doi.org/10.2307/1928327 .

    Article  Google Scholar 

  29. Cravens, D. W., Shipp, S. H., & Cravens, K. S. (1994). Reforming the traditional organization: The mandate for developing networks. Business Horizons, 37(4), 19–28. https://doi.org/10.1016/0007-6813(94)90043-4 .

    Article  Google Scholar 

  30. Crona, B., & Bodin, Ö. (2006). What you know is who you know? Communication patterns among resource users as a prerequisite for co-management. Ecology and Society, 11(2), 7.

    Article  Google Scholar 

  31. Csardi, G., & Nepusz, T. (2006). The igraph software package for complex network research. InterJournal Complex Systems, 1695(5), 1–9.

    Google Scholar 

  32. Culnan, M. J. (1993). “ How did they get my name?”: An exploratory investigation of consumer attitudes toward secondary information use. MIS Quarterly, 17, 341–363. https://doi.org/10.2307/249775 .

    Article  Google Scholar 

  33. Cummings, T., & Worley, C. (2014). Organization development and change (10th ed.). Cengage learning.

  34. Duhaime-Ross, A. (2014). Here’s how well Google’s search engine knows you. The Verge. Retrieved from http://www.theverge.com/2014/9/19/6409773/heres-how-well-googles-search-engine-knows-you.

  35. Englehardt, S., & Narayanan, A. (2016). Online tracking: A 1-million-site measurement and analysis. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 1388–1401. https://doi.org/10.1145/2976749.2978313 .

  36. Englehardt, S., Reisman, D., Eubank, C., Zimmerman, P., Mayer, J., Narayanan, A., & Felten, E. W. (2015). Cookies that give you away: The surveillance implications of web tracking. Proceedings of the 24th international conference on world wide web, 289–299. https://doi.org/10.1145/2736277.2741679 .

  37. Englehardt, S., Han, J., & Narayanan, A. (2018). I never signed up for this! Privacy implications of email tracking. Proceedings on Privacy Enhancing Technologies, 2018(1), 109–126. https://doi.org/10.1515/popets-2018-0006 .

    Article  Google Scholar 

  38. Ermakova, T., Fabian, B., Bender, B., & Klimek, K. (2018). Web tracking—A literature review on the state of research. Proceedings of the 2018 Hawaii International Conference on System Sciences (HICSS)https://doi.org/10.24251/HICSS.2018.596.

  39. Fohlin, C., Gehrig, T., & Haas, M. (2016). Rumors and Runs in Opaque Markets: Evidence from Panic of 1907. CESifo Working Paper Series, 6048. https://ssrn.com/abstract=2850377.

  40. Fouad, I., Bielova, N., Legout, A., & Sarafijanovic-Djukic, N. (2020). Missed by filter lists: Detecting unknown third-party trackers with invisible pixels. PETS 2020-20th Privacy Enhancing Technologies Symposium. https://doi.org/10.2478/popets-2020-0038 .

  41. Gassmann, O., Daiber, M., & Enkel, E. (2011). The role of intermediaries in cross-industry innovation processes. R&D Management, 41(5), 457–469. https://doi.org/10.1111/j.1467-9310.2011.00651.x .

    Article  Google Scholar 

  42. Gillespie, C. S. (2015). Fitting heavy tailed distributions: The poweRlaw package. Journal of Statistical Software, 64(2), 1–16. https://doi.org/10.18637/jss.v000.i00 .

    Article  Google Scholar 

  43. Giustiziero, G., Somaya, D., & Wu, B. (2020). A Resource-based Theory of Hyperspecialization and Hyperscaling. Available at SSRN. https://doi.org/10.2139/ssrn.3531111.

  44. Gould, R. V., & Fernandez, R. M. (1989). Structures of mediation: A formal approach to brokerage in transaction networks. Sociological Methodology, 19(1989), 89–126. https://doi.org/10.2307/270949.

  45. Granovetter, M. (1973). The strength of weak ties. American Journal of Sociology, 78(6), 1360–1380.

    Article  Google Scholar 

  46. Granovetter, M. (2005). The impact of social structure on economic outcomes. The Journal of Economic Perspectives, 19(1), 33–50.

    Article  Google Scholar 

  47. Greenstein, S. (2015). Behind the buzz of behavioral data. IEEE Micro, 35(2), 88–c3. https://doi.org/10.1109/MM.2015.26 .

    Article  Google Scholar 

  48. Grover, V., & Teng, J. T. (2001). E-commerce and the information market. Communications of the ACM, 44(4), 79–86. https://doi.org/10.1145/367211.367272 .

    Article  Google Scholar 

  49. Hahn, T. (2015). Cross-industry innovation processes: Strategic implications for telecommunication companies. Springer.

  50. Ham, C.-D., & Nelson, M. R. (2016). The role of persuasion knowledge, assessment of benefit and harm, and third-person perception in coping with online behavioral advertising. Computers in Human Behavior, 62, 689–702. https://doi.org/10.1016/j.chb.2016.03.076 .

    Article  Google Scholar 

  51. Hardy, Q. (2015). Using algorithms to determine character. New York Times. Retrieved from https://nyti.ms/2kdrVel.

  52. Helpman, E., & Krugman, P. R. (1985). Market structure and foreign trade: Increasing returns, imperfect competition, and the international economy. MIT Press.

  53. Helveston, M. N. (2014). Judicial deregulation of consumer markets. Cardozo Law Review, 36, 1739.

    Google Scholar 

  54. Helveston, M. N. (2018). Reining in commercial exploitation of consumer data symposium. Penn State Law Review, 123(3), 667–702.

    Google Scholar 

  55. Hong, W.-H., & Lee, D. (2018). Asymmetric pricing dynamics with market power: Investigating island data of the retail gasoline market. Empirical Economics, 58, 1–41. https://doi.org/10.1007/s00181-018-1614-5 .

    Article  Google Scholar 

  56. Iacovou, G. (2019). How third party cookies could be putting your company at risk [Metomic.Io]. Explainer. Retrieved from https://metomic.io/blog/main/2019/11/12/third-party-risks.html.

  57. ISBA, & PWC. (2020). ISBA programmatic supply chain transparency study. ISBA. Retrieved from https://www.isba.org.uk/knowledge/digital-media/programmatic-supply-chain-transparency-study/.

  58. Jakobi, T., von Grafenstein, M., Legner, C., Labadie, C., Mertens, P., Öksüz, A., & Stevens, G. (2020). The role of IS in the conflicting interests regarding GDPR. Business and Information Systems Engineering, 62, 261–272. https://doi.org/10.1007/s12599-020-00633-4 .

    Article  Google Scholar 

  59. Johnson, G., & Shriver, S. (2019). Privacy & market concentration: Intended & unintended consequences of the GDPR. Available at SSRN. https://doi.org/10.2139/ssrn.3477686.

  60. Joseph, S. (2020). “It is not a panacea”: Why log-level data hasn’t lived up to its promise for advertisers—Digiday. DigiDay. https://digiday.com/media/it-is-not-a-panacea-why-log-level-data-hasnt-lived-up-to-its-promise-for-advertisers/amp/.

  61. Karaj, A., Macbeth, S., Berson, R., & Pujol, J. M. (2018). Whotracks. Me: Monitoring the online tracking landscape at scale. ArXiv Preprint.

  62. Kessler, S. (2012). Google thinks I’m a middle-aged man. What about you? Mashable. Retrieved from http://mashable.com/2012/01/25/google-cookies/.

  63. Kim, H. J., Kim, I. M., Lee, Y., & Kahng, B. (2002). Scale-free network in stock markets. Journal of the Korean Physical Society, 40, 1105–1108.

    Google Scholar 

  64. Kluemper, D. H., Rosen, P. A., & Mossholder, K. W. (2012). Social networking websites, personality ratings, and the organizational context: More than meets the eye?1. Journal of Applied Social Psychology, 42(5), 1143–1172. https://doi.org/10.1111/j.1559-1816.2011.00881.x .

    Article  Google Scholar 

  65. Kohavi, R., Rothleder, N. J., & Simoudis, E. (2002). Emerging trends in business analytics. Communications of the ACM, 45(8), 45–48. https://doi.org/10.1145/545151.545177 .

    Article  Google Scholar 

  66. Kunegis, J., Blattner, M., & Moser, C. (2013). Preferential attachment in online networks: Measurement and explanations. Proceedings of the 5th Annual ACM Web Science Conference, 205–214. https://doi.org/10.1145/2464464.2464514 .

  67. Libert, T. (2015). Exposing the invisible web: An analysis of third-party HTTP requests on 1 million websites. International Journal of Communication, 9, 18.

    Google Scholar 

  68. Linden, T., Khandelwal, R., Harkous, H., & Fawaz, K. (2020). The privacy policy landscape after the GDPR. Proceedings on Privacy Enhancing Technologies, 2020(1), 47–64. https://doi.org/10.2478/popets-2020-0004 .

    Article  Google Scholar 

  69. Lobosco, K. (2013). Facebook friends could change your credit score. CNNMoney. Retrieved from http://money.cnn.com/2013/08/26/technology/social/facebook-credit-score/index.html.

  70. Loury, G. C. (1979). Market structure and innovation. The Quarterly Journal of Economics, 93(3), 395–410. https://doi.org/10.2307/1883165 .

    Article  Google Scholar 

  71. Malthouse, E. C., Maslowska, E., & Franks, J. U. (2018). Understanding programmatic TV advertising. International Journal of Advertising, 37(5), 769–784. https://doi.org/10.1080/02650487.2018.1461733 .

    Article  Google Scholar 

  72. Mayer, J. R., & Mitchell, J. C. (2012). Third-party web tracking: Policy and technology. 2012 IEEE Symposium on Security and Privacy, 413–427. https://doi.org/10.1109/SP.2012.47 .

  73. Merton, R. K. (1968). The Matthew effect in science: The reward and communication systems of science are considered. Science, 159(3810), 56–63. https://doi.org/10.1126/science.159.3810.56 .

    Article  Google Scholar 

  74. Meyer, R. (2015). Could a Bank deny your loan based on your Facebook friends? The Atlantic. Retrieved from http://www.theatlantic.com/technology/archive/2015/09/facebooks-new-patent-and-digital-redlining/407287/.

  75. Momen, N., Hatamian, M., & Fritsch, L. (2019). Did app privacy improve after the GDPR? IEEE Security and Privacy, 17(6), 10–20. https://doi.org/10.1109/MSEC.2019.2938445 .

    Article  Google Scholar 

  76. Mortier, R. (2016). Tracking personal identifiers across the Web. In Passive and Active Measurement: 17th International Conference, PAM 2016, Proceedings, 9631, 30.

  77. Nasraoui, O., Cardona, C., Rojas, C., & Gonzalez, F. (2003). Mining evolving user profiles in noisy web clickstream data with a scalable immune system clustering algorithm. Proc. of WebKDD, 71–81.

  78. Nunn, B. (2020). Baking Up New Strategies For A Post-Cookie World. Digital News Daily. Retrieved from https://www.mediapost.com/publications/article/346034/baking-up-new-strategies-for-a-post-cookie-world.html.

  79. Palvia, S., & Vemuri, V. (1998). The Impact of Electronic Commerce on Traditional Marketing Channels. AMCIS 1998 Proceedings, 150. https://aisel.aisnet.org/amcis1998/150/.

  80. Papadopoulos, P., Kourtellis, N., & Markatos, E. (2019). Cookie synchronization: Everything you always wanted to know but were afraid to ask. The World Wide Web Conference. https://doi.org/10.1145/3308558.3313542.

  81. Pasternack, A., & Melendez, S. (2019). Here are the data brokers quietly buying and selling your personal information. Fast Company. Retrieved from https://www.fastcompany.com/90310803/here-are-the-data-brokers-quietly-buying-and-selling-your-personal-information.

  82. Pettersson, T. (2003). Ethnicity and violent crime: The ethnic structure of networks of youths suspected of violent offences in Stockholm. Journal of Scandinavian Studies in Criminology & Crime Prevention, 4(2), 143–161. https://doi.org/10.1080/14043850310021567 .

    Article  Google Scholar 

  83. Picker, R. C. (2009). Online advertising, identity and privacy. U of Chicago Law & Economics, Olin Working Paper, 475. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1428065.

  84. Piskorski, M. J. (2004). Networks of power and status: Reciprocity in venture capital syndicates. WorkingPaper, Harvard Business School.

  85. Podolny, J. M., & Baron, J. N. (1997). Resources and relationships: Social networks and mobility in the workplace. American Sociological Review, 62, 673–693. https://doi.org/10.2307/2657354 .

    Article  Google Scholar 

  86. Porter, M. E. (1979). The structure within industries and companies’ performance. The Review of Economics and Statistics, 61(2), 214–227. https://doi.org/10.2307/1924589 .

    Article  Google Scholar 

  87. Porter, M. E. (1989). How competitive forces shape strategy. In D. Asch & C. Bowman (Eds.), Readings in Strategic Management (pp. 133–143). Macmillan Education UK. https://doi.org/10.1007/978-1-349-20317-8_10 .

  88. Ramachandran, J., Manikandan, K. S., & Pant, A. (2013). Why conglomerates thrive (outside the U.S.). Harvard Business Review, December 2013. Retrieved from https://hbr.org/2013/12/why-conglomerates-thrive-outside-the-us.

  89. Redman, T. C., & Waitman, R. M. (2020). Do you care about privacy as much as your customers do? Harvard Business Review. Retrieved from https://hbr.org/2020/01/do-you-care-about-privacy-as-much-as-your-customers-do.

  90. Rensmann, B., & Smits, M. (2008). Analyzing the added value of electronic intermediaries in the dutch health care sector. BLED 2008 Proceedings, 29. https://aisel.aisnet.org/bled2008/29/.

  91. Rhoades, S. A. (1993). The herfindahl-hirschman index. Federal Reserve Bulletin, 79, 188–189.

    Google Scholar 

  92. Richmond, J. (1974). Estimating the efficiency of production. International Economic Review, 15(2), 515–521. https://doi.org/10.2307/2525875 .

    Article  Google Scholar 

  93. Rieke, A. (2014). Knowing the score: New report offers tour of financial data, underwriting, and marketing. Equal Future. Retrieved from https://www.equalfuture.us/2014/10/29/knowing-the-score/.

  94. Robinson, D., & Yu, H. (2014). Knowing the score: new data, underwriting, and marketing in the consumer credit marketplace. A guide for financial inclusion stakeholders. pp. 1–34.

  95. Rossignoli, C., & Ricciardi, F. (2015). Emerging business models in B2B research: Virtual organization and e-intermediaries. In C. Rossignoli & F. Ricciardi (Eds.), Inter-Organizational Relationships: Towards a Dynamic Model for Understanding Business Network Performance (pp. 77–95). Springer International Publishing. https://doi.org/10.1007/978-3-319-11221-3_5 .

  96. Rubio-Campillo, X., Coto-Sarmiento, M., Pérez-Gonzalez, J., & Rodríguez, J. R. (2017). Bayesian analysis and free market trade within the Roman empire. Antiquity, 91(359), 1241–1252. https://doi.org/10.15184/aqy.2017.131 .

    Article  Google Scholar 

  97. Ruffell, M., Hong, J. B., & Kim, D. S. (2015). Analyzing the effectiveness of privacy related add-Ons employed to thwart web based tracking. 2015 IEEE 21st Pacific Rim International Symposium On Dependable Computing (PRDC), 264–272. https://doi.org/10.1109/PRDC.2015.29.

  98. Ryan, R. (2013). Yes, employers will check your Facebook before offering you a job. The Huffington Post. Retrieved from http://www.huffingtonpost.com/rachel-ryan/hiring-facebook_b_2795047.html.

  99. Sakamoto, T., & Matsunaga, M. (2019). After GDPR, still tracking or not? Understanding opt-out states for online behavioral advertising. 2019 IEEE Security and Privacy Workshops (SPW), 92–99. https://doi.org/10.1109/SPW.2019.00027 .

  100. Sarkar, M. B., Butler, B., & Steinfield, C. (1995). Intermediaries and cybermediaries: A continuing role for mediating players in the electronic marketplace. Journal of Computer-Mediated Communication, 1(3), 1–14. https://doi.org/10.1111/j.1083-6101.1995.tb00167.x .

    Article  Google Scholar 

  101. Scherer, F. M., & Ross, D. (1990). Industrial market structure and economic performance. University of Illinois at Urbana-Champaign’s Academy for Entrepreneurial Leadership Historical Research Reference in Entrepreneurship. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1496716.

  102. Schneier, B. (2015). Data and goliath: The hidden battles to collect your data and control your world. WW Norton & Company.

  103. Sheridan, P., & Onodera, T. (2018). A preferential attachment paradox: How preferential attachment combines with growth to produce networks with log-normal in-degree distributions. Scientific Reports, 8(1), 2811. https://doi.org/10.1038/s41598-018-21133-2 .

    Article  Google Scholar 

  104. Smith, W. P., & Kidder, D. L. (2010). You’ve been tagged! (then again, maybe not): Employers and Facebook. Business Horizons, 53(5), 491–499. https://doi.org/10.1016/j.bushor.2010.04.004 .

    Article  Google Scholar 

  105. Snow, C. C. (1997). Twenty-first-century organizations: Implications for a new marketing paradigm. Journal of the Academy of Marketing Science, 25(1), 72–74. https://doi.org/10.1007/BF02894510 .

    Article  Google Scholar 

  106. Son, J.-Y., Kim, S. S., & Riggins, F. J. (2006). Consumer adoption of net-enabled infomediaries: Theoretical explanations and an empirical test. Journal of the Association for Information Systems, 7(7), 18. https://doi.org/10.17705/1jais.00094 .

    Article  Google Scholar 

  107. Spiekermann, S., & Korunovska, J. (2017). Towards a value theory for personal data. Journal of Information Technology, 32(1), 62–84. https://doi.org/10.1057/jit.2016.4 .

    Article  Google Scholar 

  108. Spiekermann, S., Acquisti, A., Böhme, R., & Hui, K.-L. (2015a). The challenges of personal data markets and privacy. Electronic Markets, 25(2), 161–167. https://doi.org/10.1007/s12525-015-0191-0 .

    Article  Google Scholar 

  109. Spiekermann, S., Böhme, R., Acquisti, A., & Hui, K.-L. (2015b). Personal data markets. Electronic Markets, 25(2), 91–93. https://doi.org/10.1007/s12525-015-0190-1 .

    Article  Google Scholar 

  110. Stojanovic, L., Dinic, M., Stojanovic, N., & Stojadinovic, A. (2016). Big-data-driven anomaly detection in industry (4.0): An approach and a case study. 2016 IEEE international conference on big data (big data), 1647–1652. https://doi.org/10.1109/BigData.2016.7840777 .

  111. Tanaka, H., & Kitayama, N. (2019). Japan’s DPA proposes amendments to APPI. IAPP. https://iapp.org/news/a/japans-data-protection-authority-proposes-amendments-to-appi/.

  112. Tanner, A. (2017). The Gay Jewish Immigrant Whose Company Sells Your Medical Secrets. The Forward. https://forward.com/news/longform/359832/the-secret-life-of-the-gay-jewish-immigrant-whose-company-sells-your-medica/.

  113. Thitimajshima, W., Esichaikul, V., & Krairit, D. (2018). A framework to identify factors affecting the performance of third-party B2B e-marketplaces: A seller’s perspective. Electronic Markets, 28(2), 129–147. https://doi.org/10.1007/s12525-017-0256-3 .

    Article  Google Scholar 

  114. Timmers, P. (1998). Business models for electronic markets. Electronic Markets, 8(2), 3–8. https://doi.org/10.1080/10196789800000016 .

    Article  Google Scholar 

  115. Treber, S., & Lanza, G. (2018). Transparency in global production networks: Improving disruption management by increased information exchange. Procedia CIRP, 72, 898–903. https://doi.org/10.1016/j.procir.2018.03.009 .

    Article  Google Scholar 

  116. Ur, B., Leon, P. G., Cranor, L. F., Shay, R., & Wang, Y. (2012). Smart, useful, scary, creepy: Perceptions of online behavioral advertising. Proceedings of the Eighth Symposium on Usable Privacy and Security, 1–15. https://doi.org/10.1145/2335356.2335362 .

  117. Vallina-Rodriguez, N., Sundaresan, S., Kreibich, C., & Paxson, V. (2015). Header enrichment or ISP enrichment? Emerging privacy threats in mobile networks. Proceedings of the 2015 ACM SIGCOMM Workshop on Hot Topics in Middleboxes and Network Function Virtualization, 25–30. https://doi.org/10.1145/2785989.2786002 .

  118. Eijk, R. van, Asghari, H., Winter, P., & Narayanan, A. (2019). The impact of user location on cookie notices (inside and outside of the European union). Workshop on Technology and Consumer Protection (ConPro’19).

  119. Wachter, S. (2018). The GDPR and the internet of things: A three-step transparency model. Law, Innovation and Technology, 10(2), 266–294. https://doi.org/10.1080/17579961.2018.1527479 .

    Article  Google Scholar 

  120. Wan, Y. (2015). The Matthew effect in social commerce. Electronic Markets, 25(4), 313–324. https://doi.org/10.1007/s12525-015-0186-x .

    Article  Google Scholar 

  121. Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer.

  122. Wilkinson, I. (2001). A history of network and channels thinking in marketing in the 20th century. Australasian Marketing Journal; AMJ, 9(2), 23–52. https://doi.org/10.1016/S1441-3582(01)70174-7 .

    Article  Google Scholar 

  123. Yoo, B., Choudhary, V., & Mukhopadhyay, T. (2001). Neutral versus biased marketplaces: A comparison of electronic B2B marketplaces with different ownership structures. ICIS 2001 Proceedings, 15.

  124. Zahedi, F. M., & Song, J. (2008). Dynamics of trust revision: Using health infomediaries. Journal of Management Information Systems, 24(4), 225–248. https://doi.org/10.2753/MIS0742-1222240409 .

    Article  Google Scholar 

  125. Zhang, M. (2010). Social network analysis: History, concepts, and research. In B. Furht (Ed.), Handbook of Social Network Technologies and Applications (pp. 3–21). Springer US. https://doi.org/10.1007/978-1-4419-7142-5_1 .

  126. Zhang, C., Bu, Y., Ding, Y., & Xu, J. (2018). Understanding scientific collaboration: Homophily, transitivity, and preferential attachment. Journal of the Association for Information Science and Technology, 69(1), 72–86. https://doi.org/10.1002/asi.23916 .

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to David Agogo.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

Dataset 1: Spring 2016

Lightbeam was created by Atul Varma, software developer at Mozilla and originally called Collusion. This application made it possible to create a visualization of the network of websites collecting data about one’s browsing behavior on each page they visit online. In February 2012, Mozilla CEO at the time, Gary Kovacs, spoke about Collusion in a TED talk leading to the plugin going viral. In September 2012, Mozilla along with faculty and student researchers at Emily Carr University of Art + Design extended this plugin and relaunched it as Lightbeam in 2013. This application was supported by the Ford Foundation and the Natural Sciences and Engineering Research Council (NSERC). The full reference for this application and its source code can be accessed from: https://github.com/mozilla/lightbeam/blob/master/doc/data_format.v1.1.md

The data retrieved from Lightbeam has the following layout:

[source, target, timestamp, contentType, cookie, sourceVisited, secure, sourcePathDepth, sourceQueryDepth, sourceSub, targetSub, method, status, cacheable]
For instance, [“nytimes.com“, “doubleclick.net“, 1,456,366,106,722, “text\/html”, true, false, true, 1, 0, “www.”, “cm.g.”, “GET”, 204, true, false]

WHOIS Lookup is a query and response protocol that is used to query internet registry databases that store the registered users or assignees of an internet resource, such as a domain name, an IP address block or an autonomous system

Steps in creating the dataset:

  1. 1.

    Selected 8 different economic sectors

  2. 2.

    Identified the top 20–25 ranked websites in each sector using Alexa

  3. 3.

    Visited the homepage only of the top ranked websites

  4. 4.

    Save the data on websites that ‘talked’ to the visited page using Lightbeam

  5. 5.

    Retrieve the name of the corporation that owns each website in the dataset using a WHOIS look up tool.

Dataset 2 Spring 2020

OpenWpm is an automated web privacy measurement framework that makes it easy to collect data from thousands to millions of websites. It is built on top of Firefox and runs in a windowed or windowless state, crawling the provided list of websites automatically and according to configurations supplied. This tool is still in active development. The full reference for this application and its source code can be accessed from: https://github.com/mozilla/OpenWPM

The data retrieved from OpenWpm is in form of an SQLite database with different tables. More information about the Http requests table can be found at this link: https://github.com/mozilla/OpenWPM/wiki/Instrumentation-Schema-Documentation#http-requests

Steps in creating the dataset:

  1. 1.

    Crawled websites from dataset 1 using OpenWpm script

  2. 2.

    Extracted the same information as used for dataset 1

  3. 3.

    Retrieve the name of the corporation that owns each website in the dataset using WHOIS records collected for dataset 1.

  4. 4.

    Updated WhoIS records for those websites that were new in this dataset.

Complete List of Websites Crawled.

Adult
adam4adam.com
adultfriendfinder.com
cam4.com
cams.com
clips4sale.com
digitalplayground.com
ebaumsworld.com
fetlife.com
flirt4free.com
freeones.com
imlive.com
literotica.com
livejasmin.com
mrskin.com
newgrounds.com
nudevista.com
playboy.com
xnxx.com
youporn.com
planetsuzy.org
squirt.org
furaffinity.net
e-hentai.org
manhunt.net
nhentai.net
eCommerce
amazon.com
bestbuy.com
bhphotovideo.com
costco.com
ebay.com
gap.com
hm.com
homedepot.com
etsy.com
groupon.com
ikea.com
kohls.com
lowes.com
macys.com
netflix.com
newegg.com
nike.com
nordstrom.com
overstock.com
sears.com
steampowered.com
target.com
walmart.com
wayfair.com
amazon.co.uk
Health
drugs.com
medscape.com
express-scripts.com
health.com
healthgrades.com
medicinenet.com
medscape.com
mensfitness.com
menshealth.com
mercola.com
myfitnesspal.com
prevention.com
psychologytoday.com
webmd.com
weightwatchers.com
cdc.gov
fda.gov
kaiserpermanente.org
mayoclinic.org
mayoclinic.org/diseases-conditions
ncbi.nlm.nih.gov/pmc/
nhs.uk
nih.gov
who.int
Hotels
conradhotels3.hilton.com/en/index.html
courtyard.marriott.com
doubletree3.hilton.com/en/index.html
embassysuites3.hilton.com/en/index.html
hamptoninn3.hilton.com/en/index.html
hiltongardeninn3.hilton.com/en/index.html
homewoodsuites3.hilton.com/en/index.html
www.ihg.com/crowneplaza/hotels/us/en/reservation
ihg.com/intercontinental/hotels/gb/en/reservation
marriott.com/towneplace-suites/travel.mi
starwoodhotels.com
starwoodhotels.com/alofthotels/index.html
starwoodhotels.com/design/index.html
starwoodhotels.com/element/index.html
starwoodhotels.com/fourpoints/index.html
starwoodhotels.com/lemeridien/index.html
starwoodhotels.com/luxury/index.html
starwoodhotels.com/sheraton/index.html
starwoodhotels.com/stregis/index.html
starwoodhotels.com/tributeportfolio/index.html
starwoodhotels.com/whotels/index.html
http://www3.hilton.com/en/index.html
http://jw.marriott.com/
http://renaissance-hotels.marriott.com
fairfieldinn.com/
hiltongrandvacations.com/
ihg.com/candlewood/hotels/us/en/reservation
ihg.com/holidayinn/hotels/us/en/reservation
ihg.com/holidayinnexpress/hotels/us/en/reservation
ihg.com/hotelindigo/hotels/us/en/reservation
ihg.com/staybridge/hotels/us/en/reservation
marriott.com
residenceinn.com
ritzcarlton.com
springhillsuites.com
News
go.com
accuweather.com
bloomberg.com
cbsnews.com
cnn.com
drudgereport.com
indiatimes.com
forbes.com
foxnews.com
google.com
reddit.com
huffingtonpost.com
cnn.com
nbcnews.com
yahoo.com
nytimes.com
reuters.com
shutterstock.com
theguardian.com
indiatimes.com
usatoday.com
weather.com
wsj.com
wunderground.com
bbc.co.uk
Social Media
badoo.com
classmates.com
facebook.com
fiverr.com
flickr.com
foursquare.com
hi5.com
hootsuite.com
myspace.com
twitter.com
twitter.com
couchsurfing.com
facebook.com
linkedin.com
pinterest.com
livejournal.com
meetup.com
ning.com
okcupid.com
google.com
skyrock.com
stumbleupon.com
tagged.com
xing.com
last.fm
Society
ancestry.com
yahoo.com
biblegateway.com
complex.com
correios.com.br
dailykos.com
digg.com
esquire.com
legacy.com
match.com
salon.com
siteadvisor.com
slate.com
sulekha.com
theguardian.com
aarp.org
europa.eu
europa.eu
change.org
irs.gov
jw.org
lds.org
nih.gov
japanpost.jp
state.gov
Insurance
aetna.com
aflac.com
allstate.com
anthem.com
aon.com
bcbsm.com
carefirst.com
cigna.com
esurance.com
farmers.com
geico.com
travelers.com
humana.com
libertymutual.com
massmutual.com
metlife.com
nationwide.com
progressive.com
prudential.com
statefarm.com
thehartford.com
usaa.com
vsp.com
fepblue.org
kaiserpermanente.org

Appendix 2

A t-test was performed to compare the incidence of different forms of brokerage between the observed network and the commensurate random networks. The full results of this test by brokerage type, and for each observed network is shown below in Tables 6 and 7. Table 8 contains details of the companies with the most brokerage positions at both times

Table 6 T-tests comparing observed networks to random networks (Time 1)
Table 7 T-tests comparing observed networks to random networks (Time 2)
Table 8 Companies and the number of Brokerage positions occupied (Time 1)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Agogo, D. Invisible market for online personal data: An examination. Electron Markets (2020). https://doi.org/10.1007/s12525-020-00437-0

Download citation

Keywords

  • Online personal data
  • Social network analysis
  • Cookie-syncing
  • Personal data markets
  • Third-party tracking

JEL classification

  • M15
  • L10