Journal on Data Semantics

, Volume 5, Issue 1, pp 3–17 | Cite as

Random Query Answering with the Crowd

Original Article


Random data generators play an important role in computer science and engineering since they aim at simulating reality in IT systems. Software random data generators cannot be reliable enough for critical applications due to their intrinsic determinism, while hardware random data generators are difficult to integrate within applications and are not always affordable in all circumstances. We present an approach that makes use of entropic data sources to compute the random data generation task. In particular, our approach exploits the chaotic phenomena happening in the crowd. We extract these phenomena from social networks since they reflect the behavior of the crowd. We have implemented the approach in a database system, RandomDB, to show its efficiency and its flexibility over the competitor approaches. We used RandomDB by taking data from Twitter, Facebook and Flickr. The experiments show that these social networks are sources to generate reliable randomness and RandomDB a system that can be used for the task. Hopefully, our experience will drive the development of a series of applications that reuse the same data in several and different scenarios.


Random number generation Data randomization RandomDB Social phenomena Crowdsourcing 


  1. 1.
    Aggarwal CC (2013) On the analytical properties of high-dimensional randomization. TKDE 25(7):1628–1642MathSciNetGoogle Scholar
  2. 2.
    Akram RN, Markantonakis K, Mayes K (2012) Pseudorandom number generation in smart cards: an implementation, performance and randomness analysis. In: NTMS, pp 1–7Google Scholar
  3. 3.
    Alimomeni M, Safavi-Naini R, Sharifian S (2013) A true random generator using human gameplay. In: GameSec, pp 10–28Google Scholar
  4. 4.
    Arnopoulos P (1994) Sociophysics: chaos and cosmos in nature and culture. Nova Science, New YorkGoogle Scholar
  5. 5.
    Bassham III, LE, Rukhin AL, Soto J, Nechvatal JR, Smid ME, Barker EB, Leigh SD, Levenson M, Vangel M, Banks DL, Heckert NA, Dray JF, Vo S (2010) A statistical test suite for random and pseudorandom number generators for cryptographic applications. SP 800-22 Rev 1aGoogle Scholar
  6. 6.
    Blum L, Blum M, Shub M (1986) A simple unpredictable pseudo-random number generator. SIAM J Comput 15(2):364–383CrossRefMathSciNetMATHGoogle Scholar
  7. 7.
    Boyar J (1989) Inferring sequences produced by pseudo-random number generators. J ACM 36(1):129–141CrossRefMathSciNetMATHGoogle Scholar
  8. 8.
    Bozzon A, Brambilla M, Ceri S (2012) Answering search queries with crowdsearcher. In: WWW, pp 1009–1018Google Scholar
  9. 9.
    Chen J (2005) The physical foundation of economics: an analytical thermodynamic theory. World Scientific Publishing Company, SingaporeCrossRefGoogle Scholar
  10. 10.
    Chen J, Miyaji A, Su C (2014) Distributed pseudo-random number generation and its application to cloud database. In: ISPEC, pp 373–387Google Scholar
  11. 11.
    Crescenzi V, Merialdo P, Qiu D (2013) A framework for learning web wrappers from the crowd. In: WWW, pp 261–272Google Scholar
  12. 12.
    Cuzzocrea A, Darmont J, Mahboubi H (2009) Fragmenting very large xml data warehouses via k-means clustering algorithm. IJBIDM 4(3/4):301–328CrossRefGoogle Scholar
  13. 13.
    Cuzzocrea A, Sacc D, Ullman JD (2013) Big data: a research agenda. In: IDEAS, pp 198–203Google Scholar
  14. 14.
    De Virgilio R, Maccioni A (2013) Generation of reliable randomness via social phenomena. In: MEDI, pp 65–77Google Scholar
  15. 15.
    Demartini G, Trushkowsky B, Kraska T, Franklin MJ (2013) CrowdQ: crowdsourced query understanding. In: CIDRGoogle Scholar
  16. 16.
    Dorrendorf L, Gutterman Z, Pinkas B (2009) Cryptanalysis of the random number generator of the windows operating system. ACM Trans Inf Syst Secur 13(1):10Google Scholar
  17. 17.
    Figurska M, Stańczyk M, Kulesza K (2008) Humans cannot consciously generate random numbers sequences: polemic study. Med Hypotheses 70(1):182–5CrossRefGoogle Scholar
  18. 18.
    Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) CrowdDB: answering queries with crowdsourcing. In: SIGMOD, pp 61–72Google Scholar
  19. 19.
    Galam S (2012) Sociophysics: a physicist’s modeling of psycho-political phenomena. Springer, BerlinCrossRefGoogle Scholar
  20. 20.
    Gearheart CM, Arazi B, Rouchka EC (2010) DNA-based random number generation in security circuitry. Biosystems 100(3):208–214CrossRefGoogle Scholar
  21. 21.
    Gerguri S, Matyás Jr V, Ríha Z, Smolík L (2010) Random number generation based on fingerprints. In: WISTP, pp 170–182Google Scholar
  22. 22.
    Gregersen H, Sailer L (1993) Chaos theory and its implications for social science research. Hum Relat 46(7):777–802CrossRefGoogle Scholar
  23. 23.
    Gutterman Z, Pinkas B, Reinman T (2006) Analysis of the linux random number generator. In: IEEE symposium on security and privacy, pp 371–385Google Scholar
  24. 24.
    Haje FE, Golubev Y, Liardet PY, Teglia Y (2006) On statistical testing of random numbers generators. In: SCN, pp 271–287Google Scholar
  25. 25.
    Halprin R, Naor M (2009) Games for extracting randomness. In: SOUPSGoogle Scholar
  26. 26.
    Kanter I, Aviad Y, Reidler I, Cohen E, Rosenbluh M (2010) An optical ultrafast random bit generator. Nat Photonics 4(1):58–61CrossRefGoogle Scholar
  27. 27.
    Knuth DE (1981) The art of computer programming. Seminumerical algorithms, vol II, 2nd edn. Addison-Wesley, ReadingGoogle Scholar
  28. 28.
    Krhovjak J, Matyas V, Zizkovsky J (2009) Generating random and pseudorandom sequences in mobile devices. In: MobiSec, pp 122–133Google Scholar
  29. 29.
    La Cerra P (2003) The first law of psychology is the second law of thermodynamics: the energetic evolutionary model of the mind and the generation of human psychological phenomena. Hum Nat Rev 3:440–447Google Scholar
  30. 30.
    L’Ecuyer P (2001) Software for uniform random number generation: distinguishing the good and the bad. In: Winter simulation conference, pp 95–105Google Scholar
  31. 31.
    L’Ecuyer P, Simard RJ (2007) TestU01: a C library for empirical testing of random number generators. ACM Trans Math Softw 33(4):22Google Scholar
  32. 32.
    Leung CKS, Cuzzocrea A, Jiang F (2013) Discovering frequent patterns from uncertain data streams with time-fading and landmark models. T. Large Scale Data Knowl Cent Syst 8:174–196Google Scholar
  33. 33.
    Mannila H (2009) Randomization methods in data mining. In: KDD, pp 5–6Google Scholar
  34. 34.
    Marcus A, Wu E, Karger D, Madden S, Miller R (2011) Human-powered sorts and joins. PVLDB 5(1):13–24Google Scholar
  35. 35.
    Marsaglia G (2003) Random number generation. In: Encyclopedia of computer science. Wiley, Chichester, pp 1499–1503Google Scholar
  36. 36.
    Marsaglia G (2003) Seeds for random number generators. Commun ACM 46(5):90–93CrossRefGoogle Scholar
  37. 37.
    Marsaglia G (2003) Xorshift RNGs. J Stat Softw 8(14):1–6Google Scholar
  38. 38.
    Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul 8(1):3–30CrossRefMATHGoogle Scholar
  39. 39.
    Maurer UM (1992) A universal statistical test for random bit generators. J Cryptol 5(2):89–105CrossRefMATHGoogle Scholar
  40. 40.
    de Melo POSV, Viana AC, Fiore M, Jaffrès-Runser K, Mouël FL, Loureiro AAF, Addepalli L, Chen G (2015) RECAST: telling apart social and random relationships in dynamic networks. Perform Eval 87:19–36CrossRefGoogle Scholar
  41. 41.
    Nisan N (1996) Extracting randomness: how and why a survey. In: CCC, pp 44–58Google Scholar
  42. 42.
    Nobari S, Lu X, Karras P, Bressan S (2011) Fast random graph generation. In: EDBT, pp 331–342Google Scholar
  43. 43.
    Panneton F, L’ecuyer P, Matsumoto M(2006) Improved long-period generators based on linear recurrences modulo 2. TOMS 32(1):1–16Google Scholar
  44. 44.
    Park SK, Miller KW (1988) Random number generators: good ones are hard to find. Commun ACM 31(10):1192–1201CrossRefMathSciNetGoogle Scholar
  45. 45.
    Perony N, Tessone C, König B, Schweitzer F et al (2012) How random is social behaviour? Disentangling social complexity through the study of a wild house mouse population. PLoS Comput Biol 8(11):e1002786–e1002786CrossRefGoogle Scholar
  46. 46.
    de Raadt T, Hallqvist N, Grabowski A, Keromytis AD, Provos N (1999) Cryptography in OpenBSD: an overview. In: USENIX annual technical conference, pp 93–101Google Scholar
  47. 47.
    Rapoport A, Budescu DV (1997) Randomization in individual choice behavior. Psychol Rev 104(3):603–617CrossRefGoogle Scholar
  48. 48.
    Saito T, Ishii K, Tatsuno I, Sukagawa S, Yanagita T (2010) Randomness and genuine random number generator with self-testing functions. In: Joint international conference on supercomputing in nuclear applications and Monte CarloGoogle Scholar
  49. 49.
    Selke J, Lofi C, Balke WT (2012) Pushing the boundaries of crowd-enabled databases with query-driven schema expansion. PVLDB 5(6):538–549Google Scholar
  50. 50.
    Stoer J, Bulirsch R, Bartels RH, Gautschi W, Witzgall C (2002) Introduction to numerical analysis. Springer, BerlinCrossRefMATHGoogle Scholar
  51. 51.
    Trevisan L (2001) Extractors and pseudorandom generators. J ACM 48(4):860–879CrossRefMathSciNetMATHGoogle Scholar
  52. 52.
    Von Ahn L, Maurer B, McMillen C, Abraham D, Blum M (2008) reCAPTCHA: human-based character recognition via web security measures. Science 321(5895):1465–1468CrossRefMathSciNetMATHGoogle Scholar
  53. 53.
    Wagenaar WA (1972) Generation of random sequences by human subjects: a critical survey of the literature. Psychol Bull 77(1):65–72Google Scholar
  54. 54.
    Wang J, Kraska T, Franklin MJ, Feng J (2012) Crowder: crowdsourcing entity resolution. PVLDB 5(11):1483–1494Google Scholar
  55. 55.
    Yilek S, Rescorla E, Shacham H, Enright B, Savage S (2009) When private keys are public: results from the 2008 Debian OpenSSL vulnerability. In: IMC, pp 15–27Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Dipartimento di IngegneriaUniversità Roma TreRomeItaly

Personalised recommendations