Abstract
Random data generators play an important role in computer science and engineering since they aim at simulating reality in IT systems. Software random data generators cannot be reliable enough for critical applications due to their intrinsic determinism, while hardware random data generators are difficult to integrate within applications and are not always affordable in all circumstances. We present an approach that makes use of entropic data sources to compute the random data generation task. In particular, our approach exploits the chaotic phenomena happening in the crowd. We extract these phenomena from social networks since they reflect the behavior of the crowd. We have implemented the approach in a database system, RandomDB, to show its efficiency and its flexibility over the competitor approaches. We used RandomDB by taking data from Twitter, Facebook and Flickr. The experiments show that these social networks are sources to generate reliable randomness and RandomDB a system that can be used for the task. Hopefully, our experience will drive the development of a series of applications that reuse the same data in several and different scenarios.
Similar content being viewed by others
Notes
Completely Automated Public Turing test to tell Computers and Humans Apart.
In mathematics, a real-valued function f(x) defined on an interval is called convex if the line segment between any two points on the graph of the function lies above the graph, in a Euclidean space (or more generally a vector space) of at least two dimensions.
References
Aggarwal CC (2013) On the analytical properties of high-dimensional randomization. TKDE 25(7):1628–1642
Akram RN, Markantonakis K, Mayes K (2012) Pseudorandom number generation in smart cards: an implementation, performance and randomness analysis. In: NTMS, pp 1–7
Alimomeni M, Safavi-Naini R, Sharifian S (2013) A true random generator using human gameplay. In: GameSec, pp 10–28
Arnopoulos P (1994) Sociophysics: chaos and cosmos in nature and culture. Nova Science, New York
Bassham III, LE, Rukhin AL, Soto J, Nechvatal JR, Smid ME, Barker EB, Leigh SD, Levenson M, Vangel M, Banks DL, Heckert NA, Dray JF, Vo S (2010) A statistical test suite for random and pseudorandom number generators for cryptographic applications. SP 800-22 Rev 1a
Blum L, Blum M, Shub M (1986) A simple unpredictable pseudo-random number generator. SIAM J Comput 15(2):364–383
Boyar J (1989) Inferring sequences produced by pseudo-random number generators. J ACM 36(1):129–141
Bozzon A, Brambilla M, Ceri S (2012) Answering search queries with crowdsearcher. In: WWW, pp 1009–1018
Chen J (2005) The physical foundation of economics: an analytical thermodynamic theory. World Scientific Publishing Company, Singapore
Chen J, Miyaji A, Su C (2014) Distributed pseudo-random number generation and its application to cloud database. In: ISPEC, pp 373–387
Crescenzi V, Merialdo P, Qiu D (2013) A framework for learning web wrappers from the crowd. In: WWW, pp 261–272
Cuzzocrea A, Darmont J, Mahboubi H (2009) Fragmenting very large xml data warehouses via k-means clustering algorithm. IJBIDM 4(3/4):301–328
Cuzzocrea A, Sacc D, Ullman JD (2013) Big data: a research agenda. In: IDEAS, pp 198–203
De Virgilio R, Maccioni A (2013) Generation of reliable randomness via social phenomena. In: MEDI, pp 65–77
Demartini G, Trushkowsky B, Kraska T, Franklin MJ (2013) CrowdQ: crowdsourced query understanding. In: CIDR
Dorrendorf L, Gutterman Z, Pinkas B (2009) Cryptanalysis of the random number generator of the windows operating system. ACM Trans Inf Syst Secur 13(1):10
Figurska M, Stańczyk M, Kulesza K (2008) Humans cannot consciously generate random numbers sequences: polemic study. Med Hypotheses 70(1):182–5
Franklin MJ, Kossmann D, Kraska T, Ramesh S, Xin R (2011) CrowdDB: answering queries with crowdsourcing. In: SIGMOD, pp 61–72
Galam S (2012) Sociophysics: a physicist’s modeling of psycho-political phenomena. Springer, Berlin
Gearheart CM, Arazi B, Rouchka EC (2010) DNA-based random number generation in security circuitry. Biosystems 100(3):208–214
Gerguri S, Matyás Jr V, Ríha Z, Smolík L (2010) Random number generation based on fingerprints. In: WISTP, pp 170–182
Gregersen H, Sailer L (1993) Chaos theory and its implications for social science research. Hum Relat 46(7):777–802
Gutterman Z, Pinkas B, Reinman T (2006) Analysis of the linux random number generator. In: IEEE symposium on security and privacy, pp 371–385
Haje FE, Golubev Y, Liardet PY, Teglia Y (2006) On statistical testing of random numbers generators. In: SCN, pp 271–287
Halprin R, Naor M (2009) Games for extracting randomness. In: SOUPS
Kanter I, Aviad Y, Reidler I, Cohen E, Rosenbluh M (2010) An optical ultrafast random bit generator. Nat Photonics 4(1):58–61
Knuth DE (1981) The art of computer programming. Seminumerical algorithms, vol II, 2nd edn. Addison-Wesley, Reading
Krhovjak J, Matyas V, Zizkovsky J (2009) Generating random and pseudorandom sequences in mobile devices. In: MobiSec, pp 122–133
La Cerra P (2003) The first law of psychology is the second law of thermodynamics: the energetic evolutionary model of the mind and the generation of human psychological phenomena. Hum Nat Rev 3:440–447
L’Ecuyer P (2001) Software for uniform random number generation: distinguishing the good and the bad. In: Winter simulation conference, pp 95–105
L’Ecuyer P, Simard RJ (2007) TestU01: a C library for empirical testing of random number generators. ACM Trans Math Softw 33(4):22
Leung CKS, Cuzzocrea A, Jiang F (2013) Discovering frequent patterns from uncertain data streams with time-fading and landmark models. T. Large Scale Data Knowl Cent Syst 8:174–196
Mannila H (2009) Randomization methods in data mining. In: KDD, pp 5–6
Marcus A, Wu E, Karger D, Madden S, Miller R (2011) Human-powered sorts and joins. PVLDB 5(1):13–24
Marsaglia G (2003) Random number generation. In: Encyclopedia of computer science. Wiley, Chichester, pp 1499–1503
Marsaglia G (2003) Seeds for random number generators. Commun ACM 46(5):90–93
Marsaglia G (2003) Xorshift RNGs. J Stat Softw 8(14):1–6
Matsumoto M, Nishimura T (1998) Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul 8(1):3–30
Maurer UM (1992) A universal statistical test for random bit generators. J Cryptol 5(2):89–105
de Melo POSV, Viana AC, Fiore M, Jaffrès-Runser K, Mouël FL, Loureiro AAF, Addepalli L, Chen G (2015) RECAST: telling apart social and random relationships in dynamic networks. Perform Eval 87:19–36
Nisan N (1996) Extracting randomness: how and why a survey. In: CCC, pp 44–58
Nobari S, Lu X, Karras P, Bressan S (2011) Fast random graph generation. In: EDBT, pp 331–342
Panneton F, L’ecuyer P, Matsumoto M(2006) Improved long-period generators based on linear recurrences modulo 2. TOMS 32(1):1–16
Park SK, Miller KW (1988) Random number generators: good ones are hard to find. Commun ACM 31(10):1192–1201
Perony N, Tessone C, König B, Schweitzer F et al (2012) How random is social behaviour? Disentangling social complexity through the study of a wild house mouse population. PLoS Comput Biol 8(11):e1002786–e1002786
de Raadt T, Hallqvist N, Grabowski A, Keromytis AD, Provos N (1999) Cryptography in OpenBSD: an overview. In: USENIX annual technical conference, pp 93–101
Rapoport A, Budescu DV (1997) Randomization in individual choice behavior. Psychol Rev 104(3):603–617
Saito T, Ishii K, Tatsuno I, Sukagawa S, Yanagita T (2010) Randomness and genuine random number generator with self-testing functions. In: Joint international conference on supercomputing in nuclear applications and Monte Carlo
Selke J, Lofi C, Balke WT (2012) Pushing the boundaries of crowd-enabled databases with query-driven schema expansion. PVLDB 5(6):538–549
Stoer J, Bulirsch R, Bartels RH, Gautschi W, Witzgall C (2002) Introduction to numerical analysis. Springer, Berlin
Trevisan L (2001) Extractors and pseudorandom generators. J ACM 48(4):860–879
Von Ahn L, Maurer B, McMillen C, Abraham D, Blum M (2008) reCAPTCHA: human-based character recognition via web security measures. Science 321(5895):1465–1468
Wagenaar WA (1972) Generation of random sequences by human subjects: a critical survey of the literature. Psychol Bull 77(1):65–72
Wang J, Kraska T, Franklin MJ, Feng J (2012) Crowder: crowdsourcing entity resolution. PVLDB 5(11):1483–1494
Yilek S, Rescorla E, Shacham H, Enright B, Savage S (2009) When private keys are public: results from the 2008 Debian OpenSSL vulnerability. In: IMC, pp 15–27
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
De Virgilio, R., Maccioni, A. Random Query Answering with the Crowd. J Data Semant 5, 3–17 (2016). https://doi.org/10.1007/s13740-015-0051-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13740-015-0051-2