Abstract
Distributions of data taken from crystal structure databases, display unusual statistical properties like non-Gaussian shape, polymodality, and heavy tails. These features, typical for statistics of numerical data in social systems, appear in bigger sets (from hundreds to many thousand points) of database-originated parameters. The non-classic statistics of physical data collected through many years reflects a strong impact of social factors (financial support, exchange of information, competition, etc.) on a research activity. The values of some definite structural parameter, determined (and deposited to a database) by different authors, generally are neither independent nor random; their handling by common statistical tools may result in incorrect conclusions and wrong predictions. These statements are illustrated by several sets of reliable structural data whose statistics looks ‘bad,’ or inconclusive, in contemporary structural chemistry.
Similar content being viewed by others
References
Cambridge Structural Database, The Cambridge Crystallographic Data Centre. http://www.ccdc.cam.ac.uk
Inorganic Crystal Structure Database, FIZ Karlsruhe. https://icsd.fiz-karlsruhe.de
Protein Data Bank, Research collaboratory for structural bioinformatics. http://www.rcsb.org/pdb/home/home.do
American Mineralogist Crystal Structure Database. http://rruff.geo.arizona.edu/AMS/amcsd.php
Mighell AD, Himes VL, Rodgers JR (1983) Acta Cryst A39:737–740
Urusov VS, Nadezhdina TN (2009) J Struct Chem 50:S22–S37
Infantes L, Fabian L, Motherwell WDS (2007) CrystEngComm 9:65–71
Allen FH (2002) Acta Cryst B58:380–388
Betancourt MR, Skolnick J (2004) J Mol Biol 342:635–649
Allen FH (1992) In: Domenicano A, Hargittai I (eds) Accurate molecular structures, their determination and importance. Oxford University Press, IUCr
IsoStar—a knowledge based library of intermolecular interactions. Cambridge Crystallographic Data Centre. http://www.ccdc.cam.ac.uk/Solutions/CSDSystem/Pages/IsoStar.aspx
Slovokhotov YL (2014) Cryst Growth Des 14:6205–6216
Mantegna RN, Stanley HE (2000) An introduction to econophysics. Cambridge University Press, Correlations and complexity in finance
Chakrabarti BK, Chakraborti A, Chatterie A (eds) (2006) Econophysics and sociophysics: trends and perspectives. Wiley, Weinheim
Dorogovtsev S (2010) Lectures on complex networks. Clarendon, Oxford
Helbing D (2010) Quantitative sociodynamics: stochastic methods and models of social interaction processes. Springer, Berlin
Dow Jones 65 Composite (DJA). http://stockcharts.com/charts/historical
Grimes RN (2011) Carboranes, 2nd edn. Academic Press, Waltham
WinXPOW Software Manual (2003) STOE and CIE GmbH. www.stoe.com
Johnson BFG, Haymore BL, Dilworth JL (1987) In: Wilkinson J (ed) Comprehensive coordination chemistry, vol 2. Pergamon Press, Oxford
Vriese K, van Koten G (1987) In: Wilkinson J (ed) Comprehensive coordination chemistry, vol 2. Pergamon Press, Oxford
Mingos DMP (1982) In: Wilkinson J (ed) Comprehensive organometallic chemistry, vol 3. Pergamon Press, Oxford
Whiteley MW (1995) In: Wilkinson J, Stone FGA, Abel EW (eds) Comprehensive organometallic chemistry II, vol 5. Pergamon Press, Oxford
Slovokhotov YL, Yanovsky AI, Struchkov YT (1995) Abstract ECM 16, Lund, Sweden, P19-11 (p. 126)
Baek SK, Bernhardsson S, Minnhagen P (2011) New J Phys 13:043004
Acknowledgments
The author is indebted to Viktor Rybakov and Vladimir Chernyshov for their kind help with numerical data and stimulating discussions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Dedicated to the memory of Prof. Oleg Valeryevich Shishkin.
Rights and permissions
About this article
Cite this article
Slovokhotov, Y.L. ‘Bad’ distributions of good data: unusual statistics of structural databases. Struct Chem 27, 389–400 (2016). https://doi.org/10.1007/s11224-015-0716-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11224-015-0716-3