Skip to main content
Log in

‘Bad’ distributions of good data: unusual statistics of structural databases

  • Original Research
  • Published:
Structural Chemistry Aims and scope Submit manuscript

Abstract

Distributions of data taken from crystal structure databases, display unusual statistical properties like non-Gaussian shape, polymodality, and heavy tails. These features, typical for statistics of numerical data in social systems, appear in bigger sets (from hundreds to many thousand points) of database-originated parameters. The non-classic statistics of physical data collected through many years reflects a strong impact of social factors (financial support, exchange of information, competition, etc.) on a research activity. The values of some definite structural parameter, determined (and deposited to a database) by different authors, generally are neither independent nor random; their handling by common statistical tools may result in incorrect conclusions and wrong predictions. These statements are illustrated by several sets of reliable structural data whose statistics looks ‘bad,’ or inconclusive, in contemporary structural chemistry.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Cambridge Structural Database, The Cambridge Crystallographic Data Centre. http://www.ccdc.cam.ac.uk

  2. Inorganic Crystal Structure Database, FIZ Karlsruhe. https://icsd.fiz-karlsruhe.de

  3. Protein Data Bank, Research collaboratory for structural bioinformatics. http://www.rcsb.org/pdb/home/home.do

  4. American Mineralogist Crystal Structure Database. http://rruff.geo.arizona.edu/AMS/amcsd.php

  5. Mighell AD, Himes VL, Rodgers JR (1983) Acta Cryst A39:737–740

    Article  CAS  Google Scholar 

  6. Urusov VS, Nadezhdina TN (2009) J Struct Chem 50:S22–S37

    Article  CAS  Google Scholar 

  7. Infantes L, Fabian L, Motherwell WDS (2007) CrystEngComm 9:65–71

    Article  CAS  Google Scholar 

  8. Allen FH (2002) Acta Cryst B58:380–388

    Article  CAS  Google Scholar 

  9. Betancourt MR, Skolnick J (2004) J Mol Biol 342:635–649

    Article  CAS  Google Scholar 

  10. Allen FH (1992) In: Domenicano A, Hargittai I (eds) Accurate molecular structures, their determination and importance. Oxford University Press, IUCr

    Google Scholar 

  11. IsoStar—a knowledge based library of intermolecular interactions. Cambridge Crystallographic Data Centre. http://www.ccdc.cam.ac.uk/Solutions/CSDSystem/Pages/IsoStar.aspx

  12. Slovokhotov YL (2014) Cryst Growth Des 14:6205–6216

    Article  CAS  Google Scholar 

  13. Mantegna RN, Stanley HE (2000) An introduction to econophysics. Cambridge University Press, Correlations and complexity in finance

    Google Scholar 

  14. Chakrabarti BK, Chakraborti A, Chatterie A (eds) (2006) Econophysics and sociophysics: trends and perspectives. Wiley, Weinheim

    Google Scholar 

  15. Dorogovtsev S (2010) Lectures on complex networks. Clarendon, Oxford

    Book  Google Scholar 

  16. Helbing D (2010) Quantitative sociodynamics: stochastic methods and models of social interaction processes. Springer, Berlin

    Book  Google Scholar 

  17. Dow Jones 65 Composite (DJA). http://stockcharts.com/charts/historical

  18. REAXYS. http://www.elsevier.com/solutions/reaxys

  19. Grimes RN (2011) Carboranes, 2nd edn. Academic Press, Waltham

    Google Scholar 

  20. WinXPOW Software Manual (2003) STOE and CIE GmbH. www.stoe.com

  21. Johnson BFG, Haymore BL, Dilworth JL (1987) In: Wilkinson J (ed) Comprehensive coordination chemistry, vol 2. Pergamon Press, Oxford

    Google Scholar 

  22. Vriese K, van Koten G (1987) In: Wilkinson J (ed) Comprehensive coordination chemistry, vol 2. Pergamon Press, Oxford

    Google Scholar 

  23. Mingos DMP (1982) In: Wilkinson J (ed) Comprehensive organometallic chemistry, vol 3. Pergamon Press, Oxford

    Google Scholar 

  24. Whiteley MW (1995) In: Wilkinson J, Stone FGA, Abel EW (eds) Comprehensive organometallic chemistry II, vol 5. Pergamon Press, Oxford

    Google Scholar 

  25. Slovokhotov YL, Yanovsky AI, Struchkov YT (1995) Abstract ECM 16, Lund, Sweden, P19-11 (p. 126)

  26. Baek SK, Bernhardsson S, Minnhagen P (2011) New J Phys 13:043004

    Article  Google Scholar 

Download references

Acknowledgments

The author is indebted to Viktor Rybakov and Vladimir Chernyshov for their kind help with numerical data and stimulating discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuri L. Slovokhotov.

Additional information

Dedicated to the memory of Prof. Oleg Valeryevich Shishkin.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Slovokhotov, Y.L. ‘Bad’ distributions of good data: unusual statistics of structural databases. Struct Chem 27, 389–400 (2016). https://doi.org/10.1007/s11224-015-0716-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11224-015-0716-3

Keywords

Navigation