Skip to main content

Variations in Scientific Data Production: What Can We Learn from #Overlyhonestmethods?


In recent months months the hashtag #overlyhonestmethods has steadily been gaining popularity. Posts under this hashtag—presumably by scientists—detail aspects of daily scientific research that differ considerably from the idealized interpretation of scientific experimentation as standardized, objective and reproducible. Over and above its entertainment value, the popularity of this hashtag raises two important points for those who study both science and scientists. Firstly, the posts highlight that the generation of data through experimentation is often far less standardized than is commonly assumed. Secondly, the popularity of the hashtag together with its relatively blasé reception by the scientific community reveal that the actions reported in the tweets are far from shocking and indeed may be considered just “part of scientific research”. Such observations give considerable pause for thought, and suggest that current conceptions of data might be limited by failing to recognize this “inherent variability” within the actions of generation—and thus within data themselves. Is it possible, we must ask, that epistemic virtues such as standardization, consistency, reportability and reproducibility need to be reevaluated? Such considerations are, of course, of particular importance to data sharing discussions and the Open Data movement. This paper suggests that the notion of a “moral professionalism” for data generation and sharing needs to be considered in more detail if the inherent variability of data are to be addressed in any meaningful manner.

This is a preview of subscription content, access via your institution.


  1. (accessed 27/05/2014).

  2. While it cannot, of course, be assumed that these tweets can be relied upon itself as a source of data, they nonetheless remain an important indicator of the existence of variation in practice among scientists, of the less-than-scientific approach to theory exhibited by some. Furthermore, while the tweets also do not, of course, form a representative sample from the international science community, they have been received into the wider science arena predominantly with humour rather than shock.

  3. Although, of course, it must be noted that journals have widely varying requirements for the methods sections of paper. Some, it must be admitted, have very low reporting requirements, where they do not require a lot of details to be included, or they have space restrictions, where extensive details cannot be included. Indeed, it may be suggested that these contribute to a prevailing culture of just including the minimal information.

  4. This issue is also being addressed by a number of other bodies, such as PLoS and the MIBBI project.

  5. For more discussion see (Accessed 18/08/2014).

  6. While the use of kits may offset some of these issues, the problems of inter-user variation, sample preparation and interpretation all remain considerations.

  7. A popular example might be For examples of DNA extraction protocols check out.

  8. By formal instruction I refer to dedicated lessons from a recognized instructor. Within laboratories many of the daily processes are taught on a far less formal basis. Unfortunately, despite being common practice, this aspect of laboratory life is poorly documented in published literature.

  9. Although buffers are, on the whole, very stable, buffers that have been kept for long periods (beyond the recommended disposal date) can be liable to have bacterial contamination, small amounts of precipitation that may change the concentrations of the reagents and other problems.


  11. (Accessed 18/08/2014).

  12. Such as, for example, the NIH and NSF.

  13. “Cleaned” refers to the practice whereby raw data are structured, unnecessary information removed, outlying data removed, and so forth.

  14. Moreover, it necessitates that a differentiation be made between data that can’t be reproduced due to limitations with the way the methods are reported, and genuine non-reproducibility due to variations in the methods being overlooked, forgotten or unrepeatable.

  15. In this I differ from the philosophy of science discussions about “paradigms” as I am not talking about how bad data gets discarded from the body of knowledge, but how reusable and reproducible data gets sorted from that which cannot be reproduced.

  16. The integrity of peer review is further undermined by scepticism of reviewer competency. As one tweet says: We did not make the corrections suggested by reviewer 1 because we think reviewer 1 is a f***ing idiot”.

  17. It is likely that further consideration of this field will find an interesting interface with discussions on methodological iteration (such as O’Malley et al. 2010). Both fields offer a means of considering how science is rooted in, and defined by, daily repetitive practices and not by the extraordinary circumstances of experimentation.


  • Begley, C. G., & Ellis, L. M. (2012). Drug development: Raise standards for preclinical cancer research. Nature, 483(7391), 531–533.

    Article  Google Scholar 

  • Ben-David, J., & Sullivan, T. A. (1975). Sociology of science. Annual Review of Sociology, 1(1), 203–222.

    Article  Google Scholar 

  • CODATA, US National Committee for. (1997). Bits of power: Issues in global access to scientific data. Washington DC: National Academies Press.

    Google Scholar 

  • Collins, H. M. (2001). Tacit knowledge, trust and the Q of Sapphire. Social Studies of Science, 31(1), 71–85.

    Article  Google Scholar 

  • Hayden, E. C. (2013). Weak statistical standards implicated in scientific irreproducibility. Nature, November 11, 2013.

  • Jones, N. L. (2007). A code of ethics for the life sciences. Science and Engineering Ethics, 13, 25–43.

    Google Scholar 

  • Knoppers, B. M., Harris, J. R., Tasse, A. M., Budin-Ljøsne, I., Kaye, J., Deschenes, M., & Zawati, M. H. (2011). Towards a data sharing code of conduct for international genomic research. Genome Medicine, 3, 46.

  • Kuhn, T. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.

    Google Scholar 

  • Mobley, A., Linder, S. K., Braeuer, R., Ellis, L. M., & Zwelling, L. (2013). A survey on data reproducibility in cancer research provides insights into our limited ability to translate findings from the laboratory to the clinic. PLoS One, 8(5), e63221.

    Article  Google Scholar 

  • Mole, B. M. (2013). Overly honest methods, The Scientist, January 10, 2013. Accessed 17 Dec 2014.

  • O’Malley, M., Elliott, K., & Burian, R. (2010). From genetic to genomic regulation: Iterative methods in miRNA research. Studies in History and Philosophy of Biological and Biomedical Sciences, 41, 407–417.

    Article  Google Scholar 

  • Ruben, A. (2014). Forgive me, scientists, for I have sinned. Science, May 20, 2014.

  • Smith, R. (2004). Scientific articles have hardly changed in 50 years. BMJ, 328, 1533.

    Article  Google Scholar 

  • Vasilevsky, N. A., Brush, M. H., Paddock, H., Ponting, L., Tripathy, S. J., et al. (2013). On the reproducibility of science: Unique identification of research resources in the biomedical literature. PeerJ, 1, e148.

    Article  Google Scholar 

Download references


Many thanks to Prof Brian Rappert, Dr Ann-Sophie Barwich and Ms Helena van der Vegt for their invaluable comments on this manuscript and subject. I also thank the two anonymous reviewers for their helpful contributions to the manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Louise Bezuidenhout.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bezuidenhout, L. Variations in Scientific Data Production: What Can We Learn from #Overlyhonestmethods?. Sci Eng Ethics 21, 1509–1523 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • #Overlyhonestmethods
  • Research methods
  • Tacit knowledge
  • Moral professionalism
  • Data sharing
  • Scientific data
  • Open data