Advertisement

Challenges in Data Intensive Analysis at Scientific Experimental User Facilities

  • Kerstin Kleese van Dam
  • Dongsheng Li
  • Stephen D. Miller
  • John W. Cobb
  • Mark L. Green
  • Catherine L. Ruby
Chapter

Abstract

Today’s scientific challenges such as routes to a sustainable energy future, materials by design or biological and chemical environmental remediation methods, are complex problems that require the integration of a wide range of complementary expertise to be addressed successfully. Experimental and computational science research methods can hereby offer fundamental insights for their solution.

Keywords

Experimental Facility Data Ownership Pacific Northwest National Laboratory User Facility Spallation Neutron Source 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

S.D.M. acknowledges that the research at Oak Ridge National Laboratory’s Spallation Neutron Source was sponsored by the Scientific User Facilities Division, Office of Basic Energy Sciences, U. S. Department of Energy.

S.D.M and J.W.C. acknowledge that the submitted manuscript has been co-authored by a contractor of the U.S. Government under Contract No. DE-AC05-00OR22725. Accordingly, the U.S. Government retains a non-exclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.

J.W.C. acknowledges that this material is based upon work supported by the National Science Foundation under Grant No. 050474. This research was supported in part by the National Science Foundation through TeraGrid resources provided by the Neutron Science TeraGrid Gateway.

References

  1. 1.
    National Research Council. Visualizing Chemistry: The Progress and Promise of Advanced Chemical Imaging, The National Academies Press, Washington, DC, 2006.Google Scholar
  2. 2.
    Basic Energy Science Advisory Committee, Subcommittee on Facing Our Energy Challenges in a New Era of Science, “Next Generation Photon Sources for Grand Challenges in Science and Energy”, Technical Report, U.S. Department of Energy, May 2009.Google Scholar
  3. 3.
    F. Maia, P. van der Meulen, A. Ourmazd, I. Vartanyes, G. Bortel, K. Wrona, M. Altarelli, G. Huldt, D. Larsson, R. Abela, V. Elser, T. Ekeberg, K. Cameron, D. van der Spoel, H. Kono, F. Wang, P. Thibault, and A. Mancuso, “Data Analysis and its needs @ European Xfel”. Presentation SPB-Workshop 2008 Working Group 3. http://www.xfel.eu/events/workshops/2008/spb_workshop_2008/ (accessed May 6th 2011)
  4. 4.
    C. Southan and G. Cameron, “Beyond the Tsunami: Developing the Infrastructure to Deal with Life Sciences data” In The Fourth Paradigm: Data-Intensive Scientific Discovery, 2009, Microsoft Research.Google Scholar
  5. 5.
    C. Goble and D. De Roure, “The Impact of Workflow Tools on Data-centric Research” In The Fourth Paradigm: Data-Intensive Scientific Discovery, 2009, Microsoft Research.Google Scholar
  6. 6.
    K. Alapaty, B. Allen, G. Bell, D. Benton, T. Brettin, S. Canon, R. Carlson, S. Cotter, S. Crivelli, E. Dart, V. Dattoria, N. Desai, R. Egan, J. Flick, K. Goodwin, S. Gregurick, S. Hicks, B. Johnston, B. de Jong, K. Kleese van Dam, M. Livny, V. Markowitz, J. McGraw, R. McCord, C. Oehmen, K. Regimbal, G. Shipman, G. Strand, B. Tierney, S. Turnbull, D. Williams, and J. Zurawski, “BER Science Network Requirements”, Report of the Biological and Environmental Research Network Requirements Workshop, April 29 and 30, 2010, Editors E. Dart and B. Tierney, LBNL report LBNL-4089E, October 2010.Google Scholar
  7. 7.
    B.F. Jones, S. Wuchty, and B. Uzzi, “Multi-University Research Teams: Shifting Impact, Geography, and Stratification in Science” in Science Express on 9 October 2008, Science 21 November 2008: Vol. 322. no. 5905, pp. 1259–1262Google Scholar
  8. 8.
    E. Yang, “Martin Dove’s RMC Workflow Diagram”, a supplementary requirement report, Work Package 1, November 2009 – June 2010, JISC I2S2 project, July 2010, available at: http://www.ukoln.ac.uk/projects/I2S2/documents/ISIS%20RMC%20workflow.pd
  9. 9.
    E. Dart and B. Tierney, “BES Science Network Requirements – Report of the Basic Energy Sciences Network Requirements Workshop Conducted September 22 and 23, 2010”.Google Scholar
  10. 10.
    S.D. Miller, A Geist, K.W. Herwig, P.F. Peterson, M.A. Reuter, S. Ren, J.C. Bilheux, S.I. Campbell, J.A. Kohl, S.S. Vazhkudai, J.W. Cobb, V.E. Lynch, M. Chen, J.R. Trater, B.C. Smith, T. Swain, J. Huang, R. Mikkelson, D. Mikkelson, and M.L. Green, “The SNS/HFIR Web Portal System – How Can it Help Me?” 2010 J. Phys.: Conf. Ser. 251 012096. doi:10.1088/1742-6596/251/1/012096.Google Scholar
  11. 11.
    Federal Information Processing Standards Publication – FIPS PUB 199, “Standards for Security Categorization of Federal Information and Information Systems” February 2004.Google Scholar
  12. 12.
    Scientific Data Management (SDM) for Government Agencies: Report from the Workshop to Improve SDM. “Harnessing the Power of Digital Data: Taking the Next Step. June 29-July 1, 2010.Google Scholar
  13. 13.
    D. Flannery, B. Matthews, T. Griffin, J. Bicarregui, M. Gleave, L. Lerusse, S. Sufi, G. Drinkwater, and K. Kleese van Dam, “ICAT: Integrating data infrastructure for facilities based science”. Proc. 5th IEEE International Conference on e-Science (e-science 2009), Oxford, UK, 09–11 Dec 2009Google Scholar
  14. 14.
    S. Sufi, B. Matthews, and K. Kleese van Dam. (2003) An Interdisciplinary Model for the Representation of Scientific Studies and Associated Data Holdings. UK e-Science All Hands meeting, Nottingham, 02–04 Sep 2003Google Scholar
  15. 15.
    S. Sufi and B.M. Matthews. (2005) The CCLRC Scientific Metadata Model: a metadata model for the exploitation of scientific studies and associated data. In Contributions in Knowledge and Data Management in Grids, eds. Domenico Talia, Angelos Bilas, Marios Dikaiakos, CoreGRID 3, Springer-Verlag, 2005.Google Scholar
  16. 16.
    E. Yang, B. Matthews, and M. Wilson, “Enhancing the Core Scientific Metadata Model to Incorporate Derived Data,” eScience, IEEE International Conference on, pp. 145–152, 2010 IEEE Sixth International Conference on e-Science, 2010Google Scholar
  17. 17.
    B. Matthews, “Using a Core Scientific Metadata Model in Large-Scale Facilities”. Presentation at 5th International Digital Curation Conference (IDCC 2009), London, UK, 02–04 Dec 2009Google Scholar
  18. 18.
    I.M. Atkinson, D. du Boulay, C. Chee, K. Chiu, T. King, D.F. McMullen, R. Quilici, N.G.D. Sim, P. Turner, and M. Wyatt, “CIMA Based Remote Instrument and Data Access: An Extension into the Australian e-Science Environment.” Proceedings of IEEE International Conference on e-Science and Grid Computing (e-Science 2006) Amsterdam, The Netherlands, December 2006.Google Scholar
  19. 19.
    I. Gorton, A. Wynne, Y. Liu, and J. Yin, “Components in the Pipeline,” IEEE Software, vol. 28, no. 3, pp. 34–40, May/June 2011, doi:10.1109/MS.2011.23Google Scholar
  20. 20.
    D. Li, M. Tschopp, X. Sun and M. Khaleel, Comparison of reconstructed spatial microstructure images using different statistical descriptors. Submitted to Computational Materials Science Google Scholar
  21. 21.
    D. Li Application of chemical image reconstruction on materials science and technology. accepted by Proceeding of 2011 World Congress of Engineering and Technology, IEEE, and will present the paper in October 2011Google Scholar
  22. 22.
    L.M. Kindle, I.A. Kakadiaris, T. Ju, and J.P. Carson (2011) A semiautomated approach for artefact removal in serial tissue cryosections. Journal of Microscopy. 241(2):200–6.CrossRefGoogle Scholar
  23. 23.
    J.P. Carson, D.R. Einstein, K.R. Minard, M.V. Fanucchi, C.D. Wallis, and R.A Corley (2010) High resolution lung airway cast segmentation with proper topology suitable for computational fluid dynamic simulations. Computerized Medical Imaging and Graphics. In Press.Google Scholar
  24. 24.
    M. Hohn, G. Tang, G. Goodyear, P.R. Baldwin, Z. Huang, P.A. Penczek, C. Yang, R.M. Glaeser, P.D. Adams, and S.J. Ludtke, “SPARX, a new environment for Cryo-EM image processing” in J Struct Biol. 157, 47–55, 2007Google Scholar
  25. 25.
    B.F. Jones, S. Wuchty, and B. Uzzi, 2008. ‘Multi-University Research Teams: Shifting Impact, Geography, and Stratification in Science’ in Science Express on 9 October 2008, Science 21 November 2008: Vol. 322. no. 5905, pp. 1259–1262Google Scholar
  26. 26.
    R. Guimera, B. Uzzi, J. Spiro, and L.A.N. Amaral, 2005. ‘Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance’ in Science, 308, 697 (2005).Google Scholar
  27. 27.
    M. Pianta and D. Archibugi, 1991. ‘Specialization and size of scientific activities: A bibliometric analysis of advanced countries’ in Scientometrics Volume 22, Number 3/November, 1991Google Scholar
  28. 28.
    W. West and P. Nightingale, 2009. ‘Organizing for innovation: towards successful translational research’ in Trends in Biotechnology, Volume 27, Issue 10, 558–561, 17 August 2009Google Scholar
  29. 29.
    Committee on Facilitating Interdisciplinary Research, National Academy of Sciences, National Academy of Engineering, Institute of Medicine. 2004. ‘The Drivers for Interdisciplinary Research’ in Facilitating interdisciplinary Research p 26–40, 2004Google Scholar
  30. 30.
    D. Shotton, K. Portwin, G. Klyne, and A. Miles, 2009. ‘Adventures in Semantic Publishing: Exemplar Semantic Enhancements of a Research Article’ in Publication Library of Science Computational Biology. 2009 April; 5(4).Google Scholar
  31. 31.
    A. de Waard, L. Breure, J.G. Kircz, and H. van Oostendorp, 2006. ‘Modeling rhetoric in scientific publication’ in Proceedings of the International Conference on Multidisciplinary Information Sciences and Technologies, pp 1–5, InSciT2006; 25–28 October 2006; Merida, Spain. http://www.instac.es/inscit2006/papers/pdf/133.pdf.
  32. 32.
    T. Kuhn, 1962. The Structure of Scientific Revolutions (Chicago: University of Chicago Press, 1962)Google Scholar
  33. 33.
    B. Latour, 1987. ‘Science in Action’ in How to Follow Scientists and Engineers through Society, Cambridge, Ma.: Harvard University Press, 1987.Google Scholar
  34. 34.
    C. Goble and D. deRoure, 2009. “The impact of Workflow tools on data-centric research” In The Fourth Paradigm: Data-Intensive Scientific Discovery, 2009, Microsoft Research.Google Scholar
  35. 35.
    C.J. Savage and A.J. Vickers (2009) Empirical Study of Data Sharing by Authors Publishing in PLoS Journals. PLoS ONE 4(9): e7078. doi:10.1371/journal.pone.0007078.CrossRefGoogle Scholar
  36. 36.
    J.M. Wicherts, D. Borsboom, J. Kats, and D. Molenaar, 2006. ‘The poor availability of psychological research data for reanalysis’ in American Psychologist 61: 726–728.Google Scholar
  37. 37.
    D. De Roure, C. Goble, S. Aleksejevs, S. Bechhofer, J. Bhagat, D. Cruickshank, D. Michaelides, and D. Newman, 2009. ‘The myExperiment Open Repository for Scientific Workflows’ in: Open Repositories 2009, May 2009, Atlanta, Georgia, US. (Submitted).Google Scholar
  38. 38.
    C. Southan and G. Cameron, 2009. “Beyond the Tsunami: Developing the Infrastructure to Deal with Life Sciences data” In The Fourth Paradigm: Data-Intensive Scientific Discovery, 2009, Microsoft Research.Google Scholar
  39. 39.
    S. Coles and L. Carr, 2008. ‘Experiences with Repositories & Blogs in Laboratories’ in Proceedings of: Third International Conference on Open Repositories 2008, 1–4 April 2008, Southampton, United Kingdom.Google Scholar
  40. 40.
    T. Velden and C. Lagoze, The Value of new Communication Models for Chemistry, White Paper 2009, eCommens@Cornell, http://hdl.handle.net/1813/14150.
  41. 41.
    J.D. Blower, A. Santokhee, A.J. Milsted, and J.G. Frey, BlogMyData: a Virtual Research Environment for collaborative visualization of environmental data. All Hands Meeting 2010, Cardiff UK 13–16 Sep 2010 http://eprints.soton.ac.uk/164533/.
  42. 42.
    I. Gorton, C. Sivaramakrishnan, G. Black, S. White, S. Purohit, M. Madison, and K. Schuchardt, 2011. Velo: riding the knowledge management wave for simulation and modeling. In Proceeding of the 4th international workshop on Software engineering for computational science and engineering (SECSE ’11). ACM, New York, NY, USA, 32–40.Google Scholar
  43. 43.
    L.E.C. Roberts, L.J. Blanshard, K. Kleese Van Dam, L. Price, S.L. Price, and I. Brown, Providing an Effective Data Infrastructure for the Simulation of Complex Materials. Proc. UK e-Science Programme All Hands Meeting 2006 (AHM 2006).Google Scholar
  44. 44.
    A.M. Walker, R.P. Bruin, M.T. Dove, T.O.H. White, K. Kleese van Dam, and R.P. Tyer. Integrating computing, data and collaboration grids: the RMCS tool. Philosophical Transactions of The Royal Society A 367 (1890) 1047–1050 (2009) [doi:10.1098/rsta.2008.0159]Google Scholar
  45. 45.
    A. Woolf, B. Lawrence, R. Lowry, K. Kleese van Dam, R. Cramer, and M. Gutierrez. Data integration with the Climate Science Modelling Language Proc. European Geosciences Union General Assembly 2005, Vienna, Austria, 24–29 Apr 2005, Geophysical Research Abstracts, Volume 7, 08775, 2005 (2005), Fourth GO-ESSP meeting, RAL, UK, 06–08 Jun 2005, Workshop on Grid Middleware and Geospatial Standards for Earth System Science Data, NESC workshop, Edinburgh, Scotland, 06–08 Sep 2005.Google Scholar
  46. 46.
    S.D. Miller, K.W. Herwig, S. Ren, S.S. Vazhkusai, P.R. Jemian, S. Luitz, A.A. Salnikov, I. Gaponenko, T. Proffen, P. Lewis, and M.L. Green, “Data Management and Its Role in Delivering Science at DOE BES User Facilities – Past, Present, and Future.Google Scholar
  47. 47.
    J. Ahrens, B. Hendrickson, S. Miller, R. Ross, and D. Williams, “Data Intensive Science in the Department of Energy” October 2010, LA-UR-10-07088.Google Scholar
  48. 48.
    K. Koski, C. Gheller, S. Heinzel, A. Kennedy, A. Streit, and P. Wittenburg. Strategy for a European Data Infrastructure: White Paper. Technical report, Partnership for Advanced Data in Europe (PARADE), September 2009.Google Scholar
  49. 49.
    M. Atkinson, M. Kersten, A. Szalay, and J. van Hemert. Data Intensive Research Theme. NESC Technical Report, May 2010.Google Scholar
  50. 50.
    J. Wood, T. Anderson, A. Bachem, C. Best, F. Genova, D. Lopez, W. Los, M. Marinucci, L. Romary, H. Van de Sompel, J. Vigen, P. Wittenburg, D. Giaretta, R.L. Hudson. Riding the Wave – How Europe can gain from the rising tide of scientific data, October 2010.Google Scholar
  51. 51.
    J. Ahrens, B. Hendrickson, G. Long, S. Miller, R. Ross, and D. Williams. Data Intensive Science in the Department of Energy, October 2010.Google Scholar
  52. 52.
    K. Kleese van Dam, T. Critchlow, J. Johnson, I. Gorton, D. Daly, R. Russell, and J. Feo. The Future of Data Intensive Science Experimenting in Data - Across the Scales, Across Technologies, Across the Disciplines. PNNL White Paper, November 2010. https://sites.google.com/site/dataintensivesciencecommunity/home
  53. 53.
    D. Atkins, T. Detterich, T. Hey, S. Baker, S. Feldman, and L. Lyon, NSF-OCI Task Force on Data and Visualization, March 7, 2011.Google Scholar
  54. 54.
    P. Rich, “Infrastructure III”, I/O Tutorial, An Advanced Simulation & Computing (ASC) Academic Strategic Alliances Program (ASAP) Center at The University of Chicago, 2009, http://flash.uchicago.edu/website/codesupport/tutorial_talks/June2009/IOtutorial.pdf (accessed May 6th 2011)
  55. 55.
    Scientific Grand Challenges – Discovery in Basic Energy Sciences: the Role of Computing at the Extreme Scale, Report of DOE workshop, August 13–15, Washington DC.Google Scholar
  56. 56.
    B. Fultz, K.W. Herwig, and G.G. Long, “Computational Scattering Science 2010”, Workshop held at Argonne National Laboratory July 7–9 2010. Workshop report. http://neutronscattering.org/2011/01/computational-scattering-science

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Kerstin Kleese van Dam
    • 1
  • Dongsheng Li
    • 1
  • Stephen D. Miller
    • 2
  • John W. Cobb
    • 2
  • Mark L. Green
    • 3
  • Catherine L. Ruby
    • 3
  1. 1.Fundamental and Computational Science DepartmentPacific Northwest National LaboratoryRichlandUSA
  2. 2.Data Systems Group, Neutron Scattering Science DivisionOak Ridge National LaboratoryOak RidgeUSA
  3. 3.Systems Integration Group Tech-X CorporationWilliamsvilleUSA

Personalised recommendations