Abstract
The SunPy Project developed a 13-question survey to understand the software and hardware usage of the solar-physics community. Of the solar-physics community, 364 members across 35 countries responded to our survey. We found that \(99\pm 0.5\)% of respondents use software in their research and 66% use the Python scientific-software stack. Students are twice as likely as faculty, staff scientists, and researchers to use Python rather than Interactive Data Language (IDL). In this respect, the astrophysics and solar-physics communities differ widely: 78% of solar-physics faculty, staff scientists, and researchers in our sample uses IDL, compared with 44% of astrophysics faculty and scientists sampled by Momcheva and Tollerud (2015). \(63\pm 4\)% of respondents have not taken any computer-science courses at an undergraduate or graduate level. We also found that most respondents use consumer hardware to run software for solar-physics research. Although 82% of respondents work with data from space-based or ground-based missions, some of which (e.g. the Solar Dynamics Observatory and Daniel K. Inouye Solar Telescope) produce terabytes of data a day, 14% use a regional or national cluster, 5% use a commercial cloud provider, and 29% use exclusively a laptop or desktop. Finally, we found that \(73\pm 4\)% of respondents cite scientific software in their research, although only \(42\pm 3\)% do so routinely.
1 Introduction
The SunPy Project (The SunPy Community et al., 2020) facilitates and promotes the use and development of community-led, free, and open sourceFootnote 1 data-analysis software for solar physics based on the scientific Python environment. To better understand the software and hardware preferences of the solar-physics community, the Project developed a 13-question survey (reproduced in Appendix A) and disseminated it internationallyFootnote 2 over a six-month period between 7 February 2019 and 28 July 2019.
Many of the survey questions were similar (and in some cases, identical) to those posed by Momcheva and Tollerud (2015) in an informal survey of 1142 members of the astrophysics community. The SunPy Project did this deliberately to compare software preferences between the solar and astrophysics communities.
This article presents the survey results, derived from analyzing 364 responses from community members across 35 countries. All of the survey responses, along with the code (Reback et al., 2020; Caswell et al., 2020; Waskom et al., 2020; van der Walt, Colbert, and Varoquaux, 2011; Bobra, Mumford, and Pereira, 2020) to analyze these data and produce the figures in this article, are publicly available at github.com/sunpy/survey.
2 Demographics
Since the SunPy Project relies largely on volunteer efforts, we chose to construct and disseminate this survey ourselves (instead of going through a formal channel such as the Statistical Research Center at the American Institute of Physics). As a result, we recognize that this survey may suffer from coverage error.
Our survey garnered 368 responses. Most of the survey respondents fit into one of four career stages: 56% (\(n=205\)) described themselves as a faculty member, staff scientist, or researcher, 15% (\(n=53\)) as a postdoc, 23% (\(n=84\)) as an undergraduate or graduate student, and 6% (\(n=22\)) as a software or instrument developer. This adds up to \(n=364\). Four respondents did not fit into any career stage, and we dropped their responses from our analysis.
Community members across 35 countriesFootnote 3 responded to our survey. About three-quarters of the respondents came from the US, UK, Germany, India, and Japan. Together, these five countries include about 1150 solar physicists;Footnote 4 therefore, our survey sampled roughly a quarter of the solar-physics community. Our results are based on the assumption that our sample is representative of the solar-physics community overall.
We asked respondents to identify all of the areas of research relevant to their career. Most respondents identified multiple sub-disciplines of expertise. We found that 76% (\(n=275\)) work with space-based observational data, 46% (\(n=169\)) work with ground-based observational data, and 26% (\(n=93\)) work on building instruments. A vast majority of respondents, 82%, work with ground-based or space-based data. 29% (\(n=105\)) identified theory as a relevant sub-discipline, and 47% (\(n=171\)) identified numerical simulations.
Most of the survey respondents (82%) chose to answer an optional question about whether they self-identified as an underrepresented minority; 16% of this subset (13% of the total sample) said yes. 79% of respondents chose to answer another optional question about whether they self-identified as a underrepresented gender identity; 11% of this subset (9% of the total sample) said yes.
3 Software Tools
In our survey of the solar-physics community, we found that \(99\pm 0.5\)% of respondents use software in their research.Footnote 5 In a survey of the astrophysics community, Momcheva and Tollerud (2015) found that 100% of respondents use software in their research.
We asked users to list all of the scientific-software tools, including programming languages, software development tools, and data-analysis frameworks, that they utilized within the last year. We summarized their responses in Figure 1. We found that 66% of respondents use the Python scientific-software stack and 73% use IDL.Footnote 6 Overall, respondents listed 42 different software tools and the average respondent used five tools in the past year.
Summary of results for survey Question 9 “Which of the following [software tools] have you personally utilized in your work within the last year?” Results are grouped by self-identified career stage (Question 2). Respondents listed 42 different software tools; only tools used by 5% or more of respondents are shown.
We observe a stark contrast in usage between the two primary data-analysis languages in solar-physics research, Python and IDL, when viewed by respondent career stage. The earlier the career stage, the greater the percentage of Python users: 59% of faculty, staff scientists, and researchers, 75% of postdocs, and 79% of students use Python. The earlier the career stage, the fewer IDL users: 78% of faculty, staff scientists, and researchers, 75% of postdocs, and 60% of students use IDL.
Of course, these tools are not necessarily used in isolation – about half (45%) of respondents use both Python and IDL. Figure 2 shows that 28% of respondents use IDL exclusively (in other words, they use IDL and do not use Python), while 21% use Python exclusively. The ratio of exclusive IDL users to exclusive Python users is roughly 2:1 for faculty, staff, and research scientists and the opposite, 1:2, for students.
Figure 10 of Momcheva and Tollerud (2015) shows that Python is not only the most popular programming language within their sample of the astrophysics community, but it is also the most popular within every individual career category. Our survey results show that Python is the most popular programming language only among students; IDL and Python are at parity for postdocs, and IDL is more popular than Python for faculty, staff scientists, researchers, software developers, and instrument developers. In this respect, the astrophysics and solar-physics communities differ widely: 78% of solar-physics faculty, staff scientists, and researchers in our sample use IDL,Footnote 7 compared with 44% of astrophysics faculty and scientists sampled by Momcheva and Tollerud (2015).
The two groups of respondents share the same statistics, however, when it comes to writing software. In both the astrophysics and solar-physics communities, roughly a third of respondents write their own software most of the time (see Figure 3 of this article and Figure 3 of Momcheva and Tollerud, 2015). Furthermore, about 90% of respondents in both communities often or occasionally write their own software (see the same figures).
4 Education and Training
Although \(99\pm 0.5\)% of respondents use software in their research and \(91\pm 5\)% often or occasionally write their own software, \(63\pm 4\)% of respondents have not had any formal training (e.g. computer-science courses) at an undergraduate or graduate level. We found that people who write mostly their own software are no better trained than everyone else: \(44\pm 6\)% of people who write their own software reported “a lot (e.g. computer-science courses)” of formal training, compared with \(37\pm 3\)% overall. We also found that students today are twice as likely to have a lot of formal training in programming compared with faculty, researchers, and staff scientists (see Figure 4). The amount of training does not vary with area of expertise; each sub-discipline shows roughly the same amount of formal training as the general population (\(37\pm 3\)%).
5 Hardware Tools
We also found that most respondents utilize consumer hardware to run software for solar-physics research. Although 82% of respondents work with space-based or ground-based data, and some of these missions (e.g. the Solar Dynamics Observatory and Daniel K. Inouye Solar Telescope) produce terabytes of data per day, 14% use a regional or national clusterFootnote 8 and 5% use a commercial cloud provider (see Figure 5). 29% use exclusively a laptop or desktop. The community puts considerable effort into maintaining clusters and workstations, with 40% of respondents using a shared workstation, 51% using a local cluster, and 96% using a laptop or desktop.
These percentages vary significantly by sub-discipline. A larger percentage of respondents in the numerical simulations and theory sub-disciplines use local clusters (63% and 60%, respectively, compared with 51% overall) and regional or national clusters (26% and 26%, respectively, compared with 14% overall) (Figure 6).
6 Citing Scientific Software
Figure 7 shows that \(73\pm 4\)% of respondents cite scientific software in their research, although only \(42\pm 3\)% do so routinely. Roughly a quarter (\(27\pm 3\)%) never cite scientific software in their research. When asked why, about half (\(53\pm 8\)%) responded that they do not know how to appropriately cite scientific software (see Figure 8); we note that only \(4\pm 1\)% of respondents do not think software belongs in citations.
7 Discussion
Scientific software is an indispensable component of the modern scientific research workflow (Rüde et al., 2018). Virtually all of the solar-physics community uses software in their research. Based on this fact, we find three of the statistics presented in this article worrisome. First, similar to the astrophysics community,Footnote 9 a significant fraction of the solar-physics community (\(63\pm 4\)% of respondents) have not taken any computer-science courses at an undergraduate or graduate level. Second, most of the solar-physics community (82% of respondents) works with space-based or ground-based facilities, several of which produce terabyte- or petabyte-sized data sets, and nearly a third of the community (29% of respondents) uses exclusively a laptop or desktop to run software for solar-physics research. It is unclear whether the computing power offered by laptops and desktops limits the type of scientific endeavors in solar physics. Finally, less than half of the community (\(42\pm 3\)% of respondents) routinely cites scientific software in their research.
The United States National Academies of Sciences, Engineering, and Medicine (2018) report entitled Software Policy Options for NASA Earth and Space Sciences recognizes the lack of education in software development among scientists. The report recommends initiating and sponsoring “programs to educate and train researchers in open source best practices,” suggesting topics such as “export controls, licensing and intellectual property, workflows, and software development.” This includes sponsoring community members to attend conferences about open-source software development, such as Python in Astronomy (openastronomy.org/pyastro) or Scientific Computing with Python (conference.scipy.org), take online courses about software development, available on learning platforms such as Coursera (coursera.org) and edX (edx.org), join workshops like those led by The Carpentries (carpentries.org), and develop training programs, such as the Large Synoptic Survey Telescope’s Data Science Fellowship program (astrodatascience.org). Our findings in Section 4 show that the solar-physics community could benefit immensely from education and training in open-source software.
The Ford Foundation’s report, entitled Roads and Bridges: The Unseen Labor Behind Our Digital Infrastructure (Eghbal, 2016), also suggests “expanding the pool of contributors so that more people, and more types of people, can build and sustain public software together.” Increasing the diversity of the talent pool, which is still lacking in the solar-physics community, will help sustain a long-term future for open source software in solar physics.
However, maximizing the scientific return of large data sets, such as those produced by the Solar Dynamics Observatory and the Daniel K. Inouye Solar Telescope, requires both skill in software development and computational resources. The United States National Academies of Sciences, Engineering, and Medicine (2020) report entitled Progress Toward Implementation of the 2013 Decadal Survey for Solar and Space Physics: A Midterm Assessment and the Kavli Foundation series of workshops called Petabytes To Science (Bauer et al., 2019) recommend adopting science platforms that co-locate both data and computational resources required to analyze these data. In this paradigm, users run software in an external computing environment where the data lives, instead of moving the data to a desktop or laptop where the software lives. The astrophysics community already developed several science platforms, such as the ASTRO Data Lab (datalab.noao.edu), run by the NSF’s National Optical-Infrared Astronomy Research Laboratory. We encourage the solar-physics community to fund the development of science platforms so that scientists are not restricted by the computational power of consumer hardware for analyses involving terabytes of data.
Finally, we recognize that software development, and hardware development, takes a vast amount of time. This time is rarely recognized by the academic community, which largely rewards publications. Therefore, we encourage the community to publish scientific software (by submitting articles that describe research software to refereed journals and archiving this software in publicly available digital repositories; see guides.github.com/activities/citable-code), cite scientific software (see Appendix B about how to cite scientific software), and count scientific software as a co-equal research artifact when considering career evaluation. This has two benefits: it gives academic credit and career recognition to those who write software and it makes it easier to reproduce studies in solar physics.
Some of the earliest advocates for scientific reproducibility, Claerbout and Karrenbach (1992) and Buckheit and Donoho (1995), suggested that a journal article “about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship.” The actual scholarship, they argue, is the code and development environment used to generate the results. Preserving these elements of scholarship require tools like version control, which create snapshots of software or data as they change over time. At the moment, less than half the community (44% of respondents) uses version control.Footnote 10 The United States National Academies of Sciences, Engineering, and Medicine (2019) report entitled Reproducibility and Replicability in Science recommends that “researchers should convey clear, specific, and complete information about any computational methods and data products that support their published results in order to enable other researchers to repeat the analysis,” including the data, study methods, and computational environment.
Scientists make a critical choice when selecting a computational environment, because the quality of our tools informs the quality of our research. A large fraction of the community uses the Python scientific-software stack (66% of respondents). This number will only grow over time, since Python is the most popular programming language among students in the solar-physics community (79% of students who took our survey use Python).
There are a number of reasons why the Python scientific-software stack is growing in prominence both in the solar-physics community and many other scientific disciplines.Footnote 11 Interoperability between many packages for plotting, numerical methods, astronomy, statistics, and computing (e.g. Hunter, 2007; McKinney, 2010; Pedregosa et al., 2011; van der Walt, Colbert, and Varoquaux, 2011; VanderPlas et al., 2012; Rocklin, 2015; The Astropy Collaboration et al., 2018; Virtanen et al., 2020) allows researchers to write code with relative speed and ease. The rise of more than 50 packages in heliophysics alone (see heliopython.org) enables interdisciplinary analysis across traditionally isolated fields. The open-development model,Footnote 12 adopted by most of the scientific Python ecosystem, improves the longevity of software since anyone can contribute to the codebase and no single institution or person controls the software.
For these reasons, the United States National Academies of Sciences, Engineering, and Medicine (2018) report entitled Software Policy Options for NASA Earth and Space Sciences recommends that the “NASA Science Mission Directorate should explicitly recognize the scientific value of open source software and incentivize its development and support, with the goal that open source science software becomes routine scientific practice.” As the SunPy Advisory Board, we endorse this recommendation not only for the NASA Science Mission Directorate but for scientific funding agencies worldwide.
Notes
According to the Open Source Initiative, stewards of the Open Source Definition (available at opensource.org/osd), open source software consists of source code under an open source license. Open source licenses “allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.” In addition, open source software must not discriminate against persons, groups, or fields and the associated licenses must be non-specific, non-restrictive, and technology-neutral.
The UK Solar Physics, European Physical Society’s Solar Physics Division, American Astronomical Society’s Solar Physics Division, the solar-physics subdivision of the Astronomical Society of Japan, the Astronomical Society of India, and the Brazilian Astronomical Society organizations advertised the survey to their members. The SunPy Project also advertised the survey on the @SunPyProject Twitter account and sent it to the sunpy and sunpy-dev e-mail lists, both of which are public.
See the analysis code, available at github.com/sunpy/survey, for a full list.
The Solar Physics Division of the American Astronomical Society includes 521 members (private communication, S. Savage, 28 January 2020). The UK Solar Physics community estimates “over 150 scientists” on its website, uksolphys.org, as of 27 January 2020. The European Solar Physics Division counts 222 members (private communication, T.M.D. Pereira, 28 January 2020). The Astronomical Society of India includes approximately 100 solar physicists (private communication, D. Banerjee, 30 January 2020). The communication newsletter for the solar-physics subdivision of the Astronomical Society of Japan, called Renraku-kai, counts about 150 subscribers (private communication, K. Hayashi, 27 January 2020).
While three respondents did apparently indicate that they do not use software in their research, their further answers on the survey about software package usage suggest that those might have been erroneous responses.
Where relevant, we supplied our counting error for non-demographic software and hardware related questions (Questions 6 – 12). For Question 6, we report \(\sqrt{3}/364\), or 0.5%, as the percentage error in the number of no responses. Since this question required respondents to pick one response from a binary choice, we apply that same uncertainty to the yes responses. For Questions 7, 8, 10, and 11, which required respondents to pick only one response from a list of options, we quantified the percent error in each response simply by applying the square-root rule for counting experiments (Taylor, 1997). For Questions 9 and 12, which allowed respondents to select as many options as they liked, we do not calculate a percent error.
The use of IDL by the solar-physics community may be explained partly by how instrument teams provide their data. Many instrument teams provide data that have been calibrated to a low level, plus software that allows the data to be further calibrated for scientific use. The advantage of this model of scientific-data provision is that as knowledge of the instrument improves over time, the software can be updated to provide better high-level science-ready data products. A side-effect of this model of scientific-data provision is that scientific use of the data requires use of a particular package/language. Since many instrument teams chose to take advantage of the significant functionality provided by the SolarSoftWare (SSW: Freeland and Handy 1998) package, much of the software required to create higher-level data products is written in the primary language of SSW: IDL. Hence the model of scientific-data provision may explain why IDL is used by a significant proportion of respondents.
We recognize that some countries, such as the United States, require citizenship or permanent residence status to use these clusters.
Momcheva and Tollerud (2015) found that only \(8\pm 1\)% of the astrophysics community received substantial training; however, their question did not define “a lot” or “a little”.
We found that 44% of respondents selected the option “Github (or similar)” in Question 9. However, we realize this option is ambiguous. In retrospect, we should have provided “Git, Github, or similar” instead of “Github (or similar)” as an option in Question 9.
The number of contributors to the SunPy codebase grew by an average rate of one per month since 2011 (see The SunPy Community et al., 2020, Figure 1). According to the 2019 Stack Overflow developer survey, Python is the fastest-growing major programming language today (see insights.stackoverflow.com/survey/2019); furthermore, most universities use Python to teach computer science (Guo, 2014).
An open-development model goes beyond providing open-source software, it also includes making project-level decisions in publicly visible and accessible spaces, such as mailing lists, and inviting input from the user and developer communities (Tollerud et al., 2019).
References
Bauer, A.E., Bellm, E.C., Bolton, A.S., Chaudhuri, S., Connolly, A.J., Cruz, K.L., Desai, V., Drlica-Wagner, A., Economou, F., Gaffney, N., Kavelaars, J., Kinney, J., Li, T.S., Lundgren, B., Margutti, R., Narayan, G., Nord, B., Norman, D.J., O’Mullane, W., Padhi, S., Peek, J.E.G., Schafer, C., Schwamb, M.E., Smith, A.M., Tollerud, E.J., Weijmans, A.-M., Szalay, A.S.: 2019, Petabytes to science. arXiv . ADS .
Bobra, M., Mumford, S., Pereira, T.M.D.: 2020, sunpy/survey: survey v0.2.0 (2020-03-09), Zenodo. DOI .
Buckheit, J., Donoho, D.L.: 1995, Wavelab and Reproducible Research, Springer, Berlin. statweb.stanford.edu/~wavelab .
Caswell, T.A., Droettboom, M., Lee, A., Hunter, J., Firing, E., Stansby, D., Klymak, J., Hoffmann, T., de Andrade, E.S., Varoquaux, N., Nielsen, J.H., Root, B., Elson, P., May, R., Dale, D., Lee, J.-J., Seppänen, J.K., McDougall, D., Straw, A., Hobson, P., Gohlke, C., Yu, T.S., Ma, E., Vincent, A.F., Silvester, S., Moad, C., Kniazev, N., Ivanov, P., Ernest, E., Katins, J.: 2020, matplotlib/matplotlib v3.1.3, Zenodo. DOI .
Claerbout, J.F., Karrenbach, M.: 1992, Electronic documents give reproducible research a new meaning. In: Tech. Program Expanded Abs., Soc. Explor. Geophys., 601.
Eghbal, N.: 2016, Roads and bridges: the unseen labor behind our digital infrastructure. www.fordfoundation.org/work/learning/research-reports/roads-and-bridges-the-unseen-labor-behind-our-digital-infrastructure .
Freeland, S., Handy, B.N.: 1998, Data analysis with the SolarSoft system. Solar Phys.182, 497. DOI .
Guo, P.: 2014, Python is now the most popular introductory teaching language at top U.S. universities. cacm.acm.org/blogs/blog-cacm/176450 .
Hunter, J.D.: 2007, Matplotlib: a 2d graphics environment. Comput. Sci. Eng.9, 90. DOI .
McKinney, W.: 2010, Data structures for statistical computing in python. In: van der Walt, S., Millman, J. (eds.) Proc. 9th Python Science Conf., 51. DOI .
Momcheva, I., Tollerud, E.: 2015, Software use in astronomy: an informal survey, arXiv . ADS .
Mumford, S.J., Christe, S., Freij, N., Mayer, F., Hughitt, K., Ryan, D.F., Liedtke, S., Shih, A.Y., Pérez-Suárez, D., Chakraborty, P., Vishnunarayan, K.I., Inglis, A.R., Pattnaik, P., Sipőcz, B.M., Sharma, R., Leonard, D., Hewett, R.J., Alex-Ian-Hamilton, Stansby, D., Panda, A., Earnshaw, M., Choudhary, N., Kumar, A., Hayes, L., Chanda, P., Haque, M.A., Konge, S., mdmueller, Kirk, M., haathi: 2020, Sunpy, Zenodo. DOI .
National Academies of Sciences, Engineering, and Medicine: 2018, Open Source Software Policy Options for NASA Earth and Space Sciences, National Academies Press, Washington. ISBN 978-0-309-48271-4. DOI .
National Academies of Sciences, Engineering, and Medicine: 2019, Reproducibility and Replicability in Science, National Academies Press, Washington. ISBN 978-0-309-48616-3. DOI .
National Academies of Sciences, Engineering, and Medicine: 2020, Progress Toward Implementation of the 2013 Decadal Survey for Solar and Space Physics: A Midterm Assessment, National Academies Press, Washington. 978-0-309-67127-9. DOI .
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: 2011, Scikit-learn: machine learning in Python. J. Mach. Learn. Res.12, 2825.
Reback, J., McKinney, W., jbrockmendel, den Bossche, J.V., Augspurger, T., Cloud, P., gfyoung, Sinhrks, Klein, A., Roeschke, M., Tratner, J., She, C., Hawkins, S., Ayd, W., Petersen, T., Schendel, J., Hayden, A., Garcia, M., MomIsBestFriend, Jancauskas, V., Battiston, P., Seabold, S., chris-b1, h-vetinari, Hoyer, S., Overmeire, W., alimcmaster1, Mehyar, M., Dong, K., Whelan, C.: 2020, pandas-dev/pandas: Pandas 1.0.1, Zenodo. DOI .
Rocklin, M.: 2015, Dask: parallel computation with blocked algorithms and task scheduling. In: Huff, K., Bergstra, J. (eds.) Proc. 14th Python Science Conf., 126. DOI .
Rüde, U., Willcox, K., McInnes, L.C., Sterck, H.D.: 2018, Research and education in computational science and engineering. SIAM Rev.60, 707. DOI .
Taylor, J.: 1997, An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements, University Science Books, Sausalito.
The Astropy Collaboration, Price-Whelan, A.M., Sipőcz, B.M., Günther, H.M., Lim, P.L., Crawford, S.M., Conseil, S., Shupe, D.L., Craig, M.W., Dencheva, N., Ginsburg, A., VanderPlas, J.T., Bradley, L.D., Pérez-Suárez, D., de Val-Borro, M., Aldcroft, T.L., Cruz, K.L., Robitaille, T.P., Tollerud, E.J., Ardelean, C., Babej, T., Bach, Y.P., Bachetti, M., Bakanov, A.V., Bamford, S.P., Barentsen, G., Barmby, P., Baumbach, A., Berry, K.L., Biscani, F., Boquien, M., Bostroem, K.A., Bouma, L.G., Brammer, G.B., Bray, E.M., Breytenbach, H., Buddelmeijer, H., Burke, D.J., Calderone, G., Rodríguez, J.L.C., Cara, M., Cardoso, J.V.M., Cheedella, S., Copin, Y., Corrales, L., Crichton, D., D’Avella, D., Deil, C., Depagne, É., Dietrich, J.P., Donath, A., Droettboom, M., Earl, N., Erben, T., Fabbro, S., Ferreira, L.A., Finethy, T., Fox, R.T., Garrison, L.H., Gibbons, S.L.J., Goldstein, D.A., Gommers, R., Greco, J.P., Greenfield, P., Groener, A.M., Grollier, F., Hagen, A., Hirst, P., Homeier, D., Horton, A.J., Hosseinzadeh, G., Hu, L., Hunkeler, J.S., Ivezić, Ž., Jain, A., Jenness, T., Kanarek, G., Kendrew, S., Kern, N.S., Kerzendorf, W.E., Khvalko, A., King, J., Kirkby, D., Kulkarni, A.M., Kumar, A., Lee, A., Lenz, D., Littlefair, S.P., Ma, Z., Macleod, D.M., Mastropietro, M., McCully, C., Montagnac, S., Morris, B.M., Mueller, M., Mumford, S.J., Muna, D., Murphy, N.A., Nelson, S., Nguyen, G.H., Ninan, J.P., Nšthe, M., Ogaz, S., Oh, S., Parejko, J.K., Parley, N., Pascual, S., Patil, R., Patil, A.A., Plunkett, A.L., Prochaska, J.X., Rastogi, T., Janga, V.R., Sabater, J., Sakurikar, P., Seifert, M., Sherbert, L.E., Sherwood-Taylor, H., Shih, A.Y., Sick, J., Silbiger, M.T., Singanamalla, S., Singer, L.P., Sladen, P.H., Sooley, K.A., Sornarajah, S., Streicher, O., Teuben, P., Thomas, S.W., Tremblay, G.R., Turner, J.E.H., Terrón, V., van Kerkwijk, M.H., de la Vega, A., Watkins, L.L., Weaver, B.A., Whitmore, J.B., Woillez, J., Zabalza, V.: 2018, The astropy project: building an open-science project and status of the v2.0 core package. Astron. J.156, 123. DOI .
The SunPy Community, Barnes, W.T., Bobra, M.G., Christe, S.D., Freij, N., Hayes, L.A., Ireland, J., Mumford, S., Perez-Suarez, D., Ryan, D.F., Shih, A.Y., Chanda, P., Glogowski, K., Hewett, R., Hughitt, V.K., Hill, A., Hiware, K., Inglis, A., Kirk, M.S.F., Konge, S., Mason, J.P., Maloney, S.A., Murray, S.A., Panda, A., Park, J., Pereira, T.M.D., Reardon, K., Savage, S., Sipőcz, B.M., Stansby, D., Jain, Y., Taylor, G., Yadav, T., Rajul, Dang, T.K.: 2020, The SunPy project: open source development and status of the version 1.0 Core package. Astrophys. J.890, 68. DOI . ADS .
Tollerud, E., Smith, A., Price-Whelan, A., Cruz, K., Norman, D., Narayan, G., Mumford, S., Allen, A., Chan, C.-K., Cherinka, B., Drlica-Wagner, A., Foreman-Mackey, D., Ginsburg, A., Gradvhol, A., Harrington, J., Hogg, D., Jartaltepe, J., Kinney, J., Merchant, N., Momcheva, I., Murphy, N., Peek, J., Peeples, M.S., Pickering, T., Rodriguez, D., Shamir, L., Sinha, M., Sipőcz, B., Sobeck, J., Sosey, M., Stevance, H., Teuben, P., Vohl, D., Weiner, B., Aldcroft, T., Allen, A., Alpaslan, M., Anderson, L., Barentsen, G., Bektesevic, D., Benavides, J., Berriman, B., Blanton, M., Bosch, J., Bouquin, D., Bradley, L., Bryan, G., Burke, D., Burns, K., Buzasi, D., Cabral, J.B., Cardoso, J.V.d.M., Chen, B., Clarkson, W., Collins, M., Corrales, L., Craig, M., Crawford, S., Domagal-Goldman, S., Dong, C., Durbin, M., Faherty, J.K., Farr, W., Forschini, L., Golkhou, V.Z., Günther, H.M., Hafok, H., Hahn, C., Hathi, N., Hedges, C., Huang, S., Hummels, C., Hunt, E., Huppenkothen, D., Juneau, S., van Kerkwijk, M., Kerzendorf, W., Laginja, I., Law, C., de Leon, J., Li, T., Lim, P.L., Malz, A.I., Mao, Y.-Y., Melchior, P., Merin, B., Miller, B., Modjaz, M., Morton, T., Mullally, S., Ogando, R., Parejko, J.K., Paz, D., Pearson, S., Pontoppidan, K., Pope, B., Rapetti, D., Rawls, M., Read, J., Robitaille, T., Rudnick, G., Sharma, S., Sharma, S., Shupe, D., Speagle, J., Starkenburg, T., Stasyszyn, F., Streicher, O., Tremblay, G., Villaescusa-Navarro, F., Vos, J.M., Weaver, B.A., Weltman, A., Wetzel, A., Williams, P.K.G., Winkel, B.: 2019, Sustaining community-driven software for astronomy in the 2020s. Bull. Am. Astron. Soc.51, 180. ADS .
van der Walt, S., Colbert, S.C., Varoquaux, G.: 2011, The numpy array: a structure for efficient numerical computation. Comput. Sci. Eng.13, 22. DOI .
VanderPlas, J., Connolly, A.J., Ivezic, Z., Gray, A.: 2012, Introduction to astroML: machine learning for astrophysics. In: Proc. Conf. Intelligent Data Understanding (CIDU), IEEE Press. New York, 47. DOI . ADS .
Virtanen, P., Gommers, R., Oliphant, T.E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S.J., Brett, M., Wilson, J., Jarrod Millman, K., Mayorov, N., Nelson, A.R.J., Jones, E., Kern, R., Larson, E., Carey, C., Polat, İ., Feng, Y., Moore, E.W., Vand erPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E.A., Harris, C.R., Archibald, A.M., Ribeiro, A.H., Pedregosa, F., van Mulbregt, P., SciPy 1.0 Contributors: 2020, SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods17, 261. DOI . rdcu.be/b08Wh .
Waskom, M., Botvinnik, O., Ostblom, J., Lukauskas, S., Hobson, P., MaozGelbart, Gemperline, D.C., Augspurger, T., Halchenko, Y., Cole, J.B., Warmenhoven, J., de Ruiter, J., Pye, C., Hoyer, S., Vanderplas, J., Villalba, S., Kunter, G., Quintero, E., Bachant, P., Martin, M., Meyer, K., Swain, C., Miles, A., Brunner, T., O’Kane, D., Yarkoni, T., Williams, M.L., Evans, C.: 2020, mwaskom/seaborn: v0.10.0 (January 2020), Zenodo. DOI .
Acknowledgements
We would like to thank Ivelina Momcheva, Erik Tollerud, and Nabil Freij for their help and guidance on this project, the numerous professional astronomy societies who helped publicize this survey, and everyone who took this survey.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Disclosure of Potential Conflicts of Interest
The authors, all members of the SunPy Board serving in a volunteer, unpaid, capacity, declare that they have no conflicts of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A: Survey Questions
The full contents of the survey, distributed as a Google Form, appear below. All of the responses to the Question 13, an optional question that solicited general, free-form comments, are publicly available at github.com/sunpy/survey.
- i)
Which of these areas of solar physics do you work in? Check all that apply.
- \(\square \):
Observational (Space-Based)
- \(\square \):
Observational (Ground-Based)
- \(\square \):
Numerical Simulations
- \(\square \):
Theory
- \(\square \):
Instrumentation
- ii)
How would you describe the stage of your career?
- \(\ocircle \):
Undergraduate student
- \(\ocircle \):
Graduate student
- \(\ocircle \):
Postdoc
- \(\ocircle \):
Faculty, Staff Scientist, Researcher
- \(\ocircle \):
Software Developer
- \(\ocircle \):
Instrument Developer
- \(\ocircle \):
Retired
- \(\ocircle \):
My role is something other than solar physics or software development
- \(\ocircle \):
Other (Respondents can enter their own description.)
- iii)
What country is your institution in? (Respondents check appropriate country from a list of options.)
- iv)
Do you self-identify as one or more underrepresented minorities in solar physics? This question is optional.
- \(\ocircle \):
Yes
- \(\ocircle \):
No
- v)
Do you self-identify as a unrepresented gender identity in Solar Physics? This question is optional.
- \(\ocircle \):
Yes
- \(\ocircle \):
No
- vi)
Do you use software in your research?
- \(\ocircle \):
Yes
- \(\ocircle \):
No
- vii)
Have you had formal training in programming?
- \(\ocircle \):
Yes, a lot (e.g. CS courses at an undergraduate or graduate level)
- \(\ocircle \):
Yes, a little (e.g. online classes, books, workshops)
- \(\ocircle \):
No
- viii)
Which of the following statements is most applicable to you?
- \(\ocircle \):
I write mostly my own software.
- \(\ocircle \):
I mostly use software written by others.
- \(\ocircle \):
Somewhere in between.
- ix)
Which of the following have you personally utilized in your work within the last year? Check all that apply.
- \(\square \):
IDL
- \(\square \):
SolarSoft
- \(\square \):
Python
- \(\square \):
SunPy
- \(\square \):
Shell Scripting
- \(\square \):
C
- \(\square \):
C++
- \(\square \):
Fortran
- \(\square \):
IRAF
- \(\square \):
Perl
- \(\square \):
Javascript
- \(\square \):
Julia
- \(\square \):
MATLAB
- \(\square \):
Java
- \(\square \):
R
- \(\square \):
SQL
- \(\square \):
Ruby
- \(\square \):
HTML/CSS
- \(\square \):
Spreadsheets (e.g. Excel)
- \(\square \):
Mathematica
- \(\square \):
MPI
- \(\square \):
Github (or similar)
- \(\square \):
Other (Respondents can enter their own description.)
- x)
Have you cited software papers in your published research?
- \(\ocircle \):
Yes
- \(\ocircle \):
Sometimes
- \(\ocircle \):
No
- xi)
If ‘No’ for the previous question: Why haven’t you cited software in your research?
- \(\ocircle \):
I am not sure how to appropriately cite software
- \(\ocircle \):
I do not think it is necessary
- \(\ocircle \):
I do not think software belongs in citations
- xii)
On which of these have you run software for solar-physics research?
- \(\square \):
Laptop / Desktop computer
- \(\square \):
Shared workstation
- \(\square \):
Local Cluster
- \(\square \):
Regional or National Cluster
- \(\square \):
GPU
- \(\square \):
Commercial cloud
- xiii)
Do you have any comments? (This is a free-form response; comments are not required. Please feel free to give us feedback about topics like: version control, collaborative coding platforms such as Github, standard or best practices in coding, operating systems, text editors, or your personal experience with writing code and releasing software, or general thoughts about SunPy).
Appendix B: Citing Scientific Software
To cite scientific software, please follow these two steps:
- i)
Cite the refereed journal article describing the research software. To find this article, visit the website for a software package and look for citation instructions. For example, the SunPy website includes dedicated citation instructions and an associated BibTex entry.
- ii)
Cite the software archive. Publicly available digital repositories, such as Zenodo, issue a Digital Object Identifier (DOI) for archived software. (Some institutions also provide digital repositories as part of their library system.) Generally, open-source software projects in the Python scientific stack will archive their software every time that they release a new version. For example, the SunPy Github page includes the Zenodo DOI for the most recent release (as of this writing, v1.1.1); clicking on it leads to the Zenodo deposit, which provides an associated BibTex entry.
Many projects release multiple versions of software per year, but they only write a refereed journal article once in a while (for example, the SunPy Project published an article about the v1.0 release, but they will not publish an article about the v1.1.1 release). Therefore, creating reproducible results requires citing both the journal article and the software archive. Here is an example: This research used version 1.1.1 (Mumford et al., 2020) of the SunPy open source software package (The SunPy Community et al., 2020).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Bobra, M.G., Mumford, S.J., Hewett, R.J. et al. A Survey of Computational Tools in Solar Physics. Sol Phys 295, 57 (2020). https://doi.org/10.1007/s11207-020-01622-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11207-020-01622-2