Introduction

Statistical tables are ubiquitous in agricultural, biological, and biomedical studies (Steel et al.1997). An example is shown in Table 1, reporting the effects of oral administration of interferon tau (IFNT) on concentrations of amino acids, glucose, lipids, and hormones in the plasma of Zucker diabetic fatty (ZDF) rats (Tekwe et al.2013). Here, we focus on generating tables from one-way analysis of variance (ANOVA) models where measurements are summarized as mean ± SEM for each treatment group. Typically, post-hoc test results are also included in these tables using a letter-based algorithm (Piepho2004) to indicate which treatment groups are significantly different. With this algorithm, means for treatments are assigned letters (e.g., a, b, and c) to highlight significant differences. Those means that are not significantly different are assigned a common letter. In other words, two treatments without a common letter are statistically significant at the chosen level of significance (e.g., P ≤ 0.05 or ≤ 0.01). The Tukey-Kramer (TK), Student-Newman-Keuls (SNK), Fisher’s least significant difference (LSD), Duncan (DC), and Bonferroni (BF) tests are among the most popular multiple comparison procedures used in life science research (Steel et al.1997), including amino acid biochemistry, nutrition, pharmacology, and physiology (Wang et al.2014a,b; Wu and Meininger1997; Wu1997).

Table 1 Effects of oral administration of IFNT on concentrations of amino acids, glucose, lipids and hormones in the plasma of ZDF rats

In this paper, we introduce two software, freely available online, at (https://houssein-assaad.shinyapps.io/TableReport/ and https://houssein-assaad.shinyapps.io/SumAOV/) for one-way ANOVA. The software are capable, within few clicks, of generating publication-ready MS-Word tables corresponding to multiple data sets, and of exporting them to Microsoft Word or any RTF reader, with all the post-hoc tests results being included therein. The software can also handle situations where only summary data are available (i.e., sample size, mean, and SD or SEM per group), without the need to use the original individual observations. We believe that our new method will save biologists, and applied scientists in general, an ample amount of time and avoid inputting, by hand, superscript letters (see Table 1) derived from the appropriate statistical tests. This offers a distinct advantage over the traditional manual processes for computation and presentation of results in tables that are not only tedious but are also prone to errors.

Several software packages can perform one-way ANOVA, followed by post-hoc analysis (e.g., R, SAS, JMP, and SPSS). To our knowledge, none of them is capable of exporting the multiple comparison results into an RTF reader in a format similar to that of Table 1 without advanced knowledge of the corresponding programming language. Also, SAS, SPSS and JMP are not free. The main challenge lies in exporting the superscripts used to summarize the significance results to an RTF reader. A simple Google search of the terms “ANOVA calculator” or “ANOVA from summary data” reveals many free web-based programsa that can construct ANOVA tables based either on original or summary data. Despite their simple interface, these programs suffer from major drawbacks. The majority cannot perform post-hoc analysis of any kind. Additionally, none of them can export results to an RTF reader in a publication-ready format, making their usage by a broad community very unlikely. To overcome these disadvantages, we wrote our software in the R language (R core Team,2014) and used the following R packages: grifExtra (Auguie2012), XLConnect (Mirai Solutions GmbH2014), agricolae (Mendiburu2014), rtf (Schaffer2013), and shiny (Rstudio Inc2013).

In the following sections, we introduce necessary background materials for one-way ANOVA coupled with multiple comparison techniques. The main goal is to highlight some of the limitations of the statistical tests included in the software. We also wanted to underline the necessary assumptions required by one-way ANOVA and emphasize that the software should be used only when such assumptions are nearly satisfied. In addition, we present several options to prepare the data for input into the software. Different toy datasets can be downloaded from the software webpage to be used throughout the paper to illustrate the functionality of our software. We also describe the different components of the software and the steps required to generate the tables in MS Word. Furthermore, we offer various tips and useful links to cover more input and output scenarios. Concluding remarks are given towards the end of this article.

Background and materials

1. One-way ANOVA

Here, we present a brief non-technical description of one-way ANOVA and introduce few terms that will be used throughout the rest of this paper. One-way ANOVA, also known as single-factor ANOVA, involves the analysis of data sampled from two or more numerical populations (probability distributions). The characteristic that labels the different populations is called the factor under study. This factor variable can take different values known as factor levels. For example, in a published study involving dietary supplementation with 0, 0.5, 1, 2, and 4% monosodium glutamate to young pigs (Rezaei et al.2013), the experiment consisted of one factor (i.e., monosodium glutamate) with five different levels. Also, let us consider an experiment to assess the effect of four brands of gasoline automobile on engine operating efficiency (measured in mpg). Here, the brand of gasoline is the factor variable and it has four levels (the four brands). The response variable is the engine operating efficiency. One-way ANOVA assumes that the numerical populations or probability distributions of each factor level follow a normal distribution with a common variance, and differ only with respect to their means. Therefore, differences in the means reflect the effect of the essential factor levels, and it is for this reason that ANOVA focuses on the mean responses for the different factor levels. If the factor has only two levels, ANOVA is equivalent to an unpaired t-test comparing two group means. One-way ANOVA usually proceeds in two steps. First, it determines whether or not the factor level means are the same using an overall test. Second, if the factor level means differ, the researcher will conduct a follow-up analysis, known-as post-hoc analysis, to examine how they differ. Our software offers a variety of statistical tests to perform pair-wise comparisons in the post-hoc analysis step.

2. Multiple comparison methods

The main purpose of this section is to provide the reader with some insight into the limitations of the different testing procedures available in the software. For any testing problem, there are two types of errorsb. A false positive (also called Type I error) occurs when we detect an effect that does not really exist. A false negative (Type II error) occurs when we fail to declare a truly existing effect. Most of the classical multiple comparison procedures (MCP), such as the DC, SNK and LSD tests, control the Type I error [more precisely, the family-wise error rate (FWER), which is the probability of committing at least one Type I error in a series of hypotheses testing] in the weak sense. Namely, all computations (e.g. p-values) are conducted under the assumption that all null hypotheses are true. In practice, this assumption is rarely expected to hold, allowing the Type I error to be in excess of the usual 5% value. Therefore, a stronger control for Type I error rate under less restrictive assumptions is often required. A MCP controls the Type I error rate in the strong sense if this error is controlled under any partial configuration of true and false hypotheses. While TK and BF do control the FWER in the strong sense, they have a relatively low power. In other words, TK and BF are more likely to correctly identify true hypothesis as being true, but also might fail to declare false hypothesis as being false (the two methods generate larger p-values than they truly are). A summary of the previous discussion is given in Table 2, which is taken from Christensen (2011) with some modifications. Ideally, it is desired to choose a method that controls the FWER in the strong sense, while achieving the highest possible power. Increasing the power can be done by extending single-stepc testing procedures into stepwise procedures via a technique known as the closure principle (Bretz et al.2010). For instance, the stepwise Holm procedure is an extension of the single-step BF test. By construction, step-wise procedures are more powerful and control the FWER in the strong sense. The general recommendation is to use a testing procedure that controls Type I error in the strong sense while accounting for logical constraintsd and potential correlation among the tests. The books by Westfall et al. (2011) and Bretz et al. (2010) offer a thorough and accessible introduction to the MCP. Furthermore, these books provide the necessary code in SAS and R, respectively.

Table 2 Summary of multiple comparison methods

The software

1. Working with software 1

Software 1 (https://houssein-assaad.shinyapps.io/TableReport/) can handle multiple scenarios where data should be arranged accordingly to obtain the sought results without generating errors. For illustration purposes, different toy data sets that correspond to each scenario can be downloadedefrom the software webpage under the “About” panel (see Figure 1). We distinguish the following settings:

Figure 1
figure 1

A screenshot of software 1 for setting (S1).

  • (S1). A single data set arranged in one Excel sheet: The file should be saved as an Excel workbook ‘Filename.xls’ or ‘Filename.xlsx’, depending on which version of Microsoft Excel the researcher is using (see file ‘Single_data.xlsx’).

  • (S2). Multiple data sets arranged within multiple Excel sheets (one data set per sheet) and saved in one Excel workbook (see file ‘Multiple_data.xlsx’).

  • (S3). Single data set of summary measurements arranged in one Excel sheet (see file ‘Single_Summary_Data.xlsx’).

  • (S4). Multiple data sets of summary measurements in multiple Excel sheets (see file ‘Multiple_Summary_Data.xlsx’).

For the first two scenarios, data rows should correspond to different subjects or experimental units, whereas data columns should describe different variables. The Excel sheets must only contain the data set without any comments or explanations (see file ‘Single_data.xlsx’ for example). Also, an appropriate name should be assigned to each variable. Note that each data set (in one Excel sheet) should contain exactly one factor variable and at least one response variable. For instance, the file Single_data.xlsx contains a single data set with one factor variable (group) with four levels A, B, C and D and six response variables V1 to V6. In this case, the software will conduct six one-way ANOVAs, one for each response variable, and summarize the results in one table in a format similar to Table 1. It should be borne in mind that all the six one-way ANOVAs share the same factor variable ‘group’. Missing values should be left as empty cells. The data set, within an Excel sheet, doesn’t have necessarily to start from the top left cell in Excel (cell A1), as long as the tabular (rectangular) form is maintained.

The last two scenarios are especially useful in cases where the original individual observations are not available, and where only the sample size, mean, and SD or SEM for each factor level are known. For example, this might happen, if the researcher wants to analyze data that have been summarized in a submitted or published article. Refer to files Single_Summary_Data.xlsx and Multiple_Summary_Data.xlsx to prepare data for scenarios (S3) and (S4), respectively. Note that the software can acquire the mean and SEM or SD for each treatment group from the summarized table, but requires the user to enter the sample size. For example, consider the file Single_Summary_Data.xlsx, which has two response variables Var1 and Var2 and one factor variable with 4 levels L1 to L4. By uploading this file into the software, it will automatically detect the mean and SEM or SD for each group for all the response variables. All is left now is to specify the sample sizes as shown in Figure 2. If the design is balanced, enter the common value for sample size per group (e.g., 15). If the design is unbalanced, enter one value for each factor level, in the order they appear in the Excel file, separated by spaces (e.g., 15 14 15 16 for L1 to L4, respectively). Having borders around the researcher’s table cells does not affect the functionality of the software. In the next section, we present software 2 that offers a more user-friendly interface to deal with summary data. At first, it might seem that one should make some effort to get the summary data ready for the software in scenarios S3 and S4 (see Single_Summary_Data.xlsx). However, several free online programsf are currently available to convert a PDF document, which is the standard format for submitted or published papers, to a Word file. Once the table is opened in Word, it can be copied to Excel and then loaded into the software after removing all the superscripts from the table. The latter procedure can be done easily using the “Find and Replace” feature (click CTRL + F to open it) in Excel by replacing all the superscripts with an empty space.

Figure 2
figure 2

A Screenshot of software 1 for setting (S3).

The user of our software should follow the steps below:

  1. 1.

    Upload an excel workbook (both .xls and .xlsx format are supported) and select the level of significance α.

  2. 2.

    Specify a statistical test to perform all pair-wise comparisons. Currently available are the Tukey-Kramer (also known as Tukey’s HSD) test, the Duncan test, the Student-Newman-Keuls (SNK) test, and the least significant difference (LSD) test. If the researcher selects the LSD test, multiple methods (e.g., Bonferroni and Holm) are available for adjusting p-values.

  3. 3.

    Choose the table’s output format. Two formats are widely used in the literature. By selecting ‘Per group SEM’, the table will report the mean and SEM for each group (see Table  1). The ‘Pooled SEM’ option will only report the means for each group and one pooled SEM for all the treatment groups (see Table  3).

Table 3 Effects of oral administration of IFNT on concentrations of amino acids, glucose, lipids and hormones in the plasma of ZDF rats
  1. 4.

    Choose a data format. For setting (S1) select “Full single data” and then specify the factor variable in the researcher’s data set (names are case-sensitive). For (S2) select “Workbook (multiple sheets)”. Choose “Summary data” for setting (S3), while for (S4) select “Workbook (multiple sheets)” and check the summary data checkbox. For both (S3) and (S4), the researcher has to provide the sample size per group, as well as SDs or SEMs in the summary data.

  2. 5.

    Click on the green button to download the table with all statistical results included.

The publication-ready table for one-way ANOVA and multiple comparison results should now open in the MS Word or in the default RTF reader on the researcher’s computer system. The table can now be edited as desired (e.g., adding rows, columns, and borders).

2. Working with software 2

Our main intention behind this software (https://houssein-assaad.shinyapps.io/SumAOV/) is to provide reviewers of scientific papers with a quick and simple tool to check the statistical results summarized in a certain table. This method might be cumbersome if used to check results in a relatively large table containing several response variables (e.g., Table 1) because results must be checked one row at a time. An efficient alternative is to consider using software 1 under the (S3) and (S4) settings, which allow researchers to feed the whole table to the software at once after removing the superscripts from it (after all, the main goal is to check whether these superscripts are correct!). The user should carry out these steps in the given order (see Figure 3):

Figure 3
figure 3

A screenshot of software 2.

  1. 1.

    Choose the level of significance α. By default, it equals 0.05.

  2. 2.

    Indicate the number of treatment/group means to compare. Enough fields to input the researcher’s data will be available based on that number.

  3. 3.

    Specify whether the researcher will provide SD or SEM for each group.

  4. 4.

    Select a statistical test for pair-wise comparisons.

  5. 5.

    Input the

    1. a.

      Sample size for each group, n1, n2, n3, etc.

    2. b.

      Mean (average) for each treatment group, mean1, mean2, mean3, etc.

    3. c.

      The SEM or SD for each treatment group, depending on the researcher’s selection in step 3.

The ANOVA table and a table containing all the pair-wise comparisons should now appear on the right (see Figure 2). Note that the results will be automatically updated if the researcher introduces any changes to their input (e.g., changing the statistical test and sample sizes).

3. Caution regarding the names of variables

Spaces in variable names should be avoided as they might lead the software to generate an error instead of a correct table output. Also, if the length of a variable’s name in the dataset is larger than 10 characters, which might be the rule rather than the exception in many cases in biological studies, the software will abbreviate the variable’s name. This can lead to ambiguous or unpleasant terms. We, therefore, advise researchers to subjectively assign descriptive abbreviations for variables with long names before loading their dataset into the software.

4. Transposing the output table

The software generates a table for one-way ANOVA and multiple comparisons in a format similar to that of Table 1. The response variables are in different rows and the factor levels occupy different columns. We do not include a functionality that reverses this order because such a task can be easily done in Word or Excel. Typing “Transposing table in Word/Excel” in the Google search engine return many helpful links. Choose the one that corresponds to the researcher’s version of Word or Excel.

Concluding remarks

We presented two free web-based software capable of generating publication-ready RTF tables for one-way ANOVA with pair-wise comparison results included therein. These tables are often prepared for writing agricultural, biological, and medical science papers. Due to its simple interface, the software spare the researcher a considerable amount of time and eliminate errors introduced by human input. The software can handle an Excel workbook with multiple datasets saved in multiple sheets, creating one table per dataset. Our software also support two of the most commonly used table outputs in life science articles (see Tables 1 and2 for example). Additionally, tables can be generated based solely on summary results (i.e., the sample size, mean, and SD or SEM for each treatment group). This need might arise if the researcher wants to analyze data that have been summarized in a submitted or published manuscript. The software can be extended in several directions. For instance, it is possible to include additional multiple comparison tests that might improve the power of the currently available methods. Another option is to cover more families of elements to be tested, in addition to all pair-wise comparisons, such as general contrasts and linear functions.

Endnotes

aSee http://statpages.org/anova1sm.html, http://vassarstats.net/anova1u.html, and http://www.danielsoper.com/statcalc3/calc.aspx?id=43.

bThere is also a Type III error in two-sided test problems. It is defined as the correct rejection of the null hypothesis coupled with a wrong directional decision.

cWhen testing multiple hypotheses, a test procedure is called a single-step method if the rejection or non-rejection of a null hypothesis does not take the decision of any other hypothesis into account, e.g. the BF and TK tests. On the other hand, step-wise methods differ from single-step procedures in that the results of a given test depend upon the results of other tests, e.g., Holm.

dFor example, consider all pair-wise comparisons of 3 treatment means M1, M2 and M3. If M1 ≠ M2, then logically, M1 = M3 and M2 = M3 cannot be true simultaneously. Choosing a test that does not account for these logical constraints might lead to problems with the interpretation of the test results.

eAccess to Dropbox is required in order to download the corresponding toy datasets.

fSee http://www.pdfonline.com/pdf-to-word-converter/.

Availability and requirements