Introduction

The EuraHS (European Registry for Abdominal Wall HerniaS) working group was formed under the auspices of the European Hernia Society (EHS) board in 2009. An online platform for registration and outcome measurement of operations for ventral abdominal wall hernias has been developed. For this, a set of definitions and classifications were proposed [1]. The EuraHS working group organised a consensus meeting to prepare recommendations relating to the reporting of outcome results in abdominal wall hernia repair.Footnote 1

Materials and methods

The scientific methodology of clinical studies including systematic reviews and meta-analyses were discussed with researchers and a statistician invited to the consensus meeting. Recommendations relating to study methodology, description of the patient population and statistical approach were proposed to research on abdominal wall surgery.Footnote 2 Specific recommendations on abdominal wall surgery for describing hernia variables, treatment variables and for reporting the outcome results in a uniform manner were formulated by consensus.

Results

Description of study methodology

A study describes a sample or cohort of patients. It is of utmost importance to know how the study population was decided upon, how the study was conducted, what was the primary aim or endpoint of the study and how was the endpoint analysed. This knowledge is essential to know whether the results of this study can be extrapolated and generalised to the larger group of patients with the disease treated, so that the study result might influence the treatment of future patients. Knowledge of the sample procedures used to determine the study population from the screened patients allows the readers to identify potential sources of bias and thus assess the external validity of the study results. In the footnotes some exemplary hernia-related different types of studies are given for additional reading.

Study types

All reported studies should have a clear description of the study type, which should be mentioned in the title and/or the abstract of the manuscript. There is a fundamental distinction between observational studies or interventional studies (Fig. 1). An outcome variable(s) (aka dependent variable) will be studied in relation to one or more predictor variables (aka independent variables; aka risk factors) in an observational study. Analysis will focus on the association of the predictor(s) with the outcome(s) over a defined time period. A cohort study is a type of observational study in which a group (cohort) is defined, e.g. all patients undergoing a particular operation or having a certain type of hernia.Footnote 3 Most publications on ventral abdominal wall repair are classified as non-comparative cohort studies because there is no control group in the study. Rather the results are discussed in relation to other studies published on similar patient populations. In a comparative cohort study or casecontrol study at least two different populations are compared within the study.Footnote 4 A registry is a type of cohort study that has a specific purpose, defined in advance. The data entered are carefully crafted to answer important questions about the condition or symptom being studied. Results from registry studies are often very informative because such care is taken to assure consistent data definition, consistent data entry and the enrolment of a large number of patients in relationship to the total affected population.Footnote 5 A cross-sectional study is an observational study, which by definition is not longitudinal because subjects are studied at a single point in time. An example would be a study investigating the impact of the patients’ BMI on the prevalence of incisional hernias in a population of patients with previous laparotomies.

Table 1 Summary of recommendations for reporting outcome results in abdominal wall surgery as formulated by the panel of a consensus meeting held by the EuraHS working group in Palermo, Italy, June 2012

In an interventional study the result of an intervention on a specific outcome variable is examined. The patient samples compared in the study should ideally only differ in the predictor variable that is influenced by the intervention. Other variables, called confounders, should be equally distributed between the study groups. Randomization for the predictor variable in a randomized controlled trial (RCT) is the best method to ensure “equality” of the study groups provided the study population is large enough.Footnote 6 For this reason RCTs are assigned a high level of evidence because if the randomisation is performed adequately they have the smallest risk of bias between the study populations. In a comparative non-randomized clinical trial, it is less clear why a specific patient receives the intervention or not.Footnote 7

In a systematic review, a comprehensive literature research is performed on a specific topic and a qualitative critical appraisal of the individual studies is performed. Only data from studies that are considered of sufficient methodological quality are summarised.Footnote 8 In a meta-analysis the quantitative data of the individual studies are pooled and statistically analysed.Footnote 9. A meta-analysis of RCTs is considered the highest level of evidence and thus allows for the highest grade of recommendation.

A case report or case series describes an observation or a treatment, which is considered by the authors as rare or novel and thus worthy of publishing in a manuscript.

As shown in Fig. 1, guidelines are available on the web for specific types of studies which provide step-by-step instructions including a check list for authors to assure correct conduct and reporting of their work [1116]. The Cochrane Collaboration at http://www.cochrane.org summarises the websites. Many journals only accept manuscripts that conform to these guidelines and require their reviewers and editors to use them when assessing the quality of submissions. Critical appraisal sheets to assess the quality of a study report can be found on the website of the Centre for Evidence Based Medicine from Oxford [17].

Fig. 1
figure 1

Types of clinical studies: it is recommended to include the type of study clearly in the title and/or the abstract of a manuscript. Reporting guidelines (colomn 3) are available on the web to help authors in preparing manuscripts for publication. a CONSORT statement: Consolidated standards of reporting trials. http://www.consort-statement.org [11], b TREND statement: Transparent Reporting of Evaluations with Non-randomized Designs. http://www.cdc.gov/trendstatement/ [12], c STROBE statement: Strengthening the reporting of observational studies in epidemiology. http://www.strobe-statement.org [13], d STARLITE statement: Standards for reporting literature searches [14], e PRISMA statement: Preferred reporting items for systematic reviews and meta-analyses. http://www.prisma-statement.org [15], f MOOSE statement: Meta-analysis of Observational Studies in Epidemiology [16]

Prospective versus retrospective studies

In a prospective study, a cohort of patients is observed for a period of time to look at outcome, e.g. complications, and then relate this to the predictor variables, e.g. type of surgical technique. Interventional studies are prospective studies focused on the outcome of a specific intervention that is controlled but different in the study groups that are compared. A study qualifies as prospective if the outcome measurement of the primary endpoint is decided before the start of the study, and the endpoint measurements are performed in the future after the start of the study. Prospective studies are methodologically superior to retrospective studies because the measurements can be controlled and standardised. Moreover, the data gathered are usually more homogeneous and complete.

In a retrospective study the investigator looks backwards in time and examines exposure to possible risk or protective factors in relation to an outcome that is established before the start of the study. Thus the study looks at measurements made before the study was started and, therefore, the data will be less controlled and less homogeneous.

The research question and the primary endpoint

The manuscript of an interventional study should clearly state the research question and/or aim of the study. This research question is translated into a scientific hypothesis that will be the basis for the study design and the number of patients required to answer the research question. A clinically relevant primary endpoint will be chosen for which the hypothesis is formulated. The primary endpoint or primary variable of a study is the outcome parameter to be measured and compared, either to the control group in a comparative study or to results from the literature in non-comparative studies. For abdominal wall repair, the primary endpoint is most often hernia recurrence, but many other outcome parameters are possible to formulate the hypothesis: acute or chronic pain, Quality of Life, complications, reoperation rates, wound infections, mesh infections, etc. A superiority study investigates if the intervention is superior in comparison with the control group. The results of the study will be compared with the null hypothesis (H0), that there is no difference between the groups in the primary endpoint measurement. The analysis has to be performed on Intention-to-treat (ITT) basis. In ITT analysis, patient outcome is analysed according to the allocated treatment by randomization, regardless wether the patient actually received the treatment or not [18].Footnote 10 In some specific clinical situations, an equivalence or non-inferiority design is preferred. An equivalence study investigates whether a new treatment is equivalent to the control with respect to a predefined indifference. The analysis will be performed on the Per Protocol Population (PP), i.e. the patients who adhered strictly to the protocol and actually received the intervention called for by the protocol. These different types of analysis aid investigators in determining if a new treatment or device is better or as good as, but cheaper than what is now available. Like most clinical studies, the use of a biomedical statistician at both the study design and study analysis stage is recommended.

The sample size

When designing a clinical trial it is important to estimate the number of patients needed to answer the research question. Performing a clinical trial is time consuming and expensive. It is also ethically mandatory to keep the number of patients that allow for valid study results as small as possible. Therefore, it is important to estimate the number of patients that should be included in the study at the onset to answer the clinical question and the scientific hypothesis the study is exploring. If the sample size is too small the study might not be able to reject the H0. In other words the study sample is too small to show a difference in the primary outcome, although in reality there is a difference (false negative; type II error). On the other hand if the sample size is too large, scares resources will be a spent unnecessarily. To calculate the sample size needed, there has to be agreement on several elements. First, the hypothesis type has to be clear: superiority, equivalence or non-inferiority. The expected mean value of the primary outcome parameter in the two groups and the difference in outcome considered clinically important have to be estimated, based on preliminary findings or results from similar studies in the literature. The significance level, i.e. the α or Type I error we accept (usually 5 %) and the statistical power (usually 80 % = 1 − β, where β denotes the Type II error level) have to be defined. These assumptions will provide the number of patients in each group needed to evaluate the primary endpoint. All studies have “dropouts” because the patients are lost to follow-up, die, or are not willing to continue participation. Therefore, the number of patients to enter in the study should be increased in line with the number of “dropout” patients anticipated, often 10–20 %.

Interim analysis

Prior to the onset of the study, the protocol of the study should state if an interim analysis will be conducted and the statistical rules should be given. An interim analysis is usually done for safety reasons. Therefore, an analysis of the patients “as treated” is the best approach. There are different interim analysis procedures and the procedure should be chosen carefully and described in the study protocol.

During an interim analysis the progress of the study inclusions, the occurrence of serious adverse events and the quality of the raw data can also be evaluated. A decision can be made to prolong the inclusion time to increase the sample size or to stop the trial prematurely. Ideally, an independent data monitoring committee (IDMC) takes such a decision.

An example is the study by Itani et al. [19] on ventral hernia repair comparing laparoscopic with conventional surgery. The infection rate was so much higher in the conventional group that the data safety monitoring board insisted the trial be stopped.

Description of patient population

The ultimate goal of a study is to generalise the findings in the study to the larger population of which the study population is a sample. To assess the external validity of a study, the exact method of determining the study sample or study cohort has to be clear.

Mono-centre versus multi-centre studies

There are advantages and disadvantages for both study strategies. Mono-centre interventional studies have a greater chance of having two comparable groups by excluding the variations in the confounding variables that arise from including patients treated in different centres. Multi-centre studies have a greater chance of correct inference and generalisation of the study results to the larger population in the community. But multi-centre studies are logistically more difficult to perform. Moreover, the homogeneity and the quality of the raw data are often inferior in the participating centres compared with the centre of the primary investigator. On the other hand, including patients from several centres will create a larger group of eligible patients and thus a higher likelihood of achieving the sample size in a shorter time period. For some less common conditions, a multi-centre approach is prerequisite to enrol a large enough cohort of patients. It is essential that the authors report variations in expertise related to the surgical technique under investigation.

Inclusion criteria, exclusion criteria and eligibility

To minimise selection bias all consecutive eligible patients during the study period should be considered for inclusion. The reasons for non-inclusion in the trial and the number of these should be monitored and reported. To know which patients are eligible a clear and detailed description of inclusion and exclusion criteria should be given.

Dropouts and lost to follow-up

Inevitably subjects will become lost to follow-up and will not be available for measurement of the primary endpoint. Some patients will not receive the allocated treatment according to the randomization because of errors, a preoperative surgical decision, an intraoperative change in therapy or because the patient withdraws consent to participate. Nevertheless, a description of the entire intention-to-treat (ITT) population has to be provided and every patient accounted for, preferably in a flow diagram. This will make it clear to the reader which patients are included in the study analysis. The baseline data of the study population with the distribution of the predictor variables and possible confounding variables should be provided for the ITT population in the first table of the manuscript. This table will allow evaluation of the concordance between different groups in comparative studies. The variables should be listed with their frequency or mean value, their range and their standard deviation. For analysis of the primary and secondary endpoints of the study the decision about the use of the ITT or PP population is based on the type of statistical hypothesis (superiority versus equivalence).

Description of the hernia variables, operative procedure and mesh variables

The literature dealing with the treatment of abdominal wall hernias would benefit from using a common standard for description of the hernias themselves, the operation performed and the mesh materials used. The European Hernia Society has previously published classifications for inguinal and ventral hernias [20, 21]. Moreover, during the development of the EuraHS platform for registration of ventral hernias many definitions and recommendations for describing variables of interest were proposed by consensus amongst the EuraHS working group members [1]. A general recommendation of the consensus meeting in Palermo is to use these existing classifications and terminologies to describe the hernia patients included in a study.

Hernia variables

It is recommended to use the EHS classifications for inguinal and ventral hernias. Primary ventral hernias and incisional ventral hernias should be distinguished and classified accordingly. The hernia size of ventral hernias is preferably an intra-operative measurement and the width and length will be described in centimetres (cm) as the mean and the standard deviation. If the hernia defect surface is reported, the method of calculation of the defect size in cm2 should be given. By multiplying width and length, the true hernia defect size is found to be smaller than the rectangle calculated and thus this value is an overestimation of the true abdominal wall defect size. Alternatively, the formula of an ellipse can be used to get a better estimation of the true hernia defect size. For calculating the real surface area of a hernia defect or several defects of an incisional hernia many measurements are needed and calculations depend on the form of the defect. Ammaturo and Bassi have published a method for calculating the wall defect surface and compare it with the surface of the anterior abdominal wall [22]. This method involves the use of transparent paper, a computer scanner and software to calculate the exact surface. For routine use in surgical practice this is not practical.

In order to classify the dimensions of an abdominal wall hernia the consensus is to use the terminology proposed in the previous classifications. For primary ventral hernias three groups are created using the hernia defect diameter: small (<2 cm), medium (≥2–4 cm) and large (≥4 cm). For incisional hernias, there is no common standard yet. The consensus panel recommends using the EHS classification and thus the width of the incisional hernia is the distinguishing parameter between groups: W1 (<4 cm), W2 (≥4–10 cm) and W3 (≥10 cm). If descriptive terminology like “large, giant, huge” are used, a clear description of the definition should be given. However, the use of such adjectives to define the hernia size is discouraged.

Operative techniques and mesh variables

Surgical technique and their outcome is an important issue in surgical studies. A detailed description of the surgical techniques used is important for the readers to understand the procedure(s) used in the patients studied. It should allow reproducing the technique in future patients. Authors should be encouraged to use clear terminology like those proposed by the EuraHS working group [1]. For prosthetic materials, fixation devices and other equipment, we recommend using not only the generic name of the material but also providing the product and company name. When comparing different meshes the classification of meshes proposed by Klinge and Klosterhalfen is recommended [23]. A complete description of the size of implanted mesh, the overlap of the hernia defect and the detailed technique used for fixation will help the reader to understand the procedure used.

Assessment of outcome: recurrences, complications and quality of life

Recurrences

The outcome parameter recurrence is the primary endpoint in most studies of abdominal wall hernia surgery. A hernia recurrence is defined as “A protrusion of the contents of the abdominal cavity or preperitoneal fat through a defect in the abdominal wall at the site of a previous repair of an abdominal wall hernia.” [1]. Recurrence is a categorical dichotomous variable, which means the outcome cannot be quantified, but is a yes or no response. The definition used in the study of what constitutes a recurrence should be given as well as the method of follow-up that is used to look for possible recurrence. If the primary endpoint of the study is recurrence, the consensus is that only clinical follow-up will be considered adequate. In an interventional study, blinding of the evaluator to the treatment arm will minimize investigator bias and improve the quality of the data and is to be strongly encouraged.

Basically, there are two options to describe the primary endpoint recurrence in a cohort of patients. The “recurrence rate” can be measured at a specific time point (Tx) during follow-up, as the number of patients of the ITT population that have developed a recurrence between the operation date (T 0) and Tx. This will leave us with the problem of what to do with the patients that were “lost to follow-up”. This uncertainty about the status, i.e. recurrence or no recurrence, of the lost to follow-up patients will cause serious bias in the estimation of the calculated recurrence rate. A specific cohort of patients has no fixed recurrence rate because the recurrence rate will increase over time with longer follow-up. The result of a study with a recurrence rate at a specific point in time during follow-up should include 95 % confidence intervals. It is recommended that the statistical analysis of recurrence rates at a specified time in a comparative study be performed with the Fisher exact test and logistic regression to include prognostic factors.

A more sensitive method of reporting the outcome is by “time-to-event analysis” as introduced by Kaplan and Meier several decades ago for survival analysis [24]. The main reason to favour this approach is that patients lost to follow-up, the dropouts, are accounted for. In abdominal wall surgery, the event studied is most often recurrence and thus “survival rate” can be best described as the “freedom-of-recurrence”. For every patient in the study the time period of follow-up will be defined by the date of the hernia repair (T 0) to the date of recurrence or the date of the last follow-up recorded (T 1). At T 1 the status of the patient will be recorded: recurrence or no recurrence. The difference between T 1 and T 0 is the time the patient was at risk of development of a recurrence and was under “surveillance”. During the study period the number of patients at risk will gradually decrease with every patient that has a recurrence or that is lost to follow-up, i.e. censored cases. The outcome of time-to-event data for hernia recurrence is given by a Kaplan–Meier plot of the freedom-of-recurrence and by calculating freedom-of-recurrence rates at predetermined time endpoints. Statistical analysis of time-to-event data is performed using the log rank test or Cox’s regression model if prognostic factors are included. Time-to-event analysis is more powerful than comparing recurrence rates, thus requiring a smaller sample size to test a specific scientific hypothesis of an interventional study.

Complications

The consensus group recommends using the Clavien-Dindo classification as was proposed previously by the EuraHS working group [2527]. A clear definition of the different complications evaluated and reported must be given, preferably using published classifications. Of specific interest for abdominal wall surgery is postoperative seroma. The seroma classification proposed by Morales-Conde is recommended [28].

The method of follow-up

The method for assessment of the primary and other endpoints of the study should be described clearly in the manuscript. Indeed, the recurrence rate measured will be influenced by the method of follow-up. Figure 2 illustrates an increase in quality of follow-up which can range from the number of reoperations for recurrences seen to systematic investigation with medical imaging. The Palermo consensus group considered that follow-up without clinical examination of the patient is likely to give an important underestimation of the true recurrence rate and thus should be avoided. For other endpoints such as quality of life assessment, a follow-up by phone or mail might be adequate.

Fig. 2
figure 2

The validity of data for recurrence after hernia repair is dependent on the method of follow-up performed. It is recommended to consider only follow-up including clinical investigation as adequate

For large registries like the Danish Hernia Database, the Swedish Hernia Registry and the Herniamed database a clinical follow-up of all patients is not practical and achievable [29, 30]. In the population-based Danish Ventral Hernia Database the reoperation rate for recurrence is the primary outcome measurement as a “surrogate for recurrence”. Helgstrand et al. [31] demonstrated using a questionnaire and subsequent selective request for clinical follow-up that the reoperation rate underestimated the overall risk for recurrence by four- to fivefold. In the Herniamed registry patients are followed up using a questionnaire send to the patient at 1, 5 and 10 years [29]. Patients reporting a problem are invited for an examination by a physician.

Blinding of the patient and the evaluator at the primary endpoint to the treatment group in an interventional study has some organisational and logistic difficulties, but should be considered when writing a study protocol because of the enhancement of the quality of the outcome data and the diminished risk of patient or investigator bias.

Ethical and financial considerations

Studies should be performed according to the guidelines of the International Conference on Harmonisation (ICH) of Good Clinical Practice (GCP) [18]. This includes the approval by the ethical committee of the centre where the study is performed. Informed consent of the patients to be included in the study is mandatory.

Registration of the study protocol in an international database like http://www.clinicaltrials.gov is recommended and is mandatory for acceptance in some peer reviewed journals.

For studies of abdominal wall surgery it is very important that financial sponsors of the study are disclosed. The manuscript should state how the study was initiated: as an Investigator Initiated Study (IIS) or initiated by a commercial sponsor of the study. Conflicts of interest should be clearly stated at the end of the manuscript. If a research grant was received for the study, the name of the sponsoring organisation or company should be disclosed. Also the involvement of the sponsor in initiating or conducting the study and in reporting the results should be clearly delineated.

The consensus group also encourages investigators to report negative trial results. If the study methodology is appropriate, a negative outcome should not hinder the acceptance for publication.

Discussion

The literature dealing with abdominal wall surgery often fails to meet good reporting standards and statistical methodology. Moreover the terminology used to describe the hernias and their therapies is very heterogeneous, often due to the lack of commonly accepted standards and definitions. This was the impetus for the formation of the EuraHS working group. By organising a consensus meeting including the editors of Hernia—the World Journal of Hernia and Abdominal Wall Surgery—and some specialists in statistics or systematic reviews, the aim was to suggest a set of recommendations to provide a standard for investigators writing a study protocol and to authors preparing a manuscript for submission. The recommendations are listed in Table 1.

The CONSORT statement is the common standard to use as guidance in performing and reporting RCTs (http://www.consort-statement.org). However, for ventral hernia repair, RCTs are not frequent and the majority of the literature is comparative retrospective studies or non-comparative cohort studies. For those studies the STROBE statement (STrengthening the Reporting of OBservational studies in Epidemiology) is the relevant guideline (http://www.strobe-statement.org) and the quality of the studies can be scored using the MINORS scale [32].

We consider that an author checklist specifically targeted at abdominal wall surgery based on accepted statements and scoring systems would increase the quality of submissions. Editors and reviewers can use a similar checklist for their evaluations.

The consensus panellists strongly believe that an effort is needed to increase the statistical and methodological basis of the abdominal wall research. Considering recurrence, which is the primary interest of most studies on hernia repair, it is recommended using time-to-event data of the freedom of recurrence to analyse and report study results. The number of dropouts from studies on hernia repair before the measurement of the primary endpoint is often high. Therefore, the use of time-to-event data is more suitable in hernia repair studies.

To reduce the heterogeneity of the description of the variables studied and the surgical techniques performed, we recommend using previously published terminology and definitions. Understanding the study population and the surgical technique is essential for the inference of the results to the larger population of which the study population is part. The external validity of a study is the main goal of scientific research and exact description of the study parameters is thus important.

Several clinicians and researchers feel that for most clinical questions we have, we will never get answers from RCT’s and meta-analyses because the amount of variables is too large. Their frustration is that at this moment guidelines are focused mainly on this type of EBM research. Registers may be an important source of information for health care. In our particular field of research, a population-based register like the Danish Ventral Hernia Database or large surgical datasets of variables and outcomes like the Herniamed database and from the Würzburg University provides us with very interesting data [4, 29, 30]. However, the statements resulting from the analysis of register data, even by sound scientific multivariate statistical analysis, can be limited by various sources of bias. The selective inclusion of patients and their data may introduce selection bias. Some confounding variables may not be included in the dataset of the register and thus result in confounder bias. Nevertheless, we think that in practice registers may be good to generate scientific hypotheses and consider safety questions.

The EuraHS working group encourages researchers in abdominal wall surgery to use of the EuraHS platform to gather the data of their patients [1]. The platform can be used for clinical studies like RCTs and observational studies or for prospective registration of consecutive patients. The platform can be used individually, as an institutional registry, or in groups of participants (e.g. as national registry). Use of the platform will conform to the recommendation of using the consensus-based definitions and classifications of the EuraHS working group.

Knowledge of study design and statistical issues is of minimal interest to many surgeons. We think that a series of short statistical reviews related specifically to abdominal wall surgery would be a good start to improve awareness of the importance of a sound statistical approach to hernia repair research. Moreover, we would encourage the surgical societies to include courses on clinical research and statistical items in the program or in pre-congress courses during meetings of the societies.