Background

River fisheries, defined as both capture and aquaculture of river fish species for food, income, or recreation contribute substantially to meeting challenges faced by individuals, society, and the environment in a changing global landscape [1, 2]. For example, in the Lower Mekong Basin, 80% of the 60 million inhabitants directly rely upon the river fisheries for food and livelihoods [3]. Additionally, with upwards of 1700 fish species, the Mekong River is a global ‘hotspot’ of fish biodiversity [4], so understanding the relationship between the human fisheries system and the natural ecological system is critical for maintaining both the biodiversity of the resident fishes and well-being of the local human communities in the area. While the social, economic, and ecological importance of inland fish and fisheries is difficult to overstate, they are often undervalued and underappreciated [5]. This is due to the fact that accurate information about these highly dispersed fisheries is inherently difficult to acquire, often unreported, and not collected in a standardized format globally [6, 7]. Consequently, these fisheries are often given low priority in planning and policy discussions relative to other uses of river ecosystem services such as drinking water, agriculture, or energy production [5, 8].

Data related to riverine fisheries are not collected in any standardized format globally [9] and thus the extent and distribution of these fisheries has never been adequately assessed in aggregate. Targeted analyses have been conducted on certain river systems such as the Mekong [10, 11], or regions such as Southeast Asia [12] but the approaches (e.g., consumption surveys, intensive field sampling) would not be feasible at a global scale due to the cost and effort involved [13]. Most river fisheries are highly diffuse and small-scale in nature and in areas lacking necessary infrastructure for regular reporting, so the data collection that is occurring is generally not systematically distributed, but instead tend to be focused in the most developed countries [14]. Understanding the catch trends and predictions of river fisheries harvests is critical for the future of stakeholders who depend on these systems for food and livelihoods, but these fisheries harvests have not yet been quantitatively assessed at the global level in the ways that marine fisheries have been [15, 16].

The proposed systematic map protocol will provide a database of river fish harvest and assessment of available data. This study will also provide the first spatially and temporally located systematic map of river fisheries data regarding what species are fished, how much fish are being fished, and how those fish are being removed from river systems (for a systematic review of marine fisheries, see Chassot et al. [17]). It will provide location information where available as well as an overview of the knowledge base, including hotspots of data collection and information gaps in our knowledge base and will be especially important to studies and management at larger spatial scales (i.e., watershed, regional, or global scales). This database will be useful for biodiversity conservation as well as research including improving valuation methods for river fish and fisheries to more accurately recognize the full breadth of provided services or for assessing the relative importance of river fisheries to human populations globally. These data are necessary to develop biomass models that can be used to predict how river fishery could change under different scenarios, particularly in the context of global change [18]. In doing so, river fish and fisheries can be better incorporated into decision making to support sustainable river fish and freshwater management. As recently described for North American inland fisheries, inland waters often have multiple public uses and management goals may be overlapping, conflicting, or mutually dependent [19]. Resolving these interactions depends on the availability of accurate data and models to predict current fishery production and forecasts given pending local, regional, and global changes.

Identification of the topic

The “Rome declaration: ten steps to responsible inland fisheries” [20] synthesized the results of the 2015 global conference on inland fisheries of nearly 250 scientists, policy makers, and members of the development community from more than 40 countries, into a list of ten key actions to help ensure sustainable inland fisheries. This declaration was emphatically endorsed by the member parties at the 2016 FAO committee on fisheries as a sign of growing recognition of the importance of inland freshwater fisheries in many countries. The first step in this list is to “improve the assessment of biological production” of inland fisheries. The work described in this protocol was identified as a principal need to address this critical knowledge gap and has been since refined by the research team. A companion study focused on inland lakes was conducted by other members who participated in the 2015 global conference [21]. Ultimately, the goal of the conference and its outcomes is to improve the sustainability of freshwater aquatic resources and to bring greater awareness of the value and sustainability challenges of inland fisheries around the world [20].

Objective of the map

This project aims to identify, collate, and describe information on the geographical distribution of river fish harvest. Relevant information consists of fisheries-dependent and fisheries-independent data on river fish, including where and how river fish were captured. Data on river fisheries harvest will be identified and collected using the systematic mapping method described below.

Systematic mapping provides a comprehensive, transparent, and objective method for collecting and describing the state and distribution of current knowledge on a topic [22]. Through systematic mapping, our research attempts to fill the river fisheries knowledge gap by aggregating river fishery data globally. We expect two products from this endeavor:

  1. 1.

    River fisheries database (available at no cost online through a USGS portal).

  2. 2.

    Systematic map of the state of river fisheries information (submitted for publication in this journal).

These products will provide an important data resource for researchers, policy makers, and managers of river fisheries and will be used for assessing variation and trends in river fisheries or for modeling production and predicting changes due to factors such as climate or land–water use change. This information may also be used as a basis for future quantitative systematic review questions related to the map findings, such as modeling changes in harvest over time or impacts from different harvest methods.

Primary question

What is the global distribution of river fisheries harvest?

This question targets information about the distribution and scale of an activity, so can be broken down into three main components using the “PIO” format; the population (P), the intervention (I), and the outcome (O) per Table 1. In the case of this study, the intervention is actually an activity, though the question formation follows the same PIO format as interventions.

Table 1 Key elements of the primary review question

Methods

Searches

River list

The geographic scope of this initial systematic map targets 60 rivers around the world (Table 2). These rivers were identified by selecting (1) the largest rivers by drainage size that are currently flowing [23, 24], and (2) the most intensely harvested rivers according to the expert opinion of the authors as well as consultation with a subject matter expert (Cowx, University of Hull, pers. comm.). These criteria were used as a starting point to focus our global analysis on the largest and most intensely harvested rivers where data should be more readily available in comparison to smaller rivers or streams. Each river search will be conducted independently and the names of the rivers are used in the search process described below.

Table 2 Target rivers as identified by Vörösmarty et al. [23, 24] and Cowx (pers. comm.)

Search terms

Search terms were developed using keywords from fisheries articles then tested against a list of relevant articles from an independent literature review provided by a colleague (Cowx, University of Hull, pers. comm.). Additional file 1 describes the preliminary search process and provides the original search terms used for scoping. The final list of search terms reflects the need to include river names to target relevant terms given the broad scope of the research question. In some cases, the initial searches returned millions of results via Google Scholar and this strategy is aiming for a more targeted, relevant database consisting of hundreds of files. Additional search terms from the original test set did not increase the number of relevant articles returned from the search and were thus eliminated; however, the search string may be modified during the full searches as necessary. Search terms used for each river are provided in Additional file 1 and will be included as part of final project metadata and supplemental information.

All searches will be conducted in web browsers with cookies and browser history disabled and in private settings (e.g., using “incognito mode” in ‘Google Chrome’) to reduce bias generated by user-specific returns.

Databases and search engines

Table 3 provides a list of the databases, search engines, and organizational sites that will be searched. The large amount of grey literature sites (67) reflect the amount of practitioner-based research and data generated regarding inland fisheries. This list of databases, search engines, and sites was designed to return as comprehensive set of broad range results as possible by including both general search engines (i.e., Google Scholar), databases (i.e. Web of Science) and institutional specific databases [25]. Colleagues and participants of the global conference on inland fisheries also reviewed and suggested additions to the list to increase inclusivity of global regions. Organizations were selected based on involvement in river fisheries and potential for hosting relevant data because they are known by our team or collaborators to either collect or aggregate river fishery data.

Table 3 Search sources

The publication search engines and databases will be queried using the Boolean phrasing described below in the topic search categories. Web crawling software [26] will be used to query and collate data from the grey literature databases and regional/country organization sites. The first 100 returns from each grey literature site (Table 3) will be reviewed for inclusion. Google Scholar, Web of Science, and Scopus will be searched because of relatively little overlap between search results of the search engine and databases [27]. The China Knowledge Resource Integrated Database (CNKI) and Baidu Scholar will be used to target rivers where data or research may be presented in Mandarin. Proquest Aquatic Science Collection will also be included to capture reports and grey literature that may not be available through our directed grey literature searches.

Search terms and languages

Search engines and databases will be searched using the following terms and Boolean phrasing (* denotes a wildcard character to include multiple word endings). The list of search phrases is available in Additional file 1.

Example phrase: (Congo OR Zaire) AND river* AND (fish* OR fisher* OR aquaculture)

English is the dominant language for scientific publications [28], but we recognize the growing literature base in other prominent languages [29]. As such, searches will be conducted in English only for the publication databases and grey literature websites. Searches in Google Scholar will be conducted in English, Spanish, French, Portuguese, and Mandarin (Table 4) based on the primary languages spoken in the countries where the 60 major rivers flow. Scoping of search terms in the 5 non-English United Nations Languages plus the major languages spoken in countries containing selected rivers is located in Additional file 2. We recognize that fisher* would be automatically included with the use of fish*, but both fish* and fisher* are used because the words do not have the same root in non-English languages (Table 4).

Table 4 Search terms by language

Estimating the comprehensiveness of the search

Our search strategy was designed to provide a broad scope of results regarding river fisheries data to be as comprehensive as possible. Primary studies will be targeted while review papers will be used to identify primary studies and data. All articles gathered, including review papers, will be included in the final reference database. We recognize that a large amount of data may be reported only through grey literature or stored on organizational sites. As such, our search strategy includes 37 grey literature databases and 30 regional or country sites (Table 3). Further sites may be added as revealed by the search results to be potentially relevant.

For the search engine (Table 3), an accumulation or discovery curve strategy will be used to assess the return on investment for continuing through search results. As each river constitutes a distinct search, this process will be conducted for each river search, within each distinct database. Because this process includes a unique search for each river in each database, duplicates of the same articles will be collected. These will be screened during the “Article screening and study inclusion criteria” described below, but also allow calculation of overlap between databases. The process is similar to that used for recording the cumulative number of species in a particular environment as a function of search effort [30]. As new relevant articles are collected and entered into the database, the total number of potentially relevant articles per 100 search results will be calculated and plotted. The search continues, considering new articles in groups of 100 returns at a time to plot potentially relevant articles per unit of search effort. This step will be conducted during the screening process described below. In this way, the number of new, potentially relevant returns that will be discovered with continued effort can be estimated. Once the asymptote of the curve has been reached (no new relevant articles revealed for 3 groups of 100 search returns evaluated), searching in this data base will discontinue [30].

Article screening and study inclusion criteria

Screening

Article screening will occur in two steps. First, the title and abstract will be reviewed for potentially relevant articles. Second, the full text will be read from those screened during step one for relevant data. Initial screening will be conducted by two members of the research team until consistency in screening is established between screeners. Consistency will be measured using the Kappa statistic, which measures the degree of agreement between two coders [31]. This systematic map process will implement two to four coders. All coders will search and screen the same river and the kappa statistic will be calculated. Criteria and differences between included article sets will be reviewed until kappa statistic values return moderate to high values, > 0.5 [32]. Meetings to discuss search strategy and study inclusion will occur at regular intervals to maintain consistency throughout the search and study inclusion.

The inclusion criteria described below will be applied during the search process for collecting potentially relevant articles and then screening collected articles for inclusion in the final geodatabase. The number of all included and excluded articles will be recorded at each stage of the screening and study inclusion process per the PRISMA flow chart [33].

Study inclusion criteria

All potentially relevant citations selected during the search process will be saved into a reference manager or systematic review software (such as Zotero or Mendeley) for full text review. Articles that are not open access will be requested through inter-library loan or via university library subscriptions from university-affiliated researchers on our team. Articles will be selected from the search if they meet all the following criteria, erring on the side of inclusion. At the first screening step, article will be selected if it meets the first two criteria and if the title and abstract indicates the article contains species-specific and location-specific information about fish biomass. After duplicates are removed, title and abstracts will be reviewed and studies excluded based on the following criteria:

  • Not river related.

  • No information provided on fish biomass.

  • No or insufficient location information provided. Sufficient location information includes information (text description, coordinates, or a map) to pinpoint a location or specific area on a map where fish were extracted.

  • Insufficient methodological information to determine how data was acquired.

During the second screening step of the full text, only those studies that do actually contain species-specific and location-specific information about fish biomass will be selected for inclusion.

  • Is primary research, a review, a dataset, a book, or a report

  • Was published between 1950 and 2016.

    • The start date of 1950 was selected because it is also the first year that FAO provides global fisheries statistics [13].

  • The article contains species-specific and location-specific information about fish biomass.

In order to be included in the final database and map, a study must meet all of the following criteria:

Relevant subjects

River or river aquaculture fish species (as identified by Fishbase.org).

Relevant interventions

Capture/extraction of populations and communities of fish for food, income, or recreation.

Relevant outcomes

Fish biomass extraction from rivers. This data will include different units for weight, and different articles may report data at different spatial and temporal scales as well as have different definitions for catch, harvest, yield, production, etc. The original data will be extracted from the articles and then, when possible, converted to consistent units for analysis as described below in the “Data coding strategy” and Table 5.

Table 5 Field names and definitions for data collected through the systematic map process
Relevant study designs

Quantitative research including experimental, quasi-experimental, observational studies will be included. Secondary studies including literature reviews and systematic reviews will be used to identify additional primary sources of information.

The original source of the data and type of organization (non-governmental, governmental, etc.) will be included in the final database. A list of articles excluded at full text review with reasons for exclusion will be provided.

Study quality assessment

Because of variation in fish harvest methods and differing purpose for data collection in peer-reviewed and grey literature, critical appraisal will not be applied to this systematic map. However, descriptive and demographic information about researchers and data collection will be captured as they may be pertinent for how the data are used in modeling projects or estimations. Notes regarding the methodological descriptions will also be collected, including reliability of sources.

Data coding strategy

The following information will be collected from each article, when available. Other categories or target information may be added during the search process.

Capture effort and methods fields

Fishing effort, gear type and size, vessel type and size, sampled area and location.

Sector fields

Subsistence, commercial, aquaculture, recreation, research, funding source.

Information type fields

Primary information (i.e., the report authors collected the information themselves), Secondary information; article reports data collected by another party.

Map fields

Author affiliations, research question or objective, outcome or conclusion, replication present or absent, control present or absent, review (R), before/after (BA), comparator/intervention (CI), before/after/comparator/intervention (BACI), randomized controlled trial (RCT), objective or purpose of data collection.

Regarding the spatial component of fish biomass extraction, this information will consist of point locations, sets of point locations, areas, general descriptions, and river or watershed level scales. If the specific area cannot be determined from the information provided, that article will not meet study inclusion criteria and will be excluded from the final geodatabase. All other data will be included in the format provided by the original documentation. The final geodatabase will consist of point, multi-point, polygon, and multi-polygon shapefiles. The original spatial description, map, or coordinate information will be provided in the final geodatabase along with the spatial information. Changes in fish biomass over time or time-series data will include individual entries for each time point.

Intercoder reliability will be established by comparing extracted data between researchers. Table 5 provides the field names and definitions of the data targeted for extraction from the search results. All persons contributing to data collection will be provided the same set of 20 articles and the actual data extracted will be compared. Discrepancies in collected data will be discussed and assessed until data collection is consistent. Other types of data may be added as identified by the search process. Data from these sources will then be synthesized into a geodatabase.

Study mapping and presentation

The systematic map database will be presented as a geodatabase that will be open-access and hosted by USGS Sciencebase. The geodatabase will be available as both a file geodatabase and a series of folders. All files within the geodatabase will consist of shapefiles, which are not proprietary. This systematic map protocol will accompany the geodatabase as metadata. Additionally, a systematic map describing the data collection process and results of the search. A geographic map of data density and data collected will review the distribution of information to identify knowledge gaps or concentrations of information and also address the temporal scale of available information. Understanding the distribution of current information can help target future studies to fill these gaps and reduce redundant data collection. This information could also be used to frame systematic review questions and research regarding river fisheries.