Master equation model for solute transport in river basins: part I channel fluvial scale

Normal and anomalous diffusion are ubiquitous in many physical complex systems. Here we deﬁne a system of diffusion equations generalized in time and space, using the conservation principles of mass and momentum at channel scale by a master equation. A numerical model for describing the steady one-dimensional advection-dispersion equation for solute transport in streams and channels imposed with point-loading is presented. We ﬁnd the numerical model parameter as the solution of this system by estimating the transition probability that characterizes the physical phenomenon in the diffusion regime. The results presented (Part I) refer to the channel scale and represent the ﬁrst part of a research project that has been extended to the basin scale.


Introduction
Dispersion models in rivers play a crucial role in understanding the transport and spread of contaminants in natural channel networks.These models simulate the movement and distribution of pollutants, providing information for water resource management and pollution control (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).Dispersion models can be used to evaluate the potential impact of sources of pollution and to design effective strategies for mitigating the risk to human health and the environment (Chatwin and Allen 1985;Duarte and Boaventura 2008).By utilizing mathematical and computational methods, dispersion models can simulate the movement of contaminants and help identify the most vulnerable areas, allowing for proactive measures to be taken to minimize the impact of pollution on the ecosystem (Chatwin and Allen 1985;Duarte and Boaventura 2008;Deng and Jung 2009;Alley 2007;Altenburger et al. 2015;Novotny 1994).The accuracy and effectiveness of these models depend on several factors, including the quality of input data, the choice of model structure, and the appropriate calibration and validation of the model (Chatwin and Allen 1985;Duarte and Boaventura 2008;Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).Other models often require large amounts of data and intensive computational processing, making them challenging to use in real-time applications (for example Hydrodynamic Models or Lagrangian Models).Additionally, they can be computationally expensive and may require high-performance computing resources, which can be difficult to obtain (Alley 2007;Altenburger et al. 2015;Novotny 1994).The complexity of the models and the large data sets used can also make them difficult to use for non-technical users, limiting their application in decision-making processes.Nevertheless, despite these problems, dispersion models remain an essential tool for promoting sustainable water resource management and protecting the health and wellbeing of communities and ecosystems (Alley 2007;Altenburger et al. 2015;Novotny 1994).This information is essential for decision-makers, water resource managers, and environmental scientists in developing effective management strategies to minimize the impact of pollutants on water quality and ecosystem health, and however, it should be used in conjunction with other methods in order to gain a comprehensive understanding of pollutant behavior in rivers (Alley 2007;Altenburger et al. 2015;Novotny 1994;Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).In river networks or environmental catchments, the movement of polluting particles generally from a region of higher concentration to a region of lower concentration, is ubiquitous and well-known as diffusion.In the absence of obstacles to diffusion and traps, the diffusive particles describe a random walk motion on the channels of the river network, such that their mean squared displacements scale linearly with time.This process is known as normal diffusion.However, the existence of obstacles in the river network may trigger long-jumps of the diffusive particle (King and Turner 2021), such that the mean displacement of the particles is bigger than that of the normally diffusing ones in the same period (King and Turner 2021).This type of process is known as super-diffusive (King and Turner 2021).On the other hand, the river network may have regions acting as traps for the particles, where they are retained for longer times than in a normal diffusive process.In this case the process is known as sub-diffusive (King and Turner 2021).The scope of the present work is to propose a numerical technique that is computationally less expensive than the existing diffusive methods.Our approach in the framework of geomorphology river network offers interesting information in terms of instability, numerical dispersion and oscillations of diffusive phenomena.This technique, from a physical point of view, is based on two physical conservation laws: the principle of mass conservation, expressed through the use of master equations (MEs), and the principle of momentum conservation, that is condensed into the parameter P ij , expressing the transition probability between two nodes on a channel network graph.MEs describe the probability time evolution of the different states in the system, with the dynamics being due to transitions between these states (Fernengel and Drossel 2022;Van Kampen 1992;Honerkamp 2012).The most common form is an initial value problem of a linear differential equation (Van Kampen 1992).The ME peculiarity is that of probabilities flowing between states like a fluid, where their total amount is being conserved (Van Kampen 1992).For example, a ME special case is the Fokker-Planck equation, which describes the time evolution of continuous probability distribution (Keizer 1972;Van Kampen 1992;Gardiner et al. 1985).P ij , the transition probability plays a critical role in transport scales, especially in the case of river networks.River networks are complex geomorphological systems, in which the scaling behaviour of network structures is of fundamental importance to describe the phenomena of many hydraulics and hydrological processes (De Bartolo et al. 2006, 2009a, 2022).The scales essentially are two: basin scale (Rodriguez-Iturbe and Rinaldo 2001) and channel scale (Raudkivi 2020).Regarding to the first scale, there are numerous applications in the fluvial-environmental domain (Botter et al. 2011(Botter et al. , 2010;;Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018), in which the parameter P ij is not directly found to be dependent on resistance laws, namely, associated with motion (steady and unsteady conditions).Generally, it is validated through relationships of a local nature dictated by the presence of more or less tributaries incident at junctions (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).At channel scale, P ij turns out to be well focused and mainly related to the physical phenomena of motion resistance.Specifically, P ij is strictly related to solid transport and equilibrium conditions (degradation-aggradation) of the longitudinal and transversal channel geomorphology (Raudkivi 2020).In this framework, the motion types can be uniform, steady and unsteady.Therefore, the P ij parameter plays a critical role in defining the transport principle between two nodes in the same channel.At the basin scale, the complexity of P ij is averaged, node by node, over the entire graph-tree structure of the river network (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).Specifically, the node represents a junction between two streams or channels, if the former are hierarchized according to a numerical ordering (see for example, Horton-Strahler (De Bartolo et al. 2009b, a)).At the channel scale, this same approach can be simplified and developed using the same methodology as above indicated, applying it between two consecutive nodes belonging to the same channel.
The aim of this work is to highlight what are the elements of validation, applicability and limits of the basinscale methodology over the channel-scale.Specifically, the analysis of the data provided by the USGS was performed and the same master equations laws were applied on a channel reach belonging to the Maumee River basin in Ohio (USA) and finally the results were obtained.The results presented here refer to the first part (Part I) of our research conducted at the channel scale.
A numerical method for solving the steady one-dimensional advection-dispersion problem for solute transport in streams and channels imposed with multiple point-loading is presented.The proposed method requires a partition of the channel into smaller reaches here defined as computation lines, L ij .The inter-lines boundaries, referred to as the internal nodes, are the locations where the unknown concentrations are evaluated at subsequent time steps.Since the flow parameters are assumed to remain constant over time, t, as we shall see, the dispersion and decay coefficients are also assumed to remain time-invariant (Fischer et al. 1979;Chapra 2008), although their spatial variations may be accepted.Therefore, each segment assumes in time line-constant parameters of velocity, dispersion coefficient, or transition probability.This assumption, although limiting in the initial phase of phenomena of pollutant dispersion in rivers, represents the minimum condition of constraint respect to the data available for experimentation.
In the absence of available data on riverbed geometries, sections and velocity scale measurements these assumptions can provide first approximation results.In this framework our model and methodology will exhibit the first experimental comparisons.
The proposed method models the spatial variation of the solute concentration within a stream by an analytical expression in terms of the concentrations at two boundary nodes of the segment and the influx rate of point-loading, if any, located within the lines (see Fig. 1a).The channel length is partitioned into segments and the solution aims to evaluate the concentrations at the segment interfaces or the internal nodes, at prescribed time steps.The results are computed starting from a specified time level with a known spatial concentration within the domain.It is assumed that the boundary nodes at the two ends of the channel, one each at the upstream and the downstream, are specified, at all times, with solute concentration values or their spatial gradient.The solute-loading rates of the source points discharging in the channel are also considered, possibly varying with time, t.The computed concentration values of the solute along the channel length may thus vary with time, although, the flow velocities in the channel are assumed to be in a steady state.In particular, the spatial variation of the solute concentration is obtained as a solution of a system of linear equations: the principle of mass conservation, expressed through the use of MEs, and the principle of momentum conservation, condensed into the P ij parameter.Solute concentrations or their gradients at the upstream and downstream channel boundaries, defined as time-dependent functions, are used for closing the solution and performing the model.The main objective of this work is to find the transition probabilities, P ij , by a non-direct best fitting procedure.In the next section the mathematical formulations of this methodology is provided.

Master equations and mathematical formulation at the channel scale
The computation steps explained in this section assume that the channel is one-dimensional and devoid of any transient storage, thus allowing to be accepted as a suitable model for simulating the solute transport in open channels.Therefore, we seek to find the solution for the one dimensional equation governing the longitudinal spread (dispersion) of the solute concentration in space and time (Benedini and Tsakiris 2013).Considering a stream of length L, it can be discretized, using n nodes, in small reaches of length, L ij , where i and j are the extremeboundaries of each reach.The simplest discretization is the elemental structure of two nodes and one line, namely, upperstream and downstream nodes and the stream L.
As shown in Fig. 1a, considering the elemental structure of two nodes i, j and one line-cell L ij we can apply the wellknow local conservation mass equation as a ME in a where q i and q j express the solute concentration over time (discrete time t k ) in nodes i and j, and P ij represents the transition probability of the reach L ij .This equation is obtained considering the stream mean velocity constant over time (v i ¼ v j ¼ v, steady flow) as consequence of the real mass conservation on the temporal size (t f À t i ) in which the phenomenon occurred.Hence, the integral form of the mass conservation is: where t i , t f are the time interval extremes, Dt is the shifttime travel interval along the channel from the initial node to the final node and v i ¼ v j ¼ v the mean stream velocity.
Simplifying the constant mean velocity, equation ( 2) turns in a simple form: It is important to highlight that if the condition expressed in equation ( 3) is not satisfied the mass balance along the two nodes i and j loses its significance.This approach described for a single node is applied for each node of the discretized stream in order to define a system whose solutions are the transition probabilities, P ij .An example of this approach is shown in the next paragraph.The procedure is repeated over discrete time steps, till a specified final-time is reached.A matrix formulation of the equation set (1) is here described and used to solve the system and finally, to find the transition probabilities: in whichqðt k Þ,qðt kÀ1 Þ are the solute concentration vectors at the time t k and t kÀ1 , and Pu , Pd are the transition probability matrices estimated considering each local reaction.Moreover, qu ðt kÀ1 Þ, qd ðt kÀ1 Þ are the solute concentrations at the upper and lower nodes at the time t kÀ1 , respectively.For example, considering the scheme in Fig. 1b, Eq. ( 4) turns into his complete local form: or equivalently, Moreover, Fig. 1a shows a hypothetical stretch of a typical one-dimensional channel that is subdivided into m segments of which n are loaded with single-point sources (where n m).The coordinate axis, X, varies from the upstream end of the channel, pointing downstream and represents the abscissa curvilinea of the stream.A representative reach, having boundary coordinates X i and X j , and bound by nodes i and j, respectively, may be assumed to contain a solute-loading point.The load point can be the j node itself or confluent in j from an external stream.Therefore, the nodal concentrations vary in time and logically can be written, for any node along the computational domain.The formulation presented here (Part I) will be extended to the basin scale (Part II) taking into account the recursive process of the river network structure in view of one of the known hierarchical criteria in quantitative fluvial geomorphology, namely that of Horton-Strahler (De Bartolo et al. 2009aBartolo et al. , 2016)).

Study area and sampling dataset localization
Calibration of the model previously defined, is performed considering the physiographic hydrologic unity of the Maumee river network in the State of Ohio (USA).One of The Maumee River is a river network running from northeastern Indiana into northwestern Ohio and drains a primarily rural farming region in the watershed of Lake Erie (for further details see (Martin and Pederson 2022;Stothers and Tucker 2006)).This river network can be represented by a tree-graph, characterized by segments (arcs) and nodes.We focus on the St. Marys River considering the elemental structure scheme characterized by two nodes and one stream (arc).Subsequently, we consider a more complex system adding internal nodes to define how the transition probability varies.
Different types of solute concentration can be analyzed: for each one of these a transition probability can be carried out.To achieve this objective the equation system defined in the paragraph (2.2) is performed over the dataset provided by the USGS (see https://www.usgs.gov).The dataset consists of daily average pollutant concentration data measured from 2015 to the present.Considering the historical context, the environmental problems were caused by sediment contamination and agricultural runoff.The runoff Marys River, eventually leading to cultural eutrophication in Lake Erie (Fryar et al. 2000).Sediments at the site contained high levels of polychlorinated biphenyls and heavy metals, which came from old dumps, contaminated industrial sites, combined sewer overflows and disposal of dredged materials (Fryar et al. 2000).It was found that about 90 % of toxic discharges were nitrates, ammonia and phosphorus, including farm runoff and waste water from industrial processes such as steel production.In consideration of this environmental status, different solutes have been taken in consideration in our analysis: ammonia filtered and unfiltered, phosphorus, nitrate plus nitrite, ammonia plus organic nitrogen, ortophosphates and moreover suspended sediments.Some results are given in the following paragraph.The computations are handled by a first-order scheme in discrete time.Two simulation cases are given below.The first case analyzed concerns the ammonia solute NH 3 .The different schemes used for this first case are shown in Fig. 4 (schemes (a), (b) and (c)).All three schemes are used to better highlight the differences in results respect to the increasing system complexity.The second case concerns the nitrate solute NO À 3 .In this case, the only scheme used is (a) in order to better highlight the differences in results by considering several peaks of the pollutant wave.The data, provided by the USGS, for the first case refers to the entire month of October 2018, while for the second case they range from February to mid-March 2021.
Considering Fig. 3 and the schemes shown in Fig. 4, the upstream node and the downstream node correspond to nodes A A 0 and B B 0 respectively.For the ammonia solute, we analyze the first concentration peak.Results are shown in Table (1).In the case of scheme (a), the optimal results were obtained by solving the MEs system with a P A 0 B 0 transition probability equal to 0.65.In this case, but also in the subsequent cases reported, the optimal P A 0 B 0 value is obtained in an iterative manner by finding a simulated concentration curve that comes closest to the experimental one and thus, minimising the errors between the concentration peaks and the respective time lags.The trend of the concentrations is shown in Fig. 5.
The percentage error defined as jq N Àq N 0 j q N (where N is the generic node) is evaluated for all nodes in all cases.Thus, for node A 0 it is 0.29 %, while for node B 0 it is higher and about 8.81 %.In addition, there is a time-lag of 0.5 days in the peak B node compared to the simulated peak B 0 .
In scheme (b) the system becomes more complex with the addition of a central node in the stream, namely C 0 node, for which an ammonia concentration trend can be assumed (see Fig. 6).The transition probability P A 0 C 0 ¼ P C 0 B 0 found is 0.51, while the percentage errors evaluated are, for node A, 0.01 % and, for node B, 1.02 %, respectively, with an estimated time-lags of 0.3 and 0.5 days (about 7.2 and 12 h) (t p;A À! t p;A 0 and t p;B À! t p;B 0 ).
Finally, in scheme (c) the system becomes even more complex with the addition of two central nodes in the stream, C 0 and D 0 nodes.The concentration trends are shown in Fig. 7.The transition probability P A 0 C 0 ¼ P C 0 D 0 ¼ P D 0 C 0 identified is 0.39, while the percentage errors evaluated are for A node equal to 4.78 % and for B node equal to 5.76 %, respectively, with estimated time-lag of 0.2 days (about 4.8 h) for both nodes (t p;A À! t p;A 0 and t p;B À! t p;B 0 ).We can observe that by adding more nodes, the discretized system changes, and for each node the transition probability also changes.Similar behaviours can be found in other frameworks and investigations (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).At the channel scale, this makes it possible to extrapolate forecasts of the pollutant wave, at intermediate nodes.
In the case of nitrate solute instead, the upstream node and downstream node correspond to A A 0 and B B 0 nodes in scheme (a) as shown in Fig. 4.
The concentration trends are shown in the Fig. 8. Compared with the previous case, the two peaks' values of the concentration curves are given in Table (2).The transition probability estimated by the MEs resolution is equal to 0.66.For the first peak, the percentage errors evaluated in A 0 and B 0 nodes are equal to 3.95 % and 7.91 %, respectively, with a time-lag in A 0 and B 0 nodes of 0.4 days and 0.7 days (about 9.6 h and 16.8 h).For the second peak (in Fig. 9), the percentage errors evaluated in A and B nodes are equal to 0.72 % and 1.37 %, respectively, with a time-lag in B 0 node of 0.3 days.It should be noted that for all cases reported, the conditions expressed in relation (3) are verified.Although dispersion models in rivers provide valuable indications of how pollutants are spread, it is important to recognize their limitations.According to (Huang et al. 2022;Ito 1992;Ajami et al. 2007;Van Kampen 1992;Hervouet 2007), dispersion models are based on a set of assumptions and mathematical representations of physical processes, which can introduce uncertainties and errors into the predictions.The model relies on quality data inputs, including information on river flows, water quality parameters, (and meteorological conditions can be introduced), which may not always be available or accurate (Ito 1992;Hervouet 2007).Additionally, the model may capture the complexities of real-world river systems, such as irregular shaped channels, variable water flow patterns, and the presence of structures that can affect the dispersion of pollutants and thus, affecting the transition probability, a parameter that takes into account all these variables and here numerically defined (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018).It is important to consider the limitations of dispersion models when using them to make decisions, including field measurements and observational data, to validate the results: the process considered must be 3 ) versus time considering scheme (a).The first peak and their respective time and lags are highlighted a normal-diffusive type (as defined previously).Furthermore, the model evidences in the results a time-lag in dispersion considering the peaks but it is approximately 7-12 h, considering the complete time span in which the dispersion phenomenon takes place (about 15-20 days).Despite these limitations, dispersion models continue to play an important role in the management of water quality in rivers, and the model here described provides valuable information for reducing the impact of pollutants on the environment.It is of paramount importance to emphasize the choice of the number of intermediate nodes.In our case, this optimal number was greater than or equal to one, for a minimum complexity of three nodes connected by two arcs (streams).By increasing the complexity and thus, the number of internal nodes, the error on the peak concentration in the downstream nodes may also increase This investigation will be extended in Part II to a basin scale and this number will be defined on the basis of minimal Hortonian structures.Based on the comforting results obtained, it will be interesting to investigate the hydrodispersive behaviour on complex hydrographic networks characterized by a larger number of nodes and channels.This investigation will be supported by a system of recursive MEs on hydrographic network structures ordered according to the criteria of quantitative fluvial geomorphology (De Bartolo et al. 2022, 2009a, b).

Conclusions
In this work the dispersion analysis of some contaminants within the catchment area of the St. Marys River in the Ohio State (USA) was addressed at channel scale (Part I).
Specifically, a set of data concerning ammonia and nitrates were analyzed, on the basis of data provided by the American USGS service.The time period concerned the months October 2018 for ammonia, and February-March for nitrates.The analysis on the geomorphological channel scale concerned a reach of approximately 33 kms.The hydrodynamic dispersion model used for this purpose has been derived from a particular class of dynamical differential system, derived from the Fokker-Planck equations, called master equations, MEs.The MEs model here proposed, according to (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018), based on mass and momentum conservation principles, has been implemented on an equivalent graph to the channel reach, characterized by a single arc (stream) and two terminal nodes.The MEs proposed transcends the difficulties of large amounts of data and intensive computational processing and, based on the analyzed data.The model in its peculiar characteristics has offered interesting results in terms of peak time and maximum solute concentration.It allows a reconnaissance of the dispersion phenomenon at the channel scale to be carried out quickly and rapidly with an interesting low percentage error (0.1-9 %) and time-lag (7-12 h).Moreover, according to the scientific literature (Rodriguez-Iturbe et al. 2009;Rinaldo et al. 2018;Botter et al. 2010Botter et al. , 2011;;Rodriguez-Iturbe and Rinaldo 2001), numerical validity was given to the transition probabilities, P ij , considering real cases.In fact, having identified these latter for each node, it also makes it possible to carry out forecasts of pollutant dispersion in the channel-scale river network in the study basin and all those with similar geomorphological characteristics.These are the first results for future purposes, including that of extending the described method to the basin scale in order 3 ) versus time considering scheme (a).The second peak and their respective time and lags are highlighted to carry out analyses and forecasts of pollutant dispersions.The extension of the method to the basin should imply a variation to the definition of the transition probabilities P i;j in order to take into account the possible local delays or accelerations occurring at the channel scale and thus the different velocities.Consequently, the transition probabilities P i;j will not be constant over time but time-dependent.Moreover, it was found that the optimal number of channel partitions is two (with one internal node) and as the number of partitions increases, the percentage error at the maximum concentration increases at the nodes following the upstream one.Based on these encouraging results obtained, the model proposed here can be further generalized through a formal recursive development of the same MEs, which can take into account a greater complexity of the river network, for example Hortonian structures characterized by a large number of internal and external nodes (Part II).

Fig. 1 a
Fig. 1 a Discretization of the channel in the monodimensional domain using nodes and arcs.b Balanced minimal tree-like structure for local interaction

Fig. 2
Fig. 2 Map of the Maumee river hydrological network in the State of Ohio (USA) with water quality gauging stations represented as points.In yellow the St. Marys tributary taken into consideration for the present work (geographic coordinates 41 04'58'' N 85 07'56'' W)

Fig. 5
Fig. 5 Development of ammonia concentrations (q NH3 ) versus time considering scheme (a).The peaks and their respective times are highlighted

Fig. 6
Fig. 6 Development of ammonia concentrations (q NH3 ) versus time considering scheme (b).The peaks and their respective times are highlighted

Fig. 7
Fig. 7 Development of ammonia concentrations (q NH3 ) versus time considering scheme (c).The peaks and their respective times are highlighted

Fig. 9
Fig.9Development of nitrate concentrations (q NO À 3 ) versus time considering scheme (a).The second peak and their respective time and lags are highlighted

Table 1
Summary results for ammonia solute NH 3 his tributaries, the St. Marys River channel is characterized by different hydrological sub-basins, connected to each other in the river network of the Maumee catchment, as shown in Fig.2.The main stream of St. Marys River subbasin has a length of about 33 kms, and in the same Fig.2it is represented in yellow color.

Table 2
Summary results for nitrate solute NO À