Introduction

The unprecedented event of COVID-19 pandemic brought by the novel coronavirus (SARS-CoV-2) with severe acute respiratory syndrome (SARS) has claimed more than 0.9 million lives across the globe. The novel coronavirus (nCoV) belongs to the same subfamily of orthocoronavirinae as MERS-CoV and SARS-CoV. Still, it is distinctly disparate from the other members, in terms of its contagiousness and evolution rate [1,2,3]. While it is believed that infants and older people are more vulnerable to infection, young asymptomatic carriers are of significant concern as they aid in virus transmission without their knowledge [4]. Several reports have already documented the presence of the virus and its genetic material in the faecal matters and urine of infected persons. Additionally, several concerns have been raised, including possible leaching and infiltrations of effluents from health care facilities, sewage, and drainage water [5]. Thus, in this time of accelerated transmission of the novel coronavirus, it is crucial to have a robust surveillance system to pace the monitoring of the disease spread. Wastewater-based epidemiology (WBE) approach can be used to understand and evaluate the degree of establishment, penetrance, and infectivity of a specific infection in the community [6]. In this approach, presence and abundance of specific biomarkers (e.g. viral genes) in wastewater are examined, as they may reflect the general status of inhabitants related to the biomarkers within a given wastewater catchment [7]. This approach has proved useful for studying the epidemiology of various infectious diseases and is a crucial tool for disease prevention, control and intervention. Thus, this paper reviews the potential of WBE in establishing a vigilant and robust surveillance system to track down coronavirus spread in the community.

Although SARS viruses are of zoonotic origin and several countries have already facing secondary transmission of SARS-COV-2, developing an early warning system always presents a challenge. In addition to WBE, several modelling approaches might aid in assessing the disease spread both at a regional and global scale. While compartmental models have traditionally relied on the researcher’s understanding of human behaviour but its relevance deteriorates as a function of limited understanding about the response of the government and knowledge about the virus as depicted in Fig. 1. Thus, statistical models have found its relevance at the wrong time as evident from the recent studies that advocated using Bacillus Calmette–Guérin (BCG) and Polio vaccination to counter the health implication of COVID-19 based on a simple correlation of early COVID-19 mortality data of countries like Iran and India [8]. Though BCG or Polio vaccines were helpful, predictions solely based on correlation analysis may compromise the entire response system against the virus. A greater challenge lies in making a wider audience to understand the very foundation on which this assumptive theory of virus predictions is based. Our aim through this article is to simplify these complex models into understandable theories so that the conclusions of the various prediction models may be understood in its context. This paper aims to collate information on recent developments on WBE in monitoring the trend of community-scale SARS-CoV-2 prevalence as well as models to predict virus spread and transmission among populations.

Fig. 1
figure 1

Making waves perspectives of COVID-19 pandemic: monitoring, modelling, myth and mental health

Transmission, Infectivity and Inactivation of SARS-CoV-2

Faecal-Oral Route of SARS-CoV-2 Transmission

The receptor-binding domain (RBD) of the heavily glycosylated S protein of SARS-CoV-2 interacts with the angiotensin-converting enzyme 2 (ACE2) receptor and attaches to the surface of the host cells leading to receptor-mediated endocytosis of the virion [9]. The viral envelope fuses with the endosomal membrane with the help of host proteases and releases the viral genome into the cytoplasm of the host cell [10]. The virus generally binds with ACE2 receptors in the lungs and intestinal tract, replicates further, resulting in severe consequences. More than 10% of patients without acute infection experienced diarrhoea and nausea within 1–2 days before the advancement of disease and development of any critical symptoms, including fever and respiratory illness [11,11,13]. Subsequently, it put forward the possibility of the virus load along with faecal matter either in an active or infective state.

This was later confirmed by the presence of the infectious virus and its genetic material (RNA) in the faecal matter of several patients infected by SARS-CoV-2 [14]. The RNA was continuously detected in the faecal matter of more than 23% of patients ages that ranged from 10 months to 78 years old [15]. During the infamous SARS (2002–03) and MERS (2012) outbreak, patients were suffered from several gastrointestinal symptoms, including diarrhoea at the beginning of the disease [16, 17]. It has been reported that 16–73% of patients with SARS had diarrhoea throughout the malady, for the most part during the first week of the disease outbreak and the viral RNA continued to be found in the faecal matter of the patients even after 30 days of illness [17, 18]. As the disease progressed, MERS-CoV RNA was also detected in the faecal matter of most of the positive patients attributed to replication in human primary intestinal epithelial cells [19, 20]. While the binding ability of ACE2 receptors decides the infectivity of SARS-CoV, SARS-CoV-2 was found to use human ACE2 at a better efficiency compared to other strains of SARS-CoV, including the strain identified in the year 2003 [21].

Although several recent publications have highlighted the presence of SARS-CoV-2 genetic materials [22,22,23,24,25,27] until today, there is no evidence of faecal-oral transmission of SARS CoV-2. However, presence of virus genetic material in toilet bowls, sinks, saliva and respiratory secretions of infected patients [28, 29] and the raw sewage water in different parts of the globe [22,22,23,24,25,27] has raised several concerns including enteric involvement of this virus. Available evidence on other SARS and MERS virus suggest that coronaviruses can survive for several weeks and retain its viability for several days at varying temperature range. It has been reported for SARS-CoV where the virus remained viable for 5 days at a temperature ranging from 22 to 25 °C and relative humidity ranging from 40 to 50% [30]. Interestingly, the same virus remained infectious for 2 weeks when the experiment was conducted for 4 °C, but its infectivity reduced drastically to 2 days when the temperature was increased from 4 to 20 °C [31]. Similarly, a recent study on the SARS-CoV-2 highlighted the computational modelling to study the travel time and survival of the virus from source to wastewater treatment plants (WWTPs) [32]. Considering the proof of faecal discharge for both SARS-CoV and MERS-CoV and their capacity to stay viable under conditions that could encourage faecal-oral transmission, it is conceivable that SARS-CoV-2 could likewise be transmitted by means of this course. Thus, the enteric involvement of SARS-CoV-2 is of greater concern in developing and underdeveloped countries where poor hygiene practices and open defection are prevalent.

Infectivity of SARS-CoV-2

Several techniques have been developed to detect viruses and their genetic material in various environmental matrices. In general, while some techniques test infectivity of viruses, others focused on nucleic acid isolation. Most of the approaches to detect and diagnose SARS-CoV-2 are based on the isolation of genetic material of the virus to confirm infection among patients (Fig. 2). However, over the past, techniques such as cell culture and PCR-based molecular techniques have been used [33, 34]. Cell culture assays investigate both virus presence and its infectivity by monitoring response (cytopathic effects, cells bursting, plaque) of suitable host cells. However, the difficulty of culturing and slow growth rates discourages the application of this method specifically to the novel virus, where expensive cell culture assay and requirement of biosafety level 3 are of additional concern. Recently, many researchers have reported the presence of SARS-CoV-2 RNA in the wastewater using reverse transcriptase-PCR (RT-PCR) technique. This technique involves RNA concentration, extraction, amplification to detect its presence at the lowest available concentration without giving much information on the infectious nature of the virus. There is much discussion around the infectious nature of SARS-COV-2 and the role of its genetic material.

Fig. 2
figure 2

Extraction and analysis protocol of SARS-CoV-2 (adopted from Ahmed et al. (2020b), Kumar et al. (2020) and Medema et al. (2020))

However, SARS-CoV-2 needs much more than its genetic material to spread infections. Outside the host, the virus is considered to be non-living, i.e. viron, which is mainly composed of RNA, surrounded by a lipid envelop and cannot self-replicate. Infection is only possible due to the presence of viable virus but not due to the presence of its genetic material in the host body. The virus is said to be viable if it replicates and increases the viral load in the host body. SARS-CoV-2 through its spike proteins (S protein) attach to the host cell receptors (ACE2), which is commonly found in the respiratory or intestinal tract of human beings and subsequently enters into the host cell to replicate [10]. As PCR targets specific parts of the virus genetic material, the degree to which a virus was affected by several disinfectants present in the wastewater and its viability cannot be addressed by this technique. Hence, the detection of SARS-CoV-2 RNA does not imply the infectious nature of the virus. Therefore, assay-based techniques are more appropriate to study the infectious nature of the virus.

Inactivation of SARS-CoV-2 and Its Genetic Materials During Chlorination at WTPs and WWTPs

While several studies have identified the presence of SARS-CoV-2 in the faecal matter of corona-infected patients [35, 36], there is a growing concern on the transmission of the virus through water treatment plants (WTPs) and WWTPs. Several studies also detected the genetic material of the virus in raw wastewater across the globe [22, 26, 27]. Nevertheless, the viability and infectivity of the virus in faecal matter are yet to be documented; the possibility of virus transmission through wastewater and drinking water is of major concern. Previous studies have highlighted the persistence of coronavirus and SARS virus in the wastewater, which ranged from hours to several days in the absence of disinfection practices [31, 37]. Chlorination is the most commonly used disinfection technique, mainly in both WTPs and WWTPs in developed and developing countries as a tertiary treatment step [38]. Although the effect of chlorine on disinfection of SARS-CoV-2 has not yet been documented, available data on other enveloped and coronaviruses can provide an insight into the viability and infectious nature of viruses. Chlorination was found to be effective against several enveloped viruses, such as vesicular stomatitis virus, African swine fever virus, equine viral arteritis virus and porcine reproductive and respiratory syndrome virus within 10 min of exposure [39].

Among the enveloped viruses, equine viral arteritis virus was inactivated 100% when exposed to chlorine (0.015%) for 1 min. Similarly, SARS was reported to be disinfected adequately at a free chlorine dose ranging from 0.2 to 0.5 mg/L [31]. Another enveloped virus, Ebola, was reported to be disinfected 100% at chlorine doses of 5 and 10 mg/L, and a 3.5 log10 reduction was reported in the presence of free residual chlorine 0.16 mg/L under 20 s contact time [40]. Free chlorine is known to penetrate the membrane of modelled enveloped virus (Pseudomonas phage Phi6) and reacted rapidly with the proteins and polymerase complex. Instead, the peptides of the enveloped virus are 150 times more reactive than the studied non-enveloped coliphage MS2 virus. In such a scenario, inactivation of the order 4 log10 was reported for Phi6 [41]. Recently, acidic electrolyzed water (EW) having a high concentration of free available chlorine has shown strong potential in deactivating SARS-CoV-2. It was found that virucidal activity of the virus significantly increased with the loss of free chlorine at long residence time [42]. Hence, the risk of SARS-CoV-2 transmission through drinking water and wastewater is low where chlorination is carried out as it is expected to inactivate the virus and its genetic materials, which has been seen for other enveloped viruses. However, community exposure through sewage overflow, building having a faulty plumbing system [43] or aerosol-mode transmission at the WWTPs [44] cannot be avoided.

Perspectives of Wastewater-Based Epidemiology: Monitoring COVID-19

WHO has highlighted that since 1970s, over 1500 novel pathogens have been discovered and close to 40 new transmittable diseases were identified [17]. Most of these diseases were reported to have a severe impact at the community level, with reports of many outbreaks happening during the last 20 years, most significantly SARS (severe acute respiratory syndrome) during 2002–2003 and COVID-19 during 2019–2020 [17]. The consequences of such an epidemic/pandemic have led to the emergence of critical monitoring on the spread and the disease trend. However, there exist various limitations in the surveillance systems mainly to cope with the rapid population growth and changing environmental conditions.

Wastewater-based-epidemiology (WBE) approach could be applied as a monitoring tool for surveillance and early warning systems of transmittable disease hotspots (Fig. 3). According to recent reports [35, 45], SARS-CoV-2 was shed in the patients’ faecal matter for much longer (e.g. 22 days) than the duration of virus shedding in the upper respiratory swabs (10 days). Hence, identification and quantification of the viral genome in wastewater can be a reliable indicator of disease prevalence among the communities, as revealed for other similar viruses [46,46,48]. Very recently, studies report detections of RNA of SARS-CoV-2 in wastewater [27, 46, 49,49,50,51,53], with detected RNA amount being higher than that expected from reported infection cases. Moreover, SARS-CoV-2 was detected in wastewater 7–10 days earlier than clinically reported cases [46, 53, 54]. In summary, wastewater analysis is recommended as a non-invasive initial-warning tool that can help in monitoring the trend and status of COVID-19 spread and as a device for tweaking public health response [35, 55, 56]. Thus, under the present condition, such environmental surveillance tool also has the potential to be implemented in wastewater treatment systems to help authorities to manage/regulate the exit strategy.

Fig. 3
figure 3

Wastewater-based epidemiology and monitoring of SARS-COV-2

Predictive Models

Earlier knowledge regarding 2019-nCoV hinted at transmission being limited between animals and humans, with lesser development and emphasis in terms of inter-human transmission, making earlier epidemiological predictions [57, 58]. Recent developments in terms of phylogenesis, molecular epidemiology and refinements in evolutionary models have helped build a broader understanding of the spread and transmission of the current pandemic [59]. Phylogenetic evolution can be traced and validated using different mathematical models, primarily aimed at the detection of episodic mutating diversification and pervasive selection, including techniques of projection-based fitting of nucleotide substitution [60]. Further, homology models, enabling protein structures to be analysed and compared in a three-dimensional space, provide higher leverage for understanding the structural templates [61].

However, a serious shortcoming in various risk-based projection models is with regards to its predictability of spread of infection, which considerably and dynamically change in response to the social behaviour of the people under time accelerated learning model [62]. This void creates a large scope for the induction of pure statistics-based mathematical models with various epidemiological considerations taking a back seat and so is the accuracy and precision of the prediction. Therefore, with regard to the stated facts, we have reviewed the historical, mathematical, epidemiological evolution of various models with regards to their capabilities in early detection of (i) disease spread, (ii) co-variability of the virus with environmental factors and (iii) transmission among and between different multi-cellular species.

Disease Spread

The potential scale of the spread of a virus can be understood in terms of its basic reproduction number (Ro). However, studies show that Ro variability with societal intervention is more severe than previously assumed, as has been the case in Wuhan, China, where Ro declined to 0.32 from 3.86 post-implementation of lockdown [62, 63]. These models often referred to as compartmental models, segregate the population into various broad categories such as (i) susceptible (S), exposed (E), infectious (I) and recovered (R). It has gained widespread popularity in the field of epidemic spread due to its greater mathematical emphasis on simulating and adapting under real conditions. Most of these epidemiological models trace their origin in the theory of [64] Kermack and McKendrick (1927) and the early works of Ross and Hudson [65], which stressed on finding the critical causal factor which impacts the severity and frequency of the epidemic. In this regard, a brief overview of their work becomes essential in understanding the assumptions as well as the accuracy of these models. Kermack and McKendrick formulated a simple hypothesis and opined that epidemics propagates in the community via direct contact with an infected person, while other indirect transmission sources being negligible.

With the rapid spread of infection, the number of sick people in the population increases, which further stabilizes on account of increasing deaths and recoveries. Also, the chances and prospects of recovery or death or future infection to other persons will change with each passing day in the infection cycle of a sick person. A primary assumption of the theory was that epidemics are usually short-lived, and therefore, cannot change the population at large. Thus, the community under scrutiny for disease spread was kept constant for modelling. The study established that for each set of recovery, death and infectivity, there exists a threshold population density. If the current population of the region exceeds the threshold density, then any addition of a newly infected case will result in the start of the epidemic cycle again. A serious observation was concerning predicting the end of the epidemic. Kermack and McKendrick believed that infection exists in a region as long as its population exceeds the critical population density with the epidemic trying hard to move the community towards the critical threshold value, where it finally wanes away and dies out. The theory was further modified to compare the virulence and severity of different viruses for the population with identical population density, recovery and death rate. To give a mathematical basis to this developed hypothesis, the two researchers divided the time into a large number of small constant units, with the assumption that infection starts at the beginning of the time instant with no occurrence of any outbreak in between the passage of one-time unit to other (eq. 1).

$$ {Y}_t,\uptheta ={\sum}_{\uptheta =0}^t{P}_t,\uptheta $$
(1)

Here, t means the time and ⍬ means the total number of time intervals. Yt,⍬ shows the total sick population and Pt,⍬ denotes the infected people at any given time instant. P0,0 will indicate the total infected population at the beginning of the modelling, including the newly infected as well as the existing infected people, which are the potential host for the epidemic spread in the population. If we assume the rate of removal as Ψ (sum of recovery and death rate), then the total removed population from the consideration will be ΨYt,⍬, which is equal to Yt,⍬ − Yt + 1,⍬ + 1. Further, if we assume the rate of infectivity to be ɸ after the passage of ⍬ time intervals and the total unaffected population to be xt, then the total number of people becoming infected in the unit area as a function of unaffected people will be given by eq. 2.

$$ {P}_t={x}_t{\sum}_1^t\upphi \uptheta {P}_t,\uptheta $$
(2)

Further simplifications lead to the expression of these terms in a more complex differential form, drawing heavily upon the concepts of infinite integrations, which are beyond the scope of this review. However, a more straightforward but more accurate modified form of the current model was used by [62, 63], for evaluating the coronavirus spread in Wuhan, China. The study divided the population of Wuhan into six categories, including Susceptible (S), Latent (E), Reported Infectious (I), Unreported Infectious (A), Hospitalized (H) and Recovered (R). Further, two key parameters, including transmission rate (b) and ascertainment rate (r), were used to develop a dynamic network of parameters, which could be represented by eq. 3.

$$ \frac{dR}{dt}=\frac{I+A}{Di}+\frac{H}{Dh}+\frac{nR}{N-I-H} $$
(3)

where Di and Dh are the infection and hospitalization period, respectively. Further, the effective reproduction number of the disease was calculated as per eq. 4

$$ Rt=\frac{Di\ b}{A+I}\left(\alpha A+\frac{Dq\ I}{Di+ Dq}\right) $$
(4)

where α is the rate of transmission calculated as the ratio of unascertained vs. ascertained cases and Dq is the period from the onset of symptoms of illness to the hospitalization time. Further, the distribution parameters were estimated using Monte-Carlo Markov chain (MCMC) simulations using the likelihood and Poisson distribution functions.

The same model was also used by [66] in predicting the Ro of 2019-nCoV to be around 2.68. More so, the baseline projection of the study estimated that the mean imported infectious cases in the cities of Beijing, Shanghai and Shenzhen could be 113, 98 and 80, respectively. [67] employed similar compartmental models to divide the population of Hubei province into 5 categories, including susceptible population, asymptomatic, infectious with symptoms, isolated, and recovered to estimate a Ro value of 6.49. The most conservative estimate of Ro for China has been from WHO, which predicted value between 1.4 and 2.5. [68] used an additional two categories of death and cumulative cases to estimate a Ro of 4.08, assuming an average latent period of infection to be 9 days. [69] overestimated the Ro value for China to be 6.47, despite being based on the compartmental models and including appropriate social interventions. While these epidemiological models are more flexible in terms of incorporating dynamic changes with regards to the spread of the virus, their variability in prediction has led to increased dependency of researchers on statistical trend-based models. In this regard, the study by [70] used a simple statistical growth function with an assumed incubation and serial interval of 5.2 and 7.5 days, respectively.

Similarly, [71] used outbreak trajectories from stochastic simulations to obtain a Ro value of 2.2. [72] used a more precise estimation method involving epidemiological data concerning hospitalization, death and onset of illness. Further, through a doubly interval-censored likelihood function and Bayesian methods, different parameter values within the intervals were determined. Statistical models have also been used in order to estimate the serial interval (time duration between two successive cases in an epidemiological transmission chain) of nCoV using a similar Bayesian approach [73]. Few studies have relied on fitting data in different probability density functions to estimate the incubation period of 2019-nCoV to be around 5.1 days [74].

Effect of Environmental Factors

Associating environmental conditions with the spread of 2019-nCoV started due to some early work on SARS-CoV, where it was found that out of the four protein structures of the SARS-CoV, i.e. (i) Spike protein (S), (ii) Nucleocapsid (N), (iii) Envelope protein (E) and (iv) Membrane protein (M), the N protein exhibits the highest hydrophilicity and least stability due to absence of disulphide bonds. The high isoelectric point (pI = 10.1) and the absence of cysteine residue, makes N protein the weakest link of the virus [75]. Any disruption in the N protein means inhibition in the replication of viral RNA. Further, the sub-genomic transcription and translation also suffer, leading to instability of the virus. Earlier studies by [76], suggested that the poliovirus and rhinovirus have also suffered a similar fate due to viral capsid inactivation at 42°–45 °C following the denaturation of protein structures. A study by Jane-Valbuena et al. [77] further showed that the few outer capsid proteins are highly thermal sensitive. These studies provided enough excellent reasons for modelling the spread of 2019-nCoV with the external environmental conditions, assuming that the cases of infection will decrease through secondary infection routes due to the inactivation of the virus on different surfaces; however, the possibility of transmission via direct contact remains unchanged.

While the 2019-nCov transmission rate is directly dependent upon close contact with the infected person but the possibility of aerosol suspension via droplets from contagious people remains a significant source of secondary infection. A study by [78] showed that some known coronavirus (HCV/229E) could survive and attain a half-life between 27 and 67 h in conditions with humidity varying between 30 and 50%. Further evidence suggests that a decrease in temperature by 6 °C can increase the half-life of the virus by 3 h, even at 80% humidity [79]. An interesting observation in this regard was noted in SARS outbreak, where it was suggested that due to the low humidity conditions inside the airplane, the virus reached beyond 2 m from the infected person [80]. It is worth noting that all the previous studies have explicitly pointed to the role of humidity and air temperature as the controlling variables in virus spread. However, the consensus is still not clear about the magnitude to which these variables impact the virus spread. Published articles have stated that warmer weathers have a suppressing impact on the contagion spread, but the previous outbreak never happened to the scale that has occurred in 2019–2020, leaving lesser applicability of the inferences of these researches [81,79,80,84]. We have presented the various modelling studies done in this regard in Table 1.

Table 1 Models developed to study virus transmission in the environment

Transmission

The spread of coronavirus is sporadic through human-to-human transmission mainly by droplets produced during coughing or sneezing. There are two pathways involved in transmission one via direct contact with the symptomatic patient and another is indirect contact from the environment [92]. Direct contact with symptomatic patient implies being in close proximity with COVID-19-infected patient that is within the periphery of 1 m. On the other hand, indirect contact happens via the presence of microbes within droplet nuclei (i.e., particle < 5 μm diameter) in the air or over surfaces [35, 36]. These droplet nuclei can survive in the air or on surfaces for a longer period and, therefore, can be transmitted by coming in contact with an infected surface. However, the probability of airborne transfer is dependent upon the retention power of droplet nuclei in the air or on the surfaces. The likelihood of indirect transmission is less likely in comparison to direct transmission. Further, there has been some evidence suggesting oral-faecal transmission of the virus as COVID-19 infects the intestinal tract, thereby indicating the faecal presence of the virus [93,91,92,93,94,95,99]. However, this paper, as mentioned above, rules out any such established possibility as of now. Based on the available evidence, the WHO has recommended the use of masks to avoid direct and indirect transmission of the virus.

Conclusions

The impact of the coronavirus pandemic is devastating. It has raised several concerns among the population due to fear and worries of catching an infection. Although several enveloped viruses are reports to survive for many days, even at 25 °C, countries with hot weather have reported cases of COVID-19. Several factors, including virus transmission, infectivity and inactivation, are crucial before assessing WBE. This review raised the concern on the faecal-oral route of virus transmission, especially in developing countries lacking disinfection steps in water and wastewater treatment plants. Additionally, countries lacking proper sanitation facilities having open defecation practices are more vulnerable to be infected by SARS-CoV-2. The high variance in the prediction capability of various statistical models proves less beneficial for government authorities in dealing with the containment of the disease spread. However, the compartmental models can offer higher adaptability under these circumstances and can offer higher leverage in terms of policy planning.