1 Introduction

Host response to a pathogen is a result of a complex, highly dynamic set of interactions between the two species. The precise set of interactions in a given situation decides the outcome, leading to one of the possible end results, which are (a) clearance, where the infection is fully eliminated from the host;1 (b) active disease, where the infectious agent is successful in evading the host immune response2 and (c) a stalemate, where there is no clear winner but the infection persists in some sort of a dormant form.3 Thus, there can be extremely diverse outcomes between the same pair of species based on the specific set of interactions. These are governed by the specific set of signalling cascades or pathways that are triggered, as also their relative extents, making it important to understand both the qualitative and quantitative aspects of such cascades.

The host system has a demanding task of sensing, recognising and responding to a pathogen, while the pathogen also has a daunting task of sensing the specific response from the host, subverting and modulating it where possible, not only to survive, but in fact also to thrive in the hostile environment of its host.4 Signalling in pathogens can also enable them to hijack host cells to derive nutrients and other important machinery for their survival.5 Signalling occurs in cells through complex networks of molecular interactions, geared to respond to a variety of chemical, mechanical or electrical cues.6 The networks provide a framework to understand complex properties such as multi-specificity, pleiotropy, redundancy and cross-talk.7 Besides responding to infections, the networks also cater to several diverse activities within a cell such as maintenance of cellular homeostasis, communication between cells and tissue repair.8 Many of these involve the same set of signalling components, but may vary in the relative extents to which each component is triggered. It is not surprising that the host response depends on the relay of information to and from other cells around it9 and is intricately interconnected to its nutrition state and metabolism. A large number of molecules are involved in signalling, that include small molecule ligands, peptide and non-peptide hormones, complex glycans and glycan attachments of various proteins, cell surface receptors, effector proteins, enzymes such as kinases and phosphatases and other mediators. Information is processed through a range of molecular events spanning molecular recognition, binding, conformational changes and quaternary structure rearrangements, enzyme action, post-translational modifications, making or breaking of protein–protein interactions, translocation to the nucleus or other cellular compartments, manipulating transcription and regulating gene expression, finally resulting in regulation of cellular processes.10 The whole process can span multiple spatial and temporal scales.11 For example, the process of apoptosis is regulated by multiple signalling pathways, which together orchestrate the precise temporal and spatial distribution of the corresponding end-players.12

2 Need for a Systems Approach

It is clear from the extensive interconnectedness and contextual dependence at multiple levels that an understanding of multiple organisational levels from molecular interactions to epidemiological consequences is required. A comprehensive appreciation of the various signalling pathways, their inter-connections and how, when and to what extent they influence each other requires a systems approach, where individual components can be studied in the larger context of a connected system.13 In the case of signalling due to infections, there is an additional layer of complexity that arises due to the interactions between the signalling pathways of the pathogen and those of the host, making it a study of a ‘system of systems’ (Fig. 1).

Figure 1:
figure 1

A schematic representation of the study of signalling networks involved in host response to pathogens. Networks in both host and pathogen are shown with three types of edges: host-specific (blue), pathogen-specific (red) and HPI (green) edges. Examples of modelling approaches are illustrated as also the applications that emerge from them.

Systems biology aims to reconstruct systems by a bottom-up approach, with detailed knowledge about the individual components and their interactions, providing a basis to study the emergent properties of systems. A molecular systems approach has the added advantage of providing a helicopter view that has emerged from the full knowledge of the individual components, making it a high-resolution way of studying biological processes.14,15 A synergy between experimental and computational efforts, involving a series of experimental measurements, model building, refinement and validation, captures and rationalises knowledge about the system. Systematic screening of the parameter space, perturbations of various kinds including individual component knock-outs are commonly performed to choose the best model.16 Increasingly, the model building methods are leaning upon large-scale data collected through high-throughput omics experiments. Their simulations generate testable hypotheses that can be validated through experiments (Fig. 2). A high-resolution view of signalling systems that encompasses aspects ranging from molecular interactions to cellular responses is still not within easy reach. However, there is substantial progress in our understanding of discrete levels in cellular organisation, especially in genome-wide interaction networks, quantitative modelling of individual pathways, receptor structures and dynamics. In the following sections, we discuss some of the widely used modelling approaches for studying signalling systems and the insights gained from them in several cases.

Figure 2:
figure 2

Hierarchical representation of cellular complexity from a systems biology perspective. The range of input data that can facilitate systems biology studies and various modelling strategies are shown.

3 Signalling Pathway Data

It is well known that signalling mechanisms involve a series of steps, starting from receptor activation by a signalling molecule leading to the cellular response, which are studied as signalling pathways.17 Several signalling pathways are well characterised,18,19 where the pathways have been traced step-by-step to identify the effect of an extracellular stimulus, connecting the signal at the cell surface to its effects in the nucleus. A review by Seger and Krebs elucidates how the MAPK signalling pathway was traced.19 For several other pathways, while genetic studies have identified the individual components and the direction of flow of information in them, biochemical experiments have typically characterised the nature and strength of interactions involved in signalling.20 Novel molecules involved in signalling pathways and finer details are continuously being added to our knowledge base. To access and comprehend this vast amount of information, several databases of signalling pathways have been developed and made publicly available. Some common ones are listed in Table 1. SignaLink2,21 developed a few years ago, is a comprehensive resource of curated information about signalling pathways, their transcriptional and post-transcriptional regulators, modifier enzymes as well as the downstream targets of seven major pathways. These correspond to the RTK [receptor tyrosine kinase, including EGF (epidermal growth factor)/MAPK and IGF (insulin-like growth factor)/Insulin], TGF-β (Transforming growth factor-beta), Wingless/Wnt, Hedgehog, JAK/STAT (Janus kinase/signal transducer and activator of transcription), Notch and the NHR (Nuclear hormone receptor) pathways. Signal transduction, though once considered as an assembly of linear steps, is now well appreciated to involve a highly interconnected network with extensive cross-talk among different linear components in the network.22 In addition, studies elucidating genome-wide protein–protein interactions also provide considerable amount of information about signalling pathways. Some examples of these are KEGG,23 Panther,24 Protein Lounge (http://proteinlounge.com) and Netpath.25 Data related to signalling pathways can be retrieved from these databases26 and can be used to build quantitative models or network-based models to represent a dynamic system. A protein sequence database such as Ensembl, contains more than 4000 entries annotated with the term ‘signal transduction’, providing a list of proteins known to be involved in signalling in one species or the other. Sequence homology searches, which are now routine in many laboratories, can extend the annotations for homologues, thereby providing an expanded list of such molecules. It is estimated that about 5% of the total protein content in a given cell is composed of signalling molecules.27 About 300 different signalling pathways are documented in various databases.28 There are considerable data about the quantitative measurement of individual signalling molecules and the rates of the reactions they are involved in, facilitating construction of quantitative models, as described in the next section.

Table 1: List of online databases for signalling pathways.

4 Modelling Approaches to Study Signalling Systems

Comprehension of large amount of complex information and using it for a higherorder understanding of the biological systems has necessitated the use of mathematical and systematic computational analyses. A model that mimics a natural process can be used to gain insights into process mechanisms and predict outcomes for a given set of input parameters. Models can be built at different levels of coverage and resolution. What constitutes a model depends upon the question that needs to be addressed through the simulations and the data available about the components at that point of time. Theoretical models serve as useful tools to probe various aspects of a signalling pathway, particularly for addressing systems level questions such as the effect of any variation in the pathway on the cellular behaviour, the most impacted interacting pathways, its regulators and so on. Models are useful to obtain an understanding of (a) the role of individual components and their interactions resulting in a given phenotype, (b) the effect of various perturbations such as alterations in concentration of a component (e.g., capturing gene expression or copy number changes), (c) the alterations in binding affinity during complex formation (e.g., due to single nucleotide polymorphisms or disease mutations) and also (d) the effect of loss of existing components or gain of new components, in a quantitative manner. Needless to say, once a model is built, available computational methods facilitate high-throughput simulations. In some cases, they also enable a study of what may be highly impractical or even impossible through available biochemical and molecular biology techniques. System models are the best suited to study emergent properties of signalling systems such as signal amplification, cooperativity, cross-talk, synergy, antagonism and the balance among them.29 The significance of single-step versus multi-step signalling pathways can also be systematically investigated through these models, to gain both qualitative and quantitative insights. Various toy models, regarded more to be in the realm of theoretical biology, have also been used to study consequences of structural design or topology of the models, which have provided significant insights into the topologies and conditions that exhibit bistability and switch-like behaviour.29 32 Influence of graded stimuli, rate of signal transmission, length and complexity of the pathway are also investigated through these models, which have led to the identification of control points in different systems as well as strategies for their regulation.30,33

With increasing availability of genome-wide data, the landscape of modelling strategies has expanded considerably (Fig. 2). About a decade ago, modelling of signalling pathways was predominantly based on use of the physicochemical parameters for events such as protein modifications (e.g., phosphorylation), intermolecular associations, translocation and intracellular localisation. These have been mathematically captured as a set of connected ordinary differential equations (ODEs), which are widely referred to as ODE models or kinetic models.30 A detailed kinetic model requires information about the reaction mechanism and corresponding kinetic parameters and can perform time-course simulations. Such models were built based on prior knowledge obtained from biochemical and molecular biology experiments. These models have provided quantitative insights such as the influence of a change in concentration of a signalling ligand or a mutation in a receptor or the rate of a particular reaction, on the outcome of the pathway. Such models, however, are small involving only a few reactions and a handful of parameters. Several tools such as Kinetikit,29 COPASI34 and CellDesigner35 have been developed to build and simulate such models for different conditions.36 Well-known examples in this category include models of the EGF receptor (EGFR) and MAPK pathways.37,38 A model of EGFR signalling using an ODE-based approach was developed by Muller et al. to study the signal response when EGF binds to its receptor and its effect on the downstream signalling cascade.38 The model was generated using kinetic parameters available in literature and cellular protein concentrations were either compiled from literature or experimentally determined. They extensively studied the effect of ligand concentration and its binding affinity on the downstream signalling cascade. They concluded that the EGF-induced response is consistent for a 100-fold range of ligand concentrations, as the internal amplification mechanism contributes to a maximal ERK (extracellular signal-regulated kinase) signal. However, initial velocity of the receptor activation was identified as a critical parameter for signal efficacy. The results of their simulations correlate with the experimental findings that report the effect of EGF resulting in phosphorylation of ERK-1/2 and subsequent expression of the target gene c-fos. However, kinetic data are not available in such detail for many pathways, making it necessary to adopt different approaches. A second, more compelling reason to seek different approaches is to expand on the scale of the model. ODE models cater to studying individual pathways with tens of reactions and not the most suited to model larger networks involving hundreds to thousands of components and interactions.

The ‘omics’ data provided by genome-wide gene expression and protein quantification experiments can be used for generating networks. Some other approaches used for modelling are Boolean modelling and rule-based modelling.39 Rule-based models are developed on the principle of Gillespie’s algorithm,40 according to which a cell is considered as a well-mixed system and interaction between any two molecules in the cell is dependent on the abundances of each interacting molecule as well as the rate of the corresponding reaction. A model is constructed using a defined set of rules for the components, their transformations and their respective interactions in the system. At each step, the most probable state is predicted from a large set of possible interactions in the system, based on mass-action kinetics or stochastic simulations , thus eliminating the need for enumerating them explicitly. A model can be systematically studied with a range of input conditions and parameters, to predict outcomes of specific scenarios.41 BioNetGen,42 Kappa43 and RuleMonkey44 are some of the software tools that can be used for this approach. An example is the modelling of the EGFR pathway by Danos and coworkers,44 where the authors tackled the problem of combinatorial complexity and captured physicochemical causality at individual steps. They investigated the pathway structure and its dynamics and obtained insights for a range of system behaviours. Another example of studying iron-dependent signalling in the host and pathogen is described in a later section.

Large systems are also often represented as networks, where the nodes form the individual molecular components and the interactions between them form the edges. Interactions of the structural type referring to complexes between two proteins, and also of the functional type, referring to functional influences of one protein on the other, are both included as interactions. Experimentally identified interactions available in various databases provide a rich source of such interactions, which are further augmented through a variety of knowledge-based predictions. Algorithms providing such predictions are based on the concepts of Rosetta Stone,45 phylogenetic profiling 46 or gene neighbourhood.47 Databases such as STRING,48 provide a useful resource for this purpose. The edges in a signalling pathway have specific directions that indicate the direction of reactions in the pathway. The nodes can also be weighted, to indicate the relative importance of the individual nodes. Weighting schemes can be chosen such that it helps in addressing the specific question being asked. For example, abundance data of individual proteins can be used to weight the nodes, to identify specific pathways in a complex network. Graph theoretical methods are then used to identify important components, the most feasible routes as well as variations in the network under different conditions. For the signalling networks, interactions can also be tagged as stimulatory or inhibitory edges.49 The importance of each node can be systematically determined by a perturbation analysis where a node may be deleted or down-weighted and its effect on the network studied. Networks can be used for addressing a range of questions including pathway finding or identifying a minimal set of nodes sufficient for the particular signal propagation process.

In any modelling exercise, model building is a critical step, which is often iterated and refined with validation steps. Different models built at different levels of abstraction provide different levels of insights, thus making it important to understand the model, before drawing inferences from it. A number of tools are available to model biological pathways. For example, ODEs can be modelled through JWS Online,50 COBRA,51 VirtualCell,52 Cytoscape,53 PathVisio,54 Pathwaytools,55 Cell Illustrator,56 while tools such as Boolnet,57 BooleanNet,58 SimBoolNet,59 ViSiBooL,60 BooleSim61 can be used for Boolean modelling and Kappa and Bionetgen can be used for rule-based modelling. Several R and Matlab packages are also available for network-based simulations. A pictorial representation of the model can be generated using visualising engines such as Cytoscape53 and Gephi.62 For a detailed description of the modelling methods and how they have helped in understanding cell signalling in specific cases, the reader is referred to excellent reviews available in literature.12,15 In the next few sections, we will focus on how modelling has helped in understanding signalling in infectious diseases. Only representative examples are described, as a description of all modelling studies in this category is beyond the scope of this article.

5 Case Studies in Infectious Diseases and Insights Gained by Modelling

5.1 Models for Studying Mycobacterium Tuberculosis Infection

Mycobacterium tuberculosis (Mtb) employs a complex interplay between cellular signalling and transcriptional regulation to combat a wide variety of stresses inside the host.63 Several aspects of pathogenicity, virulence and the role of different signalling pathways have been studied in a variety of modelling studies, carried out at different levels of abstraction. A model based on nonlinear ODEs capturing cellular level immune response due to Mtb infection has simulated conditions of active disease, latency and reactivation.64 The study indicated that latency might result in tissue damage if the response is not adequately regulated and further suggested that cytokines interferon-γ (IFN-γ), interleukin-10 (IL-10) and IL-4 could play a prominent role in maintaining a balance between Th1 and Th2 immune responses—so as to minimise tissue damage. Another model by the same group indicated that dendritic cells play an important role in establishing latency and a delay in dendritic cells and T cell migration may lead to a different outcome, that of reactivation into active tuberculosis (TB).65 Simulation results have indicated that IFN-γ produced by CD4+ T-cells is instrumental in controlling the infection. However, it also indicated that CD4+ T-cells make a larger contribution towards the control of infection, independent of IFN-γ production.66,67 A comparatively recent study extends an ODE-model to a Bayesian computational model that describes CD137 signalling in human TB. In this model, a set of differential equations were used to model the induction of the initial infection state. Another set of equations were used to integrate the observable parameters, such as the antigen-presenting cell activation rate and duration of apoptosis, to specific states of the model.68 This model successfully predicted the underlying mechanism of the cytokine modulation by CD137 and established a framework by which other scenarios could be tested.

5.2 Models for Studying Other Infectious Diseases

Models of signalling systems have been constructed and simulated in many other infectious diseases as well, some examples of which are described below. A model of the mucosal immune responses induced during Clostridium difficile infection causing nosocomial diarrhoea and colitis was used to study the immunoregulatory role of peroxisome proliferator-activated receptor-γ (PPARγ), in modulating host responses to the pathogen using a mouse infection model, using wild-type and T-cell-specific PPARγ null mice.69 The ODE model correctly predicted the effect of the upregulation of miR-146b, downregulation of the PPARγ and its co-activator NCOA4 on the upregulation of IL-17, explaining the regulation of Th17 responses due to this infection. Use of a PPARγ agonist pioglitazone was seen to reduce the symptoms of colitis and suppress pro-inflammatory gene expression. Another model has addressed the crosstalk between effector molecules and pattern recognition receptors that mediate the complex immune response during Helicobacter pylori infection. Interactions between macrophages and intracellular H. pylori were captured through a set of ODEs. The predictions from this model that were subsequently validated experimentally showed that the bacterial load in macrophages is regulated by nucleotide-binding domain and leucine-rich repeat-containing protein X1 (NLRX1) expression. It was further identified that a negative feedback circuit is responsible for downregulation of NLRX1 by the early host response cytokines.70

Molina-Paris and coworkers developed a stochastic, within-host, computational model of Francisella tularensis infection in mice.71 The model considered bacterial replication in macrophages in spatial compartments of lung, liver or spleen and macrophages in three different states of ‘resting’, ‘suppressed’ and ‘activated’ conditions, to capture immune subversion by the pathogen. The simulations of immune cells and bacterial kinetics used probabilistic expressions to describe the time taken for rupture of infected macrophages and subsequent bacterial release and provided mechanistic insights into the early stages of pathogenesis. This also provided a framework to explore the benefits of candidate therapeutics that activate the immune responses.

There have been several models that have studied aspects of viral infections as well. An example is a mathematical model of the immune response to primary influenza infection. The study suggested that, in the presence of high viral titers, 5 days were required for an effective adaptive immune response to start.72 By considering viral titers during primary infection, the simulations showed that the viral replication rate was more important than viral infectivity. It further suggested that virus-specific IgM and CD8+ T lymphocytes were critical contributors to clear the viral load during primary infection, indicating that the future vaccine development efforts could include strategies that can boost virus-specific CD8+ T-cell responses. Among the anti-viral host responses, interferon-β stimulation and subsequent induction of JAK-STAT signalling pathways are prominent events.73 Through a set of differential equations capturing phosphorylation and dephosphorylation, formation and dissociation of protein complexes, kinase and phosphatase activation, nuclear import and export, constitutive and induced gene transcription, mRNA translation and degradation of molecules, Kimmel and coworkers have simulated the pathway dynamics, which were subsequently experimentally verified. Dynamic negative control mechanisms, previously unknown in this system, were identified through this. The study also suggested an important role for one of the phosphatases as well as for the inhibition of IRF1. Another example in this category is a study of the early response of human monocyte-derived dendritic cells to H1N1 influenza A infection, where the interplay between the immune response and viral antagonism was modelled, for two clinically different viruses. The results showed that strain variation had a significant impact on the temporal behaviour on several genes. Further, the study indicated that variation in innate immune response corresponding to different H1N1 viruses74 was consistent with an experimentally observed time shift in the interferon response that was identified by a genomics study.75

5.3 Boolean Models

A modelling approach that complements the ODE modelling strategy is Boolean modelling,76 where systems are represented as logical rules or statements that describe the interactions among system components. The components themselves can be at any level of biological hierarchy and rules are written based on prior knowledge derived from experimental studies. For example, nodes can be molecules or molecular states such as activated receptors, cells, cellular states, reactions or even entire processes. The nodes take up either an ‘on’ or true state or an ‘off’ or false state. The flow of information between the nodes are determined by Boolean transfer functions, which are used to simulate the evolution of the states and predict a final set of outcomes, for a set of initial conditions. A major advantage is that a Boolean model can make use of any knowledge already available about the model components and convert them into qualitative rules. An added advantage is that a Boolean model is not constrained by the lack of detailed kinetic parameters. However, Boolean models cannot provide quantitative descriptions of system dynamics. Instead, they can be regarded as excellent alternatives for modelling large, incompletely characterised systems and to understand the role of the network structure in defining the dynamics of the system. Integrated modelling schemes, where Boolean networks are first used for modelling regulatory and signalling networks, followed by detailed continuous models for some components, subject to availability of kinetic parameters, are suggested to offer the best of both strategies.77 79

Boolean modelling has been used to study host responses in several infectious diseases. Modelling the host immune response to infection for two closely related Bordetella species by Albert and co-workers was one of the early examples.80 The model included various components of the immune system and their interactions with bacteria capturing synergy, dependence and antagonism among various signalling pathways governing the immune system. The model contained specific rules for Toll-like receptor (TLR4) mediated signalling in response to pathogen-associated molecular patterns such as the lipopolysaccharide, the production of cytokines including IL-1, IL-6, tumour necrosis factor-α (TNF-α) and TNF-β and T-cell differentiation including a positive feedback, regulating the differentiation process. Using this model, a variety of scenarios were simulated, which provided insights about active immune processes, growth patterns of the infective agent, clearance of a secondary infection and loss of antibody production on the dynamics of infection.

A Boolean model of the host response to Mtb to predict which of the three outcomes of infection—active disease, clearance of infection or persistence would be prominent under different initial conditions, was described in Raman et al.81 The model encompassed several host and pathogen molecules, cells, cellular states and processes. It captured signalling pathways mediated by TLRs, complement receptors, mannose receptors and scavenger receptors, as well as cross-talk among them. It also included quantitative parameters to account for bacterial load, growth rates and efficiency of pathogen clearance due to drug intervention. Using systematic single and double knock-out of the individual nodes, it was identified that the system’s response underlined the pivotal role of pathways leading to TNF-α in favouring bacterial clearance. Persistence was indeed seen to be the prominent outcome in a majority of systematic perturbations, indicating the predisposition to this state.

The crosstalk between EGFR, IGF-1 receptor (IGF-1R) and insulin receptor (IR) signalling pathways make a pro-survival signalling network, which was translated into a Boolean network model and was combined with stochastic modelling of signal propagation by Capala et al.82 The authors used node importance criteria, based on static, topological properties of the network and extensions of centrality measures, which facilitated the selection of most influential nodes in maintaining connectivity among three pathways—EGFR, IGF-1R and IR. The top-ranking nodes, such as ERK1/2, AKT1, p70 ribosomal S6 kinase alpha (P70S6K) and JNK, identified from the model were found to be critical for signalling crosstalk. The model was also successfully tested to predict activity in response to various levels of stimuli. Such an approach can be used to find critical nodes involved in a pathological condition. In a separate study, Naumann and colleagues used logical modelling to model H. pylori induced c-Met signal transduction.83 They showed that activation of ERK signalling pathway induced by H. pylori is mediated by phospholipase Cγ1 (PLCγ1). Inhibiting PLCγ1 inactivated the ERK pathway, thus forming a new strategy for treating invasive stomach cancers caused by H. pylori infection. Similarly, Tan and Tay generated a Boolean network to study the pathogenesis of the dengue haemorrhaghic fever. The simulations identified TGF-β, IL-8 and IL-13 as the critical players in pathogenesis and as possible targets for intervention for treating the disease.84

5.4 Orchestration of Multiple Signalling Pathways into Networks

With the availability of large amounts of omics data on various fronts, an obvious next step is to integrate them and reconstruct condition-specific networks which can facilitate in silico predictions. Several attempts have been made to automate the reconstruction of signalling pathways with integration of genomics or transcriptomics data. Some examples are listed here to present the logic behind the integrated reconstructions. The pathway reconstruction by Zhu et al. in 2006 was one of the earliest in this direction, where they develop an algorithm capable of constructing a signalling pathway based on literature and transcriptomics or proteomics data.85 The drawback of this approach is that it considers a signalling pathway as a linear pathway model and excludes regulation via positive and negative feedback loops, which are integral parts of most signalling pathways. Signalling and dynamic regulatory events miner (SDREM)86 is a method that integrates condition-specific time series expression data and protein interaction information to obtain condition-specific networks using input/output hidden Markov model. SDREM was used to study human immune response during H1N1 influenza infection. It revealed several signalling pathways known to be involved in H1N1 influenza response and also predicted targets.87 SDREM uses condition-specific time series data and thus has an edge over other static literature based models.

Another approach using the prize-collecting Steiner forest (PCSF) problem was used to reconstruct multiple signalling pathways using proteomic data. This model reports both overlapping and independent signalling pathways based on the functional enrichment and clinical properties of the relevant proteins. An extension of this approach, based on reverse engineering principles, combined the PCSF and integer linear programming methods, which was used to analyse a temporal signalling network using phosphoprotemic data of Salmonella-infected human cells. This revealed hidden components of signalling such as the soluble NSF-attachment protein receptor and mammalian target of rapamycin signalling, cytoskeleton organisation and apoptosis pathways.88 Two recent approaches for reconstruction of dynamic signalling pathways include PathLinker and TimePath. PathLinker was used to reconstruct a comprehensive set of signalling pathways using information from NetPath and KEGG databases.89 It connects receptors to transcriptional regulators by computing shortest paths and generates a high-precision network. It provides a ranked-list based on curated pathways and thus helps in prioritising proteins and interactions for experimental studies. A reconstruction of Wnt signalling pathway using PathLinker revealed cystic fibrosis transmembrane conductance regulator (CFTR) as a novel protein mediating signalling of Receptor-Like Tyrosine Kinase (Ryk) to disabled homolog 2 (Dab2). Another reconstruction framework, TimePath, is based on integer programming approach which selects a subset of pathways to generate a response network that can facilitate rationalisation of experimentally observed phenomena.90 A response network was simulated for HIV-1 immune response using HIV expression data and protein interaction data from VirHostNet,91 which could correctly identify several well-known pathways and predict other novel pathways, some of which were experimentally validated subsequently.

Network models are useful for extending our understanding of signalling pathways beyond known pathways by predicting new routes of information flow and new probable paths in a given condition. For instance, it is possible to identify signalling pathways that are perturbed in disease and those that are differentially active in response to a given drug, vaccine or even an engineered miRNA intervention. PathwayLinker is one such method, made available as an online tool that uses a network approach to analyse signalling networks integrated with protein–protein interactions from multiple sources.92 The approach is yet to be widely exploited for studying infectious diseases. However, methods that are successful with other diseases are likely to be useful for studying signalling in infectious diseases in the future. This approach was used to study possible signalling pathways through which Gap Junction Alpha-1 (GJA1) is affected by three cancer drugs viz. cisplatin, mercaptopurine and methotrexate. The analysis identified that GJA1 has 47 interactors, including ERK1, ERK5 and SRC, which are involved in several signalling pathways. This information may additionally indicate possible side-effects of drugs that target GJA1. Networks that integrate data from pharmacological intervention experiments with protein interactions have also been explicitly used to predict signalling pathways responsible for unexpected experimental outcomes.93 Using the EGFR network in human breast cancer cells, Ram and coworkers predicted new signalling paths, which included paths between MAPK/ERK kinase (MEK1) and c-Src via SEK1 and p38. The method has been developed into a toolkit called PathwayOracle.94 Another approach to identify therapeutic targets from cancer signalling pathways based on logical modelling was presented by Han et al.95 Using breast cancer signalling pathway derived from literature, a set of vulnerable components were identified using a stochastic logical model. In this case, they predicted that JAK2, STAT3, S6K, JUN, FOS, MYC and MCL1 would have a higher impact of the on cell viability, while identifying nuclear factor kappa-B (NF-κB) and elF4E to have no significant effect at all. This result was validated using siRNA mediated gene silencing of the respective genes in breast cancer cell lines. Such an approach can give an estimate of the response of cells when a particular gene is targeted, facilitating target validation studies.

6 Survival Strategies of Pathogen Inside the Host

Defects in signalling pathways have been clearly associated with increased disease susceptibility or risk of higher disease severity.96 For example, defects in IFN-γ pathway containing a few mutations in the coding regions of the gene are associated with susceptibility to TB,97 while defects in a key signalling molecule zeta chain of T-cell receptor associated protein kinase 70 ZAP-70 in T-cells leads to chronic autoimmune arthritis in mice.98 Modelling studies providing insights into the mechanisms by which signalling pathways affect the pathogenesis process are described in this section.

6.1 Subversion of Signalling Pathways by Pathogens

The pathogenicity of bacteria is usually caused by bacterial virulence factors. These virulence factors are known to interfere in several host pathways including major signalling pathways, such as NF-κB, MAPK and TLR signalling among others.99,100 Several pathogens are known to invade a host’s signalling machinery and impair downstream signalling to meet their survival needs. Strategies employed by pathogens include (i) inhibiting a critical signalling component, for example, the bacterial virulence factor Yersinia outer protein J (YopJ) inhibits phosphorylation of MAPK kinase 2 (MKK2) that leads to further inhibition of all downstream signalling, including ERK, p38, JNK and NF-κB pathway;101 (ii) impairing the leukocyte recruitment process, as seen in the case of Shigella with a protein effector, OspF, that blocks the activation of a few NF-κB-responsive genes, leading to impaired recruitment of polymorphonuclear leukocytes to infected tissues;102 (iii) inhibiting proinflammatory responses, for example, SpvC in Salmonella significantly downregulates the release of cytokines like IL-8 and TNF-α in tissue-cultured cells103 and (iv) inducing apoptosis, as observed in the case of Yersinia enterocolitica, where YopP is translocated to the cytosol of host cells, followed by inhibition of MKKs and IKK-beta, which in turn, inhibits the production of cytokine TNF-α and thus, inducing apoptosis.104 106 A study proposed that pathogens that are involved in acute infection often target central hubs of cellular signalling pathways. This leads to global disruption of the host defense mechanism. Contrastingly, pathogens involved in chronic infections avoid targeting the central hubs to maintain their persistent state; instead, they could affect peripheral signalling network components.107 An understanding of the underlying mechanisms by which pathogens manipulate host signalling mechanisms is necessary for deriving strategies to combat them with efficiency. Vilaplana and colleagues studied the multimodal relationship among lung, spleen and lymph node during the initial stages of Mtb infection.108 They generated a mathematical model to evaluate the effect of bacillary load on specific IFN-γ responses in the lung, spleen and lymph node. The simulation results showed that a critical bacillary load has to be reached to trigger the IFN-γ responses and that higher the bacillary load, earlier are the IFN-γ response initiated and that the control of bacillary load was not immediate after the onset of IFN-γ response. They suggested that vaccination strategies targeting rapid IFN-γ response alone may be inadequate to cope with the infection.

6.2 Signalling for Nutrient Acquisition

Pathogens hamper the host system in multiple ways. Along with hijacking the cellular signalling pathways, pathogens can also acquire nutrition from the host’s system. Iron is one of the most essential nutrients for the development of pathogens and several efforts have been made to understand the mechanism by which iron homeostasis is achieved. There is a constant competition between the host and the pathogen to maintain the desired iron concentration. This is achieved through complex molecular networks. In Mtb, iron contributes significantly to virulence and thus it benefits from extracting iron from the host. In response, the host tends to withdraw the iron, making it unavailable to the pathogen and thereby, restricting the pathogen’s growth. This is achieved by (i) increasing the expression level of transferrin receptor, which is involved in internal iron transport in the host system; (ii) increasing the expression of ferritin, which is responsible for iron storage in the cytoplasm;109 (iii) increasing the levels of hepcidin, which can degrade ferroportin, involved in export of iron from the cell110 and (iv) through the presence of lactoferrin in mucosal secretion, which can bind to free iron with high affinity that leads to limited iron availability at mucosal surfaces. These molecules are also released by neutrophils at sites of infection to lower iron availability to pathogens.111 In response to restricted availability of iron, pathogens have evolved other mechanisms to obtain iron from their respective hosts that have been described in several comprehensive review articles.112 A rule-based model to study iron homeostasis in infected versus uninfected hosts comprised of 92 components and 85 interactions among them.113 The model captured iron-dependent signalling cascades of the host and pathogen, protein synthesis and decay rates, bacterial growth and death rates as a set of 194 rules. The simulations, carried out using Kappa, mimic infection scenarios and various systematic perturbations, including single and double knock-outs, in which the behavioural change of important proteins and metabolites were monitored. The study led to the identification of key controlling factors for maintaining iron homeostasis, correct prediction of the role of ideR, an iron-dependent regulator in Mtb and the role of transferrin in the host. The study suggested that an increase in iron storage in the host paradoxically benefits the pathogen owing to its capability of extracting iron from such reservoirs. Decreasing the rate of iron uptake by transferrin thus emerges as a possible strategy to control infection, suggesting transferrin to be a possible drug target.

Guthke and coworkers generated a regulatory network based on gene expression data to study the regulation of iron uptake mechanism in C. albicans when it infects human oral epithelial cells.114 The model was composed of 19 differentially expressed genes, of which 15 were iron acquisition genes and 4 were regulators. The study identified Rim101, Hap3, Sef1 and Tup1 as novel target genes for transcriptional regulators. The model was able to propose a regulatory mechanism to explain iron acquisition during adhesion to and invasion of human oral epithelial cells. Such an approach can identify sparse and robust networks and predict biologically relevant regulatory interactions. The same group has also presented a regulatory network of iron homeostasis in A. fumigatus, making use of a reverse engineering approach with a set of linear differential equations.115 This computational model predicted that expression of the hapX gene gets activated by a transcriptional regulator srbA that was previously not known as a regulator of iron homeostasis. This prediction was later experimentally confirmed.

6.3 Signalling Molecules as Specific Biomarkers of Disease

Biomarker discovery research has seen tremendous progress in the recent years due to availability of systematic data on the genome, transcriptome, proteome and metabolome, for a number of diseases. Systems perspective comprehends large and complex data and identifies genes and gene signatures. Biomarkers can distinguish between different conditions, such as diseased versus healthy, stages of the diseases, subtypes of patients, untreated versus treated and so on. Network models integrating genomics data are the most widely used for identification of biomarker candidates.116 Given that signalling pathways are at the heart of several pathologies, it is no surprise that specific signalling molecules serve as discriminating features for a variety of diseases. Examples in this category include biomarker signatures that predict reactivation risks in latent TB patients.117 A systems approach used by Pulendran and colleagues identified gene signatures that can predict immune responses in individuals vaccinated with yellow fever vaccine 17-D (YF-17D).118 These signatures are suggested to be indicators of the strength of adaptive immune response. They reported that cytokines IP-10 and IL-1 were induced post-vaccination and can be considered as reliable markers of vaccination. TNFRS17, a B-cell growth factor, was shown to predict the neutralising antibody response with up to 100% accuracy. A combination of complement protein C1qB and eukaryotic translation initiation factor 2 alpha kinase 4 EIF2AK4 can predict CD8+ T-cell response with up to 90% accuracy. A multi-organ model, capable of computing Mtb-specific T-cell frequencies over time, generated by Kirschner’s group, using non-human primates data, was utilised to discover potential biomarkers for TB.119 The model, constructed with machine learning approaches and a set of ODEs, indicated that the level of Mtb-specific frequencies of both CD4+ and CD8+ T-cells, present in blood, can be used to predict the infection outcome.

In the field of cancer, ample progress has been made on this front. There are markers to estimate severity of the disease and to predict progression rates for early diagnosis, response to therapy and recurrence.120 A study by Ottenhoff and his group identified a specific status of cytokine signalling to be indicative of the effect of treatment. They show that IFN-γ/IL-10 and IFN-γ/TNF-α ratios strongly correlate with the treatment process, where successful treatment indicates a shift toward proinflammatory cytokine profile.121 Such methods are likely to be useful for discovering biomarkers for different aspects of infectious diseases as well.

6.4 Variations in Signalling Networks upon Treatment

The precise state of intracellular signalling network can also suggest effective therapeutic strategies specific to the molecular profile. Disease-specific networks can infer variations in signalling networks upon treatment. A network study122 using transcriptomic data from whole blood samples of TB patients123 revealed that a top-ranking differential active sub-network seen in individuals after 12 months of TB treatment is more similar to the path profiles seen in healthy individuals than those with active disease. These paths were deduced based on the least path cost, where path cost was calculated using node weights derived from gene expression data in a large protein-interaction network. This study illustrates the significance of a systems biology approach that can reveal molecular paths which are otherwise difficult to trace. This network analysis further indicated that there is a considerable change in the path profiles after 2 months of TB treatment when compared with healthy individuals and that they reverted to similar paths again after 12 months of treatment. One particular example is the interferon response network, where IFN-α response genes GBP1, PIM1, CD40, IF130, IRF9, OAS1, IFIT2 and ISG15 show elevated levels during TB and are later found to be considerably lowered during the course of treatment. Presence of paths which remain unchanged upon treatment indicates the plasticity in variations due to disease and that the treatment restores the physiological state only partially. Such inferences can also open up new avenues for biomarker research.

7 Targeting Signalling Networks—Drugs and Vaccines

A majority of physiological responses, directly or indirectly, can be attributed to signalling pathways along with response to vaccination. The most evident of all is TLR signalling, which has been studied in great detail to understand the mechanism behind vaccines contributing to protective immunity. For live attenuated vaccines, TLR activation is known to play an important role.124. Known vaccines with the TLR-dependent protective mechanism include vaccines such as the BCG conferring protection against TB, which activates TLR2 and TLR4 signalling pathways.125 Similarly, multiple TLRs are activated by YF-17D, leading to activation of dendritic cell subsets.126 Pulendran and colleagues showed that a nanoparticle vaccine made of ligands for TLR4 and TLR7 can trigger neutralising antibody responses with lifetime persistence. This vaccine is shown to protect against lethal avian and swine influenza viral strains in mice and can also provide immunity against H1N1 influenza in rhesus macaques.127

Since signalling pathways are involved in immune responses induced by vaccines, several attempts have been made to model the signalling response to study the persistence of immunity. Systems biology approaches are also being increasingly used in vaccinology to predict the efficacy of vaccines. Nakaya et al. used a systems approach to study innate and adaptive responses to vaccination against influenza in humans during three consecutive influenza seasons and were able to predict immunogenicity of vaccines. They studied candidates vaccinated with trivalent inactivated influenza vaccine (TIV) or live attenuated influenza vaccine (LAIV) and found that both TIV and LAIV vaccination induced distinct molecular signatures in the blood. They further demonstrated that calcium/calmodulin-dependent protein kinase type IV (CaMKIV) plays an important role in the regulation of antibody responses to vaccination against influenza.128 These results signify the role of systems biology approaches in providing new mechanistic insights to vaccine response.

8 Conclusion

In this review, we have described the systems biology approaches and mathematical models widely used to study signalling systems and how they are applied to host–pathogen interactions. The models span a wide range from simple differential equations describing large processes in a compressed form, to complex and detailed nuanced condition-specific genome-wide networks. Availability of a large amount of systematic genome-wide data, on the one hand, and biochemical, genetic and cellular level data on individual molecules, reactions and pathways, on the other hand, has made it feasible to employ mathematical modelling as a strategy to study signalling systems. However, considerable judgement is required in choosing the appropriate models, based on the initial data and the type of results required.

Systems-level models complement experimental studies and contribute to a deeper comprehension of the processes they influence, especially when intuitive reasoning does not suffice. Several models of host–pathogen interactions, in which signalling systems occupy the centre-stage, have been built and analysed as listed in this review. Some of these have provided a framework to integrate information of diverse types and at various levels of abstraction. A number of modelling studies have provided mechanistic insights into different aspects of host response to pathogens, which are generating hypotheses on critical molecules, interactions and pathways, in specific cases. Systems level studies can also throw light on the generic mechanisms by which different pathogens affect host networks, which will help us understand commonalities in host response to diverse triggers. Several applications readily emerge from such understanding, including identification of diagnostic markers and therapeutic intervention strategies.

An integration with temporal data leading to generation of dynamic models over the course of infection or treatment will facilitate an understanding of the basis for conditional responses of the host systems. It can be foreseen that these models can be further extended into generating precise patient-specific models and account for comorbidities and other patient-specific genetic and biochemical information. Such models can be expected to provide a basis to explain the effects of heterogeneity in the patient population and help in patient sub-typing, which can be used to guide us in the development of sub-type specific biomarkers and set the stage for exploring options for personalised treatment.

Standard gene names have been used throughout the article.