Overview of various circadian systems

A circadian clock (CC) is an endogenous, self-sustaining, time-keeping system. Circadian clocks exist in most examined biological life forms, ranging from unicellular bacteria to highly complex higher organisms, including humans [1,2,3]. These clocks predict daily changes in the environment and regulate various physiological and metabolic processes [4, 5]. Clock genes across the kingdoms show limited conservation; nonetheless, the basic regulatory and time-keeping mechanism appears to be similar. CCs have an intrinsic period length of approximately 24 hours under constant conditions. Environmental cues, such as light and temperature, act as zeitgebers (time givers) that can reset the clock and also affect the rhythmic amplitude of clock outputs [4, 6, 7]. The process by which the clock is reset in response to day–night environmental changes is called entrainment. This synchronization is necessary because of variation in sunrise and sunset, as well as gradual retardation of Earth’s revolution periodicity, which necessitates responding to both seasonal and evolutionary timescales. Circadian rhythms are also temperature-compensated such that they can occur within a similar period over a wide range of biologically relevant temperatures [8,9,10]. Clocks in diverse organisms can be cell autonomous. For example, robust circadian rhythms of transcription have been observed in the single cells of Cyanobacteria and isolated mammalian fibroblasts, with minimal synchronization between the adjacent cells [11,12,13]. An oversimplified basic circadian network can be defined as consisting of three elements: input pathways that perceive and transmit signals that synchronize the clock to the environment, a central oscillator, and output pathways that link the oscillator to various biological processes. However, with the addition of new components to the clock network, our models of the circadian system are increasingly complex (Fig. 1). A given circadian oscillator consists of an autoregulatory network of multiple transcriptional and translational feedback loops, where the clock genes are activated or repressed by the rhythmic cycling of the proteins encoded by them. The input pathways themselves can also be rhythmically regulated by the circadian clock outputs [2,3,4, 14,15,16,17]. Together, the linear concept from input to clock outputs is actually an interwoven system of feedbacks.

Fig. 1
figure 1

Generic model of the circadian clock. The complex network of coupled multiple feedback oscillators are represented by solid color lines and ovals. Clock genes forming a functional oscillator regulate the input and output pathways (blue dashed lines). Feedback from output pathways can also regulate the oscillator and the input pathways (red dashed lines). In addition to external input signal transduction for clock entrainment, input pathways can also directly affect clock output and vice versa (solid black line). The model is adapted from a model depicted in [3]

CCs are well studied in prokaryotes (cyanobacteria) and eukaryotes (fungi, plants, insects, and mammals). In cyanobacteria, transcriptome expression of almost the entire genome is under circadian control [18, 19]. In fungal species, asexual spore formation, metabolism and stress responses, as well as other physiological [14, 20] and developmental processes [21, 22], show circadian rhythms. In humans, many physiological and behavioral processes, such as the sleep–wake cycle, body temperature, blood pressure, hormone production, and the immune system, are regulated in a circadian manner [23,24,25]. In plants, leaf and stomatal movement, hypocotyl elongation, hormonal signaling, and the expression of a large number of genes show circadian rhythms [26,27,28,29]. The circadian regulation of these physiological and developmental processes is ultimately a consequence of oscillating biochemical activities in each cell type. A circadian clock, to put it simply, is formed by a system of oscillating reactions.

Another characteristic feature of the circadian networks across life is the existence of multiple oscillators that coordinate differentially [30, 31]. This has introduced the concept of “pacemaker” and “slave” oscillators, wherein the pacemaker is the central oscillator that entrains to the external environmental cues and regulates the rhythmic output directly and/or by synchronizing slave oscillators, which then regulate given outputs. The slave oscillators are entrained by the central oscillator and may not exhibit all the circadian characteristics of a central oscillator. Multiple oscillators have been observed in cyanobacteria and Neurospora crassa. A self-sustained circadian oscillator composed of cyanobacterial core clock components has been reconstituted in vitro. In cyanobacteria, this suggests that a biochemical oscillator acts as a pacemaker and that a transcriptional–translational feedback loop (TTFL) is not important for driving circadian rhythms. However, circadian expression of genes was observed even when the biochemical oscillator was disrupted, suggesting that these two oscillators exist independently. When coupled to the biochemical pacemaker, the TTFL contributed to the robustness of the circadian clock [1, 32]. It has been proposed that these could be widespread in circadian-containing organisms, as a non-transcriptional oscillator is present in all three kingdoms of life [33].

Multicellular organisms have a complex architecture that consists of multiple cellular layers, tissues, and organs. In mammals, a hierarchical system of multiple circadian oscillators exists. The central pacemaker that is directly entrained by the external environmental cues is located in the suprachiasmatic nucleus (SCN) of the hypothalamus and synchronizes the peripheral clocks present throughout the organism. Transplantation of the fetal SCN into SCN-lesioned rats restored rhythmicity in a manner characteristic of the donor [34, 35]. The peripheral oscillators have clock components and properties similar to the pacemaker; however, they affect only the respective tissue or organ. Circadian rhythms of luciferase (LUC) expression were dampened after a few cycles in the non-SCN tissue culture from transgenic rat lines in which LUC was under the control of clock gene Period 1 (Per1) promoter, but continued to show robust rhythms for many weeks in the cultured SCN tissue [36]. The rhythms of the peripheral oscillators are phase-delayed by 4–12 hours and less rapidly entrained as compared to the pacemaker, indicating that the SCN pacemaker is required to synchronize the self-sustained peripheral oscillators and that the signals for synchronization take some time, as suggested by the phase delay [36, 37]. Unlike mammals, studies suggest that the circadian network in Drosophila consists of multiple self-sustained, cell autonomous circadian oscillators with a pacemaker function in most of the cells. Isolated tissues from head, thorax, and abdomen exhibited a functional circadian oscillator that could be entrained by light [38]. Interestingly, rhythms for eclosion [39] and locomotor activity are driven by circadian oscillators placed in the brain. Studies indicate that oscillator neurons in the brain are coupled and communicate via Pigment-dispersing factor to drive the locomotor activity under constant conditions (constant light (LL) and constant darkness (DD)) [40,41,42,43]. Thus, the possibility of coupled oscillators driving circadian rhythms is very probable.

Circadian rhythms can be represented as sinusoidal waves and are mathematically described by period, phase, and amplitude (Fig. 2). Entrainment by environmental cues (light and temperature stimuli) results in phase shifts. The phase can be delayed, advanced, or unchanged, depending on the time of the subjective day/night at which the stimulus is applied. If the stimulus appears in the early subjective night, the rhythm is delayed, whereas if given later in the subjective night, the rhythm is advanced. During the middle of subjective day/night, time points with little or no phase shift occur, and these are called "dead zones". Phase response curves demonstrate the transient phase shifts in the oscillation induced by a brief stimulus under constant conditions, as a function of the phase at which they are applied, and they are the best way to study entrainment in an organism by zeitgebers. The amplitude and the duration of the advances or delays are species-specific [44, 45].

Fig. 2
figure 2

Box 1

The clockwork operates by the actions of positive and negative regulatory elements that form a complex network of multiple interlocked transcriptional and translational feedback loops that are self-sustained with robust and tunable molecular oscillators [3, 13, 14, 16, 17, 46, 47]. Recent work emphasizes the importance of post-translation regulation on the stability and functionality of clock components and, hence, circadian timing. Hetero- and homo-oligomerization and nuclear shuttling of the core-clock proteins are common features shared across the kingdoms. Sequential phosphorylation plays an important role in the stability of the oligomeric states, subcellular localization and, hence, the transcriptional activity of the clock proteins during the course of the day [48,49,50,51,52]. It is likely that formation of transient complexes, which form and reform relatively easily, is essential for accurate functioning of the CC. Eukaryotic clocks are therefore a complex system of transcriptional/translational regulators and kinases/phosphatases. A complete understanding of the molecular mechanisms of such clockworks requires a full structural characterization of the clock components and their complexes, which leads to hypothesis-driven understanding of the biochemical basis of cellular clocks. The structural aspects of CC regulation are relatively poorly understood in eukaryotes, but well defined for the cyanobacterial clock [1, 32]. This review summarizes the ongoing efforts to understand the function and physical interactions of the CC components, with special emphasis on structural aspects.

The cyanobacterial circadian clock

The cyanobacterial CC has been studied extensively using Synechococcus elongates (Se) as the model organism. Three proteins form the core oscillator: KaiC, KaiA, and KaiB (Fig. 3a). Their circadian rhythms are driven by a transcription- and translation-based autoregulatory loop of KaiBC gene expression, wherein KaiA and KaiC act, respectively, as positive and negative regulators of KaiBC gene expression [53]. A fully functional, temperature-compensated clock with an approximately 24-hour periodicity could be reconstituted in vitro with KaiA, KaiB, KaiC, and ATP [54]. Also, KaiC phosphorylation was found to be rhythmic in S. elongatus in continuous dark conditions in the absence of transcription and translation [55], suggesting that post-translational KaiC phosphorylation is central to Kai protein-based timekeeping. Further research revealed that the transcription/translation-based loop, though not a requisite for maintaining circadian rhythms in prokaryotes, is still important. Circadian gene expression has been observed in the absence of KaiC phosphorylation cycles. However, over shorter periods, KaiBC gene expression and accumulation of KaiB and KaiC proteins were observed to be rhythmic and temperature-compensated in the KaiA-overexpressing strain that forces constitutive KaiC phosphorylation. Dampened rhythms over a longer period were observed in KaiC mutant strains that were phospholocked or KaiC mutants that lacked autokinase activity, thus leaving KaiC unphosphorylated. These observations demonstrate that two pathways are important for the regulation of circadian rhythms: KaiC phosphorylation and the transcription/translation-based KaiC abundance cycle. The period and amplitude of the transcription/translation cyclic rhythms were modified in the absence of the KaiC phosphorylation cycle, and rhythms at low temperature were observed only when both oscillatory pathways are intact [56], suggesting that multiple coupled oscillatory systems are important for a robust and precise circadian clock in cyanobacteria. The mechanisms that control these two pathways are still unclear [1, 32].

Fig. 3
figure 3

A simple schematic representation of the circadian clock in a cyanobacteria, b fungi, c insects, d mammals, and e plants

Structural studies have guided the understanding of the cyanobacterial clock components with an insight into their interactions that promote conformational changes and phosphorylation events vital for a functional clock. The atomic structures for KaiC, KaiA, and KaiB (Fig. 4) and/or their domains have been determined using X-ray crystallography, NMR spectroscopy, and electron microscopy [57,58,59,60]. KaiC from S. elongatus was shown to be a double-doughnut-shaped hexamer with 12 ATP binding sites between the N-terminal KaiC I and the C-terminal KaiC II rings (Fig. 4a) [58]. The S. elongatus KaiA protein forms a 3D domain-swapped dimer (Fig. 4b). It has three domains: an N-terminal domain (residues 1–129), a linker (130–179), and a C-terminal dimerization domain (180–283) [60]. The C-terminal domain forms a four-helix bundle, which has been confirmed by the structures of the C-terminal domain of KaiA from Anabaena sp. PCC7120 [59] and from Thermosynechococcous elongates (Te) BP-1 [61, 62]. The crystal structure of KaiB (Fig. 4c) revealed a protein with a thioredoxin-like fold [59, 63, 64], which forms dimers and tetramers [63].

Fig. 4
figure 4

Crystal structures of cyanobacterial clock proteins KaiC, KaiA, and KaiB. a Side view of the KaiC hexamer (PDB 2GBL) with 12 ATP (magenta) binding sites. b KaiA dimer (PDB 1R8J). c KaiB tetramer (PDB 1WWJ). d KaiC monomer with CI and CII domains and the S-shaped loop. e Structure showing two chains (A and B) of the hexameric KaiC depicting the key phosphorylation site—Ser431 and Thr432—and the bound ATP at the subunit–subunit interface. On the right is a detailed view of the interface showing the glutamates close to ATP that help to activate phosphorylation

KaiC is a kinase/phosphatase/ATPase: The principal clock component, KaiC, is the only protein with known enzymatic activity in the cyanobacterial clock. It acts as an autokinase, an autophosphatase, and an ATPase, exhibiting these functions both in vitro and in vivo [65,66,67]. The crystal structure of full-length KaiC revealed a homologous two-domain fold, resulting from gene duplication, in the monomer, with N-terminal CI and C-terminal CII domains (Fig. 4d). The C-terminal tail of CI links the two domains, whereas the C-terminal tail of CII protrudes out of the domain region, following an S-shaped loop on the periphery of the hexamer [58].

KaiC functions as kinase/phosphatase: The key phosphorylation sites identified in the KaiC CII domain are Ser431 and Thr432. Phosphorylation at these sites occurs at the subunit–subunit interface (Fig. 4e), where they are close to an ATP molecule bound to an adjacent subunit [68, 69]. The cycle of KaiC phosphorylation during the day, as well as hypophosphorylation at night, over a ~ 24-hour period proceeds in four steps: from Ser431 and Thr432 KaiC (ST unphosphorylated) to Thr432 phosphorylation (SpT), Ser431 phosphorylation (pSpT), Thr432 dephosphorylation (pST) and Ser431 dephosphorylation (ST). The reaction at each step is regulated by the product from the preceding step [70, 71]. Thr432, as the site to be phosphorylated first, is consistent with the crystal structure of KaiC, where all Thr432 residues in the six subunits are phosphorylated, in contrast to only four (out of six) Ser431 phosphorylations. Thr432 phosphorylation, which leads to new contacts across the subunit interface, enhances Ser431 phosphorylation [68, 69]. Complete phosphorylation of both Thr432 and Ser431 converts KaiC from an autokinase to an autophosphatase. Thus, the KaiC phosphorylation cycle determines KaiC enzymatic activity [67]. A third phosphorylation site was found at Thr426 that forms a hydrogen bond with the phosphate group of pSer431. T426A, T426E, and T426N mutants were observed to be arrhythmic. Thr426 was also observed to be phosphorylated in the crystal structures of S431A and T432E/S431A KaiC mutants [68, 72, 73]. In summary, the phosphorylation state of these key residues governs the functionality of the KaiC protein.

KaiC ATPase activity: KaiC shows extremely weak but stable ATPase activity (~ 15 ATP per KaiC per phosphorylation cycle) [53]. There are two ATPase activity regions in KaiC: i) slow KaiC CI ATPase activity that plays a role in time delay and the conformational switch needed for KaiC–KaiB interaction [74,75,76]; and ii) in the CII domain of KaiC catalyzing phosphorylation/dephosphorylation activity [76,77,78]. Work by Terauchi et al. [76] shows that ATPase activity displays circadian rhythms in the presence of KaiA and KaiB. KaiC variants mimicking the dephosphorylated and doubly phosphorylated state influenced its ATPase activity (non-phosphorylated state, more active; fully phosphorylated state, less active), suggesting both the kinase and the ATPase activity are closely linked. The mutants exhibiting long and short period displayed a linear correlation between the ATP hydrolysis and the circadian frequency. Temperature compensation is intrinsic to the ATPase activity. The ATPase activity showed strong temperature compensations in KaiC-only incubations and was only slightly affected in the presence of KaiA and KaiB in the temperature range 25–35°C. Terauchi et al. [76] proposed the ATPase activity of KaiC to be the most basic molecular mechanism that governs the period of a cyanobacterial circadian clock and is temperature compensated.

Analysis of the crystal structures of wild-type KaiC (4TL8) and its period-modulating variants in the pre- and post-hydrolysis states (PDB entries 4TL9 and 4TLA) revealed two structural bases of slow KaiC CI ATPase activity [79]. First, the hydrogen bonding of the lytic water moiety with the carbonyl oxygen of F199, the nitrogen of the side chain of R226 of KaiC, and another water molecule creates a steric hindrance, positions it farther, thus making it inaccessible to the γ-phosphate of the ATP (refer to the figures in [79]). Second, the slow cis-trans isomerization of a peptide (D145–S146) accompanying the ATP hydrolysis (PDB entries 4TL9, 4TLC, and 4TLA; refer to the figures in [79]) results in a substantial increase in the energy barrier to overcome, in order to disrupt the γ-phosphate–O bond of the ATP. CI and CII ATPases together form a coupled CI–CII ATPase system that is driven predominantly by the slow CI ATPase [79].

Crystal structures of KaiC (PDB 3DVL) and KaiC mutants (3JZM, 3K0A, 3K09, 3K0E, 3K0F, and 3K0C) [73] reveal that the ATP molecules bound between two subunits are recognized differently in the two subunits. The ATP phosphates are in close proximity to two glutamates in CII and are coordinated with Mg2+ (Fig. 4e). The glutamate close to the γ-phosphate (γ-P) group is also observed to be close to Thr432 and may therefore act as a general base for the hydrolysis and proton abstraction from Thr432 and Ser431 that help activate phosphorylation. The resulting γ-P transfer might increase the interaction between the subunits, thus forming a more compact hyperphosphorylated KaiC, as also observed in small-angle X-ray scattering (SAXS) measurements of the KaiC mutants mimicking various phosphorylation states [80]. Thr432/Ser431/Thr426 in CII corresponds to Glu198/Glu197/Asp192 in CI. X-ray crystallography, mass spectrometry, and KaiC T432E/S43E1 mutations showed no phosphorylation in CI, suggesting that ATP hydrolysis in CI generates the energy required for the enzymatic activity in the CII domain, rather than phosphoryl transfer [68, 69, 73, 79].

Kai protein interactions and the phosphorylation cycle: Both in vitro and in vivo, KaiA is an enhancer of KaiC phosphorylation, while KaiB antagonizes the action of KaiA [66, 67, 81, 82]. Structural and biophysical studies using various biochemical, spectroscopic, and crystallographic methods have helped to understand the KaiAC and KaiBC complexes and provided insight into the interaction of KaiA and KaiB with KaiC. KaiA binds through its C-terminal domain to the KaiC C-terminal tail at two interfaces: CIIABD peptide and the ATP binding pocket [62, 83]. KaiA contains an amino terminal pseudodomain that is proposed to receive environmental cues transmitted to potentiate entrainment [66, 67, 81, 82, 84]. KaiB interacts with the pSer431:Thr432-KaiC phosphoforms that inactivate KaiA in the KaiABC complex [68, 69]. The balance between the two activities is modulated by an “A-loop” switch (residues 488–497) in the C-terminal tail of the KaiC CII domain. KaiA stabilizes the exposed A-loops and stimulates KaiC autokinase activity, while KaiB prevents KaiA interaction with the loops, thereby stabilizing the internal core structure and, hence, locking the switch in the autophosphatase phase. A dynamic equilibrium between the buried and exposed states of the loops determines the levels of KaiC phosphorylation. It was hypothesized that binding of KaiA might disrupt the loop fold of a single unit that is engaged in the hydrogen bonding network across the subunits at the periphery [58], resulting in a weakened interface between the adjacent CII domains. This would lead to conformational changes within the CII ring that support serine/threonine phosphorylation. Initially, ATP is too distant from the phosphorylation sites to affect a phosphoryl transfer reaction; however, changes within the CII ring might relocate the bound ATP closer to the phosphorylation sites and/or enhance the retention time of ATP by sealing the ATP binding cleft [83, 84]. In contrast, KaiB interacts with the phosphoform of the KaiC hexamer. These structural analyses support the hypothesis that KaiA and KaiB act as regulators of the central KaiC protein.

Structural studies [75, 85] provide a detailed analysis to explain how these protein–protein interactions among KaiC, KaiA, and KaiB and their cooperative assembly alter the dynamics of rhythmic phosphorylation/dephosphorylation, in addition to ATP hydrolytic activity of KaiC, generating output that regulates the metabolic activities of the cell. An earlier spectroscopic study [86] proposed a model for the KaiC autokinase-to-autophosphatase switch, which suggests that rhythmic KaiC phosphorylation/dephosphorylation is an example of dynamics-driven allostery that is controlled mainly by the flexibility of the CII ring of KaiC. Using various KaiC CII domain phosphomimetics that mimic the various KaiC phosphorylation states, the authors observed that in the presence of KaiA and KaiB, different dynamic states of the CII ring followed the pattern STflexible → SpTflexible → pSpTrigid → pSTvery-rigid → STflexible. KaiA interaction with exposed A-loops of the flexible KaiC CII ring activates KaiC autokinase activity. KaiC hyperphosphorylation at S431 changes the flexible CII ring to a rigid state that allows a stable complex formation between KaiB and KaiC. The resulting conformational change in KaiB exposes a KaiA binding site that tightens the binding between KaiB and the KaiA linker, thus sequestering KaiA from A-loops in a stable KaiCB(A) complex and activating the autophosphatase activity of KaiC [86]. KaiB binding and dephosphorylation are accompanied by an exchange of KaiC subunits, a mechanism that is crucial for maintaining a stable oscillator [1].

KaiB is the only known clock protein that is a member of a rare category of proteins called the metamorphic proteins [87, 88]. These can switch reversibly between distinct folds under native conditions. The two states in which KaiB exists are: the ground state KaiB (gsKaiB; Fig. 4c) and a rare active state called the fold switch state KaiB (fsKaiB) [88]. Chang et al. [88] showed that it is the fsKaiB that binds the phosphorylated KaiC, thus sequestering KaiA and starting KaiC dephosphorylation. Hence, the previously known crystal structures of KaiB are of gsKaiB.

A high-resolution (1.8 Å; Fig. 5a, b) structure of KaiBfs-cryst (fsKaiB mutant: KaiBfs [G89A and D91R], partially truncated at the C-terminus) and CIcryst (truncation at the N-terminus of the isolated CI domain of the KaiC monomer) complex (PDB 5JWO) shows an interface that primarily consists of the residues from the fold-switched C-terminal half of KaiB and the B loop of the CIcrys [75]. KaiB in its fold switch state adopts a thioredoxin-like fold similar to that in the N-terminus of SasA that binds KaiC (Fig. 5c) [88, 89]. Previous deletion and substitution mutation studies of the KaiC B-loop show an absence of or weakened interaction between KaiB and KaiC and between SasA and KaiC. Binding of fsKaiB inhibits the interaction between SasA and KaiC as both SasA and fsKaiB compete for the same binding site on the KaiC CI domain [88, 90]. fsKaiB interaction with KaiC sequesters KaiA, thus switching a fully phosphorylated KaiC from a kinase to the phosphatase and commencing a phase transition. The same rare active state KaiB (fsKaiB), in complex with KaiC, interacts with CiKA, which then dephosphorylates RpaA (discussed later in the “Light: input to the clock” section), thus regulating the expression of class 1 (repressing) and class 2 (activating) genes.

Fig. 5
figure 5

Rare active fold-switched form of KaiB (fsKaiB) binds to the post-hydrolysis state of KaiC CI domain. a A 1.8-Å resolution structure of the KaiBfs-cryst–CIcryst complex (PDB 5JWO; from T. elongates). The ribbon diagram shows KaiBfs-cryst in pink, CIcryst in cyan, and bound ADP in yellow. Enclosed dotted box depicts the binding interface between KaiBfs-cryst and CIcryst. b Enlarged view of the KaiBfs-cryst–CIcryst complex binding interface depicting the interacting residues. c Structural comparison of KaiB ground state (gs) and fsKaiB: (i) KaiBTe (gsKaiB; PDB 2QKE, subunit A) in green, KaiBfs-cryst in pink; (ii) superposition of KaiBfs-cryst in pink with N-SasASe (PDB 1T4Y; Se: S. elongatus) in cornflower blue. Residues K58, G89, and D91 are highlighted in yellow, red, and orange, respectively. d Comparison of the ATP binding site of the KaiBfs-cryst–CIcryst complex with ATP binding site of KaiC CI structures (from S. elongates) in the pre- and post-hydrolysis states: superposition of ADP-bound CIcryst (cyan) with the CISe structure (green) in the (i) pre-ATP hydrolysis state (PDB 4TLC, subunit C) and (ii) post-ATP hydrolysis state (PDB 4TLA, subunit E)

ATP hydrolysis in the KaiC CI ring is a pre-requisite for KaiC interaction with fsKaiB [74, 91]. A comparison (Fig. 5d) of a high-resolution (1.8Å) crystal structure of the KaiBfs-cryst–CIcryst complex bound to ADP [75] with the structures of the KaiC CI domain (from S. elongates) in pre- and post-hydrolysis states displayed large conformational changes in the KaiC CI domain at the ATP binding site after ATP hydrolysis. Residue F200, near the ATP binding site and the α6 and α7 helices, moves “downward” as a result. Residues Q154 and Y155 of α6 then constitute the KaiBfs-cryst–CIcryst interface. Another 3.87Å resolution crystal structure (Fig. 6) of the KaiBfs-cryst* (KaiBfs-cryst variant with I88A substitution)–phosphomimmic KaiC S431E complex hexamer, crystallized in the presence of ATP, showed densities of ADP between each CI subunit [75] as opposed to previous crystals of KaiC and its mutant captured in the pre-hydrolysis state [92]. The structure also shows conformational changes at α6 and α7 helices of KaiC CI that accompany ATP hydrolysis. These analyses reveal that the energy provided by the ATP hydrolysis results in a much-needed conformational switch of the KaiC CI domain that captures fsKaiB [75].

Fig. 6
figure 6

Kai clock protein complex assembly. a A 3.87-Å structure of KaiBfs-cryst*and KaiC S431E complex hexamer (PDB 5JWQ) with KaiBfs-cryst* in hot pink, the KaiC CI domain ring in cyan, CII in green, and ADP densities in yellow

Dynamic structural analysis of Kai CI ring tryptophan mutants using fluorescence spectroscopy demonstrated a link between slow ATP hydrolysis and the KaiC CI binding to KaiB. The structural change triggered by slow ATP hydrolysis results in a structural rearrangement in the CI ring at the inner hexamer radius side (includes α7) and the D145–S146 peptide, without altering the overall hexameric framework of the KaiC CI ring. A slow KaiC CI ring conformational change (from pre- to post-hydrolysis state) coupled with the phosphorylation of KaiC results in a KaiC conformation that is receptive to the incoming active KaiB. This conformational switch in KaiC, coupled with ATPase activity and KaiC phosphorylation state, signals KaiC–active KaiB complex assembly and provides an explanation for the slowness of the cyanobacterial clock [91].

A 2.6Å crystal structure (Fig. 7a) of the ternary complex of KaiAcryst (KaiAΔN–C272S: KaiAΔN is KaiA variant missing the N-terminus; PDB 5JWR) in complex with KaiBfs-cryst–CIcryst provides the molecular level understanding of the co-operative assembly of the Kai components and the regulation of output signaling pathways by the Kai oscillator. Ternary complex analysis indicates that the presence of KaiA results in an increase in the affinity of KaiB for KaiC CI domain (Fig. 7b) as indicated by electrostatic interactions that form a triple junction between CIcryst, KaiBfs-cryst, and KaiAcryst and an increase in the number of hydrogen bonds and the interfacial surface area between KaiBfs-cryst–CIcryst [75]. Thus, KaiA drives the cooperative assembly of KaiB–KaiC. KaiA-activated KaiC phosphorylation drives the tightening of the CII ring, stacking CI over CII. Additionally, it is observed that the enhanced interaction between the CI and CII domains, as a result of CII rigidity, in turn suppresses KaiC ATPase activity [86].

Fig. 7
figure 7

KaiCBA ternary complex depicting the KaiA autoinhibition mechanism. a A 2.6-Å ternary complex between KaiAcryst and KaiBfs-cryst–CIcryst (PDB 5JWR; KaiAcryst in yellow, KaiBfs-cryst in pink, and CIcryst in cyan). b Enlarged view of the enclosed box in a depicting the binding interface of the ternary complex. Dashed lines show the electrostatic interactions. c Conformational changes in the KaiA dimer when sequestered into a KaiCBA complex. (i) Structure of KaiA in orange bound to CII peptides in blue (from S. elongates; PDB 5C5E) highlighting the α5 and α5 helices and β6 and β6 strands of the two KaiA monomers. (ii) KaiASe (orange) and KaiAcryst in ternary complex (yellow) superimposed showing only the α5 and α5 helices and β6 and β6 strands. (iii) The CIcryst–KaiBfs-cryst–KaiAcryst ternary complex. Panels (i), (ii), and (iii) highlight only the α5 and α5 helices and β6 and β6 strands of the two KaiA monomers depicting the structural basis of the mechanism of KaiA autoinhibition. d Top and side views of higher KaiCBA complex assembly (PDB 5N8Y) depicting the KaiC hexamer in green, the hexameric ring of KaiB monomers in pink, and KaiA homodimers in red and orange

Analysis of the ternary complex also reflects on the auto-inhibitory role of KaiA (Fig 7c). Bound KaiAcryst dimer in the ternary complex shows large conformational changes compared to the KaiA structure from S. elongates. β6 strands of KaiAcryst monomers rotate by 70° and β6 of one monomer forms an antiparallel β-sheet by docking onto β2 of KaiBfs-cryst. This rotates the α5 helices of both KaiAcryst monomers downwards onto α7 and α9 (the KaiC binding site) at the KaiAcryst dimer interface and blocks it. Thus, KaiB binding to KaiA induces changes in KaiA conformation and, as a result, KaiA inhibits itself from binding to KaiC. Structure-guided mutagenesis of the α5 helix and α7 and α9 helices of KaiA weakened ternary complex formation. Mutations in the β2 strand of fsKaiB disrupted the antiparallel β-sheet formation, eliminating the interaction between KaiAΔN and fsKaiB–KaiC CI complex. The mutation did not affect complex formation between fsKaiB and KaiC CI. The analogous mutations in kaiBSe disrupted the circadian rhythms in vivo [75].

In a parallel study [85], when Kai proteins were incubated in an excess of MgATP at 30°C, Snijder et al. observed that multiple stoichiometries of the phosphorylation-dependent Kai protein complex assemble simultaneously over a period of 24 hours. Initial formation of KaiCA complexes with autophosphorylation activity drives the cooperative assembly of phosphorylated KaiCB complexes (C6B1, C6B2 …. C6B6) followed by the formation of higher order KaiCBA complexes (C6B6A2 …. C6B6A6) that peaks in 12 hours, followed by the dephosphorylation phase wherein the KaiCBA complex disassembly is not the reverse of complex assembly. Incubation at 4°C favored autophosphorylation with KaiCBA complex levels increasing even after 24 hours. A protocol devised on these observations is used to obtain Kai complex assemblies "frozen" in various states for structural analysis. KaiCBA complex assembly could be obtained with near complete occupancy of the KaiA binding site by prolonged incubation of KaiC, B, and A in 1:3:3 molar ratio. Structural maps of KaiC6B6A12 and KaiC6B6 complex assemblies obtained at 4.7Å and 7Å resolution using mass spectrometry and single particle cryo-electron microscopy (EM) and fitted with previous crystal structures of the individual Kai proteins reveal that KaiCB assembly consists of three stacked rings of which the bottom two correspond to KaiC, and KaiB forms the top ring (Fig. 7d). The KaiB ring sits on top of KaiC CI [85]. Consistent with the previous study [88], analysis of KaiCBA complex cryo-EM maps indicates that KaiC-bound KaiB in the KaiCBA complex is fsKaiB. Also, it is the KaiBC complex assembly that guides the formation of higher KaiCBA assemblies [85].

Analysis of KaiCBA using the KaiA dimer crystal structure confirms the participation of KaiA as dimer in the formation of Kai complex assemblies. KaiB interacts with KaiA through its β2 strand and the binding is asymmetric, suggesting involvement of only one KaiB monomer in binding. Structure-guided mutagenesis of KaiC Ala106 and KaiB Lys42 and native mass spectrometry indicated their significance in KaiC–KaiB and KaiB–KaiA interactions, respectively [85]. KaiB Lys42 mutation in S. elongates and its analogus Lys43 mutation in T. elongatus disrupted clock rhythmicity in vivo [75].

Although KaiC’s autokinase and ATPase activities are fairly well characterized, KaiC dephosphorylation is less clear. The KaiC CII domain does not share the typical motif of the serine/threonine phosphatase family [93], but it does have a unique kinase/phosphatase activity at the subunit interface [78]. Egli and coworkers [78] hypothesized that this unique feature of KaiC is consistent with an unusual mechanism of dephosphorylation wherein ATP is regenerated from ADP in the CII half of KaiC, attributing a phosphoryl-transferase, rather than phosphatase, activity to KaiC. Also, Thr426 was observed to be phosphorylated in the T432E/S431E mutant crystal structure (PDB 3S1A) [66], supporting the hypothesis of phosphate transfer from ATP via a mechanism similar to Thr432 and Ser431 phosphorylation. The observation that KaiC does not appreciably consume ATP fits this model [78].

The three-dimensional structures of the clock components KaiA, KaiB, and KaiC are well defined and understood. Earlier studies of the complexes of these proteins using spectroscopic, computational, and hybrid structural approaches all support a likely mechanistic model resembling a switch from autokinase to phosphatase, or a possible autophosphotransferase activity of KaiC, and explain how it is related to KaiC ATPase activity. Further structural studies have made efforts to decipher the precise state underlying the switch; high-resolution crystal structures of KaiC–KaiB and KaiC–KaiB–KaiA complexes emphasize the importance of ATP hydrolysis of KaiC and conformational changes that trigger the assembly and disassembly of KaiC, B, and, A proteins. Kai components exist in a dynamic equilibrium between ground/inactive and the rare active state. The structures provide a molecular basis to the mechanism wherein ATP hydrolysis-induced conformational change in KaiC captures and stabilizes the interacting partner KaiB in the active state and simultaneously induces a switch between the varied enzymatic roles of KaiC that governs the phosphorylation/dephosphorylation cycles and regulates the circadian oscillator. Further studies of the KaiC–KaiA complex and the structures of the complexes that occur during the disassembly of the Kai complex are needed to understand the core circadian oscillator system and its regulation.

Circadian clocks in eukaryotes

This section briefly summarizes the various models for known eukaryotic circadian clocks and provides insight into structural research in progress.

The circadian clock in fungi

The Neurospora crassa circadian oscillator is arguably the best understood eukaryotic circadian system [31, 94, 95]. It has assisted in the elucidation of the concepts in eukaryotic clock mechanisms, yet many questions remain unanswered. With the limited structural knowledge of fungal clock proteins, the mechanism that underlies the functioning of core-clock components and posttranslational regulation is obscure. In the fungal CC (Fig. 3b), WHITE COLLAR 1 (WC-1), WHITECOLLAR 2 (WC-2), FREQUENCY (FRQ), and FRQ-INTERACTING RNA HELICASE (FRH) form crucial components of the clock. WC-1 and WC-2 are GATA-type zinc-finger DNA binding transcription factors that form the positive elements of the rhythmic loop [2, 47]. Together, they form a heterodimeric WHITE COLLAR COMPLEX (WCC) via their PER-ARNT-SIM (PAS) domains that bind to two light-responsive elements (LREs) of the frq promoter and activate the transcription of frq. In the late subjective night in constant darkness, heterodimeric WCC complex (D-WCC) binds to the distal LRE region of the frq promoter to activate frq transcription. frq mRNA levels peak in the early subjective morning and subsequently lead to FRQ accumulation that peaks in the late subjective day [2, 15, 96]. FRQ acts as the key negative element and is expressed in two isoforms: a long and a short form [10]. The two isoforms form a dimeric complex that interacts with WCC and inhibits frq transcription [15]. WCC-FRQ interaction is mediated by FRH [47, 97]. FRQ is simultaneously and progressively phosphorylated to release the repression on D-WCC and is degraded via a ubiquitin-proteasome-mediated pathway. FRQ also forms a positive loop, interlocked with the primary loop by positively regulating the expression of WC-1 [2, 98].

Among the core-clock components, WC-1 consists of three PAS domains: PAS-A, PAS-B, and PAS-C. Of the three PAS domains, PAS-A belongs to a specialized class of light, oxygen, or voltage (LOV) domain and functions as a blue-light photoreceptor. The function of PAS-B is unclear, and PAS-C is required for the interaction between WC-1 and WC-2 [99, 100]. WC-2 consists of a single PAS domain, important for interaction with WC-1, a coiled-coil domain with unknown function and a putative nuclear localization signal (NLS) [99, 101, 102]. FRQ is a phosphoprotein with a coiled-coil domain close to its N-terminus that mediates homodimerization. An NLS next to the coiled-coil domain of FRQ is essential for clock function [103]. The central and C-terminal part of FRQ is predicted to be largely unstructured and has no sequence similarity to any known protein domain [97, 104]. Apart from its role in the clock feedback loop, WC-1 is also a blue-light photoreceptor important for photomorphogenesis [2, 47, 96]. Light activation of WC-1 possibly results in the formation of a large WCC complex (L-WCC) that binds to the LREs, leading to the activation of transcription of the light-induced genes (frq and vivid (vvd) are two of them) [2, 101, 105,106,107]. VIVID (VVD) protein is another flavin-binding blue-light receptor in fungi that plays a role in phase regulation, entrainment, transient light responses, and temperature compensation in Neurospora circadian rhythms [2, 105, 106]. VVD and WC-1 are two LOV domain-containing photoreceptors that share sequence similarity in the core domain and bind FAD as the photosensory element [2]. The mechanism by which VVD inhibits nuclear WCC is unclear [2, 107]. Thus far, the LOV/PAS domain is the only recurring domain observed in the Neurospora clock. VVD is the only LOV domain containing a protein for which the crystal structure has been solved in the light and dark state, by Zoltowski et al. [106] (see below).

Circadian clocks in insects and mammals

Identification and isolation of the first clock gene, period (per), in Drosophila and subsequent analysis of its expression led to the first molecular model of an animal circadian oscillator [108, 109]. The Drosophila and mammalian clock genes share a high level of sequence similarity and have orthologs. The primary feedback loop of the clock (Fig. 3c, d) consists of the positive elements CLOCK (dCLK) and CYCLE (CYC) in Drosophila and CLOCK and BMAL1 in mouse. These positive elements in Drosophila and mouse are members of the basic helix-loop-helix (bHLH)-PAS (Period-Arnt-Single-minded) transcription factor family, and they heterodimerize to activate the transcription of genes containing E-box cis-regulatory elements in their promoter region: Period (dPer) and timeless (dTim) in Drosophila and period genes (Per1, Per2, and Per3) and cryptochrome genes (Cry1 and Cry2) in mouse [46, 110,111,112,113,114]. mPER/mCRY (dPER/dTIM in Drosophila) proteins translocate to the nucleus and repress their own transcription by acting on CLOCK/BMAL1 (dCLK/dCYC) activity [17, 112, 115,116,117,118,119]. A putative homolog of dTim is retained in mammals (mTim); however, unlike a central role for dTim, the function of mTim in the mammalian circadian clock is not clear. An essential role similar to that of dTim is performed by Cry genes in the mammalian circadian clock [120,121,122]. Interestingly, studies have shown that Cry genes, both in Drosophila and mammals, regulate the circadian clock in a light-dependent (photoreceptors) and a light-independent manner [112, 123,124,125,126,127]. However, their role as photoreceptors in mammals is still debated (discussed in the “Light: input to the clock” section).

The core-clock loop integrates with other regulatory systems that further fine-tune the mammalian clock system, wherein CLOCK/BMAL1 activates transcription of the members of the orphan nuclear receptors family (Rev-ErbA/NR1D (Nuclear receptor family 1 group D). Rev-erbα and β, and Retinoic acid receptor (RAR)-related orphan receptors (RORα, β, and γ)) in mammals via recognition of their E-box elements [128,129,130,131]. RORs and Rev-erbs, in turn, regulate the rhythmic expression of BMAL1 by alternatively binding to the retinoic-acid-related orphan receptor response elements (ROREs) on its promoter [132, 133]. RORs act as transcriptional activators of BMAL1 [129, 131, 132], whereas Rev-erbs act as repressors [128, 132]. Also, the genome-wide binding patterns of both Rev-erb α and Bmal1 showed regulatory regions that bind to most of the clock proteins and the proteins involved in various metabolic pathways, emphasizing the importance of Rev-erb/BMAL1 association with the circadian clock and metabolic functions (discussed later in the Rev-erb interactions section). Similarly, in Drosophila, dCLK/dCYC activates the transcription of vrille (vri) and Par domain protein 1ɛ (Pdp1ɛ) by binding to the VRI/PDP1-box (V/P) of the clk promoter to form the second loop [134, 135].

The interactions of PERIOD proteins: Crystal structures of fragments of Drosophila PERIOD (dPER residues 232–599) and mouse PERIOD (mPER1 residues 191–502, mPER2 residues 170–473, and mPER3 residues 108–411) proteins (Figs. 8 and 9) provide insights into the physical mechanism underlying circadian rhythm generation. The fragments include the two PAS domains (A and B), residues N-terminal to PAS-A, named the “N-terminal cap”, and the αE helix C-terminal to PAS-B. Thus, the molecular pattern established by the crystal structure tries to explain how the differential protein–protein interaction of the PAS domains in these proteins defines their distinct functions [49, 52, 136]. The occurrence of PAS domains and their interaction is found in many eukaryotic clock proteins [137]. The crystal structure of dPER (Fig. 8a, c) shows a noncrystallographic dimer where the PAS-A domain of one molecule interacts with the PAS-B domain of another molecule. Each PAS domain consists of a five-stranded antiparallel β-sheet (βA-βE) that is covered on one face with several α-helices (αA-αD). PAS-A and PAS-B in each monomer are connected by a short linker. In addition, each monomer has a highly conserved C-domain [138] that includes two long C-terminal α-helices (αE and αF). The αE helix is packed against PAS-B, parallel to αC’ of PAS-B, and the αF helix is directed away from the PAS-B core domain. Also, the crystal structure showed two different conformations for αF in the two dPER monomers [136]. The crystal structure of mPER2 (Fig. 8b, c) reveals a dimer that includes the two PAS domains, the αE helix, and a short N-terminal extension to the PAS-A domain [49].

Fig. 8
figure 8

Crystal structures of the period proteins. a dPER (PDB 1WA9) and b mPER2 (PDB 3GDI) dimers in cartoon representation. The conserved Trp482 (dPER, dark blue) and Trp419 (mPER2, cyan) residues are shown in stick representation. c The domain architecture of dPER and mPER2 proteins. The two PAS domains (PAS-A and PAS-B), the cytoplasmic localization domain (CLD, green), the conserved C-domain (light brown), nuclear localization signals (NLS, purple), NES (red), the threonine-glycine (TG) repeat region, and the dCLK:CYC inhibition domain (CCID, blue) of dPER and/or mPER2 are shown. CKIe, mCRY1/2, and dTIM are shown at their binding sites. d dPER structure representing the PAS-A–αF interaction (encircled region) interface and depicting the location of V243 (blue)

Fig. 9
figure 9

Crystal structures of mPER1 (PDB 4DJ2) and mPER3 (PDB 4DJ3) fragments. a Cartoon representation of mPER1 (residues 191–502). The conserved Trp448 (yellow) is shown in stick representation. b Comparison of the mPER1 (cyan) and mPER2 (pink) crystal structures. Movement of the PAS-A/αC helix of molecule 2 is indicated by a black arrow. c Closeup view of the structural comparison of the PAS-A/αC dimer interface of mPER1 (cyan) and mPER3 (yellow). Gly residues in mPER1 are shown in red and Arg residues in mPER3 are labeled. d Cartoon representation of mPER3 (108–411). The conserved Trp359 (blue) is shown in stick representation. e Comparison of the mPER3 (yellow) and mPER2 (pink) crystal structures. The black arrow indicates the location of movement of the PAS-A/αC helix of molecule 2. f Closeup view of the structural comparison of the PAS-A/αC dimer interface of mPER2 (pink) and mPER3 (yellow). PAS-A/αC dimer interaction is present in mPER1 and mPER3, but absent in mPER2, because of the different relative orientation of the monomers in (mPER2)2 compared to the mPER1 and mPER3 homologues

The PERIOD proteins are known to form homo- and heterodimers in the circadian clock, likely mediated via their PAS domains [138,139,140,141,142,143]. A detailed structural and biochemical analysis of the PAS domains of the dPER and mPER2 fragments has shown homodimer formation in solution and in crystal. The two structures reveal the use of different PAS interfaces for dimerization. The dPER fragment forms a dimer via intermolecular interactions of PAS-A with Trp482 in the βD’–βE’ loop of PAS-B (PAS-A-Trp482 interface) and with αF in PAS-B (PAS-A-αF interface), whereas in mPER2, the dimerization is stabilized by interactions of two PAS-B domains in antiparallel fashion. Trp419, which corresponds to Trp482 in dPER, is an important conserved residue involved in this interaction [49]. The PAS domains of dPER mediate interactions with dTIM in the Drosophila CC [144, 145]. Homodimerization might be important for dPER stabilization in the absence of dTIM and might have a possible role in dTIM-independent transcriptional repression and translocation of dPER [146,147,148,149,150,151]. However, dPER also interacts with dTIM, and in the absence of structural studies of the heterodimeric complexes a detailed analysis of such an association is difficult. A low-resolution structure of a HIF α (Hypoxia inducible factor α) PAS-B heterodimer (PDB 2A24) was obtained by docking the high-resolution structures of ARNT and the HIF-2α PAS-B domain using experimentally derived NMR restraints for the association. It demonstrated the use of a common β-sheet interface for hetero- and homodimerization in PAS [152]. Additionally, a crystal structure of a dPER fragment lacking αF, combined with a mutant analysis using analytical gel filtration and analytical ultracentrifugation, showed no dimer formation, suggesting that helix αF contributed to dPER homodimer formation [49].

Structural analysis of dPER has shown the importance of the PAS-A-αF interface in homodimer formation in solution. A dPERL (V243D) mutant, which has a temperature-dependent 29-hour long period phenotype, existed as a monomer in the solution [108]. The analysis of dPER structure (Fig. 8d) has shown that V243 is located in the center of the PAS-A-αF interface; thus, the structure provides a mechanistic explanation for the 29-hour long period phenotype of this encoded mutation variant, reflecting the significance of this interface in clock function [49]. Consistent with this study, a PAS-B triple mutation (E474R/H492S/R494D) in a dPER fragment lacking αF disrupted the dPER-dTIM heterodimer in yeast two-hybrid studies, but not the dPER homodimer in gel filtration conditions. The study suggested that the PAS-B β-sheet surface is a common surface in dPER-dTIM heterodimer formation and (mPER2)2 homodimerization [49]. The crystal structures of mPER1 and mPER3 (Fig. 9a–f) were analyzed and compared with the previously reported mPER2 structure. In addition to the PAS-B-Trp419 interactions in mPER2 (Trp448 in mPER1 and Trp359 in mPER3), it was revealed that their homodimers are stabilized by further interactions in the PAS-A domain, which are mediated by two antiparallel PAS-A/αC motifs, not by an mPER2-type PAS-A–PAS-B/αE interaction. In the center of the interface is the Tyr267 residue in mPER1 (Tyr179 in mPER3) (Fig. 9c, f). The corresponding residue in dPER is Ala287, which facilitates the introduction of Trp482 into the PAS-A domain binding pocket in dPER and dimer formation that is different from that of mPERs [49, 52].

Despite the conserved domain composition of the mPER proteins, the different interacting interfaces of the homodimers could play a role in defining their distinct functions. Of the three mammalian period proteins, mPER1 and mPER2 have been shown to be more important for maintaining the circadian rhythmicity. mPER2 regulates the expression of the clock genes (interaction with REV-ERBs), while mPER1 maintains their stability and subcellular localization via protein–protein interactions [153,154,155]. Knockout mouse studies of mPER3 showed only mild circadian phentoypes [156] but affected sleep homeostasis, suggesting its role to be directed more towards the regulation of the output processes than the core clock [157]. Period proteins contribute to the circadian regulation of metabolic pathways in peripheral tissues (adipose, liver, and muscle tissue) via the nuclear receptor signaling pathways. mPER3 interaction, via its PAS domains, with the nuclear receptor Peroxisome proliferator-activated receptor gamma (PPAR-γ) represses the receptor and inhibits adipogenesis [158]. The interactions occur via the PAS domains in mPER3. mPER1 interacts with the mineralocorticoid receptor to positively regulate the basal and aldosterone-mediated expression of the alpha subunit of the renal epithelial sodium channel (αENaC) in the renal cortical collecting duct cells, by binding of the complex to the E-box in αENaC promoter [159].

Analytical gel filtration analysis of the mPER homodimers in solution revealed a higher affinity for the mPER1 homodimer than for mPER2 and mPER3. Structural analysis of the PAS-A/αC interface (Fig. 9c, f) showed small (Gly) residues in mPER1, resulting in tighter PAS-A/αC dimer interaction compared to mPER3, which has a bulky Arg residue. Additionally, all mPER structures showed a highly conserved nuclear export signal (NES) in the αE helix. Mutation of a Met residue in this region of mPER2 disrupted its nuclear export activity, whereas mutation of the corresponding Leu in mPER1 and mPER3 had no effect. Structural analysis revealed the involvement of that Met in homodimer formation, in contrast to its Leu counterpart, which is exposed on the surface because of different orientations of the monomers in mPER1 and mPER3 compared to the (mPER2)2 homodimer [49, 52]. These observations suggest that homo- and heterodimerization events direct NES activity.

The N-terminal cap was observed to be unstructured in mPER2, whereas it formed a long helix followed by a β-strand in mPER1 and a shorter helix in mPER3. Sequence analysis of the mPER proteins predicts the presence of a HLH motif N-terminal to the PAS-A domain. In the absence of a basic region of the bHLH transcription factors, the mPERs HLH region might be engaged in heterodimeric interactions with other HLH proteins. Analytical gel filtration and mutation studies showed that mPER3 utilizes the HLH motif as a second interface to further stabilize homodimer formation instead of the PAS-A/αC interface in mPER1 and forms a more stable homodimer than mPER2. Also, a LXLL coactivator motif was observed in the PAS-A βE strand of mPER2 [49, 52], which was shown to play a role in the interaction of mPER2 with Rev-erbs [153]. The corresponding motif in mPER1 (PXXLL) and in mPER3 (PXXLT) is buried deep in the hydrophobic pocket formed by a Trp (in PAS-A) and a Leu residue in the N-terminal cap in mPER1 and mPER3, but not in mPER2. In addition, the coactivator motif in mPER2 is preceded by a less ordered βD-βE loop in the motif, suggesting that the motif in mPER2 is more easily available for interaction with nuclear receptors based on the higher flexibility of the adjoining regions [52]. Analyzing the interacting interfaces, the subsequent orientation of the monomers in mPER homodimers suggests the availability of distinct surfaces for interaction with other clock proteins and nuclear receptors. A recent study developed three new mouse cellular clock models in fibroblasts, adipocytes, and hepatocytes to study cell type-specific functions of clock gene function in peripheral tissues. Such studies showed that, although core-clock gene knockdowns displayed similar phenotypes, the period and Rev-erbs knockdowns showed cell-specific phenotypes [160].

Structural analysis of the PERIOD protein fragments is a step towards understanding PAS domains and the interactions of the PERIOD proteins. Future mutation studies of the key surfaces found from the structural studies and the interacting partners will provide a detailed understanding of their functions and the mechanism involved, which are not yet clear. Also, the newly developed cell-autnonomous clock model approach can be applied to other cell types which can be utilized to study mutants based on structural analysis to understand the tissue-specific functional differences of various clock genes.

The CLOCK–BMAL1 complex: The CLOCK–BMAL1 complex is central to the core oscillator in the mammalian clock. In the primary loop, these positive elements activate the transcription of Per and Cry genes. The PER–CRY complex in return represses their transcription by acting on CLOCK and BMAL1 expression. Another regulatory loop is formed by CLOCK–BMAL1 and Rev-erbs and RORs, wherein the complex activates their transcription. Rev-erbs and RORs subsequently regulate the rhythmic expression of BMAL1 [17, 128, 131]

An important step towards understanding the mammalian circadian clock has been the crystal structure of the mouse transcriptional activator CLOCK–BMAL1 heterodimeric complex that is central to the oscillator [161]. The 2.3-Å resolution structure (Fig. 10) of the complex between CLOCK residues 26–384 and BMAL1 residues 162–447 revealed a tightly intertwined heterodimer formed by the interaction between their corresponding bHLH, PAS-A, and PAS-B domains. The crystal structure showed a striking difference in the spatial arrangement of the corresponding domains in the two proteins. The bHLH domain consists of two helices, α1 and α2, of which α2 is connected to the N-terminal A'α helix of the PAS-A domain via a linker, L1. The CLOCK α2 helix is arranged in such a way that it is in direct contact with the CLOCK PAS-A domain, whereas no such feature is observed in the bHLH and PAS-A domains of BMAL1. Part of helix α1 and helix α2 are involved in the dimerization of the bHLH domains of the two proteins, forming a typical bHLH four-helix bundle similar to that observed in the bHLH-leucine-zipper (LZ)-containing heterodimer MYC-MAX [162]. However, the additional PAS or LZ domain guides their selective and differential partner preference among members of the bHLH superfamily [163]. A proper bHLH four-helix bundle conformation is important for the stability of the CLOCK–BMAL1 complex and its DNA binding activity, as deduced from mutations in the bHLH domain, which resulted in reduced formation of a stable complex and elimination of its transactivation property.

Fig. 10
figure 10

Crystal structure of the mouse CLOCK–BMAL1 complex (PDB 4F3L). a The ribbon diagram of the complex shows the CLOCK subunit in green and BMAL1 in pink. Yellow and blue highlight the respective linker regions between the domains. b Domain architecture of CLOCK and BMAL1 depicting the basic helix-loop-helix domain and the two PAS domains

The CLOCK and BMAL1 PAS-A domain consists of a typical PAS fold that is made of five β-strands and several helices. External to the PAS fold is the N-terminal A'α helix that is packed between the β-sheet surfaces of the two PAS-A domains and contributes to the heterodimeric interactions between the two domains. The interactions between CLOCK A'α and the BMAL1 β-sheet, and vice versa, are highly hydrophobic, forming a parallel heterodimer. Simultaneous mutation of the interface residues in both CLOCK and BMAL1 greatly reduced both the heterodimer formation and its transactivation potential compared to single mutations involving the individual proteins. The PAS-B domains of the two proteins are connected to the PAS-A domains by the 15-residue linker L2, which is ordered and buried within the CLOCK–BMAL1 interface in CLOCK, whereas in BMAL1, the linker is exposed to the surface and flexible. The crystal structure showed a translation of 26 Å in the PAS-B domains of CLOCK and BMAL1. The two PAS-B domains interact via surface-exposed hydrophobic residues in CLOCK and BMAL1. Trp427 of BMAL1 stacks with the CLOCK Trp284 located in the hydrophobic cleft between the Fα helix and the AB loop of the CLOCK PAS-B domain (Fig. 10). The tandem mutation of W427A in BMAL1 and W284A in CLOCK resulted in reduced complex formation and reduced the activity of the complex [161].

Lack of similarity among the clock proteins indicates that while the mechanisms are conserved across the kingdoms and are fundamental to clock machinery, the proteins are not structurally related, and further research is required to understand the structural differences. The crystal structures of the PAS domain homodimers of dPER and mPERs provide an interesting view of the interactions and their nonredundant functions. The PAS domains of Drosophila dPER share a significant similarity with mammalian PER proteins and bHLH-PAS transcription factors (CYC, BMAL, CLK, and NPAS2) [138]. WC-1, the functional analogue of CLOCK–BMAL1 from fungi, shows some similarity to BMAL1 within the PAS domain, as well as outside of the immediate PAS domain [98], suggesting a common ancestor and providing a link between fungi and animals. A bHLH-PAS domain has also been identified in phytochrome-interacting factor-3 (PIF3), which shows high similarity in the bHLH region to other members of the bHLH protein superfamily. Outside of the bHLH domain, PIF3 shows limited similarity to the PAS domains in phytochromes, but not to animal PAS domains [164]. The secondary dimer interface observed in mPER1 and mPER3 homodimers was absent in (mPER2)2 and is a conserved feature of mPER1 and mPER3, but not of other PERs or the bHLH-PAS-containing transcription factors [52]. Thus, the structural studies on dPER and mPER emphasized the need for detailed structural and biochemical analyses of the PERs’ and bHLH-PAS’ transcription factors to determine if similar or different modes of interaction exist among these clock components.

The crystal structure of the heterodimeric complex between mouse CLOCK and BMAL1 revealed an unusual 3D arrangement of the two PAS domains in the two proteins. The conformation and the spatial arrangement of the PAS domains of BMAL1 were similar to that observed in the crystal structure of the PAS domains of dPER and mPER. Trp362 in CLOCK is involved in an interaction with CRY. The corresponding Trp427 in BMAL1 interacts with CLOCK. In PERIOD proteins, Trp at a similar position is involved in homodimer formation [49], suggesting high structural and functional conservation of the BMAL1 and PER PAS domains. Also, the dimerization mode in the PER homodimer crystal structure and in the solution NMR structure of the HIF-2α–ARNT heterodimer was antiparallel, whereas it was parallel in the CLOCK–BMAL1 heterodimer, which, despite the similarity in the structure of the domains, suggests that their protein–protein interactions and/or function are highly influenced by the spatial arrangement [161]. Homo- and hetero-dimerization has also been observed in the components of the plant clock CCA1/LHY that contains the Myb-like domains instead of the bHLH-PAS domain. The interaction occurs in the region at the N-terminus, probably near the Myb domain. Two Myb domains are necessary for DNA binding, and dimerization was observed in the case of single Myb-domain-containing proteins. CCA1 was also observed to form homodimers [165]. However, the functional significance of its dimerization is yet to be determined.

Structural insight into Rev-erb interactions: The crystal structure of human Rev-erbβ was reported in a dimeric arrangement (Fig. 11a; monomer) [166]. Rev-erbs belong to the family of nuclear receptors that consist of ligand-sensitive transcription factors. These nuclear receptors contain two domains important for their activation: the ligand-insensitive activation domain, called activation function-1 (AF-1), at the N-terminus and the ligand-dependent activation domain, known as AF-2, present within the ligand-binding domain (LBD) at the C-terminus. Rev-erbs are unique within the family in that they lack the AF-2 domain [167]. In addition to being crucial components of the mammalian circadian clock, Rev-erbs are also suggested to play an important role in coordinating the metabolic process [168]. In the crystal structure, each monomer has an α-helical fold that consists of nine α-helices (H3–H11) and short β-strands (s1–s2). The putative LBD is filled with bulky hydrophobic residues, resulting in a small cavity unable to accommodate any potential ligand. Also, in the absence of helix H12 (AF2-helix), helix H11 adopts a unique kinked conformation that establishes contacts with H3, thereby stabilizing the hydrophobic core. H11 provides a structural platform for binding of a co-repressor and is important for constitutive repression activity.

Fig. 11
figure 11

Structure of Rev-erb β LBD monomer. a The apo form (green), without heme (PDB 2V0V), and b as a heme-containing (yellow) complex (PDB 3CQV), with the prosthetic group bound in the ligand-binding pocket. The conserved Cys384 (cyan) and His568 (red) residues involved in heme-binding are shown in stick representation. Helices H11 (red) and H3 undergo conformational changes to accommodate the heme prosthetic group. c The domain architecture of the Rev-erbs depicting the variable N-terminal A/B region (orange), DNA-binding domain (DBD) and the ligand-binding domain (LBD)

The molecular model of Rev-erbα LBD, constructed using Rev-erbβ LBD as a template, showed a similar configuration in the putative LBD [166], in contrast to the molecular model of LBD for E75, a Drosophila orthologue of human Rev-erbα [169], in which the putative LBD was large enough to accommodate a heme ligand. Previously, Rev-erbs were reported to be true orphan nuclear receptors showing no ligand-binding activity and acting as constitutive repressors by their binding to the nuclear co-repressor (N-coR). Similar to heme protein E75 [169], studies showed that heme is required to maintain the stability of Rev-erbs. The heme binding was found to be reversible, and the transcriptional regulation of Rev-erbs is altered with changes in concentration of the heme in the intracellular environment. Heme binding is required to stabilize N-coR interaction with the Rev-erbs [170, 171]. The crystal structure of the heme-bound Rev-erbβ LBD (Fig. 11b) [172] coupled with spectroscopic analysis provides the structural basis to show that heme and gas molecule (NO or CO) binding and the redox state are important for the regulation of Rev-erb activity. Conserved Cys and His residues were observed to be essential for heme binding, where Cys384 coordinates oxidized Fe(III), but not reduced Fe(II). These redox-dependent structural changes, resulting in functional changes, are common in heme proteins, such as E75 and NPAS2, but it is not known if the same changes happen in Rev-erbs [169, 173]. The reduced form was also able to bind gas molecules. Compared to the apo LBD structure, in the Rev-erbβ LBD complexed with oxidized Fe(III), helix H3 becomes straight, and H11 undergoes a conformational change in its C-terminal half to allow accommodation of the two heme-binding residues. The hydrophobic residues filling the LBD stabilize heme binding via van der Waals interactions, suggesting a significant contribution to binding strength and specificity. Heme has been shown to influence circadian cycles and to be a component not only of Rev-erbs but also of other CC proteins, such as mPER and NASP2 [172].

In the absence of the AF2 domain, the Rev-erbs regulate the activity of various genes via association with the nuclear receptor-co-repressor (N-coR) [168, 174, 175]. N-coR consists of two regions, called interaction domains (ID) 1 and 2, through which it binds to the nuclear receptor LBD. Rev-erbs regulate gene activity by specifically binding to the ID1 CoRNR motif [176,177,178]. Structures of apo-Rev-erbβ and heme-bound Rev-erbβ, however, are unable to help in understanding the Rev-erb–N-coR association, which is important for its repressive function. Phelan et al. [179] studied a co-crystal structure of interaction domain 1 (ID1) peptide bound to the hRev-erbα LBD (Fig. 12). The structure revealed formation of β-structures at the C-terminal region of the LBD that have not been observed in other nuclear receptors or in apo- or heme-bound Rev-erbβ. The N-coR ID1 peptide association with the C-terminal region of the Rev-erbα LBD results in an antiparallel β-sheet formation. The N-terminal β-strand (β1N) of the N-coR ID1 peptide is followed by a well-defined α-helix (α1N) that extends into the coactivator groove of the LBD. Structure-based alignment of the N-coR ID1 peptide-bound Rev-erbα with N-coR2/SMRT1 ID2-bound PPARα defines a new and extended CoRNR motif (I/LxxI/VIxxxF/Y/L) (Fig. 12b) that best describes the binding requirements for ID1 and ID2. Mutations at the +1, +4, and +5 positions that form the core of the CoRNR motif showed significant reduction in binding affinity towards Rev-erbα. Similar results were observed in a mammalian two-hybrid assay. Mutation at the +9 position resulted in nine-fold reduction of the interaction. These observations suggest that the core CoRNR motif (ICQII) and the right-extended flanking region are required for the interaction with Rev-erbα. Comparison of the N-coR ID1-bound Rev-erbα LBD with apo-Rev-erbβ and the heme-bound Rev-erbβ (Fig. 12c) showed that heme binding brings about changes in the conformation of H11 that result in large changes in H3, which then occupies the space for ID1 N-coR-binding. Based on homology, if the heme-bound Rev-erbα adopts similar changes, they will affect the binding of Rev-erbα with N-coR ID1 [179]. Also, the binding of heme with Rev-erbα destabilized interaction with the N-coR peptide [170], suggesting that N-coR associates differentially with the Rev-erbs in the absence of heme to perform the repression function and that the interaction between full-length Rev-erb and N-coR in the presence of heme might require additional contacts between the two proteins [174, 179].

Fig. 12
figure 12

Structure of N-CoR ID1 peptide and interactions. a N-CoR ID1CoRNR peptide (pink) bound to Rev-erb αΔ 323-423 LBD (sea green; PDB 3N00) depicting the N-CoR ID1 peptide β-strand (β1N) and α-helix (α1N) and the new C-terminal β-strand sY of Rev-erb α LBD. The backbone of the contact residues in H3, H4, H5, and the new Yβ-strand are shown in yellow and the supporting H3 residues in orange. b Representation of the amino acid residue positions in the N-CoR ID1 peptide defining the new extended motif for NRCoR. c Comparison of the N-CoR ID1 CoRNR peptide (pink) bound to Rev-erb αΔ 323-423 LBD (sea green) with apo-Rev-erb β (gray) and heme (red)-bound Rev-erb β (yellow). The region within the black box represents the changes in H3 as a result of conformational changes in H11 when Rev-erb binds to N-CoR ID1/heme.

One study has shown that both Rev-erbα and β [180] are central to the circadian clock, playing an important role in the regulation of the core-clock components and the clock output genes, rather than forming an accessory loop contributing to clock function. Analysis of the genome-wide cis-acting targets of the two isoforms and comparison with the BMAL1-binding sites [181] showed an extensive overlap highly enriched in the circadian clock genes and lipid metabolism genes. The integral role of Rev-erbs closely associates metabolic regulation to the core-clock machinery, and any alterations in the core-clock genes would create disturbance in energy homeostasis and metabolic activities that could eventually lead to metabolic diseases. The double-knockout mutant of Rev-erbα and β showed phenotypes with severely disrupted circadian expression of the core-clock components and deregulated lipid homeostatic genes. The circadian phenotypes were similar to those observed in other core-clock mutants (Bmal1-/-, Per1-/-Per2-/-, Cry1-/-Cry2-/-), suggesting that, together, the two Rev-erbs work with BMAL1 and other core-clock components to regulate circadian rhythms and metabolism. [180]. Additionally, the knowledge that a small-molecule ligand, like heme, is essential in the regulation of Rev-erbs’ activity has driven scientists to develop synthetic Rev-erb agonists as a new therapeutic approach for the treatment of metabolic diseases and resetting of altered circadian rhythms [182].

The plant circadian clock

The plant CC has been comprehensively studied using Arabidopsis thaliana as a model. The present clock paradigm consists of at least three interlocking transcriptional–translational feedback loops (Fig. 3e) [183, 184]. The core loop includes two related MyB-like transcription factors, CIRCADIAN CLOCK ASSOCIATED 1 (CCA1) and LATE ELONGATED HYPOCOTYL (LHY), whose expression peaks in the morning, and TIMING OF CAB EXPRESSION 1 (TOC1), which is expressed in the evening. TOC1 is a member of the pseudo-response regulator (PRR) gene family (PRR3, PRR5, PRR7, PRR9, and TOC1), whose members are clock-regulated but peak at different times of the day [185,186,187,188,189]. The nuclear-localized TOC1 protein, earlier suggested to activate CCA1/LHY expression [190], is the transcriptional repressor of CCA1 and LHY [191], and CCA1 and LHY repress TOC1 activity [192,193,194]. In the morning loop, CCA1/LHY promotes PRR9 and PRR7 expression, which, in turn, have negative feedback on CCA1/LHY [195,196,197]. In the evening loop, TOC1 represses an unknown mathematically defined factor ‘Y’ that, in turn, activates TOC1 expression. GIGANTEA (GI) [198] is thought to be a part of the Y factor. GI itself is negatively regulated by CCA1/LHY and TOC1 [199].

Another evening-expressed MyB domain-containing SHAQYF-type GARP transcription factor, LUX ARRHYTHMO (LUX), functions in a feedback role similar to that of TOC1 [200, 201] and is a possible component of a proposed Y activity [200]. Other components important for the clock, such as EARLY FLOWERING 3 and 4 (ELF3 and ELF4), are necessary for the gating of light signal inputs into the clock via an unclear mechanism. ELF3 and ELF4 are highly conserved plant-specific nuclear proteins with unknown function that normally accumulate in the evening [202,203,204,205,206]. Loss-of-function mutations in these three clock components result in arrhythmia under conditions of constant light and in darkness [200, 201, 205, 206]. Recent studies have shown them to be integral components of the evening repressor complex of the core molecular oscillator important for proper functioning of the circadian clock, and they have been implicated in the regulation of the transcript levels of PRR9 [206,207,208,209,210,211]. Repression by the evening genes was inferred from the genetic studies of ELF4 and ELF3 [212, 213]. Taken together, the plant CC appears to be comprised of a series of transcript regulators specific to plants.

The plant clock components and their interactions have primarily been studied using reporter assays, the yeast two-hybrid assay, and co-immunoprecipitation. However, lack of structural knowledge is largely limiting our understanding of the clock components. In silico approaches have been applied to predict the structural features and thereby gain insight into the underlying functional aspects of some components. However, in the absence of experimental validation, a cautious approach is required. Using such an approach, TOC1 was predicted to be a multidomain protein, having an N-terminal signaling domain as well as a C-terminal domain that might be involved in metal binding and transcriptional regulation. A middle linker predicted to lack structure connects two domains [214]. The N-terminal domain fold is predicted to be similar to the canonical fold of the bacterial RR protein structures [215, 216], hence the name PRR. The RR class of proteins is involved in phosphor-relay signaling in bacteria and plants [217, 218]. Gendron et al. [191] have recently defined the biochemical function of TOC1 in transcriptional repression that resides within its PRR domain. The extreme end of the C-domain is predicted to have two α-helices and represent a CCT (for CONSTANS, CONSTANS-like and TOC1) subdomain similar to the CCT domain of CONSTANS (CO). Since CO interacts with the HEME ACTIVATOR PROTEIN (HAP) transcription factor, Wenkel et al. [219] suggested that the CCT subdomain of TOC1 could have a similar interaction with this class of DNA-binding proteins, thus implicating TOC1 as a co-regulator of transcription [214]. Work by Gendron et al. [191] confirmed this structural hypothesis [214] by showing that TOC1 belongs to the family of DNA-binding transcriptional regulators. They showed that TOC1 could bind to DNA through its CCT domain and that a functional CCT domain is a prerequisite for the repressor activity of the PRR domain [191].

Another study utilizing bioinformatics approaches [212] has predicted that ELF4 is a protein with a single domain of unknown function and that it belongs to a functionally conserved family of ELF4 and ELF4-like proteins. The conserved region is predicted (Fig. 13a) to be α-helical with a coiled-coil structure and disordered N- and C-termini. The secondary structure analysis using CD spectroscopy showed signals for disordered regions and an α helix, but not for β-sheet conformation. The protein migrated as a dimer on a native gel. Using docking programs, ELF4 was predicted to form a homodimer with an asymmetrical electrostatic-potential surface (Fig. 13b, c). Additionally, expression analysis of elf4 hypomorphic alleles showed phenotypes at both morning and evening genes, suggesting a dual role for ELF4 linked with both morning and evening loops [212]. ELF4 influenced the clock period by regulating the expression of LUX under LL, in addition to TOC1, PRR9, and PRR7 expression under DD. The effect of ELF4 on morning and evening loops did not alter CCA1 or LHY expression [212].

Fig. 13
figure 13

Predicted structural models of ELF4. The a ELF4 monomer, b ELF4 dimer, and c electrostatic potential surface calculated for the ELF4 dimer. Surface areas colored red and blue represent negative and positive electrostatic potential, respectively

Identification of the evening complex, comprised of ELF4, ELF3, and LUX, which are all crucial for the transcriptional repression of the morning genes, addresses the importance of protein–protein interactions in a functional rhythmic oscillator [207]. ELF4, previously predicted to activate a transcriptional repressor [212], was shown to interact genetically and physically, both in vivo and in vitro, with a middle domain in ELF3. The interaction between the two proteins increased the nuclear levels of ELF3, suggesting that ELF4 acts as an anchor that helps in nuclear accumulation of ELF3. Both the nuclear-localization region in the C-terminal domain and the ELF4-binding middle domain of ELF3 were observed to be important for functional activity of ELF3 [211]. Although the biochemical activity of ELF3 is unclear, it has been proposed to be a co-repressor of PRR9 transcription [209].

Light: input to the clock

Light is one of the major environmental cues influencing the CC. Organisms have evolved sophisticated light-signaling networks that synchronize the clock to day/night cycles in order to regulate their metabolic and physiological processes.

Cyanobacteria

Cyanobacterial rhythms are shown to be synchronized indirectly by light via the redox state of metabolism in the cell. The type of input that the clock perceives was previously unclear. Further work revealed Circadian input kinase A (CikA), a histidine kinase bacteriochrome [220], and light-dependent period A (LdpA), an iron-sulfur protein [221], to be important candidates for input signaling to the core oscillator. These proteins transmit the input signals by sensing the redox states of the plastoquinone (PQ) pool. The PQ redox state in photosynthetic organisms varies with the intensity of light: PQ is oxidized under low light intensities and reduced at high light intensities [222]. A CiKA mutant showed a shorter free running period and was unable to reset after a dark pulse [220]. Like CikA mutants, LdpA mutants also showed a short circadian period; however, they were able to reset after the dark pulse [221]. CikA protein levels vary inversely to the light intensity in the wild type, but were observed to be light insensitive in the absence of LdpA [221, 223, 224]. S. elongatus CiKA (SyCiKA) consists of a cGMP phosphodiesterase/adenylate cyclase/FhlA-like domain (GAF) similar to that in other bacteriophytochromes, followed by a characteristic histidine protein kinase (HPK) domain. However, the GAF domain lacks the conserved Cys and His needed for the binding of the chromophore in other bacteriophytochromes. Also, binding with a chromophore was not observed in vivo. C-terminal to the kinase motif is the receiver domain homologous to the receiver domain of the response regulators of the bacterial two-component signaling systems. It lacks a conserved Asp present in the receiver domains of the bacterial RRs that is phosphorylated by the HPK domain, hence the name pseudoreceiver domain (PsR) [220, 225]. A family of PsRs is also observed in the plant circadian clock (PRRs) [185].

The solution structure of the PsR of CiKA (PDB 2J48) [226] consists of a doubly wound five-stranded β-sheet with five α-helices (α1 and α5 on one face and α2–4 on the other). CiKA mutants lacking the PsR domain showed significant increase in autokinase activity [225]. The interaction between the PsR domain and the HPK domain of CiKA was analyzed by superimposing a predicted model of CiKA-HPK (using PDB 2C2A as template [227]) and the solution structure of CiKA-PsR over the Spo0F–Spo0B complex (PDB 1F51 [228]) crystal structure. The PsR domain physically blocked the H393 of the HPK domain, making it unavailable for phosphoryl transfer (Fig. 14a), which explains the role of PsR in the attenuation of CiKA-HPK autophosphorylation activity [226]. Phopshorylation of the receiver domain in the bacterial RRs results in a conformation change, an effect that is probably mediated by the protein–protein interaction in CiKA. Like CiKA, KaiA also consists of a pseudo-response receiver domain at the N-terminus. In KaiA homodimers, the interaction between the two protomers occurs via the α4-β5-α5 surface of the PsR domain of one subunit with the swapped C-terminal domain of the other [44, 60]. It was expected that CiKA might use the same PsR surface to mediate protein–protein interactions.

Fig. 14
figure 14

Structure of the PsR domain of CiKA. a CiKA-PsR (yellow, PDB 2J48) superimposed on the Spo0F–Spo0B complex (blue and orange, PDB 1F51) depicting the structural difference in the HPK-PsR domain interaction interface in CiKA and bacterial Spo0F–Spo0B. b The complete phytochrome sensory module of Synechocystis 6803 Cph1 (PDB 2VEA). The tongue region is encircled. The N-terminal region is shown in yellow, the PAS domain in pink, the GAF domain in orange, and the PHY domain in green. The phycocyanobilin (PCB) chromophore is shown in blue stick representation

The phosphatase activity of CikA is enhanced significantly in the presence of KaiC and KaiB. In vivo, CikA strains showed high levels of phosphorylated RpaA, indicating CikA promotes dephosphorylation of RpaA [229]. Also, relative to the gsKaiB, fsKaiB variants showed a threefold increase in phosphatase activity of CikA and suppressed RpaA phosphorylation, suggesting that the rare active state KaiB interaction with KaiC activates signaling through CikA. Shortened periods of oscillation were observed in vivo and in vitro in the presence of excess of the pseudo-receiver domain of CikA (PsR-CikA). CikA was proposed to interact physically through its pseudo-reciever domain. Also, interactions were observed for KaiB variants (that adopt the fsKaiB state) and PsR-CikA domain but not for PsR-CikA domain and gsKaiB [88]. To understand the molecular basis of this interaction, a study was undertaken using Methyl-TROSY NMR spectroscopy and this revealed that an interaction between PsR-CikA and the KaiC CI domain–fsKaiB complex. Nuclear magnetic resonance spectroscopy (NMR spectra) were similar for PsR-CikA bound to fsKaiB–KaiC CI or wild-type KaiB–KaiC CI complexes. Co-operative assembly is also essential for the formation of the CikA–KaiB–KaiC complex, similar to what is observed during the formation of the KaiA–KaiB–KaiC complex, as observed by weak interaction between PsR-CikA and fsKaiB in the absence of the KaiC CI domain [75].

The solution structure of the complex between a fsKaiB variant with N29A substitution (KaiBfs-nmr ; binds to PsR-CikA in the absence of KaiC CI) and PsR-CikA (Fig. 15a) shows a binding interface of parallel nine-stranded β-sheets that includes β2 of PsR-CikA and β2 of KaiBfs-nmr. Structural analysis shows hydrophobic interactions between A29 of KaiBfs-nmr and I641 and L654 of PsR-CikA. The residue I641 of PsR-CikA is located in the center of the β2–β2 heterodimeric-binding interface. The interface center also shows interaction between C630PsR-CikA and A41 of KaiBfs-nmr. C630R substitution eliminated complex formation. Comparison of the binding interface of the PsR-CikA and fsKaiB N29A variant complex with that of the KaiA and fsKaiB complex (Fig. 15b) shows fsKaiB uses the same β2 strand to interact with KaiA and CikA. Also, mutations in the β2 strand of KaiB weakened its binding to both KaiA and CikA [75]. CikA and KaiA compete for the same overlapping binding site of the active state KaiB; thus, the rare active fold switched state is important for CikA interaction with the Kai oscillator to regulate input signals, as it is for the inactivation of SasA and the regulation of output pathways.

Fig. 15
figure 15

Structural analysis of the PsR–CikA–KaiBfs-nmr complex and the interacting interface. a NMR structure of the PsR–CikA–KaiBfs-nmr complex. Yellow, PsR-CikA; red, KaiBfs-nmr. b An expanded, close-up view of the boxed region depicting the complex interface is shown. c Comparison of the PsR–CikA–KaiBfs-nmr and KaiAcryst–KaiBfs-cryst complex interfaces. PsR–CikA and KaiAcryst compete for the same β2 strand of rare active fsKaiB

CiKA and KaiA co-purify with LdpA [224]. LdpA, an iron-sulfur center-containing protein, has been reported to be involved in redox sensing [221, 224]. Treatment of cells expressing LdpA with 2,5-dibromo-3-methyl-6-isopropyl-p-benzoquinone (DBMIB), which inhibits electron transfer from PQ to cytochrome bf, thus reducing the PQ pool, significantly affected the stability of LdpA, CikA, and KaiA. Additionally, lack of LdpA in DBMIB-treated cells further reduced CiKA stability, suggesting that LdpA can affect CiKA sensitivity to the cellular redox state [224]. Interestingly CiKA and KaiA bind directly to quinone analogues [223, 230], suggesting they can input light signals by sensing the redox state of metabolism in a manner independent of LdpA. Thus, CiKA and LdpA might be a part of an interactive network of input pathways that entrains the core oscillator by sensing the redox state of the cell as a function of light.

Fungi

Known light-induced responses in Neurospora are mediated by the blue light photoreceptors WC-1 and VVD [231, 232]. Light activation and photoadaptation mechanisms are crucial for robust circadian rhythms in Neurospora and are driven by the two LOV domains containing WCC complex and VVD [233, 234]. VVD is smaller than WC-1 and works in an antagonistic way to tune the Neurospora clock in response to blue light [2]. Light irradiation of the WCC complex results in the formation of a slowly migrating, large WCC homodimer that binds rapidly to the LREs (light responsive elements) and drives the expression of many downstream light-dependent genes (e.g., frq and vvd) [2, 101, 105, 107]. Light-induced gene expression is a transient process as hypophosprylated WCC, when activated, is simultaneously phosphorylated and rapidly degraded. Phosphorylation of WCC results in the dissociation of the complex, making it unavailable for photoactivation. The gene transcripts and proteins reach a maximum level in the initial 15 and 30 minutes, respectively, and then decrease to a steady state level in an hour on prolonged light exposure, a process called photoadaptation. A second pulse of high intensity can again activate the adapted state gene expression, elevating the levels to a second steady state [2, 232, 233]. As shown in phototropin-LOV2 domains, illumination of the LOV domain results in the formation of a covalent cysteinyl-flavin-adduct formation between LOV domain and FAD/FMN. The conversion of this light-induced adduct back to the dark state is a slow process in fungi, in contrast to the phototropins where conversion occurs within seconds [97, 235, 236].

The expression of vvd is under the control of photoactive WCC, and it accumulates rapidly upon irradiation. VVD indirectly regulates the light input to the Neurospora clock by repressing the activity of the WCC. Studies show that VVD plays a role in modulating the photoadaption state by sensing changes in light intensity [232]. Recent studies suggest that the competitive interaction of the two antagonistic photoreceptors (WCC and VVD) is the underlying molecular mechanism that leads to photoadaptation. VVD binds to the activated WCC, thus competing with the formation of the large WCC homodimer and, in turn, resulting in the accumulation of inactive WCC and attenuation of the transcriptional activity of the light-activated WCC [237]. Direct interaction of VVD with WCC prevents its degradation and stabilizes it through the slow cycle of conversion back to dark-state WCC [237, 238]. Therefore, the level of VVD helps to maintain a pool of photoactive and dark-state-inactive WCC in equilibrium. Perturbation by a light pulse of high intensity can again result in the photoactivation of the dark-state WCC, disturbing the equilibrium, until the transiently transcriptionally active WCC again drives the accumulation of more VVD to reach a second steady state. Thus, VVD plays a dual role of desensitizing the clock to moderate fluctuations in the light intensity while promoting light resetting to increasing changes in the light intensity. VVD levels gradually decline during the night as a result of degradation, but enough protein is still present to suppress the activation of highly light-sensitive WCC by light of lower intensity (moonlight). Hence, the accumulated levels of VVD provide a memory of the previous daylight to prevent light resetting by ambiguous light exposures [2, 233, 234].

The LOV domain forms a subclass of the PAS domain superfamily; it mediates blue light-induced responses in bacteria, plants, and fungi [2]. In Neurospora, VVD and WC-1 are the two LOV domain-containing photoreceptors, and in Arabidopsis, the LOV-containing families include phototropins (phot 1 and phot 2) and the ZEITLUPE family (ZTL, LOV kelch Protein 2 (LKP2), and Flavin-binding Kelch F-box1 (FKF1)). They bind the flavin mononucleotide (FMN) chromophore [239].

The crystal structure of VVD-36 (36-residue N-terminal truncation for increased solubility and stability) that retains wild-type behavior studied in both dark- and light-adapted states explains the light-induced conformational changes that are important for VVD activity (refer to [106] for structure figures). The protein exists in the crystal as a symmetrical dimer formed via hydrophobic interactions at the N-terminal cap surface. The structure revealed a typical PAS domain [106] as seen in other PAS domain-containing proteins [240, 241]. Specific to the VVD-like LOV domain is an 11-residue loop between Eα and Fα that accommodates the FAD adenosine moiety exposed to the solvent, and an N-terminal cap (residues 37–70) comprised of helix aα and strand bβ [106]. On photoexcitation, the crystal structure of the light-adapted VVD-36 reveals the formation of a stable covalent cysteinyl-flavin adduct that leads to conformational changes at Gln182 and protonation of the flavin ring. Gln182 flips to overcome unfavorable interactions and, at the same time, maintain hydrogen bonding with the protonated N5 atom of the flavin ring. A rotation of Cys71 breaks its S-H…O hydrogen bond with the carbonyl of Asp68. This brings Cys71 to a more exposed position where it interacts with the peptide N atom of Asp68. The changes in Cys71 conformation shift bβ towards the PAS core and disrupt the interactions that stabilize the packing of the N-terminal cap against the PAS β-sheet. A Q182L mutant showed similar spectral properties on photoexcitation, but it was unable to switch from a compact to a fully expanded form. Conformational changes, which involved the N-terminal cap, were also absent in a C71S mutant, but the photochemical changes at the active center were unaffected. The crystal structure of a C71S mutant (PDB 2PD8) [242] revealed a stronger hydrogen bond formation between Ser71 and Asp68 than in the wild type, which might stiffen the fold, preventing movement.

The crystal structures reveal that light-induced changes in the flavin protonation state lead to conformational changes of N-cap, creating a new interface for dimerization in the light-state VVD. Observations from size-exclusion chromatography together with static and dynamic light scattering (SLS and DLS, respectively) studies show that the light-adapted VVD forms a rapidly exchanging dimer relative to the dark-state monomer, with an expanding hydrodynamic radius. Dimer formation was observed to be concentration-dependent. The increase in the hydrodynamic radius was observed to be highly dependent on the length of the N-terminus. Studies of the various light-state N-cap variants indicate that residues 39–42 are important for dimerization and contain a Pro-Gly-Gly signature that is highly conserved among the dimer-forming variants. The aa39–42 segment adopts two distinct conformations in the crystallographic C71V VVD dimer (PDB 3D72), with a 180° rotation about the Pro residue between the two subunits [242], highlighting the importance of the proline in projecting the N-terminus towards the other subunit. Thus, conformational changes at the hinge to the PAS core and the N-cap Pro-Gly-Gly sequence are critical for light-induced dimerization.

Plants

Light input in plants is mediated by multiple photoreceptors: phytochromes (red/far-red light photoreceptors), cryptochromes (UV-A/blue light photoreceptors), UVR8 (UV receptor), and ZEITLUPE (ZTL), FLAVINBINDING, KELCH REPEAT, F-BOX 1 (FKF1), and LOV KELCH PROTEIN 2 (LKP2) (blue light) are a suite of photoreceptors involved in the photoentrainment of plants [243]. Arabidopsis contains five phytochromes (PHYA–E) [244, 245] and two classic cryptochromes (CRY1 and CRY2 that localize in the nucleus) [246, 247]. A third cryptochrome identified in Arabidopsis is called the CRY3/Arabidopsis CRY-DASH and has sequence similarity to CRY-DASH from Synechocystis. It localizes in chloroplast and mitochondria. CRY3 shows sequence-independent DNA binding, but its role in biological signaling remains unknown [248, 249]. ZTL, FKF1, and LKP2 are single LOV domain blue light receptors [250].

Recent studies suggested several light-signaling pathways involved in clock entrainment, but they are not yet well understood. One such pathway is proposed by the PIF hypothesis [244]. PIFs negatively control light-mediated gene expression to regulate plant development. PIF3 interaction with the light-activated form (Pfr form) of phyB in the nucleus results in its phosphorylation and subsequent degradation, thus relieving the repressive function of PIF3. Also, PIF3 binds to the G-box promoter region of CCA1 and LHY [251] in vitro, suggesting that phyB can also interact with the bound PIF3. In the second signaling pathway, ZTL, FKF1, and LOV-KELCH proteins possibly interact with phyB/CRY1, thus affecting the phyB/CRY1 response to the clock [252]. Also, F-box and Kelch repeat domain of ZTL/FKF1/LKP2 proteins play a role in the regulation of protein stability and mediate ubiquitin/proteasome-dependent degradation as a function of light [253]. Some of their targets include GI, TOC1, and PPR5 [189, 254, 255]. Another possible path consists of the PRR family of Arabidopsis, which shows light-dependent effects on clock period [199]. Moreover, proteins that do not possess a chromophore, including ELF4, ELF3, and TIME FOR COFFEE (TIC), are involved in gating light inputs to the clock. ELF3 negatively regulates the light input to the clock. Interaction between ELF3 and phyB results in ELF3's inhibitory function in the subjective night [199, 202, 206, 256].

Phytochromes are red/far-red light-sensing photochromic biliprotein photoreceptors that are involved in the regulation of various developmental processes. Of the five phytochromes found in plants, phyA and phyB are the best characterized [257]. PHYs have been implicated in the regulation of circadian rhythms in plants [257, 258], but their role in clock entrainment in other organisms has not been clearly defined [259]. Phytochromes share a common domain organization. The N-terminal photosensory core module consists of PAS, GAF (cGMP phosphodiesterase/adenylate cyclase/FhlA), and PHY (phytochrome-specific) domains. The PAS and GAF domains are connected by a figure-eight knot. The C-terminal transmitter module consists of an HLH dimerization/phosphor-acceptor domain and an ATPase catalytic domain, and transmits signals perceived by the N-terminal region to the signal transduction pathways. In addition, plant phytochromes contain a "Quail module" between the photosensory and transmitter modules. Fungal phytochromes have an N-terminal variable extension preceding the PAS domain and are not homologous to plant phytochromes. In cyanobacteriochromes, unlike other cyanobacterial phytochromes, the knot and the preceding PAS domain are absent altogether. The GAF domain is self-sufficient for photoperception [256, 260, 261].

The crystal structures of the PAS-GAF two-domain construct of the bacteriophytochrome DrBphP from Deinococcus radiodurans [262] and RpBhP3 from Rhodopseudomonas palustris [263] lack the PHY domain that is important for the light sensory function as changes in it prevent the conversion from the Pr (phytochrome photochromic state absorbing maximally in the red region) to the Pfr state (maximal absorption in the far-red wavelength range). The crystal structures for the complete sensory module were solved for Synechocystis 6803 Cph1 (PDB 2VEA) for the Pr ground state [264] and for bacteriochrome PaBphp (PDB 3C2W) for the unusual Pfr state [265]. Together, these structures show the sensory module to be an asymmetrical dumbbell of PAS-GAF and a smaller PHY fragment. The structure of 2VEA (Fig. 14b) reveals that the PHY domain is a member of the GAF family. It is connected to the PAS-GAF lobe by a long α9 helix. The PHY domain has an unusual tongue-like hairpin loop that contacts the PAS-GAF domain and seals the chromophore pocket. Unlike 2VEA, where the tongue makes intimate contact with the N-terminal helix α1, 3C2W exhibits a different structure for the N-terminal part and the tongue, where the biliverdin chromophore sits and remains exposed. In addition, two salt bridges are important for phytochrome function. One is formed between Arg472 of the tongue and Asp207 in the chromophore pocket. Arg254 forms the second bridge with ring B of the chromophore. In addition to this salt bridge, the main-chain O atom of Asp207 forms a hydrogen bond with the protonated N atom of ring A, B, and C of the chromophore [261, 265]. This interaction seems to be important for the functioning of Pfr, as mutations disturbing the salt bridge affect Pfr function.

The structures of the Pr (2VEA) and Pfr (3C2W) states show that, on excitation, transition from Pr→Pfr leads to Z–E isomerization of the chromophore D ring, consistent with Pr–Pfr photochemistry. Differences were also seen in the position of several tyrosine residues around the D ring in 3C2W. These Tyr residues were shown to be important for photoconversion [261].

Based on the structures of the bacterial phytochromes and the Arabidopsis phytochrome mutants studied previously, Nagtani [266] detailed the structure–function relationship of plant phytochromes. The core light-signaling domain corresponds to the N-terminal moiety (N-terminal extension, PAS/GAF domain) as the N-terminal region lacking the PHY domain in phyB continued to exhibit the light signal transduction, instead of the C-terminal region, which consists of a histidine-kinase-related domain (HKRD), as was previously proposed. Also, loss-of-function mutations (in the chromophore pocket and the PAS domain) and gain-of-function mutations (in the GAF domain) in phyA and phyB affected chromophore incorporation and phytochrome stability, respectively. The mutations affecting phyB–PIF interaction were largely found in the light-sensing knot and were identical to the signaling mutants, suggesting the involvement of the light-sensing knot region in the phyB–PIF interaction that initiates the downstream light signaling pathway. Additionally, mutations in the PHY domain that positively or negatively affected Pfr stability were mainly confined to the tongue region, defining the importance of this region in modulating phytochrome activity. Lastly, mutational analysis of the C-terminal region that comprises HKRD suggested it plays a role in protein–protein interaction and nuclear localization [266].

The LOV domain-containing ZTL/FKF1/LKP2 family is involved in the regulation of photoperiodic-dependent flowering and the entrainment of the circadian clock [239]. The structure of the FKF1-LOV polypeptide, a distant relative of VVD, was studied using size-exclusion chromatography and SAXS. FKF1-LOV was observed to be a homodimer with an overall structure similar to that of phot1-LOV (phototropin-LOV domain). Although only small conformational changes were seen in the FKF1-LOV core on dark-to-light activation, interactions with other segments, such as F-Box and/or Kelch repeats, may amplify these changes to initiate a photoperiodic response [267].

The LOV domain in the ZTL/FKF1/LKP2 family undergoes photochemical cycles similar to phot-LOV domains in vitro [253, 268,269,270]. Upon blue light absorption by phot-LOV, the FMN chromophore in the LOV domain converts from the ground state to a singlet-excited state and further to a triplet-excited state that results in stable photo-adduct formation between FMN and a conserved Cys of the LOV domain. Reversion to the ground state is also rapid [271]. The slower adduct formation and dark recovery rates of the FKF1-LOV polypeptides [272, 273] were attributed to the additional nine-residue loop insertion between Eα near a conserved Cys and the Fα helix found in the ZEITLUPE family. A FKF1-LOV polypeptide lacking the loop insertion showed a faster recovery rate in the dark compared to the FKF1-LOV with the loop intact, where no conformational change was detected [272]. This could reflect the importance of the loop in conformational changes upon light excitation and light signal transduction. In phototropins, one of the two LOV domains (LOV1) is required for dimerization [274, 275], while LOV2 is solely involved in photoreceptor activity. The single LOV domain in FKF1-LOV forms stable dimers [267], suggesting that the LOV domains in the ZTL/FKF1/LKP2 family function both as photoreceptors for blue light signal transduction and mediators for protein–protein interactions [253]. Detailed crystallographic and spectroscopic studies of the light-activated full-length proteins and their complexes are necessary to understand these interactions and the functional mechanism of the LOV domains.

Cryptochromes (CRYs) are flavoproteins that show overall structural similarity to DNA repair enzymes known as DNA photolyases [276]. They were first identified in Arabidopsis where a CRY mutant showed abnormal growth and development in response to blue light [277]. In response to light, photolyases and cryptochromes use the same FAD cofactor to perform dissimilar functions; specifically, photolyases catalyze DNA repair, while CRYs tune the circadian clock in animals and control developmental processes in plants like photomorphogenesis and photoperiodic flowering [125, 278,279,280,281]. Cryptochromes can be classified in three subfamilies that include the two classic cryptochromes from plants and animals and a third cryptochrome subfamily called DASH (DASH for Drosophila, Arabidopsis, Synechocystis, Homo sapiens) [249] whose members are more closely related to photolyases then the classic cryptochromes. They bind DNA and their role in biological signaling remains unclear [247, 249].

Cryptochromes have 1) an N-terminal photolyase homology region (PHR) and 2) a variable C-terminal domain that contains the nuclear localization signal (absent in photolyase and CRY-DASH proteins and with no obvious sequence similarity to known protein domains). The PHR region can bind two different chromophores: FAD and pterin [125, 276, 281].

In the absence of any high-resolution structure for a CRY protein, the functional analysis of this blue-light receptor was not clear earlier. Although the structure of CRY-DASH is known from Synechocystis [249], it does not clearly explain its role as a photoreceptor [282]. The crystal structure (Fig. 16a) of the PHR region of CRY1 (CRY1-PHR) from Arabidopsis [282], solved using the DNA photolyase PHR (PDB 1DNP) from a bacterial species as a molecular replacement probe [283,284,285], led to elucidation of the differences between the structure of photolyases and CRY1 and the clarification of the structural basis for the function of these two proteins. CRY1-PHR consists of an N-terminal α/β domain and a C-terminal α domain. The α/β domain consists of five parallel β-strands surrounded by four α-helices and a 310-helix. The α domain is the FAD binding region and consists of fourteen α-helices and two 310-helices. The two domains are linked by a helical connector comprised of 77 residues. FAD binds to CRY1-PHR in a U-shaped conformation and is buried deep in a cavity formed by the α domain [282]. In contrast to photolyases, which have a positively charged groove near the FAD cavity for docking of the dsDNA substrate [283], the CRY1-PHR structure reveals a negatively charged surface with a small positive charge near the FAD cavity (Fig. 16b), strongly suggesting the absence of DNA-binding activity. Trp277 and Trp324 in bacterial photolyases are important for thymine-dimer binding and DNA binding [283,284,285]. In CRY1-PHR, they are replaced by Leu296 and Tyr402. These differences, combined with a larger FAD cavity and unique chemical environment in CRY1-PHR created by different amino acid residues and charge distribution [282], explain the different functions of the two proteins. Still, the mechanism of the blue-light signaling by CRYs is not completely clear. The CRY1-PHR structure lacks the C-terminal domain of the full-length CRY1 that is crucial in the interaction with proteins downstream in the blue-light signaling pathway [286, 287]. CRY1 and CRY2 regulate COP1, an E3 ubiquitin ligase, through direct interaction via the C-terminus. Also, β-glucuronidase (GUS) fused CCT1/CCT2 expression in Arabidopsis mediates a constitutive light response [286, 287]. However, a recent study has shown N-terminal domain (CNT1) constructs of Arabidopsis CRY1 to be functional and to mediate blue light-dependent inhibition of hypocotyl elongation even in the absence of CCT1 [288]. Another study has identified potential CNT1 interacting proteins: CIB1 (cryptochrome interacting basic helix-loop-helix1) and its homolog, HBI1 (HOMOLOG OF BEE2 INTERACTING WITH IBH 1) [289]. The two proteins promote hypocotyl elongation in Arabidopsis [290,291,292]. The study showed HBI1 acts downstream of CRYs and CRY1 interacts directly with HBI1 through its N-terminus in a blue-light dependent manner to regulate its transcriptional activity and hence the hypocotyl elongation [289]. Previous studies have shown that the CRY2 N-terminus interaction with CIB1 regulates the transcriptional activity CIB1 and floral initiation in Arabidopsis in a blue light-dependent manner [293]. These studies suggest new/alternative mechanisms of blue-light-mediated signaling pathways for CRY1/2 independent of CCTs.

Fig. 16
figure 16

a CRY1-PHR structure (PDB 1U3D) with helices in cyan, β-strands in red, FAD cofactor in yellow, and AMPPNP (ATP analogue) in green. b electrostatic potential in CRY1-PHR and E. coli DNA photolyase (PDB 1DNP). Surface areas colored red and blue represent negative and positive electrostatic potential, respectively. c dCRY (PDB 4JZY) and d 6-4 dPL (PDB 3CVU). The C-terminal tail of dCRY (orange) replaces the DNA substrate in the DNA-binding cleft of dPL. The N-terminal α/β domain (blue) is connected to the C-terminal helical domain (yellow) through a linker (gray). FAD cofactor is in green. e Structural comparison of dCRY (blue; PDB 4JZY) with dCRY (beige; PDB 3TVS, initial structure; 4GU5, updated) [308, 309]. Significant changes are in the regulatory tail and adjacent loops. f Structural comparison of mCRY1 (pink; PDB 4K0R) with the dCRY (cyan; PDB 4JZY) regulatory tail and adjacent loops depicting the changes. Conserved Phe (Phe428dCRY and Phe405mCRY1) depicted that facilitates C-terminal lid movement. g dCRY photoactivation mechanism: Trp342, Trp397, and Trp290 form the classic Trp eˉ transfer cascade. Structural analysis suggest the involvement of the eˉ rich sulfur loop (Met331 and Cys337), the tail connector loop (Cys523), and Cys416, which are in close proximity to the Trp cascade in the gating of eˉs via the cascade. h Comparison of the FAD binding pocket of dCRY (cyan) and mCRY1 (pink). Asp387mCRY1 occupies the binding pocket. The mCRY1 residues (His355 and Gln289), corresponding to His 378 and Gln311 in dCRY, at the pocket entrance are rotated to "clash" with the FAD moiety. Gly250mCRY1 and His224mCRY1 superimpose Ser265dCRY and Arg237dCRY, respectively. i Crystal structure of the complex (PDB 4I6J) between mCRY2 (yellow), Fbxl3 (orange), and Skp1 (green). The numbers 1, 8, and 12 display the position of the respective leucine rich repeats (LRR) present in Fbxl3

Insects and mammals

Identification of the cryptochromes in plants subsequently led to their identification in Drosophila and mammals. Interestingly, studies have shown that cry genes, both in Drosophila and mammals, regulate the circadian clock in a light-dependent [123,124,125] and light-independent manner [126, 127]. An isolated cryb mutant [294] in Drosophila did not respond to brief light impulses under constant darkness, whereas overexpressing wild-type cry caused hypersensitivity to light-induced phase shifts [124]. Light signal transduction in Drosophila is mediated through light-dependent degradation of TIM. Light-activated CRY undergoes a conformational change that allows it to migrate to the nucleus where it binds to the dPER–dTIM complex, thus inhibiting its repressive action [295]. dCRY blocking leads to phosphorylation of the complex and subsequent degradation by the ubiquitin-proteasome pathway [296]. However, flies lacking CRY could still be synchronized, suggesting the presence of other photoreceptors. Light input to the Drosophila clock can also occur via compound eyes, as external photoreceptors and Hofbauer-Buchner eyelets behind the compound eyes, where rhodopsin is present as the main photoreceptor [297,298,299,300]. CRY-mediated input signals occur through lateral neurons and dorsal neurons in the brain, which function as internal photoreceptors [301]. In the case of external photoreceptors, the downstream signaling pathway that leads to TIM degradation is not clear. However, lack of both external and internal photoreceptors completely abolished photoentrainment in Drosophila [302].

The C-terminal extensions that are characteristic of CRYs in the Cryptochrome/Photolyase family gained considerable attention owing to their crucial role in various cryptochrome functions (reviewed in [125, 247, 281]). Despite the high similarity of the PHR regions among the CRYs in a given kingdom, the C-terminal extensions are variable in sequence, as well as in size. In plants, the C-terminal extension has three conserved motifs that are collectively referred to as DAS motifs and are comprised of DQXVP in the N-terminal end of the C-terminal extension, a region made up of acidic residues (E or D) and a STAES region followed by GGXVP at the C-terminal end of the extension [246]. A nuclear-localization domain is present in the C-terminal domain of plants and is required for function. In animals, the cryptochromes have been categorized into two types: one that acts as circadian photoreceptors (in insects) and another that acts as light-independent transcriptional repressors that function as integral components of the circadian clock (in vertebrates). Their functional diversity is attributed to the C-terminal extension. Various genetic and biochemical studies have reflected the importance of the C-terminal extension in subcellular localization, protein–protein interaction, and cryptochrome degradation via a proteasome-dependent pathway. The C-terminal extension is sufficient for nucleocytoplasmic trafficking of CRYs. Reports on Arabidopsis and Drosophila cryptochromes showed that the presence of both the PHR domain and C-terminal extension is essential to cryptochrome-mediated functions. However, like a functional N-terminal domain of Arabidopsis CRYs independent of the CCTs, studies on N-terminal domain constructs lacking the C-terminal domain of Drosophila CRY demonstrate it to be functional. A Drosophila cry mutant allele (crym) expressing only the N-terminal CRY domain was observed to be capable of light detection and photoransduction independent of the C-terminus [303]. Also, transgenic Drosophila lines overexpressing CRYΔ lacking the C-terminus resulted in a constituively active form that did not degrade [304]. CRYs undergo a blue light-dependent conformational change, making the C-terminal extension available for protein–protein interaction with downstream signaling partners, subsequently leading to CRY/CRY-mediated degradation.

Studies report direct interaction between CRY and COP1/phyB/ZTL/LKP1/ADO1 in plants, and mPER in animals, mediated via the C-terminus. Studies of chimeric proteins made by fusion of Arabidopsis (6-4) photolyase-PHR-CRY1-CCT domains showed that the features of both domains are obligatory for the repressive action of the CRY protein. The C-terminus is not sufficient to mediate the transcriptional repressor function [125, 247, 281]. In Drosophila, the C-terminal extension has been shown to be critical to the role of dCRY as a magnetoreceptor [305, 306]. Many organisms have a magnetosensing ability, utilizing the Earth’s magnetic field for navigation and orientation [247]. Lack of the dCRY C-terminus disrupts the electromagnetic field-sensing abilty of CRY, thus affecting the negative geotaxis ability of Drosophila [305, 306]. The Drosophila clock showed increasingly slow rhythms in response to an applied magnetic field in the presence of blue light. The magnetosensitivity was also affected by the field strength. cry mutants with an impaired FAD or mutants lacking cry were observed to be unresponsive to the applied magnetic field. Drosophila clock neurons overexpressing CRYs showed robust sensitivity to an applied field [306, 307].

Structural studies on the animal cryptochromes contributed immensely to the understanding of their function. Structures have been solved for both full length and truncated CRYs (Drosophila and mammalian) and show overall similarities. There are, however, significant differences and these are implicated in defining their diverse functions [308,309,310,311]. A full-length dCRY structure (3TVS) by Zoltowski et al. [308] includes the variable C-terminal tail (CTT) attached to the photolyase homology region. The dCRY structure, excluding the intact C-terminal domain, resembles (6-4) photolyases, with significant differences in the loop structures, antenna cofactor-binding site, FAD center, and C-terminal extension connecting to the CTT. The CTT tail mimics the DNA substrates of photolyases [308]. This structure of dCRY was subsequently improved (PDB 4GU5) [309] and another structure (PDB 4JY) was reported by Czarna et al. [310] (Fig. 16c, d), which together showed that the regulatory CTT and the adjacant loops are functionally important regions (Fig. 16e). As a result, it now appears that the conserved Phe534 is the residue that extends into the CRY catalytic center, mimicking the 6-4 DNA photolesions. Together it was shown that CTT is surrounded by the protrusion loop, the phosphate binding loop, the loop between α5 and α6, the C-terminal lid, and the electron-rich sulfur loop [310].

The structure of animal CRY did not reveal any cofactor other than FAD. In CRYs, flavin can exist in two forms: the oxidized FADox form or as anionic semiquinone FAD°-. During photoactivation, dCRY changes to the FAD°- form, while photolyases can form neutral semiquinone (FADH°). Unlike photolyases, where an Asn residue can only interact with the protonated N5 atom, the corresponding Cys416 residue of dCRY readily forms a hydrogen bond with unprotonated N5 and O4 of FAD, thus stabilizing the negative charge and preventing further activation to FADH.-, which is the form required for DNA repair in photolyases [308]. Structural analysis and the mutational studies of dCRY have defined the tail regions as important for FAD photoreaction and phototransduction to the tail (Fig. 11g). The residues in the electron-rich sulfur loop (Met331 and Cys337) and Cys523 in the tail connector loop, owing to their close proximity to the classic tryptophan electron transport cascade (formed by Trp420, Trp397and Trp342), influence the FAD photoreaction and play an important role in determining the lifetime of FAD°- formation and decay and regulating the dynamics of the light-induced tail opening and closing. Additionally Phe534, Glu530 (tail helix), and Ser526 (connector loop) stabilize the tail interaction with the PHR in the dark-adapted state [310]. These are important structural features that determine why these CRYs now lack photolyase activity.

The structure of the apo-form of mCRY1 by Czarna et al. [310] shows an overall fold similar to dCRY and (6-4) photolyase. Differences are observed in the extended loop between the α6 and α8 helices, which was found to be partially disordered and structurally different when compared to that in dCRY. Conformational differences (Fig. 11f) are also observed in the protrusion loops (seven residues shorter in mCRY1 and consists of Ser280: the AMPK phosphorylation site), the phosphate-binding loop (structurally different than in dCRY and partly disordered), and the C-terminal lid, which was unstructured. The lid forms the wall between the FAD binding pocket, the predicted coiled-coil helix α22, and the sulfur loop [310]. Helix α11 (Tyr287/Gly288), following the protrusion loop, has been shown before to be important for the repressive action of mCRY1 [312, 313]. Comparison of dCRY and apo mCRY1 did not show drastic changes in the FAD binding pocket (Fig. 16h). Highly conserved Asp and Arg residues form a salt bridge contributing to the stability of FAD binding in all the known CRYs/photolyases, and are moved inside the FAD binding pocket in mCRY1 [310].

Structural analysis [310] of the previously reported mutational studies [314] depict the C-terminal α22 helix and C-terminal mCRY1 tail to be essential for the transcriptional repression function of mCRY1 and the interaction with other core clock proteins. Analysis of various mutants of mCRY1 with mutations in the C-terminal lid (F504A), the predicted coiled-coil α22 helix (K485D/E, G336D), charged surfaces, the FAD-binding pocket (H355E, H224E), and the phosphate binding loop (S247D) suggested that even though common regions are utilized in the interaction with mPER2, FBXL3, and mCLOCK/BMAL1, the mode of interaction differs. The mCRY1 α22 helix and the acidic region play an essential role in the transcriptional repression function as well as protein–protein interactions. In addition, the C-terminal lid and the basic and acidic surface regions near the FAD binding pocket regulate mPER2, FBXL3, and mCLOCK/BMAL1 binding [310].

The crystal structure of the apo form of mCRY2-PHR [311] showed a photolyase-like fold. The crystal structure of the FAD-bound mCRY2-PHR showed that it retains the FAD binding activity (Kd ~ 40 μm). FAD adopted a similar U-shaped conformation as observed in other CRYs and photolyases. However, it was reported that the FAD moiety is only partially buried in the binding pocket [311]. In (6-4) photolyases FAD was found to be deep inside the pocket hidden under a well-ordered phosphate binding loop and a nearby protrusion motif. The central lysine of the phosphate binding loop forms a hydrogen bond with the N7 atom of the adenine on the cofactor [282, 312]. In contrast, the phosphate binding loop is completely disordered and the protrusion motif is moved away from FAD in FAD bound mCRY2-PHR [311].

mCRY2 with an intact CCT was found to form a stable complex with Fbxl3–Skpl [311]. A crystal structure determined for the heterotrimerc complex mCRY2–Fbxl3–Skpl (Fig. 16i) shows a globular mCRY2 fitted on a cup-shaped Fbxl3–Skpl complex via its α-helical domain. Fbxl3 consists of a three-helix F-box motif, a C-terminal leucine rich repeat (LRR) domain that contains 12 LRRs, followed by a 12 amino acid residue-long C-terminal tail (conserved in vertebrates) that ends with a Trp. The LRR domain forms a semicircular solenoid structure with parallel β-strands on its concave surface and α-helical coils on its convex surface. Based on the structural irregularities of LRR7 and 8 (longer β-strands compared to the others) the LRR domain could be divided into two halves: LRR-N (1–6) and LRR-C (7–12). The mCRY2-Fbxl3 interaction interface analysis showed a closer contact between Fbxl3 LRR-C and the mCRY2 α-helical domain. The Fbxl3 C-terminal tail caps the solenoid structure and enters into the α-helical domain of mCRY2. The terminal Trp occupies the core of the FAD-binding pocket similar to the (6-4) DNA lesion in the d(6-4)photolyase–DNA complex structure. The interface was observed to be highly hydrophobic and revealed a large surface adjacent to the cofactor binding pocket on mCRY2. This surface is formed by three structural motifs: the interface loop, the C-terminal helix, and the 11 amino acid-long conserved segment (CSS) preceding the C-terminal tail. Binding activity analysis of various Fbxl3 and mCRY2 mutants showed that complex formation is significantly affected by mutations in the Fbxl3 tail and the mCRY2 cofactor pocket [311].

The phosphorylation sites at Ser71 and Ser280 alter mCRY stability [315] and thus its binding affinity to its protein partners by restructuring the local environment. The addition of free FAD disrupted the complex between Fbxl3-mCRY2 suggesting an antagonistic role in regulating Fbxl3–mCRY2 interaction [311]. The C-terminal helix of mCRY2 is essential for PER binding [247], which is masked by the LRR domain in the mCRY2–Fbxl3–Skp1 complex [311]. All these suggest that PER abundance and the metabolic state inside the cell regulate CRY stability and ultimately the clock rhythmicity. Such knowledge can guide the design of compounds that influence CRY stability and hence was proposed as a strategy for treating metabolic anomalies [316,317,318].

Light input in mammals occurs via eyes and reaches the retina, from which signals for clock entrainment are sent to the pacemaker SCN. Circadian rhythms can be entrained in mice lacking classic visual photoreceptors (rods and cones), but not in enucleated mice, suggesting that nonvisual photoreceptors could play a role in photoentrainment of the mammalian circadian clock [319, 320]. Studies showed that a subset of intrinsically photosensitive retinal ganglion cells (ipRGCs) located in the inner nuclear layer of the retina are responsible for circadian light resetting. The ipRGCs form a retinohypothalamic tract (RHT) that projects into the pacemaker SCN. Lesion of the RHT resulted in the inability of circadian responses to light [319, 320].

Melanopsin (Opn4), a new opsin molecule that has emerged over the past decade as a potential photoreceptor for photoentrainment, is enriched in the ipRGCs [321, 322]. Mice lacking melanospin (Opn4-/-) showed less sensitivity to brief light perturbations under DD [323]. However, the phase and period responses in the Opn4-/- mice were not completely absent, indicating the involvement of other photoreceptors in the entrainment process. mCRY1 and mCRY2 are found in the inner layer of the retina [313]. Also, hCRY1 expressed in living Sf21 insect cells showed photoconversion similar to that observed in plant and Drosophila cryptochromes upon light irradiation, suggesting a possible role as photoreceptors in mammals [324, 325]. However, the role of mammalian cryptochromes in photoreception is complicated by the fact that they are a crucial part of the core oscillator machinery. Gene knockout results in an arrhythmic clock, thus making it difficult to assay its role as a photoreceptor [126, 127]. Work by Dkhissi-Benyahya et al. [326] demonstrated that with changing light intensity, mammals recruit multiple photoreceptor systems to entrain the clock in a wavelength-dependent manner. They discovered the role of medium wavelength opsin (MW-opsin, located in the outer retina) in photoentrainment, in addition to melanopsin [326]. Thus, light entrainment in mammals is like other organisms, such as insects and plants, where existence of multiple photoreceptors helps the organism to adapt to the diurnal changes in light intensity and wavelength to synchronize the circadian rhythms. Several downstream light signaling pathways have been described for transmitting light to the circadian clock [321, 322]. RHT consists of glutamate and the pituitary adenylate cyclase-activating polypeptide (PACAP), the key putative neurotransmitters of RHT that are responsible for signal transduction to the SCN that ultimately drives the induction of the Per genes [319, 320]. In addition to RHT, other neuronal inputs to the SCN have been identified. However, that is beyond the scope of this review.

Summary

An exciting chapter of circadian clock research, which is focused on structural aspects, has brought with it new challenges. Whereas the structural aspects of the circadian clockwork in prokaryotes are relatively well studied, the picture regarding eukaryotic CCs is fragmentary, trivial, and far from complete. Much is to be done. A targeted protein complex, which is a structural feature common to all the clocks, has recently gained center-stage in bench science. Multimeric protein complex formations have been shown to be important for the regulation of several core oscillators. We know that the proteins contain identical conserved domains with their typical folds. However, structural analysis of the CLOCK–BMAL1 complex and the PERIOD homodimers suggests that the dynamics of the assembly and disassembly of hetero-multimeric protein complexes is dependent on the differential spatial arrangement of the domains. Additionally, the CLOCK/BMAL1 proteins show potential for a differential electrostatic surface that endowes the complex with asymmetry, indicating that differential surface potential might be responsible for the disparity in their interaction with PER/CRY and, hence, for distinct functions.

Sequential phosphorylation is another feature that influences protein–protein interactions in circadian clocks. The dynamics of the cyanobacterial KaiC phosphorylation cycle have been observed to be driven by regulated cycles of interaction with KaiA and KaiB that trigger the enzymatic switch in KaiC. However, both the precise time point for the switch and an understanding of how the information relayed between the phosphorylation/dephosphorylation event and the physical protein–protein interaction triggers the switch are issues that remain to be elucidated. Sequential phosphorylation has also been observed in the eukaryotic clock. Protein–protein and/or protein–DNA interactions coupled with progressive phosphorylation and dephosphorylation events have been shown to be important for stability, subcellular distribution, and the function of the core-clock components [4, 48, 51, 150, 165]. PER-mediated inhibition of dCLK/dCYC activity involves association with DOUBLETIME (DBT), a kinase. DBT phosphorylates CLK, resulting in its inhibition and degradation [327]. Similarly, in Neurospora, FRQ interaction with FRH and kinases results in WCC phosphorylation, thus repressing its activity [97, 104]. CCA1 and TOC1 function and stability are also subject to phosphorylation regulation [165, 328]. However, it is not clear which event, phosphorylation or oligomerization, occurs first such that nuclear accumulation and activity result. Phosphorylation of the Drosophila CLK protein is not only sequential, but is also compartment-specific. Although phosphorylation of FRQ is crucial for its transcriptional repression activity, Cha et al. [51] showed that it is not important for the regulation of the cellular distribution of FRQ. Future structural studies of these proteins individually and in complex assemblies will provide the mechanistic details with which to understand the dynamics of these events.

The dynamics of phosphorylation and dephosphorylation are also important for the transmission of external environmental cues and for resetting the clock. A light-dependent conformational change of the photoreceptors directs a downstream cascade of phosphorylation and protein–protein interactions that defines the period length and the phase shifts. Another interesting mechanism of clock resetting has been observed in the cyanobacterial clock, where the metabolic state of the cell entrains the clock in a light-dependent manner. Circadian metabolic rhythms are also observed in higher organisms [329]. Feeding can entrain the circadian clock in rat liver independent of synchronization with the SCN or light cycle [330]. The nutritional status of the organism drives adenosine monophosphate-activated protein kinase-mediated phosphorylation of cryptochromes and entrains the peripheral clocks [331]. However, the mechanism of entrainment is not clear. Structural analysis of the CRY proteins depicts how phosphorylation and the metabolic state of the cell direct its interaction with different protein partners that regulate CRY stability and function. The extended overlapping binding interface for PER and Fbxl3 prevents them from interacting simultaneously. Interaction of Fbxl3 with CRY requires the binding of the Fbxl3 tail to the FAD binding pocket in CRY. One small molecule (Kloo1; a carbazole derivative) can modulate circadian period by interacting directly with CRY at its FAD binding pocket and protecting CRY from SCFFbxl3-mediated ubiquitination. The crystal structure of the mCRY2 PHR–Kloo1 complex shows that Kloo1 is buried deep in the pocket and mimics the cofactor [332].

The cyanobacterial CC is an enzymatic clock wherein KaiC, central to the clock, exhibits all the enzymatic activities. The eukaryotic circadian system is, instead, a complex network of transcription factors, regulatory proteins, kinases, and phosphatases. The common elements in the CC systems in different kingdoms of life are fairly well known. However, notwithstanding the coarse models we have, enough differences have been brought about by the different evolutionary paths and different environmental adaptations to justify detailed studies of CCs in different organisms. From this perspective, the efforts invested by us and others, especially with regard to the structural dissection of the circadian systems, are timely and well placed.