Introduction

Chimeric antigen receptor (CAR) T cell therapy, in which a patient’s own T lymphocytes are engineered to recognize and kill cancer cells, has achieved remarkable success in some hematological malignancies in preclinical and clinical trials, resulting in six FDA-approved CAR-T products currently available in the market [1,2,3,4,5,6] (Table 1) (Fig. 1). CARs are synthetic immune receptors that connect a single-chain variable fragment (scFv), derived from a monoclonal antibody, to T cell signaling domains to eradicate tumor cells independent of the major histocompatibility complex (MHC). Despite the impressive rate of complete remission (CR) in patients with certain B-cell malignancies [7,8,9], there are still some concerns about treatment failure associated with the low efficacy of CAR-T cells [10,11,12,13]. In the current review, we will discuss how the molecular components of CAR construct and elements of lentiviral vector backbone plasmid transferring CAR expression cassette can contribute to CAR-T cell therapy success or failure.

Table 1 FDA-approved CAR-T products
Fig. 1
figure 1

A schematic picture of the structural elements of six FDA-approved CAR-T products. Tisagenlecleucel (a), Axicabtagene ciloleucel (b), Brexucabtagene autoleucel (c), Lisocabtagene maraleucel (d), Idecabtagene vicleucel (e), Ciltacabtagene autoleucel (f)

Structural elements contribute to CAR’s potency

Once equipped with a CAR, T cells, known as CAR-T cells, act as living drugs and recognize and eliminate the target tumor cells. The conventional CAR structure consists of three modular components: the ectodomain, the transmembrane domain, and the endodomain, each of which has specific components and functions and thus the potential to be optimized.

Ectodomain

The ectodomain is the domain of a membrane protein that is outside the cytoplasm and exposed to the extracellular space. The ectodomain in the case of CAR consists of the antigen recognition region and hinge domain.

Antigen recognition domain

The predominant type of antigen-recognition domain in CARs is variable fragments of a conventional monoclonal antibody, commonly an IgG type, connected by a short peptide linker or disulfide bond and then called single-chain variable fragment or scFv [14]. Four crucial features of scFv that may affect CAR-T cell therapy clinical outcomes are affinity, immunogenicity, specificity, and structure. CAR binding affinity, along with its expression levels, determines the antigen-binding characteristics of the CAR and the efficacy of target cell recognition [15]. The CAR affinity must be sufficiently high to recognize the target antigen [16] but not too high to trigger on-target off-tumor toxicities [17]. It can be fine-tuned according to the target antigen density on tumor cells [18, 19]. Constructing CARs with the appropriate affinity to discriminate between malignant and normal cells without rendering any toxicity is crucial. Several studies have demonstrated that CAR with reduced affinity can efficiently distinguish tumors from normal tissues that express the same antigen at lower levels while maintaining potent antitumor activity and prolonged persistence [18,19,20,21,22].

Most CARs developed and tested in clinical studies have utilized scFvs, usually derived from murine monoclonal antibodies [23, 24]. Both humoral and cellular responses triggered by the murine-derived scFvs included in the CAR structure may lead to quick clearance of the CAR-T cells from circulation and thus increase the risk of relapse [25, 26]. Therefore, it seems that developing humanized or fully human scFvs, which are likely to be less immunogenic, would avert the anti-CAR responses and, consequently, circumvent treatment failure [27]. However, these scFvs may still have non-self-sequences since these variable fragments are usually generated through multiple recombination events and somatic hypermutation or fused at junctions that do not typically exist [25]. Lymphodepleting chemotherapy treatment before CAR-T cell infusion has a beneficial effect on reducing anti-CAR-T immune responses [28, 29].

An ideal target antigen must be expressed with high specificity and coverage in tumor cells. Most antigens recognized by CARs are not tumor-specific (TSA), restricted to the tumor cells, but are tumor-associated (TAA), expressed on the surface of normal tissues as well, albeit at a low level. Targeting a TAA, in most cases, leads to unwanted on-target off-tumor toxicities, such as B cell aplasia, resulting from a direct attack on healthy tissues having a shared expression of the targeted antigen [30,31,32]. Therefore, finding a target with high specificity to tumor cells requires researchers to implement a comprehensive assessment. Tumor antigen heterogeneity, observed predominantly in solid tumors such as malignant mesothelioma (MM), glioblastoma multiforme (GBM), and so forth, is also one of the main impediments restricting the efficacy of monovalent immunotherapeutic strategies directed against only one particular antigen [33]. Therefore, efforts to develop CAR-T cell immunotherapy must confront this high diversity of potential target antigen expression; otherwise, treatment failure or tumor recurrence may occur [33]. Designing CARs with two scFvs in which two corresponding scFvs target two different antigens, such as tandem CARs (TanCAR), dual CARs, loop CARs, AND-gate CARs (synNotch-CAR), and inhibitory CARs (iCARs), is a common strategy to improve the specificity of CARs [30, 34]. In the TanCAR concept, also referred to as OR-gate CAR, two different scFvs are connected outside the cell (in series), usually by a glycine-serine linker [35, 36]. The TanCAR can be activated when any one of the scFvs binds to a target antigen. When two scFvs simultaneously bind to their respective target antigens, the TanCAR will be activated and produce synergistic effects, which results in further activation of CAR-T cells and boosting their tumor-killing ability [35,36,37]. Dual CAR-T cells refer to the expression of two CARs in the same T cell, with each CAR having its own signaling function and distinct extracellular antigen recognition domains [38,39,40]. Wang et al. showed that dual CAR-T cells targeting IL-23 and PSMA secreted more cytokines in vitro and functioned significantly better in mouse models of prostate cancer compared to TanCAR-T cells expressing the same scFvs in a single CAR [37]. Like TanCAR-T cells, loop CAR-T cells consist of two scFvs in a single CAR molecule. In TanCARs, the VL-VH of one scFv is directly linked to the VL-VH of the other scFv, whereas the loop structure is formed with the VL-VH of one scFv separated by the VL-VH of the other scFv [41]. It has been shown that loop CAR-T cells are more effective than the TanCAR-T cells in eradicating tumor cells and prolonging survival in xenograft models [42]. Loop CAR-T cells targeting CD19 and CD22 showed promising results in phase II clinical trial (NCT03196830) of patients with relapsed/refractory (R/R) non-Hodgkin lymphoma (NHL) [43] and in phase I clinical trial (NCT03233854) of adults with R/R acute lymphoblastic leukemia (ALL) and large B-cell lymphoma (LBCL) [44]. A schematic picture of TanCAR, dual CAR, and loop CAR has been illustrated in Fig. 2.

Fig. 2
figure 2

A schematic picture of TanCAR (a), DualCAR (b), and Loop CAR (c). In TanCAR, two different scFvs are connected outside the cell (in series), usually by a glycine-serine linker. TanCAR can be activated when any one of the scFvs binds to a target antigen. Dual CAR-T cells refer to the expression of two CARs in the same T cell, with each CAR having its own signaling function and distinct extracellular antigen recognition domains. Like TanCAR-T cells, loop CAR-T cells consist of two scFvs in a single CAR molecule. In TanCARs, the VL-VH of one scFv is directly linked to the VL-VH of the other scFv, whereas the loop structure is formed with the VL-VH of one scFv separated by the VL-VH of the other scFv

In contrast to OR-gate CARs, AND-gate CARs, with synthetic Notch (synNotch) receptors [45] as the core element, requires T cells to sense two antigens to activate (Fig. 3a). When the synNotch receptor recognizes a target antigen by its extracellular recognition domain, the transcriptional activator domain of the receptor is released, which can then enter the nucleus and drive the expression of a CAR for a second antigen. These combinatorially gated T cells has shown a remarkable degree of therapeutic discrimination both in vitro and in vivo [46]. Like dual CAR-T cells, iCAR-T cells express two CARs on the same T cells, including a typical tumor-antigen-specific CAR and an iCAR [47]. The iCAR consists of an scFv specific to the antigens expressed exclusively on normal tissue and an inhibitory signaling domain of immunoinhibitory receptors (programmed cell death protein-1 (PD-1) and cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4)) to restrict T cell activity despite concurrent engagement of activating receptors, allowing T cells to distinguish target cells from the off-target cells [47, 48] (Fig. 3b).

Fig. 3
figure 3

A schematic picture of synNotch-CAR (a) and iCAR (b). SynNotch-CARs require T cells to sense two antigens to activate. SynNotch receptors are engineered to sense a target antigen on the surface of tumor cells and induce the expression of a CAR specific to a second tumor antigen. iCAR-T cells express two CARs on the same T cells, including a typical tumor-antigen-specific CAR and an iCAR. Recognizing a target antigen on the surface of a normal cell by iCAR leads to the inhibition of the second CAR. iCAR, Inhibitory CAR; TAA, Tumor-Associated Antigen; A and B, Target Antigens

ScFvs have a high propensity for self-aggregation, resulting from their structure, which leads to ligand-independent constitutive signaling, known as tonic signaling [49]. This tonic activation can induce early exhaustion of CAR-T cells and, consequently, limit its anti-tumor efficacy [50]. Amino acid point mutations or substitutions can partially correct the tonic signaling of CARs caused by its scFv [50, 51]. Although scFvs are currently the most often used antigen-recognition domain in CARs, some associated drawbacks, such as immunogenicity or their tendency for aggregation, may pose potential risks and challenges in their applications. Therefore, alternative antigen-binding domains would be beneficial [52]. The nanobody, also known as the VHH domain, is the variable domain of the heavy-chain-only antibodies (HcAb) naturally found in sera of camelids [53]. The VHH domain is the smallest fragment with antigen-binding capability, comparable to conventional antibodies in affinity and specificity [54]. Properties such as small size, high solubility and stability, low immunogenicity, high tissue penetration, and no need for the additional folding and assembly steps or linker optimization due to the lack of variable light chain make nanobodies a promising alternative to scFvs in CARs [54]. Using the VHHs as antigen recognition domains for CAR-T cells appears to be more favorable than scFv, particularly for solid tumors, as they can access epitopes hard or impossible to reach by scFvs [55,56,57]. The first report about the successful use of nanobodies in the CAR constructs emerged from our lab, where CAR-modified T cells used an anti-MUC1 VHH as the target-binding domain [55]. Anti-MUC1 CAR-T cells showed increased proliferation and IL2 secretion upon the stimulation and could effectively kill MUC1-positive tumor cell lines [55]. For creating more complicated CARs, nanobodies are more favorable than scFvs, due to their compact size and lack of VL chain. Furthermore, the potential cross-pairing of VH and VL, commonly observed among two independent scFv molecules but not among nanobodies, may result in the affinity loss of these scFv-based CARs [58]. Interestingly, the sixth CAR-T product, recently approved by the US FDA for medical use in patients with R/R multiple myeloma (MM), utilizes two VHHs targeting two different B-cell maturation antigen (BCMA) epitopes [59] (Fig. 1). Search for finding antigen binding domains other than scFv has led to new ligand-based CARs. Recently, Wang et al. used the intrinsic binding properties of a natural toxin to develop targeted CAR-T cells. They incorporated chlorotoxin (CLTX), a 36-amino acid peptide isolated from the deathstalker scorpion venom previously used to deliver radiation therapy and imaging reagents to tumor sites of patients with GBM, into a CAR construct to redirect cytotoxic T cells against GBM cells. They found the antitumor effects of CLTX-CAR-T cells robust and specific while making negligible off-tumor toxicity [33]. Zetakine-CARs employ a membrane-tethered cytokine ligand as the antigen recognition domain [60, 61]. Brown et al. developed a zetakine-CAR targeting IL-13 receptor α2 (IL-13Rα2) by fusing a membrane-tethered IL13 ligand mutated at a single site (E13Y) domain to intracellular signaling domains to use for the treatment of recurrent GBM [62]. Moeller et al. also developed an IL3- specific zetakine-CAR targeting the alpha-chain of the IL3 cytokine receptor (CD123), a promising target for acute myeloid leukemia (AML) as it is overexpressed in both leukemic stem cells (LSCs) and blasts [61]. Other ligand-based CARs have been developed and tested in preclinical and clinical studies across a range of malignancies, including those incorporating FMS-like tyrosine kinase 3 ligand (FLT3L) to target FLT3-positive AML [63, 64], Natural Killer Group 2D (NKG2D) receptor to target NKG2D ligands on the surface of tumor cells [65], a proliferation-inducing ligand (APRIL), also known as TNFSF13, to target BCMA and transmembrane activator and calcium-modulator and cyclophilin ligand (TACI), two receptors implicated in the pathogenesis of multiple myeloma [66], and granulocyte-macrophage colony-stimulating factor (GM-CSF) to target the GM-CSF receptor (CD116) involved in the pathogenesis of juvenile myelomonocytic leukemia (JMML) [67].

Linker

The antigen-binding domain of CARs is composed of two distinct critical components; the scFv and the linker, whose rational design is often neglected while designing a CAR construct. The content and length of linkers strongly influence the structural and functional properties of CARs. Although linkers are highly divergent in their sequences, those commonly used in CARs contain repeats of glycine (Gly) and serine (Ser) residues to provide the flexibility necessary for antigen-binding sites to change conformation [68] and maintain good stability in aqueous solutions [69]. The combination of Gly and Ser residues also prevents the formation of secondary structures and reduces the likelihood of the linker interfering with the folding and function of the scFv [70]. The length of the linker between V-domains affects various structural characteristics of scFv, such as size, flexibility, and valency. A linker greater than 12 residues helps V-domains fold in the natural orientation and form a monovalent antigen-binding site [71]. In contrast, shorter linkers restrict V-domain flexibility, prevent intramolecular variable chain pairing, and instead favor intermolecular oligomerization and formation of scFv multimers [72,73,74]. A recent study, through detailed interrogation, demonstrated how linker length drastically affects the clinical outcomes of CAR-T cell therapy. Singh et al. set out a study at the University of Pennsylvania (Penn)/ Children’s Hospital of Philadelphia (CHOP) to evaluate CD22 as a target for CAR-T cell therapy. After analyzing the results of two pilot clinical trials (NCT02650414 and NCT02588456) using anti-CD22 CAR-T in six pediatric and three adult patients with R/R ALL, they found that response rates were unexpectedly lower compared with results from a similar trial at the National Cancer Institute (NCI) [11, 75]. Further investigation revealed similar patient characteristics, manufacturing processes, and CAR constructs in these clinical trials; the only difference was the length of the scFv linker; the anti-CD22 CAR evaluated by the NCI joined the V-domains using a five-amino acid linker ((Gly4Ser), CAR22-short), while the trials at the Penn / CHOP connected the V-chains using a 20-aa linker ((Gly4Ser)4, CAR22-long) [76]. They found that the CAR22-long was monomeric, distributed uniformly on the cell surface, while the CAR22-short was homodimerized and clustered on the cell surface. The CAR22-short demonstrated greater F-actin polarization and perforin accumulation at the immune synapse (IS), more durable immune synapse formation, and enhanced downstream receptor activation. The CAR22-short also secreted more IFN-γ and drove enhanced cytotoxicity against target tumor cells than its long counterpart. Antigen-independent signaling associated with linker-driven clustering causes CAR22-short T cells to be ‘primed’ and rapidly initiate intracellular signaling and activate immune response programs upon target engagement. Interestingly, the tonic signaling associated with the shortened linker was beneficial for CD22 CARs bearing the 4-1BB costimulatory domain, while replacing 4-1BB with CD28 led to T cell exhaustion and dysfunction. Furthermore, they found that the association between linker length and clustering is not a universal phenomenon. In CARs targeting CD19, shortening the scFv linker did not result in cell surface clustering and tonic signaling [76].

Hinge domain

A domain usually overlooked while evaluating CAR functionality is the hinge domain (HD). The HD in CARs serves as a spacer that holds scFvs beyond the plasma membrane and gives them the flexibility necessary to access antigen epitopes on the surface of target cells [77, 78]. The origin, length, flexibility, and composition of the HD influence CAR antitumor activity and the occurrence of side effects as well [79,80,81]. The HDs typically used in CARs are Ig-based, originating from the constant region of human immunoglobulin molecules, or non-Ig based derived from the components naturally expressed on T cells. Ig-based spacers commonly use the hinge-CH2-CH3 regions of IgG molecules, mostly IgG1 and IgG4 [82,83,84,85]. Accumulated research studies have shown that effective antigen recognition by CAR relies on the length of HD and the target epitope distance from the cell membrane [78, 86, 87]. A study by Qin et al. showed that hinge-containing anti-CD19 CAR-T cells had a tumor-eradication capacity similar to their hinge-free counterparts. These results suggested that for those antigen epitopes like CD19, which are membrane-distal and expressed on the cell surface with high density, embedding a hinge domain in CAR construct does not improve their killing activities [78]. However, they found that hinge incorporation improved the expansion and migratory capacity of anti-CD19 CAR-T cells. They also achieved similar results with anti-mesothelin CAR-T cells [78]. To access antigen epitopes residing closer to the membrane of target cells or those embedded within heavily glycosylated structures, a hinge domain with proper length is needed to decrease the distance or amend the steric inhibitory effects between the scFv and its epitope [78, 86]. Hudecek et al. designed three CAR constructs bearing either a full-length or truncated IgG4-Fc as a spacer to study the effect of HD length on CAR antitumor activity [87]. All CAR constructs recognize a membrane-distal epitope of the receptor-tyrosine kinase-like orphan receptor 1 (ROR1). They found that HD length did not affect the CAR expression levels; however, CARs with intermediate and short spacer showed superior T-cell cytokine secretion and proliferation after target-antigen recognition [87]. Therefore, the HD length in CARs needs to be carefully tailored regarding the target epitope distance from the cell membrane to achieve an improved antitumor efficacy.

Several clinical studies have reported a lack of persistence in CAR-T cells bearing IgG-derived spacers [82, 88]. Studies have shown that the IgG-derived spacers have ligand-binding capacity. Several amino acid sequences within the CH2 domain can bind to Fcγ receptors (FcγRs) on the innate immune cells, including monocytes/ macrophages, dendritic cells (DC), neutrophils, and natural killer (NK) cells [89]. The Fc: FcγR binding may result in unwanted innate immune response, including antibody-dependent cell-mediated cytotoxicity (ADCC) and phagocytosis, which may, in turn, result in depletion of CAR-T cells having Ig-based spacers in their CAR constructs [90, 91]. The Fc: FcγR interaction may also lead to ligand-independent tonic signaling and subsequently activation-induced T-cell death (AICD) [90, 92]. Therefore, modification of the IgGs-derived spacers, such as replacing the IgG1-CH2 framework with the corresponding IgG2 amino acids [26, 29], which has a lower binding capacity to both human [32] and murine [31] FcγRs, or complete deletion of CH2 region may solve the problems [22]. Jonnalagadda et al. generated several anti-CD19 CARs with different IgG4-derived spacers; one with a nonmutated CH2 domain, one with complete deletion of the CH2 region, and the others with single- or double-point mutations in the CH2 region. They found that CAR with nonmutated CH2 failed to engraft and persist in xenograft models, probably due to Fc binding to FcγR. However, the engraftment and persistence were partially restored by blocking this interaction using intravenous immunoglobulin (IVIG) administration [91]. Moreover, they observed that double point mutations or complete deletion of the CH2 region resulted in an improved persistence and antitumor efficacy in xenograft models compared with CARs containing a nonmutated or single-mutated CH2 [91]. Thus, introducing mutations, truncations, or complete deletion in the IgG-derived spacers is inevitable to diminish the adverse consequences of Fc: FcγR binding and improve CAR persistence and efficacy in vivo [91, 93]. However, CAR-T cells with no in vivo therapeutic efficacy or persistence were reported even after similar modifications in their IgG1-derived spacers [94]. Breyanzi, the FDA-approved CAR product of Juno Therapeutics/Bristol Meyers Squibbs, of note, harbors a 12-amino acid IgG4-derived spacer without the CH2-CH3 sequence (IgG4 hinge only) [93, 95].

To minimize the possibility of potential immunological interactions elicited by the Ig-based spacers and attain the safety needed for clinical use, spacers derived from components naturally expressed on T cells, such as CD8 and CD28, can be incorporated into CAR structure [96]. Alabanza et al. investigated the effect of the hinge and transmembrane (TM) domains derived from human CD28 or CD8α on the biology of fully human or murine-derived CD28-based CARs targeting CD19. They found that regardless of scFv origin, both CARs with CD8α or CD28 HD/TMD showed similar expression levels on the T cell surface. Compared with CARs containing HD/TMD of CD8α, those with CD28 HD/TMD produced significantly higher inflammatory cytokines and underwent more AICD [79]. T cell exhaustion markers such as PD-1 and lymphocyte activation gene-3 (LAG-3) were also higher in CAR-T cells containing CD28 HD/TMD. Based on crystal structures, they realized that enhanced inflammatory cytokine production and AICD observed in CAR-T cells with CD28 HD/TMD resulted from an increased tendency of these CARs to dimerize compared with CARs containing CD8α HD/TMD. They also found that HD/TMD does not affect T cell memory phenotype [79].

In patients with a high tumor burden, CAR-T cell therapy may result in adverse reactions and side effects, such as cytokine release syndrome (CRS), associated with the over-activation of CAR-T cells [97,98,99]. Considering that the spacer appears to be involved in T cell activation and cytokine production, its modification may provide the safety and efficacy needed for patients with a high tumor burden. To decrease the over-activation of CAR-T cells, Zhang et al. removed two consecutive Gly residues in the CD8-derived spacer of a second-generation anti-CD19 4-1BB-based CAR to reduce spacer flexibility. They found that this modification resulted in better tumor control and lower release of inflammatory cytokines in vivo. Also, they observed a downward trend in tumor load and prolonged survival in xenograft models treated with CAR-T cells bearing a less flexible spacer [100]. In an endeavor to develop long spacer domains with a favorable functionality profile for membrane-proximal targets, Schafer et al. introduced a novel class of spacer derived from the Sialic acid-binding immunoglobulin-type lectins (Siglecs). A long spacer derived from Siglec showed potential cytotoxicity, and its performance was similar to the CD8α spacer in a CAR targeting CD20 in vitro and in vivo while maintaining a favorable cell phenotype profile and cytokine release pattern [94].

The spacer in the CAR construct can also be used for identification, purification, and in vivo tracking of CAR-positive subsets of T cells after engineering [96, 101]. Casucci et al. demonstrated that the incorporation of the nerve growth factor receptor (NGFR) as a spacer into the CAR backbone enables the enrichment of CAR-T cells before infusion into patients and facilitates the in vivo tracking, phenotypic characterization, and isolation of CAR-T cells for ex vivo analysis [101]. They also showed that NGFR, when incorporated into the CAR molecule, cannot trigger signaling upon in vivo growth­ factor encounter [101]. Bister et al. also inserted a CD34-derived spacer into the CAR backbone to facilitate the detection and enrichment of CAR-T cells before infusion. This spacer was functionally similar to the CD8 spacer in in-vitro and in-vivo experiments [96].

Consistent with studies concerning scFv-based CARs, spacers can affect the expression level and functional activity of nanobody-based CARs [102]. In our previous study, we incorporated three different spacers, including CH3–CH2-hinge and CH3–CH2-hinge-hinge regions derived from human IgG3 and the hinge region of the FcγRII, into anti-MUC1 CARs and observed a greater expression of CARs containing those spacers derived from human IgG3. We also observed that CARs having two repeats of hinge sequence in their spacers (CH3–CH2-hinge-hinge) showed more flexibility which may induce homodimerization and increase the avidity of CAR for the target antigen [102, 103].

Transmembrane domain

The transmembrane domain (TMD), like the hinge domain (HD), is a component in the CAR structure that connects the antigen recognition moiety to the intracellular signaling domain. It is mainly derived from type-I single-spanning proteins, such as CD3ζ, CD4, CD8α, or CD28. The TMD is primarily considered a structural block in the CAR that anchors the receptor in the cell membrane. However, the functional importance of TMD in CAR expression level and stability has been well-established [104, 105]. Fujiwara et al. studied the effect of HD and TMD, derived from various molecules, such as CD4, CD8α, or CD28, on the expression level and antigen-specific cytotoxic activity of CAR [104]. They found that the CAR expression level was enhanced much more in HD/TMD-modified than in HD-modified CARs, suggesting that CAR expression level and stability on the T cells were highly affected by TMD rather than HD [104]. A high surface expression was also reported by Zhang et al. following the incorporation of either CD8α or CD28 TMD into the CAR construct [106]. The TMD of CARs mediates CAR dimerization and interaction with endogenous proteins, forming dimers or trimers [79, 80, 105]. Annenkov et al. showed that CARs containing the transmembrane region of FcεRIγ mediate T cell activation by heterodimerizing with CD3ζ [107]. Bridgeman et al. demonstrated that embedding the transmembrane region of CD3ζ in the CAR structure facilitates signal transmitting and T cell activation via mediating the homodimerization of chimeric receptors or their interaction with the endogenous TCR [105, 108]. Alabanza et al. also showed that CD28-derived HD/TMD has a greater tendency to drive homodimer formation compared to CD8α HD/TMD. This homodimerization can cause an increased tonic signal and AICD in T cells expressing CD28-HD/TMD CARs [79]. Muller et al. found that CD28-TMD can also interact with the endogenous CD28 receptor and form CD28-CAR heterodimers [80]. The CD28-CAR dimers may cause higher on-target off-tumor toxicities by enhancing CAR sensitivity to ectopically expressed low-density antigens, such as the CD19 on brain mural cells [80, 109]. Nevertheless, Majzner et al. demonstrated that the CD28 HD/TMD provides a more stable and efficient immune synapse and decreases the antigen-density threshold for T-cell activation in CD19-specific CARs compared to their CD8 counterparts [110].

The CAR hinge and transmembrane regions can also influence CAR-T cell cytokine production [81]. To find how HD/TMD affects the expansion, cytokine production, and memory generation of CAR-T cells, Ying et al. incorporated CD8α-derived HD/TMD with different lengths into CAR. They found that a CAR harboring an 86-amino-acid HD/TMD produces potent antitumor responses without enhancing serum cytokine concentrations responsible for CRS and neurological toxicity. Thus, their results suggest that modification of CAR hinge and transmembrane regions can modulate cytokine secretion and help ameliorate CAR-T cell-associated toxicities [81]. Guedan et al. also showed that a third-generation CAR composed of ICOS and 4-1BB intracellular domains (ICDs) displayed superior antitumor activity and increased persistence in vivo only when the ICOS ICD was directly fused to an ICOS transmembrane domain [111]. The TMD of ICOS has a constitutive, albeit weak, association with the tyrosine kinase Lck. This association facilitates p85 recruitment to ICOS and subsequent PI3K activation. When incorporated into the CAR, ICOS TMD augments the proximal signaling output by providing an extra pool of Lck [112].

The TMD in the CAR structure can be designed according to the transmembrane-mediated interaction and functionality desired. Schmith et al. used the TMD of 4-1BB to form trimeric CARs, thus enhancing the antigen-binding capacity of the CAR and reducing antigen escape [113]. Wang et al. also generated a novel chimeric receptor in which the transmembrane and cytoplasmic domains of KIR2DS2, a stimulatory killer immunoglobulin-like receptor (KIR), expressed naturally in CD4 and CD8 T and NK cells, fused to an scFv. They then introduced this KIR-based CAR into human T cells expressing DAP12, an immunoreceptor tyrosine-based activation motif (ITAM)-containing adaptor. They found that T cells expressing KIR-CAR fail to trigger cytotoxic activity without KIR/DAP12 association. They also found that the KIR-based CAR exhibited more potent antitumor activity in vivo compared with 2nd-generation CD3ζ-based CARs due to enhanced stability of the KIR/DAP12 complex within the plasma membrane following antigen engagement [114].

Endodomain

Costimulatory domain

T cells require at least two distinct signals for full activation [115]. The first is delivered into T cells when TCR binds to its cognate antigenic peptides bound to MHC molecules on an antigen-presenting cell (APC). The second signal is provided when the costimulatory receptor on the T cell binds to its cognate ligand on the APC [115]. Antigenic stimulation without costimulation results in T cell anergy and unresponsiveness [116, 117]. CARs in which the antigen recognition domain is linked to the Fc receptor gamma chain (FcγR) or TCR zeta chain (CD3ζ) alone are known as “first-generation” CARs [118, 119]. Although the 1st-generation CARs exhibited cytotoxicity against target cells in-vitro and in-vivo, limited antitumor efficacy and poor persistence were observed in early-phase clinical trials [82, 120,121,122]. The insertion of a costimulatory unit into 1st-generation CARs, introduced as the second-generation CARs, enhanced T cell proliferation and in vivo persistence [122]. Third-generation CARs were also developed by fusing two costimulatory domains in a series. Most costimulatory molecules used in CARs belong to the Ig superfamily, such as CD28 and ICOS, or the TNF receptor superfamily (TNFRSF), such as 4-1BB, OX40, and CD27 [123]. A schematic picture of above-mentioned costimulatory molecules and their ligands has been illustrated in Fig. 4a.

Fig. 4
figure 4

A schematic picture of costimulatory molecules commonly used in CAR construct and their ligands on antigen presenting cells (a), Three CAR configurations (19BBz-CD80, 1928z-41BBL, and 19z1-CD80-41BBL) in Zhao’s study that are coupled with complementary costimulatory ligands (b). The 19BBz-CD80 and 1928z-41BBL are two 2nd-generation CARs coupled with CD80 and 41BBL, respectively. The 19z1-CD80-41BBL is a 1st-generation CAR that combined with both CD80 and 41BBL

Perhaps the best-characterized T cell costimulatory molecule is CD28, constitutively expressed on the surface of naïve T cells and some subsets of memory T cells [124]. It was first described in the 1980s as a co-receptor that enhanced TCR-induced proliferation and stimulated the differentiation of naive CD4 T cells [125, 126]. CD28, encoded by the CD28 gene located on human chromosome 2 (2q33), is a 44 kDa type I integral membrane glycoprotein that usually forms homodimers via disulfide bonds between cysteine residues positioned on the juxta transmembrane domain [125,126,127]. A single Ig-V-like domain on the extracellular portion of CD28 provides the structural specificity for interactions with its ligands, CD80 (B7–1) and CD86 (B7–2), expressed on APC upon activation [128]. CD28 contributes to proliferation [125, 129], IL-2 production [130, 131], survival [132,133,134], and metabolic activity [135] of naïve T cells by regulating the expression and activity of nuclear factor-κB (NF-κB), nuclear factor of activated T cells (NFAT), and activator protein 1 (AP-1) [136,137,138]. It is also involved in the cytoskeletal rearrangement, actin polymerization, and membrane rafts recruitment into the immune synapse, which maintains and boosts TCR-induced signaling [139,140,141].

The inducible T cell co-stimulator (ICOS) (CD278), the third member of the CD28/cytotoxic T lymphocyte-associated antigen-4 (CTLA-4) family, is expressed on activated T cells [123, 142]. ICOS interacts with another B7-related molecule, ICOSL, also known as B7H, B7RP-1, and GL-50, expressed by APCs. Although ICOS and CD28 are similar in structure and downstream pathways, they are not identical [123, 143]. The CD28/B7 engagement is required for the primary immune responses, whereas ICOS/B7RP-1 is essential for secondary immune responses [144, 145]. ICOS is crucial for the development and maintenance of human T helper 17 (Th17) cells [146] and also directs immunity towards humoral or inflammatory responses [144]. Although their function, both, is costimulatory for T-cell proliferation and cytokine secretion, the effects of ICOS on the costimulation of T-cells appear less potent than those exerted by CD28, probably because ICOS cross-linking does not induce IL-2 production [147].

The expression of 4-1BB (TNFRSF9, CD137, ILA), first discovered in the late 80 as an inducible costimulatory molecule on activated T cells, is mainly activation-induced and not restricted to T cells and detected on various types of non-T cells as well [148,149,150]. 4-1BB interaction with its ligand, 4-1BBL, can costimulate T cells by activating the NF-κB, c-Jun, and p38 downstream pathways independently of CD28 signals. In contrast to CD28, 4-1BB enhances T cell effector responses by stimulating proliferation, cytokine production, and cytolytic activity and inhibiting AICD of effector T cells, not naïve T cells [151, 152]. Furthermore, numerous studies have suggested that 4-1BB predominantly promotes CD8 T cell-mediated response [153, 154], and some also described a dual immunoregulatory function for 4-1BB. While it supports both CD4 and CD8 T cell-mediated responses in vitro, it preferentially augments clonal expansion and survival of CD8 T cells in vivo, suppressing CD4 + T-cell function [155, 156].

The expression of OX40 (TNFRSF4, CD134), first described as a T cell activation marker, can be induced on activated CD4 and CD8 T cells as well as on several other lymphoid and non-lymphoid cells. A slight expression can be noticed within the first hours of T cell activation in vitro and in vivo, peaking anywhere from 1 to 5 days after initial stimulation. Although TCR signals are sufficient for inducing OX40, the interaction of CD28 with its ligands can increase and maintain OX40 expression; T cell and APC-derived cytokines such as IL-1, IL-2, and TNF may also modulate its amount and length [157]. Its ligand, OX40L (TNFSF4), is not constitutively expressed but can be induced on APCs upon activation [157]. Known as the late costimulatory molecules of the TNFR family, 4-1BB and OX40 prolong T-cell persistence and promote the generation and survival of effector and memory T cells [158,159,160]. Hombach et al. demonstrated that in contrast with CD28-based CARs, OX40-costimulated CARs do not secrete IL-2 and IL-10. They also found that in a CD28-OX40 dual costimulatory CAR, OX40 represses IL-10 secretion induced by CD28 without affecting IL-2 and IFNγ production. This favorable feature of OX40 can be employed, particularly for treating solid tumors in which IL-10, an immunosuppressive cytokine, is secreted into the TME by tumor and stromal cells [161]. Concerning nanobody-based CAR-T cells, we observed that anti-HER2 nanobody-based CAR-T cells containing a combination of CD28-OX40 showed increased expansion level and cytotoxicity in vitro compared to CAR-T cells lacking OX40 [56]. Recently, Zhang et al. showed that OX40 signaling enhanced CAR-T cell survival through up-regulation of anti-apoptotic Bcl-2-like molecules and improved proliferation through increased activation of the NF-κB, MAPK, and PI3K-AKT pathways [162].

The CD27 (TNFRSF7) is constitutively expressed on T cells, NKT cells, NK cells, and other immune cells. Its expression on T cells is strongly upregulated following activation [158]. CD27 engagement by its ligand, CD70, supports CD28-mediated costimulation and the survival of proliferating T cells [163]. CD27/CD70 interaction can also augment the expansion and survival of effector cells and enhance the development of memory CD8 T cells. Likewise, it can promote proliferation, polarization, and cytokine production by CD4 T cells [164]. It has been reported that CD27-costimulated CARs exhibited an antitumor activity similar to 4-1BB or CD28-based CAR-T cells while providing a persistent comparable to that of 4-1BB-based CAR-T cells. The combination of CD27 and CD28 costimulatory domains in a 3rd-generation CAR produced encouraging clinical results in neuroblastoma, AML, and lymphoma patients [165].

Most clinical trials have used CD28 or 4-1BB-costimulated CARs to date [166]. Although T cells expressing CARs with either a 4-1BB or CD28 costimulatory domain have demonstrated similar antitumor activity, particularly against lymphomas [7, 10, 167, 168], T cells expressing CD28-costimulated CARs display higher cytokine production but lower persistence [111, 169,170,171]. The persistence of CAR costimulated by CD28 was identical to that achieved with CD3ζ alone, indicating that CD28 does not support human T-cell survival in vivo [172, 173]. The persistence of CD28-costimulated CAR-T cells can be improved by replacing CD28 with 4-1BB [174, 175] or CD27 [176] or by adding 4-1BB alone [175] or a combination of 4-1BB and CD27 [177]. CARs harboring the 4-1BB costimulatory domain mediate long-term survival of T cells in the circulation by maintaining central memory phenotype and relying on oxidative metabolism, whereas CD28-costimulated CARs promote effector memory differentiation and rely on aerobic glycolysis [50, 169, 178]. In efforts to find an ideal combination of CD28 and 4-1BB that would preserve the superior tumoricidal capacity of CD28-based CARs with the sustenance afforded by the 4-1BB-based CARs, Zhao et al. developed seven different structural configurations of CARs, three of which (1928z-41BBL, 19BBz-CD80, and 19z1-CD80-41BBL) coupled with costimulation ligands [179] (Fig. 4b). They found that the 1928z-41BBL and 19BBz-CD80 configurations showed more favorable properties regarding in vivo tumoricidal cytotoxicity, proliferation, persistence, and IRF7/IFNβ pathway induction. The 1928z-41BBL CAR, however, consistently outperformed the 19BBz-CD80 and emerged as the most potent configuration. They also observed that CD28 downregulation, occurring following activation, averts the activity of its constitutively expressed ligand (CD80) provided by the 19BBz-CD80 configuration [179]. Furthermore, the 19z1-CD80-41BBL configuration, in which CAR coupled with both CD80 and 41BBL simultaneously, was the least effective, expanding steadily but exerting inferior tumor control [179]. The lack of durable antitumor responses in CD28-costimulated CAR-T cells may also be due to tonic signaling and T cell exhaustion mediated by the CD28 costimulatory domain. Recently, Guedan et al. reduced T cell exhaustion, driven by CD28, and thus enhanced in vivo persistence of the CD28-based CARs targeting mesothelin by disrupting the interaction between the CD28 signaling domain and the SH2-domain of Grb2 via a single amino acid alteration [180]. CARs containing ICOS also showed better persistence when compared with CD28-based CARs. It is likely because ICOS activates PI3K signaling and, consequently, Akt signaling more potently than CD28. As shown by Guedan et al., expressing ICOS-based CAR in CD4 T cells not only improves the persistence of these cells but also enhances the in vivo persistence of CD8 T cells expressing either 4-1BB– or CD28-based CAR [111]. Despite accumulated data indicating poor persistence of CD28-based CARs, prolonged survival of anti-CD19 CARs containing CD28 costimulatory domains has been reported in more recent trials, suggesting that, aside from CD28, other elements incorporated into the CAR structure may also affect the CAR-T cells in vivo persistence [7, 79, 81, 111, 181, 182].

Different costimulatory domains can induce various downstream signaling pathways. Selecting a costimulatory unit or a combination that would heighten antitumor activity and maintain the long-term persistence of CAR-T cells is crucial. Besides costimulatory molecules discussed in this review, alternative costimulatory molecules, such as CD40 (TNFRSF5) [183], Herpes Virus Entry Mediator (HVEM) (TNFRSF14, CD270) [184], Glucocorticoid-Induced TNFR-Related protein (GITR) (TNFRSF18, CD357) [185], Myeloid Differentiation Primary Response 88 (MYD88)/CD40 [186], Toll-Like Receptor 2 (TLR2) [187], and Dectin-1, a C-type Lectin Receptor [188], have been explored to determine whether the incorporation of these molecules into CAR structure could improve clinical outcomes in hematological malignancies or heighten potential applications of CAR-T cell therapy in patients with solid tumors.

Activation domain

In the CAR construct, the activation motif is the critical component to trigger T cell activation signaling, including the initiation of cytotoxicity. The CD3ζ is the most common activation molecule used in CAR-T cells [189]; however, early studies also utilized the FcγR as the primary activating domain in CARs [118, 190]. All FDA-approved CARs have employed CD3ζ as the cytoplasmic activation domain (Fig. 1). The CD3ζ (CD247) is a component of the TCR complex comprising three tyrosine-rich sequences known as ITAMs, named ITAM1, ITAM2, and ITAM3, from the membrane-proximal to the membrane-distal direction. The ITAMs presented on the cytoplasmic region of CD3ζ are the phosphorylation sites recruiting the tyrosine-protein kinase ZAP70, which, in turn, triggers downstream signaling cascades. The two distal ITAMs (ITAM2 and ITAM3) display a lower binding affinity for ZAP-70 compared to ITAM1. In T cells, the quantity and diversity of ITAMs affect optimal signaling; however, Feucht et al. demonstrated that a single functional ITAM is sufficient for potent antitumor efficacy. CAR containing a single ITAM (either ITAM1, 2, or 3) outperformed the triple- and double-ITAM-containing CARs in vivo and limited T cell differentiation, resulting in an increased fraction of central memory CAR-T cells and increased persistence as well [191]. Fisher et al. designed a novel CAR construct in which an scFv targeting GD2 was linked to DAP10 as the sole signaling domain and transduced into γ/δ T cells. Due to the absence of the CD3ζ, these CAR-T cells depended on their native γ/δ TCR for activation. They could efficiently eradicate neuroblastoma cells; however, those target cells lacking γ/δ TCR ligands could escape [192]. A summary of all structural elements of CAR reviewed here has been shown in Fig. 5.

Fig. 5
figure 5

A schematic picture of all structural elements of CAR reviewed here

Vector backbone

Retroviral vectors are attractive tools for gene therapy due to the ability to integrate into the genome of target cells, a requisite for long-term expression, and maintaining a large cloning capacity, which is ideal for most clinical situations. In addition to the above-stated features common to all retroviral vectors, vectors derived from lentiviruses offer a unique advantage over their oncoretroviral counterparts: they can translocate across an intact nuclear membrane to transduce nondividing cells, the main potential targets of most gene therapies [193]. Lentiviral (LV) vectors used for gene therapy are predominantly derived from Human Immunodeficiency Virus type 1 (HIV-1). Two components are needed to make a virus-based gene delivery system: first, the packaging elements, encompassing the structural proteins and the enzymes required to generate an infectious particle, and second, the vector itself, that is, the genetic material transferred to the target cell [194]. The viral coding sequences are usually removed from the viral backbone and replaced by the gene of interest, such as the CAR. In this way, the transgene sequence is flanked by two long terminal repeats (LTRs), which are essential for the transgene integration into the host genome. The LTRs are identical in nucleotide sequence and organization and consist of U3-R-U5 regions in which the U3 is necessary for retroviral RNA transcription due to having endogenous enhancer/promoter sequences. Therefore, in wild-type retroviral-based vectors, transgene expression can be driven by the transcriptional sequences within the LTRs. However, to meet the safety requirements for clinical applications and minimize the risk of arising replication-competent recombinants (RCRs), self-inactivating (SIN) vectors are developed by removing a majority of the U3 region of the 3′ LTR. This deletion is transferred to the 5′ LTR of the proviral DNA during reverse transcription. Thus, SIN LV vectors retain all properties of their parent while lacking the ability to produce full-length vector RNA in transduced cells [193]. SIN LV vectors are the gold standard for CAR-T cell production due to the highest transduction/transfection rate. However, the efficiency of gene delivery by these vectors relies on some elements that should be present in the vector backbone or expression cassette (Fig. 6).

Fig. 6
figure 6

A schematic picture of a SIN LV vector backbone indicating where modular regulatory elements should be placed. LTR: Long Terminal Repeat, RRE: Rev-Response Element, cPPT: Central Polypurine Tract, IP: Internal Promoter, WPRE: Woodchuck Hepatitis Virus (WHV) Posttranscriptional Regulatory Element, USE: Upstream Sequence Element, PA: Polyadenylation Signal

Promoter

A gene transfer approach must be effectively directed to the specific tissues/cells where is desired, and the resulting transgene expression should be at a level required for a specific application [195]. Regarding CAR-T cell therapy, the CAR density on the T cell surface must be high enough to recognize the target antigen but not too high to trigger antigen-independent tonic signaling or drive off-tumor on-target cytotoxicity as a result of inappropriate recognition of target antigen on cells rather than tumor cells. Lentiviral vectors can transfer CAR construct into activated CD4 and CD8 human T cells with high efficiency, but its expression level depends on the promoter that drives its transcription. The U3 region of the 5′ LTR possesses endogenous enhancer/promoter sequences; however, it is removed in SIN LV vectors to enhance safety. Accordingly, for transgene expression in SIN vectors, internal promoters are employed. Selecting an appropriate promoter is a fundamental step toward a successful CAR-T cell therapy [196]. Promoter is a cis-acting element within the vector backbone that can dictate the level and duration of transgene expression and restrict expression to the specific tissues/cells, two critical goals desired for clinical applications. Various viral promoters (immediate-early cytomegalovirus [CMV], murine stem cell virus [MSCV] or spleen focus forming virus [SFFV] promoter), and cellular promoters (human elongation factor 1α-subunit [EF1α], human phosphoglycerate kinase1 [PGK], human ubiquitin C [UbiC] or the chicken β-actin and its derivative CAG) have been evaluated for transgene expression in lentiviral vectors. CMV and EF1α are the most commonly used promoters for CAR expression in T cells.

During productive infection of Human CMV (HCMV), viral genes are expressed, from immediate-early (IE) to early (E) and late (L) genes, respectively, in a temporal cascade [197]. The expression of IE genes is driven by a potent enhancer-containing promoter. This Major IE Enhancer and Promoter (MIEP) is active in various cell types and is the most commonly used promoter in mammalian expression plasmids [197]. Successful gene therapy with CAR-expressing T cells relies on the ability of T cells to maintain adequate receptor expression for a long time. The EF1a promoter is a strong promoter frequently used for CAR expression, as it often drives a strong and stable expression regardless of tissue specificity or the T cell activation state [169, 198]. Milone et al. evaluated several promoters, including EF1α, CMV, PGK, and UbiC, to identify the one that can drive the highest stable expression of a transgene in primary CD4 and CD8 T cells [169]. They found that although the CMV promoter caused a high expression level of the green fluorescent protein (GFP) early after transduction, expression dropped to < 25% of the initial expression level after 10 days of culture. In contrast, the EF1α promoter induced the highest level of GFP expression and optimally maintained it in both CD4 and CD8 T cells [169]. The sustained proliferation of CAR-T cells depends on CAR structure and high expression, the latter of which is required but not sufficient. When CARs having the CD28 transmembrane and cytosolic domain is expressed under the control of the EF1α promoter, they display a constitutive growth phenotype. Despite the constitutive growth phenotype, these CARs showed inferior antitumor effects and engraftment in vivo [199]. That is probably the explanation for why the two FDA-approved CD28-based CAR products, including Yescarta (axicabtagene ciloleucel, Kite Pharma Inc.) and Tecartus (brexucabtagene autoleucel, Kite Pharma Inc.) use the MSCV promoter instead of EF1α for CAR expression. The retroviral vectors incorporating MSCV LTR have been widely used in pre-clinical and clinical studies to drive high-level and long-term maintenance of CAR expression [200, 201].

Since heterogeneous expression of the CAR makes it challenging to ensure consistent behavior among individual CAR-T cells as their avidity toward the antigen can vary, targeted insertion of CAR into the first exon of the constant chain of the TCRα gene (TRAC) allows for a homogenous and consistent TCR-like expression, thus dodging issues related to the variegated transgene expression [202]. Eyquem et al. used the CRISPR/Cas9 paired with an adeno-associated virus (AAV) vector repair matrix to insert a CD19-specific CAR construct into the TRAC locus. CAR-T cells generated by CRISPR knock-in strategy outperformed the CAR-T cells generated via retroviral infection, both in vitro and in vivo [203]. These studies also demonstrated that targeting the CAR into the TRAC locus results in receptor internalization and re-expression cycle that is much more closely matched to the normal receptor, ultimately leading to a more sustained and effective antitumor response. Furthermore, this elegant CAR-knock-in and TCR-knockout strategy not only averts tonic CAR signaling by precisely regulating the expression of the CAR but also minimizes the risk of graft versus host disease (GVHD) by diminishing the expression of αβ TCRs on the T cell surface [203].

The expression of multiple genes requires self-cleaving 2A-like peptides of the Foot and Mouth Disease Virus (FMDV) [204], an internal ribosome entry site (IRES) [205], or the use of several promoters. The two latter strategies are the most widely used. It has been shown that the expression levels of the genes upstream and downstream of the IRES element vary, with the downstream gene typically expressed at lower levels [206]. Using two separate, divergent promoters may significantly increase the size of expression cassettes, and also probable different tissue specificity and mutual interference between them may prevent efficient co-expression in the same target cell [207, 208]. However, an alternative approach is to employ a single, compact bidirectional promoter [209]. He et al. could successfully use a bidirectional promoter to express dual CAR cassettes in the Sleeping Beauty system; however, they found that the investigated bidirectional promoters are sub-optimal for lentiviral production of long RNA encoding long dual-CAR constructs [210]. EF1α can be the best choice for driving lentiviral-based vectors containing a long RNA in CAR-T cells. Hosseini Rad et al. evaluated four strong well-characterized promoters, including EF1α, CMV, hPGK, and RPBSA, for optimal expression of a long and complicated RNA encoding multiple gene products in CAR-T cells and found that EF1α is the best choice for driving short as well as long RNA in CAR-T cells [196]. They also discovered that the EF1α promoter exhibited the best transduction efficiency, killing ability, and cytokine production. Furthermore, the authors observed a reduction in CAR expression driven by the hPGK and RPBSA promoters, which retained acceptable killing ability but reduced cytokine production [196].

Signal peptide

Recognition of target antigen typically requires CAR expression on the T cell surface. CAR protein is a type I membrane protein that must be trafficked through the secretory pathway to the plasma membrane, where it can be anchored to and exert its function. Accordingly, the CAR coding sequence starts with a signal peptide (SP). The SP, also known as the leading peptide, was first described in 1975 as a short transient peptide, predominantly found at the N-terminal of secretory and type I membrane proteins that direct the nascent polypeptide chain to the endoplasmic reticulum (ER) membrane [211, 212]. The SPs typically contain 25-30aa; however, longer SPs (up to 140aa) are also seen in eukaryotes; they are predominantly organelle-targeting, which remain stable even after protein maturation, and usually add extra functions to the protein targeting [213]. The SPs act like address tags and mediate the translocation of secretory proteins across intracellular membranes and final localization. They are more precisely required for protein translocation across the first membrane on the secretory pathway and thus universally control the entry of all proteins to the secretory pathway [211]. Although SPs may vary in length and sequence, they are found in both prokaryotes and eukaryotes and share a conserved tripartite structure, commonly characterized by a positively charged N-terminal region, a 9-to 12-residue-long hydrophobic stretch in the center that forms an α-helical conformation, and a polar C-terminal region with the cleavage site for signal peptidase [212]. As a nascent protein emerges from the ribosome, the signal peptide is recognized by the signal recognition particle (SRP), a cytoplasmic ribonucleoprotein [214]. SRP pauses elongation of the nascent polypeptide chain until the SRP–SP–ribosome complex interacts with an SRP receptor on the ER membrane. Upon interaction, the nascent chain is inserted into the ER translocon, and polypeptide chain elongation resumes. The SP is then cleaved off by a signal peptidase residing in the ER while the growing protein passes through the ER membrane [215, 216]. The SPs usually incorporated into the CARs are derived from human CD8α [217], IL-2 [218], GM-CSF receptor (GM-CSF) α chain [219], or murine Ig-kappa (IgK) [220]. Wang et al. chose the leader sequence of GM-CSFRα at the beginning of a CD19-specific CAR and a truncated form of human epidermal growth factor receptor (huEGFRt) coding sequences, based on its capacity to sort type I transmembrane proteins to the plasma membrane in T cells [221]. Recently, Ping et al. designed CAR-T cells that could secret α-PD-1 scFv in solid tumors. They compared six signal peptides frequently used in engineering secretory proteins, including Secrecon, murine IgK V-III region (IgKVIII), human IgKVIII, CD33, human tissue plasminogen activator (TPA), a consensus 16aa signal peptide, and native secreted alkaline phosphatase (SEAP) [222], to find the one that can enhance extracellular accumulations of anti-PD-1 scFv. They observed that the one derived from human IgK VIII was the best choice as the secreting capacity of anti-PD-1 scFv was significantly enhanced by this leading peptide [223].

RRE

Retroviruses such as lentiviruses employ various mechanisms to regulate the expression of alternatively spliced viral mRNAs. The presence of suboptimal splice sites allows for differential expression of several mRNAs from a single pre-RNA. Lentiviruses, including HIV-1, utilize Rev/Rex proteins that act in trans to regulate the nuclear export of unspliced or singly-spliced mRNAs required for the expression of structural and enzymatic proteins and progeny viral RNA genomes as well [224]. After integration into the host genome, during the early phase of HIV-1 life, the viral DNA called the provirus now on, is acted upon by cellular transcription factors to express viral genes. The early population of the transcripts is a fully-spliced mRNA (2 kb) exported to the cytoplasm to be translated into viral regulatory proteins, including Tat, Nef, and Rev. The Rev is a nucleocytoplasmic shuttle protein necessary for virus replication. Rev, later in the viral life cycle, is imported to the nucleus to mediate the export of the Rev Response Element (RRE)-containing unspliced (9 kb) or partially spliced (4 kb) mRNAs to the cytoplasm, where they are translated into viral proteins, such as Gag, Gag-Pol, Env, and accessory proteins, or packaged as the viral genome into newly budding virions [225]. The interaction between Rev and RRE is necessary for exporting such intron-containing mRNAs, which are usually kept in the nucleus to be spliced or degraded. The RRE is a cis-acting element located at the junction between the SU (gp120) and TM (gp41) domains of the ENV gene on viral genomic RNA [226], with a well-conserved sequence (351 nucleotides) and highly branched structure (an approximately equimolar mixture of 4 and 5 stem-loop conformations) [227] that provides an architectural scaffold with a high affinity for Rev binding. DiMattia et al., using X-ray crystallography and cryo-EM, described an “A”-shaped architecture for RRE, which allows 8–12 Rev-subunits to bind to and mediate nuclear export [228]. Both Rev oligomerization and its interaction with RRE, initiated by binding the Rev arginine-rich motif (ARM) to a high-affinity Rev-binding site known as stem-loop IIB on the RRE, are critical for viral RNAs export and, consequently, virus replication [229]. After assembly of the Rev-RRE ribonucleoprotein (RNP) complex, the nuclear export sequences (NESs), displayed on the Rev, interact with the host Crm1/RanGTP nuclear-export machinery and facilitate the nuclear export of viral intron-containing mRNAs [225, 230, 231].

cPPT

To be inserted into the host genome, the single-stranded RNA genome needs to be converted into double-stranded DNA before nuclear import. While synthesis of the first (minus) strand of DNA is initiated by a cellular tRNA molecule already packaged into the retroviral particles, synthesis of the plus-strand DNA is primed by a short purine-rich remnant of the viral RNA known as the polypurine tract (PPT), selectively preserved when the RT digest the viral RNA from the nascent RNA/DNA hybrid [232]. Besides this copy of PPT, shared by all retroviruses, HIV-1 carries a second copy of the PPT, known as cPPT, placed near the center of the genome in the integrase open reading frame [233]. Thus, synthesis of the plus-strand DNA in HIV-1 is primed by both PPT and cPPT, resulting in two discrete plus-strand segments, each covering half of the viral genome [234]. Synthesis of upstream plus-strand DNA initiated at the PPT continues for approximately 99 nucleotides downstream of cPPT, expelling the centrally initiated downstream plus-strand DNA, and is then stopped at the central termination sequence (CTS), creating a triple-stranded DNA structure at the cPPT. The 99-nucleotide-long overlapping DNA, also known as the central DNA flap, is vital for the pre-integration complex (PIC) formation and thus nuclear import [235]. Defects in the DNA flap formation result in the trapping of the PIC at the cytoplasmic side of the nuclear pore, prohibiting nuclear entry of the HIV-1 genome [236]. It has been shown that the insertion of a 118-bp sequence of the HIV-1 containing the cPPT/CTS element into lentiviral vectors enhances transduction efficiency, up to 85% in T cells pre-activated with IL-2 and PHA, by facilitating the nuclear import of transgene through a central DNA flap [237, 238]. Also, studies demonstrated that the presence of the cPPT/CST in lentiviral vectors leads to a 10-fold increase in the amount of integrated DNA and a 5- to 10-fold increase in the transduction efficiency, differing based on cell type [239]. Studies also revealed that initiating plus-strand synthesis at two separate sites in HIV-1 derived vectors would decrease the time during which minus-strand viral DNA remains single-stranded, thus averting its exposure to cellular enzymes, particularly cytidine deaminases of the APOBEC family known as natural defense barriers against HIV infection, that act on newly synthesized single-stranded viral DNA [232].

Poly(a) signal and USE

A critical step in the mRNA 3′-end processing is polyadenylation, the consequences of which may impact mRNA stability and translation efficiency. Like eukaryotes, retroviruses need two cis-acting elements (together known as the core polyA site) for transcription termination and polyadenylation, including an almost invariant AAUAAA sequence, placed 10–30 nucleotides upstream of the cleavage site, and a more variant GU/U-rich sequence positioned immediately downstream [240]. In some retroviruses, the AAUAAA sequence is placed in the U3 region, while in all lentiviruses, including HIV-1, it is present in the R region [241].

HIV-1, in common with all retroviruses, has two LTRs (U3-R-U5), which are identical in nucleotide sequence and organization. Transcription starts at the junction of U3-R, and polyadenylation occurs at the junction of R-U5. Therefore, it is expected that the transcript initiated at + 1 nucleotide in the 5′ LTR would be polyadenylated at the end of the R sequence in the 5′ LTR itself, resulting in a truncated viral RNA [240]. To prevent premature termination and polyadenylation, the promoter-proximal polyA site in the 5′ LTR must be suppressed, while the promoter-distal polyA site in the 3′ LTR has to be selectively and efficiently used to polyadenylate all resulting viral RNAs [241]. Therefore, lentiviruses have evolved with additional sequences to support and ensure efficient transcriptional termination and prevent the possibility of read-through into cellular genes. In SIN vectors, most of the U3 region is deleted from LTRs to enhance safety. However, in the case of HIV, this deletion results in leaky polyadenylation suggesting that U3 contains termination enhancer motifs or USEs. These USEs are U-rich sequences placed upstream of the AAUAAA sequence serving a critical role in selecting the 3′ polyA site in preference to an identical polyA site at the 5′ end of mRNA [242, 243]. Efficient 3′-end processing of HIV-1 has been shown to depend on a stem-loop structure that places the USE and the core polyA site close together [244].

Schambach et al. incorporated seven different USEs derived from various viral or cellular genes into SIN LV vectors to find the best one for improving viral titer and gene expression. They observed that the USE derived from simian virus 40 (SV40) late mRNA, especially when duplicated (2xSV), provided the best results, improving viral titer (up to threefold) and gene expression (by 45–100%) [244]. Interestingly, the relatively small 2xSV USE (100 bp) was nearly as potent as the WPRE (600 bp) in enhancing viral titer and transgene expression. However, the level of the effects depended on other elements, such as promoter and target cell type. They also found that the 2xSV USE was superior to the WPRE in suppressing transcriptional read-through, thus improving vector efficiency and biosafety [244]. Hager et al. investigated whether the inclusion of a strong polyA signal as part of an internal transcription unit has the potential to overcome problems regarding insufficient termination. They found that an internal polyA signal increases transgene expression to levels comparable to that typically observed in the presence of the WPRE but decreases viral titer in a promoter-dependent manner [245]. Studies have shown that an expression cassette containing a combination of WPRE (a shortened version with 247 bp), an SV40 late USE, and a full-length SV40 late polyA signal drives the highest levels of gene expression compared to vectors containing each element alone [246] (Fig. 6).

WPRE

The WPRE, a post-transcriptional regulatory element (PRE) of the woodchuck hepatitis virus (WHV), is primarily known as an RNA export element that facilitates the accumulation of surface antigen transcripts in the cytoplasm from the intronless hepadnavirus genome [247]. WPRE can substantially enhance viral titers and transgene expression from various RNA and DNA viral vectors [248,249,250] when placed in the sense orientation in the 3′-untranslated region (3′-UTR), upstream of the polyadenylation signal [251]. Xu et al. observed that the insertion of WPRE in viral vectors could increase the expression of the reporter gene up to 7-fold in vitro and up to 50-fold in vivo [250]. The positive impact of the WPRE on gene expression is not due to an enhanced rate of transcription, viral mRNA half-life, or nuclear export; it instead acts post-transcriptionally and increases the efficiency of mRNA 3′-end processing [248, 251]. Studies demonstrated that oncoretroviruses, but not lentiviruses, usually display a high transcriptional readthrough activity in the 3′ LTR [252]. This reduced 3′ termination efficiency may cause RNA instability during transgene expression and also raise safety issues regarding the increased risk of activating or capturing downstream cellular oncogenes [252]. SIN LV vectors, which lack the U3 region, also suffer from a leaky transcription termination and exhibit high transcriptional readthrough activity, suggesting that additional termination signals must be present within the U3 sequence [253]. Further studies showed that besides USE, two additional elements in U3, including the transcriptional control region and the nuclear factor of activated T cells/upstream stimulatory factor (NFAT/USF) binding region, contribute significantly to lentiviral LTR transcriptional termination (Fig. 7). Restoration of the transcriptional control region alone reduces readthrough by 70–80%, while insertion of the NFAT/USF binding region reduces RNA readthrough to a level even lower than that of the wild type LTR [254]. However, instead of restoring the U3, which causes safety issues, genetic elements, such as WPRE, can be inserted into SIN LV vectors to reduce readthrough without causing any adverse effects.

Fig. 7
figure 7

A schematic picture of the LTR and also termination signals present within the U3 region. Studies showed that besides USE, two additional elements in U3, the transcriptional control region and the nuclear factor of activated T cells/upstream stimulatory factor (NFAT/USF) binding region, contribute significantly to lentiviral LTR transcriptional termination

Like the WPRE, PRE found in the human hepatitis B virus (HBV) (HPRE) can increase the level of gene expression; however, studies showed that WPRE displays two to three times more potent activity [247]. WPRE (600 bps) and HPRE (553 bp) have two homologous subelements, called α and β, while WPRE contains an additional subelement named gamma [247]. Both WPRE and HPRE contain AU-rich motifs in the β-loop that resemble the USE core sequence. WPRE also possesses a second USE core sequence-like motif in the γ-loop, which may explain its potency of transcript termination and increased total viral mRNA over HPRE [251]. However, Choi et al. discovered that the insertion of a shortened WPRE sequence (247 bp), containing approximately 41.2% of the original size, into adeno-associated virus (AAV) vectors causes a level of transgene expression comparable to that of wild-type WPRE while providing more space for larger transgenes [246].

Conclusion and future perspectives

The optimal design of a CAR construct requires a comprehensive understanding of the features of each component alone and in combination with others within the CAR. As CAR constructs become more complex and more elements come into play, a deep understanding of the impact of distinct domains would likely improve the rational design of CAR-T cells to fit the specific needs of individual patients. However, even an optimized CAR may not overcome all the hindrances presented by the complex nature of tumors. Furthermore, viral vectors, particularly lentiviral vectors, may be highly efficient in CAR-T cell production, but several critical features discourage their use in the clinic, supporting nonviral approaches [255]. The possibility of insertional mutagenesis, caused by the random integration of viral DNA into the host genome, the limited cargo size capacity, and the high manufacturing complexity and cost are some drawbacks associated with viral vectors, prompting researchers to look for other alternatives. Transposable elements (transposons) are the most common alternatives to viral vectors with a vast potential for diverse applications in genetic engineering, including CAR-T cell therapies, since they are easier and less expensive to manufacture due to their plasmid-based nature and offer a larger cargo size capacity to deliver multiple transgenes. Various transposon-based systems, including the Sleeping Beauty (SB) [256], the piggyBac (PB) [257], and Tol2 transposon [258], have been reported for CAR-T cell production as these systems provide safe and reliable DNA transfer into T-cells. Like viral-based CAR-T cells, transposon-mediated CAR-T cell clinical applications mainly focus on blood malignancies and target the CD19 antigen [259,260,261,262]. Kebriaei et al. (2016) reported the first human application of the SB system for 26 patients with advanced NHL or ALL. All patients received a single dose of patient- or donor-derived CD19-specific CAR-T cells generated with SB in the phase I adjuvant setting following autologous (NCT00968760) or allogeneic (NCT01497184) hematopoietic stem cell transplantation (HSCT). They utilized high-throughput sequencing to analyze the CAR integration patterns in T cells, genetically modified by SB transposon/transposase plasmids, and observed that integrations were widely distributed throughout the genome with no bias [259]. For patients who received CAR-T cells after autologous HSCT (n = 7), the 30-month progression-free survival (PFS) and overall survival (OS) rates were around 83 and 100%, respectively. For those who received CAR-T cells after allogeneic HSCT (n = 19), the respective 12-month rates were 53 and 63% [259]. Recently, Li et al. used piggyBac-produced CD19-specific CAR-T cell therapy for a male patient of triple-hit R/R DLBCL with TP53 mutation (ChiCTR1800018111). Triple-hit lymphoma (THL), a relatively rare subset of DLBCL identified in approximately 1% of DLBCL cases, carries concurrent MYC, BCL2, and BCL6 rearrangements. Despite grade 2 cytokine release syndrome, the patient achieved CR two-month after CAR infusion and has still kept sustained CR for over 24 months [262]. Recent studies on transposon-mediated CAR-T cell therapy have mainly focused on two areas: 1) developing novel transposon systems to deliver multiple transgenes, such as the hyperactive Tc Buster, which has been originally isolated from the red flour beetle and shown comparable transposition efficiency to the SB and PB [263], 2) designing novel platforms to generate universal allogeneic CAR-T cells, such as the CRISPR-Cas9 ribonucleoparticles (RNP)-minicircles (mc) SB transposon platform optimized by Tipanee et al. to express CD19-specific CAR while inactivating allogeneic TCRs [264]. Although viral and nonviral techniques have their benefits and drawbacks, the question that needs to be answered is whether nonviral methods will hold the potential to meet these challenges and take CAR technology beyond the hematological malignancies.