A novel supra coarse-grained model for cellulose

Cellulose being the most widely available biopolymer on Earth is attracting significant interest from the industry and research communities. While molecular simulations can be used to understand fundamental aspects of cellulose nanocrystal self-assembly, a model that can perform on the experimental scale is currently missing. In our study we develop a supra coarse-grained (sCG) model of cellulose nanocrystal which aims to bridge the gap between molecular simulations and experiments. The sCG model is based on atomistic molecular dynamics simulations and it is developed with the force-matching coarse-graining procedure. The validity of the model is shown through comparison with experimental and simulation results of the elastic modulus, self-diffusion coefficients and cellulose fiber twisting angle. We also present two representative case studies, self-assembly of nanocrystal during solvent evaporation and simulation of a chiral nematic phase ordering. Finally, we discuss possible future applications for our model.


Introduction
Cellulose is the most abundant biopolymer on Earth, and has been gaining growing interest from the research community and industry in the recent years. The main sources of cellulose are wood, plants and bacteria. These sources produce cellulose fibers with different cross sections and lengths (Habibi et al. 2010;Moon et al. 2011;Trache et al. 2017. The cellulose fibers are broken down into cellulose nanofibril (CNF) and cellulose nanocrystal (CNC), where the latter are often referred to ''whiskers''. CNC are the product of acid hydrolysis of CNF, which removes the amorphous (disordered) regions of CNF (Kontturi et al. 2018). The combination of wide availability, bio-compatibility, mechanical strength and possibility for surface chemical modifications make cellulose (both CNC and CNF) promising material for numerous applications, e.g. electrically conducting materials, hydrogels and aerogels, nanocellulose films and membranes and biomedical applications, to mention a few (Kontturi et al. 2018;Berglund and Burgert 2018;Dufresne 2013;Tang et al. 2017;Eyley and Thielemans 2014;Chen et al. 2018;Sharma et al. 2019;Bayer et al. 2016). Cellulose often serves as a template scaffold in the aforementioned applications and is usually combined with other materials to achieve the desired functionality and thus forms functional composite materials. These materials often possess complex morphologies and therefore understanding and tuning the interactions between the different components can modulate and improve their properties.
The interactions and self-assembly of CNC or CNF have been subject of numerous experimental studies (Kontturi et al. 2018;Lagerwall et al. 2014;Ding et al. 2014;Orts et al. 1998;Benselfelt et al. 2019;Lombardo et al. 2019;Honorato-Rios et al. 2018). Depending on the experimental conditions, surface modifications, additives, etc. CNC and CNF can assemble into variety of different configurations (Kontturi et al. 2018;Lagerwall et al. 2014;Ding et al. 2014;Orts et al. 1998;Benselfelt et al. 2019;Lombardo et al. 2019;Honorato-Rios et al. 2018). CNC can also self-assemble into chiral nematic phases which possess interesting photonic properties and may have applications as optical material (Schütz et al. 2015;Surov et al. 2017;Liu et al. 2019Liu et al. , 2018Pospisil et al. 2018;Bertsch et al. 2019;Parker et al. 2016;Yao et al. 2017). The reflected wavelengths of the chiral nematic phases are depended of the pitch (p) of the self-assembled CNC. The pitch can be controlled in many ways either chemically or physically Frka-Petesic et al. 2017;Nguyen et al. 2013;Parker et al. 2018a). Although, there are plentiful experiments studying this phenomenon, to the best of our knowledge there are no molecular simulations that can provide valuable insight for the self-organization and properties of CNC into chiral nematic phases. This is mainly due to the large time and spatial scales encountered in these systems compared to the systems usually studied by molecular simulations.
Cellulose has been the subject of many molecular simulation studies. Some of these studies focus only at the surface phenomena and modifications of CNC, (Chen et al. 2017;Zhou et al. 2015;Bergenstråhle et al. 2008;Paajanen et al. 2016;Chundawat et al. 2011) while many others simulate whole CNC, both at atomistic (Chen et al. 2017;Zhou et al. 2015;Bergenstråhle et al. 2008;Paajanen et al. 2016;Chundawat et al. 2011;Matthews et al. 2006;Conley et al. 2016;Oehme et al. 2015;Djahedi et al. 2015;Paajanen et al. 2019;Ciesielski et al. 2019) and coarsegrained representations (López et al. 2009;Wohlert and Berglund 2011;López et al. 2015;Mehandzhiyski and Zozoulenko 2019;Glass et al. 2012;Fan and Maranas 2015;Markutsya et al. 2013;Shishehbor et al. 2018;Shishehbor and Zavattieri 2019;Ramezani and Golchinfar 2019;Srinivas et al. 2011;Poma et al. 2015Poma et al. , 2016Poma et al. , 2017. All-atom molecular dynamics (AA-MD) studies are able to simulate the crystal structure of cellulose in a perfect agreement with the experimental crystal structure of different cellulose allomorphs and provide a detailed atomistic picture for the structure and dynamics of CNC (Chundawat et al. 2011). However, AA-MD simulations are way too expensive to study the interactions and aggregation of large CNC in the order of 100-200 nm, which are typical lengths of CNC observed experimentally. Coarse-grained MD (CG-MD) simulations on the other hand are able to significantly reduce the degrees of freedom in molecular simulations and therefore to explore larger systems at a much longer time scale compared to AA-MD. Several CG models for CNC have been developed through the years. They have different degree of mapping (grouping the atoms from the atomistic model in one interaction bead) varying from 4 heavy atoms mapped into one CG bead, in the case of the most popular CG force field, MARTINI, (López et al. 2009;Wohlert and Berglund 2011;López et al. 2015;Mehandzhiyski and Zozoulenko 2019) to 1 or even 2 glucose units represented by 1 CG bead (Glass et al. 2012;Fan and Maranas 2015;Markutsya et al. 2013;Shishehbor et al. 2018; Shishehbor and Zavattieri 2019; Ramezani and Golchinfar 2019). These approaches help to decrease the computational cost of the simulations and enable one to explore much larger systems, but sacrifices the fine atomistic details presented in the AA models. Therefore, it is of the crucial importance for the CG models to be carefully parameterized and to reproduce key aspects of CNC structure and interactions. Most commonly the parameterization of CG models is based of AA-MD, which is shown to reproduce experiments very well (Zhou et al. 2015;Chundawat et al. 2011;Oehme et al. 2015;Paajanen et al. 2019).
In the present study, we introduce a new supra coarse-grained model for the simulation of large CNC systems, spanning into the micrometer and microsecond scales. The paper is organized as follows: first, the supra coarse-grained representation and the parametrization based on the force matching procedure are presented in the next section; the validation of the model and application of the model to the selfassembly of CNC are demonstrated in section three; finally some possible applications are discussed and conclusions are presented in the last section.

Methods and models
Supra coarse-grained representation Constructing a coarse-grained model begins with choosing the mapping scheme for a specific molecule. We already mentioned in the introduction that CNC were modeled with a different degree of mapping, varying from four heavy atoms in one CG bead to as much as 1 or 2 glucose units in one CG bead, in the most spatially reduced models. We would like to be able to simulate CNC, their interactions and selfassembly, on a mesoscopic scale (% 100 nm-1 lm) and also on a longer time scale than a typical time scale of AA-MD which is on the order of hundreds of ns. These length-and time-scales are beyond the reach of already existing CG cellulose models.
Therefore, we have chosen to build a CG model with even more reduced resolution than the existing ones. The new CNC supra coarse-grained (sCG) model is based on the 36 chains model of cellulose Ib arranged in hexagonal shape, which has two surfaces (110 and 200) and is presented in Fig. 1. It should be here noted that we chose to model the Ib allomorph, because this is the dominant allomorphic structure of cellulose in higher plants. Our model can be easily adapted to the Ia allomorph, because all the beads and form of the potential functions remain the same. A new parametrization has to be done using e.g. the force-matching coarse graining procedure as described in this paper. The 36 chain model of cellulose Ib CNC has dimensions of 3Â5 nm and represents a wood derived CNC, which has a typical width of 3-5 nm (Habibi et al. 2010;Lagerwall et al. 2014). In our model, we have mapped a part of the CNC cross-section in one CG bead, as shown in Fig. 1. The cross-section of the CNC is divided into surface and inner region, which are represented by the O/I and C bead in Fig. 1, respectively. The I bead represents 110 surface and the O bead-the 200 surface, which are often referred to hydrophilic and hydrophobic, respectively. Both of these two type of beads (I, O) group 20 glucose units,-5 glucose chains in the crosssection and 4 glucose units in the axial direction of the crystal. The third bead type, C, groups the interior of the CNC (which corresponds to the remaining 24 glucose units-6 and 4 glucose chains in the crosssectional and axial directions, respectively). Thus, the overall shape and the specific surfaces of the CNC are preserved by the sCG model and are represented with the minimal amount of CG beads. It should be mentioned that a similar philosophy of coarse-graining, called ultra-coarse-graining, has been applied earlier, where a region of a protein structure or large amphiphilic molecules has been represented by one CG bead (Dama et al. 2013;Mehandzhiyski and Grimes 2016). Figure 1, depicts the cellulose chains mapped into the respective CG beads and shows both schematically and with simulation screenshots the sCG CNC model. The next two sections present in details the parameterization procedure of the model.

Force matching coarse-graining -bonded interactions
Different coarse-graining techniques exist which can be utilized to derive and parameterize a CG model (Noid 2013;Rühle et al. 2009). We parameterize our model with the force matching coarse-graining method, (Izvekov and Voth 2005b, a) as implemented in the VOTCA package (Rühle et al. 2009). In general, the force-matching method (often referred to multiscale coarse-graining) is trying to match the forces acting on the respective CG bead to some reference forces obtained from AA-MD or ab initio simulations. This matching is achieved in the least-squared manner by minimizing the objective function equivalent to: where F ref il and F p il are the reference and predicted force acting on the ith atom in the lth configuration, respectively, and depends on the set of M parameters; N is the number of atoms in the AA configuration and L ¼ 5000 is the total total number of atomic configurations used in the fit (Izvekov and Voth 2005b, a).
In order to properly capture the physical properties CNC structure at such reduced resolution, we have fitted 3 bonds, 6 bending angles, 2 dihedral angles and 1 improper dihedral. It should be noted here that one bond (CI- Fig. 2b) and one angle (ICO) possess two distinctive distributions and therefore they can not be fitted with a single harmonic potential. Therefore, in practice these interactions are represented with two terms each (CI-1, CI-2) and (ICO-1, ICO-2), see Table S1 in the Supporting Information, such that the total number of bonds and angles are 4 and 7, respectively. More detailed information of the different bonds and angles in the model can be found in the Supporting Information. The full potential function can be expressed as a sum of the bonded and nonbonded terms: VðrÞ nonÀbonded ¼ 4 ij r ij r ij where k's are the respective force constants, r is distance between two atoms, H is the angle between three beads, / is the dihedral angle, n is the multiplicity of the dihedral angle, n is the improper dihedral, and r are the Lennard-Jones (LJ) potential parameters. The non-bonded interactions and parameters are discussed in more details in the next section. The multiplicity of the dihedral angles in our model equals n ¼ 1 and therefore produces one population in the probability distribution presented in Fig. 4e, f). The force matching procedure in VOTCA produces tables with the respective potentials which can be directly used in GROMACS (Abraham et al. 2015). Alternatively, one can fit the force field parameters to reproduce the VOTCA obtained tables and to use the standard GROMACS files. We have chosen to perform the latter and all the force field parameters can be found in the Supporting Information. We have presented in Fig. 2

Umbrella sampling: non-bonded interactions
The non-bonded interactions in our model are fitted to reproduce the potential of mean forces (PMF) obtained from the AA-MD simulations. The atomistic PMF are obtained from umbrella sampling simulations which will be reported elsewhere and are similar to those reported by Paajanen et al. (2016), where two fibrils were simulated in water. It should be noted that the atomistic PMF are obtained for CNC with crosssection of 18 chains and they are normalized per unit surface area (projected surface area) of CNC. This normalization allows us to fit the non-bonded interactions of our sCG model, even though we build the model with 36 chains. The procedure for the umbrella sampling simulations is as follows: first the two CNC are brought together, then a pulling force is applied to the center of mass of one of the CNC while the other CNC is kept restrained at its original place. By pulling the second CNC we generate a series of configurations which are separated by 0.1 nm and then each configuration was run for 5 ns. Finally, to obtain the PMF we use the Weighted Histogram Analysis Method (WHAM) implemented in GROMACS (Hub et al. 2010). For a more detailed discussion of the umbrella sampling simulations please see (Paajanen et al. 2016). The results from the AA and CG-MD are presented in Fig. 3, together with representative simulation snapshots of the two models. In our model we use the Lennard-Jones 12-6 potential for the nonbonded interactions and therefore we fitted the r and parameters in the LJ potential, which are given in the Supporting Information. The CNC possesses two different types of surfaces and therefore we have calculated the interactions between the 200-200 (Fig. 3a), 110-110 (Fig. 3b) and 200-110 (Fig. 3c) surfaces. The non-bonded interactions between the central bead (C) and the surface beads (I/O) in the same fibril are excluded, since they are connected by bonds and the non-bonded interactions between different fibrils with these beads are beyond the cut-off distance used in the simulations. The PMF obtained from the sCG model are in a very good agreement with those obtained with the atomistic model. Both the magnitude of the interactions and the equilibrium distance are well represented by the sCG model, as seen from Fig. 3. However, from Fig. 3c, it can be seen that the strength of 200-110 interaction is overestimated in the sCG model. This is related to a particular problem arising during parameterization of this interaction, where the unconstrained CNC always rotate out of the 200-110 configuration, such that two CNC adopt either the 200-200 or 110-110 configuration. To overcome this problem we also constrained the second CNC during the CG umbrella sampling, thus not allowing this rotation to happen. This most likely resulted in a stronger interaction compared to the AA-MD. Nevertheless, the 200-110 interactions are still weaker than the other two types of interactions for both the sCG and the AA models; note that the equilibrium distance between the fibers is well represented in the sCG model.

Simulation details
The simulations were carried out with the GROMACS 2019 simulation package (Abraham et al. 2015). The sCG model is parameterized against AA-MD in water. Note that a similar procedure was used by Izvekov and Voth (2009)  is shown to reproduce energetics and dynamics of the system and an obvious strength of the CG model is that the solvent is not considered and it is possible to simulate large systems going from nm scale to submicrometer/micrometer scales. All the CG-MD simulations are performed in an implicit solvent, and the equations of motion are integrated with the stochastic dynamic integrator with the friction in the Langevin equation set with a time constant of s t ¼ 4:0 ps. The time step Dt was set to 40 fs and the temperature to 300 K, unless specified otherwise. The VdW cutoff and the neighbor list were set to 3.2 nm. The simulation snapshots were prepared with VMD and pymol (Humphrey et al. 1996;Schrödinger 2015).

Results and discussion
In this section we validate our model against both experimental data and atomistic simulations. Then we discuss possible applications for the model and present several examples where our model can be used.

Twisting of a single fibril
The right-hand twist of cellulose was observed and discussed in many experimental and simulation studies (Matthews et al. 2006;Conley et al. 2016;Paajanen et al. 2019;Paavilainen et al. 2011;Zhao et al. 2013;Hadden et al. 2013;Hanley et al. 1997;Arcari et al. 2019). Although the twist can be clearly seen in TEM (Conley et al. 2016) and AFM (Hanley et al. 1997;Arcari et al. 2019) pictures, it is difficult to deduce exactly how much is the twist per length of fibril. In this respect molecular simulations can be a valuable tool to determine the twist of a CNC. Generally, the twist of the CNC depends on the diameter i.e. the number of constituent chains forming the CNC (Matthews et al. 2006). Zhao et al. (2013) showed that the twisting angle varies between 9.9 and 1:3 =nm for CNC with cross-sections between 2 Â 2 chains and 6 Â 6 chains, respectively. Two earlier MD studies found a twist around 1:4 =nm (Matthews et al. 2006;Zhao et al. 2013). We calculated the twisting angle for the sCG model as a function of the simulation time and presented it together with a simulation snapshot in Fig. 4a. As Fig. 4a shows, the CNC quickly twists in the right-hand direction in the first 10 ns and remains stable afterwards with a twisting angle around 1:5 =nm. The twist of the fibril in our model is described by the dihedral angles used in it. This value is in a very good agreement with the AA-MD results discussed above.

Diffusion coefficients
We calculated the translational diffusion coefficients as a function of the CNC length and presented the results in Fig. 4b and Table S3 in the Supporting Information. The CNC length is varied between 50 and 1200 nm. 20 CNC were placed randomly in a simulation box where the initial separation distance between CNC was at least 100 nm, thus the CNC did not interact with each other. Note that typical experimental conditions correspond to a solution at an infinite dilution. Hence, our simulation box was sufficiently large to prevent the interactions and aggregation between CNC. The concentration of CNC in these simulations is very low (( 1 %) and therefore corresponds to a very diluted solution. The simulations were run for 1 ls each and the diffusion coefficients D is calculated in a standard way from the mean-square displacement hDr 2 ðtÞi and the Einstein relation D ¼ hDr 2 ðtÞi=6t. The time-dependence of the mean-square displacements for all the systems is shown in the Supporting Information. The calculated diffusion coefficients of CNC show an exponential decay with the increase of the length and vary between 17 and % 1 Â 10 À12 m 2 =s for lengths between 50 and 1200 nm, respectively. Together with the calculated diffusion coefficients, we present several experimental values for CNC in Fig. 4b (Khouri et al. 2014;Lima et al. 2003;Mao et al. 2017;Boluk and Danumah 2014). It can be seen that the calculated values reproduce the experimental results very well for a wide range of CNC lengths. Although the different experiments often have CNC from different sources that might have different crosssection, the agreement of our model with the experimental values is remarkably good. It should be also noted that the diffusion coefficients for the same CNC length obtained from different experiments differ from each other as well.

Elastic modulus of a single fibril
We have calculated the axial elastic modulus (E) for the sCG model by pulling one end of the CNC while keeping the other end fixed (Glass et al. 2012;Shishehbor et al. 2018;Ramezani and Golchinfar 2019). Hereby we obtained the stress-strain curve (Fig. 4c) and by a linear fit to it, we obtain E ¼ 110 GPa. Values in the literature, both experimental and theoretical estimates, vary in the range of 93-270 GPa, (Glass et al. 2012;Tanaka and Iwata 2006;Nishino et al. 1995;Wu et al. 2013Wu et al. , 2014 and the calculated value for our model fells into this range. It should be noted that the elastic modulus in our model is entirely depended on the C-C bond, thus on the bonded interactions. All-atom MD simulations show that E depends around 75% on the bonded interactions and the rest 25% on the non-bonded interactions -VdW, electrostatics and hydrogen bonds (Wohlert et al. 2012). This means that our sCG bonded interactions effectively includes contribution from both bonded and non-bonded AA interactions.

Evaporation simulations
Cellulose fibrils and CNC can self-assemble into various structures to produce materials with different properties and applications (Kontturi et al. 2018). Some of these materials, which find broad applications in various fields can be easily prepared by evaporating the solvent in cellulose dispersions or by freeze drying (Bayer et al. 2016;Honorato-Rios et al. 2018;Schütz et al. 2015;Zhang et al. 2019;France et al. 2017).
During the evaporation and depending on the fabrication conditions CNC can form a percolating network which manifests itself in the form of hydrogels and membranes (Kontturi et al. 2018;Bayer et al. 2016;France et al. 2017). These materials are usually selfstanding, porous and contain large amount of water. In the current section, we present ''water'' evaporation simulations prepared with our sCG model. Since in our approach we use an implicit solvent MD simulation, the solvent evaporation can not be done in the usual way as in the explicit solvent MD, where solvent molecules are removed individually (Mehandzhiyski and Zozoulenko 2019). For our simulations the system was prepared by randomly placing 200 CNC with length 100 nm in a simulation box with the periodic boundary condition with a side of 300 nm, shown in Fig. 5a. The system is then simulated for 1 ls in the NVT ensemble at T ¼ 300 K. We calculated the volume fraction (/) of CNC to be 1 vol% for the initial system. After 1 ls, we perform a simulation in the NPT ensemble for 10 ns with the isotropic pressure P ¼ 2 bar at T ¼ 300 K. This is the evaporation step in our simulation in which the system changes its volume and the CNC are brought closer together due to the removal of solvent and thus enhancing the interaction and increasing the number of contacts between them. Due to the implicit MD simulation, the effect of water evaporation can only be modeled by shrinking the simulation box size. We used pressure of 2 bar in order to accelerate the shrinkage of the box and to speed up the simulations. After the evaporation step, the system is simulated in NVT ensemble for 1 ls, followed again by an evaporation step. This procedure is repeated  Fig. 4e. The snapshots reveal that the CNC form highly percolated cross-linked network throughout the box. The system is in a kinetically arrested state, which represents a macroscopic gel (Honorato-Rios et al. 2018). We further characterize the final system (/ ¼ 4:5 vol%) by calculating its elastic modulus. To simulate the stress-strain dependencies, we applied uniaxial deformation of the periodic simulation box in the positive x, y and z axis (Lyulin et al. 2004;Nazarychev et al. 2016). This was achieved by applying a constant rate of deformation to the box in the respective direction with two different rates, 2 Â 10 À4 nm=ps and 1 Â 10 À3 nm=ps. In the direction of the applied strain the compressibility of the box was set to zero (incompressible), and in the transverse directions it was set to 4:5 Â 10 À5 bar À1 and the pressure to P ¼ 1 bar. During the simulations we calculated the pressure tensor P i (i= x, y, z) and monitored the system size in the direction of the deformation L i . The stress-strain characteristics are then obtained from: where L 0i is the initial box size before the deformation. The elastic modulus E is obtained from a linear fit to the stress-strain curve in the initial linear elastic regime. E has been averaged over three deformation directions i = x, y, z. Figure 6 depicts the box prior and after deformation in the y-direction. We have obtained 7.2 MPa and 12.7 MPa for deformation rates of 2 Â 10 À4 nm=ps and 1 Â 10 À3 nm=ps, respectively. These values are in the range of the reported elastic modulus of cellulose hydrogels, 1-50 MPa (France et al. 2017;  Nakayama et al. 2004;Ye et al. 2018;Gao et al. 2015). Although, it is often not straightforward to directly compare the simulation with the experimental elastic modulus results, it is a good indication that our results fell in the experimental range for E and that our model is a good representation of a physical gel.

Chiral nematic phase simulations
The ordering of CNC into chiral nematic phases has been a topic of numerous studies and might be used in applications such as optical encryption, chiral templates or optical sensors (Schütz et al. 2015;Surov et al. 2017;Liu et al. 2019Liu et al. , 2018Pospisil et al. 2018;Bertsch et al. 2019;Parker et al. 2016;Yao et al. 2017;Park et al. 2014;Frka-Petesic et al. 2017;Nguyen et al. 2013;Parker et al. 2018a;Zhang et al. 2012;Shopsowitz et al. 2010;Kamita et al. 2016;Parker et al. 2018b). One of the main characteristics of CNC chiral nematic phases is their pitch (p), where often the half pitch (p/2) is reported. CNC chiral nematic phases can form during solvent evaporation, by applying a shear force or by a geometry confinement of CNC. The pitch can be experimentally measured and usually is of the order of two to tens of micrometers, but frequently it can be as small as hundreds of nanometers ). Parker et al. have also discussed that the optical resolution limit is 2 lm, which can make it difficult to quantify the exact pitch size if it is less than 2 lm (Parker et al. 2016). The concentration at which the chiral nematic phases occur has been determined and discussed in the literature as an essential factor for understanding the self-assembly of CNC. However, the self-assembly of CNC into chiral nematic phases is still poorly understood Parker et al. 2016). Here, we apply the sCG model to simulate a chiral nematic ordering of CNC. The CNC dispersion needs hours to undergo a transition from an isotropic to anisotropic phase  which is beyond the time scales of any MD simulations. Therefore, we had to accelerate the dynamics and formation of the chiral nematic phase. To this end we construct an initial system as depicted in Fig. 7a. First, 10 CNC are placed in a row with their 200 surfaces pointing at each other and run for 40 ns in the NVT ensemble in a large periodic box. After the initial 40 ns we observed that the CNC spontaneously form a twist. Finally, this twisted structure is duplicated 18 times, rotated and put next to each other as shown in Fig. 7a. As it can be seen from the inset in Fig. 7a that the individual subset of 10 CNC are placed at a certain distance so they do not overlap. After that this system is run for 1 ls at 300 K. Simulation snapshots of the final structure are presented in Fig. 7b. The half-pitch (p/2) is around 500 nm and the twist angle is H ¼ 171 . It should be noted that in our simulations we use native cellulose without surface modifications while in the experiments there are always sulfonate or carboxylate groups on the CNC surface. The presence of the surface groups will therefore increase the equilibrium surface-surface distance and respectively the observed pitch will be larger. However, the obtained structure represents very well the expected chiral nematic ordering of CNC and its left-handed helix. It should be noted here that the twist of a single fiber is right-handed, but the chiral nematic phase possess left-handed helix (Parker et al. 2018b). The twisting of the individual CNC in our simulations was also observed in the chiral nematic phase composed of a bundle of cellulose fibrils. However, the degree of twisting is lower than the twisting of the Fig. 6 Simulation snapshots of the system with / ¼ 4:5 vol% prior to deformation and after applying deformation in the y-axis with the rate of 1 Â 10 À3 nm=ps single CNC, presented in Section 3.1.1. This can be explained as a competition between a tendency to twist, and an efficient packing of the fibrils in the chiral nematic phase.

Conclusions
In the present paper, we developed a novel supra coarse-grained model for molecular dynamics simulations of cellulose nanocrystal on large time and spatial scales (microsecond-and sub-micrometer/ micrometer scales respectively). The model is based on all-atom MD simulations and is parameterized with the help of the force-matching coarse-graining and umbrella sampling simulations. We demonstrated that our model reproduced sufficiently well experimental and simulation results for CNC twisting angle, selfdiffusion coefficients and axial elastic modulus. Additionally, we carried out simulations that represents the self-assembly of CNC during water evaporation and the formation of a cross-linked network, which closely resembles the formation of a cellulose hydrogel. Finally, a simplified and idealistic representation of a chiral nematic phase ordering was simulated and is shown to be a stable configuration of CNC. Thus, our model is versatile and can be applied to study different CNC self-assembled structures on scales comparable with the experiments, which was not possible with the existing AA and CG CNC models. The sCG model can be re-parameterized to represent CNC with other shapes, number of constituent cellulose chains, inclusion of amorphous regions and various surface functional groups. The latter is particularly important since in most of the experiments CNC possess surface functional groups like sulfonate or carboxylate with a different degree of substitution. We plan to include these functional groups in our future studies, as well as to use this model to study in more details the formation and characteristics of chiral nematic phases and gelation observed in cellulose systems. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.