Martini 3 model of surface modified cellulose nanocrystals: investigation of aqueous colloidal stability

The Martini coarse-grained force field is one of the most popular coarse-grained models for molecular dynamics (MD) modelling in biology, chemistry, and material science. Recently, a new force field version, Martini 3, had been reported with improved interaction balance and many new bead types. Here, we present a new cellulose nanocrystal (CNC) model based on Martini 3. The calculated CNC structures, lattice parameters, and mechanical properties reproduce experimental measurements well and provide an improvement over previous CNC models. Then, surface modifications with COO− groups and interactions with Na+ ions were fitted based on the atomistic MD results to reproduce the interactions between surface-modified CNCs. Finally, the colloidal stability and dispersion properties were studied with varied NaCl concentrations and a good agreement with experimental results was found. Our work brings new progress toward CNC modelling to describe different surface modifications and colloidal solutions that were not available in previous coarse-grained models.


Introduction
Cellulose is an ageless and abundant biopolymer (Klemm et al. 2005), meeting renewable and sustainable requirements for innovation and development. Cellulose consists of glucose monomers chemically bonded through the β-(1,4) glycoside linkage and can be extracted from various sources, such as trees, plants, algae, etc. (Moon et al. 2011). In recent years, research studies on cellulose nanocrystals (CNCs) have been rapidly increasing due to numerous advantages of cellulose over other materials, such as biocompatibility and recyclability. With chemical treatments (George and Sabapathi 2015), the CNCs, which are rod-like or needle-shaped nanometric particles, are produced by a simple preparation process and possess high yield, large specific surface area, high aspect ratio (10-100), strong strength and stiffness, and biodegradability (Moon et al. 2011). These characteristics encourage CNC applications in biomedical engineering, wastewater treatment, energy & electronics, and other sectors (Grishkewich et al. 2017).
Due to the inherent hydrophilicity, CNCs cannot be uniformly dispersed in most non-polar (Heshmati et al. 2018) and hydrophobic (Inai et al. 2018) polymer matrices. However, the surface primary hydroxyl groups of CNCs can be easily modified to achieve different surface properties, which can affect the dispersion in various matrix polymers, self-assembly, and the strength regulation of particle-particle and particle-matrix bonding (Kalia et al. 2014). These chemical modifications can extend the application of CNCs to high-end engineering and biomedical fields (Paajanen et al. 2016;Shojaeiarani et al. 2021). Chemical modifications such as esterification, cationization, silylation, polymer grafting, and 2,2,6,6-tetramethylpiperidinyloxyl radical (TEMPO) oxidation are usually used for surface modifications to improve the compatibility and uniform dispersion within a polymer matrix or aqueous solutions (Hasani et al. 2008;Morandi et al. 2009;Oliveira Taipina et al. 2013;Raquez et al. 2013;Kalia et al. 2014;Huang et al. 2015).
With the advancement of computer algorithms and computational power, molecular simulations have become an integral part of the research environment. Atomistic molecular dynamics (MD) simulations provide valuable insight into the dynamics, structure, and properties of materials and help to understand the complex interplay of molecular interactions at the atomistic level. However, all-atomistic (AA) MD simulations can become computationally expensive when applied to large macromolecules like cellulose and particles like CNCs. Thus, models, which combine several atoms into a so-called bead or super atom, are often developed to simulate more extensive systems at larger time scales. Such models are referred to as coarse-grained (CG) models, and numerous CG models have been developed to study cellulose, CNCs, and fibers in recent years (Mehandzhiyski and Zozoulenko 2021).

3
Vol.: (0123456789) The Martini CG force field has become the most widely used CG force field in soft material science nowadays (Kholina et al. 2020;Lee et al. 2020;Shivgan et al. 2020;Jackson 2021;Sousa et al. 2021). By mapping 2 to 4 heavy atoms (C, O, N) to 1 CG bead, the Martini force field keeps the explicit description of the system and provides sufficient chemical accuracy with much higher computational efficiency. Therefore, CG simulations can explore material properties on a larger time and spatial scale than AA simulations. Martini 2 was originally developed for lipids and surfactants, and later many other models for different systems were developed, such as the Martini 2 extension for carbohydrates (López et al. 2009). CNC models were also developed with the Martini 2 (Wohlert and Berglund 2011;López et al. 2015), and they found applications for studying native cellulose as well as composite materials made of cellulose and conducting polymers (Mehandzhiyski and Zozoulenko 2019). However, these models were developed for native (pristine) cellulose, whereas surface chemical modifications that are inevitably present in most experimentally studied CNCs, were not included in the models as the modelling of ions is only qualitatively accurate in Martini 2 (Marrink et al. 2007). This seriously limits the applicability of these models.
Recently, a new version of Martini (version 3) had been reported with improved interaction balance between beads and more abundant bead types ). In addition, a Martini 3 model of a single native fiber corresponding to different cellulose allomorphs, such as cellulose I (Iα and Iβ) and cellulose II was recently presented by Moreira et al (2022). At the same time, Martini 3 models of surface group modifications of CNCs, as well as models of interacting CNCs in the dispersions, are still missing.
In this work, we report a new CNC model based on Martini 3, where we first developed a model for native cellulose and then introduced a model for the surface chemical modifications. The CNC lattice parameters and mechanical properties of the native CNC show results closely reproducing experimental measurements and providing improved lattice constants over previous Martini 2 CNC models. Then, we introduced TEMPO-modified CNC and fitted the CG model to reproduce AA results to accurately describe the electrostatic double layer characteristic for the surface-modified CNCs. Furthermore, the dispersion properties of CNCs were studied with varied NaCl concentrations. Surface-modified CNCs show colloidal stability in low NaCl concentration (< 60 mM), while aggregation was observed when the NaCl concentration is above 100 mM, which is in agreement with the experimental observations. The work brings new progress toward the modelling of surface-modified CNCs and their colloidal solutions.

Martini 3 model and simulation details
The first step in building the Martini 3 CG model was to map the atomistic structure of cellulose chains with the newly introduced beads in Martini 3. Then we proceeded to construct the cellulose nanocrystals (CNCs) in the Iβ allomorph configuration. We used two β-D-glucopyranose units (glucose) to map the AA structure to the CG model, as shown in Fig. 1a. Each Martini 3 bead in our model represents 2 or 3 heavy atoms. The choice of the bead types is discussed in the Results section. Thus, cellobiose has 10 Fig. 1 a Mapping glucose unit from AA to CG beads based on Martini 3; b Mapping cellobiose from AA to CG beads based on Martini 3; c Diagram for mapping the AA cellulose chain to the CG cellulose chain. Each glucose unit consists of 5 beads (B1 ~ B5) beads (5 beads per glucose unit), and the CG beads were mapped in the center of geometry (COG) of the atomistic structure. The glucose unit was then replicated to construct a chain with the desired length, as shown in Fig. 1c. We built our Martini 3 CG model with the clear idea that we have a fundamental base, a native (unmodified) chain, which can be easily further modified by adding different functional groups. Experimentally, the modifications are usually done by chemical reactions of the primary hydroxyl groups. Thus, we used the SN3a bead types (B1 and B2 beads) to represent the chain backbone (orange Martini 3 beads). The primary hydroxyl (B5) and secondary hydroxyl (B3, B4) groups were mapped to TP1 beads, as shown in Fig. 1c. We can easily modify or add other beads (covalently bonded functional groups) to the B5 bead, and therefore by this model we can simulate different cellulose functionalities. One typical CNC modification is TEMPO oxidation (Huang et al. 2015), where the primary hydroxyl group is substituted with the carboxylic group, which is often present in experiments in its deprotonated carboxylate form (COO − ). Thus, after establishing the basic native cellulose chain model, we focused on the surface-modified CNCs. The type of bead B5 in the case of TEMPO modified CNC was changed from TP1 to SQ5n. Another chemical modification gaining experimental interest is dialcohol and dialdehyde cellulose (Larsson et al. 2014a, b;Plappert et al. 2018), where the C2-C3 covalent bond is broken. This can be easily achieved with our model by simply removing the bond between beads B3 and B4, which shows the versatility of our mapping scheme. However, in this work, we focus on the TEMPO-modified CNCs, and further cellulose chemical modifications will be addressed in future studies. Finally, the non-reducing and reducing ends of the cellulose chains were mapped with the SP1 and SN2 beads, respectively (Fig. 1a).
The GROMACS 2020.3 simulation package was used to carry out all the simulations in this work (Páll et al. 2020). The leap-frog algorithm was used for integrating Newton's equations of motion (Van Gunsteren and Berendsen 1988) with a timestep of 0.01 ps. (Note that the chosen timestep is smaller than the one usually used in Martini 3 CG simulation (0.02 ps) . This is because, for our system, for certain cases, the 0.02 ps timestep led to numerical instabilities. We speculate that the need for smaller timesteps can be related to a large number of tiny and small beads used to map the system at hand). The Particle Mesh Ewald (PME) method was used for calculations of the non-bonded interactions between charge beads (Essmann et al. 1995). The bonds were constrained with the LINCS algorithm (Hess et al. 1997). The cut-off distance for the Coulomb and van der Waals interactions was set to 1.2 nm. The velocity rescaling (v-rescale) with a stochastic term was used for temperature coupling (Bussi et al. 2007), and the pressure was coupled with the time constant by Parrinello-Rahman (for high accuracy free energy calculations) and Berendsen (for dispersing calculations in solutions) algorithms (Parrinello and Rahman 1981;Berendsen et al. 1984). All systems in this work were simulated at 350 K except the free energy simulations, where 310 K was used to keep consistency with the AA simulations. More details and molecular dynamics parameters can be found in Supporting Information and are available on request from the authors.

Atomistic simulations
The OPLSAA force field was used for CNCs in AA calculations in this work by GROMACS software (Damm et al. 1997). SPCE water force field was used (Berendsen et al. 1987). The non-bonded interactions were calculated with PME, and the bonds were constrained with the LINCS algorithm (Essmann et al. 1995;Hess et al. 1997). The long-range dispersion corrections were applied for both energy and pressure calculations (Guo and Lu 1998). The cut-off for the Coulomb and van der Waals interactions was 1.2 nm, and the timestep was 0.002 ps. Parrinello-Rahman and v-rescale were used for the pressure and temperature couplings (Parrinello and Rahman 1981).

Partitioning free energy calculations
To reproduce the octanol/water partition coefficients ( P OW ) reported previously by López et al. (2009), the solvation free energy of CG glucose/cellobiose in the aqueous and organic phase is calculated between the water ( ΔG W ) and water-saturated octanol ( ΔG O ) solution: The ΔG W and ΔG O represent the free energy difference ( ΔF ) of the glucose/cellobiose in vacuum and (1) ΔΔG OW = −2.3RT log P OW the solutions, computed by the thermodynamic integration (TI) method (Jarzynski 1997): The potential energy function of solute-solvent interaction is represented by the U UV ( ) .
is the coupling parameter that varies linearly from zero ( A = 0 ) to one ( B = 1 ). The for each solvation free energy calculation was divided into 21 steps, which guarantees the accuracy of the free energy difference curves. For each individual step, we ran a 25 ns CG calculation, and the last 20 ns of each simulation was used for analysis.

X-ray diffraction (XRD) analysis
The XRD curves of the cellulose fibrils were calculated with Debyer, which is analysed based on the Debye scattering equation (https:// github. com/ wojdyr/ debyer). The X-ray powder pattern was used, and the wavelength for the analysis was 0.15 nm. The resolution of the calculated pattern was 0.1 degrees. The input structure of CNC was exported from molecular dynamics calculations. The model displays and format transfers of the models in this work were done by VMD 1.9.3 (William et al. 1996).

Elastic modulus
The umbrella potential between the reference group and pulling group was used to calculate the elastic modulus by GROMACS. The CNC was placed in the simulation box without water beads to avoid the influences of pressure from the solution when pulling the CNC. The reference group, including 5 glucose units (the first 5 units in the CNC), was restrained. The last 5 glucose units on the other side of the CNC were subjected to pulling force. For the pulling process, a 0.002 nm/ps pulling rate was used, and the total pulling time was 400 ps. The harmonic force constant for the umbrella pulling was set to 5000 kJ/mol/ nm. Finally, the elastic modulus in the axial direction of the CNC was calculated by a linear fit to the strain-stress curve.

Potential of mean force
The potential of mean force (PMF) was obtained by performing umbrella sampling simulations. The simulation protocol was as follows: two TEMPO modified CNCs were placed in the simulation box approximately 6 nm apart, and the box was filled with water beads and Na + counter ions. Then a pulling force was applied to the center of mass of one of the CNC while the other CNC was kept restrained at its original place. By pushing one of the CNC towards the other, we effectively describe the approach of two CNCs in the dispersion. This generated a series of configurations (umbrella windows) separated by 0.1 nm from which the PMF was calculated by the Weighted Histogram Analysis Method (WHAM) (Kumar et al. 1992) as implemented in GROMACS. Each umbrella window was run for 5 ns AA (45 ns CG) calculations, and the PMF was calculated from the last 4 ns (40 ns) of each simulation. To reproduce the CG PMF curves to AA results, the COM distance of the COO − groups distances on the surface of 2 CNCs were computed and compared. Table 1 Thermodynamic parameters of glucose/cellobiose and partition coefficient of octanol-water based on the difference between the free energy of hydration (∆ G W ) and in water-saturated octanol (∆ G O ) CG simulations were calculated at 310 K, consistent with the AA reference data from ref (López et al. 2009). The experimental data for the octanol/water partition coefficients, Log P ow (exp), is taken from ref (Mazzobre et al. 2005) and corresponds to T = 300 K

Number of contacts
To quantify the state of the systems (dispersibility of CNCs in aqueous solution), we calculated the number of contacts between CNCs. First, the gmx mindist from GROMACS was used to calculate the minimum distance between each 2 CNCs. As there are 7 CNCs in the box, 21 minimum distance results for each system were calculated. The 2 CNCs are defined to be in contact when the minimal distance between them is 0.35 nm or less. And then, after summing up the number of contacts for each model (maximum 21 contacts for each model) and dividing by 7 CNCs, we plotted the number of contacts/CNC curves.

Results and discussion
Cellulose modelling and analysis The Martini 3 bead types in the present model ( Fig. 1a) were assigned based on the calculation of the solvation and partitioning free energy between water and water-saturated octanol, and the results are shown in Table 1. As a reference for the solvation and partitioning free energies, we used previously published atomistic data for glucose/cellobiose used by Lopez et al. (2009) for the development of the Martini 2 cellulose model. In the work of Lopez et al. (2009), a water/octanol molar fraction of 0.255 was used, and to be consistent with their simulations, we used the same molar fraction for the octanol phase (constructed by 1 glucose/cellobiose molecule, 42 water molecules, and 125 octanol molecules) in our CG calculations. As shown in Table 1, the results for the partitioning free energy match very well the previous atomistic data. Then, the cellulose chain was modelled based on the bead types of glucose unit by Martini 3, as shown in Fig. 1c. The bond, angle, and dihedral angle force field parameters of the CG cellulose chains were fitted to match distributions that are based on the AA cellulose nanocrystal and are presented in Fig. S1-S3. A graphical representation of all the bonds, angles and dihedrals is presented in Table S1-S3, along with all force field parameters. Further, we created different CNCs consisting of 36, 24, and 18 cellulose chains with 16 glucose units per chain, as shown in Fig. 2a, b, and c, respectively. The exact shape and number of constituent cellulose chains in the primitive fibril is still a matter of ongoing debate (Turner and Kumar 2018;Rosén et al. 2020). However, recent works show that, most likely, the primitive fibril consists of 18 or 24 chains (Nixon et al. 2016;Kubicki et al. 2018). Therefore, we created CNCs with different numbers of chains and two different shapes (cubic and hexagonal) to test the stability of our model.
After creating the different CNCs explained above, we solvated them in water and carried out 1000 ns NPT simulations. The final CNC configurations for the three cases are shown in Fig. 2. It can be noticed that all three models possess a right-handed twist and keep their initial shape. In the rest of this work, we chose to use the CNC consisting of 36 chains with a square shape (Fig. 2a), which corresponds to experimentally measured CNC widths of ~ 3 nm (Chen et al. 2021), and therefore, all the results presented below correspond to this model. We calculated the unit  To further illustrate the validity of our model, we calculated the XRD curves of the CNC with the Debyer tool, as shown in Fig. 3b. The overall XRD profile for CG CNC resembles well the experimental measurements and the AA CNC results. The characteristic peaks at 16, 22, and 33 degrees corresponding to the 1-10/110, 200, and 004 surfaces, respectively, are shown to match the experimental XRD peaks closely. It should be noted that the peak at about 22 degrees corresponding to the 200 surface has moved to the right to ≈ 24.5 degrees. This could be explained by the slightly lower unit cell constant a in our model shown in Table 2.
The elastic modulus in the axial direction of the CNC was calculated by a linear fit to the strain-stress curve and is presented in Fig. 4. The diagram for the pulling of CNC, which consists of 36 chains and 40 glucose units per chain, is shown in the top left in Fig. 4. Based on the strain-stress curve, Young's modulus is found to be 120 GPa. From previous theoretical and experimental studies, a single fibril's elastic modulus generally ranges from 93 to 255 GPa (Tashiro and Kobayashi 1991;Reiling and Brickmann Kroon-Batenburg and Kroon 1997;Tanaka and Iwata 2006;Santiago Cintrón et al. 2011;Zhang et al. 2011). It should be noted that the elastic modulus in our Martini 3 CNC model mainly depends on the bonded interactions of cellulose chains, while in AA simulations, the internal hydrogen bonding stabilizes the CNC internal structure and contributes to the elastic modulus by 15% ~ 40% (Tashiro and Kobayashi 1985;Eichhorn and Davies 2006;Mehandzhiyski et al. 2020).

Surface-modified CNCs
TEMPO-modification with carboxylic groups (COO − ) on the surface of CNCs is a typical modification used in experiments. Surface-modified CNCs can be dispersed in low-concentration salt water and show colloidal stability. In this work, we aim to explore the dispersion properties of CNCs with COO − surface modification (20% substitution of primary hydroxyl groups corresponding to experimental charge density of 0.64 mmol/g) (Nordenström et al. 2022) in water with different NaCl concentrations. In Martini 3, the non-bonded interactions between Martini 3 beads are Native cellulose in NaCl water: a and b AA model and corresponding number density curves of Na + , Cl − , and cellulose atoms; c and d CG model and corresponding number density curves of Na + , Cl − , and cellulose beads. The r = 15 was used in the calculations for the native cellulose. TEMPOmodified cellulose in water with Na + ions: e and f AA model and corresponding number density curves of Na + and cellulose with surface modifications; g and h CG model and corresponding number density curves of Na + and cellulose with surface modifications. The r = 50 was used in the calculations for the TEMPO-modified cellulose. Green, gray, and blue beads represent the Na + , Cl − ions and COO − groups. Water molecules (beads) are not shown calculated by the Lennard-Jones (LJ) 12-6 potential energy function and the Coulomb potential energy function , as shown in Eq. 3: To guarantee the accuracy of CG simulations, we divided our work into two steps. First, we considered the native cellulose and fitted the interaction between Na + ion beads and neutral beads (native cellulose and water), which only has LJ interactions, to reproduce the AA results. Second, we considered TEMPO-modified cellulose and fitted the interaction between Na + ion beads and charge beads (COO − surface groups), including both LJ and Coulomb interactions, to reproduce the AA results.
As the first step, we improved interactions between native cellulose and Na + beads in Martini 3 by changing the non-bonded parameters ij and ij (named interaction levels in Martini 3 )) between Na + beads and the native cellulose model to match the AA number density distributions. We constructed a simulation box where native cellulose was centered in the middle, and the cellulose was periodic along the x and y directions, as shown in Fig. 5a. Then we inserted water molecules and ions (36 Na + and 36 Cl − ions to keep the system neutral). The AA and CG models have the same chain length and number of cellulose chains, ions, and water molecules. Finally, we calculated the density profiles from 20 ns NPT simulation of Na + and Cl − ions across the z-direction (normal direction with respect to the cellulose surface) of the box. We found that by using the default non-bonded parameters from Martini 3 between the charged beads (Na + and Cl − ) and native cellulose beads , the ions in the CG model are strongly adsorbed on the surface, as seen from Fig. S5. This is in contrast to the AA results shown in Fig. 5b, where the ions are mostly found in water, and there are practically no ions in close proximity to the surface of cellulose. The non-bonded parameters of charged beads with neutral beads in Martini 3 are designed to better reproduce these interactions compared to Martini 2, and, as a result, some of these interactions are stronger compared to the interactions between neutral-neutral beads, presented in the Supporting information of Souza et al.'s work (2021). However, for the present system, we found that the charged-neutral beads' attractions are way too strong, and they cannot reproduce the AA results. Thus, we decreased these attractions' strength to fit the results to the AA simulations. The topology changes for the interactions between Na + /Cl − and native cellulose are given in Table S4 in the Supporting information. The CG model and corresponding density curves after modifying the interaction levels are presented in Fig. 5c and d, respectively. After the modifications, we found that the CG density profiles' shape and bulk densities of the ions are not only qualitatively but also quantitatively in agreement with the corresponding AA results.
After fitting the interactions between Na + /Cl − and native cellulose surface, we proceed by fitting the interactions between Na + and COO − groups. The AA model and corresponding density curves are shown in Fig. 5e and f, and the final CG model and corresponding density curves are shown in Fig. 5g and h. In our slab model (Fig. 5e), the surface chains contain 36 COO − groups and therefore, we added corresponding 36 Na + ions to keep the system electroneutral. After 20 ns NPT simulations, as shown in Fig. 5e and f, due to the strong Coulomb attraction between Na + and COO − charge groups, most Na + ions are adsorbed on the cellulose surface. However, it should be noted that not all Na + ions are adsorbed on the surface, as seen from the nonzero value of the Na + density profile in the bulk water far away from the surface (Fig. 5f). The Na + ions are in dynamical association/dissociation equilibrium, and they can be exchanged between the cellulose surface and bulk water, which leads to nonzero density in the water. However, due to the charged surface, the association with the cellulose surface is preferred. This is an important characteristic of the surface-modified CNC dispersion because freely moving ions in solution results in the double layer repulsive force between charged colloidal particles. Thus, to correctly describe this phenomenon, we need to reproduce the association/dissociation equilibrium between the Na + and COO − beads in Martini 3 by fitting non-bonded parameters to reproduce the atomistic density profiles. The association/ dissociation equilibrium is directly related to the Gibbs free energy. Furthermore, this equilibrium is also related to the ratio of the coordination complex of the carboxylate group with Na + [COO − Na + ] and the free COO − at the surface and Na + ions in bulk [COO − ][Na + ]. Therefore, by reproducing the density profiles, by fitting the non-bonded parameters, the association/dissociation phenomena are properly described.
As in the case of native cellulose, we found that the attraction between COO − and Na + ions using the default non-bonded parameters of Martini 3 is too strong, see Fig. S6, where all Na + ion beads are adsorbed at the cellulose surface and no ions in the bulk water phase are present. However, in contrast to the case of native cellulose, modifying only the LJ parameters proved to be insufficient to reproduce the atomistic results. (Note that in addition to the variation of non-bonded LJ parameters, we explored variation of different bead types for Na + ions, COO − surface groups, and water molecules, and different cut-offs for the van der Waals and Coulomb interactions, and none of them worked, sees Sec. S1 in SI for details). Thus, the only parameter that is left for the adjustment of the non-bonded interactions is the dielectric constant r , see Eq. 3. Usually, the Martini 2 and 3 models use a screening constant of r = 15 or 20 (Marrink et al. 2007;Shivgan et al. 2020;Souza et al. 2021), which was originally fitted for lipid systems and the coordination number of ions in bulk solutions. However, by systematically increasing the dielectric constant, we found that r = 50 reproduces well the atomistic density curves in our system, as seen in Fig. 5g and h. Thus, larger r = 50 was used for the rest of the CG simulations when charged beads are present in the system. Moreover, the potential of mean forces (PMF) between two CNCs can only be correctly described using r = 50, which will be discussed in detail in the next section.
We believe a modification of the effective dielectric constant (from r = 15 to r = 50) for the accurate description of the system at hand is not coincidental but has an important physical reason related to the features of the electrostatics interaction of the system, namely to the formation of the double layers close to a charged surface of the nanocellulose. This is qualitatively different from charged systems previously studied with Martini 3, (such as polypeptide coacervates (Tsanai et al. 2021), lipid membranes with calcium ions (Bergfreund et al. 2021), ionic liquids (Vazquez-Salazar et al. 2020), and deep eutectic solvents (Vainikka et al. 2021), where no double layers were formed and where a standard Martini value r = 15 was used. Therefore, the important conclusion of our study is that the correct description of the systems with strong electrostatic screening arising due to the formation of the double layers requires adjustment of the standard Martini 3 parameter r describing the efficiency of screening. It should be noted that we found this to be the only solution for our system, and this hypothesis should be tested for similar systems where double layer formation and high electrostatic screening CNCs. Green and blue beads represent the Na + ions and COO − groups, respectively. Water molecules are shown as cyan volume. The number of chains and chain lengths of CNC are consistent in the AA and CG models. Both AA and CG boxes consist of 2 CNCs (110 COO − groups per CNC, 16 glucose units per chain), 220 Na + ions, and ~ 38,000 water molecules play an important role. In addition, it should also be mentioned that the increase of r to 50 is only for the system with surface charge modifications. Thus, the pristine cellulose model is fully compatible with the standard Martini 3 interaction levels and beads.

Surface-modified CNC interactions
Surface-modified CNCs are known to form stable water dispersions (Kaboorani and Riedl 2015). The colloidal stability of charged particles is induced by electrostatic forces and the overlap of the electric double layer surrounding them (Matijević 1979). The forces between charged colloidal particles can be obtained by the Derjaguin-Landau-Verwey-Overbeek (DLVO) theory (Verwey 1947;Derjaguin and Landau 1993;Israelachvili 2011) or by MD simulations, in the form of the potential of mean force (PMF) ). Thus, we calculated the PMF by both AA and CG simulations to understand the interaction between two surface-modified CNCs (Fig. 6). In both AA and CG models, we used the same CNC size and surface degree of COO − substitution. Only primary hydroxyl groups are modified to COO − groups, and the degree of substitution is 20%. The surface-modified CNC consists of 36 chains and 16 glucose units per chain, as shown in Fig. 6a and b. Each surfacemodified CNC has 110 COO − groups. An equal number of Na + ions were inserted into the box to keep the system neutral.
The PMF curves obtained from the AA simulations, along with the curves from the CG simulations with r =15 and 50, are presented in Fig. 6c. The AA simulations show that 20% surface modified CNCs repel each other (i.e., the PMF curve is positive as Fig. 6c shows). This is consistent with The Na + , Cl − , and COO − beads are represented by gray, green, and blue colors, respectively. The scale bar is shown in a, and it applies to all models in Figure. Time t = 300 ns the experimental findings where negatively charged CNCs are dispersed in aqueous media (Cao and Elimelech 2021). The PMF curve obtained from the CG simulations with r = 15 shows qualitatively different behaviour compared to the AA PMF curve. First, the CG PMF curve is lower than the AA PMF curve in Fig. 6c. Second, the CG PMF curve has minima at negative values, which implies attraction at the corresponding distances, while the AA PMF curve is always positive (repulsive). Because of this attraction, the CG surface-modified CNCs will aggregate in the CG simulations, while the AA surfacemodified CNCs are repulsive and, in agreement with the experiment, expected to be dispersed in the AA simulations. (Note that the CG PMF curve has peaks at the distances 1.6 nm and 2.1 nm, which is attributed to the formation of water layers between the cellulose surfaces).
The inability of CG PMF calculations with r = 15 to reproduce the corresponding AA results is directly The scale bar is shown in the first snapshot of g, and it applies to all models in Figure  related to the qualitative difference of the CG and AA density profiles between the double layers, as outlined in the previous section. We therefore systematically varied r and, as expected, found that Martini 3 PMF results with r = 50 show a close quantitative agreement with the corresponding AA results. This further confirms the usage of r = 50 in our simulations for charged particles where the electrostatic interaction and screening are of paramount importance.

Dispersions and colloidal stability
In this section, we explored the dispersibility of CNCs in an aqueous solution with varied salt concentrations. According to the experimental findings, low salt concentration promotes dispersibility while its increase promotes aggregation (Phan-Xuan et al. 2016;Cao and Elimelech 2021). In our simulations, we used 7 TEMPO-modified CNCs inserted randomly in the simulation boxes, with each CNC consisting of 36 chains and 32 glucose units per chain. In addition, we also considered a system composed of pristine CNCs (without surface modifications). Then, the box was filled with water and ions, where the weight ratio of CNCs is 7%; the box also contains Na + ions needed to neutralize negatively charged surface groups on CNCs. (No Na + ions were added for the case of pristine CNCs). Figure 7 shows representative snapshots of pristine CNCs in pure water (Fig. 7a) and TEMPO-modified CNCs at different ion concentrations 0-600 mM at time t = 300 ns. In order to quantify the state of the system (i.e. dispersed or aggregated), we calculated the number of contacts between CNCs for each system and its evolution during time 0 < t < 1000 ns, see Fig. 8a-f. (Two CNCs are defined to be in contact if the minimal distance between them is 0.35 nm or less). It should be noted that TEMPO-group modifications were introduced at the side surfaces along the axial direction (110/1-10 surfaces), and they are not present on end (top) surfaces of CNCs. Due to the relatively short length of CNCs in our simulations, the contacts between two ends of CNCs can happen at these points as there is no surface modification there (see Fig. 8g for illustration), which may overestimate the number of contacts between surface-modified CNCs compared to experiments. Thus, both the top contacts and surface contacts (excluding contacts between the end surfaces) between CNCs were calculated and plotted, respectively.
Let us start with the case of pristine CNCs in pure water, as shown in Fig. 7a. A visual inspection shows that each CNC is in contact with at least another CNC. Figure 8a shows the number of contacts for pristine CNCs, where we observed a fast onset of aggregation (starting around 50 ns), after which the number of contacts remains practically unchanged after 500 ns. From 500 to 1000 ns the system is in volume spanning arrested state (Nordenström et al. 2017). This is illustrated in the snapshots in Fig. 8g, where the spatial arrangements of the CNCs practically do not change in the time interval 500 < t < 1000 ns. Our finding shows that the aggregated structure of pristine CNCs in water is fully consistent with the corresponding experimental observation of the aggregation of pristine CNCs using AFM (Chen et al. 2020).
For the case of TEMPO-modified CNCs, when the NaCl concentration is 0 mM or 60 mM, the CNCs are dispersed, and no aggregation was observed after 1000 ns of NPT simulations. The number of contacts between CNCs for 0 mM is zero, see Fig. 8b. Some contacts between the ends of CNCs were detected for 60 mM case as no surface modifications with COO − groups present there, see Fig. 8c. Nevertheless, there is no significant aggregation along the side surfaces, and we can still conclude that the system is dispersed. Figure 8h shows snapshots illustrating the evolution of the system during the simulation interval 0 < t < 1000 ns. In contrast to the aggregated state discussed above, here, the spatial arrangement of CNCs changes with time, which is expected for a dispersed state.
For NaCl concentrations larger or equal to 100 mM, the number of contacts between TEMPOmodified CNCs rapidly increases, as shown in Fig. 8d-f (see also snapshots of the systems in Fig. 7d-f). An interesting observation can be made from the case 600 mM (Fig. 8f), where the contacts are mainly due to the end parts of CNCs where no surface modifications are present. However, this might also be associated with the finite simulation time (1000 ns), which is much smaller than the typical experimental time, and thus the dynamics of the systems with high salt concentrations are slower. Similar aggregation trends were observed in previous experiments of sulfated CNCs in aqueous NaCl solutions (Cao and Elimelech 2021). The critical point of NaCl concentration for the sulfated CNC aggregation is about 90-150 mM. Therefore, we believe that our model is suitable for studying the dispersion behaviour of negatively charged surfacemodified CNCs. With this respect, it is noteworthy that the relative number of surface contacts with respect to the top contacts changes with the ion concentration, being small (or close to zero) at lower and at higher concentrations and reaching a maximum at intermediate concentrations of ≈ 200 mM, where, instead, the number of top contacts becomes small, see Fig. 8b-f. It would be interesting to test this prediction experimentally.

Conclusion
A new CG model for CNCs was developed based on the Martini 3 force field. First, the native CNC was introduced and verified against atomistic and experimental results. Then, a surface modification based on TEMPO-oxidized cellulose was introduced and verified against atomistic data. The surface modifications of CNC with COO − groups and interactions with Na + ions were fitted to the AA results to reproduce the interactions between surface-modified CNCs in aqueous media. Then, the colloidal dispersion of surface-modified CNCs and the effect of the variation of the electrolyte concentration were explored. Surface-modified CNCs show colloidal stability at low NaCl concentration (< 60 mM), while they start aggregating when NaCl concentration is above 100 mM, which is consistent with the experimental data. The work brings new progress toward CNC modelling, providing an accurate description of surface-modified CNCs in colloidal solutions. The developed model can be easily modified for other surface functional groups, and we plan to expand this model to incorporate various surface modifications used in experimental studies.
Acknowledgments The authors are greatly thankful to Siewert-Jan Marrink and Paulo Souza for numerous discussions and input concerning Martini 3 implementation for the present system. The computations were conducted on resources provided by the Swedish National Infrastructure for Computing (SNIC) at NSC and HPC2N.
Author contributions All authors contributed to the study conception and design. Coarse grained molecular dynamics simulations and data collection were done by JP, atomistic molecular dynamics simulations were done by AYM, and analysis were done by all authors. The first draft of the manuscript was written by JP. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding Open access funding provided by Linköping University. The authors acknowledge funding from the Knut and Alice Wallenberg foundation through the Wallenberg Wood Science Center at Linköping University.

Data availability
The supporting Information is available free of charge on the Cellulose Publications website.

Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.

Consent for publication Not applicable.
Ethics approval Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.