Using the Isalos platform to develop a (Q)SAR model that predicts metal oxide toxicity utilizing facet-based electronic, image analysis-based, and periodic table derived properties as descriptors

Engineered nanoparticles (NPs) are being studied for their potential to harm humans and the environment. Biological activity, toxicity, physicochemical properties, fate, and transport of NPs must all be evaluated and/or predicted. In this work, we explored the influence of metal oxide nanoparticle facets on their toxicity towards bronchial epithelial (BEAS-2B), Murine myeloid (RAW 264.7), and E. coli cell lines. To estimate the toxicity of metal oxide nanoparticles grown to a low facet index, a quantitative structure–activity relationship ((Q)SAR) approach was used. The novel model employs theoretical (density functional theory calculations) and experimental studies (transmission electron microscopy images from which several particle descriptors are extracted and toxicity data extracted from the literature) to investigate the properties of faceted metal oxides, which are then utilized to construct a toxicity model. The classification mode of the k-nearest neighbour algorithm (EnaloskNN, Enalos Chem/Nanoinformatics) was used to create the presented model for metal oxide cytotoxicity. Four descriptors were identified as significant: core size, chemical potential, enthalpy of formation, and electronegativity count of metal oxides. The relationship between these descriptors and metal oxide facets is discussed to provide insights into the relative toxicities of the nanoparticle. The model and the underpinning dataset are freely available on the NanoSolveIT project cloud platform and the NanoPharos database, respectively.


Introduction
Metal oxide nanomaterials (NMs) have demonstrated exceptional structural, electronic, and chromic properties, and as such they find applications in various fields of technology. These include applications in cosmetics, sunscreens, textile, devices, and water treatment systems [1][2][3]. The vast application of metal oxides in various nanotechnology-related fields has drawn concerns with regard to their safety and toxicity, as some of these nanoparticles (NPs) have been reported to be toxic to some organisms [4,5]. In this regard, there is a high possibility that the sunscreens, which are intended to protect individuals from UV radiation, may cause more harm than the UV radiation itself [6]. Furthermore, metal oxide-based water remediation systems may pose different risks to the environment than the conventional remediation methods [7]. This provides grounds for the need to evaluate metal oxide toxicity in parallel to development of applications, as part of a responsible innovation approach. Several experimental studies have been conducted to assess the toxicity of metal oxides, examining various parameters that can cause toxicity [4,8,9].
The rate at which metal oxides are applied in different fields of technology far exceeds the rate at which the toxicity of the materials is studied. Furthermore, conventional (i.e. experimental) risk assessment techniques are often time intensive and insufficient for ensuring the safety by design of newly produced materials in rapidly expanding application areas. To keep up with the rapid rates of metal oxide production, while also lowering the number of tests and the amount of consumable reagents utilized, a method that can predict the toxicity of metal oxides is required. The application of quantitative structure activity relationship ((Q)SAR) is one approach that has been used to evaluate the toxicity of metal oxides, but to a lesser extent compared to chemical compounds [10]. As a result, extensive studies on the use of (Q)SAR models in predicting toxicity of metal oxide NPs are required to support the validation and regulatory acceptance of NMs (Q)SARs in regulatory risk assessment..
Metal oxide NPs (Q)SAR is a relatively new type of (Q)SAR that is characterized as a mathematical relationship between the properties (descriptors) of metal oxides and their biological activity. This type of model is typically referred to as "nano-(Q)SAR," and the accompanying descriptors are referred to as "nano descriptors." It is possible to compute the activity of additional NPs (where toxicity tests have not yet been carried out) by using an appropriate mathematical model, i.e. once the physicochemical parameters of metal oxides and their toxicity towards living cell lines have been calculated and the model has been validated using a test set of NMs. Predictions for new NMs can then be made once the new NMs properties fall with the domain of applicability of the (Q) SAR model, where reliable predictions can be made [11].
Classical NP properties, such as hydrodynamic size, core size, and surface charge based on 21 NPs, were used to develop a (Q)SAR model for predicting their toxicity towards human keratinocyte cells (HaCat) and bronchial epithelium transformed with Ad12-SV40 2B (BEAS-2B) cells [12]. Puzyn et al. also reported on a (Q)SAR model used to predict toxicity of 17 metal oxides, using enthalpy of formation as a descriptor for the model [13]. This model was able to successfully predict the cytotoxicity of metal oxides towards E. coli cells. Meanwhile, another model, based on seven metal oxides from different experimental conditions, has been reported [14]. On the other hand, Toporova et al. proposed (Q)SAR models based on simplified molecular input-line entry system (e.g. Simplified Molecular Input Line Entry System, "SMILES") [15][16][17].
The reliability of a (Q)SAR model depends on the set of data that was used to develop it [10]. There is no standardized dataset for the formulation of (Q)SAR models, as data used for model development are obtained from different sources and for varying experimental conditions. Density functional theory (DFT) calculations can play a vital role in the calculation of metal oxide descriptors as they have been successfully used for the prediction of metal oxide properties in various studies [18][19][20][21]. Properties obtained via DFT calculations can be used as descriptors for (Q)SAR model development. Venigalla and co-workers [22] reported on the development of a (Q)SAR model using 17 metal oxides, the descriptors of which were calculated using DFT methods. Meanwhile, Yunsong [23] explained the significance of using DFT and empirical-based descriptors to derive toxicity reaction mechanisms. The significance of in silico methods is better understanding of the features that influence MeONP potencies, as well as predict toxic responses and effect thresholds.
In the current study, a (Q)SAR model based on an extended database of the cytotoxicity of 26 metal oxide NPs to human BEAS-2B, Murine myeloid (RAW 264.7) cell lines, and E. coli was developed and validated. The confluence of the dataset is aimed to achieve a model with a broader spectrum of application and where the effects of facets towards toxicity have been taken into account. The bacteria were specifically selected for cytotoxicity assessment, since they are considered a good ecological indicator for assessing the persistence and impact of chemicals on the environment and human health. In addition, uncontrolled release of toxic substances to the bacterial environment may disturb their natural balance, resulting in unwanted effects on the environment [24]. Meanwhile, the pulmonary epithelial BEAS-2B (non-tumorigenic human lung epithelial cells) and macrophage RAW 264.7 cell lines are good models for mimicking human cells during inhalation exposures. The catalytic properties of NPs are determined by the nature of the NP surface [25]. The goal of this study was to provide a method for generating metal oxide nano-descriptors that are related to exposed facets (i.e. one of the indicators specifically related to the surface nanotopography), which may be used to characterize the observed activity of NPs towards biological cell lines.
To the best of our knowledge, descriptors on metal oxide NPs that are facet-specific have not previously been reported in the literature and are reported herein for the first time. An interpretive nano-(Q)SAR model for an extended dataset of metal oxide NPs toxicity, based on various cell lines, relies on the combined theoretical and experimental work.

Data
Toxicity data used for the model development was obtained from the literature [13], which was comprised of 26 metal oxides. (Q)SAR models have also been documented with significantly less datasets [26]. DFT simulations for lowindexed facets were used to calculate descriptors based on the surface energy and electronic properties of metal oxides. Descriptors were also generated using atomic periodicity, experimental data, and TEM data. Recent developments have focused on growing NPs with specific facets for specific applications and improved efficiency. These give a rationale for looking into the impact of exposed facets towards toxicity of NPs from a computational point of view. Various other publications have used these data, looking into different descriptors and models [27,28]. There were no data gaps in the data set, which included 26 distinct metal oxide NPs. The descriptors were classified into six categories: physicochemical, structural, image, periodic table, experimental, and molecular. Molecular descriptors included binding energy, Fermi energy, HOMO (highest occupied molecular orbitals), LUMO (lowest unoccupied energy orbitals), band gap, hardness, chemical potential, enthalpy of formation, and electronegativity. It is worth noting that molecular descriptors were derived from DFT calculations and were based on low index facet(s). The full list of descriptors collected from literature, calculated, and utilized in the model is given in the supporting information (Table S1).

Calculation of molecular descriptors
The molecular descriptors are numerical representations of the properties of metal oxides, which were based on their electronic properties. A calculated pool of molecular descriptors is presented in the supporting information (Table S1). These include surface energy, binding energy, HOMO, LUMO, band gap, electronegativity, Fermi level energy, chemical potential, and hardness. In spite of its shortcomings in estimating lattice constants and electronic band gaps, DFT is credited with giving insights to experimentally observed phenomena such as catalysis, photonics, and defects; hence, the reason for its use in this study to generate descriptors for (Q)SAR [29,30]. Additionally, DFT calculations have been used to generate datasets for high throughput screening of materials, compounds and alloys, rational design of catalysts, and for training machine learning models [31,32]. Metal oxide chemical space was limited to four types of metal oxides (M 3 O 4 , M 2 O 3 , MO 2 , and MO).
It is worth mentioning that in doing DFT calculations, the primitive cell (lattice) and not the unit cell were used for the calculations. All the crystal structures were obtained from https:// mater ialsp roject. org. DFT calculations have been documented to over (under) estimate the lattice constants by almost 20% depending on whether generalized gradient approximation (GGA) pseudopotentials are used to account for core and valence electrons [21,33]. However, this has not negated DFT from being used to offer insights on experimentally observed phenomena [19,22,34]. It is for this reason that DFT does not reproduce the same lattice constants as experiments. DFT calculations are ab initio in nature and hence do not account for temperature in the calculations. All metal oxides considered in this work were chosen with respect to their lowest convex hull energy.
All ab initio calculations, used to generate the descriptors in this study, were performed within DFT formalism as implemented in DMol [3] code within Material Studio (Accelrys, San Diego, CA) [35]. The GGA with the Perdew-Burke-Ernzerhof (PBE) functional was used to describe exchange-correlation effects [36]. The geometry optimization convergence tolerances were set at 10 −5 Ha (1 Ha = 27.21 eV) for energy, 0.002 Ha for maximum force, and 0.005 Å for maximum displacement. The tolerance to electronic self consistent field (SCF) was set at 10 −6 Ha, while a smearing of 0.02 Ha to orbital occupation was applied. Low index surfaces ((100), (110), and (111)) were constructed from the optimized bulk structures, using aslab with a vacuum of 30 Å along c-axis was used between the periodical slabs to eliminate spurious interactions.
The full, computationally enriched, dataset was made publicly available from the NanoPharos database (https:// db. nanop haros. eu/), developed via the NanoSolveIT project and continues to be maintained by NovaMechanics Ltd.

Model development
Model development was performed using the Isalos Analytics Platform, powered by the Enalos + Tools [37,38]. The dataset was checked for completeness, and no gaps were identified. The first step in the process was to feed the dataset into a low-variance filter in order to remove those descriptors that did not present significant variance and could not contribute to the model's predictive capacity [39]. In this way, the workload and time needed for the computational workflow to complete is reduced as well. The cut-off threshold for the model was set to 0.3, which meant the descriptors with 30% or similarity of values to another descriptor were removed. The next step was to account for the different numerical ranges of the included descriptors in the filtered dataset, for which Z-score normalisation was utilized. This ensured that the numerical range of all descriptors followed a Gaussian distribution with mean values and standard deviation of 0.0 and 1.0, respectively [40]. The model was then developed following random partitioning of the dataset into training and test sets, using a 75%:25% ratio. The training set was used for model development and training as well as to evaluate the model's performance and fine tune its parameters through cross-validation [41,42]. The descriptors that were used to develop the model, (i.e. those presenting the highest statistical significance with respect to the dataset's variance), were identified using the Correlation based Feature Selection (CfsSubset) algorithm combined with the BestFirst evaluator [43].
The presented metal oxide cytotoxicity model was developed using the Enalos implementation of the classification mode of the k-nearest neighbour algorithm (EnaloskNN), Enalos Chem/Nanoinformatics) [44]. EnaloskNN is an instance-based (lazy) method that uses the distance of the predicted endpoint from its k (k = 1, 2, 3, …) nearest neighbours in the feature space R n created by the "n" identified significant descriptors, which are used to make the prediction, with "k" being defined based on the model's best performance. The prediction is achieved based on the Euclidean distance, a similarity measure, of the target variable from its "k" closest neighbours [45]. The prediction is performed via the weighted average of the independent variable values of these neighbours, with the inverse of the Euclidean distance being used as the weighing factor [45,46]. In the case of nominal descriptors, the individual values are compared and if the values are the same, the Euclidean distance is set equal to 0; otherwise, it is set to 1 [47].
The Euclidean distance calculated though model development and usage can be used as a metric to not only predict a specific endpoint, but to also identify groups of neighbouring nanomaterials (NMs). The identification and analysis of these groups can lead to mapping the prediction space into specific NMs groups, which can then be used for the development of read-across strategies [48]. Therefore, by taking advantage of the EnalolkNN ability to provide the Euclidean distances, we can visualize and study the entire predictive space R n . As a result, the kNN algorithm can be used according to ECHA's read across framework [49] for NMs only if the following criteria are satisfied: • Gathering of required descriptors for each NM. • Construction of data matrix including properties and endpoints. • Development of a correlation between end points to reactivity properties. • Assessment of the applicability of the approach. • Ensurance that there is no missing data. • Assessment of the robustness of the grouping. • Justification of the method.

Model validation
The model's validity and robustness were tested using its sensitivity (Sn), specificity (Sp), and accuracy (Ac). These metrics describe the proportion of the correct predictions of toxic NMs, the proportion of the NMs that were correctly classified as non-toxic and the model's overall success rate, respectively [50]. Cohen's κ, which measures the model's reliability, while taking into account any successful predictions to be based on chance correlation [51] was calculated as well.
where, TP are true positives, TN are true negatives, FP are false positives, and FN are false negatives.
The model was furthermore evaluated using the Matthews correlation coefficient (MCC) [52], which is used as a quality measure for the development of predictive classification workflows. The MCC takes into account the true and false positive and negative outcomes of the developed model, and it is considered as a good quality metric of the model, even in cases of unbalanced datasets [53], The MCC results range between − 1 and + 1. An MCC value of − 1 corresponds to total disagreement between observed (experimental) and predicted results, while a value of + 1 corresponds to total agreement. An MCC value of 0 corresponds to a random prediction [54]. The MCC is calculated using Eq. (4): where TP are true positives, TN are true negatives, FP are false positives, and FN are false negatives.
Y-randomisation (n = 10) was used to guarantee that the created model was not the result of chance correlation and to test its statistical significance and robustness. Random shuffling of the prediction using all original descriptors yielded different data sets. The model acceptance criteria given above were calculated for each iteration. The revised criteria were expected to be lower than the original model for the model to be valid [55][56][57].

Applicability domain
In order to ensure accessibility within the scientific community and to interested stakeholders as well as to ensure its validity and reliable usage and applicability to external datasets, the applicability domain (APD) of the (Q)SAR model was calculated using the Euclidean distances calculated and retrieved via the EnalsokNN. The following equation was used: where < d > and σ are the average and standard deviation of all Euclidean distances in the training set, respectively, and Z is the empirical cut off value, which is usually 0.5. Any predictions made outside these defined limits are regarded as unreliable [58]. An analytical summary of the produced model, along with the full demonstration that the produced model meets the OECD criteria for the validation of (Q)SAR models for regulatory purposes is demonstrated via the completed (Q) SAR Model Reporting Format (QMRF) template which is included in the supplementary information (S1).

Webservice development
The fully documented model and relevant tutorials have been made publicly available, through the NanoSolveIT cloud platform (https:// cloud. nanos olveit. eu/), as a user-friendly webservice (https:// facet cytot oxici ty. cloud. nanos olveit. eu/) to ensure accessibility within the scientific community and to interested stakeholders. The model has been complemented with a REST API ( Fig. 1; https:// facet cytot oxici ty. cloud. nanos olveit. eu/ swagg er-ui/) to make it easily accessible and usable programmatically and to enable to implementation into a computational workflow, e.g. as KNIME node. The API has been implemented using the POST Request Method to be able to transfer and handle large amounts of data that are necessary to run the model. Following analysis, the results are returned in JSON format.
(5) APD =< d > +Z Fig. 1 The REST API environment for the facet-driven cytotoxicity model presented in this study. Through the API users can implement the model into their own computational workflows

Results and discussion
The goal of this study was to show that cytotoxicity of NPs may be predicted using a combination of physicochemical, molecular, and periodic table-based descriptors derived for low index facets of metal oxide NMs. The dataset included 26 metal oxides in replicates of varied experimental settings, as well as a total of 32 descriptors, i.e. 8 molecular descriptors, 9 periodic table-based descriptors, 10 TEM image-based descriptors, and 5 experimental/physicochemical descriptors (Table S1). Although the data set is modest statistically, it yields a solid predictive model. Merging datasets was carried out to ensure that they were interoperable in order to identify probable reasons for variability. This also emphasized the importance of including sufficient metadata in public datasets to improve their quality, FAIRness score, and consequently reproducibility, reusability, and scientific transparency in general [59,60]. Development of the model Following a random division of the dataset into training and test sets using a ratio of 75%:25%, respectively (Fig. 1), the predictive model was created. Using the CFsSubset algorithm and the BestFit evaluator, the descriptors that contributed the most to the model's variance and subsequently used to perform the classifications were determined. As a result, four descriptors were identified to be significant, i.e. core size, chemical potential, metal electronegativity count, and enthalpy of formation of metal the oxide. These results contain a mixture of descriptors, which includes physicochemical, periodic table and molecularbased descriptors calculated with low index facets in consideration. They are also in good agreement with a similar study by Papadiamantis et al. [44] on metal oxide toxicity to BEAS-2B and RAW 264.7 cell lines, where the core size was also identified as significant. Papadiamantis et al. identified the energy of the conduction band as significant, which is directly correlated with the metal electronegativity, and subsequently the metal electronegativity count presented herein (see also Eqs. 10 and 11 below) and the enthalpy of formation [61]. Furthermore, the chemical potential, which is linked to the arrangement of the atoms on the NPs surface, is correlated to the identified average coordination number of metal atoms in the surface region of the NPs and the average length of the surface normal component of force vector of atoms in the surface region of the NP, which describes the potential energy (stability and activity) of the atoms on the surface of NPs [61]. This parameter is unique for each exposed facet (e.g. {100} or {111}) on metal oxide MNs and hence explains their varied toxicity behaviour. The produced model had high predictivity, having an Ac value of 0.929, Sn of 0.889, and Sp of 1.000. Cohen's κ was calculated to be 0.851, and the APD is 1.951. The MCC value of the produced model was 0.861, denoting a good prediction for both classes. The predictions were performed for k = 4 closest neighbours (Fig. 2).

Justification of the descriptors implemented
There are four main mechanisms for NP toxicity: (i) the release of chemical constituents from nanomaterials; (ii) the size and shape of particle, which produce steric hindrances or interference with important binding sites of macromolecules; (iii) the surface properties of the material such as photochemical and redox potentials; and (iv) the capacity of nanomaterials to act as vectors for the transport of other toxic chemicals to sensitive tissues [13].
Core size was found to be one of the significant descriptors of the model presented herein, and this is in agreement with mechanism (ii). The effect of core size towards toxicity has been reported in the past, where a trend of decreasing core size has been connected to greater toxicity [62][63][64]. Earlier studies by Karlsson et al. compared metal oxide NPs of various sizes when they were exposed to adenocarcinomic human alveolar basal epithelial A549 cells [65]. Their findings showed that toxicity of NPs cannot be generalized solely on the basis of their core size.. The toxicity of CuO NPs increased with the observed decrease in core size, whereas the toxicity of TiO 2 , Fe 2 O 3 , and Fe 3 O 4 was unaffected by core size, regardless of having the same chemical composition. Furthermore, Warheit and co-workers reported that the toxicity of TiO 2 particles was determined by surface properties, rather than size and surface area [66]. Similarly, Ivask and colleagues discovered that while human colorectal adenocarcinoma (Caco2) cells ingested TiO 2 particles, no cytotoxicity was induced [4].
Facets are among the surface properties that have a significant influence on the toxicity of metal oxides. Liu and colleagues observed that faceted TiO 2 metal oxides (e.g. {001} facet) are more toxic than spherical metal oxides, due to their preferentially exposed crystallographic facets with large densities of unsaturated bonds [67]. The precise property-activity relationship was used to evaluate the toxicity of faceted metal oxides, using TiO 2 bipyramids with varying percentages of exposed {001} and {101} facets on the surface. The {001} facet was found to elicit severe toxicitys compared to the {101} facet, and this was attributed to the high production rate of hydroxyl radicles on the {001} surface. Similarly, differently exposed facets ({100} and {111}) of Cu 2 O nanocrystals were reported to have varied cytotoxic effects on RAW 264.7 cells [68]. Plausible mechanistic explanations were attributed to the formation of hydroxyl radicals on the facet surfaces of Cu 2 O for short term exposure, and release of Cu + ions for long-term exposure. The {100} facet was considered to release a higher concentration of Cu ions than the {111} facet. This was attributed to the alternating stacking of Cu + and O [2] ion on the {100} facet, while the Cu + ion in the {111} facet are packed between the O 2− ions making release of Cu 2+ ions more challenging [69].
Chemical stability, which is connected to particle dissolution, catalytic properties, and redox alteration on the surface, is the most important regulating parameter for metal oxide toxicity [70]. This is consistent with the aforementioned mechanisms (i) and (iii). Ions can be released by breaking chemical bonds in the metal oxide's lattice structure. Such reactions are widespread near the material's surface and are influenced by the metal oxide's exposed facet, as indicated in for Cu 2 O NMs [71]. Furthermore, the growth of metal oxide NMs towards certain facets is strongly linked to a specific lattice energy, which defines the dissolution of NMs without oxidation or reduction. As a result, the lattice energy of metal oxides varies depending on their facets, and hence, their stability varies.
Negative values for lattice energy increase with increasing cation charge (n). Similarly, the increase in positive value of enthalpy of formation is associated with an increase in cation charge. Consequently, the release of cation Me n+ having smaller charge is more energetically favoured than the release of ions with larger cation charges. This explains why the toxicity of studied metal oxides decreases in the following order: Me 2+ > Me 3+ > Me 4+ . Furthermore, because creation of Me n+ cations necessitates sublimation followed by ionization processes, the enthalpy of formation (ΔH f ) is linked to the sum of ionization potentials of a specific metal, and thus, can be calculated as where ΔH s is the enthalpy of sublimation and IP i is the nth ionization potentials of the metals.
Enthalpy of formation is not connected to metal oxide size NM, which supports our previous assertion that size cannot be the only contributing factor to metal oxide NM toxicity. Furthermore, as much as enthalpy of formation influences the release of metal ions, it also serves as an indicator for average, metal-oxygen bond strength [72]. The metal-oxygen bond strengths vary within the same crystal structure, and this is most noticeable with different facets. This explains why the ion release varies for different facets of the same metal oxide crystallographic structure.
Curvature is a fundamental variable that modulates the forces and controls the size and shape of NMs. The shape index is defined as follows: where K 2 and K 2 are the maximum and minimum principal curvature, respectively. It represents a visual interpretation of curvature with a singular range of values (− 1 ≥ S ≥ 1). A positive value depicts a concave surface, a negative value depicts a convex surface and a zero value represents a near flat surface. Furthermore, the surface curvature radius of an NP determines the chemical potential (µ) of atoms on an NP surface and is given by the Young-Laplace equation [73]: where R denotes the radius of a spherical NP, Ω denotes the volume of the particles, and γ is the surface energy. Chemical potential determines the crystallographic structure, ionicity of metal-oxygen bonds, and electrostatic potential. The latter is essential for explaining experimentally observed interactions between NPs and the cellular membrane, where the positively charged surface of the NP tends to be attracted to the negatively charged cell membrane surfaces. Meanwhile, induced stress onto the crystallographic structure tends to alter the unit cell parameters and, thus, causes structural changes such as size [74]. Ionicity is inversely proportional to size, but has a direct influence on properties such as chemical reactivity [75]. However, the observed behaviour of metal oxides is linked to the atomic arrangement of atoms on the exposed surface (facet). As a result, a relationship is established between the facets of metal oxides and their chemical potential. As a consequence, using chemical potential as a descriptor aids in model interpretation and demonstrates how facets influence metal oxide toxicity.
The electronegativity count of metal is defined as follows: where metal is defined by metal = * ; is defined by Z v metal is the valence electron of metal; Z metal is the atomic number of metal.
Electronegativity was also used as a descriptor in the model. The electronegativity (χ) value for a given metal oxide is strongly related to the electronegativity of the corresponding cation ( + ). The cation electronegativity depends on the ionic radius and formal charge of the cation; i.e. higher cation electronegativity values characterize cations with a broad charge distribution over a narrow atomic radius., It is evident that increasing cationic electronegativity will enhance the catalytic property (because electronegativity is a measure of an atom's tendency to attract a bonding pair of electrons) and, hence, the toxicity of metal oxide NPs.
Electronegativity scale also gives the potential of a metal oxide to transfer an electron towards a chemical reaction. It is an electronic based property that relates to HOMO and LUMO energies as follows: and where E HOMO and E LUMO denote the highest occupied molecular orbitals and lowest unoccupied molecular orbitals' potentials respectively, m and O denote the absolute electronegativity of metal and oxygen atoms respectively [76], a and b are number of metal of oxygen atoms in chemical formula, n is the total number of atoms in the chemical formula, E o is the standard electrode potential that assumes the value of 4.5 eV on the hydrogen scale, and E g is the estimated band gap.
Changes in the wave function due to quantum confinement of electrons results in a change of electronic properties for metal oxides. Hence, the electronegativity of the same kind of an atom in different systems is not the same. Moreover, for different facets of the same material, the electron wave function is not the same, thus its electronegativity also. It follows that varied facets of a metal oxide will yield different oxidative and reductive reactions as a result of varied electron properties (Fig. 3). This result was also observed experimentally and reported in the literature [8].
Metal electronegativity count was identified as a descriptor significant for predicting toxicity of metal oxides and is related to the band gap energy (E g ), the LUMO and the HOMO according to Eqs. 10 and 11. Thus, while the model descriptor is the electronegativity with respect to the exposed facet, the band edges can be used to describe the toxicological potential of metal oxides in cellular environments where the aqueous redox potential ranges from − 4.12 to − 4.84 eV [77]. In our study, HOMO energies for La 2 O 3 , Sb 2 O 3 , In 2 O 3 , TiO 2 , and NiO metal oxides fall within the range of aqueous redox potential. These findings show that the apparent toxicity of metal oxides is due to a distinct mechanism involving the ability to transport electrons between the surface of NPs and intracellular redox couples. When a metal oxide is irradiated under UV light, electrons are extracted from the valence band into the conduction band, leaving unoccupied electron states (holes). The holes (h + ) are capable of transferring between biological media and metal oxide, reacting with OH¯ and/or H 2 O to produce hydroxide radicals (•OH). The free electrons may react with O 2 to form superoxide radical anions (O• ‾ 2 ). These reactive oxygen species (ROS) are capable of causing membrane disruptions that can lead to cell death [78][79][80].

Conclusion
The study used toxicity data from the literature to develop a (Q)SAR model for the prediction of the toxicity of metal oxide NPs using a combination of physicochemical, molecular, and periodic table-based descriptors. The additional descriptors used in this study were calculated with the aid of DFT on the basis of up to 3 different crystal facets per NM composition (i.e. {100}, {110} and {111}), TEM images of the NMs (extracted from the original publications and processed via the NanoXtract tool). The materials' core size, chemical potential, enthalpy of formation, and electronegativity count of metal oxides were found to be the most significant descriptors of the model. All DFT calculations for the metal oxides were based on low index facets. Metal oxide chemical space was limited to four types of metal oxides (M 3 O 4 , M 2 O 3 , MO 2 , and MO). A model that is both reliable and basic for theoretical assessment of the toxicity of untested metal oxides, was successfully developed and validated using the OECD guidelines for the validation of (Q)SAR models for regulatory purposes. The model provides an insight into how facet tuning could lead to different degrees of toxicity. Finally, the types of mechanistic pathways that can lead to toxicity have been explained based on electronic properties of the metal oxide NMs. Defined structure-activity relationships in the study could play a vital role in the design of safer nanomaterials.

Acknowledgements
The authors are grateful to the Center for High Performance Computing (CHPC) in Cape Town, for the computational resources provided.

Fig. 3
A faceted metal oxide demonstration with specific behaviours observed on its various facets (surfaces). Some facets are inactive and thus non-responsive, while others are highly reactive. The vari-able electron wavefunction experienced on each facet can be used to explain this occurrence