Molecular vibrations described by normal modes are in general delocalized over the molecular structure of interest [1]. This delocalization hampers the direct comparison between the vibrations of a molecule in different environments (e.g., gas phase and solution) or when this molecule binds to a host compound. Over the past two decades, enormous effort has been put into finding “localized” normal vibrational modes (which can also be called “localized modals” in analogue to “localized orbitals” [2]), belonging to a molecule within a noncovalently bonded complex or a fragment of one molecule, including partial Hessian diagonalization [3], partial Hessian vibrational analysis (PHVA) [4, 5], mobile block Hessian (MBH) [6,7,8,9,10], vibrational subsystem analysis (VSA) [11,12,13], local Hessian transformation [14], just to name a few [15]. However, all these approaches share a common deficiency—the unphysical partitioning of the full Hessian matrix, which causes information loss about the interaction between the subsystem and its environment.

Recently, we proposed the generalized subsystem vibrational analysis (GSVA) as a new solution to obtain intrinsic fragmental vibrations [16, 17]. The key feature of GSVA compared to its predecessors lies in avoiding the partitioning of the full Hessian matrix. Instead, GSVA extracts for the subsystem a unique effective Hessian matrix \(\mathbf {F}^x_\mathrm{sub}\)

$$\begin{aligned} \mathbf {F}^x_\mathrm{sub} = \mathbf {B}^{\prime \dagger }_\mathrm{sub} (\mathbf {B}^{\prime } (\mathbf {F}^x)^{+} \mathbf {B}^{\prime \dagger })^{-1} \mathbf {B}^{\prime }_\mathrm{sub} \end{aligned}$$
(1)

where \(\mathbf {F}^x\) is the full Hessian matrix expressed in Cartesian coordinates of dimension (\(3N\times 3N\)) for the whole molecular system being composed of N atoms including n atoms of the target subsystem and \(N-n\) atoms of the environment. The Wilson \(\mathbf {B}\)-matrices [1] \(\mathbf {B}^{\prime }\) and \(\mathbf {B}^{\prime }_\mathrm{sub}\) define a non-redundant set of (\(3n-k_\mathrm{sub}\)) internal coordinates for the subsystem fragment in rows with full 3N columns and truncated 3n columns (excluding the environment atoms), respectively. \(k_\mathrm{sub}\) is the total number of rotations and translations for the subsystem being 5 or 6 depending on whether the subsystem geometry is linear or nonlinear. \((\mathbf {F}^x)^+\) is the Moore–Penrose inverse [18] of \(\mathbf {F}^x\), which is singular. \(\mathbf {F}^x_\mathrm{sub}\) on the left-hand side is a symmetric matrix of dimension \((3n\times 3n),\) and it has exactly \(k_\mathrm{sub}\) zero eigenvalues. The \(\dagger\) superscript denotes matrix transpose.

With the effective Hessian matrix \(\mathbf {F}^x_\mathrm{sub}\) expressed in Cartesian coordinates, the conventional normal mode analysis (NMA) machinery [19, 20] which is widely implemented in most quantum chemical packages can be employed to calculate for the subsystem a new type of localized normal modes, which we coined intrinsic fragmental vibrations. The reason why these normal vibrations are called “intrinsic” is due to the fact that the effective Hessian matrix \(\mathbf {F}^x_\mathrm{sub}\) retains the curvature of the potential energy surface (PES) in the direction defined by any internal coordinate within the subsystem [16, 17]. In other words, the subsystem fragment “feels” exactly the same curvature of the PES as the whole system being described with the full Hessian matrix \(\mathbf {F}^x\). This property of \(\mathbf {F}^x_\mathrm{sub}\) endows our GSVA method with a solid physical basis [16].

As discussed in our earlier work on the original GSVA implementation [16], GSVA requires a complete and non-redundant set of internal coordinate parameters, and its Wilson-\(\mathbf {B}\) matrices (see Eq. 1) to span the internal vibration space of the subsystem. However, the construction of the non-redundant parameter set is nontrivial and it needs either judicious selection of parameters manually with expert knowledge, or a dedicated algorithm which automatically selects the non-redundant parameter set from a series of redundant set of parameters in a trial-and-error manner.

In this work, we propose an alternative formulation of GSVA, which can save the effort of constructing the non-redundant parameter set for the subsystem. The new implementation replaces the Wilson-\(\mathbf {B}\) matrices in Eq. 1 with a different matrix, which also spans the internal vibrational space of the subsystem via the following procedure.

First, we apply to the subsystem fragment with its Cartesian coordinates collected in a \(3n\times 1\) column vector \(\mathbf {R}_\mathrm{cart}\) the massless (assuming all atomic masses are identical) Eckart conditions [21, 22] to generate a set of five or six translational and rotational vectors which are orthonormal to each other; see Eq. 2:

$$\begin{aligned} \mathbf {R}_{tr.+ro.} = \{\mathbf {r}_1, ..., \mathbf {r}_i, ..., \mathbf {r}_k \} \end{aligned}$$
(2)

where k equals 5 or 6 depending on whether the subsystem is linear or not and \(\mathbf {r}_i\) is a column vector of length 3n.

Next, a Gram–Schmidt orthonormalization is conducted on \(\mathbf {R}_{tr.+ro.}\) to generate \(n_\mathrm{vib} = 3n-k\) remaining vectors collected in \(\mathbf {V}\)

$$\begin{aligned} \mathbf {V} = \{\mathbf {v}_1, ..., \mathbf {v}_j, ..., \mathbf {v}_{n_\mathrm{vib}} \} \end{aligned}$$
(3)

where \(\mathbf {v}_{j}\) is a column vector of length 3n. It has to be noted that matrix \(\mathbf {V}\) is equivalent to matrix \(\mathbf {B}^{\prime \dagger }_\mathrm{sub}\) in spanning the internal coordinate/vibration space of the subsystem. In order to obtain the equivalent matrix of \(\mathbf {B}^{\prime \dagger }\), we pad each column vector \(\mathbf {v}_j\) with \(3(N-n)\) zeros associated with environmental atoms, resulting in the matrix \(\mathbf {V}_\mathrm{full}\) with the dimension \((3N\times n_\mathrm{vib})\).

In this way, the effective Hessian matrix \(\mathbf {F}^x_\mathrm{sub}\) for the subsystem can be written as

$$\begin{aligned} \mathbf {F}^x_\mathrm{sub} = \mathbf {V} (\mathbf {V}^{\dagger }_\mathrm{full} (\mathbf {F}^x)^{+} \mathbf {V}_\mathrm{full})^{-1} \mathbf {V}^{\dagger }. \end{aligned}$$
(4)

Then, the conventional NMA machinery is applied to obtain the intrinsic fragmental vibrations for the subsystem as in our earlier formulation [16]. The new formulation has two major advantages for practical implementation. (1) It avoids the complicated process of finding the complete and non-redundant internal coordinate parameter set for the subsystem; (2) the code for finding translation/rotation vectors from the Eckart conditions and conducting Gram–Schmidt orthonormalization in a modern quantum chemical package can be reused, which facilitates implementing GSVA. In order to distinguish from the original implementation of GSVA, this revised implementation was named rev-GSVA.

As a showcase example, we have employed rev-GSVA to calculate the intrinsic fragmental vibrations of the methane (CH\(_4\)) molecule in (1) methane-intercalated B\(_{36}\)N\(_{36}\) complex (Fig. 1a), (2) methane-intercalated C\(_{60}\) structure (Fig. 1b) [23] and (3) gas phase as reference. Unlike the methane-intercalated C\(_{60}\) complex, the methane-intercalated B\(_{36}\)N\(_{36}\) system [24] has not been synthesized experimentally so far, and it is interesting to compare the intrinsic fragmental vibrations of the methane molecule in B\(_{36}\)N\(_{36}\) and C\(_{60}\) in order to explore the different encapsulation effect. These three molecular systems were modeled at the M06-2X/6-31G(d,p) level [25,26,27] with Grimme’s D3(0) dispersion correction [28] using the Gaussian 16 package [29].

Fig. 1
figure 1

Structure of a methane molecule encapsulated in B\(_{36}\)N\(_{36}\) cage with \(T_d\) symmetry and b methane encapsulated in fullerene (C\(_{60}\)) with T symmetry

The results in Table 1 show that the methane molecules encapsulated inside the two cages retain \(T_d\) symmetry, as the reference methane molecule in gas phase. The non-degenerate A\(_1\) vibration describes the symmetric stretching of four C–H bonds. The doubly degenerate E modes specify the relative turnstile twisting motions of two H–C–H fragments. The triply degenerate T\(_2\) modes (1–3) with lower frequencies denote the bending of methane, while the other triply degenerate T\(_2\) modes (1’–3’) with higher frequencies are antisymmetric stretching motions of four C–H bonds. We found the largest deviation in the vibrational frequencies relative to methane in gas phase for the A\(_1\) mode and the T\(_2\)(1’–3’) modes which are concerned with the C–H bond stretching. For methane molecule contained in the B\(_{39}\)N\(_{39}\) cage, above two vibrations are redshifted by 29 and 38 cm\(^{-1}\) respectively compared to gas phase. However, these two vibrations of methane in fullerene are blueshifted by 70 and 45 cm\(^{-1},\) respectively. This means the fullerene cage could strengthen the C–H bonds of methane, while the B\(_{36}\)N\(_{36}\) weakens the C–H bonds of the contained methane molecule. One might argue that comparing the normal mode frequencies of the whole system (the way spectroscopists usually adopt) could lead to similar conclusion because the methane molecule is well separated from the cage structure; however, one needs to note that only the (rev-)GSVA method could provide the localized normal modes and frequencies which can be legitimately comparable across different systems containing the same target subsystem.

Table 1 Normal mode frequencies (in cm\(^{-1}\)) of CH\(_4\) determined with rev-GSVA in different environments

In addition to local minima on the PES, we have also tested a first-order saddle point (i.e., transition state, TS) structure. As it has been proven in our earlier work [16] that GSVA retains the curvature of the PES, it is of interest to explore whether (rev-)GSVA can retain the imaginary vibrational mode specifying the bond breaking/forming in the subsystem. In this pilot study, we investigated the TS of the chemical reaction involving a potential \(\alpha\)-ketoamide inhibitor of the SARS-CoV-2 main protease (Mpro), which is assumed to inhibit the activity of SARS-CoV-2 virus by blocking viral replication [30]. According to a recent X-ray structure of the \(\alpha\)-ketoamide SARS-CoV-2 main protease (Mpro) complex [30], ketoamide and enzyme are linked via a cysteine side chain of the enzyme. According to the suggested catalytic mechanism, the chemical reaction starts with a nucleophilic attack of the sulfur atom onto a C=O carbon atom of \(\alpha\)-ketoamide moiety, which is followed by proton transfer from the –SH group to a nearby oxygen atom of the inhibitor, as shown in Fig. 2. Work is in progress to model the reaction in the enzyme.

Fig. 2
figure 2

Transition-state structure of proton transfer from methanethiol to an \(\alpha\)-ketoamide inhibitor [30]. The methanethiol group is a simplified model of cysteine in SARS-CoV-2 main protease. The minimal 3-atom subsystem is highlighted in green color. The 5-atom subsystem is highlighted with green and blue. The 8-atom subsystem includes the 5-atom subsystem and atoms highlighted with orange. The 15-atom subsystem includes the 8-atom subsystem and atoms highlighted in purple. The 20-atom subsystem includes the 15-atom subsystem and atoms highlighted with cyan. This system was modeled at B3LYP/6-31G(d,p) level in Gaussian 16

We started with a minimal subsystem of 3 atoms containing the proton and its donor/acceptor atoms, and rev-GSVA was applied to calculate the corresponding intrinsic fragmental vibrations. Surprisingly, no imaginary frequency exists for this small 3-atom subsystem. This is probably due to the fact that the reaction center should also include the carbon atom to draw a more complete picture about bond forming and breaking. However, one imaginary frequency starts to emerge when more surrounding atoms are included into the subsystem (see Table 2) and the imaginary frequency value quickly converges to that of the full system when the subsystem contains 20 atoms. This result indicates that if the normal vibration is localized in a particular part of the molecule (e.g., bond breaking/forming or C=O bond stretching), (rev-)GSVA is expected to reproduce this vibration using a subsystem containing this segment and a few surrounding atoms. This valuable feature of (rev-)GSVA can lead to important first insights into the role of surrounding atoms for the reaction mechanism by analyzing the TS, before starting a more complex reaction path following procedure.

Table 2 Imaginary frequency (in cm\(^{-1}\)) within the subsystem of different sizes for the transition state of proton transfer reaction

We have implemented the new formulation of GSVA (rev-GSVA) introduced in this work into the open-source package UniMoVib [20] for interested readers to use in their own research.