Abstract
Break junction (BJ) measurements provide insights into the electrical properties of diverse molecules, enabling the direct assessment of single-molecule conductances. The BJ method displays potential for use in determining the dynamics of individual molecules, single-molecule chemical reactions, and biomolecules, such as deoxyribonucleic acid and ribonucleic acid. However, conductance data obtained via single-molecule measurements may be susceptible to fluctuations due to minute structural changes within the junctions. Consequently, clearly identifying the conduction states of these molecules is challenging. This study aims to develop a method of precisely identifying conduction state traces. We propose a novel single-molecule analysis approach that employs total variation denoising (TVD) in signal processing, focusing on the integration of information technology with measured single-molecule data. We successfully applied this method to simulated conductance traces, effectively denoise the data, and elucidate multiple conduction states. The proposed method facilitates the identification of well-defined plateau lengths and supervised machine learning with enhanced accuracies. The introduced TVD-based analytical method is effective in elucidating the states within the measured single-molecule data. This approach exhibits the potential to offer novel perspectives regarding the formation of molecular junctions, conformational changes, and cleavage.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Single-molecule conductance measurements have garnered significant attention in diverse applications ranging from the realization of molecular devices to the development of novel analytical techniques and the exploration of nanoscale physical properties [1,2,3]. Among the prominent techniques used in measuring single-molecule conductance are the mechanically controllable and scanning tunnelling microscope break junction (BJ) techniques based on the BJ method [4,5,6]. This method involves breaking the atomic contacts of metals to yield metal nanogap electrodes. When a molecule forms a bridge between the nanogap electrodes, the resulting tunneling current manifests as a conductance plateau in the measurement trace. Accurately discerning the conductance state of the single molecule using this trace is crucial in detecting chemical reactions[7,8,9] and identifying deoxyribonucleic acid nucleobases within these nanogaps[1, 10, 11]. However, unlike conventional methods that analyze multiple molecules, the BJ method focuses on measuring the conductance of a single molecule. Consequently, this measurement is highly susceptible to external noise and fluctuations in the molecular junction structure, leading to considerable variations in the observed conductance and impeding the analyses of individual traces containing single-molecule data [12,13,14,15]. Moreover, certain single-molecule junctions exhibit multiple conduction states owing to differences in the junction structures or molecular oxidation states [16,17,18]. The dynamics of single-molecule junctions remain inadequately understood because of the unreliability of analyzing individual conductance traces. To mitigate noise, the construction of conductance histograms is a typical analytical method employed in BJ studies [3, 4]. However, the histograms often display broad peaks attributable to substantial variations in the conductance traces, rendering differentiation between distinct conduction states challenging.
Recently, remarkable advancements in analytical techniques have been reported, particularly with respect to machine learning [19]. The integration of machine learning methodologies has progressively expanded to single-molecule measurements [10, 20,21,22,23,24,25,26,27]. Innovative analytical techniques, such as supervised machine learning and unsupervised clustering methods, have enabled the classification of various types of conductance traces. However, current methods of single-molecule signal analysis have yet to provide clear insights into individual conductance traces without relying on statistical signals. The application of novel signal processing and optimization techniques to single-molecule experimental data exhibits potential in detecting changes in the conduction state within a conductance trace. The aim of this study was to develop an analytical method of discriminating between different states based on established signal processing techniques. Our objective was to eliminate the reliance on statistical signals and derive precise data directly from individual conductance traces.
2 Method
2.1 Total variation denoising
We propose total variation denoising (TVD)-based signal reconstruction as a novel method of analyzing single-molecule conductance traces. A schematic of the proposed method is shown in Fig. 1, and Eq. 1 is minimized in TVD[28, 29].
Graw, n and Grec, n respectively represent the conductances at the n-th point in the raw and reconstructed conductance traces. In this study, differences in conductance were assessed using a logarithmic scale. For implementation, we employed the alternating direction multiplier method (ADMM) to minimize the nondifferentiable total variation[30, 31]. The analysis was performed with Python 3.10.9 with numpy package version 1.23.5, and scikit-learn version 1.2.1[32].
2.2 Generation method of simulated conductance traces
In this study, the proposed analytical method was applied to simulated single-molecule traces. Three classes of simulated traces were generated as follows. First, the ground truth traces were generated. Plateaus conductance (log(G/G0), G0 is conductance quantum, 2e2/h, e and h are the elementary charge and Planck constant, respectively) and length were determined by normal distribution of center G, standard deviation σG, and normal distribution of center L, standard deviation σL in molecular conductance region. The parameters are shown in Table 1. Here, we assumed the conductance traces of typical small organic molecules in break junction measurement such as 4,4’-bipyridine, 1,4-benzenedithiol [4, 5, 12, 17, 33, 34]. These molecules have multiple conductance states [12, 17, 33, 34]. Some traces exhibit both conductance state and the other traces have only one state. For instance, 4,4’-Bipyridine has high- and low-conductance states with the log conductance difference of about 0.5 [17, 34]. We set σG = 0.1 because the observed plateau conductance has a width of about 0.1 on the fluctuation log scale [13, 17, 34]. In single-molecule measurements, a two- or three-fold conductance plateau is often interpreted as a two- or three-molecule junction [4, 13, 18]. Therefore, it is reasonable to set the identical state with a log conductance difference smaller than log(2) = 0.3. For the plateau length, a plateau length L of 0.5–1 nm and a standard deviation σL of 0.1 nm, which are typical lengths for small molecule BJ measurements, were set based on experimental results [4, 33]. The stretch length is set to 0.01 nm for data point. In metal contact region (G > G0), the plateaus consists of a plateau determined by a normal distribution with a probability of occurrence of 0.3, center 0.3, and standard deviation 0.02, and a plateau determined by a normal distribution with a probability of occurrence of 0.7, center 0, and standard deviation 0.01, the plateau length follows a normally distributed normal distribution with a center of 50 points and a standard deviation of 10 points for all class. Then, Gaussian noise was introduced the ground truth traces. The standard deviation is 0.1 in the metal junction region (G > G0) and 0.5 in other regions. 1000 traces were generated for three classes.
3 Results and discussion
Equation 1 comprises two terms. The first term represents the L2 norm, indicating the Euclidean distance between the reconstructed and raw signals, signifying their proximity. The second term, which encompasses the sum of the differences between adjacent observation points, is referred to as the total variation. This term diminishes as the signal transitions towards smaller, flatter variations. The parameter λ, serving as the regularization parameter, influences the total variation, and higher λ values amplify the total variation, thus reducing the variation within the reconstructed trace. The results reconstructed using TVD for conductance traces at different λ values are shown in Fig. 2. The simulated conductance of Class 1 shown in Fig. 2a. Two conduction plateaus appears at log(G/G0) = –2.5 and –3. The presence of noise obscures the distinction between the two states in the signal. For comparison, Fig. 2b shows the trace reconstructed using simple moving average smoothing, which blurs the boundaries delineating the changes in conduction state while averaging to mitigate conductance variations. Moreover, accurate conductance determination is impeded by the conductances of the metal contacts. Figures 2c–f show the denoising results obtained via our proposed TVD-based method at various λ values. At smaller λ values, the dominance of the first term in Eq. 1 retains the similarity to the original signal (Fig. 2c). Conversely, larger λ values generally result in the overestimation of the step effect, aligning the conductances in the metal junction and post-molecular junction break regions more closely with the molecular conductance (Fig. 2f). However, higher λ values generally lead to the equalization of the changes in the molecular state. Notably, a clear depiction of two distinct steps emerges in the reconstructed trace at λ = 1. In this study, reconstructed traces with different λ in range 0.5–3 exhibit no difference in molecular conductance region as shown in Additional file 1: Fig. S1. Optimization of λ does not eliminate the difference from ground truth because the traces with noise were analyzed. In the analysis or real single-molecule data, there are no ground truth traces. In order to determine λ, the assumptions regarding the equivalence of distinct states of conductance is necessary. This assumption depends on the objectives underpinning the analysis. The plot showing the total variation loss as a function of the number of iterations in the ADMM algorithm (Additional file 1: Fig. S2) confirms that the total variation loss diminishes as the conductance steps become more distinguishable. Thus, the proposed method excels in reconstructing single-molecule measurement data, emphasizing clear changes in state. Subsequently, further validation was conducted at λ = 1, where the steps are distinctly reconstructed.
Reconstruction results of a simulated conductance trace. a Original trace. Blue line is noise-free ground truth. Gray line is generated by adding Gaussian noise to blue line. b Magenta line is reconstruction trace by moving average smoothing. c–f Red lines are reconstructed results by TVD analysis with λ = 0.1, 1, 5, and 20, respectively. All gray line is analyzed raw conductance trace
The observed conductance often exhibits variability when using single-molecule measurements. Determining single-molecule conductance typically involves the construction of a conductance histogram using numerous conductance traces[3, 4]. In this study, histogram analyses were conducted for three distinct classes of traces, as shown in Fig. 3. The 2D conductance-stretch length histograms of the three classes before TVD processing are shown in Figs. 3a, d, and g. For Class 1, the raw signal displays two plateaus, yet the resulting histogram exhibits only a single broad peak. Conventional single-molecule analyses using typical histograms may fail to distinguish between the two states. Conversely, the histograms derived from the TVD-reconstructed traces distinctly reveal two peaks, underscoring the efficacy of this method as a preprocessing step in differentiating between multiple conduction states. Even when the conductance trace displays a single plateau, as observed in Classes 2 and 3 (Figs. 3d–i), the proposed TVD-based method generates distinct plateaus in the histograms constructed using the reconstructed traces. These refined histograms significantly contribute in accurately determining the conductance. In instances where the BJ method reveals two conduction states, the observations encompass traces displaying both plateaus or those exhibiting only one plateau. In our analysis, we assumed a dataset comprising traces from Classes 1, 2, and 3, each with an equal appearance rate. The distinguishability between the two states is clear in the 2D histogram (Fig. 3j) and resultant conductance histograms (Fig. 3k). The total variation-denoised histogram exhibits two distinct peaks, underscoring the applicability of the proposed method to datasets with various types of conductance traces.
Histogram analysis of simulated data for Class 1 (a–c), Class 2 (d–f), Class 3 (g–i), and dataset constructed from the three classes with equal appearance rate (j, k). a, d, g, j 2D conductance-stretch length 2D histograms for Class 1 (a), Class 2 (d), Class 3 (g), and dataset constructed from the three classes (j). b, e, h Raw(gray) and reconstructed (red) conductance trace of Class 1 (b), Class 2 (e), Class 3 (h). c, f, i, k Conductance histograms constructed from raw (gray) and reconstructed (red) traces
In contrast to conventional smoothing methods, the TVD-based approach demonstrated distinctive step-like conductance transitions. This method not only facilitates denoising and precise conductance determination but also plateau analysis, as shown in Fig. 4a. Although previous studies analyzed plateau lengths [33], our method offers the advantage of defining plateaus without explicitly specifying the conductance region. This enables the determination of the plateau length and conductance under consistent conditions, devoid of arbitrariness, and is particularly beneficial in delineating multiple conduction states with subtle differences. Further details regarding the definition of the plateau region may be found in the Additional file 1. Using TVD-based analysis, plateau analysis successfully detects high- and low-conductance states with a 92% detection rate, based on 1000 Class 1 traces. Even when the assigned plateaus are counted for the high- or low-conductance state during validation, this method demonstrates a high detection rate. Its effectiveness in identifying both conduction states, even in traces with fuzzy boundaries owing to noise, underscores its robustness in state identification. The histograms revealing the conductances and lengths of the detected plateaus (Fig. 4b–e) show respective average log(G/ G0) values of – 2.49 and – 3.00 for high- and low-conductance states. The log conductance values of the high- and low-conductance states exhibit slight differences of 0.08 from the ground truth values. Average plateau lengths of 0.52 and 0.53 nm are observed. The histograms are consistent with the original conductance and plateau length. Estimating the plateau lengths is critical in inferring the stabilities of individual single-molecule junction states and molecular junction structures. Furthermore, identifying regions of identical conduction states via plateau analysis facilitates the evaluation of the noise magnitude within each conduction state [26, 35]. In a single-molecule junction, the conductance and its variation are crucial in representing the junction-specific state. The proposed method enables the evaluation of the conductance variation within each conduction state during BJ measurements, revealing novel insights into molecular junctions.
Result of Plateau detection analysis a Example of detected plateaus. Three plateaus (yellow, magenta, cyan) are detected based on TVD reconstructed trace (red). Gray line is raw signal. b, c Histograms of the plateau conductances of high- and low-conductance states for (b, c), respectively. d, e Histograms of lengths of the detected plateaus of high- and low-conductance states for d, e, respectively. Gray histograms in (b–e) represents conductance and plateau length in ground truth conductance traces. Magenta and cyan histogram represents detected plateau of high- and low-conductance state, respectively
Recently, efforts to enhance the discrimination accuracy of measured single-molecule data emerged via the application of supervised machine learning techniques [10, 20, 24,25,26,27, 36]. Unlike focusing solely on a single average conductance, machine learning scrutinizes conductance variation, augmenting single-molecule discrimination. Our method serves as a viable preprocessing step for use in supervised machine learning applications. Figure 5a shows a schematic comparison between the conventional identification method for single-molecule traces and the proposed method. In the conventional approach, conductance histograms derived from individual traces are used as features in supervised machine learning classification [24, 27]. However, in this study, we first conducted denoising, using the reconstructed traces as inputs for classification in the same manner. Conductance traces belonging to Classes 1, 2, and 3 were discriminated using the conventional and TVD methods. The detail of supervise machine learning is described in Additional file 1. The confusion matrices displaying the classification results obtained using the conventional and proposed methods are shown in Fig. 5b and c, respectively. In the conventional method, the misclassification of Class 1, which is characterized by two plateaus, occurs more frequently, primarily owing to noise hindering the differentiation between the two conduction states. The classification performance was evaluated using one of the performance indicators, i.e., the F-measure, which is defined as the harmonic mean of sensitivity and specificity. The respective F-measure scores of the conventional and proposed methods are 0.87 and 0.95. Machine learning classifies the histograms based on individual traces as features, thus benefiting from the improved classification accuracy derived from the distinct histograms generated via TVD-based denoising. Our developed method not only enhances the classification accuracy but also serves as an effective preprocessing step for use in machine learning applications.
4 Conclusion
In summary, the TVD method introduced in this study clarified the plateaus and transitions within the conduction states derived from the conductance traces. By revealing the conductances and plateau lengths while enhancing trace discrimination, this method contributed significantly in unraveling conductance transitions within molecular junctions with multiple conduction states. Its application should provide previously undisclosed insights into the dynamics of changes within single-molecule junctions.
Data availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability
The source codes are available from the corresponding author on reasonable request.
References
Di Ventra M, Taniguchi M. Decoding DNA, RNA and peptides with quantum tunnelling. Nat Nanotechnol. 2016;11:117–26. https://doi.org/10.1038/nnano.2015.320.
Su TA, Neupane M, Steigerwald ML, et al. Chemical principles of single-molecule electronics. Nat Rev Mater. 2016;1:1–15. https://doi.org/10.1038/natrevmats.2016.2.
Song H, Reed MA, Lee T. Single molecule electronic devices. Adv Mater. 2011;23:1583–608. https://doi.org/10.1002/adma.201004291.
Xu B, Tao NJ. Measurement of single-molecule resistance by repeated formation of molecular junctions. Science. 2003;301:1221–3. https://doi.org/10.1126/science.10874.
Reed MA, Zhou C, Muller CJ, et al. Conductance of a molecular junction. Science. 1997;278:252–4. https://doi.org/10.1126/science.278.5336.252.
Smit RHM, Noat Y, Untiedt C, et al. Measurement of the conductance of a hydrogen molecule. Nature. 2002;419:906–9. https://doi.org/10.1038/nature01103.
Yang C, Zhang L, Lu C, et al. Unveiling the full reaction path of the Suzuki–Miyaura cross-coupling in a single-molecule junction. Nat Nanotechnol. 2021;16:1214–23. https://doi.org/10.1038/s41565-021-00959-4.
Huang X, Tang C, Li J, et al. Electric field–induced selective catalysis of single-molecule reaction. Sci Adv. 2019;5:eaaw3072. https://doi.org/10.1126/sciadv.aaw3072.
Aragones AC, Haworth NL, Darwish N, et al. Electrostatic catalysis of a Diels–Alder reaction. Nature. 2016;531:88–91. https://doi.org/10.1038/nature16989.
Taniguchi M, Ohshiro T, Komoto Y, et al. High-precision single-molecule identification based on single-molecule information within a noisy matrix. J Phys Chem C. 2019;123:15867–73. https://doi.org/10.1021/acs.jpcc.9b03908.
Ohshiro T, Konno M, Asai A, et al. Single-molecule RNA sequencing for simultaneous detection of m6A and 5mC. Sci Rep. 2021;11:19304. https://doi.org/10.1038/s41598-021-98805-z.
Kim Y, Pietsch T, Erbe A, et al. Benzenedithiol: a broad-range single-channel molecular conductor. Nano Lett. 2011;11:3734–8. https://doi.org/10.1021/nl201777m.
Li X, He J, Hihath J, et al. Conductance of single alkanedithiols: conduction mechanism and effect of molecule−electrode contacts. J Am Chem Soc. 2006;128:2135–41. https://doi.org/10.1021/ja057316x.
Li Z, Mejía L, Marrs J, et al. Understanding the conductance dispersion of single-molecule junctions. J Phys Chem C. 2020;125:3406–14. https://doi.org/10.1021/acs.jpcc.0c08428.
Kim HS, Kim Y-H. Conformational and conductance fluctuations in a single-molecule junction: multiscale computational study. Phys Rev B. 2010;82:075412. https://doi.org/10.1103/PhysRevB.82.075412.
Li Y, Wang H, Wang Z, et al. Transition from stochastic events to deterministic ensemble average in electron transfer reactions revealed by single-molecule conductance measurement. Proc Natl Acad Sci USA. 2019;116:3407–12. https://doi.org/10.1073/pnas.18148251.
Quek SY, Kamenetska M, Steigerwald ML, et al. Mechanically controlled binary conductance switching of a single-molecule junction. Nat Nanotechnol. 2009;4:230–4. https://doi.org/10.1038/nnano.2009.10.
Komoto Y, Fujii S, Iwane M, Kiguchi M. Single-molecule junctions for molecular electronics. J Mater Chem C Mater. 2016;4:8842–58. https://doi.org/10.1039/C6TC03268K.
Brown KA, Brittman S, Maccaferri N, et al. Machine learning in nanoscience: big data at small scales. Nano Lett. 2019;20:2–10. https://doi.org/10.1021/acs.nanolett.9b04090.
Komoto Y, Ryu J, Taniguchi M. Machine learning and analytical methods for single-molecule conductance measurements. Chem Commun. 2023;59:6796–810. https://doi.org/10.1039/D3CC01570J.
Albrecht T, Slabaugh G, Alonso E. Deep learning for single-molecule science. Nanotechnology. 2017;28:423001. https://doi.org/10.1088/1361-6528/aa8334.
Lemmer M, Inkpen MS, Kornysheva K, et al. Unsupervised vector-based classification of single-molecule charge transport data. Nat Commun. 2016;7:1–10. https://doi.org/10.1038/ncomms12922.
Bro-Jørgensen W, Hamill JM, Bro R, Solomon GC. Trusting our machines: validating machine learning models for single-molecule transport experiments. Chem Soc Rev. 2022;51:6875–92. https://doi.org/10.1039/d1cs00884f.
Magyarkuti A, Balogh N, Balogh Z, et al. Unsupervised feature recognition in single-molecule break junction data. Nanoscale. 2020;12:8355–63. https://doi.org/10.1039/d0nr00467g.
Lauritzen KP, Magyarkuti A, Balogh Z, et al. Classification of conductance traces with recurrent neural networks. J Chem Phys. 2018;148:084111. https://doi.org/10.1063/1.5012514.
Balogh Z, Mezei G, Tenk N, et al. Configuration-specific insight into single-molecule conductance and noise data revealed by the principal component projection method. J Phys Chem Lett. 2023;14:5109–18. https://doi.org/10.1021/acs.jpclett.3c00677.
Fu T, Zang Y, Zou Q, et al. Using deep learning to identify molecular junction characteristics. Nano Lett. 2020;20:3320–5. https://doi.org/10.1021/acs.nanolett.0c00198.
Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D. 1992;60:259–68. https://doi.org/10.1016/0167-2789(92)90242-F.
Chambolle A. An algorithm for total variation minimization and applications. J Math Imaging Vis. 2004;20:89–97. https://doi.org/10.1023/B:JMIV.0000011325.36760.1e.
Wahlberg B, Boyd S, Annergren M, Wang Y. An ADMM algorithm for a class of total variation regularized estimation problems. IFAC Proc Vol. 2012;45:83–8. https://doi.org/10.3182/20120711-3-BE-2027.00310.
Gabay D, Mercier B. A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput Math with Appl. 1976;2:17–40. https://doi.org/10.1016/0898-1221(76)90003-1.
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825–30. https://doi.org/10.5555/1953048.2078195.
Kamenetska M, Quek SY, Whalley AC, et al. Conductance and geometry of pyridine-linked single-molecule junctions. J Am Chem Soc. 2010;132:6817–29. https://doi.org/10.1021/ja1015348.
Isshiki Y, Fujii S, Nishino T, Kiguchi M. Fluctuation in interface and electronic structure of single-molecule junctions investigated by current versus bias voltage characteristics. J Am Chem Soc. 2018;140:3760–7. https://doi.org/10.1021/jacs.7b13694.
Kamenetska M, Koentopp M, Whalley AC, et al. Formation and evolution of single-molecule junctions. Phys Rev Lett. 2009;102:126803. https://doi.org/10.1103/PhysRevLett.102.126803.
Lin L, Tang C, Dong G, et al. Spectral clustering to analyze the hidden events in single-molecule break junctions. J Phys Chem C. 2021;125:3623–30. https://doi.org/10.1021/acs.jpcc.0c11473.
Funding
This work was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant nos. 19H00852, 22K14566, 22H00281 and Japan Science and Technology Agency (JST) Core Research for Evolutional Science and Technology (CREST) Grant nos. JPMJCR1666, JPMJCR2234, and JST Support for Pioneering Research Initiated by the Next Generation (SPRING) Grant no. JPMJSP2138, Japan and “Crossover Alliance to Create the Future with People, Intelligence and Materials” J225102501 from MEXT, Japan.
Author information
Authors and Affiliations
Contributions
Conceptualization, YK; Analysis, YK; Visualization, YK and JR; Writing original draft, YK; writing—review and editing Y. K. and M. T. All authors have approved the final version of the manuscript.
Corresponding author
Ethics declarations
Competing interests
There are no conflicts to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1
. Supplementary Information.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Komoto, Y., Ryu, J. & Taniguchi, M. Total variation denoising-based method of identifying the states of single molecules in break junction data. Discover Nano 19, 20 (2024). https://doi.org/10.1186/s11671-024-03963-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s11671-024-03963-4