A protocol for automated timber species identification using metabolome profiling
Using chemical fingerprints for timber species identification is a relatively new, but promising technique. However, little is known about the effect of pre-processing spectral data parameter settings on the timber species classification accuracy. Therefore, this study presents an extensive and automated analysis method using the random forest machine learning algorithm on a set of highly valuable timber species from the Meliaceae family. Metabolome profiles were collected using direct analysis in real-time (DART™) ionisation coupled with time-of-flight mass spectrometry (TOFMS) analysis of heartwood specimens for 175 individuals (representing 10 species). In order to analyse variability in classification accuracy, 110 sets of data pre-processing parameter combinations consisting of mass tolerance for binning and relative abundance cut-off thresholds were tested. Furthermore, for each set of parameters (designated “binning/threshold setting”), a random search for one hyperparameter of interest was performed, i.e. the number of variables (in this case ions) drawn randomly for each random forest analysis. The best classification accuracy (82.2%) was achieved with 47 variables and a binning and threshold combination of 40 mDa and 4%, respectively. Entandrophragma angolense is mostly confused with Entandrophragma candollei and Khaya anthotheca, and several Swietenia species are confused with each other due to the high similarity of their chemical fingerprints. Entandrophragma cylindricum, Entandrophragma utile, Khaya ivorensis, Lovoa trichilioides and Swietenia macrophylla are easy to discriminate and show less misclassifications. The choice of parameter settings, whether it is in the data pre-processing (binning and threshold) or classification algorithm (hyperparameters), results in variability in classification accuracy. Therefore, a preliminary parameter screening is proposed before constructing the final model when using the random forest algorithm for classification. Overall, DART-TOFMS in combination with random forest is a powerful tool for species identification.
The authors would like to thank Stijn Willem (UGent-Woodlab), Pam McClure and Erin Price (US Fish and Wildlife Forensic Laboratory) for their help with the sample preparation. This research was conducted under the HerbaXylaRedd BELSPO-project (Brain.be – code: BR/143/A3/HERBAXYLAREDD). The findings and conclusions in the article are those of the authors and do not necessarily represent the views of the U.S. Fish and Wildlife Service.
Compliance with ethical standards
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
- Beeckman H (2003) De microscopische schoonheid van mahonie (The microscopical beauty of mahogany). CR Interdisciplinair Tijdschrift Voor Conservering En Restauratie 4(2):18–27Google Scholar
- Beyramysoltan S, Giffen JE, Rosati JY, Musah RA (2018) Direct analysis in real time-mass spectrometry and Kohonen artificial neural networks for species identification of larva, pupa and adult life stages of carrion insects. Anal Chem 90:9206–9217. https://doi.org/10.1021/acs.analchem.8b01704 CrossRefGoogle Scholar
- Deklerck V, Finch K, Gasson P, Van den Bulcke J, Van Acker J, Beeckman H, Espinoza E (2017) Comparison of species classification models of mass spectrometry data: kernel discriminant analysis vs random forest; a case study of Afrormosia (Pericopsis elata (Harms) Meeuwen). Rapid Commun Mass Sp 31(May):1582–1588. https://doi.org/10.1002/rcm.7939 CrossRefGoogle Scholar
- Dormontt EE, Boner M, Braun B, Breulmann G, Degen B, Espinoza E, Gardner S, Guillery P, Hermanson JC, Koch G, Lee SL, Kanashiro M, Rimbawanto A, Thomas D, Wiedenhoeft AC, Yin Y, Zahnen J, Lowe AJ (2015) Forensic timber identification: it’s time to integrate disciplines to combat illegal logging. Biol Conserv 191:790–798. https://doi.org/10.1016/j.biocon.2015.06.038 CrossRefGoogle Scholar
- Espinoza EO, Lancaster CA, Kreitals NM, Hata M, Cody RB, Blanchette RA (2014) Distinguishing wild from cultivated agarwood (Aquilaria spp.) using direct analysis in real time and time of-flight mass spectrometry. Rapid Commun Mass Sp 28(3):281–289. https://doi.org/10.1002/rcm.6779 CrossRefGoogle Scholar
- Kasongo E, Louppe D, Monthe F, Hardy O, Mbele Lokanda FB, Hubau W, Van den Bulcke J, Van Acker J, Beeckman H, Bourland N (2019) Enjeux et amélioration de gestion de Entandrophragma: arbres africains potentiellement en danger (Management problems and improvements of Entandrophragma: African trees are potentially in danger). Bois et Forêts de Tropiques 339:75–94CrossRefGoogle Scholar
- Kuhn M (2018) Package classification and regression training (‘caret’). Repository CRAN, R packageGoogle Scholar
- Leisch F, Dimitriadou E (2010) Package machine learning benchmark problems (‘mlbench’). Repository CRAN, R packageGoogle Scholar
- Lemes MR, Gribel R, Proctor J, Grattapaglia D (2003) Population genetic structure of mahogany (Swietenia macrophylla King, Meliaceae) across the Brazilian Amazon, based on variation at microsatellite loci: implications for conservation. Mol Ecol 12(11):2875–2883. https://doi.org/10.1046/j.1365-294X.2003.01950.x CrossRefGoogle Scholar
- Lemes MR, Dick CW, Navarro C, Lowe AJ, Cavers S, Gribel R (2010) Chloroplast DNA microsatellites reveal contrasting phylogeographic structure in mahogany (Swietenia macrophylla King, Meliaceae) from Amazonia and Central America. Trop Plant Biol 3(1):40–49. https://doi.org/10.1007/s12042-010-9042-5 CrossRefGoogle Scholar
- Monthe FK, Duminil J, Kasongo Yakusu E, Beeckman H, Bourland N, Doucet J-L, Sosef MSM, Hardy OJ (2018) The African timber tree Entandrophragma congoense (Pierre ex De Wild.) A. Chev is morphologically and genetically distinct from Entandrophragma angolense (Welw.) C.DC. Tree Genet Genomes 14(5):66. https://doi.org/10.1007/s11295-018-1277-6 CrossRefGoogle Scholar
- Musah RA, Espinoza EO, Cody RB, Lesiak AD, Christensen ED, Moore HE, Maleknia S, Drijfhout FP (2015) A high throughput ambient mass spectrometric approach to species identification and classification from chemical fingerprint signatures. Sci Rep 5(February):11520. https://doi.org/10.1038/srep11520 CrossRefGoogle Scholar
- Pastore TCM, Braga JWB, Coradin VTR, Magalhães WLE, Okino EYA, Camargos JAA, Bonzon de Muñiz GI, Bressan OA, Davrieux F (2011) Near infrared spectroscopy (NIRS) as a potential tool for monitoring trade of similar woods: discrimination of true mahogany, cedar, andiroba, and curupixá. Holzforschung 65(1):73–80. https://doi.org/10.1515/HF.2011.010 CrossRefGoogle Scholar
- Rosa da Silva N, De Ridder M, Baetens JM, Van den Bulcke J, Rousseau M, Martinez Bruno O, Beeckman H, Van Acker J, De Baets B (2017) Automated classification of wood transverse cross-section micro-imagery from 77 commercial Central-African timber species. Ann For Sci 74(2):30. https://doi.org/10.1007/s13595-017-0619-0 CrossRefGoogle Scholar
- UNEP-WCMC (n.d.) Convention on international trade in endangered species of Wild Fauna and Flora. Appendices I, II and III. Retrieved from https://cites.org/sites/default/files/notif/E-Notif-2016-068-A.pdf. Accessed 4 Jan 2019
- Vlam M, de Groot GA, Boom A, Copini P, Laros I, Veldhuijzen K, Zakamdi D, Zuidema PA (2018) Developing forensic tools for an African timber: regional origin is revealed by genetic characteristics, but not by isotopic signature. Biol Conserv 220(January):262–271. https://doi.org/10.1016/j.biocon.2018.01.031 CrossRefGoogle Scholar