Abstract
In the present study sunset yellow (SY), a synthetic colour, which is a common adulterant in tea powders has been analysed using FT-IR spectral data coupled to machine learning tools for efficient classification and quantification of the SY adulteration. Earlier established real coded genetic algorithm (RCGA) was used as variable selection method to predict the key fingerprints of SY in the FT-IR spectra. Here, RCGA was used to select 20, 30, 40, 50 and 60 characteristic wavenumbers for SY. Classification was carried using support vector machine (SVM), random forest (RF) and extreme gradient boosting (XGB) classifiers. SVM classifier using 50 variables could give an accuracy of 0.90 amongst the three. Quantification of SY based on PLS (partial least squares), LS-SVM (least squares-SVM), RF and XGBoost were built on characteristic wavenumbers. Both RF and LS-SVM models were observed to be superior to PLS when coupled to RCGA obtained 20 fingerprint variables. Overall, RCGA-LS-SVM model resulted in lowest RMSECV (0.1956) with regression co-efficient values RC2 = 0.9989 and RP2 = 0.9979, when 50 fingerprint variables were used. These results demonstrated that FT-IR combined with RCGA-LS-SVM procedure could be a robust technique for rapid detection of SY in tea powder.
Similar content being viewed by others
Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Code availability
All the computational codes will be made available upon request.
Abbreviations
- ATR:
-
Attenuated total reflection
- BOSS-LightGBM:
-
Bootstrapping soft shrinkage- light gradient boosting machine
- CARS:
-
Competitive adaptive reweighted sampling
- CatBoost:
-
Categorical boosting
- DMC:
-
Dry matter content
- DS:
-
Direct sampling
- FDA:
-
Food and drug administration
- FSSAI:
-
Food safety and standards authority of India
- FT-IR:
-
Fourier transform infrared
- GA:
-
Genetic algorithm
- iPLS:
-
Interval partial least squares
- KS:
-
Kennard and stone
- KSPXY:
-
Kernel distance-based sample set partitioning based on joint X–Y distances
- LightGBM:
-
Light gradient boosting machine
- LS-SVM:
-
Least squares-support vector machine
- LVs:
-
Latent variables
- MAE:
-
Mean absolute error
- mTry:
-
Number of input variables
- NIPALS:
-
Nonlinear iterative partial least squares
- nTree:
-
Number of regression trees
- PCs:
-
Principal components
- PCA:
-
Principal component analysis
- PCR:
-
Principal component regression
- PLS:
-
Partial least squares
- RBF:
-
Radial basis function
- RCGA:
-
Real coded genetic algorithm
- RC 2 and RP 2 :
-
Regression coefficient of calibration and prediction
- RF:
-
Random forest
- RMSEC:
-
Root mean square of calibration
- RMSECV:
-
Root mean square error of cross validation
- RMSEP:
-
Root mean square error of prediction
- RS:
-
Random sampling
- SG:
-
Savitzky-Golay
- SNV:
-
Standard normal variate
- SPA:
-
Successive projections algorithm
- SPXY:
-
Sample set partitioning based on joint X–Y distances
- SVD:
-
Singular value decomposition
- SVM:
-
Support vector machine
- TLC:
-
Thin layer chromatography
- SY:
-
Sunset yellow
- XGB:
-
Extreme Gradient Boosting
References
Amsaraj R, Ambade ND, Mutturi S (2021) Variable selection coupled to PLS2, ANN and SVM for simultaneous detection of multiple adulterants in milk using spectral data. Int Dairy J
Amsaraj R, Mutturi S (2021) Real-coded GA coupled to PLS for rapid detection and quantification of tartrazine in tea using FT-IR spectroscopy. LWT–Food Sci Technol 139:110583
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Cebi N, Yilmaz MT, Sagdic O (2017) A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses. Food Chem 229:517–526
Chanda S, Hazarika AK, Choudhury N, Islam SA, Manna R, Sabhapondit S et al (2019) Support vector machine regression on selected wavelength regions for quantitative analysis of caffeine in tea leaves by near infrared spectroscopy. J Chemom 33(10):e3172
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd international conference on knowledge discovery and data mining, pp 785–794
Dankowska A, Kowalewski W (2019) Tea types classification with data fusion of UV–Vis, synchronous fluorescence and NIR spectroscopies and chemometric analysis. Spectrochim Acta Part A 5:195–202
de Andrade FI, Guedes MIF, Vieira ÍGP, Mendes FNP, Rodrigues PAS, Maia CSC, de Ribeiro M (2014) Determination of synthetic food dyes in commercial soft drinks by TLC and ion-pair HPLC. Food Chem 157:193–198
Ge X, Sun J, Lu B, Chen Q, Xun W, Jin Y (2019) Classification of oolong tea varieties based on hyperspectral imaging technology and BOSS-LightGBM model. J Food Process Eng 42(8):e13289
Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17
Leardi R, Boggia R, Terrile M (1992) Genetic algorithms as a strategy for feature selection. J Chemom 6(5):267–281
Li X, Luo L, He Y, Xu N (2013) Determination of dry matter content of tea by near and middle infrared spectroscopy coupled with wavelet-based data mining algorithms. Comput Electron Agric 98:46–53
Li X, Zhang Y, He Y (2016) Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics. Sci Rep 6(1):1–8
Li X, Xu K, Zhang Y, Sun C, He Y (2017) Optical determination of lead chrome green in green tea by Fourier transform infrared (FT-IR) transmission spectroscopy. PLoS ONE 12(1):1–14
Li M, Dai G, Chang T, Shi C, Wei D, Du C, Cui HL (2017) Accurate determination of geographical origin of tea based on terahertz spectroscopy. Appl Sci 7(2):172
Li L, Jin S, Wang Y, Liu Y, Shen S, Li M et al (2021) Potential of smartphone-coupled micro NIR spectroscopy for quality control of green tea. Spectrochim Acta Part A 247:119096
Liang G, Dong C, Hu B, Zhu H, Yuan H, Jiang Y et al (2018) Prediction of moisture content for Congou Black Tea Withering Leaves using image features and nonlinear method. Sci Rep 8(1):1–8
Lohumi S, Joshi R, Kandpal LM, Lee H, Kim MS, Cho H et al (2017) Quantitative analysis of Sudan dye adulteration in paprika powder using FTIR spectroscopy. Food Addit Contam Part A 34(5):678–686
Luo X, Xu L, Huang P, Wang Y, Liu J, Hu Y et al (2021) Nondestructive testing model of tea polyphenols based on hyperspectral technology combined with chemometric methods. Agriculture 11(7):890
Malaysian tea manufacturer fined over banned colourings. Accessed 31 Aug 2021
Raja V (2019) Sale of fake tea powder rampant: here’s how to check your tea for adulteration. 11/02/2019, The Better India., https://www.thebetterindia.com/201889/tea-adulterated-test-fake-india-purity-check-homeindia/. Accessed 31 Aug 2021
Rovina K, Prabakaran PP, Siddiquee S, Shaarani SM (2016) Methods for the analysis of Sunset Yellow FCF (E110) in food and beverage products-a review. TrAC Trends Anal Chem 85:47–56
Sun Y, Wang Y, Huang J, Ren G, Ning J, Deng W et al (2020) Quality assessment of instant green tea using portable NIR spectrometer. Spectrochim Acta Part A 240:118576
Suykens JAK, van Gestel T, de Brabanter J, de Moor B, Vandewalle JPL (2002) Least squares support vector machines. World Sci 5:796
Wang X, Huang J, Fan W, Lu H (2015) Identification of green tea varieties and fast quantification of total polyphenols by near-infrared spectroscopy and ultraviolet-visible spectroscopy with chemometric algorithms. Anal Methods 7(2):787–792
Wu X, Zhu J, Wu B, Sun J, Dai C (2018) Discrimination of tea varieties using FTIR spectroscopy and allied Gustafson-Kessel clustering. Comput Electron Agric 147:64–69
Xu Y, Goodacre R (2018) On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. Int J Test 2(3):249–262
Xu M, Wang J, Zhu L (2019) The qualitative and quantitative assessment of tea quality based on E-nose, E-tongue and E-eye combined with chemometrics. Food Chem 289:482–489
Yang B, Qi L, Wang M, Hussain S, Wang H, Wang B et al (2020) Cross-category tea polyphenols evaluation model based on feature fusion of electronic nose and hyperspectral imagery. Sensors 20(1):496
Zhang M, Guo J, Ma C, Qiu G, Ren J, Zeng F, Lü E (2020) An effective Prediction Approach for Moisture Content of Tea Leaves based on Discrete Wavelet transforms and bootstrap soft shrinkage algorithm. Appl Sci 10(14):4839
Acknowledgements
RA wish to express sincere thanks to Indian Council of Medical Research (ICMR) for granting SRF fellowship to carry out research work. The authors would like to thank Ms. Asha M of Central Instruments Facility & Services (CFTRI) and Mr. Punil HN of Microbiology & Fermentation Technology Dept. (CFTRI) for their assistance during experimentation. Authors also acknowledge the Director, CSIR-CFTRI, Mysuru for providing infrastructure and support during the research work.
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
RA carried out the experiments and wrote the original manuscript, SM conceived, supervised, and edited the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Both the authors declare no conflict of interest.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Ethics approval
Not applicable.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Amsaraj, R., Mutturi, S. Rapid detection of sunset yellow adulteration in tea powder with variable selection coupled to machine learning tools using spectral data. J Food Sci Technol 60, 1530–1540 (2023). https://doi.org/10.1007/s13197-023-05694-3
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13197-023-05694-3