Skip to main content
Log in

Rapid detection of sunset yellow adulteration in tea powder with variable selection coupled to machine learning tools using spectral data

  • Original Article
  • Published:
Journal of Food Science and Technology Aims and scope Submit manuscript

Abstract

In the present study sunset yellow (SY), a synthetic colour, which is a common adulterant in tea powders has been analysed using FT-IR spectral data coupled to machine learning tools for efficient classification and quantification of the SY adulteration. Earlier established real coded genetic algorithm (RCGA) was used as variable selection method to predict the key fingerprints of SY in the FT-IR spectra. Here, RCGA was used to select 20, 30, 40, 50 and 60 characteristic wavenumbers for SY. Classification was carried using support vector machine (SVM), random forest (RF) and extreme gradient boosting (XGB) classifiers. SVM classifier using 50 variables could give an accuracy of 0.90 amongst the three. Quantification of SY based on PLS (partial least squares), LS-SVM (least squares-SVM), RF and XGBoost were built on characteristic wavenumbers. Both RF and LS-SVM models were observed to be superior to PLS when coupled to RCGA obtained 20 fingerprint variables. Overall, RCGA-LS-SVM model resulted in lowest RMSECV (0.1956) with regression co-efficient values RC2 = 0.9989 and RP2 = 0.9979, when 50 fingerprint variables were used. These results demonstrated that FT-IR combined with RCGA-LS-SVM procedure could be a robust technique for rapid detection of SY in tea powder.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Code availability

All the computational codes will be made available upon request.

Abbreviations

ATR:

Attenuated total reflection

BOSS-LightGBM:

Bootstrapping soft shrinkage- light gradient boosting machine

CARS:

Competitive adaptive reweighted sampling

CatBoost:

Categorical boosting

DMC:

Dry matter content

DS:

Direct sampling

FDA:

Food and drug administration

FSSAI:

Food safety and standards authority of India

FT-IR:

Fourier transform infrared

GA:

Genetic algorithm

iPLS:

Interval partial least squares

KS:

Kennard and stone

KSPXY:

Kernel distance-based sample set partitioning based on joint X–Y distances

LightGBM:

Light gradient boosting machine

LS-SVM:

Least squares-support vector machine

LVs:

Latent variables

MAE:

Mean absolute error

mTry:

Number of input variables

NIPALS:

Nonlinear iterative partial least squares

nTree:

Number of regression trees

PCs:

Principal components

PCA:

Principal component analysis

PCR:

Principal component regression

PLS:

Partial least squares

RBF:

Radial basis function

RCGA:

Real coded genetic algorithm

RC 2 and RP 2 :

Regression coefficient of calibration and prediction

RF:

Random forest

RMSEC:

Root mean square of calibration

RMSECV:

Root mean square error of cross validation

RMSEP:

Root mean square error of prediction

RS:

Random sampling

SG:

Savitzky-Golay

SNV:

Standard normal variate

SPA:

Successive projections algorithm

SPXY:

Sample set partitioning based on joint X–Y distances

SVD:

Singular value decomposition

SVM:

Support vector machine

TLC:

Thin layer chromatography

SY:

Sunset yellow

XGB:

Extreme Gradient Boosting

References

  • Amsaraj R, Ambade ND, Mutturi S (2021) Variable selection coupled to PLS2, ANN and SVM for simultaneous detection of multiple adulterants in milk using spectral data. Int Dairy J

  • Amsaraj R, Mutturi S (2021) Real-coded GA coupled to PLS for rapid detection and quantification of tartrazine in tea using FT-IR spectroscopy. LWT–Food Sci Technol 139:110583

    Article  CAS  Google Scholar 

  • Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Cebi N, Yilmaz MT, Sagdic O (2017) A rapid ATR-FTIR spectroscopic method for detection of sibutramine adulteration in tea and coffee based on hierarchical cluster and principal component analyses. Food Chem 229:517–526

    Article  CAS  PubMed  Google Scholar 

  • Chanda S, Hazarika AK, Choudhury N, Islam SA, Manna R, Sabhapondit S et al (2019) Support vector machine regression on selected wavelength regions for quantitative analysis of caffeine in tea leaves by near infrared spectroscopy. J Chemom 33(10):e3172

    Article  CAS  Google Scholar 

  • Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd Acm Sigkdd international conference on knowledge discovery and data mining, pp 785–794

  • Dankowska A, Kowalewski W (2019) Tea types classification with data fusion of UV–Vis, synchronous fluorescence and NIR spectroscopies and chemometric analysis. Spectrochim Acta Part A 5:195–202

    Article  Google Scholar 

  • de Andrade FI, Guedes MIF, Vieira ÍGP, Mendes FNP, Rodrigues PAS, Maia CSC, de Ribeiro M (2014) Determination of synthetic food dyes in commercial soft drinks by TLC and ion-pair HPLC. Food Chem 157:193–198

    Article  PubMed  Google Scholar 

  • Ge X, Sun J, Lu B, Chen Q, Xun W, Jin Y (2019) Classification of oolong tea varieties based on hyperspectral imaging technology and BOSS-LightGBM model. J Food Process Eng 42(8):e13289

  • Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17

    Article  CAS  Google Scholar 

  • Leardi R, Boggia R, Terrile M (1992) Genetic algorithms as a strategy for feature selection. J Chemom 6(5):267–281

    Article  CAS  Google Scholar 

  • Li X, Luo L, He Y, Xu N (2013) Determination of dry matter content of tea by near and middle infrared spectroscopy coupled with wavelet-based data mining algorithms. Comput Electron Agric 98:46–53

    Article  Google Scholar 

  • Li X, Zhang Y, He Y (2016) Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics. Sci Rep 6(1):1–8

    Google Scholar 

  • Li X, Xu K, Zhang Y, Sun C, He Y (2017) Optical determination of lead chrome green in green tea by Fourier transform infrared (FT-IR) transmission spectroscopy. PLoS ONE 12(1):1–14

    Google Scholar 

  • Li M, Dai G, Chang T, Shi C, Wei D, Du C, Cui HL (2017) Accurate determination of geographical origin of tea based on terahertz spectroscopy. Appl Sci 7(2):172

    Article  Google Scholar 

  • Li L, Jin S, Wang Y, Liu Y, Shen S, Li M et al (2021) Potential of smartphone-coupled micro NIR spectroscopy for quality control of green tea. Spectrochim Acta Part A 247:119096

    Article  CAS  Google Scholar 

  • Liang G, Dong C, Hu B, Zhu H, Yuan H, Jiang Y et al (2018) Prediction of moisture content for Congou Black Tea Withering Leaves using image features and nonlinear method. Sci Rep 8(1):1–8

    Article  Google Scholar 

  • Lohumi S, Joshi R, Kandpal LM, Lee H, Kim MS, Cho H et al (2017) Quantitative analysis of Sudan dye adulteration in paprika powder using FTIR spectroscopy. Food Addit Contam Part A 34(5):678–686

    CAS  Google Scholar 

  • Luo X, Xu L, Huang P, Wang Y, Liu J, Hu Y et al (2021) Nondestructive testing model of tea polyphenols based on hyperspectral technology combined with chemometric methods. Agriculture 11(7):890

    Article  Google Scholar 

  • Malaysian tea manufacturer fined over banned colourings. Accessed 31 Aug 2021

  • Raja V (2019) Sale of fake tea powder rampant: here’s how to check your tea for adulteration. 11/02/2019, The Better India., https://www.thebetterindia.com/201889/tea-adulterated-test-fake-india-purity-check-homeindia/. Accessed 31 Aug 2021

  • Rovina K, Prabakaran PP, Siddiquee S, Shaarani SM (2016) Methods for the analysis of Sunset Yellow FCF (E110) in food and beverage products-a review. TrAC Trends Anal Chem 85:47–56

    Article  CAS  Google Scholar 

  • Sun Y, Wang Y, Huang J, Ren G, Ning J, Deng W et al (2020) Quality assessment of instant green tea using portable NIR spectrometer. Spectrochim Acta Part A 240:118576

    Article  CAS  Google Scholar 

  • Suykens JAK, van Gestel T, de Brabanter J, de Moor B, Vandewalle JPL (2002) Least squares support vector machines. World Sci 5:796

    Google Scholar 

  • Wang X, Huang J, Fan W, Lu H (2015) Identification of green tea varieties and fast quantification of total polyphenols by near-infrared spectroscopy and ultraviolet-visible spectroscopy with chemometric algorithms. Anal Methods 7(2):787–792

    Article  CAS  Google Scholar 

  • Wu X, Zhu J, Wu B, Sun J, Dai C (2018) Discrimination of tea varieties using FTIR spectroscopy and allied Gustafson-Kessel clustering. Comput Electron Agric 147:64–69

    Article  Google Scholar 

  • Xu Y, Goodacre R (2018) On splitting training and validation set: a comparative study of cross-validation, bootstrap and systematic sampling for estimating the generalization performance of supervised learning. Int J Test 2(3):249–262

    Google Scholar 

  • Xu M, Wang J, Zhu L (2019) The qualitative and quantitative assessment of tea quality based on E-nose, E-tongue and E-eye combined with chemometrics. Food Chem 289:482–489

    Article  CAS  PubMed  Google Scholar 

  • Yang B, Qi L, Wang M, Hussain S, Wang H, Wang B et al (2020) Cross-category tea polyphenols evaluation model based on feature fusion of electronic nose and hyperspectral imagery. Sensors 20(1):496

    Google Scholar 

  • Zhang M, Guo J, Ma C, Qiu G, Ren J, Zeng F, Lü E (2020) An effective Prediction Approach for Moisture Content of Tea Leaves based on Discrete Wavelet transforms and bootstrap soft shrinkage algorithm. Appl Sci 10(14):4839

    Article  Google Scholar 

Download references

Acknowledgements

RA wish to express sincere thanks to Indian Council of Medical Research (ICMR) for granting SRF fellowship to carry out research work. The authors would like to thank Ms. Asha M of Central Instruments Facility & Services (CFTRI) and Mr. Punil HN of Microbiology & Fermentation Technology Dept. (CFTRI) for their assistance during experimentation. Authors also acknowledge the Director, CSIR-CFTRI, Mysuru for providing infrastructure and support during the research work.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

RA carried out the experiments and wrote the original manuscript, SM conceived, supervised, and edited the manuscript.

Corresponding author

Correspondence to Sarma Mutturi.

Ethics declarations

Conflict of interest

Both the authors declare no conflict of interest.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Ethics approval

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Amsaraj, R., Mutturi, S. Rapid detection of sunset yellow adulteration in tea powder with variable selection coupled to machine learning tools using spectral data. J Food Sci Technol 60, 1530–1540 (2023). https://doi.org/10.1007/s13197-023-05694-3

Download citation

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13197-023-05694-3

Keywords

Navigation