Machine Learning in VLSI Computer-Aided Design pp 233-263 | Cite as

# Machine Learning for VLSI Chip Testing and Semiconductor Manufacturing Process Monitoring and Improvement

## Abstract

Machine learning and big data analytics are the latest spotlights with all the glare of fame ranging from media coverage to booming start-up companies to eye-catching merges and acquisitions. On the contrary, the $336 billion industry of semiconductor was seen as an “old-fashioned” business, with fading interests from the best and brightest among young graduates and engineers. This chapter argues that this does not have to be that way because many research problems and solutions as studied in the semiconductor industry are in fact closely related to these machine learning and big data problems. To illustrate this point, we discuss a number of practical but challenging problems arising from semiconductor manufacturing process in this chapter. We first show how machine learning techniques, especially those regression-related problems, often under the “disguise” of optimization problems, have been used frequently (often with nontrivial modeling skills and mathematical sophistications) to solve the semiconductor problems. We discuss such examples as process variation modeling and VLSI chip testing. For some other types of semiconductor problems, such as manufacturing process monitoring and improvement, we show that some existing machine learning algorithms are not necessarily well positioned to solve them, and novel machine learning techniques involving temporal, structural, and hierarchical properties need to be further developed. In either scenario, we convey the message that machine learning and existing semiconductor industry researches are closely related, and researchers often contribute to and benefit from each other.

## References

- 1.J. Chen, Y. Chen, X. Du, C. Li, J. Lu, S. Zhao, X. Zhou, Big data challenge: a data management perspective. Front. Comp. Sci.
**7**(2), 157–164 (2013)MathSciNetCrossRefGoogle Scholar - 2.D. Analytics, Analytics trends 2015: a below-the-surface look, in
*White Paper*(2015)Google Scholar - 3.G. Newell, N. Bekhazi, R. Morgan, Optimizing storage and I/O for distributed processing on enterprise and high performance compute (HPC) systems for mask data preparation software (CATS), Technical Report, Synopsys, Inc., 2007Google Scholar
- 4.D. Kurz, C.D. Luca, J. Pilz, Monitoring virtual metrology reliability in a sampling decision system, in
*Conference on Automation Science and Engineering*(2013)Google Scholar - 5.A. Johnson, S. McLoone, A dynamic sampling methodology for within product virtual metrology, in
*29th International Manufacturing Conference*(2012)Google Scholar - 6.J. Attenberg, K. Weinberger, A. Dasgupta, Collaborative email-spam filtering with the hashing-trick, in
*Proceedings of the Sixth Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference*(2009)Google Scholar - 7.O. Chappelle, P. Shivaswamy, S. Vadrevu, Multi-task learning for boosting with application to web search ranking, in
*Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)*(2010), pp. 1189–1198Google Scholar - 8.A. Torralba, K.P. Murphy, W.T. Freeman, Sharing features: efficient boosting procedures for multiclass object detection, in
*Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*(2004), pp. 762–769Google Scholar - 9.M. Aitkin, N. Longford, Statistical modelling issues in school effectiveness studies. J. R. Stat. Soc. A
**149**, 1–43 (1986)CrossRefGoogle Scholar - 10.M. Daniels, C. Gatsonis, Hierarchical generalized linear models in the analysis of variations in health care utilization. J. Am. Stat. Assoc.
**94**, 29–38 (1999)CrossRefGoogle Scholar - 11.Y. Chen, B. Hu, E.J. Keogh, G.E.A.P.A. Batista, Dtw-d: time series semi-supervised learning from a single example, in
*KDD*(2013), pp. 383–391Google Scholar - 12.B. Hu, Y. Chen, E.J. Keogh, Time series classification under more realistic assumptions, in
*SDM*(2013), pp. 578–586Google Scholar - 13.J. Zakaria, A. Mueen, E.J. Keogh, Clustering time series using unsupervised-shapelets, in
*ICDM*(2012), pp. 785–794Google Scholar - 14.L. Li, B.A. Prakash, Time series clustering: complex is simpler!, in
*ICML*(2011), pp. 185–192Google Scholar - 15.L. Li, B.A. Prakash, C. Faloutsos, Parsimonious linear fingerprinting for time series. J. Proc. VLDB Endow.
**3**(1), 385–396 (2010)CrossRefGoogle Scholar - 16.T. Rakthanmanon, B.J.L. Campana, A. Mueen, G.E.A.P.A. Batista, M.B. Westover, Q. Zhu, J. Zakaria, E.J. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in
*KDD*(2012), pp. 262–270Google Scholar - 17.L. Wei, E.J. Keogh, X. Xi, M. Yoder, Efficiently finding unusual shapes in large image databases. Data Min. Knowl. Disc.
**17**(3), 343–376 (2008)MathSciNetCrossRefGoogle Scholar - 18.B.-K. Yi, N. Sidiropoulos, T. Johnson, H.V. Jagadish, C. Faloutsos, A. Biliris, Online data mining for co-evolving time sequences, in
*ICDE*(2000), pp. 13–22Google Scholar - 19.S. Papadimitriou, J. Sun, C. Faloutsos, Streaming pattern discovery in multiple time-series, in
*VLDB*(2005), pp. 697–708Google Scholar - 20.Y.-J. Chang, Y. Kang, C.-L. Hsu, C.-T. Chang, T.Y. Chan, Virtual metrology technique for semiconductor manufacturing, in
*International Joint Conference on Neural Networks, 2006. IJCNN ’06*(2006), pp. 5289–5293Google Scholar - 21.A. Yaglom, Some classes of random fields in n-dimensional space, related to stationary random processes. Theory Probab. Appl.
**2**, 273–320 (1957)CrossRefGoogle Scholar - 22.R.L. Bras, I. Rodriguez-Iturbe,
*Random Functions and Hydrology*(Dover Publishers, Mineola, 1985)Google Scholar - 23.T. Coleman, Y. Li, An interior, trust region approach for nonlinear minimization subject to bounds. SIAM J. Optim.
**6**, 418–445 (1996)MathSciNetCrossRefGoogle Scholar - 24.C. Visweswariah, K. Ravindran, K. Kalafala, S.G. Walker, S. Narayan, First-order incremental block-based statistical timing analysis, in
*DAC*, San Diego, CA, June 2004, pp. 331–336Google Scholar - 25.H. Chang, S.S. Sapatnekar, Statistical timing analysis considering spatial correlations using a single PERT-like traversal, in
*ICCAD*, San Jose, CA, November 2003, pp. 621–625Google Scholar - 26.R. Chen, L. Zhang, V. Zolotov, C. Visweswariah, J. Xiong, Static timing: back to our roots, in
*Asia and South Pacific Design Automation Conference*, Seoul, South Korea, March 2008, pp. 310–315Google Scholar - 27.J. Xiong, V. Zolotov, C. Visweswariah, P. Habitz, Optimal margin computation for at-speed test, in
*Conference on Design, Automation and Test in Europe*, Munich, Germany, March 2008, pp. 622–627Google Scholar - 28.B. Bakker, T. Heskes, Task clustering and gating for Bayesian multitask learning. J. Mach. Learn. Res.
**4**, 83–99 (2003)zbMATHGoogle Scholar - 29.A. Argyriou, T. Evgeniou, M. Pontil, Convex multi-task feature learning. Mach. Learn.
**73**(3), 243–272 (2008)CrossRefGoogle Scholar - 30.J. Chen, L. Tang, J. Liu, J. Ye, A convex formulation for learning shared structures from multiple tasks, in
*ICML*(2009), p. 18Google Scholar - 31.J. Chen, J. Zhou, J. Ye, Integrating low-rank and group-sparse structures for robust multi-task learning, in
*KDD*(2011), pp. 42–50Google Scholar - 32.P. Kang, D. Kim, H.-J. Lee, S. Doh, S. Cho, Virtual metrology for run-to-run control in semiconductor manufacturing. Expert Syst. Appl.
**38**, 2508–2522 (2011)CrossRefGoogle Scholar - 33.S. Lynn, J. Ringwood, E. Ragnoli, S. McLoone, N. MacGearailt, Virtual metrology for plasma etch using tool variables, in
*Advanced Semiconductor Manufacturing Conference*(2009)Google Scholar - 34.Y. Zhang, J.G. Schneider, Learning multiple tasks with a sparse matrix-normal penalty, in
*NIPS*(2010), pp. 2550–2558Google Scholar - 35.H. Liu, M. Palatucci, J. Zhang, Blockwise coordinate descent procedures for the multi-task lasso, with applications to neural semantic basis discovery, in
*ICML*(2009), p. 82Google Scholar - 36.D.G. Luenberger,
*Linear and Nonlinear Programming*, 2nd edn. (Addison-Wesley, Massachusetts, 1973)zbMATHGoogle Scholar - 37.R.K. Ando, T. Zhang, A framework for learning predictive structures from multiple tasks and unlabeled data. J. Mach. Learn. Res.
**6**, 1817–1853 (2005)MathSciNetzbMATHGoogle Scholar - 38.J. Zakaria, A. Mueen, E.J. Keogh, Clustering time series using unsupervised-shapelets, in
*ICDM*(2012), pp. 785–794Google Scholar - 39.T. Rakthanmanon, B.J.L. Campana, A. Mueen, G.E.A.P.A. Batista, M.B. Westover, Q. Zhu, J. Zakaria, E.J. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in
*KDD*(2012), pp. 262–270Google Scholar - 40.D. Chakrabarti, S. Papadimitriou, D.S. Modha, C. Faloutsos, Fully automatic cross-associations, in
*KDD*(2004), pp. 79–88Google Scholar - 41.I.S. Dhillon, S. Mallela, D.S. Modha, Information-theoretic co-clustering, in
*KDD*(2003), pp. 89–98Google Scholar - 42.R. Tibshirani, Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B
**58**, 267–288 (1996)MathSciNetzbMATHGoogle Scholar - 43.H. Zou, T. Hastie, Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat Methodol.)
**67**(2), 301–320 (2003)Google Scholar