Knowledge-Driven Dimension Reduction and Reduced Order Surrogate Models

Liu, Wing Kam; Gan, Zhengtao; Fleming, Mark

doi:10.1007/978-3-030-87832-0_5

Wing Kam Liu⁴,
Zhengtao Gan⁴ &
Mark Fleming⁴

508 Accesses
2 Citations

Abstract

This chapter focuses on the knowledge-driven dimension reduction aspect of mechanistic data science. Two types of dimension reduction methods that are introduced in this chapter: clustering and reduced order modeling. Clustering aims to reduce the total number of data points in a dataset by grouping similar data points into clusters. The datapoints within a cluster are considered to be more like each other than datapoints in other clusters. There are multiple methods and algorithms for clustering. In the first part of this chapter, three clustering algorithms are presented, ranging from entry level to advanced level: the Jenks natural breaks, K-means clustering, and self-organizing map (SOM). Clustering is a form of dimension reduction that reduces the total number of data points. In the second part of this chapter, Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) will be introduced as a reduced order modeling technique that reduce the number of features by eliminating redundant and dependent features, leading to a new set of principal features. The resulting model is called a reduced order model. Proper Generalized Decomposition (PGD) is a higher order extension of PCA and will also be introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Driver HE, Kroeber AL (1932) Quantitative expression of cultural relationships. University of California Publications in American Archaeology and Ethnology, Berkeley, pp 211–256
Google Scholar
Zubin J (1938) A technique for measuring like-mindedness. J Abnorm Soc Psychol 33(4):508–516
Article Google Scholar
Tryon RC (1939) Cluster analysis: correlation profile and Orthometric (factor). In: Analysis for the Isolation of Unities in Mind and Personality. Edwards Brothers, Ann Arbor
Google Scholar
Cattell RB (1943) The description of personality: basic traits resolved into clusters. J Abnorm Soc Psychol 38(4):476–506
Article Google Scholar
Jenks GF (1967) The data model concept in statistical mapping. Int Yearbook Cartograp 7:186–190
Google Scholar
MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol 1. University of California Press, Berkeley, pp 281–297
Google Scholar
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Defays D (1977) An efficient algorithm for a complete-link method. Comp J Br Comp Soc 20(4):364–366
MathSciNet MATH Google Scholar
Kohonen T, Honkela T (2007) Kohonen Network. Scholarpedia 2(1):1568
Article Google Scholar
https://en.wikipedia.org/wiki/Singular_value_decomposition#History
Pearson K (1901) On lines and planes of closest fit to Systems of Points in space. Philos Mag 2(11):559–572
Article Google Scholar
Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441 and 498–520
Google Scholar
Ammar A, Mokdad B, Chinesta F, Keunings R (2006) A new family of solvers for some classes of multidimensional partial differential equations encountered in kinetic theory Modeling of complex fluids. J Non-Newtonian Fluid Mech 139(3):153–176
Article Google Scholar
Jenks GF (1967) The data model concept in statistical mapping. Int Yearbook Cartograp 7:186–190
Google Scholar
https://medium.com/analytics-vidhya/jenks-natural-breaks-best-range-finder-algorithm-8d1907192051
https://en.wikipedia.org/wiki/Hierarchical_clustering
Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480
Article Google Scholar
Rauber A, Merkl D, Dittenbach M (2002) The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data. IEEE Trans Neural Netw 13(6):1331–1341
Article Google Scholar
Goodfellow, Bengio, Courville (2016) The back-propagation algorithm (Rumelhart et al., 1986a). p. 200
Google Scholar
Gan Z, Li H, Wolff SJ, Bennett JL, Hyatt G, Wagner GJ, Cao J, Liu WK (2019) Data-driven microstructure and microhardness design in additive manufacturing using a self-organizing map. Engineering 5(4):730–735
Article Google Scholar
Wolff SJ, Gan Z, Lin S, Bennett JL, Yan W, Hyatt G, Ehmann KF, Wagner GJ, Liu WK, Cao J (2019) Experimentally validated predictions of thermal history and microhardness in laser-deposited Inconel 718 on carbon steel. Addit Manuf 27:540–551
Google Scholar
Vesanto J, Himberg J, Alhoniemi E, Parhankangas J (2000) SOM toolbox for Matlab 5 57:2. Technical report
Google Scholar
Mukherjee T, Zuback JS, De A, DebRoy T (2016) Printability of alloys for additive manufacturing. Sci Rep 6(1):1–8
Article Google Scholar
https://www.api5lx.com/api5lx-grades/
https://en.wikipedia.org/wiki/Coefficient_of_determination
https://physlets.org/tracker/
https://physlets.org/tracker/
Modesto D, Zlotnik S, Huerta A (2015) Proper generalized decomposition for parameterized Helmholtz problems in heterogeneous and unbounded domains: application to harbor agitation. Comput Methods Appl Mech Eng 295:127–149
Article MathSciNet Google Scholar
Bro R (1997) PARAFAC. Tutorial and applications. Chem Intell Lab Syst 38(2):149–171
Article Google Scholar
https://yelu-git.github.io/hopgd/
Hawkins T (1975) Cauchy and the spectral theory of matrices. Hist Math 2:1–29
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Northwestern University, Evanston, IL, USA
Wing Kam Liu, Zhengtao Gan & Mark Fleming

Authors

Wing Kam Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhengtao Gan
View author publications
You can also search for this author in PubMed Google Scholar
Mark Fleming
View author publications
You can also search for this author in PubMed Google Scholar

5.1 Electronic Supplementary Material

(MP4 15426 kb)

(MP4 16931 kb)

(MP4 20997 kb)

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Liu, W.K., Gan, Z., Fleming, M. (2021). Knowledge-Driven Dimension Reduction and Reduced Order Surrogate Models. In: Mechanistic Data Science for STEM Education and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-87832-0_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-87832-0_5
Published: 01 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87831-3
Online ISBN: 978-3-030-87832-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics