Skip to main content

Knowledge-Driven Dimension Reduction and Reduced Order Surrogate Models

  • Chapter
  • First Online:
Mechanistic Data Science for STEM Education and Applications

Abstract

This chapter focuses on the knowledge-driven dimension reduction aspect of mechanistic data science. Two types of dimension reduction methods that are introduced in this chapter: clustering and reduced order modeling. Clustering aims to reduce the total number of data points in a dataset by grouping similar data points into clusters. The datapoints within a cluster are considered to be more like each other than datapoints in other clusters. There are multiple methods and algorithms for clustering. In the first part of this chapter, three clustering algorithms are presented, ranging from entry level to advanced level: the Jenks natural breaks, K-means clustering, and self-organizing map (SOM). Clustering is a form of dimension reduction that reduces the total number of data points. In the second part of this chapter, Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) will be introduced as a reduced order modeling technique that reduce the number of features by eliminating redundant and dependent features, leading to a new set of principal features. The resulting model is called a reduced order model. Proper Generalized Decomposition (PGD) is a higher order extension of PCA and will also be introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 99.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Driver HE, Kroeber AL (1932) Quantitative expression of cultural relationships. University of California Publications in American Archaeology and Ethnology, Berkeley, pp 211–256

    Google Scholar 

  2. Zubin J (1938) A technique for measuring like-mindedness. J Abnorm Soc Psychol 33(4):508–516

    Article  Google Scholar 

  3. Tryon RC (1939) Cluster analysis: correlation profile and Orthometric (factor). In: Analysis for the Isolation of Unities in Mind and Personality. Edwards Brothers, Ann Arbor

    Google Scholar 

  4. Cattell RB (1943) The description of personality: basic traits resolved into clusters. J Abnorm Soc Psychol 38(4):476–506

    Article  Google Scholar 

  5. Jenks GF (1967) The data model concept in statistical mapping. Int Yearbook Cartograp 7:186–190

    Google Scholar 

  6. MacQueen JB (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol 1. University of California Press, Berkeley, pp 281–297

    Google Scholar 

  7. Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Englewood Cliffs

    MATH  Google Scholar 

  8. Defays D (1977) An efficient algorithm for a complete-link method. Comp J Br Comp Soc 20(4):364–366

    MathSciNet  MATH  Google Scholar 

  9. Kohonen T, Honkela T (2007) Kohonen Network. Scholarpedia 2(1):1568

    Article  Google Scholar 

  10. https://en.wikipedia.org/wiki/Singular_value_decomposition#History

  11. Pearson K (1901) On lines and planes of closest fit to Systems of Points in space. Philos Mag 2(11):559–572

    Article  Google Scholar 

  12. Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24:417–441 and 498–520

    Google Scholar 

  13. Ammar A, Mokdad B, Chinesta F, Keunings R (2006) A new family of solvers for some classes of multidimensional partial differential equations encountered in kinetic theory Modeling of complex fluids. J Non-Newtonian Fluid Mech 139(3):153–176

    Article  Google Scholar 

  14. Jenks GF (1967) The data model concept in statistical mapping. Int Yearbook Cartograp 7:186–190

    Google Scholar 

  15. https://medium.com/analytics-vidhya/jenks-natural-breaks-best-range-finder-algorithm-8d1907192051

  16. https://en.wikipedia.org/wiki/Hierarchical_clustering

  17. Kohonen T (1990) The self-organizing map. Proc IEEE 78(9):1464–1480

    Article  Google Scholar 

  18. Rauber A, Merkl D, Dittenbach M (2002) The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data. IEEE Trans Neural Netw 13(6):1331–1341

    Article  Google Scholar 

  19. Goodfellow, Bengio, Courville (2016) The back-propagation algorithm (Rumelhart et al., 1986a). p. 200

    Google Scholar 

  20. Gan Z, Li H, Wolff SJ, Bennett JL, Hyatt G, Wagner GJ, Cao J, Liu WK (2019) Data-driven microstructure and microhardness design in additive manufacturing using a self-organizing map. Engineering 5(4):730–735

    Article  Google Scholar 

  21. Wolff SJ, Gan Z, Lin S, Bennett JL, Yan W, Hyatt G, Ehmann KF, Wagner GJ, Liu WK, Cao J (2019) Experimentally validated predictions of thermal history and microhardness in laser-deposited Inconel 718 on carbon steel. Addit Manuf 27:540–551

    Google Scholar 

  22. Vesanto J, Himberg J, Alhoniemi E, Parhankangas J (2000) SOM toolbox for Matlab 5 57:2. Technical report

    Google Scholar 

  23. Mukherjee T, Zuback JS, De A, DebRoy T (2016) Printability of alloys for additive manufacturing. Sci Rep 6(1):1–8

    Article  Google Scholar 

  24. https://www.api5lx.com/api5lx-grades/

  25. https://en.wikipedia.org/wiki/Coefficient_of_determination

  26. https://physlets.org/tracker/

  27. https://physlets.org/tracker/

  28. Modesto D, Zlotnik S, Huerta A (2015) Proper generalized decomposition for parameterized Helmholtz problems in heterogeneous and unbounded domains: application to harbor agitation. Comput Methods Appl Mech Eng 295:127–149

    Article  MathSciNet  Google Scholar 

  29. Bro R (1997) PARAFAC. Tutorial and applications. Chem Intell Lab Syst 38(2):149–171

    Article  Google Scholar 

  30. https://yelu-git.github.io/hopgd/

  31. Hawkins T (1975) Cauchy and the spectral theory of matrices. Hist Math 2:1–29

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

5.1 Electronic Supplementary Material

(MP4 15426 kb)

(MP4 16931 kb)

(MP4 20997 kb)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Liu, W.K., Gan, Z., Fleming, M. (2021). Knowledge-Driven Dimension Reduction and Reduced Order Surrogate Models. In: Mechanistic Data Science for STEM Education and Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-87832-0_5

Download citation

Publish with us

Policies and ethics