Skip to main content

Advertisement

Log in

Rigid transformations for stabilized lower dimensional space to support subsurface uncertainty quantification and interpretation

  • Original Paper
  • Published:
Computational Geosciences Aims and scope Submit manuscript

Abstract

Subsurface datasets commonly are big data, i.e., they meet big data criteria, such as large data volume, significant feature variety, high sampling velocity, and limited data veracity. Large data volume is enhanced by the large number of necessary features derived from the imposition of various features derived from physical, engineering, and geological inputs, constraints that may invoke the curse of dimensionality. Existing dimensionality reduction (DR) methods are either linear or nonlinear; however, for subsurface datasets, nonlinear dimensionality reduction (NDR) methods are most applicable due to data complexity. Metric-multidimensional scaling (MDS) is a suitable NDR method that retains the data's intrinsic structure and could quantify uncertainty space. However, like other NDR methods, MDS is limited by its inability to achieve a stabilized unique solution of the low dimensional space (LDS) invariant to Euclidean transformations and has no extension for inclusions of out-of-sample points (OOSP). To support subsurface inferential workflows, it is imperative to transform these datasets into meaningful, stable representations of reduced dimensionality that permit OOSP without model recalculation.

We propose using rigid transformations to obtain a unique solution of stabilized Euclidean invariant representation for LDS. First, compute a dissimilarity matrix as the MDS input using a distance metric to obtain the LDS for \(N\)-samples and repeat for multiple realizations. Then, select the base case and perform a rigid transformation on further realizations to obtain rotation and translation matrices that enforce Euclidean transformation invariance under ensemble expectation. The expected stabilized solution identifies anchor positions using a convex hull algorithm compared to the \(N+1\) case from prior matrices to obtain a stabilized representation consisting of the OOSP. Next, the loss function and normalized stress are computed via distances between samples in the high-dimensional space and LDS to quantify and visualize distortion in a 2-D registration problem. To test our proposed workflow, a different sample size experiment is conducted for Euclidean and Manhattan distance metrics as the MDS dissimilarity matrix inputs for a synthetic dataset.

The workflow is also demonstrated using wells from the Duvernay Formation and OOSP with different petrophysical properties typically found in unconventional reservoirs to track and understand its behavior in LDS. The results show that our method is effective for NDR methods to obtain unique, repeatable, stable representations of LDS invariant to Euclidean transformations. In addition, we propose a distortion-based metric, stress ratio (SR), that quantifies and visualizes the uncertainty space for samples in subsurface datasets, which is helpful for model updating and inferential analysis for OOSP. Therefore, we recommend the workflow's integration as an invariant transformation mitigation unit in LDS for unique solutions to ensure repeatability and rational comparison in NDR methods for subsurface energy resource engineering big data inferential workflows, e.g., analog data selection and sensitivity analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data availability

The data and well-documented workflow used is publicly available on the corresponding author's GitHub Repository: https://github.com/Mide478/LowerDimensionalSpace-Stabilization-RT-UQI on publication. No proprietary data is within the GitHub repository and the data used for the synthentic case study is publicly available in the GeoDataSets respository: https://github.com/GeostatsGuy/GeoDataSets/blob/master/unconv_MV_v4.csv.

References

  1. Eldawy, A., Mokbel, M.F.: The era of Big Spatial Data. In: Proceedings of the VLDB Endowment. pp. 1992–1995 (2017)

  2. Zhu, L., Yu, F.R., Wang, Y., Ning, B., Tang, T.: Big data analytics in intelligent transportation systems: A Survey. IEEE Trans. Intell. Transp. Syst. 20, 383 (2019). https://doi.org/10.1109/TITS.2018.2815678

    Article  Google Scholar 

  3. Mohammadpoor, M., Torabi, F.: Big Data analytics in oil and gas industry: An emerging trend, (2020)

  4. Mabadeje, A., Salazar, J., Garland, L., Ochoa, J., Pyrcz, M.: A machine learning workflow to support the identification of subsurface resource analogs. Energy Explor. Exploit. 1, 23 (2023). https://doi.org/10.1177/01445987231210966

    Article  Google Scholar 

  5. Aziz, K., Sarma, P., Durlofsky, L.J., Chen, W.H.: Efficient real-time reservoir management using adjoint-based optimal control and model updating Optimal Reconstruction of State-Dependent Constitutive Relations for Complex Fluids in Earth Science Applications View project Efficient real-time reservoir management using adjoint-based optimal control and model updating. (2014). https://doi.org/10.1007/s10596-005-9009-z

    Article  Google Scholar 

  6. He, J., Sarma, P., Durlofsky, L.J.: Reduced-order flow modeling and geological parameterization for ensemble-based data assimilation. Comput. Geosci. 55, 54–69 (2013). https://doi.org/10.1016/J.CAGEO.2012.03.027

    Article  ADS  Google Scholar 

  7. Jiang, S., Durlofsky, L.J.: Treatment of model error in subsurface flow history matching using a data-space method. J. Hydrol. 603, 127063 (2021). https://doi.org/10.1016/J.JHYDROL.2021.127063

    Article  Google Scholar 

  8. Shirangi, M.G., Durlofsky, L.J.: Closed-loop field development under uncertainty by use of optimization with sample validation. SPE J. 20, 908–922 (2015). https://doi.org/10.2118/173219-PA

    Article  Google Scholar 

  9. Bellman, R.: A Markovian Decision Process. Indiana Univ. Math. J. 6, 679–684 (1957). https://doi.org/10.1512/IUMJ.1957.6.56038

    Article  MathSciNet  Google Scholar 

  10. Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp. 420–434. Springer Verlag (2001)

  11. Giannella, C.R.: Instability results for Euclidean distance, nearest neighbor search on high dimensional Gaussian data. Inf. Process. Lett. 169, 106115 (2021). https://doi.org/10.1016/j.ipl.2021.106115

    Article  MathSciNet  Google Scholar 

  12. Kabán, A.: Non-parametric detection of meaningless distances in high dimensional data. Stat. Comput. 22, 375–385 (2012). https://doi.org/10.1007/s11222-011-9229-0

    Article  ADS  MathSciNet  Google Scholar 

  13. Tan, X., Tahmasebi, P., Caers, J.: Comparing training-image based algorithms using an analysis of distance. Math. Geosci. 46, 149–169 (2014). https://doi.org/10.1007/s11004-013-9482-1

    Article  MathSciNet  Google Scholar 

  14. Josset, L., Ginsbourger, D., Lunati, I.: Functional error modeling for uncertainty quantification in hydrogeology. Water Resour. Res. 51, 1050–1068 (2015). https://doi.org/10.1002/2014WR016028

    Article  ADS  Google Scholar 

  15. Josset, L., Demyanov, V., Elsheikh, A.H., Lunati, I.: Accelerating Monte Carlo Markov chains with proxy and error models. Comput. Geosci. 85, 38–48 (2015). https://doi.org/10.1016/J.CAGEO.2015.07.003

    Article  ADS  CAS  Google Scholar 

  16. Pachet, F., Mining, C.S.-M. data, 2012, U.: Hit song science. api.taylorfrancis.com. (2012)

  17. Turchetti, C., Falaschetti, L.: A manifold learning approach to dimensionality reduction for modeling data. Inf. Sci. (Ny) 491, 16–29 (2019). https://doi.org/10.1016/J.INS.2019.04.005

    Article  MathSciNet  Google Scholar 

  18. London, K.P.-T., Edinburgh, undefined, philosophical, and D., 1901, undefined: LIII. On lines and planes of closest fit to systems of points in space. Taylor Fr. 2, 559–572 (2010). https://doi.org/10.1080/14786440109462720

  19. Jolliffe, I.: Principal Component Analysis. Encycl. Stat. Behav. Sci. (2005). https://doi.org/10.1002/0470013192.BSA501

    Article  Google Scholar 

  20. Schmid, P.J.: Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech. 656, 5–28 (2010). https://doi.org/10.1017/S0022112010001217

    Article  ADS  MathSciNet  CAS  Google Scholar 

  21. Rowley, C., Mezić, I., Bagheri, S., P.S.-J. of fluid, 2009, undefined: Spectral analysis of nonlinear flows. cambridge.orgCW Rowley, I Mezić, S Bagheri, P Schlatter, DS HenningsonJournal fluid Mech. 2009•cambridge.org. 641, 115–127 (2009). https://doi.org/10.1017/S0022112009992059

  22. Kutz, J., Brunton, S., Brunton, B., Proctor, J.: Dynamic mode decomposition: data-driven modeling of complex systems. (2016)

  23. Rao, C.R.: The utilization of multiple measurements in problems of biological classification. J. R. Stat. Soc. Ser. B 10, 159–193 (1948). https://doi.org/10.1111/j.2517-6161.1948.tb00008.x

    Article  MathSciNet  Google Scholar 

  24. Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugen. 7, 179–188 (1936). https://doi.org/10.1111/j.1469-1809.1936.tb02137.x

    Article  Google Scholar 

  25. He, X., Niyogi, P.: Locality preserving projections. In: Advances in Neural Information Processing Systems (2004)

  26. He, X., Cai, D., Yan, S., Zhang, H.J.: Neighborhood preserving embedding. Proc. IEEE Int. Conf. Comput. Vis. II, 1208–1213 (2005). https://doi.org/10.1109/ICCV.2005.167

    Article  Google Scholar 

  27. Sarma, P., Durlofsky, L.J., Aziz, K.: Kernel principal component analysis for efficient, differentiable parameterization of multipoint geostatistics. Math. Geosci. 40, 3–32 (2008). https://doi.org/10.1007/S11004-007-9131-7/METRICS

    Article  MathSciNet  Google Scholar 

  28. Schölkopf, B., Smola, A., Müller, KR.: . Kernel principal component analysis. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, JD. (eds) Artificial Neural Networks — ICANN'97. ICANN 1997. Lecture Notes in Computer Science, vol 1327. Springer, Berlin, Heidelberg (1997). https://doi.org/10.1007/BFb0020217

  29. Kao, Y.H., Van Roy, B.: Learning a factor model via regularized PCA. Mach Learn. 91, 279–303 (2013). https://doi.org/10.1007/s10994-013-5345-8

  30. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science. 290, 2323–2326 (2000). https://doi.org/10.1126/science.290.5500.2323

  31. Torgerson, W.S.: Multidimensional scaling: I. Theory and method. Psychometrika. 17, 401–419 (1952). https://doi.org/10.1007/BF02288916

    Article  MathSciNet  Google Scholar 

  32. Cox, T., Cox, M.: Multidimensional Scaling. Multidimens. Scaling. (2000). https://doi.org/10.1201/9780367801700

    Article  Google Scholar 

  33. Borg, I., Groenen, P.: Modern multidimensional scaling: Theory and applications. (2005)

  34. Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science. 290, 2319–2323 (2000). https://doi.org/10.1126/science.290.5500.2319

  35. Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003). https://doi.org/10.1162/089976603321780317

    Article  Google Scholar 

  36. Van Der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2625 (2008)

    Google Scholar 

  37. Coifman, R.R., Lafon, S.: Diffusion maps. Appl. Comput. Harmon. Anal. 21, 5–30 (2006). https://doi.org/10.1016/j.acha.2006.04.006

    Article  MathSciNet  Google Scholar 

  38. Cunningham, J.P., Ghahramani, Z.: Linear dimensionality reduction: Survey, Insights, and Generalizations. J. Mach. Learn. Res. 16, 2859–2900 (2015)

    MathSciNet  Google Scholar 

  39. Young, G., Householder, A.S.: Discussion of a set of points in terms of their mutual distances. Psychometrika 3, 19–22 (1938). https://doi.org/10.1007/BF02287916

    Article  Google Scholar 

  40. Xia, Y.: Correlation and association analyses in microbiome study integrating multiomics in health and disease. Prog. Mol. Biol. Transl. Sci. 171, 309–491 (2020). https://doi.org/10.1016/BS.PMBTS.2020.04.003

    Article  CAS  PubMed  Google Scholar 

  41. Trosset, M.W., Priebe, C.E., Park, Y., Miller, M.I.: Semisupervised learning from dissimilarity data. Comput. Stat. Data Anal. 52, 4643 (2008). https://doi.org/10.1016/J.CSDA.2008.02.030

    Article  MathSciNet  PubMed  PubMed Central  Google Scholar 

  42. Kouropteva, O., Okun, O., Pietikäinen, M.: Incremental locally linear embedding algorithm. In: Lecture Notes in Computer Science. pp. 521–530 (2005)

  43. Law, M.H.C., Zhang, N., Jain, A.K.: Nonlinear manifold learning for data stream. In: SIAM Proceedings Series. pp. 33–44 (2004)

  44. Bengio, Y., Paiement, J.-F., Vincent, P., Delalleau, O., Le Roux, N., Ouimet, M.: Out-of-sample extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering. Adv. Neural Inf. Process. Syst. 16, (2003)

  45. Villena-Martinez, V., Oprea, S., Saval-Calvo, M., Azorin-Lopez, J., Fuster-Guillo, A., Fisher, R.B.: When deep learning meets data alignment: A Review on Deep Registration Networks (DRNs). Appl. Sci. 10, 7524–7524 (2020). https://doi.org/10.3390/APP10217524

    Article  CAS  Google Scholar 

  46. Verleysen, M., Lee, J.A.: Nonlinear dimensionality reduction for visualization. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). pp. 617–622 (2013)

  47. Lee, J.A., Verleysen, M.: Quality assessment of dimensionality reduction: Rank-based criteria. Neurocomputing 72, 1431–1443 (2009). https://doi.org/10.1016/J.NEUCOM.2008.12.017

    Article  Google Scholar 

  48. Dayawansa, W.P.: Recent advances in the stabilization problem for low dimensional systems. IFAC Proc. 25, 1–8 (1992). https://doi.org/10.1016/S1474-6670(17)52250-8

    Article  Google Scholar 

  49. Buehrer, R.M., Wymeersch, H., Vaghefi, R.M.: Collaborative sensor network localization: algorithms and practical issues. Proc. IEEE 106, 1089–1114 (2018). https://doi.org/10.1109/JPROC.2018.2829439

    Article  Google Scholar 

  50. Aicardi, I., Nex, F., Gerke, M., Lingua, A.M., Melgani, F., Pajares Martinsanz, G., Li, X., Thenkabail, P.S.: An image-based approach for the co-registration of multi-temporal UAV image datasets. Remote Sens. 8, 779–779 (2016). https://doi.org/10.3390/RS8090779

    Article  ADS  Google Scholar 

  51. Saval-Calvo, M., Azorin-Lopez, J., Fuster-Guillo, A., Villena-Martinez, V., Fisher, R.B.: 3D non-rigid registration using color: Color coherent point drift. Comput. Vis. Image Underst. 169, 119–135 (2018). https://doi.org/10.1016/J.CVIU.2018.01.008

    Article  Google Scholar 

  52. Eggert, D.W., Lorusso, A., Fisher, R.B.: Estimating 3-D rigid body transformations: a comparison of four major algorithms. Mach. Vis. Appl. 9, 272–290 (1997)

    Article  Google Scholar 

  53. Miyakoshi, M.: Correcting whole-body motion capture data using rigid body transformation. Eur. J. Neurosci. 54, 7946–7958 (2021). https://doi.org/10.1111/EJN.15531

    Article  CAS  PubMed  Google Scholar 

  54. Rodrigues, M.A., Liu, Y.: On the representation of rigid body transformations for accurate registration of free-form shapes. Rob. Auton. Syst. 39, 37–52 (2002)

    Article  Google Scholar 

  55. Yang, T., Liu, J., McMillan, L., Wang, W.: A fast approximation to multidimensional scaling. Proc. ECCV. 1–8 (2006)

  56. Arun, K.S., Huang, T.S., Blostein, S.D.: Least-Squares Fitting of Two 3-D Point Sets. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-9, 698–700 (1987). https://doi.org/10.1109/TPAMI.1987.4767965

  57. Kruskal, J.B.: Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29, 1–27 (1964). https://doi.org/10.1007/BF02289565

    Article  MathSciNet  Google Scholar 

  58. Barber, C.B., Dobkin, D.P., Huhdanpaa, H.: The quickhull algorithm for convex hulls. ACM Trans. Math. Softw. 22, 469–483 (1996). https://doi.org/10.1145/235815.235821

    Article  MathSciNet  Google Scholar 

  59. De Leeuw, J., Stoop, I.: Upper bounds for Kruskal’s stress. Psychometrika 49, 391–402 (1984). https://doi.org/10.1007/BF02306028/METRICS

    Article  Google Scholar 

  60. Pyrcz, M.: GeoDataSets: Synthetic Subsurface Data Repository, (2021)

  61. Pyrcz, M.J., Deutsch, C.V.: Geostatistical Reservoir Modeling. 448 (2014)

  62. Lal, A.K., Pati, S.: Linear Algebra through Matrices. (2018)

  63. Sorkine, O., Rabinovich, M.: Least-squares rigid motion using svd: Technical notes, p. 1–6. (2017)

Download references

Acknowledgements

The authors sincerely appreciate Equinor and the Digital Reservoir Characterization Technology (DIRECT) consortium's industry partners at the Hildebrand Department of Petroleum and Geosystems Engineering, University of Texas at Austin for financial support. We acknowledge Equinor for granting permission to use the Duvernay case study dataset presented here.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ademide O. Mabadeje.

Ethics declarations

Competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

Based on Definition 5.8.5 in Lal and Pati [62], which states a map \(T: {\mathbb{R}}^{n}\to {\mathbb{R}}^{n}\) is said to be a rigid motion if \(\Vert T\left(\mathbf{x}\right)- T\left(\mathbf{y}\right) \Vert = \Vert \mathbf{x}-\mathbf{y}\Vert , \forall \mathbf{x}, \mathbf{y}\in {\mathbb{R}}^{n}\), where \(T\) is the rigid transformation operator such that when applied on set points, \(\mathbf{x}, \mathbf{y}\) each with \({\text{dim}}N\times 3\), leads to an invariant matrix. Thus, if a rigid transformation operator is performed on an MDS realization that preserves pairwise dissimilarity in the lower dimensional space, the same properties from the prior definition are implied and an invariant dissimilarity matrix is obtained. The mathematical proof by rigor for rigid transformations is in Sorkine-Hornung et al. [63]. Hence, we propose Theorem A.1.

Theorem A.1: Let \({\mathbf{S}}_{E}\) be the ensemble expectation of rigid transformed solutions for multiple realizations in the workflow known as stabilized solutions, i.e.,\({\mathbf{S}}_{E}={\mathbb{E}}\left[ T\left( {\mathbf{Z}}_{{\text{k}}}\right)\right]\), in a lower-dimensional space. If an out-of-sample-point (OOSP) that falls within a 95% confidence interval for each predictor feature of interest, \(P \forall m=1,\dots , M\), and \(M=\left\{ m \right|m\in {\mathbb{N}}\}\) is added, then the stabilized solution obtained by applying rigid transformation on the representation obtained from the low dimensional space, \({\mathbf{Z}}_{OOS{\text{P}}}\), is the same as the ensemble expectation of the stabilized solution for the \(N\)-sample case, which is Euclidean transformation invariant i.e., \({\mathbf{S}}_{E}\approx {\mathbf{S}}_{OOSP}\).

Proof: Suppose adding an OOSP within the 95% confidence interval causes the stabilized solution obtained by rigid transformation on \({{\varvec{Z}}}_{OOSP}\) to differ from \({\mathbf{S}}_{E}\). Let \(T\) be the rigid transformation such that \(T: {\mathbb{R}}^{n}\to {\mathbb{R}}^{n}\). This implies there is no applicable rigid transformation that ensures \(\Vert T\left(\mathbf{x}\right)- T\left(\mathbf{y}\right) \Vert = \Vert \mathbf{x}-\mathbf{y}\Vert , \forall \mathbf{x}, \mathbf{y}\in {\mathbb{R}}^{n}\) when \({\text{dim}}\left\{\mathbf{x}\right\}\ne {\text{dim}}\left\{\mathbf{y}\right\}\). This contradicts the definition of a rigid transformation as stated in Lal and Pati [62] and Sorkine-Hornung et al. [63].

Therefore, assuming the shape and size of the sets of anchor points from the \(N\)-sample and \(N+1\) sample cases are equal i.e., \({\text{dim}}\left\{{{\varvec{A}}}_{n}\right\}={\text{dim}}\{{{\varvec{A}}}_{OOSP}\}=n\), because the OOSP is within a 95% confidence interval for each predictor feature of interest. Then, a map \(T: {\mathbb{R}}^{{\text{n}}}\to {\mathbb{R}}^{{\text{n}}}\) that enforces rigid motion in the low dimensional space is possible if \(\Vert T\left({{\varvec{A}}}_{n}\right)- T\left({{\varvec{A}}}_{OOSP}\right) \Vert = \Vert {{\varvec{A}}}_{n}-{{\varvec{A}}}_{OOSP}\Vert , \forall {{\varvec{A}}}_{n}, {{\varvec{A}}}_{OOSP}\in {\mathbb{R}}^{n}\), provided that the OOSP does not reside in the tail distribution of each \(P\). This constraint is imposed to restrict the dissimilarity matrix, \({\varvec{D}}\), from changing significantly. Therefore, Theorem A.1 holds.

Table 1 Criteria for evaluating normalized stress and its corresponding interpretation [57]

Credit author statement

Ademide O. Mabadeje: Data curation, Conceptualization, Methodology, Software, Validation, Visualization, Formal analysis, Writing – Original draft.

Michael Pyrcz: Data curation, Conceptualization, Supervision, Funding acquisition, Writing –Reviewing and Editing.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mabadeje, A.O., Pyrcz, M.J. Rigid transformations for stabilized lower dimensional space to support subsurface uncertainty quantification and interpretation. Comput Geosci (2024). https://doi.org/10.1007/s10596-024-10278-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10596-024-10278-x

Keywords

Navigation