Statistical Learning as a Regression Problem

Berk, Richard A.

doi:10.1007/978-3-030-40189-4_1

Richard A. Berk⁵

Part of the book series: Springer Texts in Statistics ((STS))

4218 Accesses
3 Citations
2 Altmetric

Abstract

This chapter makes four introductory points: (1) regression analysis is defined by the conditional distribution of Y |X, not by a conventional linear regression model; (2) different forms of regression analysis are properly viewed as approximations of the true relationships, which is a game changer; (3) statistical learning can be just another kind of regression analysis; (4) and properly formulated regression approximations can have asymptotically most of the desirable estimation properties. The emphasis on regression analysis is justified in part through a rebranding of least squares regression by some as a form of supervised machine learning. Once these points are made, the chapter turns to several key statistical concepts needed for statistical learning: overfitting, data snooping, loss functions, linear estimators, linear basis expansions, the bias–variance tradeoff, resampling, algorithms versus models, and others.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Regularization will have a key role in much of the material ahead. It goals and features will be addressed as needed.
2.
“Realized” here means produced through a random process. Random sampling from a finite population is an example. Data generated by a correct linear regression model can also be said to be realized. After this chapter, we will proceed almost exclusively with a third way in which data can be realized.
3.
The data, birthwt, are from the MASS package in R.
4.
The χ ² test assumes that the marginal distributions of both variables are fixed in repeated realizations of the data. Only the distribution of counts within cells can change. Whether this is a plausible assumption depends on how the data were generated. If the data are a random sample from a well-defined population, the assumption of fixed marginal distributions is not plausible. Both marginal distributions would almost certainly change in new random samples. The spine plot and the mosaic plot were produced using the R package vcd, which stands for “visualizing categorical data.” Its authors are Meyer et al. (2007).
5.
Although there are certainly no universal naming conventions, “predictors” can be seen as variables that are of subject-matter interest, and “covariates” can be seen as variables that improve the performance of the statistical procedure being applied. Then, covariates are not of subject-matter interest. Whatever the naming conventions, the distinction between variables that matter substantively and variables that matter procedurally is important. An example of the latter is a covariate included in an analysis of randomized experiments to improve statistical precision.
6.
A crime is “cleared” when the perpetrator is arrested. In some jurisdictions, a crime is cleared when the perpetrator has been identified, even if there has been no arrest.
7.
But nature can certainly specify different predictor values for different students.
8.
By “asymptotics,” one loosely means what happens to the properties an estimate as the number of observations increases without limit. Sometimes, for example, bias in the estimate shrinks to zero, which means that in sufficiently large samples, the bias will likely be small. Thus, the desirable estimation properties of logistic regression only materialize asymptotically. This means that one can get very misleading results from logistic regression in small samples if one is working at Level II.
9.
This is sometimes called “the fallacy of accepting the null” (Rozeboom 1960).
10.
Model selection in some disciplines is called variable selection, feature selection, or dimension reduction. These terms will be used interchangeably.
11.
Actually, it can be more complicated. For example, if the predictors are taken to be fixed, one is free to examine the predictors alone. Model selection problems surface when associations with the response variable are examined as well. If the predictors are taken to be random, the issues are even more subtle.
12.
If one prefers to think about the issues in a multiple regression context, the single predictor can be replaced by the predictor adjusted, as usual, for its linear relationships with all other predictors.
13.
Recall that x is fixed and does not change from dataset to dataset. The new datasets result from variation around the true conditional means.
14.
We will see later that by increasing the complexity of the mean function estimated, one has the potential to reduce bias with respect to the true response surface. But an improved fit in the data on hand is no guarantee that one is more accurately representing the true mean function. One complication is that greater mean function complexity can promote overfitting.
15.
The next several pages draw heavily on Berk et al. (2019) and Buja et al. (2019a,b).
16.
Each case is composed of a set (i.e., vector) of values for the random variables that are included.
17.
The notation may seem a little odd. In a finite population, these would be matrices or vectors, and the font would be bold. But the population is of limitless size because it constitutes what could be realized from the joint probability distribution. These random variables are really conceptual constructs. Bold font might have been more odd still. Another notational scheme could have been introduced for these statistical constructs, but that seems a bit too precious and in context, unnecessary.
18.
For exposition, working with conditional expectations is standard, but there are other options such as conditional probabilities when the predictor is categorical. This will be important in later chapters.
19.
They are deviations around a mean, or more properly, an expected value.
20.
For example, experiences in the high school years will shape variables such as the high school GPA, the number of advanced placement courses taken, the development of good study habits, an ability to think analytically, and performance on the SAT or ACT test, which, in turn, can be associated the college grades freshman year. One can easily imagine representing these variables in a joint probability distribution.
21.
We will see later that some “weak” forms of dependence are allowed.
22.
This intuitively pleasing idea has in many settings solid formal justification (Efron and Tibshirani 1993: chapter 4).
23.
There is no formal way to determine how large is large enough because such determinations are dataset specific.
24.
Technically, a prediction interval is not a confidence interval. A confidence interval provides coverage for a parameter such as a mean or regression coefficient. A prediction interval provides coverage for a response variable value. Nevertheless, prediction intervals are often called confidence intervals.
25.
The use of split samples means that whatever overfitting or data snooping that might result from the fitting procedure apply to the first split and do not taint the residuals from the second split. Moreover, there will typically be no known or practical way to do proper statistical inference that includes uncertainty from the training data and fitting procedure when there is data snooping.
26.
This works because the data are assumed to be IID, or at least exchangeable. Therefore, it makes sense to consider the interval in which forecasted values fall with a certain probability (e.g., .95) in limitless IID realizations of the forecasting data.
27.
Because of the random split, the way some of the labels line up in the plot may not be quite right when the code is run again. But that is easily fixed.
28.
The use of split samples can be a disadvantage. As discussed in some detail later, many statistical learning are sample-size dependent when the fitting is undertaken. Smaller samples lead to fitted values and forecasts that can have more bias with respect to the true response surface. But in trade, no additional assumptions need be made when the second split is used to compute residuals.
29.
If the sampling were without replacement, the existing data would simply be reproduced unless the sample size was smaller than N. More will be said about this option in later chapters based on work by Buja and Stuetzle (2006).
30.
The boot procedures stem from the book by Davidson (1997). The code is written by Angelo Canty and Brian Ripley.
31.
The second-order conditions differ substantially from conventional linear regression because the 1s and 0s are a product of Bernoulli draws (McCullagh and Nelder 1989: Chapter 4). It follows that unlike least squares regression for linear models, logistic regression depends on asymptotics to obtain desirable estimation properties.
32.
Some treatments of machine learning include logistic regression as a form of supervised learning. Whether in these treatments logistic regression is seen as a model or an algorithm is often unclear. But it really matters, which will be more apparent shortly.
33.
As a categorial statement, this is a little too harsh. Least squares itself is an algorithm that in fact can be used on some statistical learning problems. But regression analysis formulated as a linear model incorporates many addition features that have little or nothing to do with least squares. This will be more clear shortly.
34.
There is also “semisupervised” statistical learning that typically concentrates on Y |X, but for which there are more observations for the predictors than for the response. The usual goal is to fit the response better using not just set of observations for which both Y and X are observed, but also using observations for which only X is observed. Because an analysis of Y |X will often need to consider the joint probability distribution of X as well, the extra data on X alone can be very useful.
35.
Recall that because we treat the predictors that constitute X as random variables, the disparities between the approximation and the truth are also random, which allows them to be incorporated in ε.
36.
Some academic disciplines like to call the columns of X “inputs,” and Y an “output” or a “target.” Statisticians typically prefer to call the columns of X “predictors” and Y a “response.” By and large, the terms predictor (or occasionally, regressor) and response will be used here except when there are links to computer science to be made. In context, there should be no confusion.
37.
In later chapters, several procedures will be discussed that can help one consider the “importance” of each input and how inputs are related to outputs.
38.
A functional is a function that takes one or more functions as arguments.
39.
An estimand is a feature of the joint probability distribution whose value(s) are primary interest. An estimator is a computational procedures that can provide estimates of the estimand. An estimate is the actual numerical value(s) produced by the estimator. For example, the expected value of a random variable may be the estimand. The usual expression for the mean in an IID dataset can be the estimator. The value of the mean obtained from the sample is the estimate. These terms apply to Level II, statistical learning but with more a complicated conceptual scaffolding.
40.
Should a linear probability model be chosen for binomial regression, one could write G = f(X) + ε, which unlike logistic regression, can be estimated by least squares. However, it has several undesirable properties such as sometimes returning fitted values larger than 1.0 or smaller than 0.0 (Hastie et al. 2009: section 4.2).
41.
The term “mean function” can be a little misleading when the response variable is G. It would be more correct to use “proportion function” or “probability function.” But “mean function” seems to be standard, and we will stick with it.
42.
Clustering algorithms have been in use since the 1940s ((Cattell 1943)), long before there was the discipline of computer science was born. When rebranded as unsupervised learning, these procedures are just old wine in new bottles. There are other examples of conceptual imperialism, many presented as a form of supervised learning. A common instance is logistic regression, which dates back to at least the 1950s (Cox 1958). Other academic disciplines also engage in rebranding. The very popular difference-in-differences estimator claimed as their own by economists (Abadie 2005) was initially developed by educational statisticians a generation earlier (Linn and Slinde 1977), and the associated research design was formally proposed by Campbell and Stanley (1963).
43.
Tuning is somewhat like setting the dials on a coffee grinder by trial and error to determine how fine the grind should be and how much ground coffee should be produced.
44.
For linear models, several in-sample solutions have been proposed for data snooping (Berk et al. 2014a,b; Lockhart et al. 2014; Lei et al. 2018), but they are not fully satisfactory, especially for statistical learning.
45.
The issues surrounding statistical inference are more complicated. The crux is that uncertainty in the training data and in the algorithm is ignored when performance is gauged solely with test data. This is addressed in the chapters ahead.
46.
Loss functions are also called “objective functions” or “cost functions”.
47.
In R, many estimation procedures have a prediction procedure that can easily be used with test data to arrive at test data fitted values.
48.
As noted earlier, one likely would be better off using evaluation data to determine the order of the polynomial.
49.
The transformation is linear because the \(\hat {y}_{i}\) are a linear combination of the y _i. This does not mean that the relationships between X and y are necessarily linear.
50.
This is a carry-over from conventional linear regression in which X is fixed. When X is random, Eq. (1.16) does not change. There are, nevertheless, important consequences for estimation that we have begun to address. One may think of the \(\hat {y}_{i}\) as estimates of population approximation, not of the true response surface.
51.
Emphasis in the original.
52.
The residual degrees of freedom can then be computed by subtraction (see also Green and Silverman 1994: Sect. 3.3.4).
53.
The symbol I denotes an indicator function. The result is equal to 1 if the argument in brackets is true and equal to 0 if the argument in brackets is false. The 1s and 0s constitute an indicator variable. Sometimes indicator variables are called a dummy variables.
54.
To properly employ a Level II framework, lots of hard thought would be necessary. For example, are the observations realized independently as the joint probability distribution approach requires? And if not, then what?.

References

Abadie, A. (2005). Semiparametric difference-in-differences estimators. Review of Economic Studies 72(1), 1–19.
MathSciNet MATH Google Scholar
Akaike, H. (1973). Information theory and an extension to the maximum likelihood principle. In B. N. Petrov & F. Casaki (Eds.), International symposium on information theory (pp. 267–281). Budapest: Akademia Kiado.
Google Scholar
Angrist, J. D., & Pischke, J. (2009). Mostly harmless econometrics. Princeton: Princeton University.
MATH Google Scholar
Barber, D. (2012). Bayesian reasoning and machine learning. Cambridge: Cambridge University.
MATH Google Scholar
Berk, R. A. (2003). Regression analysis: A constructive critique. Newbury Park, CA.: SAGE.
Google Scholar
Berk, R. A. (2005). New claims about executions and general deterrence: Déjà Vu all over again? Journal of Empirical Legal Studies, 2(2), 303–330.
Google Scholar
Berk, R. A., & Freedman, D. A. (2003). Statistical assumptions as empirical commitments. In T. Blomberg & S. Cohen (Eds.), Law, punishment, and social control: Essays in honor of Sheldon Messinger. Part V (pp. 235–254). Aldine de Gruyter, November 1995, revised in second edition, 2003.
Google Scholar
Berk, R. A., Kriegler, B., & Ylvisaker, D. (2008). Counting the homeless in Los Angeles county. In D. Nolan & S. Speed (Eds.), Probability and statistics: Essays in honor of David A. Freedman. Monograph series for the institute of mathematical statistics.
Google Scholar
Berk, R. A., Brown, L., & Zhao, L. (2010). Statistical inference after model selection. Journal of Quantitative Criminology, 26, 217–236.
Google Scholar
Berk, R. A., Brown, L., Buja, A., Zhang, K., & Zhao, L. (2014a). Valid post-selection inference. Annals of Statistics, 41(2), 802–837.
MathSciNet MATH Google Scholar
Berk, R. A., Brown, L., Buja, A., George, E., Pitkin, E., Zhang, K., et al. (2014b). Misspecified mean function regression: Making good use of regression models that are wrong. Sociological Methods and Research, 43, 422–451.
MathSciNet Google Scholar
Berk, R. A., Buja, A., Brown, L., George, E., Kuchibhotla, A. K., Su, W., et al. (2019). Assumption lean regression. The American Statistician. Published online, April 12, 2019.
Google Scholar
Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer.
MATH Google Scholar
Bolen, C. (2019). Goldman banker snared by AI as U.S. Government Embraces New Tech. Bloomberg Government. Posted July 8, 2019.
Google Scholar
Bound, J., Jaeger, D. A., & Baker, R. M. (1995). Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. Journal of the American Statistical Association, 90(430), 443–450.
Google Scholar
Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71(356), 791–799.
MathSciNet MATH Google Scholar
Breiman, L. (2001b). Statistical modeling: Two cultures (with discussion). Statistical Science, 16, 199–231.
MathSciNet MATH Google Scholar
Buja, A., & Stuetzle, W. (2006). Observations on bagging. Statistica Sinica, 16(2), 323–352.
MathSciNet MATH Google Scholar
Buja, A., Berk, R., Brown, L., George, E., Pitkin, E., Traskin, M., et al. (2019a). Models as approximations—Part I: A conspiracy of random regressors and model deviations against classical inference in regression. Statistical Science, 34(4), 523–544.
MathSciNet Google Scholar
Buja, A., Berk, R., Brown, L., George, E., Kuchibhotla, A. K., & Zhao, L. (2019b). Models as approximations—Part II: A general theory of model-robust regression. Statistical Science, 34(4), 545–565.
MathSciNet Google Scholar
Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. Boston: Cengage Learning.
Google Scholar
Cattell, R. B. (1943). The description of personality: Basic traits resolved into clusters. Journal of Abnormal and Social Psychology, 38(4), 476–506.
Google Scholar
Christianini, N, & Shawe-Taylor, J. (2000). Support vector machines (Vol. 93(443), pp. 935–948). Cambridge, UK: Cambridge University.
Google Scholar
Cochran, W. G. (1977) Sampling techniques (3rd edn.). New York: Wiley.
MATH Google Scholar
Cook, D. R., & Weisberg, S. (1999). Applied regression including computing and graphics. New York: Wiley.
MATH Google Scholar
Cox, D. R. (1958). The regression analysis of binary sequences (with discussion). Journal of the Royal Statistical Society, Series B, 20(2), 215–242.
MathSciNet MATH Google Scholar
Dasu, T., & Johnson, T. (2003). Exploratory data mining and data cleaning. New York: Wiley.
MATH Google Scholar
Davidson, A. C. (1997). Bootstrap methods and their application. Cambridge, UK: Cambridge University Press.
Google Scholar
Edgington, E. S., & Ongehena, P. (2007). Randomization tests (4th edn.). New York: Chapman & Hall.
Google Scholar
Efron, B. (1986). How Biased is the apparent error rate of Prediction rule?. Journal of the American Statistical Association, 81(394), 461–470.
MathSciNet MATH Google Scholar
Efron, B., & Tibshirani, R. (1993). Introduction to the Bootstrap. New York: Chapman & Hall.
MATH Google Scholar
Eicker, F. (1963). Asymptotic normality and consistency of the least squares estimators for families of linear regressions. Annals of Mathematical Statistics, 34, 447–456.
MathSciNet MATH Google Scholar
Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, pp. 59–82).
Google Scholar
Faraway, J. J. (2014). Does data splitting improve prediction? Statistics and Computing, 26(1–2), 49–60.
MathSciNet MATH Google Scholar
Freedman, D. A. (1981). Bootstrapping regression models. Annals of Statistics, 9(6), 1218–1228.
MathSciNet MATH Google Scholar
Freedman, D. A. (1987). As others see us: A case study in path analysis (with discussion). Journal of Educational Statistics, 12, 101–223.
Google Scholar
Freedman, D. A. (2004). Graphical models for causation and the identification problem. Evaluation Review, 28, 267–293.
Google Scholar
Freedman, D. A. (2009a). Statistical models Cambridge, UK: Cambridge University.
Google Scholar
Freedman, D. A. (2009b). Diagnostics cannot have much power against general alternatives. International Journal of Forecasting, 25, 833–839.
Google Scholar
Freedman, D. A. (2012). On the so-called ‘Huber sandwich estimator’ and ‘Robust standard errors.’ The American Statistician, 60(4), 299–302.
MathSciNet Google Scholar
Geisser, S. (1993). Predictive inference: An introduction. New York: Chapman & Hall.
MATH Google Scholar
Green, P. J., & Silverman, B. W. (1994). Nonparametric regression and generalized linear models. New York: Chapman & Hall.
MATH Google Scholar
Hall, P. (1997) The Bootstrap and Edgeworth expansion. New York: Springer.
MATH Google Scholar
Hand, D., Manilla, H., & Smyth, P. (2001). Principles of data mining. Cambridge, MA: MIT Press.
Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd edn.). New York: Springer.
MATH Google Scholar
Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the Fifth Symposium on Mathematical Statistics and Probability (Vol I, pp. 221–233).
Google Scholar
Janson, L., Fithian, W., & Hastie, T. (2015). Effective degrees of freedom: A flawed metaphor. Biometrika, 102(2), 479–485.
MathSciNet MATH Google Scholar
Jöeskog, K. G. (1979). Advances in factor analysis and structural equation models. Cambridge: Abt Books Press.
Google Scholar
Kaufman, S., & Rosset, S. (2014). When does more regularization imply fewer degrees of freedom? Sufficient conditions and counter examples from the Lasso and Ridge regression. Biometrica, 101(4), 771–784.
MATH Google Scholar
Leamer, E. E. (1978). Specification searches: AD HOC inference with non-experimental data. New York: Wiley.
MATH Google Scholar
Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.
MathSciNet MATH Google Scholar
Leeb, H., & Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators? The Annals of Statistics, 34(5), 2554–2591.
MathSciNet MATH Google Scholar
Leeb, H., Pötscher, B. M. (2008). Model selection. In T. G. Anderson, R. A. Davis, J.-P. Kreib, & T. Mikosch (Eds.), The handbook of financial time series (pp. 785–821). New York, Springer.
Google Scholar
Lei, J., G’Sell, M., Rinaldo, A., Tibshirani, R. j., & Wasserman, L. (2018). Distribution-free predictive inference for regression. Journal of the American Statistical Association, 113(523), 1094–1111.
MathSciNet MATH Google Scholar
Linn, R. L., & Slinde, J. A. (1977). The determination of the significance of change between pre- and post-testing periods. Review of Educational Research, 47, 121–150.
Google Scholar
Lockhart, R., Taylor, J., Tibshirani, R. J., & Tibshirani, R. (2014). A significance test for the lasso (with discussion). Annals of Statistics, 42(2), 413–468.
MathSciNet MATH Google Scholar
Mallows, C. L. (1973). Some comments on CP. Technometrics, 15(4), 661–675.
MATH Google Scholar
Marsland, S. (2014). Machine learning: An algorithmic perspective (2nd edn.). New York: Chapman & Hall
Google Scholar
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd edn.). New York: Chapman and Hall.
MATH Google Scholar
Meyer, D., Zeileis, A., & Hornik, K. (2007). The Strucplot framework: Visualizing multiway contingency tables with VCD. Journal of Statistical Software, 17(3), 1–48.
Google Scholar
Michelucci, P., & Dickinson, J. L. (2016). The power of crowds: Combining human and machines to help tackle increasingly hard problems. Science, 351(6268), 32–33.
Google Scholar
Murdock, D., Tsai, Y., & Adcock, J. (2008). P-values are random variables. The American Statistician, 62, 242–245.
MathSciNet Google Scholar
Murphy, K. P. (2012). Machine learning: A probabilistic perspective. Cambridge, Mass: MIT Press.
MATH Google Scholar
Nagin, D. S., & Pepper, J. V. (2012). Deterrence and the death penalty. Washington, D.C.: National Research Council.
Google Scholar
Open Science Collaboration (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aas4716-1–aas4716-8.
Google Scholar
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11), 559–572.
MATH Google Scholar
Rice, J. A. (2007). Mathematical statistics and data analysis (3rd edn.). Belmont, CA: Duxbury Press.
Google Scholar
Rozeboom, W. W. (1960). The fallacy of null-hypothesis significance tests. Psychological Bulletin, 57(5), 416–428.
Google Scholar
Rubin, D. B. (1986). Which Ifs have causal answers. Journal of the American Statistical Association, 81, 961–962.
Google Scholar
Rubin, D. B. (2008). For objective causal inference, design trumps analysis. Annals of Applied Statistics, 2(3), 808–840.
MathSciNet MATH Google Scholar
Rummel, R. J. (1988). Applied Factor Analysis. Northwestern University Press.
MATH Google Scholar
Ruppert, D., Wand, M. P., & Carroll, R. J. (2003). Semiparametric regression. Cambridge, UK: Cambridge University.
MATH Google Scholar
Schwartz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.
MathSciNet Google Scholar
Stigler, S. M. (1981). Gauss and the invention of least squares. The Annals of Statistics, 9(3), 465–474.
MathSciNet MATH Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning (2nd edn.). A Bradford Book.
Google Scholar
Torgerson, W. (1958). Theory and methods of scaling. New York: Wiley.
Google Scholar
Weisberg, S. (2013). Applied linear regression (4th edn.). New York: Wiley.
MATH Google Scholar
White, H. (1980a). Using least squares to approximate unknown regression functions. International Economic Review, 21(1), 149–170.
MathSciNet MATH Google Scholar
Witten, I. H., & Frank, E. (2000). Data mining. New York: Morgan and Kaufmann.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Criminology, Schools of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA
Richard A. Berk

Authors

Richard A. Berk
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berk, R.A. (2020). Statistical Learning as a Regression Problem. In: Statistical Learning from a Regression Perspective. Springer Texts in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-030-40189-4_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-40189-4_1
Published: 30 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-40188-7
Online ISBN: 978-3-030-40189-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics