Introduction to Empirical Data Analysis

Backhaus, Klaus; Erichson, Bernd; Gensler, Sonja; Weiber, Rolf; Weiber, Thomas

doi:10.1007/978-3-658-32589-3_1

Klaus Backhaus⁶,
Bernd Erichson⁷,
Sonja Gensler⁸,
Rolf Weiber⁹ &
…
Thomas Weiber¹⁰

4241 Accesses

Abstract

This chapter introduces, characterizes and classifies the eight methods of multivariate data analysis (MVA) covered in this book. When using MVA, several variables are considered simultaneously and their relationship is analyzed quantitatively. MVA aims to describe and explain these relationships or to predict future developments. Bivariate analyses that consider just two variables at a time are a special case of MVA. However, reality is usually much more complex and requires the consideration of more than just two variables. Furthermore, this chapter presents the fundamentals of empirical data analysis that are relevant to all methods discussed in the book. Since most readers will be familiar with these basics, these presentations serve primarily as a repetition or as an opportunity to look up important aspects of quantitative data analysis, such as basic statistical concepts (e.g. mean, standard deviation, covariance), the difference between correlation and causality, and the basics of statistical testing. Finally, the handling of outliers and missing values is discussed and the statistical package IBM SPSS Statistics, which is used in this book, is briefly introduced.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Both SPSS and R use the point-biserial calculation of a correlation if one of the variables has only two calculation-relevant values.
2.
On www.multivariate-methods.info, the reader will also find an Excel sheet with information on the calculation of the various statistical parameters using Excel.
3.
In Excel, the mean of a variable can be calculated by: = AVERAGE(matrix), where (matrix) is the range of cells containing the data of the variable. For example, “ = AVERAGE(C6:C55)” calculates the mean of the 50 cells C6 to C55 in column C.
4.
In Excel, the sample variance can be calculated by: \(s_{x}^{2}\) = VAR.S(matrix). The population variance can be calculated by: \(\sigma_{x}^{2}\) = VAR.P(matrix).
5.
In Excel, the sample standard deviation can be calculated by: \(s_{x}^{{}}\) = STDEV.S(matrix). The population standard deviation is calculated by: \(\sigma_{x}^{{}}\) = STDEV.P(matrix).
6.
Variance and standard deviation cannot be interpreted meaningfully for the variable “gender”. However, columns E and F are required for the calculation of covariance and correlations.
7.
In Excel, the covariance can be calculated as follows: s_xy = COVARIANCE.S(matrix1;matrix2).
8.
In Excel, the correlation between variables can be calculated as follows: r_xy = CORREL(matrix1;matrix2).
9.
Cf. the correlation of binary variables with metrically scaled variables in Sect. 1.1.2.2.
10.
For statistical testing, also see Sect. 1.3.
11.
The p-value may be calculated in Excel as follows: p = TDIST(ABS(t);N−2;2) or p=1–F.DIST(F;1;n–2;1).
12.
The central limit theorem states that the sum or mean of n independent random variables tends toward a normal distribution if n is sufficiently large, even if the original variables themselves are not normally distributed. This is the reason why a normal distribution can be assumed for many phenomena.
13.
In Excel we can calculate the critical value \(t_{\alpha /2}\) for a two-tailed t-test by using the function T.INV.2 T(α;df). We get: T.INV.2 T(0.05;99) = 1.98. The values in the last line of the t-table are identical with the normal distribution. With df = 99 the t-distribution comes very close to the normal distribution.
14.
In Excel we can calculate the p-value by using the function T.DIST.2 T(ABS(t_emp);df). For the variable in our example we get: T.DIST.2 T(ABS(−1.90);99) = 0.0603 or 6.03%
15.
In Excel we can calculate the critical value \(t_{\alpha }\) for the lower tail by using the function T.INV(α;df). We get: T.INV(0.05;99) = –1.66. For the upper tail we have to switch the sign or use the function T.INV(1–α;df).
16.
In Excel we can calculate the p-value for the left tail by using the function T.DIST(temp;df;1). We get: T.DIST(−1.90;99;1) = 0.0302 or 3%. The p-value for the right tail is obtained by the function T.DIST.RT(temp;df).
17.
Cf., e.g., Hastie et al. 2011, Pearl and Mackenzie 2018; Gigerenzer 2002.
18.
The histogram was created with Excel by selecting “Data/Data Analysis/Histogram”. In SPSS, histograms are created by selecting “Analyze/Descriptive Statistics/Explore”.
19.
In SPSS we can create boxplots (just like histograms) by selecting “Analyze/Descriptive Statistics/Explore”.

References

Campbell, D. T., & Stanley, J. C. (1966). Experimental and quasi-experimental designs for research. Chicago: Rand McNelly.
Google Scholar
du Toit, S. H. C., Steyn, A. G. W., & Stumpf, R. H. (1986). Graphical exploratory data analysis. New York: Springer.
Book Google Scholar
Freedman, D. (2002). From association to causation: Some remarks on the history of statistics (p. 521). Berkeley, Technical Report No: University of California.
Google Scholar
Gigerenzer, G. (2002). Calculated rsks. New York: Simon & Schuster.
Google Scholar
Green, P. E., Tull, D. S., & Albaum, G. (1988). Research for marketing decisions (5th ed.). Englewood Cliffs (NJ): Prentice Hall.
Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2011). The elements of statistical learning. New York: Springer.
Google Scholar
Pearl, J., & Mackenzie, D. (2018). The book of Why—The new science of cause and effect. New York: Basic Books.
Google Scholar
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103(2684), 103, pp. 677–680.
Google Scholar
Tukey, J. W. (1977). Exploratory data analysis. Massachusetts: Addison-Wesley.
Google Scholar
Watson, J., Whiting, P. F. & Brush, J. E. (2020). Interpreting a covid-19 test result. British Medical Journal, 12 May 2020, 369:m1808.
Google Scholar

Author information

Authors and Affiliations

Institute of Business-to-Business Marketing, Marketing Center Münster, University of Münster, Münster, Germany
Klaus Backhaus
Otto-von-Guericke-University Magdeburg, Magdeburg, Germany
Bernd Erichson
Chair for Value-Based-Marketing, Marketing Center Münster, University of Münster, Münster, Germany
Sonja Gensler
Chair of Marketing and Innovation, University of Trier, Trier, Germany
Rolf Weiber
Munich, Germany
Thomas Weiber

Authors

Klaus Backhaus
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Erichson
View author publications
You can also search for this author in PubMed Google Scholar
Sonja Gensler
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Weiber
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Weiber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Klaus Backhaus .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Backhaus, K., Erichson, B., Gensler, S., Weiber, R., Weiber, T. (2021). Introduction to Empirical Data Analysis. In: Multivariate Analysis. Springer Gabler, Wiesbaden. https://doi.org/10.1007/978-3-658-32589-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-658-32589-3_1
Published: 14 October 2021
Publisher Name: Springer Gabler, Wiesbaden
Print ISBN: 978-3-658-32588-6
Online ISBN: 978-3-658-32589-3
eBook Packages: Business and Economics (German Language)

Publish with us

Policies and ethics

Introduction to Empirical Data Analysis

Abstract

Access this chapter

Notes

References

Further reading

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Introduction to Empirical Data Analysis

Abstract

Access this chapter

Notes

References

Further reading

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation