Abstract
■■■
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Bibliography
Sam Kash Kachigan, Multivariate Statistical Analysis: A Conceptual Introduction, 2nd Edition, ISBN-13: 978-0942154917
Alvin C. Rencher, William F. Christensen, Methods of Multivariate Analysis 3rd Edition, Wiley, ISBN-13: 978-0470178966
Author information
Authors and Affiliations
Solutions
Solutions
1.1.1 Exercise 1.1: Problem 1
The following probabilities can be extracted from the given information. Note that fail represents the event of a machine tool failure, good represents the event that there is no machine tool failure, vibration represents that there was excessive vibration, and overheat represents the event of overheating.
Given these initial probabilities, the conditional probability that there was a failure given an observed vibration can be calculated as follows.
Now, this conditional probability of failure is going to be used as the probability for failure in future calculations. Since the sum of all probabilities in a set must be one, the probability that the product is good must be 0.66. Now that the first event occurred and we have these new probabilities of failure, the probability of failure can be calculated given the next event.
So, after both events occurred, the probability of a failure is 0.392.
1.1.2 Exercise 1.1: Problem 2
The following frequencies (probabilities) can be extracted from the Example data.
Now, the probability of poor quality given event A at 10 am:
The new probabilities are:
Now, the probability of poor quality given the consecutive event B at 2 pm:
The new probabilities are:
Now, the probability of poor quality given the consecutive event C at 4 pm:
Given the three sequential events, the probability that there was a poor quality of product became 99.86 %.
1.1.3 Exercise 1.1: Problem 3
The probability of passing the requirements can be calculated for the original system as the area between two z-scores on a standard normal curve. The upper and lower bound z-scores are calculated as:
The probability inside these bounds is 0.7335, or 73.4 %.
Now, to check the improvement from the controls added, the same process will be done with the statistical data for the controlled process.
The probability inside these bounds is 0.9991, or 99.91 %.
Given this information, we can see that the probability of passing the requirements jumped from 73.4 % to 99.91 % with the addition of controls. This means that there was a 25.61 % increase in the success rate of this procedure from the introduction of controls.
1.1.4 Exercise 1.1: Problem 4
In this problem, we are looking for the amount of time sufficient for 90 % of students to complete the reading task. For this, we will look at a normal distribution with a mean of 2.5 and a standard deviation of 0.6 min. The z-value corresponding to 90 % probability under is 1.282.
The reading time value associated with this z is 3.3 min. We can conclude that within 3.3 min, 90 % of students will be done reading one page.
1.1.5 Exercise 1.1: Problem 5
1.1.5.1 Part A
For a 90 % confidence interval and 17 students, the t-value for this calculation will be
The 90 % confidence interval for mean:
The 90 % confidence interval for standard deviation:
For a 95 % confidence interval and 17 students, the t-value for this calculation will be
The 95 % confidence interval for mean:
The 95 % confidence interval for standard deviation:
1.1.5.2 Part B
Doubling the accuracy with a 90 % confidence interval would require the following N.
If we make our N = 63, we can observe a doubling in our accuracy by a halving of our interval width.
Doubling the accuracy of a 95 % confidence interval would require the following N.
If we make our N = 61, we can observe a doubling in our accuracy by a halving of our interval width.
1.1.5.3 Part C
Doubling the accuracy for a 90 % confidence interval would require the following N.
If we make our N = 60, we can observe a doubling in our accuracy by a halving of our interval width.
Doubling the accuracy of a 95 % confidence interval would require the following N.
If we make our N = 61, we can observe a doubling in our accuracy by a halving of our interval width.
1.1.6 Exercise 1.2: Problem 1
The correlation matrix was calculated with the following configuration:
In which the correlation coefficient for two variables, x and y, is defined as
The correlation matrix for x, y, and z is:
Then, the statistical significance was evaluated by comparing the half-width if the confidence interval for the correlation coefficients to the coefficients themselves. The correlation coefficients were deemed significant of they were outside the confidence interval, meaning that the correlation coefficient was greater than the half-width of the interval. The half-widths of the intervals were calculated as
-
Δxy = 0.14956 and r xy is 0.02506, so the x-y correlation is not significant.
-
Δxz = 0.14888 and r xz is 0.071861, so the x-z correlation is not significant.
-
Δyz = 0.094818 and r yz is 0.60531, so the y-z correlation is significant .
1.1.7 Exercise 1.2: Problem 2
The multiple correlation coefficient Ry,xvz is:
1.1.8 Exercise 1.2: Problem 3
First, the array of X was ordered from minimum to maximum value and then split into two equal parts. The associated Z values were split respectively into two equal-length sets. The mean value for Z for each set was calculated separately. The difference of these two mean values was compared to the half-widths of their confidence intervals.
The half-widths were calculated as:
The difference in mean values of Z 1 and Z 2 is 0.3387, and the half-width of the 95 % confidence intervals for Z sets are 0.5586 and 0.5934. This indicates that there is no evidence that value of variable X has an effect on mean value of variable Z.
1.1.9 Exercise 1.2: Problem 4
The cross-correlation function is a function calculated with respect to discrete interval m that varies as m=0,1,2,…,N/4. The value of this function is:
The resulting cross-correlation function is plotted below against m.
1.1.10 Exercise 1.2: Problem 5
A frequency analysis tool in MATLAB was used to break down the frequency spectrum of x(i).
The detected peaks are consistent with the frequencies and magnitudes of the sinusoidal components of the signal:
-
Peak at 0.016 Hz (0.10 radians/s) with amplitude 7,
-
Peak at 0.11 Hz (0.70 radians/s) with amplitude 2
-
Peak at 0.28 Hz (1.77 radians/s) with amplitude .5
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Skormin, V.A. (2016). Statistical Methods and Their Applications. In: Introduction to Process Control. Springer Texts in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-319-42258-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-42258-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42257-2
Online ISBN: 978-3-319-42258-9
eBook Packages: Business and ManagementBusiness and Management (R0)