Analytics to Infer Population Characteristics and Differences

Fraser, Cynthia

doi:10.1007/978-3-031-42555-4_2

Cynthia Fraser²

79 Accesses

Abstract

Analytics to infer population characteristics and segment differences rely on samples. Samples are collected and analyzed to efficiently estimate population characteristics. Inference from samples enables tests of hypotheses about what may be true in the population. Those hypotheses are tested and population parameters are estimated with confidence intervals. Included in this chapter are tests of hypotheses and confidence intervals forInference relies on the properties of Normally distributed sample means. Those properties of Normal distributions are explored first, below.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

McIntire School of Commerce, University of Virginia, Charlottesville, VA, USA
Cynthia Fraser

Authors

Cynthia Fraser
View author publications
You can also search for this author in PubMed Google Scholar

2.1 Electronic Supplementary Material(s)

DC hotels master (XLSX 16 kb)

Latte master (XLSX 15 kb)

Nationals salaries master (XLSX 18 kb)

Polaski master (XLSX 15 kb)

Appendices

Excel 2.1: Inference with Pairs

2.1.1 Find the Confidence Interval for the Mean Difference Between Alternate Pairs

Polaski

To estimate the population difference in perceived taste of (identical) vodka samples poured from a leading brand bottle and a new brand concept bottle, from sample data, construct a confidence interval of the average rating difference.

Activate the Analysis ToolPak: File, Options, Add Ins, Analysis Tool Pack.

Find the mean and standard deviation of the difference and the margin of error of the difference (labelled Confidence, in row 16 of descriptives).

Data Analysis, Descriptives. Ask for Summary Statistics and the Confidence Level for Mean.

Subtract and add the margin of error (in B16) from the mean difference to find the 95% confidence interval bounds for the difference.

A screenshot of an Excel sheet has 2 columns A and B and 20 rows of data. The mean difference in B 3, confidence level difference in B 18, upper 95% c i difference in B 19, and lower 95% c i difference in B 20 are highlighted.

Test Expectations Regarding the Mean Difference Between Alternate Pairs with a Paired T.Test

Managers expect that vodka poured from the leading brand package, Stoly, will be perceived as better tasking than the identical vodka poured from the proposed new Polaski package:

$$ \mathrm{H}1:{\boldsymbol{\mu}}_{\boldsymbol{Stoly}-\boldsymbol{Polaski}}>\mathbf{0} $$

If that is not the case, then the difference in perceived taste ratings will be less than or equal to zero:

$$ \mathrm{H}0:{\boldsymbol{\mu}}_{\boldsymbol{Stoly}-\boldsymbol{Polaski}}\le \mathbf{0} $$

To test the null expectation, use the function T.TEST(array1, array2,tails,type) to calculate a paired t test. For array1, enter the Stoly package taste ratings. For array2, enter the Polaski package taste ratings. For tails, enter 1 for a one tail test (since the hypotheses contain inequalities), and for type, enter 1 to specify a paired t test.

A screenshot of an Excel sheet. The column headers are Stoly bottle, Polaski bottle, difference, and perceived taste difference scale. Columns A and B are highlighted. The p-value is calculated in row 32.

If managers were, instead, confident that the Polaski name and package were equally effective to the Stoly name and package in influencing taste ratings, the null hypotheses would contain an equality and a two tail test would be used.

In such a case, for tails, enter 2 for a two tail test (since the null hypothesis contains an equality).

A screenshot of an Excel sheet. The column headers are Stoly bottle, Polaski bottle, difference, and perceived taste difference scale. The calculated p-value is 7.4274 E minus 06.

Excel 2.2: Inference with Two Population Samples

DC Hotels

Estimate the difference between Hilton’s and competitors’ mean ratings.

Use descriptives to find the segment sample means, standard deviations, and standard errors for Hilton hotels and for competitors’ hotels.

A screenshot of an Excel sheet. The column headers are Hilton, guest, competitors, and guest. The values are provided from the rows 3 to 15.

Find the difference between segment means, B2-D2, and the pooled standard error of the difference from the segment sample variances and sample sizes: SQRT(B3^2 + D3^2).

A screenshot of an Excel sheet. The column headers are Hilton, guest, other, and guest. The values are listed from rows 2 to 16. The difference between means is 0.157051 and pooled s e is 0.057506.

Find the degrees of freedom required for the critical t. First, find the numerator from the segment standard errors: (B4^2 + D4^2)^2

A screenshot of an Excel sheet. The column headers are Hilton, guest, competitors, and guest. The standard errors in B 4 and D 4 cells are highlighted.

Next, find the denominator from the segment standard errors and sample sizes: B4^4/(B15–1) + D4^4/(D15–1)

A screenshot of an Excel sheet. The column headers are Hilton, guest, competitors, and guest. The values are provided from rows 3 to 15. Values in B 4, B 15, D 4, and D 15 cells are highlighted.

In E5, find degrees of freedom of 125.4 by dividing the numerator by the denominator.

Find the critical t with 125.4 degrees of freedom: T.INV.2 T(.05,E5)

A screenshot of an Excel sheet. The column headers are Hilton, guest, competitors, and guest. The values are provided from rows 3 to 15. E 5 cell is highlighted. The difference, s e, and critical t values are in rows 16, 17, and 18.

Find the margin of error, 1.56, by multiplying the critical t times the pooled SE.

Subtract and add the margin of error to the difference to find the lower and upper 95% confidence intervals, .04 to .27.

A screenshot of an Excel sheet. The column headers are Hilton, guest, competitors, and guest. The values are provided in rows 3 to 15. B 16, 19, and 21 cells are highlighted. The difference, s e, critical t, me, lower 95% c i, and upper 95% c i values are listed in rows 16 to 21.

Excel 2.3: Analytics to Test the Expectations Regarding the Difference Between Two Segment Means

Use the Excel function T.TEST(array1,array2,tails,type) to find the p value from a one tail t test of the difference between average guest ratings of the two segments. For array1, enter the sample Hilton guest ratings. For array2, enter the sample competitors’ guest ratings. Assuming that managers are confident that Hilton’s guest ratings are higher, enter 1 for a one tail test, and for type, enter 3 to signal a two sample t test which allows the standard deviations to differ between segments: T.TEST(B2:B38,E2:E93,1,3)

A screenshot of an Excel sheet. The column headers are the Hilton Hotel, guest, price, and competitors. The data are provided in rows 34 to 39. B 34 to 37 and B 39 cells are highlighted.

Case 2.1: Moneyball in 2022

A.
In 2021, the Washington Nationals finished last in the National League East. Some Nats players had heard that, back in 2018, Phillies players had complained to their General Manager that their last place performance was due to their low salaries. The Phillies got raises and finished second in the NLE in 2021. Now, the Nats players allege that their last place performance is due to their low salaries. The General Manager believes that they may have a point.

Determine whether or not the Nationals’ salaries are less than other NLE players. Data are in Nationals salaries.xlsx.
1. 1.
  This is: ___ a one tail OR ___ a two tail test.
2. 2.
  This is: ___ a independent two sample test OR ___ a paired test.
3. 3.
  State the null hypothesis using symbols:
4. 4.
  Can the null hypothesis be rejected? ___ yes ___ no
5. 5.
  What is the pvalue from your test?
6. 6.
  Find the 95% confidence interval for the difference between other National League East salaries and Nationals’ salaries.
B.
The General Manager knows that his strategy to hire younger players resulted in lower salaries. Can he justify the lower salaries by pointing to the younger ages of his players?

Determine whether or not the Nationals’ ages are less than other NLE players.
1. 1.
  This is: ___ a one tail OR ___ a two tail test.
2. 2.
  This is: ___ a independent two sample test OR ___ a paired test.
3. 3.
  State the null hypothesis using symbols:
4. 4.
  Can the null hypothesis be rejected? ___ yes ___ no
5. 5.
  What is the pvalue from your test?

Case 2.2: McLattes

McDonalds recently sponsored a blind taste test of lattes from Starbucks and their own McCafes. A sample of 30 Starbucks customers tasted both lattes from unmarked cups and provided ratings on a −3 (=worst latte I’ve ever tasted) to +3 (=best latte I’ve ever tasted) scale. McDonalds managers had conducted taste comparisons with a sample of coffee experts. They were confident that their lattes would be rated equivalently to Starbucks lattes. These data are in Latte.

1.
This is ___ a one tail OR ___ a two tail test.
2.
This is ___ a two independent samples OR ___ a paired test.
3.
State the null hypothesis using symbols:
4.
Test the hypotheses and report your results to management, including the pvalue:
- ___ reject the null, based on pvalue: ____ OR ___ fail to reject the null, based on pvalue: ___

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fraser, C. (2024). Analytics to Infer Population Characteristics and Differences. In: Business Statistics for Competitive Advantage with Excel and JMP . Springer, Cham. https://doi.org/10.1007/978-3-031-42555-4_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-42555-4_2
Published: 05 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42554-7
Online ISBN: 978-3-031-42555-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics