Unit – 5
Applied Statistics
Problem 1: two-tailed test
- Suppose Acme Drug Company develops a new drug, designed to prevent colds. The company claims that the drug is equally effective for men and women. To test this claim, they chose a simple random sample of 100 women and 200 men from a population of 100,000 volunteers.
- At the end of the study, 38% of the women had a cold; and 51% of men caught a cold. Based on these results, can we rule out the company's claim that the drug is equally effective for men and women? Use a significance level of 0.05.
Solution:
The solution to this problem involves four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze the sample data, and (4) interpret the results. We work through the following steps:
State the hypotheses. The first step is to affirm the null hypothesis and an alternative hypothesis.
Null hypothesis: P1 = P2
Alternative hypothesis: P1 ≠ P2
Note that these hypotheses constitute a two-tailed test. The null hypothesis will be rejected if the proportion of population 1 is too large or too small.
Problem 2:
Formulate an analysis plan. For this analysis, the significance level is 0.05. The test method is a two-sided z-test.
Analyze the sample data. Using the sample data, we calculated the proportion of the pooled sample (p) and the standard error (SE). Using these measures, we compute the z-score (z) test statistic.
Solution:
p = (p1 * n1 + p2 * n2) / (n1 + n2)
p = [(0.38 * 100) + (0.51 * 200)] / (100 + 200)
p = 140/300 = 0.467
SE = sqrt {p * (1 - p) * [(1 / n1) + (1 / n2)]}
SE = sqrt [0.467 * 0.533 * (1/100 + 1/200)]
SE = sqrt [0.003733] = 0.061
z = (p1 - p2) / SE = (0.38 - 0.51) /0.061 = -2.13
Where p1 is the proportion of the sample in sample 1, where p2 is the proportion of the sample in sample 2, n1 is the size of sample 1 and n2 is the size of sample 2.
Since we have a two-tailed test, the P value is the probability that the z-score is less than -2.13 or greater than 2.13.
We use the normal distribution calculator to find P (z <-2.13) = 0.017 and P (z> 2.13) = 0.017. Therefore, the value P = 0.017 + 0.017 = 0.034.
Problem 3:
Suppose the previous example is declared slightly differently. Suppose Acme Drug Company develops a new drug, designed to prevent colds. The company claims that the drug is more effective for women than for men. To test this claim, they chose a simple random sample of 100 women and 200 men from a population of 100,000 volunteers.
At the end of the study, 38% of the women had a cold; and 51% of men caught a cold. Based on these results, can we conclude that the medication is more effective for women than for men? Use a significance level of 0.01.
Solution:
The solution to this problem involves four steps: (1) state the hypotheses, (2) formulate an analysis plan, (3) analyze the sample data, and (4) interpret the results. We work through the following steps:
State the hypotheses. The first step is to affirm the null hypothesis and an alternative hypothesis.
Null hypothesis: P1> = P2
Alternative hypothesis: P1 <P2
Note that these hypotheses constitute a one-tailed test. The null hypothesis will be rejected if the percentage of women who have a cold (p1) is sufficiently lower than the proportion of men who have a cold (p2).
Formulate an analysis plan. For this analysis, the significance level is 0.01. The test method is a two-sided z-test.
Analyze the sample data. Using the sample data, we calculated the proportion of the pooled sample (p) and the standard error (SE). Using these measures, we compute the z-score (z) test statistic.
p = (p1 * n1 + p2 * n2) / (n1 + n2)
p = [(0.38 * 100) + (0.51 * 200)] / (100 + 200)
p = 140/300 = 0.467
SE = sqrt{ p * ( 1 - p ) * [ (1/n1) + (1/n2) ] }
SE = sqrt[ 0.467 * 0.533 * ( 1/100 + 1/200 ) ]
SE = sqrt [0.003733] = 0.061
z = (p1 - p2) / SE = (0.38 - 0.51)/0.061 = -2.13
Where p1 is the proportion of the sample in sample 1, where p2 is the proportion of the sample in sample 2, n1 is the size of sample 1 and n2 is the size of sample 2.
Subsequently we have a one-tailed test, the P significance is the probability that the z-score is less than -2.13. We practice the normal distribution calculator to discover P (z <-2.13) = 0.017. Therefore, the P value = 0.017.
Interpret the results. Since the P value (0.017) is greater than the significance level (0.01), we cannot reject the null hypothesis
4. Write the steps to calculate a CI for the difference between two population proportions.
Ans:
- Determine the confidence level and find the appropriate z*-value.
Refer to the above table.
2. Find the sample proportion
For the first sample by taking the total number from the first sample that are in the category of interest and dividing by the sample size, n1. Similarly, find for the second sample.
3. Take the difference between the sample proportions,
4. Find
And divide that by n1. Find
And divide that by n2. Add these two results together and take the square root.
5. Multiply z* times the result from Step 4.
This step gives you the margin of error.
6. Take
Plus or minus the margin of error from Step 5 to obtain the CI.
The lower end of the CI is
Minus the margin of error, and the upper end of the CI is
Plus the margin of error.
5. Eleven students were given a test in statistics. They were given a month’s further tuition and the second test of equal difficulty was held at the end of this. Do the marks give evidence that the students have benefitted by extra coaching?
Boys | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Marks I test | 23 | 20 | 19 | 21 | 18 | 20 | 18 | 17 | 23 | 16 | 19 |
Marks II test | 24 | 19 | 22 | 18 | 20 | 22 | 20 | 20 | 23 | 20 | 17 |
Sol. We compute the mean and the S.D. Of the difference between the marks of the two tests as under:
Assuming that the students have not been benefitted by extra coaching, it implies that the mean of the difference between the marks of the two tests is zero i.e.
Then, nearly and df v=11-1=10
Students | |||||
1 | 23 | 24 | 1 | 0 | 0 |
2 | 20 | 19 | -1 | -2 | 4 |
3 | 19 | 22 | 3 | 2 | 4 |
4 | 21 | 18 | -3 | -4 | 16 |
5 | 18 | 20 | 2 | 1 | 1 |
6 | 20 | 22 | 2 | 1 | 1 |
7 | 18 | 20 | 2 | 1 | 1 |
8 | 17 | 20 | 3 | 2 | 4 |
9 | 23 | 23 | - | -1 | 1 |
10 | 16 | 20 | 4 | 3 | 9 |
11 | 19 | 17 | -2 | -3 | 9 |
|
|
|
|
From table IV, we find that (for v=10) =2.228. As the calculated value of , the value of t is not significant at 5% level of significance i.e. the test provides no evidence that the students have benefitted by extra coaching.
6. From a random sample of 10 pigs fed on diet A, the increase in weight in certain period were 10,6,16,17,13,12,8,14,15,9 lbs. For another random sample of 12 pigs fed on diet B, the increase in the same period were 7,13,22,15,12,14,18,8,21,23,10,17 lbs. Test whether diets A and B differ significantly as regards their effect on increases in weight ?
Sol. We calculate the means and standard derivations of the samples as follows
| Diet A |
|
| Diet B |
|
10 | -2 | 4 | 7 | -8 | 64 |
6 | -6 | 36 | 13 | -2 | 4 |
16 | 4 | 16 | 22 | 7 | 49 |
17 | 5 | 25 | 15 | 0 | 0 |
13 | 1 | 1 | 12 | -3 | 9 |
12 | 0 | 0 | 14 | -1 | 1 |
8 | -4 | 16 | 18 | 3 | 9 |
14 | 2 | 4 | 8 | -7 | 49 |
15 | 3 | 9 | 21 | 6 | 36 |
9 | -3 | 9 | 23 | 8 | 64 |
|
|
| 10 | -5 | 25 |
|
|
| 17 | 2 | 4 |
|
|
|
|
|
|
120 |
|
| 180 | 0 | 314 |
Assuming that the samples do not differ in weight so far as the two diets are concerned i.e.
For v=20, we find =2.09
The calculated value of
Hence the difference between the samples means is not significant i.e. thew two diets do not differ significantly as regards their effects on increase in weight.