UNIT – 2
Statistical Average
Mean
Merits of Mean
1) Arithmetic mean rigidly defined by Algebraic Formula.
2) It is easy to calculate and simple to understand.
3) It is based on all observations of the given data.
4) It is capable of being treated mathematically hence it is widely used in statistical analysis.
5) Arithmetic mean can be computed even if the derailed distribution is not known but some of the observation and number of the observation are known.
6) It is least affected by the fluctuation of sampling.
7) For every kind of data mean can be calculated.
Demerits of Arithmetic mean :
Computation of sample mean -
If X1, X2, ………………Xn are data values then arithmetic mean is given by
Computation of the mean for ungrouped data
Example 1 – The marks obtained in 10 class test are 25, 10, 15, 30, 35
The mean = X = 25+10+15+30+35 = 115 =23
5 5
Analysis – The average performance of 5 students is 23. The implication is that students who got below 23 did not perform well. The students who got above 23 performed well in exam.
Example 2 – Find the mean
Xi | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
Freq (Fi) | 2 | 5 | 12 | 17 | 14 | 6 | 3 |
Xi | Freq (Fi) | XiFi |
9 | 2 | 18 |
10 | 5 | 50 |
11 | 12 | 132 |
12 | 17 | 204 |
13 | 14 | 182 |
14 | 6 | 84 |
15 | 3 | 45 |
| Fi = 59 | XiFi= 715 |
|
|
|
Then, N = ∑ fi = 59, and ∑fi Xi=715
X = 715/59 = 12.11
Mean for grouped data/ Weighted Arithmetic Mean
Grouped data are the data that are arranged in a frequency distribution
Frequency distribution is the arrangement of scores according to category of classes including the frequency.
Frequency is the number of observations falling in a category
The formula in solving the mean for grouped data is called midpoint method. The formula is
Where, X = Mean
Xm = midpoint of each class or category
f = frequency in each class or category
∑f Xm = summation of the product of fXm
Example 3 – the following data represent the income distribution of 100 families. Calculate mean income of 100 families?
Income | 30-40 | 40-50 | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 |
No. of families | 8 | 12 | 25 | 22 | 16 | 11 | 6 |
Solution:
Income | No. of families | Xm (Mid point) | fXm |
30-40 | 8 | 35 | 280 |
40-50 | 12 | 34 | 408 |
50-60 | 25 | 55 | 1375 |
60-70 | 22 | 65 | 1430 |
70-80 | 16 | 75 | 1200 |
80-90 | 11 | 85 | 935 |
90-100 | 6 | 95 | 570 |
| n = 100 |
| ∑f Xm = 6198 |
X = ∑f Xm/n = 6330/100 = 63.30
Mean = 63.30
Example 4 – Calculate the mean number of hours per week spent by each student in texting message.
Time per week | 0 - 5 | 5 - 10 | 10 - 15 | 15 - 20 | 20 - 25 | 25 – 30 |
No. of students | 8 | 11 | 15 | 12 | 9 | 5 |
Solution:
Time per week (X) | No. of students (F) | Mid point X | XF |
0 - 5 | 8 | 2.5 | 20 |
5 – 10 | 11 | 7.5 | 82.5 |
10 - 15 | 15 | 12.5 | 187.5 |
15 - 20 | 12 | 17.5 | 210 |
20 - 25 | 9 | 22.5 | 202.5 |
25 – 30 | 5 | 27.5 | 137.5 |
| 60 |
| 840 |
Mean = 840/60 = 14
Example 5 –
The following table of grouped data represents the weights (in pounds) of all 100 babies born at a local hospital last year.
Weight (pounds) | Number of Babies |
[3−5) | 8 |
[5−7) | 25 |
[7−9) | 45 |
[9−11) | 18 |
[11−13) | 4 |
Solution:
Weight (pounds) | Number of Babies | Mid point X | XF |
[3−5) | 8 | 4 | 32 |
[5−7) | 25 | 6 | 150 |
[7−9) | 45 | 8 | 360 |
[9−11) | 18 | 10 | 180 |
[11−13) | 4 | 12 | 48 |
| 100 |
| 770 |
Mean = 770/100 = 7.7
Geometric mean
Geometric mean is a type of mean or average, which indicates the central tendency of a set of numbers by using the product of their values.
Definition
The Geometric Mean (G.M) of a series containing n observations is the nth root of the product of the values.
For ungrouped data
Geometric Mean, GM = Antilog ∑logxi
N
Merits of Geometric Mean
Demerits of Geometric Mean
Example 1 – find the G.M of the values
X | Log X |
45 | 1.653 |
60 | 1.778 |
48 | 1.681 |
65 | 1.813 |
Total | 6.925 |
GM = Antilog ∑logxi
N
= Antilog 6.925/4
= Antilog 1.73
= 53.82
For grouped data
Geometric Mean, GM = Antilog ∑ f logxi
N
Example 2 – calculate the geometric mean
X | f |
60 – 80 | 22 |
80 – 100 | 38 |
100 – 120 | 45 |
120 – 140 | 35 |
|
|
Solution
X | f | Mid X | Log X | f log X |
60 – 80 | 22 | 70 | 1.845 | 40.59 |
80 – 100 | 38 | 90 | 1.954 | 74.25 |
100 – 120 | 45 | 110 | 2.041 | 91.85 |
120 – 140 | 35 | 130 | 2.114 | 73.99 |
Total | 140 |
|
| 280.68 |
GM = Antilog ∑ f logxi
N
= antilog 280.68/140
= antilog 2.00
GM = 100
Example 3 – calculate geometric mean
class | frequency |
2-4 | 3 |
4-6 | 4 |
6-8 | 2 |
8-10 | 1 |
Solution
class | frequency | x | Log x | flogx |
2-4 | 3 | 3 | 1.0986 | 3.2958 |
4-6 | 4 | 5 | 1.2875 | 6.4378 |
6-8 | 2 | 7 | 0.5559 | 3.8918 |
8-10 | 1 | 9 | 0.2441 | 2.1972 |
| 10 |
|
| 15.8226 |
GM = Antilog ∑ f logxi
N
= antilog 15.8226/10
= antilog 1.5823
GM = 4.866
Harmonic mean
Harmonic mean is quotient of “number of the given values” and “sum of the reciprocals of the given values
For ungrouped data
Merits of Harmonic Mean
Demerits of Harmonic Mean
Example 1 - Calculate the harmonic mean of the numbers 13.2, 14.2, 14.8, 15.2 and 16.1
Solution
X | 1/X |
13.2 | 0.0758 |
14.2 | 0.0704 |
14.8 | 0.0676 |
15.2 | 0.0658 |
16.1 | 0.0621 |
Total | 0.3147 |
H.M of X = 5/0.3147 = 15.88
Example 2 - Find the harmonic mean of the following data {8, 9, 6, 11, 10, 5} ?
X | 1/X |
8 | 0.125 |
9 | 0.111 |
6 | 0.167 |
11 | 0.091 |
10 | 0.100 |
5 | 0.200 |
Total | 0.794 |
H.M of X = 6/0.794 = 7.560
For grouped data
Example 3 - Calculate the harmonic mean for the below data
Marks | 30-39 | 40-49 | 50-59 | 60-69 | 70-79 | 80-89 | 90-99 |
F | 2 | 3 | 11 | 20 | 32 | 25 | 7 |
Solution
Marks | X | F | F/X |
30-39 | 34.5 | 2 | 0.0580 |
40-49 | 44.5 | 3 | 0.0674 |
50-59 | 54.4 | 11 | 0.2018 |
60-69 | 64.5 | 20 | 0.3101 |
70-79 | 74.5 | 32 | 0.4295 |
80-89 | 84.5 | 25 | 0.2959 |
90-99 | 94.5 | 7 | 0.0741 |
Total |
| 100 | 1.4368 |
HM = 100/1.4368 = 69.59
Example 4 – find the harmonic mean of the given class
Ages | 4 | 5 | 6 | 7 |
No. of students | 6 | 4 | 10 | 9 |
Solution
X | F | f/x |
4 | 6 | 1.50 |
5 | 4 | 0.80 |
6 | 10 | 1.67 |
7 | 9 | 1.29 |
| 29.00 | 5.25 |
HM = 29/5.25 = 5.5
Example 5 – calculate harmonic mean
class | frequency |
2-4 | 3 |
4-6 | 4 |
6-8 | 2 |
8-10 | 1 |
Solution
class | frequency | x | f/x |
2-4 | 3 | 3 | 1 |
4-6 | 4 | 5 | 0.8 |
6-8 | 2 | 7 | 0.28 |
8-10 | 1 | 9 | 0.11 |
| 10 |
| 2.19 |
Harmonic mean = 10/2.19 = 4.55
Merits of mean
Demerits of mean
Key Takeaways:
Mode
The mode is denoted Mo, is the value which occurs most frequently in a set of values.
Croxton and Cowden defined it as “the mode of a distribution is the value at the point armed with the item tends to most heavily concentrated. It may be regarded as the most typical of a series of value”
Mode for ungrouped data
Example 1- Find the mode of scores of section A
Scores = 25, 24, 24, 20, 17, 18, 10, 18, 9, 7
Solution – Mode is 24, 18 as both have occurred twice.
Mode for grouped data
Mode = L1 + (L2 – L1) d1
d1 +d2
L1= lower limit of the modal class,
L2= upper limit of the modal class‟
d1 =fm-f0 and d2=fm-f1
Where fm= frequency of the modal class,
f0 = frequency of the class preceding to the modal class,
f1= frequency of the class succeeding to the modal class.
Example 2 – Find the mode
Seconds | Frequency |
51 - 55 | 2 |
56 - 60 | 7 |
61 - 65 | 8 |
66 - 70 | 4 |
The group with the highest frequency is the modal group: - 61-65
D1 = 8-7 = 1
D2 = 8-4 = 4
Mode = L1 + (L2 – L1) d1
d1 +d2
mode = 61 + (65-61) 1 = 61+4 (1/5) = 61.8
1+4
Mode = 61.8
Example 3 - In a class of 30 students marks obtained by students in science out of 50 is tabulated below. Calculate the mode of the given data.
Marks obtained | No. of students |
10 -20 | 5 |
20 – 30 | 12 |
30 – 40 | 8 |
40 - 50 | 5 |
Solution:
The group with the highest frequency is the modal group: - 20 -30
D1 = 12 - 5 = 7
D2 = 12 - 8 = 4
Mode = L1 + (L2 – L1) d1
d1 +d2
mode = 20 + (30-20) 7 = 20+10 (7/11) = 26.36
7+4
Mode = 61.8
Example 4- Based on the group data below, find the mode
Time to travel to work | frequency |
1 – 10 | 8 |
11 -20 | 14 |
21 – 30 | 12 |
31 – 40 | 9 |
41 - 50 | 7 |
Solution:
The group with the highest frequency is the modal group: - 11 - 20
D1 = 14 - 8 = 6
D2 = 14 - 12 = 2
Mode = L1 + (L2 – L1) d1
d1 +d2
mode = 11 + (20-11) 6 = 11+9 (6/8) = 17.75
6+2
Example 5 –
Compute the mode from the following frequency distribution
CI | F |
70-71 | 2 |
68-69 | 2 |
66-67 | 3 |
64-65 | 4 |
62-63 | 6 |
60-61 | 7 |
58-59 | 5 |
Solution:
The group with the highest frequency is the modal group: - 60 - 61
D1 = 7 - 6 = 1
D2 = 7 - 5 = 2
Mode = L1 + (L2 – L1) d1
d1 +d2
mode = 60 + (61-60) 1 = 60+1 (1/3) 60.85
1+2
Merits of mode
Demerits of mode
Median
For ungrouped data median is calculated by (n+1)th value
2
Example 1 – find the median score of 7 students in science class
Score = 19, 17, 16, 15, 12, 11, 10
Median = (7+1)/2 = 4th value
Median = 15
Find the median score of 8 students in science class
Score = 19, 17, 16, 15, 12, 11, 10, 9
Median = (8+1)/2 = 4.5th value
Median = (15+12)/2 = 13.5
Example 2 – find the median of the table given below
Marks obtained | No. of students |
20 | 6 |
25 | 20 |
28 | 24 |
29 | 28 |
33 | 15 |
38 | 4 |
42 | 2 |
43 | 1 |
Solution:
Marks obtained | No. of students | cf |
20 | 6 | 6 |
25 | 20 | 26 (20+6) |
28 | 24 | 50 (26+24) |
29 | 28 | 78 |
33 | 15 | 93 |
38 | 4 | 97 |
42 | 2 | 99 |
43 | 1 | 100 |
Median = (n+1)/2 = 100+1/2 = 50.5
Median = (28+29)/2 = 28.5
Median of grouped data
Formula
MC = median class is a category containing the n/2
Lb = lower boundary of the median class
Cfp = cumulative frequency before the median class if the scores are arranged from lowest to highest value
Fm = frequency of the median class
c.i = size of the class interval
Ex- calculate the median
Example 3-
Calculate the median
Marks | No. of students |
0-4 | 2 |
5-9 | 8 |
10-14 | 14 |
15-19 | 17 |
20-24 | 9 |
Solution:
Marks | No. of students | CF |
0-4 | 2 | 2 |
5-9 | 8 | 10 |
10-14 | 14 | 24 |
15-19 | 17 | 41 |
20-24 | 9 | 50 |
| 50 |
|
n = 50
n = 50/2= 25
2
The category containing n/2 is 15 -19
Lb = 15
Cfp = 24
f = 17
ci = 4
Median = 15 + 25-24 *4 = 15.23
17
Example 4 - Given the below frequency table calculate median
X | 60 – 70 | 70 – 80 | 80- 90 | 90-100 |
F | 4 | 5 | 6 | 7 |
Solution:
X | F | CF |
60 - 70 | 4 | 4 |
70 - 80 | 5 | 9 |
80 - 90 | 6 | 15 |
90 - 100 | 7 | 22 |
n = 22
n = 22/2= 11
2
The category containing n+1/2 is 80 - 90
Lb = 80
Cfp = 9
f = 6
ci = 10
Median = 80 + 11-9 *10 = 83.33
6
Example 5– calculate the median of grouped data
Class interval | 1-3 | 3-5 | 5-7 | 7-9 | 9-11 | 11-13 |
Frequency | 4 | 12 | 13 | 19 | 7 | 5 |
Solution:
CI | F | CF |
1-3 | 4 | 4 |
3-5 | 12 | 16 |
5-7 | 13 | 29 |
7-9 | 19 | 48 |
9-11 | 7 | 55 |
11-13 | 5 | 60 |
n = 60
n = 60/2= 30
2
The category containing n+1/2 is 7-9
Lb = 7
Cfp = 29
f = 19
ci = 2
Median = 7 + 30-29 *2 = 7.105
19
Merits of median
Demerits of median
Key Takeaways:
Quartiles
There are three quartiles, i.e., Q1, Q2 and Q3 which divide the total data into four equal parts when it has been orderly arranged. Q1, Q2 and Q3 are termed as first quartile, second quartile and third quartile or lower quartile, middle quartile and upper quartile, respectively. The first quartile, Q1, separates the first one-fourth of the data from the upper three fourths and is equal to the 25th percentile. The second quartile, Q2, divides the data into two equal parts (like median) and is equal to the 50th percentile. The third quartile, Q3, separates the first three-quarters of the data from the last quarter and is equal to 75th percentile.
Calculation of Quartiles:
The calculation of quartiles is done exactly in the same manner as it is in case of the calculation of median.
The different quartiles can be found using the formula given below:
Qi = l1 + i= 1,2,3
Where,
L1 = lower limit of ith quartile class
L2 = upper limit of ith quartile class
c = cumulative frequency of the class preceding the ith quartile class
f = frequency of ith quartile class.
Deciles
Deciles are the partition values which divide the arranged data into ten equal parts. There are nine deciles i.e. D1, D2, D3……. D9 and 5th decile is same as median or Q2, because it divides the data in two equal parts.
Calculation of Deciles:
The calculation of deciles is done exactly in the same manner as it is in case of calculation of median.
The different deciles can be found using the formula given below:
Di = l1 + i= 1,2,3….9
Where,
l1 = lower limit of ith quartile class
l2 = upper limit of ithquartile class
c = cumulative frequency of the class preceding the ithquartile class
f = frequency of ith quartile class.
Percentile
Percentiles are the values which divide the arranged data into hundred equal parts. There are 99 percentiles i.e. P1, P2, P3, ……. P99.
The 50th percentile divides the series into two equal parts and P50 = D5 = Median.
Similarly, the value of Q1 = P25 and value of Q3 = P75
Calculation of Percentiles:
The different percentiles can be found using the formula given below:
pi = l1 + i= 1,2,3…………….99
Where,
l1 = lower limit of ith quartile class
l2 = upper limit of ithquartile class
c = cumulative frequency of the class preceding the ithquartile class
f = frequency of ith quartile class.
Day | Frequency |
1 | 20 |
2 | 35 |
3 | 25 |
4 | 12 |
5 | 10 |
6 | 23 |
7 | 18 |
8 | 14 |
9 | 30 |
10 | 40 |
ANS
Arrange the frequency data in ascending order
Day | Frequency |
1 | 10 |
2 | 12 |
3 | 14 |
4 | 18 |
5 | 20 |
6 | 23 |
7 | 25 |
8 | 30 |
9 | 35 |
10 | 40 |
First quartile (Q1)
Qi= [i * (n + 1) /4] th observation
Q1= [1 * (10 + 1) /4] th observation
Q1 = 2.75 th observation
Thus, 2.75 th observation lies between the 2nd and 3rd value in the ordered group, between frequency 12 and 14
First quartile (Q1) is calculated as
Q1 = 2nd observation +0.75 * (3rd observation - 2nd observation)
Q1 = 12 + 0.75 * (14 – 12) = 13.50
Third quartile (Q3)
Qi= [i * (n + 1) /4] th observation
Q3= [3 * (10 + 1) /4] th observation
Q3 = 8.25 th observation
So, 8.25 th observation lies between the 8th and 9th value in the ordered group, between frequency 30 and 35
Third quartile (Q3) is calculated as
Q3 = 8th observation +0.25 * (9th observation – 8th observation)
Q3 = 30 + 0.25 * (35 – 30) = 31.25
2. Calculate Q1, D7 and P20 from the following data:
3, 13, 11, 11, 5, 4, 2
ANS
Arranging observations in the ascending order we get
2, 3, 4, 5, 11, 11, 13
Here, n = 7
Q1 = ()th value of the observation
= ()th Value of the observation
= 2nd Value of the observation
= 3
D3 = ()th value of the observation
= ()th value of the observation
= (2.4)th Value of the observation
= 2nd observation + 0.4 (3rd – 2nd)
= 3 + 0.4(4 – 3)
= 3 + 0.4
= 3.4
P20 = ()th value of the observation
= ()th value of the observation
= (1.6)th value of the observation
= 1st observation + 0.6 (2nd – 1st )
= 2 + 0.6(3 – 2)
= 2 + 0.6
= 2.6
3. Calculate P20 from the following data:
Class | 2 - 4 | 4 - 6 | 6 - 8 | 8 - 10 |
Frequency | 3 | 4 | 2 | 1 |
ANS
In the case of Frequency Distribution, Percentiles can be calculated by using the formula:
pi = l1 +
Class interval | F | CF |
2 - 4 | 3 | 3 |
4 - 6 | 4 | 7 |
6 - 8 | 2 | 9 |
8 - 10 | 1 | 10 |
Total | n = 10 |
|
Here n = 10
Class with th value of the observation in CF column
= th value of the observation in CF column
= 2th value of the observation in CF column and it lies in the class 2 - 4
Therefore, P20 class is 2 – 4
The lower boundary point of 2 – 4 is 2.
Therefore, L = 2
P20 = L +
= 2 + x 2
= 2 + 1.3333
= 3.3333
4. Calculate D7 from the following data:
Class | 2 - 4 | 4 - 6 | 6 - 8 | 8 - 10 |
Frequency | 3 | 4 | 2 | 1 |
ANS
In the case of Frequency Distribution, Deciles can be calculated by using the formula:
Di = l1 +
Class interval | F | CF |
2 - 4 | 3 | 3 |
4 - 6 | 4 | 7 |
6 - 8 | 2 | 9 |
8 - 10 | 1 | 10 |
Total | n = 10 |
|
Here n = 10
Class with th value of the observation in CF column
= th value of the observation in CF column
= 7th value of the observation in CF column and it lies in the class 6 – 8
Therefore, D7 class is 6 – 8
The lower boundary point of 6 – 8 is 6.
Therefore, L = 6
D7 = L +
= 6 + x 2
= 6 + 0
= 6
Weighted average
Weighted average is a means of determining the average of a set of values by assigning weightage to each value in relation to their relative importance/significance.
The formula of weighted average can be expressed as follows:
Weighted average = (Total of x1w1+ x2w2+x3w3…..+xnwn)/(Total of w1 +w2+w3….+wn)
where;
For example, let’s continue the same example as above. While ABC Inc has 5 different items in its stock, they are present in different quantities in stock. The quantities in which they are present will become the ‘weights’ and thus a weighted average calculation would be more accurate to calculate the average price of its stock. The weighted average can be calculated as below:
Sr No | Stock item name | Cost price (in USD) | Quantity in stock (in kg) | Weighted cost |
1 | Copper | 10 | 9 | 90 |
2 | Brass | 12 | 22 | 264 |
3 | Iron | 9 | 31 | 279 |
4 | Aluminium | 20 | 11 | 220 |
5 | Plastic | 8 | 27 | 216 |
| Total |
| 100 | 1069 |
| Weighted average (Weighted cost/total of weights) |
|
| 10.69 |
As can be seen, the weighted average cost arrived at differs from the simple average cost due to the introduction of quantities in the data set.
Weighted average method is opted for when values in a set are all not of equal importance. Thus, by taking into consideration the relative importance of each value, the weighted average method seeks to equate all the values comprised in the set. This makes weighted average method a more complex but more accurate calculation method than simple average.
Key Takeaways:
REFERENCE