2.1 Arithmetic Geometric and harmonic means | unit 2 statistical average

Back to Study material

UNIT – 2

Statistical Average

2.1 Arithmetic, Geometric and harmonic means

Mean

The mean is the arithmetic average, also called as arithmetic mean.

Mean is very simple to calculate and is most commonly used measure of the center of data.

Means is calculated by adding up all the values and divided by the number of observations.

Merits of Mean

1) Arithmetic mean rigidly defined by Algebraic Formula.

2) It is easy to calculate and simple to understand.

3) It is based on all observations of the given data.

4) It is capable of being treated mathematically hence it is widely used in statistical analysis.

5) Arithmetic mean can be computed even if the derailed distribution is not known but some of the observation and number of the observation are known.

6) It is least affected by the fluctuation of sampling.

7) For every kind of data mean can be calculated.

Demerits of Arithmetic mean :

It can neither be determined by inspection or by graphical location.

Arithmetic mean cannot be computed for qualitative data like data on intelligence honesty and smoking habit etc.

It is too much affected by extreme observations and hence it is not adequately represent data consisting of some extreme point.

Arithmetic mean cannot be computed when class intervals have open ends.

If any one of the data is missing then mean cannot be calculated.

Computation of sample mean -

If X1, X2, ………………Xn are data values then arithmetic mean is given by

Computation of the mean for ungrouped data

Example 1 – The marks obtained in 10 class test are 25, 10, 15, 30, 35

The mean = X = 25+10+15+30+35 = 115 =23

5 5

Analysis – The average performance of 5 students is 23. The implication is that students who got below 23 did not perform well. The students who got above 23 performed well in exam.

Example 2 – Find the mean

Xi	9	10	11	12	13	14	15
Freq (Fi)	2	5	12	17	14	6	3

Xi	Freq (Fi)	XiFi
9	2	18
10	5	50
11	12	132
12	17	204
13	14	182
14	6	84
15	3	45
	Fi = 59	XiFi= 715

Then, N = ∑ fi = 59, and ∑fi Xi=715

X = 715/59 = 12.11

Mean for grouped data/ Weighted Arithmetic Mean

Grouped data are the data that are arranged in a frequency distribution

Frequency distribution is the arrangement of scores according to category of classes including the frequency.

Frequency is the number of observations falling in a category

The formula in solving the mean for grouped data is called midpoint method. The formula is

Where, X = Mean

Xm = midpoint of each class or category

f = frequency in each class or category

∑f Xm = summation of the product of fXm

Example 3 – the following data represent the income distribution of 100 families. Calculate mean income of 100 families?

Income	30-40	40-50	50-60	60-70	70-80	80-90	90-100
No. of families	8	12	25	22	16	11	6

Solution:

Income	No. of families	Xm (Mid point)	fXm
30-40	8	35	280
40-50	12	34	408
50-60	25	55	1375
60-70	22	65	1430
70-80	16	75	1200
80-90	11	85	935
90-100	6	95	570
	n = 100		∑f Xm = 6198

X = ∑f Xm/n = 6330/100 = 63.30

Mean = 63.30

Example 4 – Calculate the mean number of hours per week spent by each student in texting message.

Time per week	0 - 5	5 - 10	10 - 15	15 - 20	20 - 25	25 – 30
No. of students	8	11	15	12	9	5

Solution:

Time per week (X)	No. of students (F)	Mid point X	XF
0 - 5	8	2.5	20
5 – 10	11	7.5	82.5
10 - 15	15	12.5	187.5
15 - 20	12	17.5	210
20 - 25	9	22.5	202.5
25 – 30	5	27.5	137.5
	60		840

Mean = 840/60 = 14

Example 5 –

The following table of grouped data represents the weights (in pounds) of all 100 babies born at a local hospital last year.

Weight (pounds)	Number of Babies
[3−5)	8
[5−7)	25
[7−9)	45
[9−11)	18
[11−13)	4

Solution:

Weight (pounds)	Number of Babies	Mid point X	XF
[3−5)	8	4	32
[5−7)	25	6	150
[7−9)	45	8	360
[9−11)	18	10	180
[11−13)	4	12	48
	100		770

Mean = 770/100 = 7.7

Geometric mean

Geometric mean is a type of mean or average, which indicates the central tendency of a set of numbers by using the product of their values.

Definition

The Geometric Mean (G.M) of a series containing n observations is the nth root of the product of the values.

For ungrouped data

Geometric Mean, GM = Antilog ∑logxi

Merits of Geometric Mean

It is rigidly defined.

It is based on all the items.

It is capable of further algebraic treatment.

It gives less weight to large items and more to small items.

Demerits of Geometric Mean

It is difficult to compute.

It is not easy to understand.

If there are negative values in the series, it cannot be computed.

Example 1 – find the G.M of the values

X	Log X
45	1.653
60	1.778
48	1.681
65	1.813
Total	6.925

GM = Antilog ∑logxi

= Antilog 6.925/4

= Antilog 1.73

= 53.82

For grouped data

Geometric Mean, GM = Antilog ∑ f logxi

Example 2 – calculate the geometric mean

X	f
60 – 80	22
80 – 100	38
100 – 120	45
120 – 140	35

Solution

X	f	Mid X	Log X	f log X
60 – 80	22	70	1.845	40.59
80 – 100	38	90	1.954	74.25
100 – 120	45	110	2.041	91.85
120 – 140	35	130	2.114	73.99
Total	140			280.68

GM = Antilog ∑ f logxi

= antilog 280.68/140

= antilog 2.00

GM = 100

Example 3 – calculate geometric mean

class	frequency
2-4	3
4-6	4
6-8	2
8-10	1

Solution

class	frequency	x	Log x	flogx
2-4	3	3	1.0986	3.2958
4-6	4	5	1.2875	6.4378
6-8	2	7	0.5559	3.8918
8-10	1	9	0.2441	2.1972
	10			15.8226

GM = Antilog ∑ f logxi

= antilog 15.8226/10

= antilog 1.5823

GM = 4.866

Harmonic mean

Harmonic mean is quotient of “number of the given values” and “sum of the reciprocals of the given values

For ungrouped data

Merits of Harmonic Mean

It is rigidly defined.

It is based on all the observations of a series i.e.

It cannot be calculated ignoring any item of a series.

It is capable of further algebraic treatment.

Demerits of Harmonic Mean

It is not easy to understand by a man of ordinary prudence.

Its calculation is cumbersome as it involves finding out of the reciprocals of the numbers.

It does not give better and accurate results when the means adopted are the same for the different ends achieved.

Its algebraic treatment is very much limited and not far and wide as that of the arithmetic mean.

It is greatly affected by the values of the extreme items.

It cannot be calculated, if any, of the items is zero.

Example 1 - Calculate the harmonic mean of the numbers 13.2, 14.2, 14.8, 15.2 and 16.1

Solution

X	1/X
13.2	0.0758
14.2	0.0704
14.8	0.0676
15.2	0.0658
16.1	0.0621
Total	0.3147

H.M of X = 5/0.3147 = 15.88

Example 2 - Find the harmonic mean of the following data {8, 9, 6, 11, 10, 5} ?

X	1/X
8	0.125
9	0.111
6	0.167
11	0.091
10	0.100
5	0.200
Total	0.794

H.M of X = 6/0.794 = 7.560

For grouped data

Example 3 - Calculate the harmonic mean for the below data

Marks	30-39	40-49	50-59	60-69	70-79	80-89	90-99
F	2	3	11	20	32	25	7

Solution

Marks	X	F	F/X
30-39	34.5	2	0.0580
40-49	44.5	3	0.0674
50-59	54.4	11	0.2018
60-69	64.5	20	0.3101
70-79	74.5	32	0.4295
80-89	84.5	25	0.2959
90-99	94.5	7	0.0741
Total		100	1.4368

HM = 100/1.4368 = 69.59

Example 4 – find the harmonic mean of the given class

Ages	4	5	6	7
No. of students	6	4	10	9

Solution

X	F	f/x
4	6	1.50
5	4	0.80
6	10	1.67
7	9	1.29
	29.00	5.25

HM = 29/5.25 = 5.5

Example 5 – calculate harmonic mean

class	frequency
2-4	3
4-6	4
6-8	2
8-10	1

Solution

class	frequency	x	f/x
2-4	3	3	1
4-6	4	5	0.8
6-8	2	7	0.28
8-10	1	9	0.11
	10		2.19

Harmonic mean = 10/2.19 = 4.55

Merits of mean

It is rigidly defined

It is easy to understand and easy to calculate

It is based upon all values of the given data

It is capable of future mathematical treatment

It is not much affected by sampling fluctuation

Demerits of mean

It cannot be calculated if any observation are missing

It cannot be calculated for open end classes

It is effected by extreme values

It cannot be located graphically

It may be number which is not present in the data

Key Takeaways:

The mean is the arithmetic average, also called as arithmetic mean.

The Geometric Mean (G.M) of a series containing n observations is the nth root of the product of the values.

Harmonic mean is quotient of “number of the given values” and “sum of the reciprocals.

2.2 Mode median

Mode

The mode is denoted Mo, is the value which occurs most frequently in a set of values.

Croxton and Cowden defined it as “the mode of a distribution is the value at the point armed with the item tends to most heavily concentrated. It may be regarded as the most typical of a series of value”

Mode for ungrouped data

Example 1- Find the mode of scores of section A

Scores = 25, 24, 24, 20, 17, 18, 10, 18, 9, 7

Solution – Mode is 24, 18 as both have occurred twice.

Mode for grouped data

Mode = L1 + (L2 – L1) d1

d1 +d2

L1= lower limit of the modal class,

L2= upper limit of the modal class‟

d1 =fm-f0 and d2=fm-f1

Where fm= frequency of the modal class,

f0 = frequency of the class preceding to the modal class,

f1= frequency of the class succeeding to the modal class.

Example 2 – Find the mode

Seconds	Frequency
51 - 55	2
56 - 60	7
61 - 65	8
66 - 70	4

The group with the highest frequency is the modal group: - 61-65

D1 = 8-7 = 1

D2 = 8-4 = 4

Mode = L1 + (L2 – L1) d1

d1 +d2

mode = 61 + (65-61) 1 = 61+4 (1/5) = 61.8

1+4

Mode = 61.8

Example 3 - In a class of 30 students marks obtained by students in science out of 50 is tabulated below. Calculate the mode of the given data.

Marks obtained	No. of students
10 -20	5
20 – 30	12
30 – 40	8
40 - 50	5

Solution:

The group with the highest frequency is the modal group: - 20 -30

D1 = 12 - 5 = 7

D2 = 12 - 8 = 4

Mode = L1 + (L2 – L1) d1

d1 +d2

mode = 20 + (30-20) 7 = 20+10 (7/11) = 26.36

7+4

Mode = 61.8

Example 4- Based on the group data below, find the mode

Time to travel to work	frequency
1 – 10	8
11 -20	14
21 – 30	12
31 – 40	9
41 - 50	7

Solution:

The group with the highest frequency is the modal group: - 11 - 20

D1 = 14 - 8 = 6

D2 = 14 - 12 = 2

Mode = L1 + (L2 – L1) d1

d1 +d2

mode = 11 + (20-11) 6 = 11+9 (6/8) = 17.75

6+2

Example 5 –

Compute the mode from the following frequency distribution

CI	F
70-71	2
68-69	2
66-67	3
64-65	4
62-63	6
60-61	7
58-59	5

Solution:

The group with the highest frequency is the modal group: - 60 - 61

D1 = 7 - 6 = 1

D2 = 7 - 5 = 2

Mode = L1 + (L2 – L1) d1

d1 +d2

mode = 60 + (61-60) 1 = 60+1 (1/3) 60.85

1+2

Merits of mode

It is easy to understand & easy to calculate

It is not affected by extreme values or sampling fluctuations.

Even if extreme values are not known mode can be calculated.

It can be located just by inspection in many cases.

It is always present within the data.

Demerits of mode

It is not rigidly defined.

It is not based upon all values of the given data.

It is not capable of further mathematical treatment.

Median

The points or value that divides the data into two equal parts

Firstly , the data are arranged in ascending or descending order .

The median is the middle number depending on the data size.

When the data size is odd, the median is the middle value

When the data size is even, median is the average of the middle two values

It is also known as middle score or 50th percentile

For ungrouped data median is calculated by (n+1)th value

Example 1 – find the median score of 7 students in science class

Score = 19, 17, 16, 15, 12, 11, 10

Median = (7+1)/2 = 4th value

Median = 15

Find the median score of 8 students in science class

Score = 19, 17, 16, 15, 12, 11, 10, 9

Median = (8+1)/2 = 4.5th value

Median = (15+12)/2 = 13.5

Example 2 – find the median of the table given below

Marks obtained	No. of students
20	6
25	20
28	24
29	28
33	15
38	4
42	2
43	1

Solution:

Marks obtained	No. of students	cf
20	6	6
25	20	26 (20+6)
28	24	50 (26+24)
29	28	78
33	15	93
38	4	97
42	2	99
43	1	100

Median = (n+1)/2 = 100+1/2 = 50.5

Median = (28+29)/2 = 28.5

Median of grouped data

Formula

MC = median class is a category containing the n/2

Lb = lower boundary of the median class

Cfp = cumulative frequency before the median class if the scores are arranged from lowest to highest value

Fm = frequency of the median class

c.i = size of the class interval

Ex- calculate the median

Example 3-

Calculate the median

Marks	No. of students
0-4	2
5-9	8
10-14	14
15-19	17
20-24	9

Solution:

Marks	No. of students	CF
0-4	2	2
5-9	8	10
10-14	14	24
15-19	17	41
20-24	9	50
	50

n = 50

n = 50/2= 25

The category containing n/2 is 15 -19

Lb = 15

Cfp = 24

f = 17

ci = 4

Median = 15 + 25-24 *4 = 15.23

Example 4 - Given the below frequency table calculate median

X	60 – 70	70 – 80	80- 90	90-100
F	4	5	6	7

Solution:

X	F	CF
60 - 70	4	4
70 - 80	5	9
80 - 90	6	15
90 - 100	7	22

n = 22

n = 22/2= 11

The category containing n+1/2 is 80 - 90

Lb = 80

Cfp = 9

f = 6

ci = 10

Median = 80 + 11-9 *10 = 83.33

Example 5– calculate the median of grouped data

Class interval	1-3	3-5	5-7	7-9	9-11	11-13
Frequency	4	12	13	19	7	5

Solution:

CI	F	CF
1-3	4	4
3-5	12	16
5-7	13	29
7-9	19	48
9-11	7	55
11-13	5	60

n = 60

n = 60/2= 30

The category containing n+1/2 is 7-9

Lb = 7

Cfp = 29

f = 19

ci = 2

Median = 7 + 30-29 *2 = 7.105

Merits of median

It is rigidly defined

It is easy to understand and easy to calculate

It is not affected by extreme values

It is not much affected by sampling fluctuation

It can be located graphically

Demerits of median

It is not based upon all values of the given data

It is difficult to calculate increasing order data size

It is not capable of further mathematical treatment.

Key Takeaways:

The mode is denoted Mo, is the value which occurs most frequently in a set of values.

The median is the middle number depending on the data size.

2.3 Quartiles and percentiles

Quartiles

There are three quartiles, i.e., Q1, Q2 and Q3 which divide the total data into four equal parts when it has been orderly arranged. Q1, Q2 and Q3 are termed as first quartile, second quartile and third quartile or lower quartile, middle quartile and upper quartile, respectively. The first quartile, Q1, separates the first one-fourth of the data from the upper three fourths and is equal to the 25th percentile. The second quartile, Q2, divides the data into two equal parts (like median) and is equal to the 50th percentile. The third quartile, Q3, separates the first three-quarters of the data from the last quarter and is equal to 75th percentile.

Calculation of Quartiles:

The calculation of quartiles is done exactly in the same manner as it is in case of the calculation of median.

The different quartiles can be found using the formula given below:

Qi = l1 + i= 1,2,3

Where,

L1 = lower limit of ith quartile class

L2 = upper limit of ith quartile class

c = cumulative frequency of the class preceding the ith quartile class

f = frequency of ith quartile class.

Deciles

Deciles are the partition values which divide the arranged data into ten equal parts. There are nine deciles i.e. D1, D2, D3……. D9 and 5th decile is same as median or Q2, because it divides the data in two equal parts.

Calculation of Deciles:

The calculation of deciles is done exactly in the same manner as it is in case of calculation of median.

The different deciles can be found using the formula given below:

Di = l1 + i= 1,2,3….9

Where,

l1 = lower limit of ith quartile class

l2 = upper limit of ithquartile class

c = cumulative frequency of the class preceding the ithquartile class

f = frequency of ith quartile class.

Percentile

Percentiles are the values which divide the arranged data into hundred equal parts. There are 99 percentiles i.e. P1, P2, P3, ……. P99.

The 50th percentile divides the series into two equal parts and P50 = D5 = Median.

Similarly, the value of Q1 = P25 and value of Q3 = P75

Calculation of Percentiles:

The different percentiles can be found using the formula given below:

pi = l1 + i= 1,2,3…………….99

Where,

l1 = lower limit of ith quartile class

l2 = upper limit of ithquartile class

c = cumulative frequency of the class preceding the ithquartile class

f = frequency of ith quartile class.

Calculate Q1, Q2 and Q3 from the following data given below:

Day	Frequency
1	20
2	35
3	25
4	12
5	10
6	23
7	18
8	14
9	30
10	40

ANS

Arrange the frequency data in ascending order

Day	Frequency
1	10
2	12
3	14
4	18
5	20
6	23
7	25
8	30
9	35
10	40

First quartile (Q1)

Qi= [i * (n + 1) /4] th observation

Q1= [1 * (10 + 1) /4] th observation

Q1 = 2.75 th observation

Thus, 2.75 th observation lies between the 2nd and 3rd value in the ordered group, between frequency 12 and 14

First quartile (Q1) is calculated as

Q1 = 2nd observation +0.75 * (3rd observation - 2nd observation)

Q1 = 12 + 0.75 * (14 – 12) = 13.50

Third quartile (Q3)

Qi= [i * (n + 1) /4] th observation

Q3= [3 * (10 + 1) /4] th observation

Q3 = 8.25 th observation

So, 8.25 th observation lies between the 8th and 9th value in the ordered group, between frequency 30 and 35

Third quartile (Q3) is calculated as

Q3 = 8th observation +0.25 * (9th observation – 8th observation)

Q3 = 30 + 0.25 * (35 – 30) = 31.25

2. Calculate Q1, D7 and P20 from the following data:

3, 13, 11, 11, 5, 4, 2

ANS

Arranging observations in the ascending order we get

2, 3, 4, 5, 11, 11, 13

Here, n = 7

Q1 = ()th value of the observation

= ()th Value of the observation

= 2nd Value of the observation

= 3

D3 = ()th value of the observation

= ()th value of the observation

= (2.4)th Value of the observation

= 2nd observation + 0.4 (3rd – 2nd)

= 3 + 0.4(4 – 3)

= 3 + 0.4

= 3.4

P20 = ()th value of the observation

= ()th value of the observation

= (1.6)th value of the observation

= 1st observation + 0.6 (2nd – 1st )

= 2 + 0.6(3 – 2)

= 2 + 0.6

= 2.6

3. Calculate P20 from the following data:

Class	2 - 4	4 - 6	6 - 8	8 - 10
Frequency	3	4	2	1

ANS

In the case of Frequency Distribution, Percentiles can be calculated by using the formula:

pi = l1 +

Class interval	F	CF
2 - 4	3	3
4 - 6	4	7
6 - 8	2	9
8 - 10	1	10
Total	n = 10

Here n = 10

Class with th value of the observation in CF column

= th value of the observation in CF column

= 2th value of the observation in CF column and it lies in the class 2 - 4

Therefore, P20 class is 2 – 4

The lower boundary point of 2 – 4 is 2.

Therefore, L = 2

P20 = L +

= 2 + x 2

= 2 + 1.3333

= 3.3333

4. Calculate D7 from the following data:

Class	2 - 4	4 - 6	6 - 8	8 - 10
Frequency	3	4	2	1

ANS

In the case of Frequency Distribution, Deciles can be calculated by using the formula:

Di = l1 +

Class interval	F	CF
2 - 4	3	3
4 - 6	4	7
6 - 8	2	9
8 - 10	1	10
Total	n = 10

Here n = 10

Class with th value of the observation in CF column

= th value of the observation in CF column

= 7th value of the observation in CF column and it lies in the class 6 – 8

Therefore, D7 class is 6 – 8

The lower boundary point of 6 – 8 is 6.

Therefore, L = 6

D7 = L +

= 6 + x 2

= 6 + 0

= 6

2.4 Simple and weighted averages

Weighted average

Weighted average is a means of determining the average of a set of values by assigning weightage to each value in relation to their relative importance/significance.

The formula of weighted average can be expressed as follows:

Weighted average = (Total of x1w1+ x2w2+x3w3…..+xnwn)/(Total of w1 +w2+w3….+wn)

where;

x = values in the set

w = weightage of each value in the set

n = number of values in the set

For example, let’s continue the same example as above. While ABC Inc has 5 different items in its stock, they are present in different quantities in stock. The quantities in which they are present will become the ‘weights’ and thus a weighted average calculation would be more accurate to calculate the average price of its stock. The weighted average can be calculated as below:

Sr No	Stock item name	Cost price (in USD) [A]	Quantity in stock (in kg) [B]	Weighted cost [A * B]
1	Copper	10	9	90
2	Brass	12	22	264
3	Iron	9	31	279
4	Aluminium	20	11	220
5	Plastic	8	27	216
	Total		100	1069
	Weighted average (Weighted cost/total of weights)			10.69

As can be seen, the weighted average cost arrived at differs from the simple average cost due to the introduction of quantities in the data set.

Weighted average method is opted for when values in a set are all not of equal importance. Thus, by taking into consideration the relative importance of each value, the weighted average method seeks to equate all the values comprised in the set. This makes weighted average method a more complex but more accurate calculation method than simple average.

Key Takeaways:

There are three quartiles, i.e., Q1, Q2 and Q3 which divide the total data into four equal parts when it has been orderly arranged.

Percentiles are the values which divide the arranged data into hundred equal parts. There are 99 percentiles i.e., P1, P2, P3, ……. P99.

REFERENCE

B.N Gupta – Statistics

S.P Singh – statistics

Gupta and Kapoor – Statistics

Yule and Kendall – Statistics method

Sign Up

Index

Notes

Highlighted

Underlined

Browse by Topics

Notes

Highlighted

Underlined