3.1 Correlation – Meaning Application types and degree of correlation methods – Karl pearson’s coefficient of correlation spearman’s rank coefficient of correlation

UNIT 3

Correlation

3.1 Correlation – Meaning, Application, types and degree of correlation, methods – Karl pearson’s coefficient of correlation, spearman’s rank coefficient of correlation

Correlation is used to describe the linear relationship between two continuous variables (e.g., height and weight). In general, correlation tends to be used when there is no identified response variable. It measures the strength (qualitatively) and direction of the linear relationship between two or more variables.

Definition

“Correlation analysis deals with the association between two or more variables.” —Simpson and Kafka

“Correlation is an analysis of the co-variation between two variables.” —A.M. Tuttle

Uses and types

Uses-

Prognosis: The coefficient of correlation is used quite profitably in Prediction. It is used to predict the success one in a number of studies which will achieve in his further educational careers.

Reliability: To test the reliability the co-efficient of correlation has been used very often. Through calculation of this statistics it has been sought to be asserted whether or not a test measures on two successive occasions the same type of thing.

3. Validity: A test’s width value can be obtained through correlation. Whenever a test is constructed the tests, not what it claims to test.

4. Test Construction: The coefficient of correlation is also being used in the test construction. There are always the questions whenever a new test is constructed, whether each element of the test is related to other elements or to the test as a whole and as to whether each element is related to the criteria chosen. Those relationships are all examined through the technique of correlation.

Types

Correlation measures the nature and strength of relationship between two variables. Correlation lies between +1 to -1. A correlation of +1 indicates a perfect positive correlation between two variables. A zero correlation indicates that there is no relationship between the variables. A correlation of -1 indicates a perfect negative correlation.

Degree of Correlation and its Nature

Perfect correlation: If two variables change in the same proportion (increase or decrease), then the correlation between them is perfect correlation. Here, perfect correlation can be a positive or negative correlation.

a) Coefficient of correlation (r) = 1: If there is perfect positive relationship between two variables, then the value of correlation will be +1.

b) Coefficient of correlation (r) = −1: If there is perfect negative relationship between two variables, then the value of correlation will be −1.

2. Zero correlation: The correlation is zero is said to be when two variables have no relationship between them. It implies that a change in the value of one variable has no effect on the change in the value of the other variable.

a) Coefficient of correlation (r) = 0: If there is no relationship between the two variables, then the value of correlation will be zero. However, it does not imply that these two variables are independent. It only indicates non-existence of linear relation between the two variables.

3. Limited degree of correlation: A limited degree of correlation exists between perfect correlation and zero correlation, i.e. the value of the coefficient of correlation lies between +1 and −1. This limited degree of correlation may be high, moderate or low.

a) High degree of correlation: Correlation of two series of data is closer to one.

b) Medium degree of correlation: Correlation of two series of data is neither large nor small.

c) Low degree of correlation: Correlation of two series of data is small.

Key takeaways –

Degree of correlation are perfect, zero and limited

Karl Pearson’s Coefficient of Correlation

It is widely used mathematical method is used to calculate the degree and direction of the relationship between linear related variables. The coefficient of correlation is denoted by “r”.

Direct method-

Karl Pearson-final

Shortcut method –

The value of the coefficient of correlation (r) always lies between ±1. Such as:

a) r=+1, perfect positive correlation.

b) r=-1, perfect negative correlation.

c) r=0, no correlation.

Example 1 - Compute Pearsons coefficient of correlation between advertisement cost and sales as per the data given below:

Advertisement cost	39	65	62	90	82	75	25	98	36	78
sales	47	53	58	86	62	68	60	91	51	84

Solution

X	Y	X - X	(X - X)2	Y - Y	(Y - Y)2
39	47	-26	676	-19	361	494
65	53	0	0	-13	169	0
62	58	-3	9	-8	64	24
90	86	25	625	20	400	500
82	62	17	289	-4	16	-68
75	68	10	100	2	4	20
25	60	-40	1600	-6	36	240
98	91	33	1089	25	625	825
36	51	-29	841	-15	225	435
78	84	13	169	18	324	234
650	660		5398		2224	2704

r = (2704)/√5398 √2224 = (2704)/(73.2*47.15) = 0.78

Thus Correlation coefficient is positively correlated

Example 2

Compute correlation coefficient from the following data

Hours of sleep (X)	Test scores (Y)
8	81
8	80
6	75
5	65
7	91
6	80

X	Y	X - X	(X - X)2	Y - Y	(Y - Y)2
8	81	1.3	1.8	2.3	5.4	3.1
8	80	1.3	1.8	1.3	1.8	1.8
6	75	-0.7	0.4	-3.7	13.4	2.4
5	65	-1.7	2.8	-13.7	186.8	22.8
7	91	0.3	0.1	12.3	152.1	4.1
6	80	-0.7	0.4	1.3	1.8	-0.9
40	472		7		361	33

X = 40/6 =6.7

Y = 472/6 = 78.7

r = (33)/√7 √361 = (33)/(2.64*19) = 0.66

Thus Correlation coefficient is positively correlated

Example 3

Calculate coefficient of correlation between X and Y series using Karl pearson shortcut method

X	14	12	14	16	16	17	16	15
Y	13	11	10	15	15	9	14	17

Solution

Let assumed mean for X = 15, assumed mean for Y = 14

X	Y	dx	dx2	dy	dy2	dxdy
14	13	-1.0	1.0	-1.0	1.0	1.0
12	11	-3.0	9.0	-3.0	9.0	9.0
14	10	-1.0	1.0	-4.0	16.0	4.0
16	15	1.0	1.0	1.0	1.0	1.0
16	15	1.0	1.0	1.0	1.0	1.0
17	9	2.0	4.0	-5.0	25.0	-10.0
16	14	1	1	0	0	0
15	17	0	0	3	9	0
120	104	0	18	-8	62	6

r = 8 *6 – (0)*(-8)

√8*18-(0)2 √8*62 – (-8)2

r = 48/√144*√432 = 0.19

Example 4 –

Calculate coefficient of correlation between X and Y series using Karl pearson shortcut method

X	1800	1900	2000	2100	2200	2300	2400	2500	2600
F	5	5	6	9	7	8	6	8	9

Solution

Assumed mean of X and Y is 2200, 6

X	Y	dx	dx (i=100)	dx2	dy	dy2	dxdy
1800	5	-400	-4	16	-1.0	1.0	4.0
1900	5	-300	-3	9	-1.0	1.0	3.0
2000	6	-200	-2	4	0.0	0.0	0.0
2100	9	-100	-1	1	3.0	9.0	-3.0
2200	7	0	0	0	1.0	1.0	0.0
2300	8	100	1	1	2.0	4.0	2.0
2400	6	200	2	4	0	0	0.0
2500	8	300	3	9	2	4	6.0
2600	9	400	4	16	3	9	12.0

			0	60	9	29	24

Note – we can also proceed dividing x/100

r = (9)(24) – (0)(9)

√9*60-(0)2 √9*29– (9)2

r = 0.69

Example 5 –

X	28	45	40	38	35	33	40	32	36	33
Y	23	34	33	34	30	26	28	31	36	35

Solution

X	Y	X - X	(X - X)2	Y - Y	(Y - Y)2
28	23	-8	64	-8.0	64.0	64.0
45	34	9	81	3.0	9.0	27.0
40	33	4	16	2.0	4.0	8.0
38	34	2	4	3.0	9.0	6.0
35	30	-1	1	-1.0	1.0	1.0
33	26	-3	9	-5.0	25.0	15.0
40	28	4	16	-3	9	-12.0
32	31	-4	16	0	0	0.0
36	36	0	0	5	25	0.0
33	35	-3	9	4	16	-12
360	310	0	216	0	162	97

X = 360/10 = 36

Y = 310/10 = 31

r = 97/(√216 √162 = 0.51

Spearman’s Rank Correlation Coefficient –

The Spearman’s Rank Correlation Coefficient is the non-parametric statistical measure used to study the strength of association between the two ranked variables. This method is used for ordinal set of numbers, which can be arranged in order.

Where, P = Rank coefficient of correlation

D = Difference of ranks

N = Number of Observations

The Spearman’s Rank Correlation coefficient lies between +1 to -1.

d) +1 indicates perfect association of rank

e) 0 indicates no association between the rank

f) -1 indicates perfect negative association between the ranks

When ranks are not given - Rank by taking the highest value or the lowest value as 1

Equal Ranks or Tie in Ranks – in this case ranks are assigned on an average basis. For ex – if three students score of 5, at 5th, 6th, 7th ranks ach one of them will be assigned a rank of 5 + 6 + 7/3= 6.

If two individual ranked equal at third position, then the rank is calculates as (3+4)/2 = 3.5

Example 1 –

Test 1	8	7	9	5	1
Test 2	10	8	7	4	5

Solution

Here, highest value is taken as 1

Test 1	Test 2	Rank T1	Rank T2	d	d2
8	10	2	1	1	1
7	8	3	2	1	1
9	7	1	3	-2	4
5	4	4	5	-1	1
1	5	5	4	1	1
					8

R = 1 – (6*8)/5(52 – 1) = 0.60

Example 2 -

Calculate Spearman rank-order correlation

English	56	75	45	71	62	64	58	80	76	61
Maths	66	70	40	60	65	56	59	77	67	63

Solution

Rank by taking the highest value or the lowest value as 1.

Here, highest value is taken as 1

English	Maths	Rank (English)	Rank (Math)	d	d2
56	66	9	4	5	25
75	70	3	2	1	1
45	40	10	10	0	0
71	60	4	7	-3	9
62	65	6	5	1	1
64	56	5	9	-4	16
58	59	8	8	0	0
80	77	1	1	0	0
76	67	2	3	-1	1
61	63	7	6	1	1
					54

R = 1-(6*54)

10(102-1)

R = 0.67

There fore this indicates a strong positive relationship between the ranks individuals obtained in the math and English exam.

Example 3 –

Find Spearman's rank correlation coefficient between X and Y for this set of data:

X	13	20	22	18	19	11	10	15
Y	17	19	23	16	20	10	11	18

Solution

X	Y	Rank X	Rank Y	d	d2
13	17	3	4	-1	1
20	19	7	6	1	1
22	23	8	8	0	0
18	16	5	3	2	2
19	20	6	7	-1	1
11	10	2	1	1	1
10	11	1	2	-1	1
15	18	4	5	-1	1
					8

R =

R = 1 – 6*8/8(82 – 1) = 1 – 48 = 0.90

504

Example 4 – Calculation of equal ranks or tie ranks

Find Spearman's rank correlation coefficient:

Commerce	15	20	28	12	40	60	20	80
Science	40	30	50	30	20	10	30	60

Solution

C	S	Rank C	Rank S	d	d2
15	40	2	6	-4	16
20	30	3.5	4	-0.5	0.25
28	50	5	7	-2	4
12	30	1	4	-3	9
40	20	6	2	4	16
60	10	7	1	6	36
20	30	3.5	4	-0.5	0.25
80	60	8	8	0	0
					81.5

R = 1 – (6*81.5)/8(82 – 1) = 0.02

Example 5 –

X	10	15	11	14	16	20	10	8	7	9
Y	16	16	24	18	22	24	14	10	12	14

Solution

X	Y	Rank X	Rank Y	d	d2
10	16	6.5	5.5	1	1
15	16	3	5.5	-2.5	6.25
11	24	5	1.5	3.5	12.25
14	18	4	4	0	0
16	22	2	3	-1	1
20	24	1	1.5	-0.5	0.25
10	14	6.5	7.5	-1	1
8	10	9	10	-1	1
7	12	10	9	1	1
9	14	8	7.5	0.5	0.25
					24

R = 1 – (6*24)/10(102 – 1) = 0.85

The correlation between X and Y is positive and very high.

Key takeaways - Correlation is used to describe the linear relationship between two continuous variables

3.2 Regression analysis – meaning, importance, simple regression equation

Regression analysis is a technique of studying the dependence of one variable called dependent variable, on one or more variable called explanatory variable, with a view to estimate or predict the average value of the dependent variables in terms of the known or fixed values of the independent variables.

Regression analysis includes several variations, such as linear, multiple linear, and nonlinear. The most common models are simple linear and multiple linear.

Nonlinear regression analysis is commonly used for more complicated data sets in which the dependent and independent variables show a nonlinear relationship.

Linear model assumption -

The dependent and independent variables show a linear relationship between the slope and intercept.

The independent variable is not random.

The value of the residual (error) is zero.

The value of the residual (error) is constant across all observations.

The value of the residual (error) is not correlated across all observations.

The residual (error) values follow the normal distribution.

Importance

Regression Analysis, a statistical technique, is used to evaluate the relationship between two or more variables. Regression analysis helps an organisation to understand what their data points represent and use them accordingly with the help of business analytical techniques in order to do better decision-making. In this analysis, you will understand how the typical value of the dependent variable changes when one of the independent variables is varied, while the other independent variables are held fixed. Business analysts and data professionals use this powerful statistical tool for removing the unwanted variables and select the important ones.

Simple linear regression

Simple linear regression is a model that assesses the relationship between a dependent variable and an independent variable.

Y = a + bX + ϵ

Where:

Y – Dependent variable

X – Independent (explanatory) variable

a – Intercept

b – Slope

ϵ – Residual (error)

With the help of simple linear regression model we have the following two regression lines

1. Regression line of Y on X: This line gives the probable value of Y (Dependent variable) for any given value of X (Independent variable).

Regression line of Y on X : Y – Ẏ = byx (X – Ẋ)

OR : Y = a + bX

2. Regression line of X on Y: This line gives the probable value of X (Dependent variable) for any given value of Y (Independent variable).

Regression line of X on Y : X – Ẋ = bxy (Y – Ẏ)

OR : X = a + bY

Multiple linear regressions-

Multiple linear regression analysis is essentially similar to the simple linear model, with the exception that multiple independent variables are used in the model.

Y = a + bX1 + cX2 + dX3 + ϵ

Where:

Y – Dependent variable

X1, X2, X3 – Independent (explanatory) variables

a – Intercept

b, c, d – Slopes

ϵ – Residual (error)

Example

How to find a linear regression equation

Subject	X	Y
1	43	99
2	21	65
3	25	79
4	42	75
5	57	87
6	59	81

Solution

Subject	X	Y	Xy	X2	Y2
1	43	99	4257	1849	9801
2	21	65	1365	441	4225
3	25	79	1975	625	6241
4	42	75	3150	1764	5625
5	57	87	4959	3249	7569
6	59	81	4779	3481	6521
Total	247	486	20485	11409	40022

To find a and b, use the following equation

Find a:

((486 × 11,409) – ((247 × 20,485)) / 6 (11,409) – 247*247)

484979 / 7445

=65.14

Find b:

(6(20,485) – (247 × 486)) / (6 (11409) – 247*247)

(122,910 – 120,042) / 68,454 – 2472

2,868 / 7,445

= .385225

y’ = a + bx

y’ = 65.14 + .385225x

Example

Calculate linear regression analysis

students	X	Y
1	95	85
2	85	95
3	80	70
4	70	65
5	60	70

Solution

students	X	Y	X2	y2	xy
1	95	85	9025	7225	8075
2	85	95	7225	9025	8075
3	80	70	6400	4900	5600
4	70	65	4900	4225	4550
5	60	70	3600	4900	4200
total	390	385	31150	30275	30500

To find a and b, use the following equation

Find a:

((385 × 31150) – ((390 × 30500)) / 5 (31150) – 152100)

97750 / 3650

=26.78

Find b:

(5(30500) – (390 × 385)) / (5 (31150) – 152100)

2,350 / 3650

= .0.64

y’ = a + bx

y’ = 26.78 + .0.64x

Key takeaways - Regression analysis includes several variations, such as linear, multiple linear, and nonlinear. The most common models are simple linear and multiple linear

3.3 Standard error estimates

The standard error is one of the mathematical tools used in statistics to estimate the variability. It is abbreviated as SE. The standard error of a statistic or an estimate of a parameter is the standard deviation of its sampling distribution. We can define it as an estimate of that standard deviation.

Standard Error Formula

The accuracy of a sample that describes a population is identified through SE formula. The sample mean which deviates from the given population and that deviation is given as;

standard error formula

Where S is the standard deviation and n is the number of observations.

The standard error of the estimate is the estimation of the accuracy of any predictions. It is denoted as SEE. The regression line depreciates the sum of squared deviations of prediction. It is also known as the sum of squares error. SEE is the square root of the average squared deviation. The deviation of some estimate from intended values is given by standard error of estimate formula.

How to calculate Standard Error

Step 1: Note the number of measurements (n) and determine the sample mean (μ). It is the average of all the measurements.

Step 2: Determine how much each measurement varies from the mean.

Step 3: Square all the deviations determined in step 2 and add altogether: Σ(xi – μ)²

Step 4: Divide the sum from step 3 by one less than the total number of measurements (n-1).

Step 5: Take the square root of the obtained number, which is the standard deviation (σ).

Step 6: Finally, divide the standard deviation obtained by the square root of the number of measurements (n) to get the standard error of your estimate.

Calculate the standard error of the given data:

y: 5, 10, 12, 15, 20

Solution: First we have to find the mean of the given data;

Mean = (5+10+12+15+20)/5 = 62/5 = 10.5

Now, the standard deviation can be calculated as;

S = Summation of difference between each value of given data and the mean value/Number of values.

Hence,

After solving the above equation, we get;

S = 5.35

Therefore, SE can be estimated with the formula;

SE = S/√n

SE = 5.35/√5 = 2.39

Key takeaways - The standard error of the estimate is the estimation of the accuracy of any predictions. It is denoted as SEE.

3.4 Index numbers- meaning, types and uses, methods of constructing price index number, fixed base method, chain base method, fishers’s ideal index number, reversibility test – time and factor

The value of money does not remain same over the time. A rise in the price levels means a fall in the value of money and a fall in the price level means a rise in the value of money. Thus index number is a statistical device that measures the relative change in the level of price from one time period to another.

Definition

“Index numbers are quantitative measures of growth of prices, production, inventory and other quantities of economic interest” ………Ronold

An index number measures how much a variable changes over the time. Index number is calculated by finding the ratio of current value to a base value.

Uses of index number

a) Index numbers are specialized averages.

b) Index numbers measures the change in one variable or a group of variables.

c) Index numbers measures the effect of changes over a period of time.

d) Index numbers are meant to study the changes in the effects of such factors which cannot be measured directly.

Types of index number

1. Wholesale Price Index Numbers:

Wholesale price index numbers are constructed on the basis of the wholesale prices of certain important commodities. The commodities included in preparing these index numbers are mainly raw-materials and semi-finished goods. Only the most important and most price-sensitive and semi- finished goods which are bought and sold in the wholesale market are selected and weights are assigned in accordance with their relative importance.

2. Retail Price Index Numbers:

These index numbers are prepared to measure the changes.in the value of money on the basis of the retail prices of final consumption goods. The main difficulty with this index number is that the retail price for the same goods and for continuous periods is not available. The retail prices represent larger and more frequent fluctuations as compared to the wholesale prices.

3. Cost-of-Living Index Numbers:

These index numbers are constructed with reference to the important goods and services which are consumed by common people. Since the number of these goods and services is very large, only representative items which form the consumption pattern of the people are included. These index numbers are used to measure changes in the cost of living of the general public.

4. Working Class Cost-of-Living Index Numbers:

The working class cost-of-living index numbers aim at measuring changes in the cost of living of workers. These index numbers are consumed on the basis of only those goods and services which are generally consumed by the working class. The prices of these goods and index numbers are of great importance to the workers because their wages are adjusted according to these indices.

5. Wage Index Numbers:

The purpose of these index numbers is to measure time to time changes in money wages. These index numbers, when compared with the working class cost-of-living index numbers, provide information regarding the changes in the real wages of the workers.

6. Industrial Index Numbers:

Industrial index numbers are constructed with an objective of measuring changes in the industrial production. The production data of various industries are included in preparing these index numbers.

Methods of constructing price index number, fixed base method, chain base method, fishers’s ideal index number

The chain index numbers - In fixed base method the base remain constant through out i.e. the relatives for all the years are based on the price of that single year. On the other hand in chain base method, the relatives for each year is found from the prices of the immediately preceding year. Thus the base changes from year to year. Such index numbers are useful in comparing current year figures with the preceding year figures. The relatives which we found by this method are called link relatives.

Thus link relative for current year = current years figure/previous year figure *100

And by using these link relatives we can find the chain indices for each year by using the below formula

Chain index for current year = Link relative of current year * Chain index of previous year/ 100

Note: The fixed base index number computed from the original data and chain index number computed from link relatives give the same value of the index provided that there is only one commodity, whose indices are being constructed.

Example 1

From the following data of wholesale prices of wheat for ten years construct index number taking a) 1998 as base and b) by chain base method

Example 2

From the following data calculate the index numbers using the Chain Index Numbers method.

Year 2011 2012 2013 2014 2015 2016 2017 2018

Prices 120 124 130 144 150 160 164 170

Solution

Construction of Chain Index Numbers

Year	Price	Link Relatives	Chain indices
2011	120	100	100
2012	124	120/124 x 100 = 103.33	103.33 ×100/100 = 103.33
2013	130	124/130 x 100 = 104.83	104.83 ×103.33/100 = 108.32
2014	144	130/144 x 100 = 110.76	110.76×108.32 /100= 119.98
2015	150	144/150 x 100 = 104.16	104.16 ×119.98/100 = 124.97
2016	160	150/160 x 100 = 106.66	106.66×124.97/100 = 133.29
2017	164	160/164 x 100 = 102.5	102.5 ×133.29/100 = 136.62
2018	170	164/170 x 100 = 103.65	103.65 ×136.62/100 = 141.61

Example 3

Compute the chain base index numbers

Solution

Fixed base method – under this method index number is calculated with a fixed base year. By this method the index number of a given year is not influenced by the variation of prices of any other year.

Price relatrive of current year = price of current year/ price of base year*100

Example 1

Find index numbers for the following data taking 1980 as the base year.

Year	1980	1981	1982	1983	1984	1985	1986	1987
Price	40	50	60	70	80	100	90	110

Solution

Construction of price index numbers

Simple aggregative method – in this method, the index number is equal to the sum of prices in the current year as a percentage of the sum of prices in the base year.

Where, P01 = Index number

P 1= Total of the current year’s prices of all commodities

P 0= Total of the base year’s prices of all commodities

Examples 1–

Commodity	Price in base year 2005	Price in current year 2010
A	10	20
B	15	25
C	40	60
D	25	40

Solution

Commodity	Price in base year 2005	Price in current year 2010
A	10	20
B	15	25
C	40	60
D	25	40
	= 90	= 145

Index number ( P01 ) =

P01 = (145/90)*100 = 161.11

It means the price in 2010 were 61% more than the price in 2005

Example 2

Find the index number from the data given below

Commodities	Units	Price in 2007	Price in 2008
Sugar	Quintal	2200	3200
Milk	Quintal	18	20
Oil	Liter	68	71
Wheat	Quintal	900	1000
Clothing	Meter	50	60

Solution

Commodities	Units	Price in 2007	Price in 2008
Sugar	Quintal	2200	3200
Milk	Quintal	18	20
Oil	Liter	68	71
Wheat	Quintal	900	1000
Clothing	Meter	50	60
		= 3236	= 4351

Index number ( P01 ) =

P01 = (4351/3236)*100 = 134.45

It means the price in 2008 were 34% more than the price in 2007

Example 3 –

Construct the price index for 2003, taking the year 2000 as base year

Commodities	Price in 2000	Price in 2003
A	60	80
B	50	60
C	70	100
D	120	160
E	100	150

Solution

Commodities	Price in 2000 - P 0	Price in 2003 - P 1
A	60	80
B	50	60
C	70	100
D	120	160
E	100	150
	= 400	= 550

Index number ( P01 ) =

P01 = (550/400)*100 = 137.5

Therefore there is an increase of 37.5% in the prices in 2003 as against 2000.

Example 4-

Compute the price index for the years 2001, 2002, 2003, 2004 taking 2000 as base year

Year	2000	2001	2002	2003	2004
Price	120	144	168	204	216

Solution

Price index for different years

2000	(120/120)*100 = 100
2001	(144/120)*100 = 120
2002	(168/120)*100 = 140
2003	(204/120)*100 = 170
2004	(216/120)*100 = 180

Example 5 –

Prepare simple aggregative price index

Commodities	Price in 1995 - P 0	Price in 2003 - P 1
Wheat	100	140
Rice	200	250
Pulses	250	350
Sugar	14	20
Oil	40	50

Solution

Commodities	Price in 1995 - P 0	Price in 2003 - P 1
Wheat	100	140
Rice	200	250
Pulses	250	350
Sugar	14	20
Oil	40	50
	= 604	= 810

Simple aggregative index number = (810/604)*100 = 134.1

2. Simple average of relative method - in this method, index number is equal to the sum of price relatives divided by the number of items.

Where, N= number of items

Example 1 –

Commodity	Base year	Current year
A	10	20
B	15	25
C	40	60
D	25	40

Solution

Commodity	Base year	Current year	Price relatives
A	10	20	(20/10)*100 = 200
B	15	25	(25/15)*100 =166.7
C	40	60	(60/40)*100 =150
D	25	40	(40/25)*100 =160
N = 4			= 676.7

Index number = 676.7/4 = 169.2

Example 2 –

Construct the index number for the year 2010

Commodities	Price (2009)	Price(2010)
P	6	10
Q	12	2
R	4	6
S	10	12
T	8	12

Solution

Commodities	Price (2009)	Price(2010)	Price relative
P	6	10	166.67
Q	12	2	16.67
R	4	6	150
S	10	12	120
T	8	12	150
N = 5			603.34

Index number = 603.34/4 = 120.68

Example 3 –

Using simple average of price relative method find price index for 2001, taking 1996 as base year for the following data

Commodity	Wheat	Rice	Sugar	Ghee	Tea
Price in 1996	12	20	12	40	80
Price in 2001	16	25	16	60	96

Solution

Commodities	Price (2009)	Price(2010)	Price relative
Wheat	12	16	(16/12)*100 = 133.33
Rice	20	25	(25/20)*100 = 125
Sugar	12	16	133.33
Ghee	40	60	150
Tea	80	96	120
N = 5			661.66

=661.66 = 132.33

Therefore Price Index for 2001, taking 1996 as base year, = 132.33

Example 4 –

Using simple average of price relative method find price index for 2010, taking 2009 as base year for the following data

Commodities	Price (2009)	Price(2010)
A	60	80
B	50	60
C	60	72
D	50	75
E	25	37 .5
F	20	30

Solution

Commodities	Price (2009)	Price(2010)	Price relatives
A	60	80	133.33
B	50	60	120
C	60	72	120
D	50	75	150
E	25	37 .5	150
F	20	30	150
N = 6			823.33

= 823.33/6 = 137.22

3. Weighted aggregative method – in this method, according to the relative importance, different weights are assigned to the items. Many formulas developed to estimate index numbers on the basis of weights.

Some of the formulas given below

Laspeyre’s formula - in this method, the quantities of the base year are accepted as weight

Paasche’s formula – in this method, the quantities of the current year are accepted as weight

Dorbish and Bowley’s formula – this method is the combination of Laspeyre’s formula and Paasche’s formula

Fisher’s ideal formula – this method is the geometric mean of Laspeyre’s formula and Paasche’s formula

Marshall – Edgeworth method - In this method also both the current year as well as base year prices and quantities are considered.

Kelly’s method –

Where q refers to quantity of some period, not necessarily of the mean of the base year and current year.

Example 1 –

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1
A	10	5	20	2
B	15	4	25	8
C	40	2	60	6
D	25	3	40	4

Solution

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1	Poqo	P1qo	Poq1	P1q1
A	10	5	20	2	50	100	20	40
B	15	4	25	8	60	100	120	200
C	40	2	60	6	80	120	240	360
D	25	3	40	4	75	120	100	160
					265	440	480	760

Laspeyre’s formula

P 01 = (440/265)*100 = 166.04

Paasche’s formula

P 01 = (760/480)*100 = 158.33

Dorbish and Bowley’s formula

P 01 = ((440/265) + (760/480)) *100 = 162

Fisher’s ideal formula

P 01 = √ ((440/265) + (760/480)) *100 = 162.1

Example 2

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1
A	15	500	20	600
B	18	590	23	640
C	22	450	24	500

Solution

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1	Poqo	P1qo	Poq1	P1q1
A	15	500	20	600	7500	10000	9000	12000
B	18	590	23	640	10620	13570	11520	14720
C	22	450	24	500	9900	10800	11000	12000

					28020	34370	31520	38720

Laspeyre’s formula

P 01 = (34370/28020)*100 = 122.66

Paasche’s formula

P 01 = (38720/31520)*100 = 122.84

Dorbish and Bowley’s formula

P 01 = ((34370/28020) + (38720/31520)) *100 = 122.66

Fisher’s ideal formula

P 01 = √ = ((34370/28020) + (38720/31520)) *100 = 122.69

Example 3

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1
A	2	8	4	6
B	5	10	6	5
C	4	14	5	10
D	2	19	2	13

Solution

Commodity	Base year		Current year
Commodity	PO	QO	P1	Q1	Poqo	P1qo	Poq1	P1q1
A	2	8	4	6	16	32	12	24
B	5	10	6	5	50	60	25	30
C	4	14	5	10	56	70	40	50
D	2	19	2	13	38	38	26	26
					160	200	103	130

Laspeyre’s formula

P 01 = (200/160)*100 = 125

Paasche’s formula

P 01 = (130/103)*100 = 126.21

Dorbish and Bowley’s formula

P 01 = ((200/160) + (130/103)) *100 = 125.6

Fisher’s ideal formula

P 01 = √ = ((200/160) + (130/103)) *100 = 125.61

Marshall-Edgeworth method

= (200+130)/(160+103) *100 = 125.48

Example 4 –

Calculate the price indices from the following data by applying (1) Laspeyre’s method (2) Paasche’s method and (3) Fisher ideal number by taking 2010 as the base year.

Commodity	2010		2011
Commodity	PO	QO	P1	Q1
A	20	10	25	13
B	50	8	60	7
C	35	7	40	6
D	25	5	35	4

Solution

Commodity	2010		2011
Commodity	PO	QO	P1	Q1	Poqo	P1qo	Poq1	P1q1
A	20	10	25	13	200	250	260	325
B	50	8	60	7	400	480	350	420
C	35	7	40	6	245	280	210	240
D	25	5	35	4	125	175	100	140
					970	1185	920	1125

Laspeyre’s formula

P 01 = (1185/970)*100 = 122.16

Paasche’s formula

P 01 = (1125/920)*100 = 122.28

Fisher’s ideal formula

P 01 = √ = ((1185/970) + (1125/920)) *100 = 120.55

Example 5 –

Calculate the Dorbish and Bowley’s price index number for the following data taking 2014 as base year.

Item	2010		2011
Item	PO	QO	P1	Q1
Oil	80	3	100	4
Pulses	35	2	45	3
Sugar	25	2	30	3
Rice	50	30	54	35

Solution

Item	2010		2011
Item	PO	QO	P1	Q1	Poqo	P1qo	Poq1	P1q1
Oil	80	3	100	4	240	300	320	400
Pulses	35	2	45	3	70	90	105	135
Sugar	25	2	30	3	50	60	75	90
Rice	50	30	54	35	1500	1620	1750	1890
					1860	2070	2250	2515

Dorbish and Bowley’s formula

P 01 = ((2070/1860) + (2515/2250)) *100 = 111.38

Example 6 –

Calculate a suitable price index from the following data

commodity	Quantity	price
		2007	2010
X	25	3	4
Y	12	5	7
Z	10	6	5

Solution

commodity	Q	P0	P1	P0Q	P1Q
X	25	3	4	75	100
Y	12	5	7	60	84
Z	10	6	5	60	50
				195	234

Kelly price index

= 235/195*100 = 120

4. Weighted average of relative method – in this method different weights are used for the items according to their relative importance. If p = [p1/ p0] × 100 is the price relative index and w = p0q0 is attached to the commodity

Where, means sum of weights for different commodities

Sum of price relatives

Example 1 –

Commodity	Weight	Base price year	current price year
A	5	10	20
B	4	15	25
C	2	40	60
D	3	25	40

Solution

Commodity	Weight	Base price year	current price year	price relatives	RW
A	5	10	20	20/10*100 = 200	1000
B	4	15	25	25/15*100 =166.7	666.8
C	2	40	60	60/40*100 = 150	300
D	3	25	40	40/25*100 = 160	480
	14				2446.8

P01 = 2446.8/14 = 174.8

Example 2 – compute price index by applying weighted average of relative method

Commodity	Quantity	Base price year	current price year
Wheat	20	3	4
Flour	40	1.5	1.6
Milk	10	1	1.5

Solution

Commodity	Quantity	Base price year	current price year	Weight	price relatives	RW
Wheat	20	3	4	60	133.3	8000
Flour	40	1.5	1.6	60	106.7	6400
Milk	10	1	1.5	10	150.0	1500

				130		15900

P01 = 15900/130 = 122.30

Example 3 – Calculate weighted average of relative method

Commodity	Base price year	current price year	Weight
x	3	4	7
y	1.5	1.6	8
z	1	1.5	9

Solution

Commodity	Base price year	current price year	Weight	price relatives	RW
x	3	4	7	133.3	933.33
y	1.5	1.6	8	106.7	853.33
z	1	1.5	9	150.0	1350
			24		3136.66

P01 = 3136.66/24 = 130.67

Reversibility test – time and factor

Index numbers are studied to know the relative changes in price and quantity for any two years compared. There are two tests which are used to test the adequacy for an index number. The two tests are as follows,

(i) Time Reversal Test

(ii) Factor Reversal Test

The criterion for a good index number is to satisfy the above two tests.

Time Reversal Test

It is an important test for testing the consistency of a good index number. This test maintains time consistency by working both forward and backward with respect to time (here time refers to base year and current year). Symbolically the following relationship should be satisfied, P01 × P10 =1

Fisher’s index number formula satisfies the above relationship

when the base year and current year are interchanged, we get

Factor Reversal Test

This is another test for testing the consistency of a good index number. The product of price index number and quantity index number from the base year to the current year should be equal to the true value ratio. That is, the ratio between the total value of current period and total value of the base period is known as true value ratio. Factor Reversal Test is given by,