Tuesday, September 22, 2020

variance basic


The discussion below may appear unrelated to stock markets, It helps you relate better to the term ‘Volatility’.

Consider 2 batsmen and the number of runs they have scored over 6 consecutive  IPL  matches –

Match

RAINA

RAYADU

1

20

45

2

23

13

3

21

18

4

24

12

5

19

26

6

23

19

You are the captain of the team, and you need to choose either RAINA or RAYADU for the 7th match. The batsman should be dependable – in the sense that the batsman you choose should be in a position to score at least 20 runs. Whom would you choose?  that people approach this problem in one of the two ways –

1.     Calculate the total score (also called ‘Sigma’) of both the batsman – pick the batsman with the highest score for next game. Or..

2.     Calculate the average (also called ‘Mean’) number of scores per game – pick the batsman with better average.

Let us calculate the same and see what numbers we get –

o    RAINA’s Sigma = 20 + 23 + 21 + 24 + 19 + 23 = 130

o    RAYADU’s Sigma = 45 + 13 + 18 + 12 + 26 + 19 = 133

So based on the sigma you are likely to select RAYADU. Let us calculate the mean or average for both the players and figure out who stands better –

o    RAINA = 130/6 = 21.67

o    RAYADU = 133/6 = 22.16

So it seems from both the mean and sigma perspective, RAYADU deserves to be selected. But let us not conclude that yet. Remember the idea is to select a player who can score at least 20 runs and with the information that we have now (mean and sigma) there is no way we can conclude who can score at least 20 runs. Therefore, let’s do some further investigation.

To begin with, for each match played we will calculate the deviation from the mean. For example, we know RAINA’s mean is 21.67 and in his first match RAINA scored 20 runs. Therefore deviation from mean form the 1st match is 20 – 21.67 = – 1.67. In other words, he scored 1.67 runs lesser than his average score. For the 2nd match it was 23 – 21.67 = +1.33, meaning he scored 1.33 runs more than his average score.

The middle black line represents the average score of RAINA, and the double arrowed vertical line represents the the deviation from mean, for each of the match played. We will now go ahead and calculate another variable called ‘Variance’.

Variance is simply the ‘sum of the squares of the deviation divided by the total number of observations’. This may sound scary, but its not. We know the total number of observations in this case happens to be equivalent to the total number of matches played, hence 6.

So variance can be calculated as –

Variance = [(-1.67) ^2 + (1.33) ^2 + (-0.67) ^2 + (+2.33) ^2 + (-2.67) ^2 + (1.33) ^2] / 6
= 19.33 / 6
= 3.22

Further we will define another variable called ‘Standard Deviation’ (SD) which is calculated as –

std deviation = √ variance 

So standard deviation for RAINA is –
= SQRT (3.22)
= 1.79

Likewise RAYADU’s standard deviation works out to be 11.18.

Lets stack up all the numbers (or statistics) here –

Statistics

RAINA

RAYADU

Sigma

130

133

Mean

21.6

22.16

SD

1.79

11.18

 

We know what ‘Mean’ and ‘Sigma’ signifies, but what about the SD? Standard Deviation simply generalizes and represents the deviation from the average.

Here is the text book definition of SD “In statistics, the standard deviation (SD, also represented by the Greek letter sigma, σ) is a measure that is used to quantify the amount of variation or dispersion of a set of data values”.

Please don’t get confused between the two sigma’s – the total is also called sigma represented by the Greek symbol ∑ and standard deviation is also sometimes referred to as sigma represented by the Greek symbol σ.

One way to use SD is to make a projection on how many runs RAINA and RAYADU are likely to score in the next match. To get this projected score, you simply need to add and subtract the SD from their average.

Player

Lower Estimate

Upper Estimate

RAINA

21.6 – 1.79 = 19.81

21.6 + 1.79 = 23.39

RAYADU

22.16 – 11.18 = 10.98

22.16 + 11.18 = 33.34

These numbers suggest that in the upcoming 7th match RAINA is likely to get a score anywhere in between 19.81 and 23.39 while RAYADU stands to score anywhere between 10.98 and 33.34. Because RAYADU has a wide range, it is difficult to figure out if he is going to score at least 20 runs.  He can either score 10 or 34 or anything in between.

However RAINA seems to be more consistent. His range is smaller, which means he will neither be a big hitter nor a lousy player. He is expected to be a consistent and is likely to score anywhere between 19 and 23. In other words – selecting RAYADU over RAINA for the 7th match can be risky.

Going back to our original question, which player do you think is more likely to score at least 20 runs? By now, the answer must be clear; it has to be RAINA. RAINA is consistent and less risky compared to RAYADU.

So in principal, we assessed the riskiness of these players by using “Standard Deviation”. Hence ‘Standard Deviation’ must represent ‘Risk’. In the stock market world, we define ‘Volatility’ as the riskiness of the stock or an index. Volatility is a % number as measured by standard deviation.

Going by the above definition,  if Infosys and TCS have volatility of 25% and 45% respectively, then clearly Infosys has less risky price movements when compared to TCS.

 




 

No comments:

Post a Comment

Rocky road to democracy

 The road to a democratic future is rocky. But the attack on democracy typically starts quite slowly. On the surface, everything seems norma...