Probabilities: the birthday paradox

1.    Introduction

An interesting problem of probability, with an intriguing result: given a group of n persons chosen at random, what is the minimum size of the group so that at least 2 persons in the group have the same birthday day with a probability of 50% or more ?

 

FIRST TRY TO CALCULATE THAT FOR YOURSELF WITHOUT LOOKING AT THE SOLUTION BELOW. IF YOU ARE SUCCESSFUL, CONGRATULATIONS!

2.    Solution

 It is much simpler to calculate the probability that the n persons have all different birthdays. This is a common trick in probabilities: calculating the probability of the negative of an event instead of the probability of the original event.

If the probability of the original event of interest is p, then the probability of the negative of the event is q = 1 – p.

 

Consider each person in the group of n persons one after the other:

  • Person # 1: has a birthday on one specific day, whatever it is.
  • Person # 2: the probability that this person’s birthday is different from that of person 1 is (365 – 1)/365 = 364/365 because 1 day is excluded.
  • Person # 3: the probability that this person’s birthday is different from that of persons 1 and 2 is (365 – 2)/365 = 363/365, because 2 days are excluded, and so on …
  • Person # n: the probability that this person’s birthday is different from that of persons 1, 2, 3, …, n-1 is (365 – [n-1]) /365 = (366 – n)/365, because n-1 days are excluded.

Then, the probability that the n persons have all different birthdays is the product of the above individual probabilities:

q = (364/365) * (363/365) * …. * ([366-n]/365)

 

This is because all events Ek : “the kth person’s birthday is different from that of the k-1 predecessors” need to be satisfied simultaneously and because all these events are independent of each other.

 

Therefore:

Probability (E2 and E3 and … and En) = Probability(E2) * Probability (E3) * …  *  Probability (En)

 

In summary:

Probability that at least 2 persons have the same birthday:

p = 1 – q = 1- (364/365) * (363/365) * …. * ([366-n]/365)               (A)

 

More compact formula:

p  = 1 - 364! / 365n-1 (365 – n)!                                                            (B)

 

The calculation of these probabilities as a function of n can be done easily in a spreadsheet using the formula (A) in order to avoid super large numbers with formula B.

 

FINAL RESULT

Starting at n = 23, the probability that at least 2 persons in a group of n have the same birthday becomes greater than 50%. The probability is actually 0.5073 for n = 23.

The paradox is that intuitively one is expecting a much bigger group to be able to have at least two persons with the same birthday with a good chance of success.