Learn MATLAB Episode #25: Birthday Paradox

One interesting and popular problem in probability is called the birthday problem, or the birthday paradox, and the problem goes something like this. So, given a classroom of n students, what is the probability that at least one pair of students shares a birthday? Now you might be surprised that at N equals 23 the probability is about fifty percent, which is why this problem is called a paradox. So that means in an average sized classroom there’s probably a pretty good chance that two people in the class share a birthday. This is counterintuitive since there are 365 days in a year. In this lecture I’ll show you the theory behind the solution, and how to visualize it in MATLAB. So the first thing is the problem of at least one pair of people sharing a birthday is difficult, but remember that the probability of all distinct events have to add up to one. So what is the opposite of at least one pair of people sharing a birthday? It’s not two people sharing a birthday, or three people sharing a birthday, or two pairs of two people sharing a birthday, these events all fit into at least one pair sharing a birthday. So the two disjoint events that we want to talk about are the probability that at least one pair shares a birthday, and nobody shares a birthday. So these two events are disjoint and therefore they have to add up to one. So we can calculate then the probability that two people or at least one pair of people shares a birthday as 1 minus the probability that nobody shares a birthday, and so in mathematics we would call this a counting problem. So now let’s think about how do we calculate the probability that nobody shares a birthday. So there are 365 days in a year. Now if you think of each day as a bucket we have one person and they have 365 buckets to choose from. The probability that this one person will collide with another is 0. The probability that one person shares a birthday with somebody else when there’s only that one person is 0, so the probability that nobody shares a birthday in this case is 1, or 365.365. Now what about two people? So with one person already having chosen a birthday or a bucket, the second person has a 1/365 chance of colliding with that person. So the probability of at least one pair having a common birthday is 364/365. So this is the case with two people. Now if we have three people the third person only has 363 buckets to choose from. So we have 364/365 times 363/365, and we can multiply these probabilities because they are independent. So this is the probability that three people, and a group of three people, at least one pair would share a birthday. So we can continue this pattern but it would probably be easier to write a matlab function to do this. So my birthday function is going to return all the probabilities up to the value n. So I’m going to initialize a to be an array of zeros, and I’m going to count up to n, and fill in the values of a. Actually, I’m going to say one is because we’re subtracting from one. Actually, it doesn’t matter what I initialize date to because i’m going to say 1 minus over here. Okay, so, we know that we have to multiply by something over 365 each time and subtract that from one, and so we can use a for loop to iterate over the thing that has to be subtracted and multiply the new value iteratively. Now that we have our function let’s test it, so let’s say a equals birthday, and let’s set n equal to 100, and let’s plot. Okay, so, you can see here when n is about 23 you get the probability around 0.5. When n is equal to 50 you’re right above 90%, so there’s a pretty good chance in a group of 50 people that at least one of those pairs of people shares a birthday. So let’s check A(23), right it’s about 50%, and that is the solution to the birthday paradox.