I have been looking at the birthday problem () and I am trying to figure out what the probability of 3 people sharing a birthday in a room of 30 people is. (Instead of 2).
I thought I understood the problem but I guess not since I have no idea how to do it with 3.
$\endgroup$ 39 Answers
$\begingroup$The birthday problem with 2 people is quite easy because finding the probability of the complementary event "all birthdays distinct" is straightforward. For 3 people, the complementary event includes "all birthdays distinct", "one pair and the rest distinct", "two pairs and the rest distinct", etc. To find the exact value is pretty complicated.
The Poisson approximation is pretty good, though. Imagine checking every triple and calling it a "success" if all three have the same birthdays. The total number of successes is approximately Poisson with mean value ${30 \choose 3}/365^2$. Here $30\choose 3$ is the number of triples, and $1/365^2$ is the chance that any particular triple is a success. The probability of getting at least one success is obtained from the Poisson distribution: $$ P(\mbox{ at least one triple birthday with 30 people})\approx 1-\exp(-{30 \choose 3}/365^2)=.0300. $$
You can modify this formula for other values, changing either 30 or 3. For instance, $$ P(\mbox{ at least one triple birthday with 100 people})\approx 1-\exp(-{100 \choose 3}/365^2)=.7029,$$ $$ P(\mbox{ at least one double birthday with 25 people })\approx 1-\exp(-{25 \choose 2}/365)=.5604.$$
Poisson approximation is very useful in probability, not only for birthday problems!
$\endgroup$ 14 $\begingroup$An exact formula can be found in Anirban DasGupta, The matching, birthday and the strong birthday problem: a contemporary review, Journal of Statistical Planning and Inference 130 (2005), 377-389. This paper claims that if $W$ is the number of triplets of people having the same birthday, $m$ is the number of days in the year, and $n$ is the number of people, then
$$ P(W \ge 1) = 1 - \sum_{i=0}^{\lfloor n/2 \rfloor} {m! n! \over i! (n-2i)! (m-n+i)! 2^i m^n} $$
No derivation or source is given; I think the idea is that the term corresponding to $i$ is the probability that there are $i$ birthdays shared by 2 people each and $n-2i$ birthdays with one person each.
In particular, if $m = 365, n = 30$ this formula gives $0.0285$, not far from Byron's approximation.
$\endgroup$ 4 $\begingroup$As being pointed out by Micheal Lugo the formulation given by Anirban DasGupta is a exact answer for this problem, however a formal proof is needed. I have found and verified a solution by Doctor Rick from Math Forum, below is the link
His approach is to partition the sample space as following:
1. none share a birthday 2. one pair shares a birthday 3. two pairs share different birthdays 4. three pairs share different birthdays : 1+N/2. N/2 pairs share different birthdays 2+N/2. three or more share a birthdayThen he points out a clever way to count for each partition by picking different birthday for each pair of person. I have tried and arrived with the same formulation as Anirban DasGupta's. For more detail please take a look at the link above!
$\endgroup$ 0 $\begingroup$I'm a bit skeptical of this answer. Here is a formula that works.
It probably helps to explain Dasgupta's formula from Michael Lugo's answer first.
Say that a map $f : [m]\to [n]$ is $k$-almost injective if $|f^{-1}(j)|\le k$ for all $j\in [n]$. Counting injective maps is easy, there are
$$ I(1,m,n) := m!\binom{n}{m} $$
of them. You just pick the range and then a bijection to it. This gives right away the standard birthday collision probability for $m$ people and years of length $n$
$$ 1 - n^{-m}I(1,m,n) $$
One gets the generalized birthday probability from $I(k,m,n)$ in the same way, so we can just think about $I(k,m,n)$.
How would we go about counting $2$-injective maps? The same idea as before works. This time, we pick $c$ pairs that will have colliding images, injectively map these into $[n]$, then injectively map the rest to a set of size $n-c$. So we get
$$ I(2,m,n) = \sum_{c=0}^{\lfloor m/2\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - 2j}{2}\right) I(1,c,n)I(1,m-2c,n-c) $$
This is equivalent to Dasgupta's formula, but it is easier to see the induction.
If we want to get $I(k,m,n)$ in general, we have
$$ I(k,m,n) = \sum_{c=0}^{\lfloor m/k\rfloor}\frac{1}{c!} \left(\prod_{j=0}^{c-1}\binom{m - kj}{k}\right) I(1,c,n)I(k-1,m-kc,n-c) $$
$\endgroup$ $\begingroup$My own research led to the following result...
Knowing that there are $A$ days in the year (typically $A=365$), the probability $P(A, M, n)$ that at least $n$ children have their birthday the same day within a class of $M$ children is:
$$\boxed{ P(A, M,n) = 1 - \dfrac{ K_n(A, M) }{ A^M } }$$
where $K_n(A, M)$ represents the number of configurations in which one cannot find $n$ children (or more) having their birthday the same day, and can be computed by recurrence as follows:
$$\forall n\ge 2, \quad \boxed{ K_{n+1}(A, M) = \sum_{0\le k\le \left\lfloor{\frac{M}{n}}\right\rfloor} \dfrac{ \binom{A}{k} \; (M)_{nk} \; K_n(A-k, M-nk)}{ (n!)^k} }$$
with the following initialization: $\boxed{ K_2(A,M)=(A)_M }$
and where $(n)_k$ stands for the decreasing factorial : $(n)_k = n(n-1)...(n-k+1)$
Numerically:
Within a class of $M=30$ children, knowing that the year counts $A=365$ days...
The probability that at least $n=2$ children have their birthday the same day is:
$$P(365,30,2)\simeq 70,6\%$$
The probability that at least $n=3$ children have their birthday the same day is:
$$P(365,30,3)\simeq 2,85\%$$
The probability that at least $n=4$ children have their birthday the same day is:
$$P(365,30,4)\simeq 0,0532\%$$
Nicolas
$\endgroup$ $\begingroup$Just like to point out that Trazom's answer is incorrect for the general case - the sets being counted in the outer sum overlap. I don't have enough reputation to comment. I wrote a blog post about the general case here :
$\endgroup$ $\begingroup$There is another approximation based on a Poisson distribution. I was using this method in the 1990s (I have found my implementation in Java), but unfortunately do not remember where I first read about it. It is also described in .
Let $b$ be the number of days in the year, $n$ the number of people in the room, and suppose you want to know the probability that at least $k + 1$ people share a birthday. (In the original question above, $b = 365$ and $k = 2.$)
Let $X_i$ be the number of people who have birthdays on day number $i$ for$1 \leq i \leq b.$We approximate the joint distribution of $(X_1, \ldots, X_b)$ by assuming that each $X_i$ is a Poisson variable with expected value $n/b,$and assuming that these variables are all independent.
The probability that no $k + 1$ people all have their birthdays on day $i$ is$P(X_i \leq k) = F(k),$ where $F$ is the CDF of $\mathrm{Poisson}(n/b).$The probability that this happens on all $b$ days of the year, that is, there is no day on which more than $k$ people have a birthday, is $(F(K))^b.$
The estimated probability that there is a day when $k+1$ or more people have a birthday is therefore $1 - (F(K))^b.$
Setting $b=365$ and trying this out on a few known cases from :
\begin{array}{ccccc} k & n & n/b & F(k) & (F(k))^b \\ 1 & 22 & 0.0602740 & 0.99825489 & 0.5286 \\ 1 & 23 & 0.0630137 & 0.99809610 & 0.4988 \\ 2 & 87 & 0.2383562 & 0.99811045 & 0.5014 \\ 2 & 88 & 0.2410959 & 0.99804850 & 0.4902 \\ 3 & 186 & 0.5095890 & 0.99812428 & 0.5039 \\ 3 & 187 & 0.5123288 & 0.99808774 & 0.4973 \\ 4 & 312 & 0.8547945 & 0.99812038 & 0.5032 \\ 4 & 313 & 0.8575342 & 0.99809433 & 0.4985 \\ \end{array}
This predicts (correctly) that you need $23$ people to have at least a $50\%$ chance of at least one set of two people with the same birthday, $88$ people to have at least a $50\%$ chance of at least one set of three people with the same birthday, $187$ people to have at least a $50\%$ chance of at least one set of four people with the same birthday, $313$ people to have at least a $50\%$ chance of at least one set of five people with the same birthday.
As another example, suppose you have $100$ people in the room. Then this approximation gives $(F(2))^{365} \approx 0.3600$, and therefore the probability of three or more people all with the same birthday is approximately $0.6400.$Wolfram Alphagives the probability as $0.6459$. Contrast this with the accepted answer, which estimates the probability at $0.7029.$
On the other hand, for small numbers of people in the room, this method overestimates the chance of $k+1$ people with the same birthday. For example, for the probability of three or more people with the same birthday out of $30$ people, this method gives $0.0313$, the accepted answer gives $0.0300,$ and Wolfram Alpha gives $0.0285.$
The individual $X_i$ are not actually Poisson and of course they are not actually independent. That is what makes this an approximation rather than an exact method.
$\endgroup$ $\begingroup$Anyone looking for generalized birthday problem i.e. How many people are required such that M of them share same birthday with certain probability.
This link explain various method for calculating probability of generalized birthday problem.
Also this paper talk more about various kind of coincidences we face ib life, interesting read.
$\endgroup$ 1 $\begingroup$I am looking at this question and the complicated answers and it's confusing me. Supposing I want to solve in a group of 100 people. what is the probability that at least 3 people share a birthday. So I start from very basic - if there are 3 people, the probability of them sharing a birthday is $$\frac{1}{365} *\frac{1}{365}*\frac{1}{365}*(365)=\frac{1}{(365)^2}$$ 1/365 the prob. of 1 person having a birthday on a particular day multiplied by each person's probability multiplied by the total no. of days.
Similarly if there are 4 people, the probability of at least 3 of them sharing a birthday would be $\frac{1}{(365)^2}*^4C_3 $. and similarly for x people, $\frac{1}{(365)^2}*^XC_3 $
I am trying to find the fault in this logic.
$\endgroup$ 3