This article is from the source 'guardian' and was first published or seen on . It last changed over 40 days ago and won't be checked again for changes.
You can find the current article at its original source at http://www.theguardian.com/science/alexs-adventures-in-numberland/2014/jun/10/world-cup-birthday-paradox-footballers-born-on-the-same-day
The article has changed 4 times. There is an RSS feed of changes available.
Version 0 | Version 1 |
---|---|
World Cup birthday paradox: footballers born on the same day | World Cup birthday paradox: footballers born on the same day |
(2 days later) | |
The birthday paradox is the surprising mathematical result | The birthday paradox is the surprising mathematical result |
that you only need 23 people in order for it to be more likely than not that | that you only need 23 people in order for it to be more likely than not that |
two of them share the same birthday. | two of them share the same birthday. |
We can prove the result using probability, but I won’t do | We can prove the result using probability, but I won’t do |
that here since it is done very well in many places on the web. (And, since | that here since it is done very well in many places on the web. (And, since |
you’re asking, there’s a full explanation in my book Alex’s Adventures in | you’re asking, there’s a full explanation in my book Alex’s Adventures in |
Numberland). | Numberland). |
To be clear: the maths says that the chances of a shared | To be clear: the maths says that the chances of a shared |
birthday in a group of 23 people is 50.7%. Just over half. | birthday in a group of 23 people is 50.7%. Just over half. |
The result is surprising because 23 is an awfully small | The result is surprising because 23 is an awfully small |
group when the total number of possible birthdays is 365. | group when the total number of possible birthdays is 365. |
One of the many wonderful things about a World Cup is that | One of the many wonderful things about a World Cup is that |
it gives us a fantastic data set in which to test the birthday paradox. | it gives us a fantastic data set in which to test the birthday paradox. |
Each nation has a squad of 23 players, and there are 32 | Each nation has a squad of 23 players, and there are 32 |
nations. We would expect a shared birthday in 50.7% of the squads, | nations. We would expect a shared birthday in 50.7% of the squads, |
which works out at about 16 of the teams taking part. | which works out at about 16 of the teams taking part. |
But, in fact, 19 teams have a shared birthday – about 60% of the total. | But, in fact, 19 teams have a shared birthday – about 60% of the total. |
They are Brazil (Hulk, Paulinho, both born 25 July), as well | They are Brazil (Hulk, Paulinho, both born 25 July), as well |
as Algeria, Argentina, Australia, Bosnia, Cameroon, Chile, Colombia, France, Germany, | as Algeria, Argentina, Australia, Bosnia, Cameroon, Chile, Colombia, France, Germany, |
Iran, Holland, Honduras, Nigeria, Russia, South Korea, Spain, Switzerland and the USA. | Iran, Holland, Honduras, Nigeria, Russia, South Korea, Spain, Switzerland and the USA. |
(Argentina, Iran, Nigeria, South Korea and | (Argentina, Iran, Nigeria, South Korea and |
Switzerland have two pairs of shared birthdays each.) | Switzerland have two pairs of shared birthdays each.) |
Why is it the case that 60% of teams have a shared | Why is it the case that 60% of teams have a shared |
birthday, 10% more than we would expect? | birthday, 10% more than we would expect? |
It could be luck. Maybe if we took a group of 23 players | It could be luck. Maybe if we took a group of 23 players |
from every country in the world, we would get closer to the expected percentage | from every country in the world, we would get closer to the expected percentage |
of 50.7. | of 50.7. |
Yet I doubt it. We can see patterns in the data that help to explain | Yet I doubt it. We can see patterns in the data that help to explain |
why we get so many teams with shared birthdays: the distribution of | why we get so many teams with shared birthdays: the distribution of |
footballers’ birthdays is not uniform throughout the year. | footballers’ birthdays is not uniform throughout the year. |
Footballers are more likely to be born at the beginning of | Footballers are more likely to be born at the beginning of |
the year than at the end. If an equal number of players are born each month, then each month should have, on average, 61 birthdays. | the year than at the end. If an equal number of players are born each month, then each month should have, on average, 61 birthdays. |
But the total number of birthdays are January 72, February 79, March | But the total number of birthdays are January 72, February 79, March |
64, April 63, May 73, June 61, July 54, August 57, September 65, October 52, November 46, December 47. | 64, April 63, May 73, June 61, July 54, August 57, September 65, October 52, November 46, December 47. |
The first five months of the year are all above average, and | The first five months of the year are all above average, and |
five of the last six are below average. | five of the last six are below average. |
There is only one day in January, and one day in February | There is only one day in January, and one day in February |
when there are no birthdays, but there are eight birthdayless dates in November and eight in | when there are no birthdays, but there are eight birthdayless dates in November and eight in |
December. | December. |
February, the shortest month, has the most birthdays, and | February, the shortest month, has the most birthdays, and |
the days of the year with most birthdays – 7 – all fall in this month: seven players were born on each of February 5, | the days of the year with most birthdays – 7 – all fall in this month: seven players were born on each of February 5, |
13 and 14. | 13 and 14. |
One explanation for the skewed spread of birthdays is that | One explanation for the skewed spread of birthdays is that |
sportsmen are more likely to be born just after the school cut-off date, since | sportsmen are more likely to be born just after the school cut-off date, since |
they will be the biggest children in their school years and dominate sports | they will be the biggest children in their school years and dominate sports |
lessons. | lessons. |
If this argument is correct, then it would appear that the school | If this argument is correct, then it would appear that the school |
cut-off date in most of the countries at the World Cup is 1 January. | cut-off date in most of the countries at the World Cup is 1 January. |
(Although England then proves a counterexample. The cut-off | (Although England then proves a counterexample. The cut-off |
date is 1 September, and the most popular months for England team birthdays are May, August | date is 1 September, and the most popular months for England team birthdays are May, August |
and December). | and December). |
Whatever the reasons for the distribution of birthdays, the | Whatever the reasons for the distribution of birthdays, the |
fact that it is not uniform means that there is less randomness in when a | fact that it is not uniform means that there is less randomness in when a |
birthday may fall and hence the chance of a shared birthday goes up. | birthday may fall and hence the chance of a shared birthday goes up. |
Update, 12 June 2014: Now this is embarrassing. After being alerted by a friend, I have discovered that my source data was wrong. I got all the dates from the squad lists on Wikipedia, which it appears have errors compared with Fifa's official list. (Fifa hadn't made their lists available at that time). If we go by Fifa's list, then Spain, Chile and Algeria do not have shared birthdays, meaning that 16 teams have birthdays and 16 teams don't. While disappointing to have made a mistake, the maths turns out much nicer this way: 16 teams is closest to the 50.7 predicted percentage. And my point about the uneven distribution still stands. January to May all have above average birthdays overall, and October, November and December have the least. The lesson of the day is that we must always treat Wikipedia entries with a dose of scepticism. |