Girl Named Florida solutions
In The Drunkard's Walk, Leonard Mlodinow presents "The Girl Named Florida Problem":
"In a family with two children, what are the chances, if one of the children is a girl named Florida, that both children are girls?"
I like this problem, and I use it on the first day of my class to introduce the topic of conditional probability. But I've decided that it's too easy. To give it a little more punch, I've decided to combine it with the Red-Haired Problem from last week:
In a family with two children, what are the chances, if at least one of the children is a girl with red hair, that both children are girls?
Just like last week, you can make some simplifying assumptions:
About 2% of the world population has red hair. You can assume that the alleles for red hair are purely recessive. Also, you can assume that the Red Hair Extinction theory is false, so you can apply the Hardy–Weinberg principle. And you can ignore the effect of identical twins.
Before I present my solution, I want to sneak up on it with a series of warm-up problems.
P[GG | two children]: if a family has two children, what is the chance that they have two girls?
P[GG | two children, at least one girl]: if we know they have at least one girl, what is the chance that they have two girls?
P[GG | two children, older child is a girl]: if the older child is a girl, what is the chance that they have two girls?
P[GG | two children, at least one is a girl named Florida].
P[GG | two children, at least one is a girl with red hair, and the parents have brown hair].
P[GG | two children, at least one is a girl with red hair].
Problem 1: P[GG | two children]
If we assume that the probability that each child is a girl is 50%, then P[GG | two children] = 1/4.
Problem 2: P[GG | two children, at least one girl]
There are four equally-likely kinds of two child families: BB, BG, GB and GG. We know that BB is out, so the conditional probability is
P[GG | at least one girl] = P[GG and at least one girl] / P[at least one girl] = 1/3.
Problem 3: P[GG | two children, older child is a girl]
Now there are only two possible families, GB and GG, so the conditional probability is 1/2. Informally we can argue that once we know about the older child we can treat the younger child as independent. But if there's one thing we learn from this problem, it's that our intuition for independence is not reliable.
Problem 4: P[GG | two children, at least one girl named Florida]
Here's the one that makes people's head hurt. For each child, there are three possibilities, boy, girl not named Florida, and girl named Florida, with these probabilities:
B: 1/2
G: 1/2 - x
GF: x
where x is the unknown percentage of people who are girls named Florida. Of families with at least one girl named Florida, there are these possible combinations, with these probabilities
B GF: 1/2 x
GF B: 1/2 x
G GF: x (1/2 - x)
GF G: x (1/2 - x)
GF GF: x^2
The highlighted cases have two girls, so the probability we want is the sum of the highlighted cases over the sum of all cases. With a little algebra, we get:
P(GG | at least one girl named Florida) = (1 - x) / (2 - x)
Assuming that Florida is not a common name, x approaches 0 and the answer approaches 1/2. So it turns out, surprisingly, that the name of the girl is relevant information.
As x approaches 1/2, the answer converges on 1/3. For example, if we know that at least one child is a girl with two X chromosomes, x is close to 1/2 and the problem reduces to Problem 2.
If this problem is still making your head hurt, this figure might help:

Here B a boy, Gx is a girl with some property X, and G is a girl who doesn't have that property. If we select all families with at least one Gx, we get the five blue squares (light and dark). The families with two girls are the three dark blue squares.
If property X is common, the ratio of dark blue to all blue approaches 1/3. If X is rare, the same ratio approaches 1/2.
Problem 5: P[GG | two children, at least one girl with red hair, parents have brown hair]
If the parents have brown hair and one of their children has red hair, we know that both parents are heterozygous, so their chance of having a red-haired girl is 1/8.
Using the girl-named-Florida formula, we get
P[GG | two children, at least one girl with red hair, parents have brown hair] = (1 - 1/8) / (2 - 1/8) = 7/15.
And finally:
Problem 6: P[GG | two children, at least one girl with red hair]
In this case we don't know the genotype of the parents. There are three possibilities: Aa Aa, Aa aa, and aa aa.
We follow these steps:
1) Use the prevalence of red hair to compute the prior probabilities of each parental genotype.
2) Use the evidence (at least one girl with red hair) to compute the posteriors.
3) For each combination, compute the conditional probability. We have already computed
P(GG | two children, at least one with red hair, Aa Aa) = 7/15
The others are
P(GG | two children, at least one with red hair, Aa aa) = 3/7
P(GG | two children, at least one with red hair, aa aa) = 1/3
4) Apply the law of total probability to get the answer.
I'm too lazy to do the algebra, so I got Mathematica to do it for me. Here is the notebook with the answer.
In general, if p is the prevalence of red hair alleles,
P(GG | two children, at least one with red hair) = (p^2 + 2p - 7) / (p^2 + 2p - 15)
If the prevalence of red hair is 0.02, then p = sqrt(0.02) = 0.141, and
P(GG | two children, at least one with red hair) = 45.6%
Congratulations to Professor Ted Bunn at the University of Richmond, the only person who submitted a correct answer before the deadline!
At least, I think it's the right answer. Maybe we both made the same mistake.
For more fun with probability, see Chapter 5 of my book, Think Stats, which you can read here, or buy here.