BQ is unfair to women

Mar 02, 2011

Last week I wrote about the changes the BAA is making in the qualifying times for the Boston Marathon. Based on a sample of qualifiers from the Chicago Marathon, I predicted that the proportion of women in the open division (ages 18-34) will drop in the next two years.

This raises an obvious question: are the new standards fair? BAA executive director Tom Grilk explained “Looking back at the data that we have... we found that the fairest way to deal with this is to have a uniform reduction in qualifying standards across the board."

But he didn’t explain what he means by “fair.” There are several possibilities.

1) E-fairness (E for elite): By this definition, a standard is fair if the gender gap for qualifiers is the same as for elite runners. I discussed this standard in the previous post, and showed two problems: (a) elite women are farther from the pack than elite men, so qualifying times would be determined by a small number of outliers; and as a result, (b) this standard would disqualify 47% of the women in the open division.

A variation of E-fairness uses the relative difference in speeds rather than the absolute difference in times. This option reduces the impact, but doesn’t address what I think is the basic problem: it doesn’t make sense to base qualifying times on the performance of elite runners.

2) R-fairness (R for representative): By this definition a standard is fair if the qualifiers are a representative sample of the population of marathoners. I don’t have good data to evaluate R-fairness for the open division, but for the field as a whole the current standard is R-fair: according to running usa, 41% of marathoners are female, and in 2010 42% of Boston Marathon finishers were female.

[Note: this article in the Wall Street Journal claims that 42% is "higher than the percentage of all U.S. marathoners who are women," but I don't know what they are basing that on.]

A problem with R-fairness is that the population of marathoners includes some people who are competitive racers and others who are... not competitive racers. I don’t think it makes sense for the middle and the back of the pack to affect the standard.

3) C-fairness (C for contenders): Qualifying times should be determined by the most relevant population, runners who finish close to the standard. Specifically, I define a “contender” as someone who finishes within X minutes of the standard, where X is something like 20 minutes (we’ll look at some different values for X and see that it doesn’t matter very much).

And here’s what I propose: a standard is C-fair if the percentage of contenders who qualify is the same in each group. As an example, I'll compute a fair standard for men and women in the open division. Here’s how:

1) Like last week, I use data from the last three Chicago marathons as a sample of the population of marathoners. [Note: If this sample is not representative, that will affect my results, so I would like to get a more comprehensive dataset. I contacted marathonguide.com, but have not heard back.]

2) For the current male standard, 3:10, I select runners who finish within X minutes of the standard, and compute the percentage of these contenders that qualify.

3) Then I search for the female standard that yields the same percentage of qualifiers.

This figure shows the results:

The x-axis is the gender gap: the difference in minutes between male and female qualifying times. The y-axis is the difference in the percentage of contenders who qualify. The lines show results for values of X from 20 to 40 minutes. For smaller X, the results are noisier.

Where the lines cross through 0 is the gap that is C-fair. By that definition, the gap should be about 38 minutes. So if the male standard is 3:10, the female standard should be 3:48.

In 2013 the male standard will be 3:05. In that case, based on the same analysis, the gap should be 34 minutes, so the female standard should be 3:39.

C-fairness also has the property of “equal marginal impact,” which means that if we tighten the standard by 1 minute, we disqualify the same percentage of runners in all groups, which leaves the demographics of the field unchanged. Last week we saw that the current standard does not have this property -- tightening the qualifying times has a disproportionate impact on women.

In summary:

1) I think qualifying times should be based on the population of contenders -- runners near the standard -- not on the elites or the back of the pack.

2) A standard is fair if it qualifies the same proportion of contenders from each group.

3) By that definition, the gender gap in the open division should be 38 minutes in 2012 and 34 minutes in 2013.

4) The common belief that the standard for women is too easy is mistaken; by the definition of fair that I think is most appropriate, the standard for women is relatively hard.

-----

If you find this sort of thing interesting, you might like my free statistics textbook, Think Stats. You can download it or read it at thinkstats.com.

Probably Overthinking It

Discussion about this post