The Lady Tasting Tea by David Salsburg
This is a super niche book for statistics nerds, all the more so because it’s not even technical. Rather, it’s about the people behind all the theorems and proofs – the Pythagorases of 20th century statistics.
1) Karl Pearson’s key contribution is the idea that math doesn’t have to be deterministic. Instead of one true “thing,” there exists a distribution defined by parameters.
AP Stats is lowkey the most important class of the 21st century. That said, AP Stats and college stats are not at all the same thing.
2) Gosset, publishing under the pen name of Student, worked at Guinness.
Good quality beer calls for good statistical distributions.
3) Ronald Fisher, who would surpass Pearson in his contributions to the field of statistics, suffered from visual impairment, which enabled him to develop strong geometric reasoning.
One day, I hope we have VR education. I’d be able to ask any question I want whenever I want and step into a 3D interactive explanation. Knowledge sharing is just too inefficient.
4) Fisher was a proponent of eugenics and later on would dismiss claims that smoking caused cancer.
The author painted an overall positive portrait of Fisher, but I’m sure he would not have been fun to work with.
5) LD-50 is defined as the dose required to kill 50% of the population.
LD-50 served as the parameter in Bliss’s probit model.
6) Fisher did not believe that a failure to find significance meant the hypothesis was true.
The meat of this book was the exploration of competing interpretations of the significance test between Fisher and Neyman-Pearson (Karl Pearson’s son). Fisher believed that an insignificant p-value simply meant we had to conduct another test.
7) Neyman-Pearson takes the frequentist definition and says that the significance test must pit a null hypothesis against an alternative hypothesis.
This is the widely accepted interpretation of significance testing now. It’s interesting to consider how it’s not necessarily the “correct” interpretation. With all the p-hacking going on, it’s clear that there are problems with how practitioners take the methodology for granted.
8) In the Serene Republic of Venice, the head of state doge was elected via a randomly selected set of lectors.
This story is hard to believe, but it’s good enough for me that the Doge of Venice is real.
9) Case control, prospective cohort, and retrospective cohort are three types of cohort study.
Observational studies are hard. Being able to run an experiment is a luxury.
10) “‘It seems to me that one of statistician’s jobs is to look at figures, to query why they look like they do…. I am being very simpleminded tonight, but I think it is our job to suggest that figures are interesting – and, if the person to whom we say this looks bored, then we have either put it across badly or the figures are not interesting. I suggest that my statistics in the Home Office are not boring.'” – Stella Cuncliffe
I wholeheartedly agree. One picture is worth a thousand words. A table or chart should be worth at least that.
As a behind-the-scenes addendum to a supposedly dry subject, this book was very easy to read and digestible. I don’t think I’ll remember many details of who discovered what and who didn’t like whom, but the core idea that statistics is a relatively nascent and ever evolving field definitely resonated. Fast forwarding to the present, machine learning and large-scale online experimentation are very much the next step in the statistics evolution.