Mastering ‘Metrics by Josh Angrist and Jorn-Steffen Pischke
I always tell myself I won’t read economics books, yet here I am again. That said, I consider econometrics to be its own field. The part of economics that I don’t want to read anymore are the theoretical models that overfit past observations and struggle to hold water once the real world no longer fits the model (see: interest rates, economic growth, etc). There’s nothing inherently bad about wrong models. All of science is basically a model that we keep updating as we learn more about how the world works. The problem with economic models is that they are ultimately about human behavior, and there is a weird feedback loop where humans behave a certain way because we think that we behave/should behave/have behaved/will behave a certain way. This agency is in contrast to other sciences where our models are (we think) exogenous to the entity we want to predict or understand. To me, econometrics fits in this latter category despite having the “econ” root in its name. It is more about data science than economic theory, and a lot of the material in this book is eerily relevant to my work. Also eerie is how much of this I’ve forgotten. I’m pretty sure I had learned everything in this book at some point in my life, so it’s great that there’s such a digestible book on econometrics. I wish there were more books like this on other topics that traditionally live in clunky textbooks. The kung fu master theme throughout was a bit much and seemed forced, but it created a lighthearted tone to the book that made it a much more lively read.
1) Robert Frost wrote “The Road Not Taken.”
I loved how the authors used this poem to illustrate that one observation is one observation and you can’t know what would have happened otherwise. Establishing apples-to-apples comparisons is the core challenge in measuring casual effect. It’s also a concept that I think most people should be more mindful of in daily life. For example, there are no real answers to questions like “Is Harvard better than Yale?” A Harvard student only experiences Harvard. Even if they transferred to Yale, they are still biased by having gone to Harvard. Econometrics is about finding ways to answer these questions – via running experiments or leveraging other randomness and assumptions.
2) RAND HIE (Health Insurance Experiment) and a subsequent Oregon study results suggest that insurance coverage increases use of health services but does not improve health.
I’m pretty sure I worked on this Oregon study during my freshman UROP. It’s really crazy that I know some key players in the national healthcare insurance arena. Reminder: healthcare is completely broken in America.
3) Estimated standard error of the sampling mean
Find you someone who can explain this correctly.
4) The building block of regression analysis is finding pairs of people and comparing them.
Again, the key is to find data points that are truly comparable. Regression (and ML) Python functions are abstracted so far away from the core concept that anyone can run a regression and have no idea what it’s actually doing.
5) Omitted variable bias is the difference between short and long regression coefficients.
I had omitted OVB from my memory.
6) Using ln(Y) is useful because then you can interpret coefficients as a percentage change in Y.
Via calculus magic, this approximation works when the percentage change is small.
7) KIPP (Knowledge is Power Program) is the largest network of charter schools in America.
I’ve never really paid much attention to the charter/public school debate, so this is the first time I’ve heard of KIPP.
8) Instrumental variable analysis requires 3 main assumptions.
First stage: instrumental variable affects the causal channel of interest; Independence assumption: instruments look randomly assigned; Exclusion restriction: a single causal channel connects the instrumental variable with the outcome
9) If there’s no effect in the reduced form, then the IV estimate is also 0.
This is because the effect is (A on C) / (A on B). If (A on C) is 0, then the effect is 0.
10) Errors in the measurement or reporting of independent variables lead to attenuation bias.
Errors bias the regression estimates toward 0.
These ten points don’t do the book justice, since the book really shines in its long explanations of econometric methodologies. It walks through examples and math in a clear way, without being overly technical or repetitive. Reading this book was like going to school again but having the professor explain every little detail so that you really get it. It was extra interesting because I could think of examples from work to match each topic. The standard at work is to run experiments, but the real challenge is usually in teasing out effects when there is no experiment or when the experiment is wrong. For example, if I only see an effect on iOS, is that because of a specific OS version? This type of question maps to instrumental variable analysis. Even though I’m not calculating estimates with standard errors, the goal is similar: find an apples-to-apples comparison and determine whether there is a real difference.