ScPoEconometrics: Advanced
Instrumental Variables - Applications
Bluebery Planterose
SciencesPo Paris 
 2023-03-07
1 / 41

Status

What Did we Do Last Time?

We learned about John Snow's grand experiment in London 1850.
We used his story to motivate the IV estimator.
You took a quiz about some IV aspects.

2 / 41

Status

What Did we Do Last Time?

We learned about John Snow's grand experiment in London 1850.
We used his story to motivate the IV estimator.
You took a quiz about some IV aspects.

Today

We'll look at further IV applications.
We introduce an extension called Two Stage Least Squares.
We will use R to compute the estimates.
Finally we'll talk about weak instruments.

2 / 41

Back to school!3 / 41

Returns To Schooling

What's the causal impact of schooling on earnings?
Jacob Mincer was interested in this important question.
Here's his model:

$\log Y_{i} = α + ρ S_{i} + β_{1} X_{i} + β_{2} X_{i}^{2} + e_{i}$

4 / 41

Returns To Schooling

$\log Y_{i} = α + ρ S_{i} + β_{1} X_{i} + β_{2} X_{i}^{2} + e_{i}$

He found an estimate for $ρ$ of about 0.11,
11% earnings advantage for each additional year of education
Look at the DAG. Is that a good model? Well, why would it not be?

5 / 41

Ability Bias

We compare earnings of men with certain schooling and work experience
Is all else equal, after controlling for those?
Given $X$ ,
- Can we find differently diligent workers out there?
- Can we find differently able workers?
- Do family connections of workers vary?

6 / 41

Ability Bias

We compare earnings of men with certain schooling and work experience
Is all else equal, after controlling for those?
Given $X$ ,
- Can we find differently diligent workers out there?
- Can we find differently able workers?
- Do family connections of workers vary?

Yes, of course. So, all else is not equal at all.
That's an issue, because for OLS consistency we require the orthogonality assumption $E [e_{i} | S_{i}, X_{i}] \neq 0$
Let's introduce ability $A_{i}$ explicitly.

6 / 41

Mincer with Unobserved Ability

In fact we have two unobservables: $e$ and $A$ .
Of course we can't tell them apart.
So we defined a new unobservable factor $u_{i} = e_{i} + A_{i}$

7 / 41

Mincer with Unobserved Ability

In fact we have two unobservables: $e$ and $A$ .
Of course we can't tell them apart.
So we defined a new unobservable factor $u_{i} = e_{i} + A_{i}$

7 / 41

Mincer with Unobserved Ability

In terms of an equation: $\log Y_{i} = α + ρ S_{i} + β_{1} X_{i} + β_{2} X_{i}^{2} + \underset{A_{i} + e_{i}}{\underset{⏟}{u_{i}}}$
Sometimes, this does not matter, and the OLS bias is small.
But sometimes it does and we get it totally wrong! Example.

8 / 41

Angrist and Krueger (1991): Birthdate is as good as Random

Angrist and Krueger (AK91) is an influental study addressing ability bias.
Idea:
1. construct an IV that encodes birth date of student.
2. Child born just after cutoff date will start school later!
Suppose all children who reach the age of 6 by 31st of december 2021 are required to enroll in the first grade of school in september 2021.

9 / 41

Angrist and Krueger (1991): Birthdate is as good as Random

Angrist and Krueger (AK91) is an influental study addressing ability bias.
Idea:
1. construct an IV that encodes birth date of student.
2. Child born just after cutoff date will start school later!
Suppose all children who reach the age of 6 by 31st of december 2021 are required to enroll in the first grade of school in september 2021.

If born in September 2015 (i.e. 6 years prior), will be 5 years and 3/4 by the time they start school.
If born on the 1st of January 2016 will be 6 and 3/4 years when they enter school in september 2022.
However, people can drop out of school legally on their 16-th birthday!
So, out of people who drop out, some got more schooling than others.
AK91 construct IV quarter of birth dummy: affects schooling, but not related to $A$ !

9 / 41

AK91 IV setup

quarter of birth dummy $z$ : affects schooling, but not related to $A$ !
In particular: whether born in 4-th quarter or not.

10 / 41

AK91 Estimation: Two Stage Least Squares (2SLS)

AK91 allow us to introduce a widely used variation of our simple IV estimator: 2SLS

We estimate a first stage model which uses only exogenous variables (like $z$ ) to explain our endgenous regressor $s$ .
We then use the first stage model to predict values of $s$ in what is called the second stage or the reduced form model. Performing this procedure is supposed to take out any impact of $A$ in the correlation we observe in our data between $s$ and $y$ .

$\begin{aligned} 1. Stage: s_{i} & = α_{0} + α_{1} z_{i} + η_{i} \\ 2. Stage: y_{i} & = β_{0} + β_{1} {\hat{s}}_{i} + u_{i} \end{aligned}$

Conditions:

Relevance of the IV: $α_{1} \neq 0$
Independence (IV assignment as good as random): $E [η | z] = 0$
Exogeneity (our exclusion restriction): $E [u | z] = 0$

11 / 41

Let's do Angrist and Krueger (1991)!12 / 41

Data on birth quarter and wages

Let's load the data and look at a quick summary

data("ak91", package = "masteringmetrics")
# from the modelsummary package
datasummary_skim(data.frame(ak91),histogram = TRUE)

	Unique (#)	Mean	SD	Min	Median	Max
lnw	26732	5.9	0.7	−2.3	6.0	10.5
s	21	12.8	3.3	0.0	12.0	20.0
yob	10	1934.6	2.9	1930.0	1935.0	1939.0
qob	4	2.5	1.1	1.0	3.0	4.0
sob	51	30.7	14.2	1.0	34.0	56.0
age	40	45.0	2.9	40.2	45.0	50.0

13 / 41

AK91 Data Transformations

We want to create the q4 dummy which is TRUE if you are born in the 4th quarter.
create factor versions of quarter and year of birth.

ak91 <- mutate(ak91,
               qob_fct = factor(qob),
               q4 = as.integer(qob == "4"),
               yob_fct = factor(yob))
# get mean wage by year/quarter
ak91_age <- ak91 %>%
  group_by(qob, yob) %>%
  summarise(lnw = mean(lnw), s = mean(s)) %>%
  mutate(q4 = (qob == 4))

14 / 41

AK91 Figure 1: First Stage!

Let's reproduce AK91's first figure now on education as a function of quarter of birth!

ggplot(ak91_age, aes(x = yob + (qob - 1) / 4, y = s )) +
  geom_line() + 
  geom_label(mapping = aes(label = qob, color = q4)) +
  guides(label = FALSE, color = FALSE) +
  scale_x_continuous("Year of birth", breaks = 1930:1940) +
  scale_y_continuous("Years of Education", breaks = seq(12.2, 13.2, by = 0.2),
                     limits = c(12.2, 13.2)) +
  theme_bw()

15 / 41

AK91 Figure 1: First Stage!

The numbers label mean education by quarter of birth groups.
The 4-th quarters did get more education in most years!
There is a general trend.

16 / 41

AK91 Figure 2: Impact of IV on outcome

What about earnings for those groups?

ggplot(ak91_age, aes(x = yob + (qob - 1) / 4, y = lnw)) +
  geom_line() +
  geom_label(mapping = aes(label = qob, color = q4)) +
  scale_x_continuous("Year of birth", breaks = 1930:1940) +
  scale_y_continuous("Log weekly wages") +
  guides(label = FALSE, color = FALSE) +  
  theme_bw()

17 / 41

AK91 Figure 2: Impact of IV on outcome

The 4-th quarters are among the high-earners by birth year.
In general, weekly wages seem to decline somewhat over time.

18 / 41

Running IV estimation in `R`

Several options (like always with R! 😉)
Will use the iv_robust function from the estimatr package.
Robust? Computes standard errors which are correcting for heteroskedasticity. Details here.

library(estimatr)
# create a list of models
mod <- list()
# standard (biased!) OLS
mod$ols <- lm(lnw ~ s, data = ak91)
# IV: born in q4 is TRUE?
# doing IV manually in 2 stages.
mod[["1. stage"]] <- lm(s ~ q4, data = ak91)
ak91$shat         <- predict(mod[["1. stage"]])  
mod[["2. stage"]] <- lm(lnw ~ shat, data = ak91)
# run 2SLS
# doing IV all in one go
# notice the formula!
# formula = y ~ x | z
mod$`2SLS`  <- iv_robust(lnw ~ s | q4,
                         data = ak91,
                         diagnostics = TRUE)

19 / 41

Running IV estimation in `R`

Several options (like always with R! 😉)
Will use the iv_robust function from the estimatr package.
Robust? Computes standard errors which are correcting for heteroskedasticity. Details here.
Notice the predict to get $\hat{s}$ .

library(estimatr)
# create a list of models
mod <- list()
# standard (biased!) OLS
mod$ols <- lm(lnw ~ s, data = ak91)
# IV: born in q4 is TRUE?
# doing IV manually in 2 stages.
mod[["1. stage"]] <- lm(s ~ q4, data = ak91)
ak91$shat         <- predict(mod[["1. stage"]])
mod[["2. stage"]] <- lm(lnw ~ shat, data = ak91)
# run 2SLS
# doing IV all in one go
# notice the formula!
# formula = y ~ x | z
mod$`2SLS`  <- iv_robust(lnw ~ s | q4,
                         data = ak91,
                         diagnostics = TRUE)

19 / 41

AK91 Results Table

	ols	1. stage	2. stage	2SLS
(Intercept)	4.995***	12.747***	4.955***	4.955***
	(0.004)	(0.007)	(0.381)	(0.358)
s	0.071***			0.074**
	(0.000)			(0.028)
q4		0.092***
		(0.013)
shat			0.074*
			(0.030)
R2	0.117	0.000	0.000	0.117
RMSE	0.64	3.28	0.68	0.64
1. Stage F:				48.990
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

OLS likely downward biased (measurement error in schooling)
First Stage: IV q4 is statistically significant, but small effect: born in q4 has 0.092 years of educ. $R^{2}$ is 0%! But F-stat is large. 😅
Second stage has same point estimate as 2SLS but different std error (2. stage one is wrong)

20 / 41

Remember the F-Statistic?

We encountered this before: it's useful to test restricted vs unrestricted models against each other.

21 / 41

Remember the F-Statistic?

We encountered this before: it's useful to test restricted vs unrestricted models against each other.
Here, we are interested whether our instruments are jointly significant. Of course, with only one IV, that's not more informative than the t-stat of that IV.

21 / 41

Remember the F-Statistic?

We encountered this before: it's useful to test restricted vs unrestricted models against each other.
Here, we are interested whether our instruments are jointly significant. Of course, with only one IV, that's not more informative than the t-stat of that IV.
This F-Stat compares the predictive power of the first stage with and without the IVs. If they have very similar predictive power, the F-stat will be low, and we will not be able to reject the H0 that our IVs are jointly insignificant in the first stage model. 😞

21 / 41

Additional Control Variables

We saw a clear time trend in education earlier.
There are also business-cycle fluctuations in earnings
We should somehow control for different time periods.
Also, we can use more than one IV! Here is how:

22 / 41

Additional Control Variables

# we keep adding to our `mod` list:
mod$ols_yr  <- update(mod$ols, . ~ . + yob_fct)  #  previous OLS model
# add exogenous vars on both sides of the `|` !
mod[["2SLS_yr"]] <- estimatr::iv_robust(lnw ~ s  + yob_fct | q4 + yob_fct, data = ak91, diagnostics = TRUE )  
# use all quarters as IVs
mod[["2SLS_all"]] <- estimatr::iv_robust(lnw ~ s  + yob_fct | qob_fct + yob_fct, data = ak91, diagnostics = TRUE  )

	ols	2SLS	ols_yr	2SLS_yr	2SLS_all
(Intercept)	4.995	4.955	5.017	4.966	4.592
	(0.004)	(0.358)	(0.005)	(0.354)	(0.251)
s	0.071	0.074	0.071	0.075	0.105
	(0.000)	(0.028)	(0.000)	(0.028)	(0.020)
R2	0.117	0.117	0.118	0.117	0.091
RMSE	0.64	0.64	0.64	0.64	0.65
1. Stage F:		48.990		47.731	32.323
Instruments	none	Q4	none	Q4	All Quarters
Year of birth	no	no	yes	yes	yes

23 / 41

Additional Control Variables

	ols	2SLS	ols_yr	2SLS_yr	2SLS_all
(Intercept)	4.995	4.955	5.017	4.966	4.592
	(0.004)	(0.358)	(0.005)	(0.354)	(0.251)
s	0.071	0.074	0.071	0.075	0.105
	(0.000)	(0.028)	(0.000)	(0.028)	(0.020)
R2	0.117	0.117	0.118	0.117	0.091
RMSE	0.64	0.64	0.64	0.64	0.65
1. Stage F:		48.990		47.731	32.323
Instruments	none	Q4	none	Q4	All Quarters
Year of birth	no	no	yes	yes	yes

Adding year controls...

leaves OLS mostly unchanged
slight increase in 2SLS estimate

Using all quarters as IV...

Increases precision of 2SLS estimate a lot!
Point estimate is 10.5% now!

24 / 41

AK91: Taking Stock - The Quarter of Birth (QOB) IV

This will produce consistent estimates if
1. The IV predicts the endogenous regressor well.
2. The IV is as good as random / independent of OVs.
3. Can only impact outcome through schooling.
How does the QOB perform along those lines?

25 / 41

AK91: Taking Stock - The Quarter of Birth (QOB) IV

This will produce consistent estimates if
1. The IV predicts the endogenous regressor well.
2. The IV is as good as random / independent of OVs.
3. Can only impact outcome through schooling.
How does the QOB perform along those lines?

Plot of first stage and high F-stat offer compelling evidence for relevance. ✅
Is QOB independent of, say, maternal characteristics? Birthdays are not really random - there are birth seasons for certain socioeconomic backgrounds. highest maternal schooling give birth in second quarter. (not in 4th! ✅)
Exclusion: What if the youngest kids (born in Q4!) are the disadvantaged ones early on, which has long-term negative impacts? That would mean $E [u | z] \neq 0$ ! Well, with QOB the youngest ones actually do better (more schooling and higher wage)! ✅

25 / 41

Mechanics of IVIdentification and Inference26 / 41

IV Identification

Let's go back to our simple linear model:

$y = β_{0} + β_{1} x + u$

where we fear that $C o v (x, u) \neq 0$ , $x$ is endogenous.

Conditions for IV

first stage or relevance: $C o v (z, x) \neq 0$
IV exogeneity: $C o v (z, u) = 0$ : the IV is exogenous in the outcome equation.

27 / 41

Valid Model (A) vs Invalid Model (B) for IV `z`

28 / 41

IV Identification

Conditions for IV

first stage or relevance: $C o v (z, x) \neq 0$

IV exogeneity: $C o v (z, u) = 0$ : the IV is exogenous in the outcome equation.

How does this identify $β_{1}$ ?
(How can we express $β_{1}$ in terms of population moments to pin it's value down?)

29 / 41

IV Identification

$\begin{aligned} C o v (z, y) & = C o v (z, β_{0} + β_{1} x + u) \\ = β_{1} C o v (z, x) + C o v (z, u) \end{aligned}$

Under condition 2. above (IV exogeneity), we have $C o v (z, u) = 0$ , hence

$C o v (z, y) = β_{1} C o v (z, x)$

30 / 41

IV Identification

$\begin{aligned} C o v (z, y) & = C o v (z, β_{0} + β_{1} x + u) \\ = β_{1} C o v (z, x) + C o v (z, u) \end{aligned}$

Under condition 2. above (IV exogeneity), we have $C o v (z, u) = 0$ , hence

$C o v (z, y) = β_{1} C o v (z, x)$

and under condition 1. (relevance), we have $C o v (z, x) \neq 0$ , so that we can divide the equation through to obtain

$β_{1} = \frac{C o v (z, y)}{C o v (z, x)} .$

$β_{1}$ is identified via population moments $C o v (z, y)$ and $C o v (z, x)$ .
We can estimate those moments via their sample analogs

30 / 41

IV Estimator

Just plugging in for the population moments:

${\hat{β}}_{1} = \frac{\sum_{i = 1}^{n} (z_{i} - \bar{z}) (y_{i} - \bar{y})}{\sum_{i = 1}^{n} (z_{i} - \bar{z}) (x_{i} - \bar{x})}$

The intercept estimate is ${\hat{β}}_{0} = \bar{y} - {\hat{β}}_{1} \bar{x}$

31 / 41

IV Estimator

Just plugging in for the population moments:

${\hat{β}}_{1} = \frac{\sum_{i = 1}^{n} (z_{i} - \bar{z}) (y_{i} - \bar{y})}{\sum_{i = 1}^{n} (z_{i} - \bar{z}) (x_{i} - \bar{x})}$

The intercept estimate is ${\hat{β}}_{0} = \bar{y} - {\hat{β}}_{1} \bar{x}$
Given both assumptions 1. and 2. are satisfied, we say that the IV estimator is consistent for $β_{1}$ . We write

$plim ({\hat{β}}_{1}) = β_{1}$

in words: the probability limit of ${\hat{β}}_{1}$ is the true $β_{1}$ .

If this is true, we say that this estimator is consistent.

31 / 41

IV Inference

Assuming $E (u^{2} | z) = σ^{2}$ the variance of the IV slope estimator is

$V a r ({\hat{β}}_{1, I V}) = \frac{σ^{2}}{n σ_{x}^{2} ρ_{x, z}^{2}}$

$σ_{x}^{2}$ is the population variance of $x$ ,
$σ^{2}$ the one of $u$ , and
$ρ_{x, z}$ is the population correlation between $x$ and $z$ .

32 / 41

IV Inference

Assuming $E (u^{2} | z) = σ^{2}$ the variance of the IV slope estimator is

$V a r ({\hat{β}}_{1, I V}) = \frac{σ^{2}}{n σ_{x}^{2} ρ_{x, z}^{2}}$

$σ_{x}^{2}$ is the population variance of $x$ ,
$σ^{2}$ the one of $u$ , and
$ρ_{x, z}$ is the population correlation between $x$ and $z$ .

You can see 2 important things here:

Without the term $ρ_{x, z}^{2}$ , this is like OLS variance.
As sample size $n$ increases, the variance decreases.

32 / 41

IV Variance is Always Larger than OLS Variance

Replace $ρ_{x, z}^{2}$ with $R_{x, z}^{2}$ , i.e. the R-squared of a regression of $x$ on $z$ :

$V a r ({\hat{β}}_{1, I V}) = \frac{σ^{2}}{n σ_{x}^{2} R_{x, z}^{2}}$

Given $R_{x, z}^{2} < 1$ in most real life situations, we have that $V a r ({\hat{β}}_{1, I V}) > V a r ({\hat{β}}_{1, O L S})$ almost certainly.

33 / 41

IV Variance is Always Larger than OLS Variance

Replace $ρ_{x, z}^{2}$ with $R_{x, z}^{2}$ , i.e. the R-squared of a regression of $x$ on $z$ :

$V a r ({\hat{β}}_{1, I V}) = \frac{σ^{2}}{n σ_{x}^{2} R_{x, z}^{2}}$

Given $R_{x, z}^{2} < 1$ in most real life situations, we have that $V a r ({\hat{β}}_{1, I V}) > V a r ({\hat{β}}_{1, O L S})$ almost certainly.
The higher the correlation between $z$ and $x$ , the closer their $R_{x, z}^{2}$ is to 1. With $R_{x, z}^{2} = 1$ we get back to the OLS variance. This is no surprise, because that implies that in fact $z = x$ .

So, if you have a valid, exogenous regressor $x$ , you should not perform IV estimation using $z$ to obtain $\hat{β}$ , since your variance will be unnecessarily large.

33 / 41

Returns to Education for Married Women

Consider the following model for married women's wages:

$\log w a g e = β_{0} + β_{1} e d u c + u$ Let's run an OLS on this, and then compare it to an IV estimate using father's education. Keep in mind that this is a valid IV $z$ if

fatheduc and educ are correlated
fatheduc and $u$ are not correlated.

34 / 41

Returns to Education for Married Women

data(mroz,package = "wooldridge")
mods = list()
mods$OLS <- lm(lwage ~ educ, data = mroz)
mods[['First Stage']] <- lm(educ ~ fatheduc, data = subset(mroz, inlf == 1))
mods$IV  <- estimatr::iv_robust(lwage ~ educ | fatheduc, data = mroz)

	OLS	First Stage	IV
(Intercept)	-0.185	10.237	0.441
	(0.185)	(0.276)	(0.467)
educ	0.109		0.059
	(0.014)		(0.037)
fatheduc		0.269
		(0.029)
Num.Obs.	428	428	428
R2	0.118	0.173	0.093

35 / 41

IV Standard Errors

36 / 41

IV with a Weak Instrument

IV is consistent under given assumptions.
However, even if we have only very small $C o r (z, u)$ , we can get wrong-footed
Small corrleation between $x$ and $z$ can produce inconsistent estimates.

$plim ({\hat{β}}_{1, I V}) = β_{1} + \frac{C o r (z, u)}{C o r (z, x)} \cdot \frac{σ_{u}}{σ_{x}}$

37 / 41

IV with a Weak Instrument

IV is consistent under given assumptions.
However, even if we have only very small $C o r (z, u)$ , we can get wrong-footed
Small corrleation between $x$ and $z$ can produce inconsistent estimates.

$plim ({\hat{β}}_{1, I V}) = β_{1} + \frac{C o r (z, u)}{C o r (z, x)} \cdot \frac{σ_{u}}{σ_{x}}$

Take $C o r (z, u)$ is very small,
A weak instrument is one with only a small absolute value for $C o r (z, x)$
This will blow up this second term in the probability limit.
Even with a very big sample size $n$ , our estimator would not converge to the true population parameter $β_{1}$ , because we are using a weak instrument.

37 / 41

Weak Stuff

To illustrate this point, let's assume we want to look at the impact of number of packs of cigarettes smoked per day by pregnant women (packs) on the birthweight of their child (bwght):

$\log (b w g h t) = β_{0} + β_{1} p a c k s + u$

We are worried that smoking behavior is correlated with a range of other health-related variables which are in $u$ and which could impact the birthweight of the child. So we look for an IV. Suppose we use the price of cigarettes (cigprice), assuming that the price of cigarettes is uncorrelated with factors in $u$ . Let's run the first stage of cigprice on packs and then let's show the 2SLS estimates:

38 / 41

Weak Stuff

data(bwght, package = "wooldridge")
mods <- list()
mods[["First Stage"]] <- lm(packs ~ cigprice, data = bwght)
mods[["IV"]] <- estimatr::iv_robust(log(bwght) ~  packs | cigprice, data = bwght, diagnostics = TRUE)

	First Stage	IV
(Intercept)	0.067	4.448
	(0.103)	(0.940)
cigprice	0.000
	(0.001)
packs		2.989
		(8.996)
R2	0.000	-23.230
RMSE	0.30
1. Stage F:		0.121

39 / 41

Weak Stuff

The first columns shows: very weak first stage. cigprice has zero impact on packs it seems!
$R^{2}$ is zero.
What is we use this IV nevertheless?

40 / 41

Weak Stuff

The first columns shows: very weak first stage. cigprice has zero impact on packs it seems!
$R^{2}$ is zero.
What is we use this IV nevertheless?

in the second column: very large, positive(!) impact of packs smoked on birthweight. 🤔
Huge Standard Error though.
An $R^{2}$ of -23?!
F-stat of first stage: 0.121. Corresponds to a p-value of 0.728 : we cannot reject the H0 of an insignificant first stage here at all.
So: invalid approach. ❌

40 / 41

END


	bluebery.planterose@sciencespo.fr
	Original Slides from Florian Oswald
	Book
	@ScPoEcon
	@ScPoEcon

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

ScPoEconometrics: Advanced

Instrumental Variables - Applications

Bluebery Planterose

SciencesPo Paris 2023-03-07

Status

What Did we Do Last Time?

Status

What Did we Do Last Time?

Today

Back to school!

Returns To Schooling

Returns To Schooling

Ability Bias

Ability Bias

Mincer with Unobserved Ability

Mincer with Unobserved Ability

Mincer with Unobserved Ability

Angrist and Krueger (1991): Birthdate is as good as Random

Angrist and Krueger (1991): Birthdate is as good as Random

AK91 IV setup

AK91 Estimation: Two Stage Least Squares (2SLS)

Let's do Angrist and Krueger (1991)!

Data on birth quarter and wages

AK91 Data Transformations

AK91 Figure 1: First Stage!

AK91 Figure 1: First Stage!

AK91 Figure 2: Impact of IV on outcome

AK91 Figure 2: Impact of IV on outcome

Running IV estimation in R

Running IV estimation in R

AK91 Results Table

Remember the F-Statistic?

Remember the F-Statistic?

Remember the F-Statistic?

Additional Control Variables

Additional Control Variables

Additional Control Variables

AK91: Taking Stock - The Quarter of Birth (QOB) IV

AK91: Taking Stock - The Quarter of Birth (QOB) IV

Mechanics of IV

Identification and Inference

IV Identification

Conditions for IV

Valid Model (A) vs Invalid Model (B) for IV z

IV Identification

Conditions for IV

IV Identification

IV Identification

IV Estimator

IV Estimator

IV Inference

IV Inference

IV Variance is Always Larger than OLS Variance

IV Variance is Always Larger than OLS Variance

Returns to Education for Married Women

Returns to Education for Married Women

IV Standard Errors

IV with a Weak Instrument

IV with a Weak Instrument

Weak Stuff

Weak Stuff

Weak Stuff

Weak Stuff

END

Status

What Did we Do Last Time?

Help

SciencesPo Paris
2023-03-07

Running IV estimation in `R`

Running IV estimation in `R`

Valid Model (A) vs Invalid Model (B) for IV `z`