ScPoEconometrics Advanced
Recap 2
Bluebery Planterose
SciencesPo Paris 
 2023-01-31
1 / 68

Recap 2

Last time, we refreshed our basic OLS knowledge
Today we continue and look at more than one explanatory variable, and associated problems

But, why more than one variable?
Like, how many other variables?
And, above all: which ones ? 🤔

2 / 68

Recap 2

Last time, we refreshed our basic OLS knowledge
Today we continue and look at more than one explanatory variable, and associated problems

But, why more than one variable?
Like, how many other variables?
And, above all: which ones ? 🤔

We will remember what we meant by a model.

2 / 68

Back to the STAR Experiment

Remember what we learned about the STAR Experiment
What is the causal impact of class size on test scores?
${score}_{i} = β_{0} + β_{1} {classize}_{i} + u_{i}$ ?
We use a model to order our thoughts about how a causal impact is determined.

3 / 68

Back to the STAR Experiment

Remember what we learned about the STAR Experiment
What is the causal impact of class size on test scores?
${score}_{i} = β_{0} + β_{1} {classize}_{i} + u_{i}$ ?
We use a model to order our thoughts about how a causal impact is determined.

3 / 68

Multiple Variables

Let's augment our model with more variables:

$y = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} x_{3} + u$

4 / 68

Spot the Difference 🕵️

5 / 68

Omitted-variable bias6 / 68

Omitted-variable bias

Omitted-variable bias (OVB) arises when we omit a variable that

affects our outcome variable $y$
correlates with an explanatory variable $x_{j}$

As it's name suggests, this situation leads to bias in our estimate of $β_{j}$ .

7 / 68

Omitted-variable bias

Omitted-variable bias (OVB) arises when we omit a variable that

affects our outcome variable $y$
correlates with an explanatory variable $x_{j}$

As it's name suggests, this situation leads to bias in our estimate of $β_{j}$ .

Note: OVB Is not exclusive to multiple linear regression, but it does require multiple variables affect $y$ .

7 / 68

Omitted-variable bias

Example

Let's imagine a simple model for the amount individual $i$ gets paid

${Pay}_{i} = β_{0} + β_{1} {School}_{i} + β_{2} {Male}_{i} + u_{i}$

where

${School}_{i}$ gives $i$ 's years of schooling
${Male}_{i}$ denotes an indicator variable for whether individual $i$ is male.

thus

$β_{1}$ : the returns to an additional year of schooling (ceteris paribus)
$β_{2}$ : the premium for being male (ceteris paribus)
If $β_{2} > 0$ , then there is discrimination against women—receiving less pay based upon gender.

8 / 68

Omitted-variable bias

Example, continued

From our population model

${Pay}_{i} = β_{0} + β_{1} {School}_{i} + β_{2} {Male}_{i} + u_{i}$

If a study focuses on the relationship between pay and schooling, i.e.,

${Pay}_{i} = β_{0} + β_{1} {School}_{i} + (β_{2} {Male}_{i} + u_{i})$ ${Pay}_{i} = β_{0} + β_{1} {School}_{i} + ε_{i}$

where $ε_{i} = β_{2} {Male}_{i} + u_{i}$ .

We used our exogeneity assumption to derive OLS' unbiasedness. But even if $E [u | X] = 0$ , it is not true that $E [ε | X] = 0$ so long as $β_{2} \neq 0$ .

Specifically, $E [ε | Male = 1] = β_{2} + E [u | Male = 1] \neq 0$ .

9 / 68

Omitted-variable bias

Example, continued

From our population model

${Pay}_{i} = β_{0} + β_{1} {School}_{i} + β_{2} {Male}_{i} + u_{i}$

If a study focuses on the relationship between pay and schooling, i.e.,

${Pay}_{i} = β_{0} + β_{1} {School}_{i} + (β_{2} {Male}_{i} + u_{i})$ ${Pay}_{i} = β_{0} + β_{1} {School}_{i} + ε_{i}$

where $ε_{i} = β_{2} {Male}_{i} + u_{i}$ .

We used our exogeneity assumption to derive OLS' unbiasedness. But even if $E [u | X] = 0$ , it is not true that $E [ε | X] = 0$ so long as $β_{2} \neq 0$ .

Specifically, $E [ε | Male = 1] = β_{2} + E [u | Male = 1] \neq 0$ . Now OLS is biased.

9 / 68

Omitted-variable bias

Example, continued

Let's try to see this result graphically.

The population model:

${Pay}_{i} = 20 + 0.5 \times {School}_{i} + 10 \times {Male}_{i} + u_{i}$

Our regression model that suffers from omitted-variable bias:

${Pay}_{i} = {\hat{β}}_{0} + {\hat{β}}_{1} \times {School}_{i} + e_{i}$

Finally, imagine that women, on average, receive more schooling than men.

10 / 68

Omitted-variable bias