+ - 0:00:00
Notes for current slide
Notes for next slide

ScPoEconometrics Advanced

Recap 2

Bluebery Planterose

SciencesPo Paris
2023-01-31

1 / 68

Recap 2

  • Last time, we refreshed our basic OLS knowledge

  • Today we continue and look at more than one explanatory variable, and associated problems

  • But, why more than one variable?

  • Like, how many other variables?

  • And, above all: which ones ? 🤔

2 / 68

Recap 2

  • Last time, we refreshed our basic OLS knowledge

  • Today we continue and look at more than one explanatory variable, and associated problems

  • But, why more than one variable?

  • Like, how many other variables?

  • And, above all: which ones ? 🤔



We will remember what we meant by a model.

2 / 68

Back to the STAR Experiment

  • Remember what we learned about the STAR Experiment

  • What is the causal impact of class size on test scores?

  • scorei=β0+β1classizei+ui ?

  • We use a model to order our thoughts about how a causal impact is determined.

3 / 68

Back to the STAR Experiment

  • Remember what we learned about the STAR Experiment

  • What is the causal impact of class size on test scores?

  • scorei=β0+β1classizei+ui ?

  • We use a model to order our thoughts about how a causal impact is determined.

3 / 68

Multiple Variables

Let's augment our model with more variables:

y=β0+β1x1+β2x2+β3x3+u

4 / 68

Spot the Difference 🕵️

5 / 68

Omitted-variable bias

6 / 68

Omitted-variable bias

Omitted-variable bias (OVB) arises when we omit a variable that

  1. affects our outcome variable y

  2. correlates with an explanatory variable xj

As it's name suggests, this situation leads to bias in our estimate of βj.

7 / 68

Omitted-variable bias

Omitted-variable bias (OVB) arises when we omit a variable that

  1. affects our outcome variable y

  2. correlates with an explanatory variable xj

As it's name suggests, this situation leads to bias in our estimate of βj.

Note: OVB Is not exclusive to multiple linear regression, but it does require multiple variables affect y.

7 / 68

Omitted-variable bias

Example

Let's imagine a simple model for the amount individual i gets paid

Payi=β0+β1Schooli+β2Malei+ui

where

  • Schooli gives i's years of schooling
  • Malei denotes an indicator variable for whether individual i is male.

thus

  • β1: the returns to an additional year of schooling (ceteris paribus)
  • β2: the premium for being male (ceteris paribus)
    If β2>0, then there is discrimination against women—receiving less pay based upon gender.
8 / 68

Omitted-variable bias

Example, continued

From our population model

Payi=β0+β1Schooli+β2Malei+ui

If a study focuses on the relationship between pay and schooling, i.e.,

Payi=β0+β1Schooli+(β2Malei+ui) Payi=β0+β1Schooli+εi

where εi=β2Malei+ui.

We used our exogeneity assumption to derive OLS' unbiasedness. But even if E[u|X]=0, it is not true that E[ε|X]=0 so long as β20.

Specifically, E[ε|Male=1]=β2+E[u|Male=1]0.

9 / 68

Omitted-variable bias

Example, continued

From our population model

Payi=β0+β1Schooli+β2Malei+ui

If a study focuses on the relationship between pay and schooling, i.e.,

Payi=β0+β1Schooli+(β2Malei+ui) Payi=β0+β1Schooli+εi

where εi=β2Malei+ui.

We used our exogeneity assumption to derive OLS' unbiasedness. But even if E[u|X]=0, it is not true that E[ε|X]=0 so long as β20.

Specifically, E[ε|Male=1]=β2+E[u|Male=1]0. Now OLS is biased.

9 / 68

Omitted-variable bias

Example, continued

Let's try to see this result graphically.

The population model:

Payi=20+0.5×Schooli+10×Malei+ui

Our regression model that suffers from omitted-variable bias:

Payi=β^0+β^1×Schooli+ei

Finally, imagine that women, on average, receive more schooling than men.

10 / 68

Omitted-variable bias

Example, continued: Payi=20+0.5×Schooli+10×Malei+ui

The relationship between pay and schooling.

11 / 68

Omitted-variable bias

Example, continued: Payi=20+0.5×Schooli+10×Malei+ui

Biased regression estimate: Pay^i=31.3+0.9×Schooli

11 / 68

Omitted-variable bias

Example, continued: Payi=20+0.5×Schooli+10×Malei+ui

Recalling the omitted variable: Gender female and male

11 / 68

Omitted-variable bias

Example, continued: Payi=20+0.5×Schooli+10×Malei+ui

Recalling the omitted variable: Gender female and male

11 / 68

Omitted-variable bias

Example, continued: Payi=20+0.5×Schooli+10×Malei+ui

Unbiased regression estimate: Pay^i=20.9+0.4×Schooli+9.1×Malei

11 / 68

Omitted-variable bias

Solutions

  1. Don't omit variables 😜

  2. Instrumental variables and two-stage least squares (coming soon): If we could find something that only affects x1 but not the omitted variable, we can make progress!

  3. Use multiple observations for the same unit i: panel data.

Warning: There are situations in which neither solution is possible.

12 / 68

Omitted-variable bias

Solutions

  1. Don't omit variables 😜

  2. Instrumental variables and two-stage least squares (coming soon): If we could find something that only affects x1 but not the omitted variable, we can make progress!

  3. Use multiple observations for the same unit i: panel data.

Warning: There are situations in which neither solution is possible.

  1. Proceed with caution (sometimes you can sign the bias).

  2. The key is to have a mental map of should belong to the model.

12 / 68
13 / 68

Interpreting coefficients

14 / 68

Interpreting coefficients

Continuous variables

Consider the relationship

Payi=β0+β1Schooli+ui

where

  • Payi is a continuous variable measuring an individual's pay
  • Schooli is a continuous variable that measures years of education
15 / 68

Interpreting coefficients

Interpretations

  • β0: the y-intercept, i.e., Pay when School=0
  • β1: the expected increase in Pay for a one-unit increase in School
16 / 68

Interpreting coefficients

Continuous variables

Consider the model

y=β0+β1x+u

Differentiate the model:

dydx=β1

17 / 68

Task 1: Interpretation (4 minutes)

  1. Load the wage1 dataset from the wooldridge package. you may have to install this first.

  2. Run skimr::skim on the dataset to get an overview. what is the fraciton of nonwhite in the data?

  3. Regressing wage on education and tenure, what is the interpretation of the tenure coefficient? You may need to consult ?wage1 here.

18 / 68

Interpreting coefficients

Categorical variables

Consider the relationship

Payi=β0+β1Femalei+ui

where

  • Payi is a continuous variable measuring an individual's pay
  • Femalei is a binary/indicator variable taking 1 when i is female
19 / 68

Interpreting coefficients

Interpretations

  • β0: the expected Pay for males (i.e., when Female=0)
  • β1: the expected difference in Pay between females and males
  • β0+β1: the expected Pay for females
20 / 68

Interpreting coefficients

Categorical variables

Derivations

E[Pay|Male]=E[β0+β1×0+ui]=E[β0+0+ui]=β0

21 / 68

Interpreting coefficients

Categorical variables

Derivations

E[Pay|Male]=E[β0+β1×0+ui]=E[β0+0+ui]=β0

E[Pay|Female]=E[β0+β1×1+ui]=E[β0+β1+ui]=β0+β1

21 / 68

Interpreting coefficients

Categorical variables

Derivations

E[Pay|Male]=E[β0+β1×0+ui]=E[β0+0+ui]=β0

E[Pay|Female]=E[β0+β1×1+ui]=E[β0+β1+ui]=β0+β1

Note: If there are no other variables to condition on, then β^1 equals the difference in group means, e.g., x¯Femalex¯Male.


Note2: The holding all other variables constant interpretation also applies for categorical variables in multiple regression settings.
21 / 68

Interpreting coefficients

Categorical variables

yi=β0+β1xi+ui for binary variable xi={0,1}

22 / 68

Interpreting coefficients

Categorical variables

yi=β0+β1xi+ui for binary variable xi={0,1}

23 / 68

Task 2: Categorical Variables (3 Minutes)

  • Continue with the wage1 dataset.

  • Now regress wage on female. What is E[wage|male]?

  • Add married to the regression. Now what is E[wage|female,not married]?

24 / 68

Interpreting coefficients

Interactions

Interactions allow the effect of one variable to change based upon the level of another variable.

Examples

  1. Does the effect of schooling on pay change by gender?

  2. Does the effect of gender on pay change by race?

  3. Does the effect of schooling on pay change by experience?

25 / 68

Interpreting coefficients

Interactions

Previously, we considered a model that allowed women and men to have different wages, but the model assumed the effect of school on pay was the same for everyone:

Payi=β0+β1Schooli+β2Femalei+ui

but we can also allow the effect of school to vary by gender:

Payi=β0+β1Schooli+β2Femalei+β3Schooli×Femalei+ui

26 / 68

Interpreting coefficients

Interactions

The model where schooling has the same effect for everyone (F and M):

27 / 68

Interpreting coefficients

Interactions

The model where schooling's effect can differ by gender (F and M):

28 / 68

Interpreting coefficients

Interactions

Interpreting coefficients can be a little tricky with interactions, but the key is to carefully work through the math.

As is often the case with econometrics.

Payi=β0+β1Schooli+β2Femalei+β3Schooli×Femalei+ui

Expected returns for an additional year of schooling for women:

E[Payi|FemaleSchool=+1]E[Payi|FemaleSchool=]=E[β0+β1(+1)+β2+β3(+1)+ui]E[β0+β1+β2+β3+ui]=β1+β3

29 / 68

Interpreting coefficients

Interactions

Interpreting coefficients can be a little tricky with interactions, but the key is to carefully work through the math.

As is often the case with econometrics.

Payi=β0+β1Schooli+β2Femalei+β3Schooli×Femalei+ui

Expected returns for an additional year of schooling for women:

E[Payi|FemaleSchool=+1]E[Payi|FemaleSchool=]=E[β0+β1(+1)+β2+β3(+1)+ui]E[β0+β1+β2+β3+ui]=β1+β3

Similarly, β1 gives the expected return to an additional year of schooling for men. Thus, β3 gives the difference in the returns to schooling for women and men.

29 / 68

Task 3: Interactions (4 minutes)

  • Same dataset!

  • Regress wage on experience, female indicator and their interaction. What is the interpretation of all the coefficients here? Can you distinguish them from zero?

  • What is the expected wage for a male with 5 years of experience?

30 / 68

Interpreting coefficients

Log-linear specification

In economics, you will frequently see logged outcome variables with linear (non-logged) explanatory variables, e.g.,

log(pricei)=β0+β1bdrmsi+ui

This specification changes our interpretation of the slope coefficients.

data(hprice1,package = "wooldridge")
lm(log(price) ~ bdrms, data = hprice1) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 5.04 0.126 39.9 3.13e-57
#> 2 bdrms 0.167 0.0345 4.85 5.43e- 6
31 / 68

Interpreting coefficients

Log-linear specification

Interpretation

  • A one-unit increase in our explanatory variable increases the outcome variable by approximately β1×100 percent.

  • Example: An additional bedroom increases sales prices of a house by approximately 16 percent (for β1=0.16).

32 / 68

Interpreting coefficients

Log-linear specification

Consider the log-linear model

log(y)=β0+β1x+u

and differentiate

dyy=β1dx

So a marginal change in x (i.e., dx) leads to a β1dx percentage change in y.

33 / 68

Interpreting coefficients

Log-linear specification

What about that approximation part?

An additional bedroom increases sales prices of a house by approximately 16 percent (for β1=0.16).

  • %Δy0.16×100=16%.

  • Good approximation as long as Δy is not too big.

  • We approximate log(Δyy0+1)Δyy0

34 / 68

Interpreting coefficients

Log-linear specification

What about that approximation part?

An additional bedroom increases sales prices of a house by approximately 16 percent (for β1=0.16).

  • %Δy0.16×100=16%.

  • Good approximation as long as Δy is not too big.

  • We approximate log(Δyy0+1)Δyy0

34 / 68

Interpreting coefficients

Log-linear specification

What about that approximation part?

An additional bedroom increases sales prices of a house by approximately 16 percent (for β1=0.16).

  • %Δy0.16×100=16%.

  • Good approximation as long as Δy is not too big.

  • We approximate log(Δyy0+1)Δyy0

  • The exact formula is %Δy=100×(exp(Δxβ)1)

  • In our case: %Δy=100×(exp(0.16)1)=17.3

35 / 68

Task 4

  • same Dataset!

  • Now regress log wage on education and tenure. How does the interpretation of the coefficient on education change?

36 / 68

Interpreting coefficients

Log-log specification

Similarly, econometricians frequently employ log-log models, in which the outcome variable is logged and at least one explanatory variable is logged

log(pricei)=β0+β1log(sqrfti)+ui

Interpretation:

  • A one-percent increase in x will lead to a β1 percent change in y.
  • Often interpreted as an elasticity.
37 / 68

Interpreting coefficients

Log-log specification

Consider the log-log model

log(y)=β0+β1log(x)+u

and differentiate

dyy=β1dxx

which says that for a one-percent increase in x, we will see a β1 percent increase in y. As an elasticity:

dydxxy=β1

38 / 68

Task 5

  • Load the hprice1 dataset from the wooldridge package.

  • Regress log price on log sqrft. What is the interpretation on log(sqrft)?

  • What is the E[price|sqrft=115] (Caution! not log price!)

39 / 68

Interpreting coefficients

Log-log specification

lm(log(price) ~ log(sqrft), data = hprice1) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) -0.975 0.641 -1.52 1.32e- 1
#> 2 log(sqrft) 0.873 0.0846 10.3 1.05e-16
  • a 1% increase in square footage of the house leads to a 0.873% increase in sales price.

  • Notice the absence of units here (it's all in percent terms of both variables involved).

40 / 68

Interpreting coefficients

Log-linear with a binary variable

Note: If you have a log-linear model with a binary indicator variable, the interpretation for the coefficient on that variable changes.

Consider again

log(yi)=β0+β1x1+ui

for binary variable x1.

The approximate interpretation of β1 is as before:

When x1 changes from 0 to 1, y will change by 100×β1 percent.

41 / 68
#>
#> Call:
#> lm(formula = log(price) ~ log(lotsize) + log(sqrft) + bdrms +
#> colonial, data = hprice1)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.69479 -0.09750 -0.01619 0.09151 0.70228
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1.34959 0.65104 -2.073 0.0413 *
#> log(lotsize) 0.16782 0.03818 4.395 3.25e-05 ***
#> log(sqrft) 0.70719 0.09280 7.620 3.69e-11 ***
#> bdrms 0.02683 0.02872 0.934 0.3530
#> colonial 0.05380 0.04477 1.202 0.2330
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.1841 on 83 degrees of freedom
#> Multiple R-squared: 0.6491, Adjusted R-squared: 0.6322
#> F-statistic: 38.38 on 4 and 83 DF, p-value: < 2.2e-16

Approximate

  • When colonial changes from 0 to 1 (i.e. house becomes colonial), y will change by 100×β1=5.37 percent.

Exact

  • When colonial changes from 1 to 0, y will change by 100×(eβ11)=5.52 percent.
42 / 68

Uncertainty and inference

43 / 68

Uncertainty and inference

Is there more?

Up to this point, we know OLS has some nice properties, and we know how to estimate an intercept and slope coefficient via OLS.

Our current workflow:

  • Get data (points with x and y values)
  • Regress y on x
  • Plot the OLS line (i.e., y^=β^0+β^1)
  • Done?

But how do we actually learn something from this exercise?

44 / 68

Uncertainty and inference

Linkup with Intro Course

This is related to Intro Course material:

  1. Sampling

  2. Hypothesis Testing

  3. Regression Inference
45 / 68

Uncertainty and inference

There is more

But how do we actually learn something from this exercise?

  • Based upon our value of β^1, can we rule out previously hypothesized values?
  • How confident should we be in the precision of our estimates?
  • How well does our model explain the variation we observe in y?

We need to be able to deal with uncertainty. Enter: Inference.

46 / 68

Uncertainty and inference

Learning from our errors

As our previous simulation pointed out, our problem with uncertainty is that we don't know whether our sample estimate is close or far from the unknown population parameter.

However, all is not lost. We can use the errors (ei=yiy^i) to get a sense of how well our model explains the observed variation in y.

When our model appears to be doing a "nice" job, we might be a little more confident in using it to learn about the relationship between y and x.

Now we just need to formalize what a "nice job" actually means.

: Except when we run the simulation ourselves—which is why we like simulations.

47 / 68

Uncertainty and inference

Learning from our errors

First off, we will estimate the variance of ui (recall: Var(ui)=σ2) using our squared errors, i.e.,

s2=iei2nk

where k gives the number of slope terms and intercepts that we estimate (e.g., β0 and β1 would give k=2).

s2 is an unbiased estimator of σ2.

48 / 68

Uncertainty and inference

Learning from our errors

We know that the variance of β^1 (for simple linear regression) is

Var(β^1)=s2i(xix¯)2

which shows that the variance of our slope estimator

  1. increases as our disturbances become noisier
  2. decreases as the variance of x increases
49 / 68

Uncertainty and inference

Learning from our errors

More common: The standard error of β^1

SE^(β^1)=s2i(xix¯)2

Recall: The standard error of an estimator is the standard deviation of the estimator's distribution.

50 / 68

Uncertainty and inference

Learning from our errors

Standard error output is standard in R's lm:

#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10
51 / 68

Uncertainty and inference

Learning from our errors

We use the standard error of β^1, along with β^1 itself, to learn about the parameter β1.

After deriving the distribution of β^1, we have two (related) options for formal statistical inference (learning) about our unknown parameter β1:

  • Confidence intervals: Use the estimate and its standard error to create an interval that, when repeated, will generally†† contain the true parameter.

  • Hypothesis tests: Determine whether there is statistically significant evidence to reject a hypothesized value or range of values.

: Hint: it's normal with the mean and variance we've derived/discussed above)
††: E.g., Similarly constructed 95% confidence intervals will contain the true parameter 95% of the time.

52 / 68

Uncertainty and inference

Confidence intervals

We construct (1α)-level confidence intervals for β1 β^1±tα/2,dfSE^(β^1)

tα/2,df denotes the α/2 quantile of a t dist. with nk degrees of freedom.

53 / 68

Uncertainty and inference

Confidence intervals

We construct (1α)-level confidence intervals for β1 β^1±tα/2,dfSE^(β^1)

For example, 100 obs., two coefficients (i.e., β^0 and β^1k=2), and α=0.05 (for a 95% confidence interval) gives us t0.025,98=1.98

54 / 68

Uncertainty and inference

Confidence intervals

We construct (1α)-level confidence intervals for β1 β^1±tα/2,dfSE^(β^1)

Example:

lm(y ~ x, data = pop_df) %>% tidy(conf.int = TRUE)
#> # A tibble: 2 × 7
#> term estimate std.error statistic p.value conf.low conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8 1.69 3.37
#> 2 x 0.567 0.0793 7.15 1.59e-10 0.410 0.724
55 / 68

Uncertainty and inference

Confidence intervals

We construct (1α)-level confidence intervals for β1 β^1±tα/2,dfSE^(β^1)

Example:

lm(y ~ x, data = pop_df) %>% tidy(conf.int = TRUE)
#> # A tibble: 2 × 7
#> term estimate std.error statistic p.value conf.low conf.high
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8 1.69 3.37
#> 2 x 0.567 0.0793 7.15 1.59e-10 0.410 0.724

Our 95% confidence interval is thus 0.567±1.98×0.0793=[0.410,0.724]

55 / 68

Uncertainty and inference

Confidence intervals

So we have a confidence interval for β1, i.e., [0.410,0.724].

What does it mean?

56 / 68

Uncertainty and inference

Confidence intervals

So we have a confidence interval for β1, i.e., [0.410,0.724].

What does it mean?

Informally: The confidence interval gives us a region (interval) in which we can place some trust (confidence) for containing the parameter.

56 / 68

Uncertainty and inference

Confidence intervals

So we have a confidence interval for β1, i.e., [0.410,0.724].

What does it mean?

Informally: The confidence interval gives us a region (interval) in which we can place some trust (confidence) for containing the parameter.

More formally: If repeatedly sample from our population and construct confidence intervals for each of these samples, (1α) percent of our intervals (e.g., 95%) will contain the population parameter somewhere in the interval.

56 / 68

Uncertainty and inference

Confidence intervals

So we have a confidence interval for β1, i.e., [0.410,0.724].

What does it mean?

Informally: The confidence interval gives us a region (interval) in which we can place some trust (confidence) for containing the parameter.

More formally: If repeatedly sample from our population and construct confidence intervals for each of these samples, (1α) percent of our intervals (e.g., 95%) will contain the population parameter somewhere in the interval.

Now back to our simulation...

56 / 68

Uncertainty and inference

Confidence intervals

We drew 10,000 samples (each of size n=30) from our population and estimated our regression model for each of these simulations:

yi=β^0+β^1xi+ei

(repeated 10,000 times)

Now, let's estimate 95% confidence intervals for each of these intervals...

57 / 68

Uncertainty and inference

Confidence intervals

From our previous simulation: 97.8% of our 95% confidences intervals contain the true parameter value of β1.

That's a probabilistic statement:

  • Could be more.
  • Could be less.
58 / 68

Uncertainty and inference

Hypothesis testing

In many applications, we want to know more than a point estimate or a range of values. We want to know what our statistical evidence says about existing theories.

We want to test hypotheses posed by officials, politicians, economists, scientists, friends, weird neighbors, etc.

Examples

  • Does increasing police presence reduce crime?
  • Does building a giant wall reduce crime?
  • Does shutting down a government adversely affect the economy?
  • Does legal cannabis reduce drunk driving or reduce opiod use?
  • Do air quality standards increase health and/or reduce jobs?
59 / 68

Uncertainty and inference

Hypothesis testing

Hypothesis testing relies upon very similar results and intuition.

While uncertainty certainly exists, we can still build reliable statistical tests (rejecting or failing to reject a posited hypothesis).

60 / 68

Uncertainty and inference

Hypothesis testing

Hypothesis testing relies upon very similar results and intuition.

While uncertainty certainly exists, we can still build reliable statistical tests (rejecting or failing to reject a posited hypothesis).

OLS t test Our (null) hypothesis states that β1 equals a value c, i.e., Ho:β1=c

From OLS's properties, we can show that the test statistic

tstat=β^1cSE^(β^1)

follows the t distribution with nk degrees of freedom.

60 / 68

Uncertainty and inference

Hypothesis testing

For an α-level, two-sided test, we reject the null hypothesis (and conclude with the alternative hypothesis) when

|tstat|>|t1α/2,df|

meaning that our test statistic is more extreme than the critical value.

Alternatively, we can calculate the p-value that accompanies our test statistic, which effectively gives us the probability of seeing our test statistic or a more extreme test statistic if the null hypothesis were true.

Very small p-values (generally < 0.05) mean that it would be unlikely to see our results if the null hyopthesis were really true—we tend to reject the null for p-values below 0.05.

61 / 68

Uncertainty and inference

Hypothesis testing

R and statas default to testing hypotheses against the value zero.

lm(y ~ x, data = pop_df) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10
62 / 68

Uncertainty and inference

Hypothesis testing

R and statas default to testing hypotheses against the value zero.

lm(y ~ x, data = pop_df) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10

Ho: β1=0 vs. Ha: β10

62 / 68

Uncertainty and inference

Hypothesis testing

R and statas default to testing hypotheses against the value zero.

lm(y ~ x, data = pop_df) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10

Ho: β1=0 vs. Ha: β10

tstat=7.15 and t0.975, 28=2.05

62 / 68

Uncertainty and inference

Hypothesis testing

R and statas default to testing hypotheses against the value zero.

lm(y ~ x, data = pop_df) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10

Ho: β1=0 vs. Ha: β10

tstat=7.15 and t0.975, 28=2.05 which implies p-value <0.05

62 / 68

Uncertainty and inference

Hypothesis testing

R and statas default to testing hypotheses against the value zero.

lm(y ~ x, data = pop_df) %>% tidy()
#> # A tibble: 2 × 5
#> term estimate std.error statistic p.value
#> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 (Intercept) 2.53 0.422 6.00 3.38e- 8
#> 2 x 0.567 0.0793 7.15 1.59e-10

Ho: β1=0 vs. Ha: β10

tstat=7.15 and t0.975, 28=2.05 which implies p-value <0.05

Therefore, we reject Ho.

62 / 68

Uncertainty and inference

F tests

You will sometimes see F tests in econometrics.

We use F tests to test hypotheses that involve multiple parameters
 (e.g., β1=β2 or β3+β4=1),

rather than a single simple hypothesis
 (e.g., β1=0, for which we would just use a t test).

63 / 68

Uncertainty and inference

F tests

Example

Economists love to say "Money is fungible."

Imagine that we might want to test whether money received as income actually has the same effect on consumption as money received from tax rebates/returns.

Consumptioni=β0+β1Incomei+β2Rebatei+ui

64 / 68

Uncertainty and inference

F tests

Example, continued

We can write our null hypothesis as

Ho:β1=β2Ho:β1β2=0

Imposing this null hypothesis gives us the restricted model

Consumptioni=β0+β1Incomei+β1Rebatei+ui Consumptioni=β0+β1(Incomei+Rebatei)+ui

65 / 68

Uncertainty and inference

F tests

Example, continued

To this the null hypothesis Ho:β1=β2 against Ha:β1β2,
we use the F statistic Fq,nk1=(SSErSSEu)/qSSEu/(nk1) which (as its name suggests) follows the F distribution with q numerator degrees of freedom and nk1 denominator degrees of freedom.

Here, q is the number of restrictions we impose via Ho.

66 / 68

Uncertainty and inference

F tests

Example, continued

The term SSEr is the sum of squared errors (SSE) from our restricted model Consumptioni=β0+β1(Incomei+Rebatei)+ui

and SSEu is the sum of squared errors (SSE) from our unrestricted model Consumptioni=β0+β1Incomei+β2Rebatei+ui

67 / 68

END

bluebery.planterose@sciencespo.fr
Original Slides from Florian Oswald
Book
@ScPoEcon
@ScPoEcon
68 / 68

Recap 2

  • Last time, we refreshed our basic OLS knowledge

  • Today we continue and look at more than one explanatory variable, and associated problems

  • But, why more than one variable?

  • Like, how many other variables?

  • And, above all: which ones ? 🤔

2 / 68
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow