+ - 0:00:00
Notes for current slide
Notes for next slide

ScPoEconometrics: Advanced

Intro and Recap 1

Bluebery Planterose

SciencesPo Paris
2023-01-24

1 / 42

Welcome to ScPoEconometrics: Advanced!

Today

  1. Who Am I

  2. This Course

  3. Recap 1 of topics from intro course

Next time

  • Quiz 1 (before next time)

  • Recap 2

2 / 42

Who Am I

  • I'm an PhD candidate at the Paris School of Economics. Check out my website!

  • I work on tax evasion, climate policies, and macro topics:

    1. Acceptability of climate policies: who support/oppose climate policies and why?

    2. Offshore real-estate in Dubai using leaked data: how large is it, who owns it, and what does it tell us about global offshore real-estate?

    3. Excess Profit Tax: how to tax excess profits from energy firms that benefited from the war in Ukraine?

3 / 42

This Course

Prerequisites

  • This course is the follow-up to Introduction to Econometrics with R which is taught to 2nd years.

  • You are supposed to be familiar with all the econometrics material from the slides of that course and/or chapters 1-9 in our textbook.

  • We also assume you have basic R working knowledge at the level of the intro course!

    • basic data.frame manipulation with dplyr
    • simple linear models with lm
    • basic plotting with ggplot2
    • Quiz 1 will try and test for that 😉, so be on top of this chapter
4 / 42

This Course

Grading

  1. There will be four quizzes on Moodle roughly every two weeks => 40%

  2. There will be two take home exams / case studies => 60%

  3. There will be no final exam 😅.

5 / 42

This Course

Grading

  1. There will be four quizzes on Moodle roughly every two weeks => 40%

  2. There will be two take home exams / case studies => 60%

  3. There will be no final exam 😅.

Course Materials

  1. Book chapter 10 onwards

  2. The Slides

  3. The interactive shiny apps

  4. Quizzes on Moodle

5 / 42

Syllabus

  1. Intro, Recap 1 (Quiz 1)

  2. Recap 2 (Quiz 2)

  3. Intro, Difference-in-Differences

  4. Tools: Rmarkdown and data.table

  5. Instrumental Variables 1 (Quiz 3)

  6. Instrumental Variables 2 (Midterm exam)

7. Panel Data 1

8. Panel Data 2 (Quiz 4)

9. Discrete Outcomes

10. Intro to Machine Learning 1

11. Intro to Machine Learning 2

12. Recap / Buffer (Final Project)

12. Recap / Buffer (Final Project)

6 / 42

Course Organization

7 / 42

Recap 1

Let's get cracking! 💪

8 / 42

Population vs. sample

Models and notation

We write our (simple) population model

yi=β0+β1xi+ui

and our sample-based estimated regression model as

yi=β^0+β^1xi+ei

An estimated regression model produces estimates for each observation:

y^i=β^0+β^1xi

which gives us the best-fit line through our dataset.

(A lot of this set slides - in particular: pictures! - have been taken from Ed Rubin's outstanding material. Thanks Ed 🙏)

9 / 42

Task 1: Run Simple OLS (4 minutes)

  1. Load data here. in dta format. (Hint: use haven::read_dta("filename") to read this format.)

  2. Obtain common summary statistics for the variables classize, avgmath and avgverb. Hint: use the skimr package.

  3. Estimate the linear model avgmathi=β0+classizeixi+ui

10 / 42

Task 1: Solution

  1. Load the data

    grades = haven::read_dta(file ="https://www.dropbox.com/s/wwp2cs9f0dubmhr/grade5.dta?dl=1")
  2. Describe the dataset:

    library(dplyr)
    grades %>%
    select(classize,avgmath,avgverb) %>%
    skimr::skim()
  3. Run OLS to estimate the relationship between class size and student achievement?

    summary(lm(formula = avgmath ~ classize, data = grades))
11 / 42

Question: Why do we care about population vs. sample?

Population

12 / 42

Question: Why do we care about population vs. sample?

Population

Population relationship

yi=2.53+0.57xi+ui

yi=β0+β1xi+ui

12 / 42

Question: Why do we care about population vs. sample?

Sample 1: 30 random individuals

13 / 42

Question: Why do we care about population vs. sample?

Sample 1: 30 random individuals

Population relationship
yi=2.53+0.57xi+ui

Sample relationship
y^i=2.36+0.61xi

13 / 42

Question: Why do we care about population vs. sample?

Sample 2: 30 random individuals

Population relationship
yi=2.53+0.57xi+ui

Sample relationship
y^i=2.79+0.56xi

13 / 42

Question: Why do we care about population vs. sample?

Sample 3: 30 random individuals

Population relationship
yi=2.53+0.57xi+ui

Sample relationship
y^i=3.21+0.45xi

13 / 42

Let's repeat this 10,000 times.

(This exercise is called a (Monte Carlo) simulation.)

14 / 42

14 / 42

Population vs. sample

Question: Why do we care about population vs. sample?

  • On average, our regression lines match the population line very nicely.

  • However, individual lines (samples) can really miss the mark.

  • Differences between individual samples and the population lead to uncertainty for the econometrician.

15 / 42

Population vs. sample

Question: Why do we care about population vs. sample?

16 / 42

Population vs. sample

Question: Why do we care about population vs. sample?

Answer: Uncertainty matters.

  • Every random sample of data is different.

  • Our (OLS) estimators are computed from those samples of data.

  • If there is sampling variation, there is variation in our estimates.

16 / 42

Population vs. sample

Question: Why do we care about population vs. sample?

Answer: Uncertainty matters.

  • Every random sample of data is different.

  • Our (OLS) estimators are computed from those samples of data.

  • If there is sampling variation, there is variation in our estimates.

  • OLS inference depends on certain assumptions.

  • If violated, our estimates will be biased or imprecise.

  • Or both. 😧

16 / 42

Linear regression

The estimator

We can estimate a regression line in R (lm(y ~ x, my_data)) and stata (reg y x). But where do these estimates come from?

A few slides back:

y^i=β^0+β^1xi which gives us the best-fit line through our dataset.

But what do we mean by "best-fit line"?

17 / 42

Being the "best"

Question: What do we mean by best-fit line?

Answers:

  • In general (econometrics), best-fit line means the line that minimizes the sum of squared errors (SSE):

SSE=i=1nei2 where ei=yiy^i

  • Ordinary least squares (OLS) minimizes the sum of the squared errors.
  • Based upon a set of (mostly palatable) assumptions, OLS
    • Is unbiased (and consistent)
    • Is the best (minimum variance) linear unbiased estimator (BLUE)
18 / 42

OLS vs. other lines/estimators

Let's consider the dataset we previously generated.

19 / 42

OLS vs. other lines/estimators

For any line (y^=β^0+β^1x)

19 / 42

OLS vs. other lines/estimators

For any line (y^=β^0+β^1x), we can calculate errors: ei=yiy^i

19 / 42

OLS vs. other lines/estimators

For any line (y^=β^0+β^1x), we can calculate errors: ei=yiy^i

19 / 42

OLS vs. other lines/estimators

For any line (y^=β^0+β^1x), we can calculate errors: ei=yiy^i

19 / 42

OLS vs. other lines/estimators

SSE squares the errors (ei2): bigger errors get bigger penalties.

19 / 42

OLS vs. other lines/estimators

The OLS estimate is the combination of β^0 and β^1 that minimize SSE.

19 / 42
ScPoApps::launchApp("reg_simple")
20 / 42

OLS

Formally

In simple linear regression, the OLS estimator comes from choosing the β^0 and β^1 that minimize the sum of squared errors (SSE), i.e.,

minβ^0,β^1SSE

21 / 42

OLS

Formally

In simple linear regression, the OLS estimator comes from choosing the β^0 and β^1 that minimize the sum of squared errors (SSE), i.e.,

minβ^0,β^1SSE

but we already know SSE=iei2. Now use the definitions of ei and y^.

ei2=(yiy^i)2=(yiβ^0β^1xi)2=yi22yiβ^02yiβ^1xi+β^02+2β^0β^1xi+β^12xi2

21 / 42

OLS

Formally

In simple linear regression, the OLS estimator comes from choosing the β^0 and β^1 that minimize the sum of squared errors (SSE), i.e.,

minβ^0,β^1SSE

but we already know SSE=iei2. Now use the definitions of ei and y^.

ei2=(yiy^i)2=(yiβ^0β^1xi)2=yi22yiβ^02yiβ^1xi+β^02+2β^0β^1xi+β^12xi2

Recall: Minimizing a multivariate function requires (1) first derivatives equal zero (the 1st-order conditions) and (2) second-order conditions (concavity).

21 / 42

OLS

Interactively

ScPoApps::launchApp("SSR_cone")

22 / 42

OLS

Interactively

We skipped the maths.

We now have the OLS estimators for the slope

β^1=i(xix¯)(yiy¯)i(xix¯)2

and the intercept

β^0=y¯β^1x¯

Remember that those two formulae are amongst the very few ones from the intro course that you should know by heart! ❤️

23 / 42

OLS

Interactively

We skipped the maths.

We now have the OLS estimators for the slope

β^1=i(xix¯)(yiy¯)i(xix¯)2

and the intercept

β^0=y¯β^1x¯

Remember that those two formulae are amongst the very few ones from the intro course that you should know by heart! ❤️

We now turn to the assumptions and (implied) properties of OLS.

23 / 42

OLS: Assumptions and properties

Question: What properties might we care about for an estimator?

24 / 42

OLS: Assumptions and properties

Question: What properties might we care about for an estimator?

Tangent: Let's review statistical properties first.

24 / 42

OLS: Assumptions and properties

Refresher: Density functions

Recall that we use probability density functions (PDFs) to describe the probability a continuous random variable takes on a range of values. (The total area = 1.)

These PDFs characterize probability distributions, and the most common/famous/popular distributions get names (e.g., normal, t, Gamma).

Here is the definition of a PDF fXfor a continuous RV X:

Pr[aXb]abfX(x)dx

25 / 42

OLS: Assumptions and properties

Refresher: Density functions

The probability a standard normal random variable takes on a value between -2 and 0: P(2X0)=0.48

26 / 42

OLS: Assumptions and properties

Refresher: Density functions

The probability a standard normal random variable takes on a value between -1.96 and 1.96: P(1.96X1.96)=0.95

27 / 42

OLS: Assumptions and properties

Refresher: Density functions

The probability a standard normal random variable takes on a value beyond 2: P(X>2)=0.023

28 / 42

OLS: Assumptions and properties

Imagine we are trying to estimate an unknown parameter β, and we know the distributions of three competing estimators. Which one would we want? How would we decide?

29 / 42

OLS: Assumptions and properties

Question: What properties might we care about for an estimator?

30 / 42

OLS: Assumptions and properties

Question: What properties might we care about for an estimator?

Answer one: Bias.

On average (after many samples), does the estimator tend toward the correct value?

More formally: Does the mean of estimator's distribution equal the parameter it estimates?

Biasβ(β^)=E[β^]β

30 / 42

OLS: Assumptions and properties

Answer one: Bias.

Unbiased estimator: E[β^]=β

31 / 42

OLS: Assumptions and properties

Answer one: Bias.

Unbiased estimator: E[β^]=β

Biased estimator: E[β^]β

31 / 42

OLS: Assumptions and properties

Answer two: Variance.

The central tendencies (means) of competing distributions are not the only things that matter. We also care about the variance of an estimator.

Var(β^)=E[(β^E[β^])2]

Lower variance estimators mean we get estimates closer to the mean in each sample.

32 / 42

OLS: Assumptions and properties

Answer two: Variance.

32 / 42

OLS: Assumptions and properties

Answer one: Bias.

Answer two: Variance.

Subtlety: The bias-variance tradeoff.

Should we be willing to take a bit of bias to reduce the variance?

In econometrics, we generally stick with unbiased (or consistent) estimators. But other disciplines (especially computer science) think a bit more about this tradeoff.

33 / 42

The bias-variance tradeoff.

34 / 42

OLS: Assumptions and properties

Properties

As you might have guessed by now,

  • OLS is unbiased.
  • OLS has the minimum variance of all unbiased linear estimators.
35 / 42

OLS: Assumptions and properties

Properties

But... these (very nice) properties depend upon a set of assumptions:

  1. The population relationship is linear in parameters with an additive disturbance.

  2. Our X variable is exogenous, i.e., E[u|X]=0.

  3. The X variable has variation. And if there are multiple explanatory variables, they are not perfectly collinear.

  4. The population disturbances ui are independently and identically distributed as normal random variables with mean zero (E[u]=0) and variance σ2 (i.e., E[u2]=σ2). Independently distributed and mean zero jointly imply E[uiuj]=0 for any ij.

36 / 42

OLS: Assumptions and properties

Assumptions

Different assumptions guarantee different properties:

  • Assumptions (1), (2), and (3) make OLS unbiased.
  • Assumption (4) gives us an unbiased estimator for the variance of our OLS estimator.

We will discuss solutions to violations of these assumptions. See also our discussion in the book

  • Non-linear relationships in our parameters/disturbances (or misspecification).
  • Disturbances that are not identically distributed and/or not independent.
  • Violations of exogeneity (especially omitted-variable bias).
37 / 42

OLS: Assumptions and properties

Conditional expectation

For many applications, our most important assumption is exogeneity, i.e., E[u|X]=0 but what does it actually mean?

38 / 42

OLS: Assumptions and properties

Conditional expectation

For many applications, our most important assumption is exogeneity, i.e., E[u|X]=0 but what does it actually mean?

One way to think about this definition:

For any value of X, the mean of the residuals must be zero.

  • E.g., E[u|X=1]=0 and E[u|X=100]=0

  • E.g., E[u|X2=Female]=0 and E[u|X2=Male]=0

  • Notice: E[u|X]=0 is more restrictive than E[u]=0

38 / 42

Graphically...

39 / 42

Valid exogeneity, i.e., E[u|X]=0

40 / 42

Invalid exogeneity, i.e., E[u|X]0

41 / 42

END

bluebery.planterose@sciencespo.fr
Original Slides from Florian Oswald
Book
@ScPoEcon
@ScPoEcon
42 / 42

Welcome to ScPoEconometrics: Advanced!

Today

  1. Who Am I

  2. This Course

  3. Recap 1 of topics from intro course

Next time

  • Quiz 1 (before next time)

  • Recap 2

2 / 42
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow