Anyone know why the error term switches from u to epsilon?

In the population model section we’re shown a linear equation that uses u as its error term but once we get to conditional expectation function the error term changes to \epsilon_i, or at least I think its error. I can’t seem to find the definition of \epsilon

Anyone have any insight in the difference, other than \epsilon_i is per X_i and u seems global?

Hi, I think \epsilon refers to the error term of CEF. I found it was mentioned here in Therome 3.1.1 via this StackExchange thread

2nding @bkktimner

Also I think that \epsilon comes from the definition of the CEF, vs u is coming from one (of many) possible models under consideration.

Thanks @bkktimner and @nfultz ltz

From a data perspective whats the practical difference then? Both terms are just capturing the unmodeled error, whether that be from the linear model, or the variation not captured by a mean estimate. Is that simplification a good general understanding?

To expand a little:

E(\epsilon | x) = 0 vs E(u | x) = 0

The \epsilon version follows from how CEF is defined and is thus always true for any pair of RVs.

the u version is an extra assumption on u for identifying regression models.

We would prefer the former but are usually stuck with the later.

When you say “we’d prefer the former, but are stuck with the latter” why is it the case we’re stuck with the latter?

And for all intents and purposes they’re both modeling the same thing right? Unaccounted for noise that isn’t captured in the conditional

They’re only equivalent when your (assumed) model is actually true - functional form is correct, not missing anything, etc. There’s no way to be absolutely sure you set up the model correctly. And there’s no way to directly observe the \epsilon either.

Even in cases where you have a strong reason to believe in the model, things can go awry.

Eg Imagine testing Hooke’s law (F = kx + u) for the length of springs, but maybe over time, after several trials you stretch the spring too much and later trials are a little different because the “springiness” has worn out a little. The “True” model that “Nature” “uses” E(F|x) + \epsilon to “make springs a certain length” has a bunch of higher order crap in it that we’ve ignored at our own peril.

It could be “bad”. Or it could be “fine”. Depends if you are making Slinkies or rocket parts.

2 Likes

got it, thank you @nfultz