That is all super-simple; nonetheless, it may be helpful. In school in the present day a pupil requested for some instinct as to why, whenever you’re regressing y on x, measurement error on x biases the coefficient estimate however measurement error on y doesn’t.
I gave the next fast clarification:
– You’re already beginning with the mannequin, y_i = a + bx_i + e_i. If you happen to add measurement error to y, name it y*_i = y_i + eta_i, and then you definately regress y* on x, you may write y* = a + bx_i + e_i + eta_i, and so long as eta is unbiased of e, you may simply mix them right into a single error time period.
– When you may have measurement error in x, two issues occur to attenuate b—that’s, to tug the regression coefficient towards zero. First, when you spreading out x however hold y unchanged, this may scale back the slope of y on x. Second, whenever you add noise to x you’re altering the ordering of the information, which is able to scale back the energy of the connection.
However that’s all phrases (and a few math). It’s less complicated and clearer to do a dwell simulation, which I did proper then and there at school!
Right here’s the R code:
# simulation for measurement error library("arm") set.seed(123) nThe ensuing plot is on the high of this submit.
I like this simulation for 3 causes:
1. You may have a look at the graph and see how the slope modifications with measurement error in x however not in y.
2. This train reveals the advantages of clear graphics, together with little issues like making the dots small, including the regression strains in crimson, labeling the person plots, and utilizing a standard axis vary for all 4 graphs.
3. It was quick! I did it dwell at school, and that is an instance of how college students, or anybody, can reply this form of statistical query immediately, with much more confidence and understanding than would come from a textbook and a few formulation.
P.S. As Eric Loken and I focus on in this 2017 article, every thing will get extra difficult when you situation on "statistical significance."
P.P.S. Sure, I do know my R code is ugly. Consider this as an inspiration: even when, like me, you’re a sloppy coder, you may nonetheless code up these examples for instructing and studying.