January 4th, 2015
In Technology
No Comments
If you enjoy this article, see the other most popular articles
If you enjoy this article, see the other most popular articles
If you enjoy this article, see the other most popular articles
What does statistical over-fitting look like?
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.
I like how clear this makes the mistake of over-fitting:
The model explains over 99% of the variance in the data. Like I said, not a typical data set.
View the estimates of the coefficients, and the p-values of their t-tests
(:coefs lm)
(:t-probs lm)
The values for coefficients b0, … b10 are (0.878 0.065 -0.066 -0.016 0.037 0.003 -0.009 -2.8273E-4 9.895E-4 1.050E-5 -4.029E-5), and the p-values are (0 0 0 1.28E-5 0 0.083 1.35E-12 0.379 3.74E-8 0.614 2.651E-5).All the coefficients are significant except b5, b7, and b9.
Finally, overlay the fitted values on the original scatter-plot
(add-lines plot x (:fitted lm))
That’s the kind of fit rarely seen on real data! In fact, on real data this would be an example of over-fitting. This model likely wouldn’t generalize to new data from the same process that created this sample.
Post external references
- 1
http://data-sorcery.org/2009/06/04/linear-regression-with-higher-order-terms/
February 8, 2022 9:33 am
From Michael S on How I recovered from Lyme Disease: I fasted for two weeks, no food, just water
"Did you have Bartonella, too? Seems it uses autogenesis..."