Some technical recommendations for the post-p < 0.05 world

Stop calling it “significance”
- It seems that even professionals take that word in its everyday sense rather than its (very limited) statistical sense.
Don’t use a probability scale
- A probability is like a loaded gun. If you hand one to a scientist, who knows what they are going to point it at.

Possible fixes for Stat 101 …

push confidence intervals. Put p in a historical footnote to the course.
reduce emphasis on statistical inference, put more on evaluating, in every problem, the full process of statistical work, starting with data collection.

A (proposed) general procedure

Italics indicate things we don’t regularly teach now.

Identify your goal: prediction or intervention/experiment.
Draw out causal diagram and identify confounders/covariates.
Create sampling and data collection plan to deal with confounders/covariates. Then collect your data, including measurable covariates (or their proxies).
Plot out data, draw in model. (We can easily handle two explanatory variables: axis and color)
Compare spread of model values and raw response. Calculate R.
Measure effect size, e.g. difference in means or slope of regression line. We’ll call it \(\Delta\).
Calculate \(F\). For methods in intro stats, \(F = (n-2) \frac{R^2}{1-R^2}\). With one covariates, it’s half this. (Well, \(F = \frac{n-3}{2}\frac{R^2}{1-R^2})\).
95% confidence interval on \(\Delta\) is \(\Delta \left( 1 \pm \sqrt{4/F}\right)\)*.
Calculate the confounding interval.

F in its raw form is our measure of significance.

Why F? Why not t?

F is, of course \(t^2\), so there’s hardly a difference. (That’s why the multipler on the 95% confidence interval is 4 = 2².)
Using F avoids temptation to look at one-tailed versus two-tailed tests.

Distinction between

Formulas are simpler and can be replaced by graphs.

SHOW GRAPHS FOR \(\frac{R^2}{1-R^2}\) and \(1 \pm \sqrt{4/F}\right\).

Formulas from Triola

Get applications to the three settings from classical-inference.Rmd.