Chap 6: Process of investigation

GAISE 2016 Recommendation #1:

Teach statistical thinking.

  • Teach statistics as an investigative process of problem-solving and decision-making.
  • Give students experience with multivariable thinking.

I don’t believe that “decision-making” should be taken to refer to “select the test that’s appropriate for [my color of data].” Nor should it be “Is p < 0.05?”

Instead, let’s demonstrate a meaningful statistical process that involves genuine decision making.

Best if this could be a consensus statement, but the first step in building consensus is to have at least one proposal.

Proposed process

1. Identify your goal, know your purpose

GAISE 2016 Recommendation #3:

Integrate real data with a context and a purpose.

This can include relative value/cost of correct/incorrect predictions, e.g. a loss function

2. Assemble existing knowledge, including data that might be relevant

This might include population demographics.

3. Draw out causal diagrams

Identify the covariates that can be measured, those that can’t be measured, and those that are not even known (the unk unks).

4. Plan data collection

  1. What study plan?
    • experiment
    • snapshot
    • longitudinal
    • stratified sampling?
  2. How to collect response and explanatory variables and covariates.

  3. Pick a sample size

5. Create model representations

6. Evaluate technical performance of models

  • Prediction error
  • Bootstrapping
  • Cross validation to compare models

Often the results will cause you to revisit previous steps.

7. Interpret and communicate

  • Effect size. (Confounding interval?)
  • Apply loss functions.
  • Express risk sensibly, attribute risk (causality) responsibly.
  • Express your uncertainty/ standardize your results (adjustment) to help decision-makers see contrasts that are meaningful.
  • Be attentive to false discovery.
  • Don’t be afraid to frame things in terms of causation, but do so only if you have handled the possibility of confounding in a responsible way.