There is a problem in science today. I've written a lot about how to cure it, but in this post I want to outline the nature of the disease as I see it. The problem goes by many names:
researcher degrees of freedom
undisclosed analytic flexibility
the file drawer
"Why most published research findings are false."
So I'm going to call it the f problem for short. I like to visualize f as a forking path. Given any particular set of raw data, a researcher faces a series of choices about how to turn it into a 'result'. There are choices over which statistical tests to run, on which variables, after excluding which outliers, and applying which preprocessing... and so on:
The f problem is that researchers can try multiple approaches in private, and select for publication the most desirable ones. Most often, it's statistically ...