Should you calculate a p-value when there isn’t randomisation?

The thought behind this question was prompted by reading [zotpressInText item=”{TIBTBKWD}” format=”%a% (%d%, %p%)”], which argues against frequentist inferential statistics. One of the arguments refers to an underlying assumption required to compute p-values; they need random sampling. Without this, a p-value is meaningless. But this is rare in social science research [zotpressInText item=”{VRZPC486}” format=”%a% (%d%, %p%)”]. Is it therefore meaningful to calculate a p-value, given one of the assumptions is often violated? To answer this, I do what I often do when faced with a complex problem: asked Twitter! I’m grateful to those who answered my poll, and especially so to those who gave more detail/papers in the comments. I’m going to highlight some of those responses, presented chronologically, though I recommend you read the whole discussion. My hope is that it will be a useful collection of arguments (for, against, and mixed) about the topic.

Firstly, the poll shows a clear majority believe it is acceptable to calculate a p-value without randomised sampling or group allocation, though some believe there are conditions to this. But a significant proportion answered that it shouldn’t be done.

Richard Morey made a valuable set of points about understanding what the p-value tells us:

This was expanded on by Oscar Olvera (I highly recommend reading the whole thread):

Zad Chow linked to [zotpressInText item=”{UCYJGVV5}” format=”%a% (%d%, %p%)”], which argues researchers should deemphasise inferential statistics and use more descriptive statistics[note]A point which was also made by Valentin Amrhein.[/note] and statistical techniques that are more closely aligned with the distributions seen in real data. Berna Devezer provided some very helpful links and summaries for sources on this topic, including a blog post by [zotpressInText item=”{UUV6GMQL}” format=”%a% (%d%, %p%)”]. This blog post argues it is acceptable to calculate a p-value without randomisation, as that’s almost all that’s available to social science researchers! Because of the violated assumptions, your data need to be correspondingly stronger and/or your conclusions of generalisation more tentative (as the statistical population may not be the one you were planning on looking at[note]Thanks to David Disabato for clarifying this point[/note]).

Whilst this topic is far from settled (I’m not even sure where I fully stand), being presented these points has refined my thinking. As more arguments come my way, I’ll update this blog post to further this process.


[zotpressInTextBib style=”apa” sort=”ASC”]


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: