Don’t you just love being wrong? Of course you don’t, no one does. But there is a grim satisfaction in no longer believing something that there isn’t good enough evidence for. This is what I experienced after examining the phenomenon known as ‘stereotype threat’. In short, it’s the idea that groups with negative stereotypes about them feel anxiety when these stereotypes are made salient (and are therefore more likely to confirm those stereotypes) e.g. women being inferior than men at maths.
I believed in it (as evidenced by the fact I’ve written about it before) because there appeared to be a lot of evidence in favour of it. But there have been some significant failed replications (Ganley et al., 2013; Stricker & Ward, 2004; Stafford, 2016; Finnigan & Corker, 2016; Wei, 2012 available here). These were large-scale replications and (in the case of Stricker & Ward and Wei) were field experiments; they used data from actual tests (with stereotype threat manipulations). This is in contrast with the positive replications and initial studies: low numbers of participants in a zero-consequence test in a laboratory. The fact that the largest studies found no effect of stereotype threat is significant. There have also been suggestions of publication bias, with evidence from Flore & Witcherts (2015). This was also suggested by Ganley et al. (2013) who found that 80% of published articles found at least one instance of stereotype threat but none of the unpublished articles did (though that doesn’t necessarily mean this result was due to publication bias).
The original conclusion by Steele & Aronson (1995) has been called into question by Sackett, Hardison, & Cullen (2004). They point out that Steele & Aronson statistically adjusted the student’s results in the test to control for differences in student’s prior SAT performance. So the (small) differences seen between the baseline test and the “threat” condition may have been due to the differences in prior SAT performance.
There have also been issues raised with some of the positive replications. Stoet & Geary (2012) analysed 20 attempted replications of stereotype threat (for women in mathematics). They found only 55% of the studies replicated it but almost all of those (8/11) had adjusted the scores of participants to control for differences in maths ability (using a prior test performance e.g. SAT, as a covariate).
The two results above are a problem, as the variable they are examining is differences in mathematical scores. The experimental hypothesis is it is due to stereotype threat (with the assumption that the groups don’t differ on the covariate of prior mathematical ability) but if they do differ on it then it might explain the difference, making stereotype threat irrelevant (see Jussim, Crawford, Anglin, Stevens, & Duarte; 2016, for an elaboration).
More recently, Flore, Mulder, & Wicherts (2019) conducted a registered report of the phenomenon in Dutch high schools. They collected data for 2067 students, aged either 13 or 14 years old. Using Bayesian analysis, they found much stronger evidence for the null hypothesis of stereotype threat having no meaningful impact on performance.
I don’t think stereotype threat doesn’t exist at all, but it seems to be a lot more complicated than simply “making stereotypes salient will reduce performance”. I think it’s relevance in the real world is hugely over-played and there are a lot of interacting variables at work which affect it’s impact. Whilst even small effect sizes (as found in most of the studies) can have impacts in the real world, I am unconvinced that spending time and resources on reducing stereotype threat is worthwhile when other factors are known to negatively affect mathematics performance (Ceci & Williams, 2010). I am willing to be convinced of its existence but there needs to be much stronger evidence for it.
For a nuanced discussion of the possible reasons for positive effects in the lab but null effects in the field, I recommend Stricker (2008). For an excellent in-depth statistical analysis as to why it’s difficult to replicate the original stereotype threat finding, I suggest you read this post by Uli Schimmack (2015).
Ceci, S. J., & Williams, W. M. (2010). Sex differences in math-intensive fields. Current Directions in Psychological Science, 19, 275–279. doi: 10.1177/0963721410383241
Dictionary.com (2015). Salient. [Online] Available at: http://dictionary.reference.com/browse/salient [Accessed 21/12/2015]
Finnigan, K.M.; Corker, K.S. (2016). Do performance avoidance goals moderate the effect of different types of stereotype threat on women’s math performance? Journal of Research in Personality, 63, 36-43.
Flore, P.C. & Wicherts, J.M. (2015). Does stereotype threat influence performance of girls in stereotyped domains? A meta-analysis. Journal of School Psychology, 53 (1), 25-44.
Ganley, C. M., Mingle, L. A., Ryan, A. M., Ryan, K., Vasilyeva, M., & Perry, M. (2013). An Examination of Stereotype Threat Effects on Girls’ Mathematics Performance. Developmental Psychology. Advance online publication. doi: 10.1037/a0031412
Jussim, L.; Crawford, J. T.; Anglin, S. M.; Stevens, S. T.; & Duarte, J. L. (2016). Interpretations and methods: Towards a more effectively self-correcting social psychology. Journal of Experimental Social Psychology, 66, 116-133.
PsychBrief (2014). Overcoming stereotype threat. [Online] Available from: http://psychbrief.blogspot.co.uk/2014/05/overcoming-stereotype-threat_30.html [Accessed: 21/12/2015]
Replication-Index (2015). Why are Stereotype-Threat Effects on Women’s Math Performance Difficult to Replicate? [Online] Available at: https://replicationindex.wordpress.com/2015/01/06/why-are-stereotype-threat-effects-on-womens-math-performance-difficult-to-replicate/ [Accessed: 21/12/2015]
Sackett, P.R. Hardison, C.M., & Cullen, M.J. On Interpreting Stereotype Threat as Accounting for African American–White Differences on Cognitive Tests. American Psychologist, 59 (1), 7-13.
Stafford, T. (2016). No stereotype threat effect in international chess, Annual Conference of the Cognitive Science Society, 10-13th August 2016, Philadelphia, USA.
Stoet, G. & Geary, D.C. (2012). Can Stereotype Threat Explain the Gender Gap in Mathematics Performance and Achievement? Review of General Psychology, 16 (1), 93-102.
Steele, C.M. & Aronson, J. (1995). Stereotype threat and the intellectual test performance of African Americans. Journal of Personality and Social Psychology, 69 (5), 797-811.
Stricker, L.J. (2008). The Challenge of Stereotype Threat for the Testing Community. [Online] Available from: http://www.ets.org/Media/Research/pdf/RM-08-12.pdf [Accessed: 21/12/2015]
Stricker, L.J. & Ward, W.C. (2004). Stereotype Threat, Inquiring About Test Takers’ Ethnicity and Gender, and Standardized Test Performance. Journal of Applied Social Psychology, 34 (4), 665–693
Wei, T.E. (2012). Sticks, Stones, Words, and Broken Bones; New Field and Lab Evidence on Stereotype Threat. Educational Evaluation and Policy Analysis, 34 (4), 465-488