Ability grouping of students doesn’t work

Academic achievement in England is strongly impacted by class, with those of a higher socioeconomic status (S.E.S.) more likely to achieve than than those of a lower S.E.S. (Clifton & Cook, 2012). These gaps between students can be seen between students as early as three years old (Feinstein, 2003) and continue to widen as the children age (Feinstein, 2004). One of the historical measures to reduce these inequalities is ability grouping. Students are placed into groups based on their test scores for certain subjects so they can be taught with their peers of similar ability. ‘Streaming’ (called ‘tracking’ in the US) divides students into groups based on their test scores across all/most of their subjects, meaning they stay with the same students across those subjects. This is similar to ‘banding’. ‘Setting’ occurs when students are put into ability groups for specific subjects that are not necessarily consistent across subjects e.g. a student could be placed in top set for maths but middle set for English (Francis et al., 2017). Data on the prevalence of ability grouping is inconsistent but the evidence suggests it is prevalent in secondary school and to a lesser extent primary school in the U.K. (Dracup, 2014). It is becoming more common in the U.S. after a drop in popularity during the 1990’s (Steenbergen-Hu, Makel, & Olszewski-Kubilius, 2016).

Why are students grouped according to test scores?

Setting students is popular with parents, especially middle-class parents (Boaler, Wiliam, Brown; 2000), so schools are incentivised to structure their classrooms this way. But just because it is popular doesn’t mean it is beneficial to the students or even that it should be used. The putative benefits of ability grouping include: allowing teachers to go at a pace suitable for students of different abilities; giving more able students greater opportunities to push themselves; establishing smaller classes for lower attaining pupils (Ireson, 1999); and children in the lower sets receiving greater support (Francis et al., 2017). There is some evidence to suggest it works e.g. Steenbergen-Hu, Makel, & Olszewski-Kubilius (2016) but there are methodological concerns with the supporting evidence which undermine their conclusions.

Steenbergen-Hu, Makel, & Olszewski-Kubilius (2016) conducted a meta-analysis of meta-analyses studying the effects of different types of ability grouping. They identified four types of which I will focus on two: between-class (same age students are placed into high, average, or low classes based on prior achievement across subjects); and within class (teachers assign students into sub-groups within the classroom based on ability). Of the meta-analyses studies, seven were rated as having low methodological quality (characterised as having “major weaknesses”) and six had moderate methodological quality. None were rated as high quality. Most of the low-quality meta-analyses lacked fundamental details like effect sizes and research design and all the others were missing important information. The results for between-class grouping suggested a weak to nil effect (all confidence intervals included 0), with a small positive impact of within-class grouping. But the spectre of publication bias hangs over as, though they assessed for it, they used a sub-optimal method. The trim and fill method has a false positive rate of close to 100% given likely values of publication bias (Carter, Schönbrodt, Hilgard, & Gervais, 2017). Coupled with the fact publication bias inflates effect sizes, because the statistical significance filter means only effect sizes that are likely to be larger than the true effect size are published when typical sample sizes are small and measures are noisy (Gelman, 2011), we can be even less confident in the results. Even for the mini meta-analysis of 12 randomised control trials, the highest quality studies, the effect size of between-class grouping couldn’t break free of 0 and the evidence for within-class grouping was mixed (with a meta-analytic result of a small positive effect size).

An example of streaming

The evidence against

Whilst ability grouping is supposed to reduce disparities between students of different S.E.S., they can widen them (Higgins et al., 2012). It can also promote social segregation (OECD, 2014), with working class pupils – and students from some minority ethnic groups – disproportionately represented in low sets and streams (Kutnick et al., 2005). Pupils in some mathematics sets are taught as if they were identical in ability, given the same tasks at the same pace. Pupils in lower mathematics sets report being, and are observed to be, insufficiently challenged and expected to spend more time copying off the board than in higher sets. Consequently[although this isn’t the only reason for this finding[/note], children placed in lower sets show lower motivation (Suknandan & Lee, 1998).

For primary and secondary school students, there is a paucity of strong evidence showing a positive effect of ability grouping (either setting or streaming) on academic outcomes on average (Kutncik et al., 2005). Results from PISA survey (OECD, 2009) show ability grouping within schools is related related to lower performance at the system level. Looking at individual subjects, Ireson, Hallam, & Hurley (2005) investigated the effect of setting in Maths, Science, and English at G.C.S.E. and found no significant effect for any of the classes. However, this isn’t to say there is no evidence of its benefit for certain sub-groups. Pupils in higher sets achieve more than children in schools that did not stream their pupils, even after controlling for variations between children, (Parsons & Hallam, 2014) not only because teachers can move at a faster pace but because there is a greater chance they will be be taught by a more experienced and qualified teacher1. The corollary of this is that children in lower sets are taught by less experienced teachers and those who are less likely to be subject specialists. They are also more likely to have a higher number of teachers as their educators are more likely to leave (Kutncik et al., 2005).

As Boaler & Wiliam (2001) summarise, “streaming [appears to have] no academic benefits whatsoever, while setting confers small academic benefits on some high-attaining students, at the expense of large disadvantages for lower attainers”.

Where do these negative effects come from?

There are multiple interrelated factors which negatively affect students’ outcomes when they are grouped according to test scores. One of the main reasons is a students’ test score is not the only variable that determines which set they will be placed in; class and Special Educational Needs diagnoses are also significant predictors (Muijs & Dunne, 2010). These groups are unable to account for changes in children’s results or misallocation as there is a lack of fluidity between groups (Dunne, Hunphreys, Sebba, Dyson, Gallanmaugh, & Muijs  2007). Lower set classes are taught by poorer quality teachers (Slavin, 1990), as well as the other deficiencies mentioned earlier. Placing students in sets can create an artificial ceiling, where students are excluded from higher tier study and therefore higher grades (e.g. being limited to a ‘C’ grade by taking the Foundation level paper for a subject). Pupils in lower sets often express dissatisfaction with their set and often with the school as a whole (Ball, 1981). This “anti-school” attitude has been shown to have a negative impact on outcomes (Baines & Blatchford, 2010). Student’s self-perception of ability and their confidence in being able to succeed in a subject can also be reduced by being placed in a lower set. All these factors combine to create self-fulfilling prophecies for the students: placement in lower sets due to a multitude of factors (of which previous attainment is only one) leads to reduced motivation and self-efficacy (Bandura, 1994). Coupled with poorer quality teaching and artificial ceilings, these conspire to limit student achievement and are an example of the Matthew Effect: the rich get richer and the poor get poorer (Kerckhoff & Glennie, 1999).

What’s an alternative?

With the results of numerous studies pointing to limited positive effects and wide ranging negative effects, the evidence leads one to conclude ability grouping should no longer be practiced in schools. But what should replace it? Mixed grouping appears to be the fairest given the evidence, as it benefits the most without punishing others as strongly as ability grouping does (Taylor et al., 2017). However, there are a number of factors deterring schools from adopting mixed ability grouping e.g. changing teachers’ minds about mixed ability teaching and the lack of exemplars to follow (Taylor et al., 2017).


Ainscow, M.; Dyson, A.; Goldrick, S.; & West, M. (2012). Developing Equitable Education Systems. London: Routledge.

Ball, S. (1981). Beachside Comprehensive: a case-study of secondary schooling. Cambridge: Cambridge
University Press.

Bandura, A. (1994). Self-efficacy. In V. S. Ramachaudran (Ed.), Encyclopedia of human behavior (Vol. 4, pp. 71-81). New York: Academic Press. (Reprinted in H. Friedman [Ed.], Encyclopedia of mental health. San Diego: Academic Press, 1998).

Blatchford. P. & Baines, E. (2010) Peer relations in school. In K. Littleton, C. Wood and Kleine Staarman (Eds) Elsevier Handbook of Educational Psychology: New Perspectives on Learning and Teaching. Emerald: Bingley, UK (ISBN 978-1-84855-232-6).

Boaler, J. (1997) When even the winners are losers: Evaluating the experiences of top set’ students. Journal of Curriculum Studies, 29 (2).

Boaler, J. & Wiliam, D. (2001) “‘We’ve still got to learn!’ Students’ perspectives on ability grouping and mathematics achievement” in Gates ed. (2001: 77-92)

Boaler, J.; Wiliam, D; & Brown, M. (2000). Students’ experiences of ability grouping—disaffection, polarisation and the construction of failure. British Education Research Journal, 26 (5), 631–648.

Carter, E.C.; Schönbrodt, F.D.; Hilgard, J.; & Gervais, W.M. (2017). Correcting for bias in psychology: A comparison of meta-analytic methods. Retrieved from: psyarxiv.com/9h3nu

Clifton, J. & Cook, W. (2012). A Long Division: Closing the attainment gap in England’s secondary schools.
London: IPPR. Retrieved from: https://www.ippr.org/files/images/media/files/publication/2012/09/long%20division%20FINAL%20version_9585.pdf

Dracup, T. (2014) The Politics of Setting. Available at: https://giftedphoenix.wordpress.com/2014/11/12/the-politics-of-setting/

Dunne, M., Hunphreys, J., Sebba, J., Dyson, A., Gallanmaugh, F. and Muijs, D. (2007). Effective teaching
and learning for pupils in low attaining groups. London: DCSF.

Feinstein, L. (2003). Inequality in the Early Cognitive Development of British Children in the 1970 Cohort. Economica, 70: 73–97

Feinstein, L. (2004). Mobility in Pupils’ Cognitive Attainment During School Life. Oxford Review of Economic Policy 20 (2), 213–229.

Francis, B.; Archer, L.; Hodgen, J.; Pepper, D.; Taylor; B.; Travers, M. (2017). Exploring the relative lack of impact of research on ‘ability grouping’ in England: a discourse analytic account. Cambridge Journal of Education , 47 (1), 1-17.

Gelman, A. (2011). The statistical significance filter. Retrieved from: http://andrewgelman.com/2011/09/10/the-statistical-significance-filter/

Higgins, S.; Kokotsaki, D.; Coe, R. (2012). The Teaching and Learning Toolkit. London: Education Endowment Foundation and Sutton Trust. Retrieved from: h

Ireson, J. (1999). Innovative Grouping Practices In Secondary Schools. Research Report No 166. Available at: http://dera.ioe.ac.uk/4460/1/RR166.pdf

Ireson, J., Hallam, S. and Hurley, C. (2005). What are the effects of ability grouping on GCSE
attainment? British Educational Research Journal, 31, 313-328.

Kerckhoff, A.C. & Glennie, E. (1999). The Matthew Effect in Education. Research in Sociology of Education and Socialisation, 12, 35-66. Retrieved from: https://www.researchgate.net/profile/Elizabeth_Glennie/publication/257936416_The_Matthew_Effect_in_American_Education/links/00b7d52655454183f4000000/The-Matthew-Effect-in-American-Education.pdf

Kutnick, P.; Sebba, J.; Blatchford, P.; Galton, M.; & Thorp, J. (2005). The Effects of Pupil Grouping: Literature Review. Research Report No 688. Retrieved from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=

Muijs, D. and M. Dunne (2010). Setting by ability–or is it? A quantitative study of determinants of set placement
in English secondary schools. Educational Research 52 (4), 391-407.

OECD (2010). PISA 2009 results: What Makes a School Successful? Resources, Policies, and Practices (Vol. IV) http://dx.doi.org/10.1787/9789264091559-en

OECD (2014), “Are Grouping and Selecting Students for Different Schools Related to Students’ Motivation to Learn?”, PISA in Focus, No. 39, OECD Publishing, Paris. http://dx.doi.org/10.1787/5jz5hlpb6nxw-en

Parsons, S. & Hallam, S. (2014). The impact of streaming on attainment at age seven: evidence from the Millennium Cohort Study. The Oxford Review of Education, 40 (5), 567-589.

Slavin, R. (1990). Achievement effects of ability grouping in secondary schools: a best evidence synthesis.
Review of Educational Research, 60, 471-499.

Steenbergen-Hu, S.; Makel. M.C.; Olszewski-Kubilius, P. (2016). What One Hundred Years of Research Says About the Effects of Ability Grouping and Acceleration on K–12 Students’ Academic Achievement. Review of Educational Research, 86 (4), 849–899.

Sukhnandan, L. & Lee, B. (1998) Streaming, Setting and Grouping by Ability: a Review of the Literature. Slough: NFER.

Taylor, B.; Francis, B.; Archer, L.’ Hodgen, J.; Pepper, D.; Tereshchenko, A.; & Travers, M. (2016). Factors deterring schools from mixed attainment teaching practice. Pedagogy, Culture, and Society, 25 (3). https://doi.org/10.1080/14681366.2016.1256908

  1. However there is evidence even students in the top set don’t benefit from this placement as they can be disadvantaged by the fast pace and high expectations (Boaler, 1997)

Write a Comment

Your email address will not be published. Required fields are marked *