The Problem with Assessing Groups - Center for Engaged Learning

One of the challenges an instructor faces when developing any collaborative project is how to assess learning. When multiple students work together to create some product, it’s not necessarily the case that the final product reflects the learning of all the students in the group. In fact, there is evidence that the product created by a group is not a good indicator of individual group member learning. For example, in a study by Michael Delucchi (2007) students’ grades on two group projects were only weakly related to performance on subsequent individual exams that were intended to assess the same learning outcomes, and in one case, the group project grade was unrelated to exam performance when controlling for a different measure of individual ability.

Noreen Webb (1993) has also shown that the performance of students learning math skills in small mixed-ability groups can be “uniformly high in the group setting but quite variable on the individual test” (138). Other research conducted by Webb and colleagues (1998) found that unless a group was made up of uniformly high-achieving students, the groups’ performance on an assignment was predicted by the initial ability level of the highest-achieving student in the group, rather than by the average of the group members’ ability or the ability level of the lowest-achieving student.

One possible reason for this might have to do with the nature of work that students are asked to do when they are in groups. Collaborative projects tend to be disjunctive in nature (see Steiner’s taxonomy of tasks). This means that the goal of the task is to come up with the best or highest quality product, and the task itself does not involve the group dividing up subtasks and distributing them among members (admittedly, students may first take a divide-and-conquer approach to a large project, but by assigning the group a single grade, the instructor is certainly signaling the intention that every student is responsible for every aspect of the project). Performance on disjunctive tasks is most influenced by the highest-achieving individual. Consider team-based testing as an example. The team’s goal is to submit the best (i.e., correct) answers, and they are unlikely to simply divide the questions up amongst themselves. Instead, they are instructed to discuss each question together. Thus, in a group with high, medium, and low-achieving students, we might expect the team’s performance to be most impacted by the highest-achieving student.

So, given the practical and even ethical concerns associated with group assessment (Kagan 1995), what’s an instructor to do? Getting rid of the group assessment component seems like the easy solution, but there is reason to believe that would just create different problems. One of the prevailing theories of collaborative learning suggests that motivation to achieve a common goal (i.e., social interdependence) is necessary for collaborative learning to occur (Johnson and Johnson 2002), or at the very least is a facilitator of it (Meijer et al. 2020), and the graded group product may be necessary for helping to foster that interdependence. While I think there is a lot that could be unpacked and questioned in that assumption (e.g., is rewarding a good final product actually creating a shared learning goal or just a performance goal?), I am willing to concede that having no graded group component would certainly seem to eliminate any external motivation to collaborate – in a sense, making collaboration optional.

Unfortunately, the research on how best to assess group work has largely focused on the motivational aspect, rather than on how to increase the validity of the assessment. In particular, much research has examined how peer intra-group assessment might be used to supplement the group grade, but the goal of that research tends to be focused on how such techniques might help reduce the likelihood of freeloading and social loafing by incorporating an element of individual accountability (Meijer et al. 2020; Slavin, Hurley, and Chamberlain 2003). Reducing problematic behaviors in groups is certainly a good goal – and peer assessment like this is something I will explore more in a future post – but evidence that peer assessment increases engagement among students who might otherwise not fully participate doesn’t necessarily mean that peer assessment increases the validity of a group assessment.

I think the challenge of coming up with a solution to the problem of group assessment is that there may not be one. If all you see as the instructor is the final product, then you need to really consider what that final product can and can’t tell you about the individual learning of members of that group. In some cases, you might be able to observe some individual assessment criteria. For example, if the final product is a presentation, you could require each student to speak for roughly equal amount of time in order to get an individual measure of oral communication skills.

Alternatively, the project could be assigned in such a way that individual work on elements that contribute to the final product could be assessed at earlier points in the semester. For example, if the final product is a group research paper/presentation, an early step in the process could ask each student to submit an annotated bibliography. This work would allow an instructor to assess students’ ability to select high-quality sources and communicate about them effectively. This kind of scaffolded individual work also has the benefit of encouraging better collaboration by holding all group members accountable for contributing to the background research needed to develop the project.

Both of these recommendations, though, are more of an attempt to circumvent than solve the problem. They aren’t intending to influence the validity of the final group assessment. They are creating opportunities for individual assessment. That’s not a bad thing. It just may not always be possible, depending on the nature of the project and the time and resources available to the instructor. Also, there is the possibility that a greater emphasis on individual assessment could have the unintended effect of reducing the sense of social interdependence that a group grade is supposed to help foster.

In the end, if the rationale behind assigning a collaborative project is that you want to take advantage of the potential learning benefits associated with cooperative learning, then the assignment is to some extent a means, not an end. Thus, the assessment of group work might be better thought of as formative, rather than summative (Dixson and Worrell 2016). Don’t give it too much weight toward the final grade, design the project in a way that allows for observing and assessing individual student learning, consider treating the final product grade as more a measure of whether the group members engaged with the process (which intragroup peer assessments might actually help you assess), and come up with other methods that can serve as summative assessments of individual learning. Afterall, nobody is arguing that collaborative projects are high impact practices because they help us accurately assess students’ learning, so maybe the solution is to just not pretend they do.

References

Delucchi, Michael. 2007. “Assessing the Impact of Group Projects on Examination Performance in Social Statistics.” Teaching in Higher Education 12 (4): 447-460. https://doi.org/10.1080/13562510701415383.

Dixson, Dante D., and Frank C. Worrell. 2016. “Formative and Summative Assessment in the Classroom.” Theory into Practice 55 (2): 153-159. https://doi.org/10.1080/00405841.2016.1148989.

Kagan, Spencer. 1995. “Group Grades Miss the Mark.” Educational Leadership 52(8): 68–71. http://www.ascd.org/publications/educational-leadership/may95/vol52/num08/Group-Grades-Miss-the-Mark.aspx.

Johnson, David W., and Roger T. Johnson. 2002. “Cooperative Learning and Social Interdependence Theory.” In Theory and Research on Small Groups. Social Psychological Applications to Social Issues, vol 4, edited by R. Scott Tindale et al. Boston, MA: Springer. https://doi.org/10.1007/0-306-47144-2_2.

Meijer, Hajo, Rink Hoekstra, Jasperina Brouwer, and Jan-Willem Strijbos. 2020. “Unfolding Collaborative Learning Assessment Literacy: A Reflection on Current Assessment Methods in Higher Education.” Assessment & Evaluation in Higher Education 45(8): 1222-1240. https://doi.org/10.1080/02602938.2020.1729696.

Slavin, Robert E., Eric A. Hurley, and Anne Chamberlain. 2003 “Cooperative Learning and Achievement: Theory and Research.” Handbook of Psychology, 177-198. https://doi.org/10.1002/0471264385.wei0709.

Webb, Noreen M. 1993. “Collaborative Group Versus Individual Assessment in Mathematics: Processes and Outcomes.” Educational Assessment 1 (2): 131–152. https://doi.org/10.1207/s15326977ea0102_3.

Webb, Noreen M., Kariane M. Nemer, Alexander W. Chizhik, and Brenda Sugrue. 1998. “Equity Issues in Collaborative Group Assessment: Group Composition and Performance.” American Educational Research Journal 35 (4): 607-651. https://doi.org/10.3102/00028312035004607.

David Buck, associate professor of psychology, is the 2020-2022 Center for Engaged Learning Scholar. Dr. Buck’s CEL Scholar project focuses on collaborative projects and assignments as a high-impact practice.

How to Cite this Post

Buck, David. (2021, July). The Problem with Assessing Groups [Blog Post]. Retrieved from https://www.centerforengagedlearning.org/the-problem-with-assessing-groups

Posted in:

Tagged:

David Buck