In previous posts, I’ve discussed how collaborative assignments pose a challenge for valid assessment because the resulting product typically reflects pooled ability and effort (for a good review, see Webb et al. 1998). One way instructors have attempted to overcome this is by incorporating intragroup peer assessment into the project grade. However, this “solution” may bring additional problems with it: peer assessments may not actually be a valid assessment tool and they may be perceived as unfair by students. These two problems are not mutually exclusive, but they aren’t the same either. For example, a student who argues a grade is “unfair” may in some cases be correct, but it may also be that the claim is the result of the grading policy being miscommunicated by the teacher or misunderstood by the student. This distinction is important for researchers studying the impact of peer assessment because the validity of a grade and perceptions of its fairness are two different outcome variables.

For example, if we wanted to see if a peer assessment activity increased students’ perceptions of the fairness of a project grade, our outcome would be students’ self-reported perceptions of the fairness of that grade or the process by which it was calculated. This would ideally be done in an experimental context that compares students working on collaborative projects who engaged in peer assessment to students who did not engage in peer assessment. Unfortunately, despite calls for it, there is not a lot of experimental or quasi-experimental research on peer assessment (Topping 2010).

Less helpful, but still perhaps informative, is descriptive research in which students are asked to assess the fairness of their grade after peer assessment (with no control to compare against). Descriptive research like this indicates that although the majority of students seem to have a favorable attitude towards using peer assessment as a means of individuating group grades, there is still a large minority that is either ambivalent or has a negative response to this kind of grading (e.g., Carvalho 2013; Jin 2012). In other words, the effect of peer assessment on perceptions of grade fairness are mixed.

Student Concerns About Peer Assessment

Students’ skepticism about the fairness of peer assessment seems to center around concerns over possible friendship marking (i.e., assessments based on personal relationships, rather than the intended criteria) and concerns that peers are not motivated or competent enough to accurately serve as assessors. If these concerns are warranted, then it means that grades based on peer assessment may also have weak validity. So, are these concerns warranted?

When it comes to the concern over friendship marking, there’s certainly ample evidence that we are biased when engaging in self-evaluation (Haas et al. 1998; Johnston and Miles 2004), and it seems reasonable to believe that we might unconsciously evaluate close friends through similarly rose-colored lenses. Evaluations of friends’ contributions may also be influenced by interpersonal motives. For example, group members may assign high scores to their peers or engage in reciprocal grading (wherein each group member assigns identical scores for everyone, so that no individual receives low marks; e.g., Edgerton and McKechnie 2002) in order to maintain positive social relations with peers. If this is the case, we might expect peer assessment scores to be impacted by the degree of anonymity of the assessor, since an anonymous assessment is less likely to incur social rewards or punishments than an identifiable one.

As it turns out, research does show that anonymous peer assessments tend to be more negative than identifiable peer assessments (Peterson and Peterson 2011), and confidential peer assessments completed outside of class are less positive than those completed inside of class—possibly because the greater privacy outside of class increases the perception of confidentiality (Donmeyer 2006). There’s also evidence that students are less willing to give each other honest critical feedback if it could have a punitive effect on the peer, such as when that feedback gets incorporated into the actual project or class grade (Sridharan, Tai, and Boud 2019). Collectively, this suggests that social motives do seem to impact the validity of peer assessments.

Though I haven’t seen it brought up frequently in the peer assessment literature, there is also a concern that group- or identity-based prejudices could influence students’ assessments. Although some research suggests gender bias may not be common in peer assessment (Tucker 2014), there is a dearth of research on the potential effects of other types of bias (e.g., racial, ethnic, or nationality-based prejudices). However, general research on how prejudice and stereotypes affect a person’s perception and evaluations of behavior would certainly suggest that these are points of concern that need to be more fully explored in the context of peer assessment.

Students’ concerns that peers may not be able or motivated to provide accurate assessments also has some basis in research. There is evidence that peer assessments are moderately correlated with teacher assessments (Li et al. 2016), but the strength of this relationship can differ depending on the nature of the assignment and individual differences in the students doing the assessing. For example, high-ability students give more varied peer assessment scores to their teammates, compared to lower ability students (Davison et al. 2014; Saavedra and Kwun 1993). The explanation researchers propose for this finding is that higher ability students are more motivated and (perhaps more importantly) better able to distinguish between the contributions of others.

Given all the evidence presented here, it would seem like the concerns expressed by some students may in fact be warranted. So, where does that leave us? Is there a way to conduct peer assessment that is valid and perceived as fair? It’s hard to say. The studies above demonstrate that students’ concerns around the unfairness of peer assessment may be based on some truth. If so, then taking steps to address these issues would be a first possible step. For example, because anonymous peer assessments and those completed outside of class tend to be lower, we might consider those to be important features of valid peer assessment (this assumes of course that lower is more valid, which is not a given, but does seem like a theoretically reasonable assumption in this case).

It also might make sense to frame use of peer assessments in such a way that students don’t perceive giving negative evaluations as punitive. This can involve spending some time explaining the importance of these assessments as sources of feedback, training students on the assessment tool, and structuring the assessment process so that it requires students to include both qualitative feedback and quantitative marks (Miller 2003). Even though qualitative feedback might not be useful as part of a summative assessment, writing specific justifications for the ratings may force students to think more about why they are giving different numeric ratings. Some researchers also suggest involving students in the development of the peer assessment criteria (Falchikov 2003) and having groups discuss their contributions toward the project with each other before completing the assessments in private (Lejk and Wyvill 2001).

References

Carvalho, Ana. 2013. “Students’ Perceptions of Fairness in Peer Assessment: Evidence from a Problem-based Learning Course.” Teaching in Higher Education 18 (5): 491-505. https://doi.org/10.1080/13562517.2012.753051.

Davison, H. Kristl, Vipanchi Mishra, Mark N. Bing, and Dwight D. Frink. 2014. “How Individual Performance Affects Variability of Peer Evaluations in Classroom Teams: A Distributive Justice Perspective.” Journal of Management Education 38(1): 43-85. https://doi.org/10.1177/1052562912475286.

Dommeyer, Curt J. 2006. “The Effect of Evaluation Location on Peer Evaluations.” Journal of Education for Business 82(1): 21-26. https://doi.org/10.3200/JOEB.82.1.21-26.

Edgerton, Edward, and Jim McKechnie. 2002. “Students’ Views of Group-based Work and the Issue of Peer Assessment.” Psychology Learning & Teaching 2 (2): 76–81. https://doi.org/10.2304/plat.2002.2.2.76.

Falchikov, Nancy. 2004. “Involving Students in Assessment.” Psychology Learning & Teaching 3(2): 102–108. https://doi.org/10.2304/plat.2003.3.2.102.

Haas, Amie L., Robert W. Haas, and Thomas R. Wotruba. 1998. “The Use of Self-Ratings and Peer Ratings to Evaluate Performances of Student Group Members.” Journal of Marketing Education 20(3): 200-209.

Jin, Xiao-Hua. 2012. “A Comparative Study of Effectiveness of Peer Assessment of Individuals’ Contributions to Group Projects in Undergraduate Construction Management Core Units.” Assessment & Evaluation in Higher Education 37(5): 577-589. https://doi.org/10.1080/02602938.2011.557147.

Johnston, Lucy, and Lynden Miles. 2004. “Assessing Contributions to Group Assignments.” Assessment & Evaluation in Higher Education 29(6): 751-768. https://doi.org/10.1080/0260293042000227272.

Lejk, Mark, and Michael Wyvill. 2001. “The Effect of the Inclusion of Self-Assessment with Peer Assessment of Contributions to a Group Project: A Quantitative Study of Secret and Agreed Assessments.” Assessment & Evaluation in Higher Education 26(6): 551-561. http://doi.org/10.1080/02602930120093887.

Li, Hongli, Yao Xiong, Xiaojiao Zang, Mindy L. Kornhaber, Youngsun Lyu, Kyung Sun Chung and Hoi K. Suen. 2016. “Peer Assessment in the Digital Age: A Meta-Analysis Comparing Peer and Teacher Tatings.” Assessment & Evaluation in Higher Education 41(2): 245-264. https://doi.org/10.1080/02602938.2014.999746.

Miller, Peter J. 2003. “The Effect of Scoring Criteria Specificity on Peer and Self-Assessment.” Assessment & Evaluation in Higher Education 28 (4): 383-393. https://doi.org/10.1080/0260293032000066218.

Peterson, Christina Hamme, and Andrew N. Peterson. 2011. “Impact of Peer Evaluation Confidentiality on Student Marks.” International Journal for the Scholarship of Teaching and Learning 5 (2): Article 13. https://doi.org/10.20429/ijsotl.2011.050213.

Saavedra, Richard, and Seog K. Kwun. 1993. “Peer Evaluation in Self-Managing Work Groups.” Journal of Applied Psychology 78(3): 450-462.

Sridharan, Bhavani, Joanna Tai, and David Boud. 2019. “Does the Use of Summative Peer Assessment in Collaborative Group Work Inhibit Good Judgement?” Higher Education 77:853–870. https://doi.org/10.1007/s10734-018-0305-7.

Topping, Keith J. 2010. “Methodological Quandaries in Studying Process and Outcomes in Peer Assessment.” Learning and Instruction 20(4):339-343. https://doi.org/10.1016/j.learninstruc.2009.08.003.

Tucker, Richard. 2014. “Sex Does Not Matter: Gender Bias and Gender Differences in Peer Assessments of Contributions to Group Work.” Assessment & Evaluation in Higher Education 39 (3): 293-309. https://doi.org/10.1080/02602938.2013.830282.

Webb, Noreen M., Kariane M. Nemer, Alexander W. Chizhik, and Brenda Sugrue. 1998. “Equity Issues in Collaborative Group Assessment: Group Composition and Performance.” American Educational Research Journal 35 (4): 607–651. https://doi.org/10.3102/00028312035004607.

David Buck, associate professor of psychology, is the 2020-2022 Center for Engaged Learning Scholar. Dr. Buck’s CEL Scholar project focuses on collaborative projects and assignments as a high-impact practice.

How to Cite this Post

Buck, David. 2022. “I’ve Got It! What If They Grade Each Other?” Center for Engaged Learning (blog), Elon University. March 1, 2022. https://www.centerforengagedlearning.org/ive-got-it-what-if-they-grade-each-other/.