HomeBlogCEL Scholar Should AI be Involved in Assessing Student Work? by Amanda SturgillJune 24, 2025 Share: Section NavigationSkip section navigationIn this sectionBlog Home AI and Engaged Learning Assessment of Learning Capstone Experiences CEL News CEL Retrospectives CEL Reviews Collaborative Projects and Assignments Community-Based Learning Diversity, Inclusion, and Equity ePortfolio Feedback First-Year Experiences Global Learning Health Sciences High Impact Practices Immersive Learning Internships Learning Communities Mentoring Relationships Online Education Place-Based Learning Professional and Continuing Education Publishing SoTL Reflection and Metacognition Relationships Residential Learning Communities Service-Learning Signature Work Student Leadership Student-Faculty Partnership Studying EL Supporting Neurodivergent and Physically Disabled Students Undergraduate Research Work-Integrated Learning Writing Transfer in and beyond the University Style Guide for Posts to the Center for Engaged Learning Blog I was at a campus workshop this week, and we discussed this recent article about a student requesting a tuition refund after discovering a piece of course content was generated by ChatGPT (Hill 2025). I thought the use of the word “catching” was an interesting choice in the headline I saw, and brings to mind the way some treat AI use in academia: as a dirty secret. Students and instructors alike are exploring the limits and boundaries for using the tools as part of instruction. In that same workshop, there were occasional mentions of how great it would be if it were possible to see conclusive evidence that students had used AI to complete coursework. It’s a wish I’ve heard from a lot of people. Unfortunately, easy, automatic detection tools have issues. One of the biggest flaws is the risk of false positives (Fisk 2025). AI detection tools are quicker to accuse authors who have a distinctive writing style and non-native English speakers (Giray 2024), which then can mean uncomfortable confrontations or unfair penalties. Perhaps a combination of automated and faculty inspection of writing improves detection (Säglam et al. 2024), but sophisticated prompting and word substitutions can make it very difficult to tell if AI support was used (Peng et al. 2024). In any case, it’s important to consider what an instructor might do with that knowledge. Mixed Reviews If AI is of limited value in detecting AI, does it have value on its own in assessing or providing feedback on student writing? Research suggests maybe. Impey and colleagues (2024) found that Chat GPT-4 was better than peer assessment and about as good as instructor assessment for grading short writing assignments about astronomy, whereas Kooli and Yusuf (2024) found a less perfect overlap with similar results on average, but some differences in specific aspects of the paper. Wan and Chen (2024) found that students were pleased with the quality of feedback from even earlier free ChatGPT models. Mendonça, Quintal, and Mendonça (2025) tested multiple types of large language models, finding that premium systems provided assessments that were more in line with instructor assessment than open-source systems did, which suggests that the type of system you use could make a difference in how well it works for this task. Just because you can do something doesn’t mean it’s necessarily a good idea. As the student complaint at Northeastern indicates, learners also have strong opinions about the appropriate use of AI for instructors. In Wan and Chen’s work (2024), student researchers rated AI-generated feedback as more useful than instructor-generated feedback on writing, but the raters had difficulty determining which was which. Roe, Perkins, and Ruelle (2024) found that students generally did not support AI-generated feedback, although a combination of AI and instructor feedback was better tolerated. In the session I went to at my own university, faculty descriptions of how to use AI in assessing student work were met with murmurs about whether one was pushing oneself out of a job by doing this. Much of the work to date about AI adoption attitudes from students suggests that learners are concerned about exactly how their work might be assessed. Although a lot of early work has focused on student creation of work and a need for transparency about what is acceptable, we may need to have these transparent conversations about AI in assessment as well. References Fisk, Gary D. 2025. “AI or Human? Finding and Responding to Artificial Intelligence in Student Work.” Teaching of Psychology 52 (3): 314–18. https://doi.org/10.1177/00986283241251855. Giray, Louie. 2024. “The Problem with False Positives: AI Detection Unfairly Accuses Scholars of AI Plagiarism.” The Serials Librarian 85 (5-6): 181–89. https://doi.org/10.1080/0361526X.2024.2433256. Hill, Kashmir. 2025. “The Professors Are Using ChatGPT, and Some Students Aren’t Happy About It.” New York Times. https://www.nytimes.com/2025/05/14/technology/chatgpt-college-professors.html Impey, Chris, Matthew Wenger, Nikhil Garuda, Shahriar Golchin, and Sarah Stamer. “Using Large Language Models for Automated Grading of Student Writing about Science.” International Journal of Artificial Intelligence in Education (2025): 1-35. https://doi.org/10.1007/s40593-024-00453-7. Kooli, Chokri, and Nadia Yusuf. 2024. “Transforming Educational Assessment: Insights Into the Use of ChatGPT and Large Language Models in Grading.” International Journal of Human–Computer Interaction 41 (5): 3388–99. https://doi.org/10.1080/10447318.2024.2338330. Mendonça, Pedro C., Filipe Quintal, and Fábio Mendonça. 2025. “Evaluating LLMs for Automated Scoring in Formative Assessments.” Applied Sciences 15 (5): 2787. https://doi.org/10.3390/app15052787. Peng, Xinlin, Ying Zhou, Ben He, Le Sun, and Yingfei Sun. “Hidding the Ghostwriters: An Adversarial Evaluation of AI-generated Student Essay detection.” arXiv. https://doi.org/10.48550/arXiv.2402.00412. Roe, Jasper, Mike Perkins, and Daniel Ruelle. 2024. “Understanding Student and Academic Staff Perceptions of AI Use in Assessment and Feedback.” arXiv. https://doi.org/10.48550/arXiv.2406.15808. Säglam, Timur, Sebastian Hahner, Larissa Schmid, and Erik Burger. 2024. “Automated Detection of AI-Obfuscated Plagiarism in Modeling Assignments.” In 2024 IEEE/ACM 46th International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), 297–308. https://doi.org/10.1145/3639474.3640084. Wan, Tong, and Zhongzhou Chen. 2024. “Exploring Generative AI Assisted Feedback Writing for Students’ Written Responses to a Physics Conceptual Question with Prompt Engineering and Few-Shot Learning.” Physical Review Physics Education Research 20 (1). https://doi.org/10.1103/PhysRevPhysEducRes.20.010152. About the Author Amanda Sturgill, associate professor of journalism, is a 2024-2026 CEL Scholar. Her work focuses on the intersection of artificial intelligence (AI) and engaged learning in higher education. Dr. Sturgill also previously contributed posts on global learning as a seminar leader for the 2015-2017 research seminar on Integrating Global Learning with the University Experience. How to Cite This Post Sturgill, Amanda. 2025. “Should AI be Involved in Assessing Student Work? ” Center for Engaged Learning (blog). June 24, 2025. https://www.centerforengagedlearning.org/should-ai-be-involved-in-assessing-student-work.