In my previous blog post, I introduced a framework developed to assist faculty with incorporating generative AI, such as ChatGPT, in their assessments. I also shared an artificial intelligence-supported assessment (AI-SA) and some of my thinking underlying its structure and related expectations of students in a Calculus I course. In that post, readers were encouraged to rank that AI-SA with the framework and think about its usefulness for their instructional planning. Since then, I have graded students’ submissions on that AI-SA and have considered the degree to which my original AI-SA framework rankings aligned with students’ work and feedback they provided. In this pilot study, students’ feedback included responses to a brief class survey about their perceptions of learning with that AI-SA and an audio-recorded and transcribed focus group interview with a subset of seven students.

When writing my first AI-SA, Using ChatGPT to Explore Calculus Applications, I used the framework to rank assessment quality with ten questions. Recall the definition of an AI-SA, an assessment that includes student utilization of artificial intelligence technology such as a chatbot and provides the teacher with information about student progress towards achieving learning goals. AI-SAs can take on a variety of forms and purposes such as homework, writing reports, presentations, etc., and be formative or summative. My AI-SA was a writing assignment that included AI-active and AI-inactive components to help me assess student progress of the calculus learning goal and establish the connection between rates of change and accumulation.

Artificial Intelligence-Supported Assessment Framework & Rankings

Guiding Questions Response Scale: 1-very low; 2-low; 3-middle; 4-high; 5-very highRanking (1 – 5)
1) To what degree does the artificial intelligence-supported assessment (AI-SA) align with student learning objectives?5
2) To what degree does the AI-SA give every student equitable access and opportunity to actively engage in learning?4
3) To what degree does the AI-SA align with students’ current levels of AI literacy: ability to understand, use, monitor, and critically reflect on AI applications (e.g. Long, Blunt, and Magerko 2021)?3
4) To what degree does the AI-SA encourage students to achieve the teacher’s purpose and/or learning goals in efficient and powerful ways, which may not be feasible without the use of AI?4
5) To what degree does the AI-SA appropriately assign assessment components the designation of AI-active or AI-inactive?4
6) To what degree does the AI-SA encourage students to evaluate the accuracy and usability of AI output?3
7) To what degree does the AI-SA promote critical thinking about the benefits and drawbacks of using AI?3
8) To what degree does the AI-SA prepare students for using AI outside of academia?4
9) To what degree does the AI-SA align with the institution’s honor code?5
10) To what degree does the AI-SA align with the institution’s position on AI?4

The table above includes the rankings I assigned before my students completed the AI-SA. These rankings average 3.9, and I hoped that student work and feedback would support this indication of above average quality.

Students completed a post-assessment survey during the class session that the AI-SA was due. Results revealed students’ perceptions of the quality of this AI-SA, and I provide summarized results on some survey questions here.

A bar graph indicating students' agreement with the following statement: This project was more enjoyable than traditional math assignments. The responses provided for this statement ranged from one (very low) to five (very high). Zero percent of students chose one. Ten percent of students chose two. Ten percent of students chose three. Fifty percent of students chose four. Thirty percent of students chose five.

A follow-up survey prompt asked students to briefly explain their answer choice. One student reported, “I didn’t feel the pressure to get everything right, and I was able to learn through the process of completing this assignment. It helped having ChatGPT back up my reasoning, and it gave me a base line for how to do the problem.” Another student shared, “I believe the project made us learn and understand the material, especially within our own specific majors/passions.” Not all students found this project to be enjoyable, and one responded, “I would have preferred more practice problems to this assignment.” While results on these two survey questions are not conclusive, they give some evidence that my rankings on framework item 2 (equitable access – rank of 4) and item 3 (AI literacy – rank of 3) align with students’ perspectives. My ranking of 3 on item 3 may actually be lower than it should be as students seemed to have higher levels of AI literacy than I originally suspected.

A bar graph indicating students' agreement with the following statement: ChatGPT helped me understand the mathematics involved in this writing project. The responses provided for this statement ranged from one (very low) to five (very high). Zero percent of students chose one. Zero percent of students chose two. Ten percent of students chose three. 52.6 percent of students chose four. 36.8 percent of students chose five.

In response to the follow-up question, which asked students to explain their answer choice, one student provided, “I believe ChatGPT did help me understand this math topic we are learning better because it created examples using a topic of my choice and a step by step on how to solve.” Another student explained, “I felt as if the program did a decent job in helping me understand, it gave more examples than what I would have been able to come up with, but it did feel as if it was contradicting itself at times.” Results on these two survey questions give some evidence of the appropriateness of my framework rankings on item 4 (teacher’s purpose – ranking of 4); item 6 (accuracy and usability – ranking of 3); and item 7 (critical thinking – ranking of 3).

A bar graph indicating students' agreement with the following statement: ChatGPT helped me complete the writing portion of this writing project. The responses provided for this statement ranged from one (very low) to five (very high). Ten percent of students chose one. Five percent of students chose two. Thirty percent of students chose three. Forty five percent of students chose four. Ten percent of students chose five.

In the follow-up question, one student provided, “ChatGPT definitely contributed, but the inactive part of the assignment was just using the information it generated and being able to understand it and formulate it into paragraph form.” Another explained, “If I didn’t ask ChatGPT the questions I did during the active section, then I would not have been able to give as good of an explanation as I feel I did with it.” Yet another shared, “[It] gave me more ideas to think of and write on with accordance to my topic of choice.” Student responses on these two survey questions lend some evidence of the appropriateness of my rankings for framework item 5 (AI-active or AI-inactive – ranking of 4); item 9 (honor code – ranking of 5); and item 10 (position statement on AI – ranking of 4). Students seemed to adhere to the stipulations of AI-active and AI-inactive components to this AI-SA.

A bar graph indicating students' agreement with the following statement: This project helped me develop a deeper understanding of the course material. The responses provided for this statement ranged from one (very low) to five (very high). Zero percent of students chose one. Ten percent of students chose two. Fifteen percent of students chose three. Fifty five percent of students chose four. Twenty percent of students chose five.

Student responses to the follow-up question included, “Learning the way that the AI goes about finding the answer and comparing the similarities and differences to the way that we were taught in class is fascinating, as sometimes even the AI does make mistakes, and it can be up to your own discretion to try and catch the mistakes.” Another explained, “It didn’t help, but it certainly didn’t hurt.” A third student provided, “This project actually did [help me deeply understand course material] because I was actually able to better understand the connection between rates of change and accumulation.” Responses to these two survey questions give some support to my original framework rankings on item 1 (learning objectives – ranking of 5) and item 8 (AI outside of academia – ranking of 4). In general, students gave evidence of achieving the learning goal of establishing the connection between rates of change and accumulation along with considering how to appropriately use AI outside of academia (e.g., critical assessment of benefits and drawbacks).

The case I have made for the match between my AI-SA framework rankings and student survey response data is limited for a number of reasons. The small sample size in this pilot study from one class may not be representative of other Calculus I classes. The nature of the discipline of mathematics may also influence how framework item rankings do (or do not) match what students report. However, the preliminary data analyzed about using this framework and the related AI-SA shows some promise for its effectiveness to encourage faculty and students to use generative AI in productive and educative ways.

I encourage you to use the AI-SA framework to write or adapt an existing assessment for your students. Recall that one benefit of generative AI is its ability to efficiently generate multiple examples of concepts and applications. Try it out and send me an email at atrocki@elon.edu to communicate feedback. In my next blog post, we will further unpack student submissions on this AI-SA, provide a summary list of areas of interest students chose for this assessment, and share insights gained from the focus group interview.

References

Long, Duri, Takeria Blunt, and Brian Magerko. 2021. “Co-designing AI Literacy Exhibits for Informal Learning Spaces.” Proceedings of the ACM on Human-Computer Interaction 5, no. CSCW2: 1-35. https://doi.org/10.1145/3476034.

Aaron Trocki is an Associate Professor of Mathematics at Elon University. He is the CEL Scholar for 2023-2024 and is focusing on models of assessment and feedback outside of traditional grading assumptions and approaches.

How to Cite this Post

Trocki, Aaron. 2024. “Utilizing a Framework for Artificial Intelligence-Supported Assessments: Part 1.” Center for Engaged Learning (blog), Elon University. February 20, 2024. https://www.centerforengagedlearning.org/utilizing-a-framework-for-artificial-intelligence-supported-assessments-part-1.