April 6th, 2023

As higher education begin to understand the implications of Generative AI, I find myself fascinated by two hypothetical scenarios and my contrasting reactions to them.

Two hypothetical cases:

Dr. Algon’s Feedback

Dr. Carla Algon asks her students to submit rough drafts of a 1-page essay, so she can give them some formative feedback. The instructor has given feedback on this kind of writing hundreds of times, and has a document where they keep some “stock language” of comments and suggestions that they have used repeatedly over the years. Trying to give useful feedback as efficiently as they can, Dr. Algon would copy and paste the relevant comments, then tweak those stock responses to fit the specifics of the draft, and send the feedback to the students.

This time, as a starting point, Dr. Algon asks ChatGPT to give “feedback on the writing, style and content” of the essay. The generative AI tool gives some not-very-useful spelling and grammar advice, but also offers some useful suggestions to the student about including more analysis, and making the personal connections stronger. Parts of the feedback are well-written and have good explanations. Dr. Algon copies two paragraphs of ChatGPT’s feedback, tweaks a few words and sends it.

Ramon’s Essay

Now, a different situation: A student names Ramon has had a difficult few weeks, with a small family emergency, a car that wouldn’t start and unexpected shifts in the schedule of his full-time job. Late on Friday night, Ramon realizes that an assignment is due in his Principles of Management course that very night. He is supposed to write 2-pages about Fayon’s “General and Industrial Administration.” As a starting point, he submits the essay prompt to ChatGPT. He tweaks the results a few times, asking it to sound more human, and give the generative AI a few tidbits from his real life to incorporate. He copies the results into his own file, changes a handful of the sentences and submits it.

Why do these feel different?

In both of these situations, one person is sending words to another. In both, the recipient is led to believe that the words were generated by the sender. But I expect that you feel differently about these two cases,and I think it is interesting to dive into the reasons for that.

Does it matter if we are transparent about using ChatGPT?

Imagine that at the top of each piece of writing each author had included “The following was first written by ChatGPT and then collaboratively edited and changed by myself and ChatGPT.” Will the recipient react the same way in each case? I suspect both recipients might start with an intuitive resistance, thinking “hey, that’s cheating.” However, once each of them reads the content, I think the reactions will diverge. If the student gets good, useful feedback, they may not care as much that generative AI was used in its creation. Ramon’s professor, however, will continue to care deeply about how, exactly, ChatGPT was involved.

Does it matter that one is for a grade?

Ramon’s essay is going to be graded, but Dr. Algon’s feedback is not. Does that matter? Maybe a little. Ramon is expected to adhere to a student code of conduct, which may include language that deems his use of ChatGPT a breach of academic integrity. There is no similar policy dictating that Dr. Algon’s feedback must be her original work. But then again, very few faculty would say that the standards of academic integrity are somehow one-sided. We would never say that while students much avoid plagiarism, faculty are free to claim other’s work as their own.

Is it the words that matter, or are they a proxy?

When Dr. Along’s student reads the essay feedback, the content of the feedback is what matters. For both the author, Dr. Algon, and their student, what matters is that the feedback is appropriate, useful, actionable, etc. The feedback itself is the only goal here. The student doesn’t need (or want) to know what was going on in Dr. Algon’s mind when writing the feedback, they just want good feedback! If there is a tool that makes this more efficient and effective, whether that is stock text to be copied and pasted, or ChatGPT, why would the student care if Dr. Algon uses it?

On the other hand, Ramon’s Principle of Management professor does not actually care about the words Ramon writes (aside from mild curiosity about the personalized parts). The Management Professor knows the topic of the essay very well and isn’t seeking new insights from their students. Instead, the content of the writing is a proxy. The Management Professor, like so many of their colleagues, assigns an essay like this believing that if a student can write a 2-page analysis of Fayon’s “General and Industrial Administration,” it implies the student has actually learned useful management concepts. That learning is what the professor actually cares about. The content of the writing is treated as indirect evidence of student understanding, or even skill. This is one reason plagiarism is considered academic misconduct. If Ramon submits someone else’s writing, then the writing is no longer a proxy for Ramon’s understanding. The premise of the essay is broken, making the assessment invalid.

Similar Situations

This divide between writing in which the content is the purpose and writing in which the content is a proxy certainly shows up elsewhere. If you read instructions on how to submit your taxes, the writing is the point. You want clear, accurate, easy-to-follow instructions, and you don’t care at all who wrote them, or how. On the other hand, if you are writing a letter of apology to a loved one, the writing is a proxy for your state of mind and intentions.

[The day after I noted that a letter of apology is a good example here, news broke about a university using ChatGPT to write a message about a recent shooting at Michigan State University.]

Perhaps writing as a proxy is dead?

What does assessment in education look like if we assume that “writing as proxy” is dead to us? In other words, let us state up front that instructors cannot know whether the words they read were written by the student that submitted them. This would instantly call into question (or outright invalidate) an enormous number of assessments in a wide array of courses at every university.

I think that is close to the truth of this moment, and I hope we will grapple with the implications. More specifically, I hope we will take this crisis in assessment as an opportunity to change teaching and learning for the better, finally putting into practice the irrefutable evidence that has accumulated in 50+ years of educational research.