Posted on 2025-02-11

That Study About AI is Making People Dumber

The problem with reading the study that everyone is talking about, is realizing that everybody else is an idiot.

Sun Tzu, probably

Everybody¹ has been talking about this Microsoft study titled "The Impact of Generative AI on Critical Thinking: Self-Reported Reductions in Cognitive Effort and Confidence Effects From a Survey of Knowledge Workers", or "AI makes you dumb" for short. At least, that's what people are taking away from it.

So does it? Does using Generative AI tools make the user dumber? Maybe. Or maybe not. The study doesn't say. So what does it say? And is that enough for you and me to point and laugh at all the users of GenAI?²

Let's take a look.

The Study

What did they actually do in this study? Well, they asked 319 knowledge workers³ about their use of GenAI on the job. In total, these people described 936 cases, ranging from generating an image for a presentation, through summarizing texts, to generating "recommendations for new resources and strategies to explore to hone my trading skills."

The participants self-reported data about themselves and their use of GenAI tools. The questions are provided in the appendix of the survey, so I grabbed those that were responsible for the most interesting findings:

To what extent do you agree with the following statements, regarding your daily work? (1: Strongly disagree; 5: Strongly agree)
1. I sometimes question the way others (e.g., your colleagues) do something and try to think of a better way.
2. I like to think over what I have been doing and consider alternative ways of doing it.
[Rate 1 (Strongly disagree) to 5 (Strongly agree)]
1. Generally, I trust GenAI.
2. GenAI helps me solve many problems.
Have you ever done any reflective/critical thinking (e.g., reflect on your use and the outputs you got from LLM tools) when doing this task with GenAI tool?
How confident are you in your ability to do this task with-out GenAI?
How confident are you in the ability of GenAI to do this task?

Questions (1) and (2) are about the user in general, while (3) to (5) are about a specific task. The answers to these questions get turned into cold hard numbers:

the user's general tendency to reflect on their work
the user's general trust in GenAI
the user's perceived enaction of critical thinking
the user's confidence in AI for the specific task
the user's confidence in themselves for this specific task

In the study, they mapped the (3) perceived enaction of critical thinking against the others and looked for correlations.

The Findings

So what did they find? First, users that scored high on a tendency to reflect, also did a lot of perceived critical thinking. Okay. The thinkers be thinking. I guess we wouldn't really expect anything else, but it's good to see that people don't lose their tendency to think (or not think) just because there's a chatbot present.

Next, the amount that users generally trust in GenAI does not correlate with perceived critical thinking at all. That's actually kind of interesting. Whether you're an AI super fan or a skeptic⁴ doesn't really play a role in how much you critical thinking you do. But only kind of. That's because of the next correlation.

A user's confidence in the AI for the specific task is negatively correlated with the users perceived enaction of critical thinking. This means that in cases where the user was confident in the AI they did little critical thinking themselves, while in cases where they were not confident in the AI, they did more thinking.

This is the finding that everybody has been misinterpreting or misrepresenting, claiming that trusting AI or merely using it causes you to stop thinking. All that this is actually saying is that when a person approaches the AI with some task and they are confident that the AI can do the task well, then they will not doublecheck what the AI has done. It starts and ends with the specific task. The variable that could be interpreted in any way to say something about the intelligence of a user away from the specific AI task is tendency to reflect. And we already saw that "the thinkers be thinking." AI didn't stop these people from thinking.

Let's do some thinking of our own. The study asked people to rate how confident they were in AI for a specific task. In other words: How well do you think the AI will do the job? It then asked the same people for the same task to rate their own perceived enaction of critical thinking, or did you independently verify what the AI did? And—lo and behold—when people thought the AI would do a good job, they did not expend extra effort in checking the result.⁵

Let me spell it out more with some other examples. Let's say a person needs to know the birthday of Ada Lovelace. That person is confident that Wikipedia will provide reliable information on the topic. They check Wikipedia and then they do nothing more. They do not consult another page or book.
Or let's take a parent who's asking their child whether they have cleaned up all the toys. The child says yes, but the parent is not confident in the child, so they go and check for themselves.
I hope that spelling it out like this reveals the banality of this finding: People will check what they aren't confident in and won't check what they are confident in.⁶

Whew. Okay, let's quickly close this out with the last correlation the study found. People's confidence in themselves for the task is correlated with perceived enaction of critical thinking. Put differently: A person who believes they themselves could do a bang-up job at a task will think about it when they get some AI output. On the flip-side, a person who does not think that they could perform the task well, won't even try to check the AI's work. This is common sense. It's not news worthy.

To wrap this section up, let's recap the findings:

Thinkers be thinking. People who think about everything also think about GenAI output. Those who don't, don't.
General trust in AI doesn't matter. Whether people generally have high or low trust in AI doesn't tell us anything about how much they'll trust AI for a specific task. I'll come back to that in a hot sec.
People will double-check what they aren't confident in. They won't double-check what they are confident in.
When people don't think they can check something they won't. And vice versa.

A Small Detour

I've said most of what I wanted to say. But there's one thing I've been ignoring. And that's to do with how the 319 participants and their 936 cases have been selected. Participation hinged upon people "using GenAI tools at work at least once per week." That means we don't have hard core skeptics, or even soft core skeptics.

And these people are asked about their own uses of GenAI. We can imagine a person with generally low trust in AI, because they are concerned about factual accuracy. But this person might still use it for generating an image for a presentation.

Here it becomes clear why general trust in AI did not show strong correlation either way, but confidence in the AI for the specific task did. If all users had been asked to complete the same task using AI then general trust in AI would probably have shown stronger correlation.⁷

Wrapping Up

So why am I even writing this? It seems the paper was just a long way to say some really banal things.

Well, like I briefly got into already, people have been misrepresenting, or misinterpreting this study. Or they simply projected their own ideas onto it to dunk on GenAI users whom they perceive to be intellectually lesser. And I'm not gonna lie, I wanted that too. I've been fed up with people showing me the "art" they "made," or telling me how easy it is to "code" an "app," or telling me to "look something up on chatGPT." I'd love to be able to point at a paper and say "there it is: You're dumb and I'm not!"

But this is not that paper.

It does not measure how much thinking people do or do not do.
It does not measure short-, medium-, nor long-term effects of using GenAI on critical thinking.
It does not measure short-, medium-, nor long-term effects of anything—full stop.
It does not show that thinking less about a task is causing higher confidence in AI.⁸
It does not show that AI is making people dumber.

They can do that all by themselves.

Have you ever done any reflective/critical thinking (e.g., reflect on the headlines you got from social media) when encountering this paper?
—Adapted from Appendix A.1: Survey Questions

Pivot to AI and 404media, that's two places! ↩
I'm using the term "GenAI," because that's the term they use in their study. It refers to text and image generation tools, such as chatGPT and DALL-E. This is not about protein folding or whatever. ↩
Knowledge worker here refers to a variety of jobs such as nurses, teachers, coders, or sales people. People that could conceivably use GenAI for work-related tasks. ↩
As much a skeptic as you can be and still participate. We'll get back to that. ↩
"Perceived enaction of critical thinking" does not only refer to "checking the results," as I call it here. It could also be prompt engineering or editing and adapting AI output. ↩
If you really need it, let's do another example that's not about "checking the result." A person needs to be driven to the hospital. They enter a cab and are confident in the driver. They only provide the address and do not "enact more critical thinking" by describing the complete route. ↩
Truth be told, there was some correlation with other variables I haven't talked about here, but I'm not gonna get into that. ↩
I've seen people say this. It does not make any sense to me. If anything, I'd see the causation going the other way: High confidence causing less thinking. The study itself only shows correlation. ↩

Thoughts? Reach out on Mastodon @Optional@dice.camp, message me via SimpleX, or shoot me an email.