Introduction
In January 2025, a researcher named Michael Gerlich published a paper that ricocheted across the internet within days. The finding was blunt: among 666 participants in the United Kingdom, frequent use of AI tools correlated with significantly lower critical thinking scores [1]. The correlation coefficient was -0.68. In behavioral science, that number is unusually large. It suggested that the more people relied on AI for answers, the less capable they became at evaluating those answers independently.
Within weeks, headlines simplified the finding into alarm. "AI is eroding our brains." "ChatGPT is killing critical thinking." But the actual science told a more complicated and more interesting story. Three other major studies followed in rapid succession. Microsoft and Carnegie Mellon surveyed 319 knowledge workers and found that confidence in AI predicted less critical thinking, while self-confidence predicted more [2]. MIT Media Lab strapped EEG electrodes to 54 participants and watched their brain activity drop when they wrote essays with ChatGPT [3]. And RAND Corporation tracked 1,214 American students through 2025, finding that AI homework use jumped from 48% to 62% in seven months , while 67% of students themselves said they believed AI was harming their critical thinking [4].
These four studies, taken together, paint a picture that is neither simple doom nor simple progress. The question is not whether AI hurts thinking. The question is how, when, and why. And to answer that, we need to go deeper , into the brain itself.

The Paper That Launched a Thousand Headlines
Michael Gerlich is a professor at SBS Swiss Business School in Zurich. His study, published in the journal Societies in January 2025, used a mixed-methods design: a 23-item survey distributed to 666 participants across three age groups (17-25, 26-45, and 46+), combined with 50 in-depth qualitative interviews [1].
The quantitative backbone was the Halpern Critical Thinking Assessment, a validated instrument that measures five dimensions of critical thinking: verbal reasoning, argument analysis, hypothesis testing, likelihood and uncertainty, and decision-making [5]. Participants also reported how frequently they used AI tools like ChatGPT, Google Gemini, and Microsoft Copilot.
Three numbers defined the paper. First: r = -0.68 between AI tool usage and critical thinking scores. Second: r = +0.72 between AI tool usage and cognitive offloading , the tendency to delegate mental work to an external tool. Third: r = -0.75 between cognitive offloading and critical thinking. The pattern was clear. More AI use predicted more offloading. More offloading predicted weaker thinking.
But the most striking finding was the age gap. Participants aged 17 to 25 showed the highest AI dependence and the lowest critical thinking scores. Those over 46 showed the opposite pattern. Education level mattered too , higher educational attainment correlated with better critical thinking regardless of AI usage [1].
The study had real limitations. It was correlational, not causal. It was cross-sectional, meaning it captured a snapshot, not a trajectory. Its sample was entirely British. And its measures of AI usage were self-reported. A correction notice was published in September 2025 addressing errors in one of the ANOVA tables, though the primary correlations remained unchanged [6].
Still, the effect size was hard to ignore. And it aligned with a growing body of evidence from very different directions.

What Happens Inside the Brain When AI Does the Thinking
The most direct evidence about AI's effect on the brain came from an unlikely place: a basement laboratory at MIT, where neuroscientist Nataliya Kosmyna and her team fitted 54 volunteers with 32-channel EEG caps and asked them to write persuasive essays.
The participants were split into three conditions. One group used ChatGPT. Another used a search engine. A third wrote entirely from their own knowledge , no external tools at all. After three sessions, a subset of 18 participants crossed over to a different condition, allowing the researchers to see what happened when a ChatGPT user was forced to write alone, and vice versa [3].
The EEG results were striking. The team measured dynamic Direct Transfer Function connectivity in the alpha band , a marker of information flow between brain regions involved in executive control, working memory, and sustained attention. The ChatGPT group showed significantly lower fronto-parietal connectivity compared to both the search engine and brain-only groups. Their brains were, quite literally, doing less coordinated cognitive work during the writing process.
But here is the finding that made the study go viral. When ChatGPT users were reassigned to write without AI assistance, they continued to show reduced prefrontal and occipito-parietal activation. Kosmyna called this "cognitive debt" , a carry-over effect suggesting that even short-term AI reliance left a measurable neural trace [7]. The brain-only-first group, by contrast, retained higher activation even when they later switched to using ChatGPT.
One more detail mattered. When participants were asked to recall and quote their own essays afterward, the ChatGPT group performed worst. They had difficulty remembering what they had supposedly written. The essays had their names on them, but the thinking behind the words belonged to the machine.
A critical caveat: this study was a preprint, not yet peer-reviewed. The sample was small, especially for the crossover phase. The MIT team's official FAQ explicitly asked journalists not to use terms like "brain damage" or "brain rot." The study measured task-time activation, not structural change. Still, it offered the first neuroimaging evidence of what cognitive offloading to AI looks like in real time.

The Neural Architecture of Thinking Critically
To understand why AI might weaken critical thinking, it helps to know where critical thinking happens in the brain. Not in one place. In a network.
The dorsolateral prefrontal cortex , the DLPFC , sits just behind your forehead, slightly to each side. Think of it as the brain's project manager. It holds information in working memory, sequences steps in a plan, and monitors for conflicts between competing responses. When you read a persuasive argument and notice a logical flaw, your DLPFC is active. When you resist the urge to accept a plausible-sounding answer and instead check the evidence, your DLPFC is doing the heavy lifting [8].
Neuroscientist John Kroger and colleagues showed in 2002, using fMRI, that the anterior portion of the left DLPFC is selectively recruited when reasoning problems become relationally complex , when you have to juggle multiple variables and their relationships simultaneously [9]. More recently, Fresnoza and colleagues used high-definition transcranial direct current stimulation to demonstrate a dissociation: the left DLPFC handles inductive reasoning (drawing general conclusions from specific cases), while the left inferior frontal gyrus handles deductive reasoning (applying general rules to specific cases) [10].
The right DLPFC has its own specialty. A 2018 study showed that stimulating the right DLPFC with anodal tDCS improved performance on the Cognitive Reflection Test , a measure of the ability to override intuitive but wrong answers with analytical reasoning [11]. This is the brain region that helps you pause when something feels right but actually is not.
Deeper in the brain sits the anterior cingulate cortex, or ACC. Matthew Botvinick and colleagues at Princeton described the ACC as a conflict monitor , it fires when the brain detects that two response pathways are competing. See a Stroop test (the word "RED" printed in blue ink) and your ACC lights up. It signals the DLPFC to increase top-down control, to override the automatic response and select the correct one [12].
Then there is the ventromedial prefrontal cortex, the VMPFC, sitting on the underside of the frontal lobes. Vinod Goel and Raymond Dolan showed in 2003 that the VMPFC activates when belief-laden content overrides logical evaluation , when people accept an argument not because it is logically valid but because its conclusion sounds believable [13]. The VMPFC is, in a sense, the brain region where bias wins.
A 2025 fNIRS study confirmed this architecture for critical thinking specifically. During belief-bias syllogistic reasoning, the right inferior frontal cortex and left DLPFC showed increased activation when participants successfully overrode belief-driven errors. When they failed to override , when belief trumped logic , the VMPFC was more active [13].
What does all this mean for AI? When you ask ChatGPT a question and accept its answer without evaluating it, you skip the DLPFC-ACC loop entirely. There is no conflict to monitor because you have not generated your own answer to compare against. There is no relational complexity to manage because the AI has already done the managing. The neural circuits that would normally fire during evaluation simply do not get recruited. And circuits that do not get recruited do not get strengthened. This is the neuroplasticity argument: use it or lose it.

Cognitive Offloading: The Invisible Bargain
The concept behind the AI-thinking problem is not new. Psychologists have a name for it that predates ChatGPT by decades: cognitive offloading.
In 2016, Evan Risko at the University of Waterloo and Sam Gilbert at University College London published a foundational review in Trends in Cognitive Sciences defining cognitive offloading as "the use of physical action to alter the information processing requirements of a task so as to reduce cognitive demand" [5]. You tilt your head to align a rotated image with your visual system's preferred orientation. You write a shopping list instead of memorizing it. You set a phone alarm instead of holding a time in working memory. Each of these acts shifts cognitive labor from brain to world.
Risko and Gilbert proposed a metacognitive model: people decide to offload based on their subjective assessment of internal task difficulty and their confidence in their own abilities. When confidence is low and the task feels hard, offloading increases. When an external tool is trusted and easy to use, offloading increases further [14].
This model maps perfectly onto AI use. ChatGPT is maximally trusted (its outputs sound confident and polished), maximally easy (type a question, get an answer), and the tasks people give it , writing, analyzing, summarizing , feel genuinely difficult. All three triggers for offloading fire simultaneously. A 2025 study by Lopez and colleagues confirmed that trust in the tool is a significant moderator of offloading behavior, even when the tool's accuracy does not justify that trust [15].
But there is a distinction that matters enormously. Not all offloading is harmful.
When you use a calculator to do long division, you free your working memory for the higher-order mathematical thinking that the division was serving. The calculator handles computation; your brain handles conceptualization. A meta-analysis by Robert Ellington in 2003 reviewed dozens of studies and found that calculator use actually improved both operational and conceptual math skills when used as a supplement to instruction, not a replacement for it [16]. Similarly, spell-checkers free attentional resources for argumentation. These are cases of cognitive augmentation , the tool handles lower-order processing so the brain can focus on higher-order processing.
The problem arises when the offloaded process is itself the target skill. When AI does the evaluating, the analyzing, the synthesizing, and the judging , when it does the critical thinking , the brain is not augmented. It is bypassed. The practice deficit accumulates. And as any musician, athlete, or chess player knows, skills that are not practiced decay. The RAND Corporation's 2026 report explicitly distinguished between cognitive augmentation and cognitive offloading as the policy-relevant distinction schools need to address [4].
From Socrates to Smartphones
Fear that a new technology will destroy thinking is as old as technology itself.
Around 370 BCE, Plato wrote the Phaedrus, in which Socrates tells the myth of the Egyptian god Theuth, who presents the gift of writing to King Thamus. Thamus refuses. Writing, he says, "will create forgetfulness in the learners' souls, because they will not use their memories; they will trust to the external written characters and not remember of themselves" [17]. Socrates was making the original cognitive offloading argument , twenty-four centuries before Risko and Gilbert gave it a name.
The irony, of course, is that Plato's argument against writing was itself written down. And writing did not destroy memory. It changed memory. It freed the mind from rote storage and allowed human civilizations to accumulate knowledge across generations in ways that purely oral cultures could not.
The same cycle repeated with the printing press. In 1477, Hieronimus Squarciafico, a printing assistant to Aldus Manutius, complained that "abundance of books makes men less studious; it destroys memory and enfeebles the mind." Books did not enfeeble the mind. They redistributed cognitive labor.
In the 1970s through 1990s, calculators triggered the same anxiety. Would students lose the ability to do arithmetic? The meta-analyses tell a clear story. When calculators augmented instruction , used after students understood the underlying concepts , math performance improved. When calculators replaced instruction , used before students understood the concepts , performance did not improve, but it did not decline either [16]. The direction of the effect depended entirely on the relationship between the tool and the skill.
Search engines and Google brought the debate into the digital age. Betsy Sparrow, Jenny Liu, and Daniel Wegner at Columbia and Harvard published their landmark "Google Effects on Memory" study in Science in 2011. Four experiments demonstrated that people who expected to have later access to information encoded the information itself less well, but encoded where to find the information more effectively [18]. The internet was becoming what Wegner had previously theorized as a transactive memory partner , an external storage system that the brain treats as an extension of itself.
A 2024 meta-analysis of the Google effect literature confirmed the basic pattern: digital information access changes encoding strategy, shifting from content to source. But it found no evidence of decline in fluid intelligence or general cognitive ability [19].
The lesson from history is not that technology is harmless. The lesson is that the direction of its effect depends on one thing: whether the tool augments the target skill or replaces it. Calculators that augment conceptual math are beneficial. Spell-checkers that free attention for argument construction are beneficial. But AI that replaces the entire reasoning process? That is something genuinely new. And this is where history stops providing reassurance, because no previous technology could substitute for the act of thinking itself.

The Confidence Trap
The Microsoft Research and Carnegie Mellon University study, presented at CHI 2025, added a psychological twist that the Gerlich paper missed.
Hao-Ping Lee and colleagues surveyed 319 knowledge workers about their real-world experiences using generative AI at work. They collected 936 concrete examples of AI use and measured both the confidence participants placed in AI and the confidence they placed in themselves [2].
The result was an asymmetry. Higher confidence in GenAI predicted less critical thinking enacted , participants who trusted the AI more spent less effort evaluating its outputs, checking its reasoning, and considering alternatives. But higher self-confidence , confidence in one's own abilities , predicted more critical thinking. People who believed in their own judgment were more likely to verify, question, and push back on AI outputs [20].
This finding has a direct parallel in psychology. The Dunning-Kruger effect describes how people with low skill in a domain tend to overestimate their competence because they lack the knowledge to recognize their own gaps. Lee et al. identified an analogous AI pattern: users with low domain expertise but high trust in AI were the least likely to verify AI outputs , precisely because they lacked the knowledge base to recognize when the AI was wrong. Meanwhile, experts with high self-confidence verified more aggressively, widening the skill gap between thoughtful users and passive consumers.
The study also documented a shift in the nature of cognitive work. When AI entered the workflow, the researchers observed a transformation: from information gathering to information verification, from problem-solving to AI response integration, from doing tasks to supervising tasks. The cognitive demand did not disappear. It changed shape. But many workers had not adapted their habits to match the new shape.
The practical implication cuts both ways. For someone with a strong knowledge base and the metacognitive habit of self-checking, AI becomes a genuine thinking partner , a tool that generates first drafts of analysis that the human then sharpens, challenges, and improves. This is desirable difficulty in action: the effort of evaluation strengthens understanding. For someone without that foundation, AI becomes an intellectual crutch that feels efficient while silently eroding the very capacities it seems to serve.

A Generation That Already Knows
Perhaps the most unsettling data in the entire AI-critical-thinking literature comes not from neuroscientists or psychologists, but from the students themselves.
The RAND Corporation's American Youth Panel tracked 1,214 students aged 12 to 29 through the 2025-2026 academic year. In May 2025, 48% of students reported using AI to help with homework. By December 2025, that figure had risen to 62%. The increase was driven almost entirely by middle school and high school students; college-level use remained relatively stable [4].
But here is the paradox. At the same time that usage was surging, so was worry. In December 2025, 67% of students endorsed the statement "The more students use AI for their schoolwork, the more it will harm their critical thinking skills." That was up from 54% just seven months earlier [21]. Female students were more likely to express this concern than male students.
A separate RAND survey of adults and educators found that 61% of parents agreed AI harms critical thinking, while only 22% of school district leaders shared that concern. More than 80% of students said their teachers had never taught them how to use AI effectively [22].
Read those numbers again. Students are using AI more. Students believe AI harms their thinking. Teachers are not teaching them how to use AI well. This is not a technology problem. It is an education policy failure. Students have accurate metacognitive awareness , they can feel the dependency forming , but they lack the scaffolding to do anything about it.
In the UK, the Higher Education Policy Institute's 2026 survey of 1,054 undergraduates found that 95% were using AI in some form, and 94% had used it for assessed work [23]. Separately, a 2024 Elsevier survey of roughly 3,000 researchers and clinicians across 123 countries found that 81% worried AI would erode critical thinking in their fields [24].

Beyond the Classroom: Where Critical Thinking Meets Real Consequences
The conversation about AI and critical thinking has largely focused on education. But the stakes are highest where critical thinking failures have real-world consequences: medicine, law, finance, and civic life.
In healthcare, a phenomenon called automation bias has been documented for decades in radiology and aviation. A 2024 randomized controlled trial in computational pathology (n = 28 pathologists) measured what happens when AI gives incorrect advice. Seven percent of pathologists who had initially made correct diagnoses reversed their correct assessments after receiving erroneous AI recommendations. Time pressure made the effect worse [25]. A 2025 preprint on LLM-assisted diagnostic reasoning found that LLM hallucination rates reached 50 to 82 percent when clinical vignettes contained even a single incorrect detail [26].
In legal practice, the case of Mata v. Avianca in 2023 became a cautionary tale. Lawyers submitted a court brief containing fabricated case citations generated by ChatGPT. The cases did not exist. The lawyers had not verified them. The court sanctioned the attorneys and the incident became a textbook example of what happens when professionals offload judgment to AI without applying critical evaluation [27].
In the workplace more broadly, the pattern is accelerating. ChatGPT reached 800 million weekly active users by October 2025, up from 200 million just fourteen months earlier [28]. A PNAS study of 18,000 Danish workers found that adoption varies dramatically by profession: marketing and journalism professionals lead with roughly 64% adoption, while financial advisors and accountants lag at about 18% , partly because of data sensitivity concerns [29]. The same study found a striking gender gap: women were 16 percentage points less likely than men to use ChatGPT for work in the same occupation.
The democratic implications are harder to quantify but potentially larger. If a significant share of future voters, jurors, policymakers, and journalists have weakened habits of critical evaluation , if the DLPFC-ACC circuit that helps them detect logical errors, evaluate evidence, and resist persuasive but unsound arguments has been under-exercised for years , the downstream effects on civic reasoning could be substantial. UNESCO's 2023 guidance on generative AI in education explicitly flagged preservation of critical thinking as a condition for responsible AI integration [30].
When AI Actually Makes Thinking Better
The story so far has been mostly cautionary. But the evidence is not one-sided. Under the right conditions, AI can enhance critical thinking rather than erode it.
A systematic review by Helal and colleagues, published in Information Discovery and Delivery in 2025, synthesized 68 peer-reviewed studies on generative AI and critical thinking. Their conclusion was not that AI harms thinking, but that the effect depends entirely on how AI is integrated into the task [31]. When AI was used as a "thinking tool" , for generating counterarguments, surfacing assumptions, or providing scaffolded feedback , critical thinking scores improved. When AI was used as a "content tool" , generating finished answers or complete essays , critical thinking scores declined.
A quasi-experiment with 163 sixth-graders in China tested this distinction directly. Students were divided into three groups: a control group receiving lecture-based instruction, a group using AI as a cognitive tool (generating study materials), and a group using AI as a thinking tool (analyzing and challenging ideas). Both AI groups outperformed the control in knowledge transfer, but the thinking-tool group showed the strongest gains in analytical reasoning [32].
Aman Sarkar at Microsoft Research took the concept further. In a 2024 paper titled "When Copilot Becomes Autopilot," he proposed designing AI interfaces that act as critics and provocateurs rather than answer-generators [33]. A follow-up between-subjects study (n = 24) in early 2025 tested whether AI-generated provocations , challenges to the user's reasoning , could restore critical engagement. They could. Participants who received provocations from AI showed metacognitive effort comparable to those working entirely without AI [34].
Stanford's Hari Subramonyam developed a tool called Script & Shift that embodied this philosophy for student writing. Instead of generating text, the tool provided "buttons and interfaces that allow students to engage in idea formation" , forcing students to make creative and critical decisions at each step rather than delegating them to the model [35].
The principle that emerges from this counter-evidence is simple. AI enhances thinking when it increases the cognitive demand on the user. AI erodes thinking when it decreases the cognitive demand. This is perfectly consistent with what research on active recall and desirable difficulties has shown for decades: learning happens when retrieval is effortful. Ease is the enemy of encoding.

The Metacognitive Solution
If the problem is cognitive offloading, the solution is metacognitive awareness , the ability to think about your own thinking and regulate it deliberately.
Zimmerman's three-phase model of self-regulated learning maps directly onto responsible AI use. The forethought phase is where you set goals before prompting: what am I trying to learn, decide, or produce? The performance phase is where you monitor AI outputs against those goals: does this answer actually address my question, or does it just sound like it does? The self-reflection phase is where you evaluate the quality of both the AI's contribution and your own engagement: did I learn something, or did I just consume an answer [31]?
Lee et al.'s finding from the Microsoft/CMU study translates directly into this framework. Self-confidence , the metacognitive belief that you can handle the task , predicted higher critical thinking effort. This aligns with Bandura's self-efficacy theory: mastery experiences are the primary source of self-efficacy, and AI substitution erodes mastery experiences by removing the productive struggle that builds them [2].
A 2025 paper on passive versus collaborative AI use in higher education introduced the concept of "metacognitive laziness" to describe the failure of the self-regulation loop in unstructured AI use. When students received no guidance on how to use AI, they defaulted to passive consumption. When they received explicit metacognitive scaffolding , prompts to evaluate, question, and compare AI outputs , their critical thinking scores matched or exceeded those of students working without AI [36].
What might a practical protocol look like? Based on the converging evidence from Risko and Gilbert's metacognitive model, Lee et al.'s confidence findings, and the counter-evidence from scaffolded studies, a four-step approach emerges.
Step one: Think first. Before prompting AI, generate your own answer, analysis, or outline. Even a rough draft engages the DLPFC-ACC circuit. Step two: Prompt deliberately. Frame your prompt as a question for a thinking partner, not as a request for a finished product. Ask for counterarguments, alternative perspectives, or evidence that contradicts your position. Step three: Verify systematically. Check AI claims against primary sources. Challenge logical steps. Look for omissions. Step four: Reflect explicitly. After the task, ask yourself what you learned, what surprised you, and where your initial thinking was wrong.
This is not a theoretical framework. It is what expert users already do naturally , the same users who, in Lee et al.'s data, showed higher self-confidence and more critical thinking despite heavy AI use. The challenge is teaching this protocol to the majority who have not developed it independently.
The Road Ahead
The relationship between AI tools and critical thinking is not a simple story of machines destroying minds. It is a story about what happens when a species that evolved to think discovers tools that can think for it.
The evidence from four major 2025-2026 studies converges on three conclusions. First, unreflective AI use correlates with weaker critical thinking, and the neural evidence from MIT suggests this is not just a behavioral change but a measurable shift in brain activation patterns [3]. Second, the mechanism is cognitive offloading , specifically, the offloading of evaluation, analysis, and judgment rather than mere computation or storage [1]. Third, the effect is contingent, not inevitable. Metacognitive awareness, self-confidence, and the design of AI interfaces all moderate whether AI enhances or erodes critical thinking [2].
History offers cautious comfort. Every major cognitive technology , writing, print, calculators, search engines , triggered the same fears. And in every case, the outcome depended not on the technology itself but on how it was integrated into practice. Writing did not destroy memory. It changed what we memorize and freed the mind for higher-order work. Calculators did not destroy arithmetic skill when used properly. They augmented it.
But AI is qualitatively different from every previous tool in one respect. It is the first technology that can simulate the act of reasoning itself. Calculators did computation. Search engines did retrieval. AI does evaluation, synthesis, and judgment , the core operations of critical thinking. When those operations are offloaded, there is no higher-order function left to be freed for.
The question for the next decade is not whether AI will be used. Eight hundred million people already use ChatGPT every week [28]. The question is whether educational systems, workplace cultures, and individual habits will evolve fast enough to distinguish between augmentation and offloading , between AI as a thinking partner and AI as a thinking replacement.
The brain's capacity for critical evaluation is not fixed. It can be strengthened with practice and weakened with neglect. The prefrontal cortex is not a read-only device. It is a muscle. And like any muscle, it responds to the demands placed on it.
The choice is not between using AI and thinking critically. The choice is between using AI while thinking critically, and letting AI do the thinking for you.

Frequently Asked Questions
Does AI make you a worse critical thinker?
Not automatically. Research shows the effect depends on how AI is used. Passive acceptance of AI outputs correlates with lower critical thinking scores, while active verification and questioning of AI can maintain or even strengthen analytical skills. The key variable is whether the user engages in evaluation or skips it entirely.
What is cognitive offloading and why does it matter for AI?
Cognitive offloading means using external tools to reduce mental effort. When AI handles reasoning tasks that you would otherwise do internally, the brain regions responsible for analysis and evaluation get less practice. Over time, this reduced engagement can weaken the neural circuits that support independent critical thinking, according to EEG research from MIT Media Lab.
Which brain regions are involved in critical thinking?
Three main regions form the core network. The dorsolateral prefrontal cortex manages working memory and analytical reasoning. The anterior cingulate cortex detects conflicts between competing responses and triggers deeper evaluation. The ventromedial prefrontal cortex is involved when beliefs override logical analysis. Together they form the evaluation circuit that AI use can bypass.
Can AI actually improve critical thinking?
Yes, when designed correctly. Studies show that AI used as a thinking tool , generating counterarguments, challenging assumptions, or providing scaffolded prompts , can improve critical thinking scores. The key is that the AI must increase cognitive demand on the user rather than decrease it. Tools designed to provoke reflection outperform tools designed to deliver finished answers.
What can students do to protect their critical thinking while using AI?
Research suggests a four-step approach: think first by generating your own answer before prompting AI, prompt deliberately by asking for challenges rather than finished answers, verify systematically by checking AI claims against primary sources, and reflect explicitly on what you learned. Studies show that users with strong metacognitive habits maintain critical thinking skills even with heavy AI use.





