The Safety Gap: Why Helpful AI Might Be Hindering Vocational Learning

If you spend any time reading the relentless stream of announcements from major technology companies, you could be forgiven for thinking that the question of Artificial Intelligence in education has already been solved. The narrative is invariably one of frictionless assistance: AI tutors that never sleep, instantly clarifying complex concepts and guiding students effortlessly toward mastery. However, a recent, sobering review of the actual evidence base suggests a much more complicated reality - one that should prompt serious reflection among teachers and trainers in European education including in vocational education and training.
The Stanford SCALE Initiative recently undertook a massive review of the research landscape concerning AI in K-12 education [1]. They analysed over 800 academic papers published on the topic. Their findings are striking: out of that arepository, they identified only 20 high-quality causal studies that examine how AI tools actually affect students or educators. Even more tellingly, none of these high-quality causal studies were conducted in US K-12 classrooms. We are currently engaged in a massive, uncontrolled experiment in educational technology, and the evidence base underpinning it is remarkably thin.
What little robust evidence we do have points to a profound pedagogical tension. The Stanford report notes that while student performance often improves when they have access to AI tools during a task - such as writing code or practicing maths - the results are decidedly mixed once the AI support is removed. In some cases, performance actually declines. This raises a fundamental question for VET practitioners: are these tools helping students develop durable, transferable skills, or are they simply helping them complete tasks?
This tension is at the heart of what recent educational research terms the "Safety Gap" [2]. The Safety Gap describes the widening chasm between the surface-level competence a student displays when supported by a helpful AI tool, and their actual, internal capability to verify that output. In vocational education, where the ultimate goal is autonomous professional competence in the workplace, this gap is not just an academic concern; it is a critical vulnerability. If an apprentice electrician or trainee nurse relies on an AI "Oracle" to generate immediate answers without engaging in the cognitive friction required to build their own mental models, they risk an epistemic collapse when faced with a novel problem on the job.
The pedagogical problem lies in the design of the tools themselves. Most commercial generative AI models are engineered to be maximally helpful, which in practice means minimizing user effort and delivering answers as quickly as possible. Yet, decades of learning science—from Cognitive Load Theory to the concept of "desirable difficulties"—tell us that learning requires productive struggle. When an AI tool immediately provides the correct diagnostic procedure or the flawless block of code, it bypasses the very cognitive processes necessary for schema construction. It reduces the germane cognitive load- the productive struggle essential for deep learning - to zero.
For European VET systems, which pride themselves on the integration of theoretical knowledge and practical application, the lessons from this emerging evidence base are clear. We cannot simply drop generic, Oracle-style AI chatbots into training environments and expect them to act as effective tutors. Instead, we need to pedagogically aligned AI architectures. As researchers suggest, this means deploying "Socratic" or even "Adversarial" AI agents [2]. These are tools deliberately designed to withhold direct solutions, introduce constructive cognitive friction, and force the learner into a defensive posture where they must critique the AI's output rather than passively consume it.
Furthermore, the Stanford review highlights that the most promising applications of AI in education currently might not be student-facing at all. Early causal studies suggest that educator-facing AI tools can meaningfully reduce time spent on lesson preparation while maintaining instructional quality [1]. For VET trainers, who often juggle complex practical demonstrations with theoretical instruction, using AI to streamline administrative and preparatory burdens may be the most evidence-based approach currently available.
Ultimately, the integration of AI into vocational education must be driven by pedagogy, not product releases. We must resist frictionless task completion and actively design for the productive struggle that transforms novices into competent professionals.
References
[1] Stanford SCALE Initiative, Understanding the Evidence Base on AI in K-12 Education: A 2026 Review, March 2026. https://scale.stanford.edu/research-in-action/understanding-evidence-base-ai-k12-education
[2] Wang, H., & Shan, W., The safety gap: restoring productive struggle through pedagogically aligned generative AI, Frontiers in Education, April 2026. https://doi.org/10.3389/feduc.2026.1757622
About the Image
"Seeing More — Seeing Less" features the Atomium, Brussels' iconic monument built for the 1958 World Expo, when science promised a radiant future. The giant atom also evokes neural networks. Behind the seductive humanoid appearance of these "neural networks" lie vast databases that are, ultimately, just numbers. The title reflects a double movement: AI systems processing ever more data while narrowing what we actually perceive through their tendency to normalize thought. The image also visualises ideas expressed in Ananny and Crawford's paper, 'Seeing without Knowing' (2016) where they argue that being able to see a system is sometimes equated with being able to know how it works and govern it. Simply being transparent about code or datasets does not ensure accountability, enforce corrective actions, or enable us to understand real-world implications of technologies accurately. / Paper collage digitally recomposed. Created from Brussels archival and heritage materials during a workshop organized by FARI – AI for the Common Good Institute Brussels.
