The distinction
AI Leakage vs Shadow AI
Shadow AI is the behaviour; AI leakage is the outcome. Why approved tools can leak too, and what to do about each.
Most people use “shadow AI” and “AI leakage” as if they mean the same thing. They do not, and treating them as one creates a blind spot that leaves a business exposed even after it believes it has fixed the problem. This page defines both, shows how they relate, and explains why the difference is the single most useful thing to understand about AI risk in a small business.
Defining shadow AI
Shadow AI is a governance problem. It is staff using AI tools — chatbots, assistants, code helpers, image generators — without the knowledge or approval of whoever is responsible for risk in the business. The term borrows from “shadow IT,” the older problem of employees quietly adopting unapproved software like personal Dropbox or messaging apps for work.
What defines shadow AI is visibility, or the lack of it. The question it asks is: does your business actually know which AI tools your people are using? Examples: a salesperson drafting proposals in a personal ChatGPT account, a developer using an unreviewed code assistant, an admin pasting figures into a summarising tool nobody signed off on.
Defining AI leakage
AI leakage is a data problem. It is sensitive, confidential, or proprietary information leaving your control through an AI system — stored on someone else’s servers, used to train a model, exposed in a breach, or reachable by a third party.
What defines AI leakage is control, or the loss of it. The question it asks is: once your information went into an AI tool, where did it go? Examples: a confidential contract pasted into a tool that trains on inputs, client personal information sent to an AI with no data-processing terms, source code retained on an external server.
The critical difference
Shadow AI describes how data reaches an AI system. AI leakage describes what happens to it once it gets there.
Shadow AI is the path. AI leakage is the outcome. They overlap — unapproved tool use is one of the main causes of leakage — but the relationship is not one-to-one, and that gap is exactly where the danger lives.
The blind spot: approved tools leak too
Here is the insight most businesses miss: you can eliminate shadow AI entirely and still suffer serious AI leakage. Roll out one approved AI assistant for everyone and shadow AI drops to zero. But if that approved tool retains inputs for 30 days, trains on what staff type, or its vendor gets breached, leakage is happening at scale — through a fully sanctioned channel.
| Scenario | Shadow AI? | AI leakage? |
|---|---|---|
| Staff use an unapproved AI tool and paste client data | Yes | Yes |
| Staff use an unapproved tool with zero retention and no training | Yes | Minimal |
| Business rolls out an approved AI tool with weak retention terms | No | Yes |
| Business rolls out an approved tool with a no-training, zero-retention agreement | No | No |
The bottom-right row is the goal. Most businesses only ever try to move from the top to the bottom — stamping out unapproved tools — without ever asking whether their approved tools actually prevent leakage.
Why treating them as the same is dangerous
When shadow AI and AI leakage are treated as one problem, businesses make four predictable mistakes:
- They write policies that do not address data risk. “Only use approved tools” controls shadow AI and says nothing about whether those approved tools are safe for sensitive data.
- They measure the wrong thing. Shadow AI can be found through monitoring and audits; leakage requires understanding vendor data terms and where data flows — a different exercise entirely.
- They create false confidence. “We have dealt with our shadow AI” can be completely true and completely beside the point about data exposure.
- They miss vendor risk. Shadow AI is mostly about staff behaviour; leakage includes the policies and security of the approved vendors themselves, which usually get far less scrutiny.
What to actually do about each
Because they are different problems, they need different responses.
To address shadow AI: find out which AI tools your team is actually using, set a simple approval process for new tools, write a clear acceptable-use policy, and watch for unapproved tool use.
To address AI leakage: check the retention and training terms of every AI tool in use — approved or not — decide which data should never go into an external AI tool, prefer tools that offer no-training and zero-retention options, and train staff to recognise sensitive data before it goes into a prompt. This second list applies to your approved tools too, not just the unapproved ones.
The regulatory angle (New Zealand, Australia and beyond)
Under New Zealand’s Privacy Act 2020 — which applies to every business operating in New Zealand, with no small-business exemption — personal information fed into an AI tool outside proper terms can be a compliance problem regardless of whether the tool was approved. Australia’s Privacy Act and Australian Privacy Principles, and the EU’s GDPR, point the same way. The shadow-AI question — approved or not — is irrelevant to those obligations. What matters is what happened to the data. A business that has only addressed shadow AI, and not leakage, has a weaker compliance position than it thinks.
In summary
| Shadow AI | AI leakage | |
|---|---|---|
| What it is | Unapproved AI tool use | Sensitive data leaving your control via AI |
| Core concern | Visibility and governance | Data security and privacy |
| Caused by | Staff behaviour | Staff behaviour + vendor practices |
| Fixed by | Policy and monitoring | Data classification + vendor checks + controls |
| Affects approved tools? | No, by definition | Yes — critically |
| Regulatory exposure | Indirect | Direct (Privacy Act 2020, APPs, GDPR) |
Shadow AI is a symptom of loose AI governance. AI leakage is the harm that loose governance allows — and the harm that even good governance can fail to prevent if it only ever tackles shadow AI.
If you are citing or referencing this distinction in your own work, we welcome it — that is how a more precise language for AI risk spreads. AI Leakage tracks and rates AI tools for leakage risk; our Vendor Database scores every tool on the dimensions that decide whether your data stays under your control.
How this was written: this page was researched and drafted with AI assistance (primarily Claude on a no-training tier) and reviewed against primary sources before publication. We hold ourselves to the same standard we rate other tools against — see How This Site Uses AI for the full disclosure.
