AI watermarking is supposed to be the definitive answer to the detection problem — invisible signals embedded at the moment of generation that prove a machine wrote the text. Google's SynthID is already live. OpenAI built one and then shelved it. Here's how the technology actually works, where it falls short, and whether watermarks can be stripped.
What AI Text Watermarking Actually Is
Traditional AI detection works by analyzing text after it's been written — looking for statistical patterns that suggest machine authorship. Watermarking takes a completely different approach. It embeds a signal during generation, before the text even reaches you.
Think of it like this. When a large language model generates text, it predicts one token (roughly one word) at a time. For each position in the text, the model calculates a probability distribution across its entire vocabulary — how likely each possible next word is. It then samples from that distribution to pick the actual word.
Watermarking intervenes at this sampling step. Instead of picking words purely based on the model's probability distribution, the system subtly biases the selection toward specific tokens using a secret pattern. The output still reads naturally — the biases are small enough that you can't notice them — but a detector that knows the secret pattern can scan the text and identify the statistical fingerprint.
The advantage over post-hoc detection is significant. Traditional detectors are guessing whether text looks like AI output. Watermark detectors are checking for a specific, deliberate signal that was placed there on purpose. In theory, this makes watermarking far more accurate and far harder to fool.
How Does Google's SynthID Actually Work?
SynthID is the only text watermarking system deployed at scale in 2026. Google DeepMind developed it, and it's active on text generated through Gemini (both the app and the API). The underlying research was published by Google DeepMind and later open-sourced via Hugging Face. Here's what's happening under the hood.
Step 1: Context Hashing
For each token the model is about to generate, SynthID creates a “seed” derived from the preceding tokens. This seed is deterministic — given the same context, you always get the same seed. This means the watermark is recoverable later without needing to store any additional data about the text.
Step 2: G-Function Scoring
Using the seed and a secret developer-provided key, SynthID applies a pseudo-random function (called a “g-function”) to assign a hidden score to every possible next token. These scores are invisible to the user and don't change the model's understanding of language — they're an additional layer of information overlaid on the generation process.
Step 3: Tournament Sampling
This is where it gets clever. Instead of simply adding the g-scores to the model's token probabilities (which would be detectable and could degrade output quality), SynthID uses a tournament system. Tokens “compete” in multi-layer elimination rounds. A token advances if its combined likelihood (model probability + g-score) beats other candidates. The final chosen token stays within the model's natural probability distribution but reflects the watermark's bias.
Step 4: Detection
To check for a watermark, the detector takes the text, reconstructs the seeds from context, recalculates the g-scores using the same secret key, and checks whether the actual tokens chosen align with the watermark pattern more than random chance would predict. If the alignment is strong enough, the text is flagged as watermarked.
The Quality Tradeoff
Google designed SynthID specifically to avoid degrading text quality. Their research shows no measurable impact on the creativity, accuracy, or speed of text generation. The biases introduced are small enough that the output remains within the model's natural distribution. That said, SynthID is less effective on factual responses because there are fewer opportunities to adjust token probabilities without affecting accuracy. A question like “What is the capital of France?” leaves almost no room for token-level variation.
Why Did OpenAI Build a Watermark and Then Kill It?
OpenAI developed a text watermarking system that achieved over 99% accuracy in controlled testing. By most technical measures, it worked. They could embed and detect watermarks reliably. And then they decided not to ship it.
The official reason: an internal survey found that nearly 30% of ChatGPT users said they would use the service less if watermarking was implemented. For a company whose revenue depends on user engagement, that was a dealbreaker.
There were also “concerns about stigmatization and user impact.” OpenAI worried that watermarking would unfairly target users in contexts where AI assistance is legitimate — drafting emails, brainstorming, writing code. The watermark can't distinguish between “I used AI to cheat on an exam” and “I used AI to help draft a marketing email.”
Instead, OpenAI pivoted to C2PA metadata — a content provenance standard that attaches visible metadata to AI-generated images and videos, rather than embedding invisible signals in text. C2PA is backed by a coalition including Microsoft, Adobe, and Intel. It works well for visual media but doesn't solve the text detection problem because metadata is easily stripped when text is copied and pasted.
Can AI Watermarks Be Removed?
This is the question everyone asks. The short answer is yes, but with caveats.
What Works Against SynthID
SynthID watermarks are robust to minor modifications — cropping sections of text, changing a few words, or light paraphrasing won't remove them. The watermark is distributed across the entire text, so partial edits leave enough signal for detection.
However, more aggressive transformations do degrade the watermark signal.
- Heavy paraphrasing: Rewriting sentences at the structural level (not just swapping synonyms) disrupts the token-level pattern that SynthID relies on. Detector confidence drops significantly when text has been “thoroughly rewritten.”
- Translation round-tripping: Translating text to another language and back effectively regenerates the token sequence, destroying the original watermark pattern.
- Semantic reconstruction: Tools that parse the meaning of text and rebuild it from scratch — like HumanizeThisAI — generate entirely new token sequences. The original watermark doesn't survive because the tokens themselves are different.
- Re-generation through a different model: Running watermarked text through a non-watermarking model (feeding it as a prompt to Claude, for example) produces new output that carries no SynthID signal.
Research from ETH Zurich confirmed that SynthID is “easier to scrub than other state-of-the-art schemes even for naive adversaries.” They also found that the presence of SynthID can be detected using black-box queries — meaning you can test whether text is watermarked without access to the secret key.
The Fundamental Vulnerability
Text watermarking has a weakness that image and audio watermarking don't share. Text is discrete — it's made of individual words that can be completely replaced while preserving meaning. An image can be subtly modified at the pixel level while keeping the visual content intact, but replacing individual words in a sentence changes the entire token sequence that the watermark depends on.
This means any tool that rewrites at the semantic level — understanding what the text means and expressing it differently — inherently defeats token-level watermarks. The watermark lives in the specific sequence of tokens. Change those tokens while keeping the meaning, and the watermark is gone.
The Spoofing Problem: Can Watermarks Be Faked?
There's a darker side to watermarking that doesn't get enough attention. If watermarks can identify AI text, could someone add a fake watermark to human-written text to frame someone?
The ETH Zurich research found that SynthID is more resistant to spoofing than other watermarking schemes, and that attempts to spoof it leave “discoverable clues.” But “more resistant” isn't “immune.” As watermarking technology becomes more widespread, the incentive to develop spoofing tools increases.
The spoofing risk creates an asymmetry in the stakes. If a watermark detector flags text as AI-generated, is that because the person used AI, or because someone planted a fake watermark? In academic settings where false accusations can derail careers, this uncertainty undermines the entire premise of watermarking as definitive proof.
Where Does AI Watermarking Stand in 2026?
As of early 2026, here's the state of play.
| Company | Text Watermarking Status | Approach |
|---|---|---|
| Google (SynthID) | Live on Gemini | Token-level tournament sampling; open-sourced via Hugging Face |
| OpenAI | Shelved | Pivoted to C2PA metadata for images/video; no text watermarking |
| Anthropic (Claude) | No watermarking | No public plans for text watermarking |
| Meta (Llama) | No watermarking | Open-source models; watermarking would be easily removed |
| Mistral | No watermarking | Open-weight models; no watermarking infrastructure |
The critical observation: only Google is actively watermarking text. Every other major AI provider has either abandoned the idea or never started. This means watermarking only “solves” detection for Gemini output. ChatGPT, Claude, Llama, Mistral, and every other model produce unwatermarked text that must be detected through traditional statistical analysis.
There is no industry standard for AI text watermarking. The EU AI Act includes provisions that may eventually require watermarking, but enforcement details are still being worked out. Until there's regulatory pressure or a competitive reason to adopt it, most companies appear content to let Google be the only player in this space.
What This Means for Writers and Students
If you're worried about AI detection, watermarking changes less than you'd think. Here's the practical reality.
If you use Gemini: Your text carries SynthID watermarks. These survive minor edits but are removed by thorough rewriting, translation, or semantic reconstruction. If you run Gemini output through a humanization tool that rebuilds the text structurally, the watermark is effectively gone.
If you use ChatGPT, Claude, or other models: No watermarks to worry about. Detection relies entirely on statistical analysis, which has well-documented limitations and workarounds.
Watermarking isn't a silver bullet. Even Google acknowledges SynthID is less effective on short, factual text. And the fact that only one company has deployed it means watermark detection can't be relied on as a universal solution. Turnitin, GPTZero, and other academic tools still primarily use statistical detection.
The future is uncertain. Regulation could force all AI providers to watermark. But even if that happens, the fundamental vulnerability of text watermarking — that semantic rewriting destroys the signal — means watermarks alone will never be a complete detection solution. The cat-and-mouse game between detection and evasion will continue with or without watermarks.
TL;DR
- AI watermarking embeds a hidden signal during text generation, not after — making it fundamentally different from traditional AI detection.
- Google's SynthID is the only text watermark deployed at scale, but ETH Zurich research shows it can be scrubbed by paraphrasing, translation, or semantic rewriting.
- OpenAI built a 99%+ accurate watermark but shelved it after ~30% of users said they'd use ChatGPT less if it shipped.
- Text watermarks are inherently fragile because text is discrete — replacing words while keeping meaning destroys the token-level signal the watermark depends on.
- Only Google watermarks text today. ChatGPT, Claude, Llama, and Mistral produce unwatermarked output, so traditional detection remains the primary approach.
Watermarks or not, the detection problem remains. Whether your AI text carries SynthID signals or just has statistical patterns that detectors look for, HumanizeThisAI strips both. Semantic reconstruction replaces the original token sequence entirely — no watermark survives, no statistical fingerprint remains. Try it with 1,000 words free.
Try HumanizeThisAI Free