Open access to Google’s AI text watermarking tool may help advertisers win audience trust

By Kendra Barnett, Associate Editor

October 23, 2024 | 12 min read

As concerns about AI content misuse abound, Google is opening up access to its AI text watermarking technology. It will be a welcome change for advertisers and publishers, who stand to benefit from improved transparency.

Google will now allow all developers to access its SynthID Text tool for identifying and flagging AI-generated text / Adobe Stock

In an effort to address growing concerns about the authenticity and proliferation of AI-generated content, Google on Wednesday announced that it will make its tech for watermarking AI-created text available to all developers.

For advertisers and publishers – whose lifeblood in many ways depends on consumer trust – the change could prove particularly valuable.

With the tool, SynthID Text, developers can embed invisible watermarks into AI-generated text – watermarks that remain effective even after additional modifications or paraphrasing have been made to the content.

The technology may help differentiate between human-written and AI-generated text, marking a significant step toward improving transparency in generative AI.

“SynthID Text is designed to embed imperceptible watermarks directly into the text generation process. It does this by introducing additional information at the point of generation by modulating the likelihood of tokens being generated – all without compromising the quality, accuracy, creativity or speed of the text generation,” Pushmeet Kohli, vice-president of research at Google DeepMind, said in a statement shared with The Drum. “Now, other GenAI developers will be able to use this technology to help them detect whether text outputs have come from their own LLMs (large language models), making it easier for more developers to build AI responsibly.”

SynthID Text was originally unveiled in August of 2023, when it was made available exclusively to users of Google’s AI image generator Imagen, part of Google Cloud’s machine learning platform Vertex. Now, it will be accessible to all developers via Hugging Face and Google’s Responsible GenAI Toolkit.

The improved accessibility of the tool arrives at a critical moment, as AI-generated and AI-modified content sweeps the internet, heightening ethical and legal debates around issues like misinformation, deepfakes, fraud and other nefarious uses of AI.

Want to go deeper? Ask The Drum

The issues have become increasingly pressing this year, as nearly half the world’s population heads to the polls in a range of high-stakes elections. And AI-fueled misinformation has already reared its ugly head in a handful of these elections. In mid-August, former US president and Republican nominee Donald Trump shared manipulated images that falsely signaled campaign support from Taylor Swift fans. Elections in India and Slovakia faced deceitful AI robocalls and deepfake speeches.

Digital watermarking has been proposed as a potential mitigator to the harmful spread of AI-generated misinformation. By explicitly flagging when a piece of visual, text or audio content has been tampered with or generated by machines, the risk of disseminating falsehoods is curbed – at least in theory.

The newly widespread availability of Google’s SynthID Text, is, for this reason, a step in the right direction, according to Andrew Frank, vice-president distinguished analyst at Gartner. “Enabling consumers and citizens – and their search and social tools – to distinguish between content that’s been generated by machines and humans gives them critical context to judge its trustworthiness and authenticity,” he says.

How SynthID Text works

The mechanism behind SynthID Text is fairly intricate. When an LLM generates text, it does so by predicting the next word or part of a word, known as a token, based on what came before it. These predictions are guided by probability scores, which help the model choose the most likely word to keep the sentence flowing smoothly. As Google explains on a landing page for SynthID, “For example, with the phrase ‘My favorite tropical fruits are __.’ The LLM might start completing the sentence with the tokens ‘mango,’ ‘lychee,’ ‘papaya,’ or ‘durian,’ and each token is given a probability score...”

SynthID Text adds a layer of complexity to this process by subtly adjusting these probabilities as the model generates text. The tool tweaks the probability scores just enough to embed a hidden watermark in the text, without affecting the quality or meaning of the output. The watermark can later be detected by analyzing the patterns in how tokens are selected.

Adjustments occur throughout a given text, with each sentence containing several modified probability scores. As the length of the text grows, so does the effectiveness of the watermark, making it easier to distinguish from non-watermarked content.

SynthID Text is integrated into Google’s Gemini models.

Despite the positive potential of the tool, SynthID Text has its limits. For instance, it may be less reliable with shorter pieces of text or when AI-generated content is translated into different languages.

Additionally, “Google’s own ad products increasingly rely on AI-generated content, which many advertisers welcome for its time-saving benefits,“ says Xavier Litt, cofounder of adtech platform Ad360. The “real test,“ he says, “will be whether Google holds itself to the same standards it imposes on others by watermarking its own AI-generated content.“

Other versions of SynthID designed to detect and watermark AI-generated images, audio and video, may be developed, too, depending on developer feedback, a company spokesperson tells The Drum.

Creating greater transparency & trust in adland

For advertisers and publishers, the SynthID Text will provide much-needed transparency – which may help secure audiences’ trust.

“This is a significant development for advertisers and publishers because it introduces an essential layer of transparency and authenticity at a time when AI-generated content is becoming more prevalent," says Dr. A.K. Pradeep, founder and CEO at generative AI firm Sensori.ai.

For advertisers, “this tech safeguards campaign integrity by distinguishing human- from AI-generated content, protecting brand trust and aligning with ethical standards,” Pradeep says.

Further, he posits that a technology like SynthID could provide useful insights into the ways in which consumers engage – or don’t engage – with AI-generated ads. In short, it could help inform brands’ AI content strategies.

Others agree. Jorge Argota, an independent marketing consultant, suggests that the tool will embolden advertisers to experiment more with AI-generated copy without fear of negative repercussions. “Brands can now try AI content with confidence and protect their integrity and avoid PR headaches,” he says. “It’s not a perfect solution, but it’s a big step towards responsible AI. [It helps move us closer toward] trust, transparency and authenticity in content creation and distribution across all platforms.”

Suggested newsletters for you

Daily Briefing

Daily

Catch up on the most important stories of the day, curated by our editorial team.

Weekly Marketing

Friday

Stay up to date with a curated digest of the most important marketing stories and expert insights from our global team.

The Drum Insider

Once a month

Learn how to pitch to our editors and get published on The Drum.

What’s more, an advanced AI detection tool like SynthID could potentially equip advertisers with knowledge that helps them mitigate brand safety pitfalls. In the words of Greg Kahn, an emerging tech expert and the chief executive officer at GK Digital Ventures: “This is the first scalable solution that allows brands to verify they aren’t inadvertently placing ads next to AI-generated content farms or what some are calling ‘AI slop.’” In brief, tech that can easily identify AI-generated content – like SynthID – may help advertisers navigate the complex waters of brand safety and avoid serving ads next to specific kinds of content.

Publishers, too, could benefit, Pradeep argues, as sophisticated digital watermarking can help maintain high editorial standards, ensure accuracy and mitigate the likelihood of misinformation on webpages, apps or other channels.

This fact in itself could have a positive ripple effect for publishers. “Major publishers now have a technical tool to distinguish their human-generated content, which could justify higher ad rates,” explains Kahn. “News outlets known for original reporting, for example, can now show that distinction clearly to readers and advertisers.”

Hurdles on the road ahead

While Google has made SynthID Text open-source to encourage adoption, it will still be a challenge to achieve widespread adoption in the industry, especially as developers like OpenAI work to establish their own watermarking mechanisms.

Meanwhile, lawmakers are circling. A new law in California mandates that AI systems disclose that their content is AI-generated by requiring that information about a piece of content’s provenance is embedded in its metadata. A similar draft regulation was proposed in China last month.

Rising regulatory pressure is driven by growing urgency to combat the spread of dangerous and misleading AI-generated content. A 2021 draft of the EU’s AI Act estimated that up to 90% of online content could be created by AI by 2026, amplifying the need for effective detection mechanisms.

As governments around the world look to implement new legal guardrails for the production and dissemination of AI-made content, the industry is likely to coalesce around a standardized method of watermarking.

And some experts anticipate that Google’s SynthID will be embraced as a potential standard by lawmakers. “I believe legislators will gladly support this initiative hoping to get a grip of Gen AI anarchy that might have some unwanted social impact,” says Dmitri Lisitski, the CEO and cofounder of digital advertising platform Influ2.

Nonetheless, it’s worth noting that watermarking tech – no matter how sophisticated – is unlikely to be foolproof. Last year, a team of researchers at the University of Maryland, led by computer science professor Soheil Feizi, published a study demonstrating just how easy it is to evade watermarking efforts. And while the technology for detecting AI-generated or AI-modified content may have advanced since then, bad actors will surely continue to seek out ways to loophole or erase digital watermarks.

Google DeepMind’s Kohli acknowledged that SynthID “isn’t a silver bullet for identifying AI-generated content,” but expressed optimism that it is a step in the right direction.

Ultimately, the expanded access to SynthID Text is a sign of progress toward a potential industry-wide standard for identifying and flagging AI-generated content. “By making SynthID open source,” Kahn says, “Google is essentially offering a universal standard. It’s not a full solution, but it’s a step toward clearer boundaries between AI-generated and human-created content.”

With other AI players likely to follow in Google’s footsteps and roll out their own detection tools, only time can tell whether watermarking becomes a universal safeguard in the online content ecosystem – or just a minor weapon in the fight against AI misuse.

For more, sign up for The Drum’s daily newsletter here.

Media Digital Transformation