Why AI doesn't sound like you

What we found when we stopped guessing about tone of voice and AI, and went looking for the evidence.

Everyone working with AI eventually hits the same wall. You give it a prompt, and what comes back is smooth, competent, and average. It isn’t wrong. It just isn’t you.

We hear it constantly, usually from clients, usually about their brand: “I can’t make AI sound like us.” Or worse – they get it right once, and then it stops working, or it works for nobody else on the team. It is one of the most common frustrations in marketing’s first real years with generative AI, and most of the advice about fixing it is confident, tidy and unevidenced.

So we stopped guessing. We put some parameters around the problem and went looking for what actually works. The answer turned out to be more interesting than the problem – and it points at something much bigger than tone of voice.

First, what tone of voice actually is

Most people know tone of voice matters and would struggle to say what it is made of. It is worth being precise, because the confusion is part of why AI gets it wrong.

Your voice is who you are. It stays constant. Your tone flexes by audience and moment – you don’t write to the board the way you write to customers, or to your boss the way you talk to your best friend. And style is the mechanical layer: sentence case, “who” not “whom”, en dashes not em dashes. Small choices, made on purpose, that add up to something recognisable.

You have one voice and a hundred tones, most of which have never been written down. It looks simple from the outside and turns out to be all taste and tacit knowledge on the inside. That’s the first reason it’s hard to hand to a machine.

Why AI pulls everything to the average

A language model wants to be plausible, and its safest bet for plausible is the average – the most likely version of whatever you asked for, drawn from everything it has read. That is fine for most tasks. It is fatal for voice, because a distinctive voice sits a long way from average. That distance is the entire point of having one.

Now look at how most teams try to fix it: they stack adjectives. “Be professional, punchy and wry.” But to a model, a word isn’t a feeling – it is a coordinate in a vast mathematical space, and it has no fixed definition of “punchy”. It reinterprets the word every run, and where the instruction is vague it falls back to the statistical centre. It pulls to the average. Words mean different things to different people, too: ask for “wry” and you are picturing Margaret Atwood while the model reaches for Jeremy Clarkson. The result is the register everyone now recognises on sight – competent, smooth, interchangeable, and decidedly average. This is not a small problem for a brand.

It’s worth saying that sounding average isn’t always a failure. For example, Santander sounds like your average bank and that works well because it isn’t trying to compete on brand. Santander is still making a conscious choice to sound that way and to maintain it. But where brand is part of how you win – a Monzo, an Innocent Drinks, a Brewdog – average is exactly what you can’t afford. Spend a year letting AI regress your voice to the mean and you haven’t saved money. You have diluted an asset, while paying to do it.

Distinctive descriptors like "punchy", "professional" and "wry" all collapsing toward the same point – the pull to the average

What we did

We wanted to find a way to evidence this theory, so we looked at how people who write for a living get a consistent voice out of AI – across journalism, PR, fiction (including the romance and fantasy writers who produce at real volume), technical writing and brand copywriting. The tools they sell, the methods they swear by, the workflows they describe.

Then we graded the evidence behind each one, because almost nobody in this field measures whether the voice is actually right. They measure how much editing they had to do afterwards – effort, not quality. Quality, it turns out, is easy to feel and hard to measure, which tells you something important: this is closer to an art than a formula.

Finally, we tested the strongest techniques on our own published articles, scored blind, and kept the failures in alongside the wins. The full method and results are in the whitepaper.

These are the seven approaches writing professionals use to get voice consistency from AI, sorted by tech layer (prompt, context, weights, output gate) and by how strong the evidence is.

What works

Three findings, and the first is the one to take away.

Show, don’t tell. Give the model a strong on-brand example, then an explicitly off-brand one. That contrast fences off the right territory and blocks the pull to the bland centre. We do this instinctively in design – the mood board that says “like this, not like that”. It’s the single highest-return move available to you, because it shows the style of a thing by excluding something.

Example from the Monzo tone of voice guide

Then teach it over time. Take those examples and your rules and load them as a reusable pack – a project, a Skill, even just a block of plain text you bring to every prompt. When it gets something wrong, show it the edit and let it learn. (Big enterprises can go further and train a model on their own voice. A few claim to have cracked it – we looked hard for evidence that it works reliably for their clients and couldn’t find it. For almost everyone, a well-fed system is the smarter bet.)

Show it enough. Give it too few examples and it overfits – grabs the most distinctive thing in your sample and won’t let go. Antony recently fed Claude a newsletter draft built around a metaphor about a cave, and everything afterwards was caves and journeys into the unknown. A couple of examples isn’t a voice, it’s a caricature.

Why this matters beyond marketing

Here is the part that outlasts the topic. Tone of voice is just an unusually clear example of every hard, taste-based problem organisations are now trying to hand to AI. It can’t be reduced to a single instruction. It needs context, examples, a system, and a human at the point where taste is applied.

Get that right and the job changes shape. The machine takes on the structural heavy lifting and hands you a draft that’s close. You spend your time where it counts – on judgement, on taste, on the decisions only you can make. In other words, you shrink the decision space: not so the machine decides for you, but so you make fewer, better decisions instead of wading through baggy copy you might as well have written yourself.

A scatter of infinite possibilities funnelling through successive filters down to a single narrowed decision space

That pattern – AI doing the lower-order work so people can do the higher-order work – is the one worth carrying from a marketing problem into every other corner of the business. As we’ve written about what AI is doing to marketing, the capabilities compounding in value are the judgement ones: briefing, editorial standards, taste. Tone of voice is where that shows up first and clearest.

Where this leaves leaders

A few things hold across everything we found.

There is no perfect prompt and no silver bullet, and any vendor promising a hands-off voice engine is selling snake oil. What works is a system you build and keep improving. The leadership decision is what to resource: fund the system and the work of correcting it, and protect the human at the point where taste is applied.

And the one practical thing to take away is simple. Stop describing how you sound, and start showing the model the difference. Do this, not that. Then keep teaching it.

This started as a Brilliant Noise webinar. Watch the 30-minute recording and get the full research and slides here.