Enrichment before personalization: the order matters in cold email

There’s a step in every cold email pipeline that decides how much the AI personalization is worth, and it happens before the AI runs. Enrichment first, personalization second. Get the order backwards and you pay for it twice.

Personalizing a dead address is burning money twice

A scraped list of a thousand leads typically carries 15 to 25 percent addresses that are dead, disposable, or catch-alls that’ll never see an inbox. Personalize first and you’ve paid AI costs on hundreds of leads that were never reachable. That part is merely wasteful.

The expensive part comes next. Mail those addresses and the bounces land on your sender reputation. Warmed mailboxes take weeks to build and a few bad sending days to ruin, and once reputation dips, every future email in the campaign delivers worse. The best opener ever written is worth nothing in a spam folder. Verification costs a fraction of a cent per address, which makes it the highest-leverage line item in the whole system.

Dirty data makes AI write like a robot

The second half of enrichment is cleanup, and it’s what makes the personalization believable. Raw scraped data shows up in legal-entity dress: “ACME MARKETING SOLUTIONS LLC,” “Jonathan R. Smith.” Feed that to a language model and it’ll faithfully write “I came across Acme Marketing Solutions LLC” in the opener, and the email is dead on arrival, because no human has ever typed a company’s legal suffix into a friendly sentence.

So before any opener gets generated, company names lose their suffixes and their SEO padding, and first names get casualized to what a colleague would actually write. “Acme” and “Jon.” Now the model composes with the same ingredients a person would use, and the output reads like a person wrote it. In my pipeline this cleanup is deterministic code with zero AI involved, which means it’s fast, free, and never hallucinates.

The economics of the ordering

The whole argument compresses into one asymmetry. Enrichment is cheap and mechanical. Personalization is where the value concentrates. Run the cheap mechanical step first and the expensive creative step only ever touches leads that are reachable and cleanly described.

Taekwondo drills this exact lesson from day one: stance before kick. Power thrown off a bad base goes nowhere, no matter how fast the leg is, and no amount of flair up top fixes it. Enrichment is the stance. Personalization is the kick.

In my stack, verification and cleanup cost basically nothing, and AI openers land around forty cents per thousand leads with a small model. Flip the order and part of those forty cents gets spent on unreachable addresses, while the rest produces stiff, suffix-laden openers from dirty inputs. Same tools, same spend, half the value.

The general rule

This ordering shows up way beyond cold email. Every AI step in a pipeline is a multiplier on the quality of what enters it, so the deterministic cleanup always goes upstream of the model. Whenever an AI output looks mediocre, I check what it was fed before I touch the prompt. Most “AI quality” problems I get called in on turn out to be data quality problems wearing a costume.