I’ve fixed the same mess four times now. A design system starts clean — five core colours, three type scales, a handful of spacing values. Then the first product team needs “just one more” semantic token. Then the mobile team needs a different scale. Then someone on the marketing site hard-codes a hex because they couldn’t find the right variable.

Within six months, the token set has doubled and nobody trusts it. The system that was supposed to bring consistency has become a source of confusion.

Here’s why it happens and how to build tokens that actually survive.

The problem: naming, not numbers

Most token systems fail because of naming conventions, not the values themselves. Teams invent categories as they go — color-primary, color-brand-blue, color-cta-bg — until the namespace is a swamp.

The fix is a strict three-level hierarchy:

LevelPurposeExample
BaseRaw valuesblue-500: #2563eb
PrimitiveIntent-neutral rolescolor-accent: blue-500
SemanticContext-specificbtn-primary-bg: color-accent

Base tokens never change. Primitive tokens change when the brand evolves. Semantic tokens change when a component needs to behave differently in a new context — without cascading chaos.

If every token is a semantic token, no token is semantic.

When the second product arrives

This is the moment most systems break. Product A uses spacing-lg for card padding. Product B needs a tighter grid. The “obvious” fix is to change spacing-lg — except now Product A’s cards look wrong.

The rule: never change a base or primitive token to satisfy a single product. Instead, create a new semantic layer for Product B:

  • Product A: card-padding: spacing-lg
  • Product B: card-padding: spacing-md

Same primitive, different semantic mapping. No regressions, no confusion.

Audit rhythm

Even a good token system drifts. I’ve learned to schedule a token audit every three months:

  1. Export all tokens in use across all products
  2. Find every token used fewer than three times
  3. Ask: “Is this a genuine need or a workaround?”
  4. Consolidate or delete

A token used once is a hack. A token used twice is a coincidence. A token used three times is a pattern worth naming.

The pipeline

In practice, this means tokens should live in a single source of truth — JSON or TypeScript — and be consumed by every product via a build step. No manual syncing, no design tool exports that drift from production. Figma writes to the same JSON that Tailwind reads.

It takes a day to set up. It saves weeks over the lifetime of the system.

Stephen.