I’ve fixed the same mess four times now. A design system starts clean — five core colours, three type scales, a handful of spacing values. Then the first product team needs “just one more” semantic token. Then the mobile team needs a different scale. Then someone on the marketing site hard-codes a hex because they couldn’t find the right variable.
Within six months, the token set has doubled and nobody trusts it. The system that was supposed to bring consistency has become a source of confusion.
Here’s why it happens and how to build tokens that actually survive.
The problem: naming, not numbers
Most token systems fail because of naming conventions, not the values themselves. Teams invent categories as they go — color-primary, color-brand-blue, color-cta-bg — until the namespace is a swamp.
The fix is a strict three-level hierarchy:
| Level | Purpose | Example |
|---|---|---|
| Base | Raw values | blue-500: #2563eb |
| Primitive | Intent-neutral roles | color-accent: blue-500 |
| Semantic | Context-specific | btn-primary-bg: color-accent |
Base tokens never change. Primitive tokens change when the brand evolves. Semantic tokens change when a component needs to behave differently in a new context — without cascading chaos.
If every token is a semantic token, no token is semantic.
When the second product arrives
This is the moment most systems break. Product A uses spacing-lg for card padding. Product B needs a tighter grid. The “obvious” fix is to change spacing-lg — except now Product A’s cards look wrong.
The rule: never change a base or primitive token to satisfy a single product. Instead, create a new semantic layer for Product B:
- Product A:
card-padding: spacing-lg - Product B:
card-padding: spacing-md
Same primitive, different semantic mapping. No regressions, no confusion.
Audit rhythm
Even a good token system drifts. I’ve learned to schedule a token audit every three months:
- Export all tokens in use across all products
- Find every token used fewer than three times
- Ask: “Is this a genuine need or a workaround?”
- Consolidate or delete
A token used once is a hack. A token used twice is a coincidence. A token used three times is a pattern worth naming.
The pipeline
In practice, this means tokens should live in a single source of truth — JSON or TypeScript — and be consumed by every product via a build step. No manual syncing, no design tool exports that drift from production. Figma writes to the same JSON that Tailwind reads.
It takes a day to set up. It saves weeks over the lifetime of the system.
— Stephen.