Reliability: When the Big Number isn’t the Whole Story

‘0.90 Reliability’ – Sounds Impressive, doesn’t it?

If you’ve ever looked at a personality tools manual, you’ve probably seen a statistic proudly displayed:

“Internal reliability = 0.90”

It sounds reassuring. Scientific. Solid.

Most people see a big number and think: Great. That must mean it’s good.

But here’s the question few people ask: Good at what?

Because a high reliability score does not automatically mean a model is useful, predictive, or even particularly insightful. It simply means the items hang together consistently. And consistency, on its own, is not the same as value.

Consistent… but doing what?

Imagine asking someone the same question ten times in slightly different ways.

You would probably get very consistent answers. In personality terms this means your reliability statistic would look excellent.

But would you have learned anything new?

In psychometrics, it is possible to increase reliability by making items very similar to one another. That creates what we call internal consistency, but it also narrows the model and what it is actually able to measure.

Facet5’s own technical documentation discusses how reliability can be inflated by increasing item similarity or by adding more items to a scale.

That doesn’t make a tool wrong. But it does mean the headline number doesn’t tell the whole story.

The Real Question Leaders Care About

Not many Leaders wake up in the morning and ask: “What is the average inter-item correlation of our personality instrument?”

They are asking:

Can I trust this insight?
Will this hold up over time?
Does it help me make better decisions?
Does it translate into performance?
Does it help me lead better?!

That’s where reliability becomes meaningful. If a personality model fluctuates wildly every time someone completes it, you can’t build development plans on it. If it is so narrow that it reduces people to one simplified dimension, you may get clarity -but not depth. And over-simplification creates excuses.

AND – reliability is valuable when it supports stability without oversimplifying complexity.

Stability that Supports Growth

Personality traits tend to stabilise in adulthood. That’s what allows us to recognise patterns in ourselves over time.

Facet5 distinguishes between internal consistency and stability over time. Stability matters because it allows organisations to use personality insight as a long-term reference point, not just a moment-in-time snapshot.

When you are investing in leadership development, succession planning or team calibration, you need confidence that the underlying pattern is steady enough to build on.

Otherwise, you are designing scaffolding on shifting sand.

Reliability is only Half the Equation

There is another distinction that matters even more than reliability:

Validity.

A tool can be highly reliable, very consistent, and still not tell you anything useful.

Reliability answers: “Is it consistent?”

Validity answers: “Is it measuring what it claims to measure?”

Facet5 has been examined against established models and studied in relation to real workplace outcomes. That means the model is not only internally coherent – it is aligned with recognised personality science and linked to practical performance contexts.

And that’s where the “so what” lives. Because leaders don’t buy reliability. They buy impact.

When Years of Research isn’t the Point

You will also often see phrases like: “Backed by decades of research.”

Longevity is reassuring. But again, the deeper question is:

What has that research actually demonstrated?
Has the model been continually reviewed and refined?
Has it been tested across different cultures?
Are norms updated and stability reviewed?

Facet5 norms, for example, are regularly reviewed and tested using effect size calculations to determine whether changes materially affect interpretation.

That doesn’t make for flashy marketing copy. But it does mean the model is maintained with care.

So, what should you look for?

When evaluating a personality tool, instead of focusing on the biggest number in the brochure, consider asking:

Is the model stable enough to support long-term development?
Is it broad enough to capture real behavioural nuance?
Is there evidence it connects to workplace performance?
Are norms transparent and responsibly updated?
Does the insight translate into better conversations, better relationships, better decisions, better performance?

Because in the end, the value of personality insight isn’t found in a coefficient.

It’s found in whether leaders say:

“This helps me understand how I show up.”
“This helps my team work better together.”
“This gives me something I can act on.”

Reliability is part of the scaffolding. It ensures the structure holds. But scaffolding exists for a reason, to support something being built. And if a personality model doesn’t help you build better leadership, stronger teams, and clearer development pathways, then the size of the reliability doesn’t really matter.

Discover more blog posts >>