The Dangers of Personality Profiling (AKA the fallacy of applying complicated models to complex domains)

I find that some of my ideas take a few weeks, months or even years to form. This one took almost exactly a year before coalescing (coagulating?) in my mind. I’ve been thinking about personality tests in the context of efficacy, equity and neurodiversity recently, and it troubles me.

I’ve always found personality testing problematic – indeed any Jungian approach to putting people into type categories I find highly distasteful and potentially harmful.

Critical literacy is sorely lacking in the business and management world. This is possibly largely because it’s not rewarded: we reward confidence, sticking by decisions, bullishness and simple answers to complex problems.

In respect to DEI, I just can’t square the desire to categorise people and their personalities with the very real need for inclusion and diversity of ways of thinking. It’s seems simply antithetical.

To summarise the flaws in personality testing:

  • There is very little evidential basis behind personality profiling, and significant evidence against it.
  • The models are often based on false dichotomies of “big picture vs detail-oriented”, when there is no evidence that these exist.
  • The models are usually based on WEIRD societies, and fail to recognise collectivist, holistic strengths. They rarely address context and inter-relational behaviours, but make assumptions about behaviour from individualistic measures.
  • These tools can lead to false and incorrect assumptions made about other people and the way they behave.
  • The tools may be used for unethical (and illegal) practices such as recruitment, selection for promotion, or other decisions made about someone.
  • They are one of the most highly weaponised management tools ever created.
  • Because they lead people to believe that they can understand someone based upon a profile, they can prevent further examination and effort to understand people and their uniqueness.
  • The algorithms used are rarely open. Algorithms inherit the biases of those people that created them, so if we are making ourselves subject to analysis by algorithm, I want to know what it’s doing.
  • Many tests are biased (see above) – for example, the Big Five was shown to bias women and categorise them as more aggressive when answering identically to a man: because the original data model was flawed.
  • When assigned a profile, we are not allowed to dispute it. Even though we have spent decades in our own minds, a five-minute test is assumed to know more about me, than me.

Even scientists who are most concerned with assessing individual differences in personality would concede that our ability to predict how particular people will respond in particular situations is very limited.

Personality, strength, or psychometric models such as Myers-Briggs, DISC, Belbin, Predictive Index, Tilt and the myriad others available, attempt to codify people and their preferences, personalities, behaviours and values into archetypes, using fixed (usually proprietary and opaque) algorithms. There is usually a good (commercial) reason these tests are closed-source, since it prevents detailed analysis and evaluation of the algorithm.

Repeatability is not the same as validity

These archetypes (such as “maverick”, or “Inventor”) are then categorised and collated into larger group types, and in many organisations, used  to inform everything from role selection, management approach, or even hiring decisions, (which is illegal in many cases).

In 20 years of management, I have never seen a psychometric analysis tool generate a constructive outcome, particularly from a diversity, equity and inclusion (DEI) perspective. I also find it interesting that personality testing *only* exists in the business world, not in the academic world of actual psychological study. Do business managers actually think they know something psychologists don’t?

In my opinion (somewhat backed up by many years of experience and study), categorising people and attempting to simplify the complexities of our nature, in an attempt to make other people and ourselves more predictable, is certainly a seductive proposition. But it is error-prone, and dangerous.

Psychometric analyses don’t work. Indeed, they are often damaging.

The reason they will never work is because they try to map a complicated framework onto a complex problem. You may be familiar with Carl Jung, and his “12 Archetypes” of “Ruler, Sage, Explorer, etc”, which are frequently criticised as mystical or metaphysical essentialism. Since archetypes are defined so vaguely and since archetypal images have been observed by many Jungians in a wide and essentially infinite variety of everyday phenomena, they are neither generalisable nor specific in a way that may be researched or demarcated with any kind of rigour. Hence they elude systematic study, which is true of many other domains of knowledge that seek to reduce complex problems and systems to simple, archetypal models and solutions.

As Cynefin shows us, complicated systems can be really big, and appear complex, but the laws of cause and effect don’t change. When you press the A/C button in your modern car (which is “complicated”), the A/C comes on, and the same thing happens every subsequent time you do it. This is rather obviously not the case with people.

In a complex system such as a human being, asking a teammate to help you out with a task one day results in them helping you, but on another day, they might tell you to stick it; maybe they’re too busy, maybe they’re tired, or maybe they just don’t feel like helping. Cause and effect change in complex systems; and humans are complex. Really complex. Which is why “the soft stuff is the hard stuff“.

Complicated systems can seem messy, but an action results in the same result each time. People are not like that. They are complex, and groups of people even more so. Cause and effect changes constantly – pressing the equivalent of that A/C button on a complex human has one effect today and a different effect tomorrow.

And that is why personality, psychometric, “strength” tests etc will never work in the way people desire them to.

All models are wrong. Some are useful.

The problem is when you use a model and apply it to a complex problem in the assumption that it’s right.

“It ain’t what you don’t know that hurts you, it’s what you know for sure that just ain’t so.”

And people selling these systems either know this, in which case they’re selling snake oil. Or they don’t know it, and they’re simply being optimistically gullible, looking for simple answers to complex problems. To be fair, we humans are almost infinitely susceptible to the seductive simplicity of personality archetypes, even more so when they’re about us. This is known as the Barnum effect, where it’s possible to give everyone the same description and people nevertheless rate the description as very accurate.

Flawed evidence of personality test reliability

MBTI fails on both validity and reliability tests, as do most other personality and psychometric tools. Proponents (usually people selling them) are keen to point out reliability measures that show, with a degree of error, that the same person taking the same test at a different time often obtains a similar result. This only serves to highlight the problem however – just as I would tell you my favourite colour is yellow if you ask me today, and I’d respond usually with the same a month later – it doesn’t follow that my favourite colour has anything to do with my personality, nor that my personality is stable over time. Equally, I may be lying. My favourite colour is actually blue.

Most of these systems apply an assumption of dichotomies, or even force them – you are either X or Y: cannot be both, and you cannot change from one to the other. This has been disproven too.

We should all be suspicious of algorithms that describe us or make decisions about us that are closed course, and psychometric tests are no different. Predictive Index have repeatedly declined to open source their algorithm, ostensibly to protect their intellectual property.

Even the most “trusted” test of all in academia, the “Big 5” has been found to be systematically sexist. Criticism of MBTI and others extends further, due to a highly westernised, English-language primary approach.

Dangerous tools?

Evidence shows that, far from being a “short-cut” to more insightful leadership, tools such as these can be harmful – they may convince managers that they’re doing “good management”, and discourage further effort to improve management and leadership behaviours. At worst, they’re actively discriminatory and detrimental to individual and team performance, reducing the quality of human interactions and decreasing levels of psychological safety.

Finally, I’ve never come across a strongly competent leader who used personality testing and categorisation. It seems to me (and I’m conscious that I’m biased) that these tests are a form of plastic empathy. A way to feel like you’re understanding people, without actually putting in the effort to do so.

What do you think? Are they a useful tool, or a dangerous over-simplification?

Resilience Engineering, DevOps, and Psychological Safety – resources

With thanks to Liam Gulliver and the folks at DevOps Notts, I gave a talk recently on Resilience Engineering, DevOps, and Psychological Safety.

It’s pretty content-rich, and here are all the resources I referenced in the talk, along with the talk itself, and the slide deck. Please get in touch if you would like to discuss anything mentioned, or you have a meetup or conference that you’d like me to contribute to!

Red Hat Open Innovation Labs

Open Practice Library

Resilience Engineering and DevOps slide deck

Resilience engineering – Where do I start?

Resilience engineering: Where do I start?

Turn the ship around by David Marquet

Lorin Hochstein and Resilience Engineering fundamentals


Scott Sagan, The Limits of Safety:
“The Limits of Safety: Organizations, Accidents, and Nuclear Weapons”, Scott D. Sagan, Princeton University Press, 1993.


Sidney Dekker: “The Field Guide To Understanding Human Error: Sidney Dekker, 2014


John Allspaw: “Resilience Engineering: The What and How”, DevOpsDays 2019.


Erik Hollnagel: Resilience Engineering





Jabe Bloom, The Three Economies

The Three Economies an Introduction


Resilience vs Efficiency

Efficiency vs. Resiliency: Who Won The Bout?


Tarcisio Abreu Saurin – Resilience requires Slack

Slack: a key enabler of resilient performance


Resilience engineering and DevOps – a deeper dive

Resilience Engineering and DevOps – A Deeper Dive


Symposium with John Willis, Gene Kim, Dr Sidney Dekker, Dr Steven Pear, and Dr Richard Cook: Safety Culture, Lean, and DevOps


Approaches for resilience and antifragility in collaborative business ecosystems: Javaneh Ramezani Luis, M. Camarinha-Matos:


Learning organisations:
Garvin, D.A., Edmondson, A.C. and Gino, F., 2008. Is yours a learning organization?. Harvard business review, 86(3), p.109.


Psychological safety: Edmondson, A., 1999. Psychological safety and learning behavior in work teams. Administrative science quarterly, 44(2), pp.350-383.

The four stages of psychological safety, Timothy R. Clarke (2020)

Measuring psychological safety:


And of course the youtube video of the talk:

Please get in touch if you’d like to find out more.