Resilience Engineering, DevOps, and Psychological Safety – resources

With thanks to Liam Gulliver and the folks at DevOps Notts, I gave a talk recently on Resilience Engineering, DevOps, and Psychological Safety.

It’s pretty content-rich, and here are all the resources I referenced in the talk, along with the talk itself, and the slide deck. Please get in touch if you would like to discuss anything mentioned, or you have a meetup or conference that you’d like me to contribute to!

Red Hat Open Innovation Labs

Open Practice Library

Resilience Engineering and DevOps slide deck

Resilience engineering – Where do I start?

Resilience engineering: Where do I start?

Turn the ship around by David Marquet

Lorin Hochstein and Resilience Engineering fundamentals


Scott Sagan, The Limits of Safety:
“The Limits of Safety: Organizations, Accidents, and Nuclear Weapons”, Scott D. Sagan, Princeton University Press, 1993.


Sidney Dekker: “The Field Guide To Understanding Human Error: Sidney Dekker, 2014


John Allspaw: “Resilience Engineering: The What and How”, DevOpsDays 2019.


Erik Hollnagel: Resilience Engineering





Jabe Bloom, The Three Economies

The Three Economies an Introduction


Resilience vs Efficiency

Efficiency vs. Resiliency: Who Won The Bout?


Tarcisio Abreu Saurin – Resilience requires Slack

Slack: a key enabler of resilient performance


Resilience engineering and DevOps – a deeper dive

Resilience Engineering and DevOps – A Deeper Dive


Symposium with John Willis, Gene Kim, Dr Sidney Dekker, Dr Steven Pear, and Dr Richard Cook: Safety Culture, Lean, and DevOps


Approaches for resilience and antifragility in collaborative business ecosystems: Javaneh Ramezani Luis, M. Camarinha-Matos:


Learning organisations:
Garvin, D.A., Edmondson, A.C. and Gino, F., 2008. Is yours a learning organization?. Harvard business review, 86(3), p.109.


Psychological safety: Edmondson, A., 1999. Psychological safety and learning behavior in work teams. Administrative science quarterly, 44(2), pp.350-383.

The four stages of psychological safety, Timothy R. Clarke (2020)

Measuring psychological safety:


And of course the youtube video of the talk:

Please get in touch if you’d like to find out more.

A Short Critique of SAFe – The Scaled Agile Framework

whats wrong with SAFe?

This is a critique of the Scaled Agile Framework (SAFe).

It’s a critique, so it’s pretty negative! There are benefits through using SAFe, and some very good use cases for full or partial adoption, as long as your eyes are open to the problems with SAFe, and your reasons for adopting it are sound.

However, here I’m describing ten key points why it might not be the magic bullet for an organisation looking to scale technology delivery. I’m really interested in your opinion, so please do get in touch if you wish to make a comment or suggestion.

Problems and issues of SAFe:

  1. Encourages normalisation of batch sizing across teams, incentivises increasing task sizes, and fundamentally misappropriates what story points are for.
  2. Causes increased localised technical debt.
  3. Creates a conflict with support, operational and SRE functions.
  4. Decreases inter-team collaboration.
  5. Uses fallacies in estimation.
  6. Decreases the agile focus on value in favour of “what management wants.
  7. Decreases the utility and focus on retrospectives.
  8. Is not Agile – it encourages top-down, large-batch planning rather than small, iterative, feedback loops.
  9. Is framed as a solution, rather than a stage in a journey.
  10. Scales up the solution rather than scaling down the problem.

1 – Nothing in agile suggests that we need to, or even *should* measure work units (i.e. story points) in uniform manners across teams. Story points exist to help the people *doing* the work break things down into optimum “batch size”, which makes deliverables achievable, less complex, and facilities flow. Indeed, SAFe actually encourages larger batch sizes through front-loaded planning, not smaller sizes planned through more iterative methods.

SAFe tries to normalise story points across teams for various reasons, but there is often a strong desire to measure and compare the delivery of teams and people. This is not what story points are for. Story points do not exist to measure how “productive” developers are.

2 – Technical debt tends to increase in SAFe organisations because the prioritisation of dealing with it is raised to a management level rather than team level. This is counter-productive for technical debt that originates at the team level (which most of it does). Management will tend to prioritise features and functions, delaying the pay-back of localised technical debt, and resulting in slower, higher risk, more brittle systems.

3 – If SAFe is applied to more operational functions, such as technology support, operations, or SRE, conflicts between delivery and support functions arise, because supporting teams typically need to work either responsively, dealing with issues as they arise, or on very short cycles – not the Programme Increment cycle time imposed by SAFe.

4 – Due to the focus on deliverables and accountability through project or product managers, teams may be discouraged from assisting each other, as they are measured by their own deliver: how much they assist other teams is rarely valued.

5 – The concept of “ideal dev days” is often used for estimating in SAFe. Everyone else knows that ideal dev days are a fallacy. Instead, look at past similar deliverables, and see how long they took. This is a much more predictive metric, and is less susceptible to optimism bias or wanting to please the boss.

6 – The concept of “value” often breaks down in SAFe, through a focus on volume of delivery and meeting the (often arbitrary) deadlines imposed by management in PI planning. As a result, what end-users actually want is often ignored in favour of what management wants.

7 – PI planning includes a small element of retrospective activity, but it’s too little, too late. The retrospective feedback loops need to be short and light, not tagged on to PI planning as an afterthought.

8 – Agile was created as a response to frustrations felt across the industry from heavyweight, top-down project management methodology that was killing the sector. Trying to scale Agile up by applying heavyweight, top-down methodologies is antithetical. 

9 – Some SAFe practitioners describe it as a transition stage, a process through which organisations can achieve increased capability at scale. I would agree: if an organisation feels the need to adopt SAFe, it should be as training wheels, a structure through which great capabilities can be built, before throwing off the shackles of a rigid, top-down framework. If it was really true that SAFe is a transitionary framework, why does the SAFe model not include anything about the transition away from it?

10 – In reality, most organisations don’t need SAFe. They’re not so big that they need such a big solution. SAFe is a comfort blanket for organisations used to traditional, slow, heavyweight, command-control structures. Your projects and products actually aren’t that big – and if they are, then that’s the problem, not the management process.

Fundamentally, SAFE tends to ignore, or encourages management to ignore the possibility that those closest to the work might be the best equipped to make decisions about it. Scale the work down, not the process up. SAFe fits the delivery model to the organisational structure, rather than forcing the organisation to adopt new ways.