DevOps was, and still is to a degree, a “ground-up” phenomenon. It became to be adopted, adapted and evolved by engineering teams before “management” even really understood what it was.
The openness and flexibility that was expounded by DevOps meant that it was able to be interpreted in different ways by different teams in different contexts. This was a key strength, because unlike rigid frameworks such as ITIL, the people responsible for doing the work were able to modify and apply DevOps to their own work, in the ways that best suited them.
But this loose definition also proved to be a weakness. Because there were no limits to how DevOps could be interpreted and applied, it was often (and still is) interpreted as a technology solution rather than cultural change. This resulted in “DevOps engineers”, or “DevOps teams” whose remit is focussed on cloud technology, CI/CD pipelines, or automation.
Due to this, we’re still far behind from where we could have been as an industry. Despite everyone in technology knowing the term “DevOps” and almost every firm adopting some degree of DevOps practices, these transformations have often stuttered or even failed, in part because it’s unclear to many what DevOps really is and how to “do” DevOps.
Resilience Engineering is a field of applied research that considers organisational-scale capability to anticipate, detect, respond and adapt to change. The principle of socio-technicality is core to RE: the premise that you can’t separate people from technology; if you change the technology, it will affect people, and if you change the way people work or communicate or the way teams are structured, it will impact the technology created or consumed by those people.
RE as a field has been around for almost two decades, but only now (for various reasons including the Covid pandemic) is beginning to touch mainstream discussions and discussed in the same conversations as digital and organisational transformation.
Researchers and Practitioners of RE are quick to clarify what RE is, and is not, during these discussions. Whilst it may seem dogmatic to be so strict about what is within the remit of the field, I think this could be the valuable lesson learned from one of the weaknesses of DevOps. In order for organisations to successfully adopt and adapt to a new operating model and principles such as RE, it’s essential to understand very clearly what it is.
Resilience Engineering, despite being nearly twenty years old as a field, is somewhat embryonic in its adoption outside of a narrow field of specialist researchers and practitioners, and as such, it’s crucial that we define accurately what it is, what it is not, and resist attempts (intentional or unintentional) to co-opt the term to mean something more akin to chaos engineering, automation, or system hardening efforts.
A balance must be struck between defining accurately what RE is, and tolerating (indeed, encouraging) a flexibility of interpretation and adoption in different contexts. DevOps was maybe too loose in this respect, other paradigms such as ITIL or SAFe were maybe too strict and dogmatic. Maybe with Resilience Engineering, the sweet spot will be found.