logo dotConferences

Chaos management during a major incident

Aish Raj Dahal at dotScale 2017

No software system on the planet is today fully failure-resistant. Given this, it becomes crucial for software teams to be able to deal with major production incidents in a nimble way. However, just as complex systems fail, responding to a major system outage is a painful operational exercise that may at times require multiple stakeholders to work together. In this talk, Aish discusses how to efficiently deal with the human element, when complex systems fail.