A whole-railway reliability approach to planning for things that will probably never happen


Author: Andrew Love
Day: Aspect Day One
Session: Systems

Many railways understandably focus their attention on improving safety and reliability into eliminating the causes of the most commonly experienced incidents. However, this approach neglects the mitigation of low-probability high-impact incidents that might only be experienced once in the life of a transit network, but would be significant (potentially catastrophic) should they occur, such as widescale power or communication failures and earthquakes. Although such risks exist throughout (and beyond) the railway system, the role of communications, telemetry and operational control in mitigating or managing the impact of such risks means that addressing these issues falls squarely into the remit of the IRSE's members.In this paper, I will discuss:
•The need for a structured, quantitative approach to identifying and assessing potential threats, so that an appropriate level of attention is given to mitigating low-probability events and the (sometimes hidden) dependencies between systems can be identified.
•The use of a whole-railway resilience model to identify the risks and mitigations from the human components of the railway system and the dependencies from interfaces from outside the railway systems (e.g. utilities), as well as the more obvious risks and mitigations from the technological components of the railway system. This methodology enables a wider range of mitigations to be considered, and moves the approach to resilience from being asset-focused (e.g. "Do we need a backup control room?") to being enterprise-focussed (e.g. "How will we cope if the control room becomes unavailable?") to enable a more holistic approach to planning and investment.
•The importance of assessing the environment in which the railway operates; this covers the criticality of the railway to the wider environment as well as the evolving threats in the global environment, including geopolitical, meteorological, technological, medical, commercial and social issues.
•Practical steps by which the impact of low-probability failures can be designed out or mitigated through procedure or monitoring systems. This will include measures to ensure that critical resources are available to work under scenarios that also affect their out-of-work environment.
•The informal mitigations that are already in place in many railway enterprises, and how these can be formalised and protected during organisational change. •How railways can test their resilience to ensure that mitigations are effectively implemented.