Operability

Good operability requires insight into the current activity in the solution, and mechanisms that allow controlling its behavior on the fly.

Many design techniques contribute to good operability.

Measurability generates metrics during activity, which can be live streamed to a dashboard, showing operators what’s happening.

Auditability generates logs, which can be searched to investigate the system’s behavior.

Testability may include a heartbeat feature that helps determine the health of the system.

Added functionality can further improve operability.

Being able to change configuration on the fly during operation can be a powerful tool. But it requires the code to be written to regularly check for changes in underlying configuration files, or it may require extra operations views in the user interface to changing various settings.

Initiating a controlled shutdown is also something I often build into my designs. It often goes hand in hand with a design principle stating that all services must reject calls once a shutdown has been initiated.

A controlled shutdown can also be designed to allow already received transactions to finish before completely shutting down all subsystems.

Similarly, controlled startup can be designed to hold continuously incoming service requests in the queues between synchronous services and their asynchronous backend counterparts until all critical inner subsystems have been properly started up. Again, this may require that a design principle dictates that backend services must reject their incoming calls (from the queues) until the other backend services are ready.