Observable systems

One of the biggest challenges with software is that you can’t really see it run. Developers can while they’re writing the code using advanced debuggers that allow them to run their code line by line, and study its state in real time. But once the software is deployed, it can’t be inspected in real time anymore.

The traditional approach is the enable logging, which potentially emits enormous amounts of deeply technical and detailed information in log files, which can be read as they are written. The problem with this approach is that writing so much log data slows down the running software, and fills up disk space quickly. Operators therefore often switch logging off in production, or set the level of information written to a relatively high level. The result is high-level logging that isn’t detailed enough to understand suddenly detected problems. Log levels can be changed, but you have to wait for the problem to occur again to see what is logged.

I recommend that you build in observability independently of any standard logging mechanisms. Software should send meaningful information about its ongoings to a monitoring service, which can be viewed in realtime. And the software should send information from all environments, test and pre-production environments included.

My own sample monitoring service demonstrates the concept: mon.asp.ugleberg.dk.