Return to site

Ops Tools for Real World

From Our Ops Team to Yours

· Blog

Ops Tools - they are everywhere, helping drive DevOps, Cloud Adoption, and trying to improve the sanity of overworked Ops teams.

But we're struck lately that many Ops Tools companies live in a fantasy land that few real-world ops teams nor customers really inhabit. This fantasy land is where everything is 100% cloud, everyone is 100% DevOps, everything is new and shiny, code is changed as needed to emit metrics and logs, most stuff runs in Docker and/or is immutable, and presumably everything is well-understood and documented.

The only problem is few if any of these things are ever true in the vast majority of ops shops. Of course most would like to inhabit such a world. But most have limited resources, lots of things to do, lots and lots of legacy, and a mountain of technical debt. The most that most can hope for is to start proper DevOps and/or cloud use on new projects, Dev/Test, and so on.

This means that Infrastructure-as-Code, hourly code pushes, Blue/Green plus Canaries, Dockers schedulers, etc. is just not how real teams are managing their infrastructure day-to-day. This also means they need tools that helps them with that they have, to play the ball where it lies, you might say.

Thus ops teams need tools that can reverse engineer what they have, auto-discover as much as possible, and provide both manual and automated ways to change and improve it. Ideally they would have audit tools that can find any major issues and recommend incremental improvements in security, reliability, performance, and so on.

Most Ops Tools Live in a Fantasy Land - Not Where Ops Lives

Such tools would include deep config discovery to tell users what they really have configured out in the field, what’s running where, how it’s connected, and differences between supposedly identical machines and services as we all know config drift happens. Great tools would also provide ways to change and update configurations of all the various services, such as turning off SSL v2 or adjusting MySQL buffers.

For clouds, it would do the same, including making changes and updates to things like AWS without using Cloud Formation. Today, if a system was not built with Cloud Formation or similar tools, it’s usually impossible to make any automated changes, but all the tools just assume you built your system yesterday from scratch with their tool. Foolish assumption, this is.

Teams need real tools, for existing systems, that help solve real and varied problems, in complex & hybrid systems

Monitoring is everywhere and everyone is doing something, but real teams need help just dealing with floods of false alerts, plus adding better monitoring of key resources, then adding basic service-level monitoring, prediction, and anomaly detection. Extra points for any troubleshooting help at all, including automated graph configuration, rule-based advisors, event correlation, at at the very advanced, simple expert systems and some auto-healing to restart something a 3am.

Real users almost all need help on logging, especially collecting and searching them, but ideally also alerting and analyzing them in ways engineers can use. They’d love it if the logs included OS and Cloud logs plus Java, PHP, etc., and Docker, HTTP, MySQL Slow/Error and more as all these are needed by real teams trying to troubleshoot real systems at 3am when things go upside down.

Do such tools exist ?

Yes, they do. In fact, they all exist on one platform, Siglos, the new Unified Ops Platform which does all the above and much more, for on-line systems of all types and sizes. Born in the crucible of a decade of real-world ops on hundreds of systems, Siglos is the right tool for the real Ops world.

More information at

All Posts

Almost done…

We just sent you an email. Please click the link in the email to confirm your subscription!