Enterprise IT has been living in the dark for too long. It’s time to step out into the light.
When I have a conversation with IT professionals about their monitoring strategies, they usually relax, sit back in their chair, and assure me it’s all taken care of. When we dig into the details together, they might get up to the whiteboard and start drawing diagrams about how their infrastructure generates events, how these are gathered, how they are filtered and correlated, and finally how they are presented to human operators. In some cases, people even whip out their laptops and start going through actual presentations.
Monitoring is complex, and people have developed and honed their strategies over years or even decades to address the specific needs of their organisation.
There is one question, however, that I can rely on to derail the discussion: “How often do you get a ticket for something that you had no idea people were using?”
The response I get is pretty uniform. Only once was there vehement denial – and it later turned out that the users were indeed doing plenty of things outside the view of IT. Most competent sysadmins are fully aware that there are all sorts of services being used that they were not consulted about. This has long been called “ shadow IT ”, but to my mind, this wording reveals more about how out of touch with their users many enterprise IT teams are.
The term shadow IT is used to refer to some sort of service, usually cloud-based, that users adopt directly without going through corporate IT approval processes. The most common early example of shadow IT was developers obtaining virtual servers from IaaS providers such as AWS, paying with a corporate credit card instead of dealing with long and complex procurement processes. These days, it’s just as likely to be a SaaS collaboration tool like Slack, and users are as likely to be in Marketing as in Development.
Until a decade ago, none of this was really possible. The only equivalent was a developer hiding a test machine under their desk – I’ve got a few stories along those lines from my own days as a sysadmin! This was something that only pretty technical users did. The worst damage anyone else could do was to install some godawful animated cursor or yet another browser toolbar.
These days, useful tools can be accessed for free or for a low monthly charge, and they can enable people to do their jobs better and more efficiently.
From whose point of view can this sort of unofficial IT be thought of as being dark or nefarious?
Central IT teams do have legitimate reasons to know about what services are being used in the company. This lets them save money by consolidating billing, forecast capacity and requirements accurately, and avoid unexpected security exposure. What can sometimes get lost is the fact that all of this is supposed to be in service of the users. If IT processes and procedures become bottlenecks between users and the support they need to get their jobs done, guess what? Users will interpret them as damage, and route around them.
There is a fundamental change in thinking required. From the point of view of the users, it’s not shadow IT that’s hidden; that’s easy to access and friendly to use. It’s the official corporate services that are difficult to discover, hard to use, and impossible to get support for. IT Operations is the Department of No, because that’s the answer users get all too often when they ask for something. Meanwhile, in their personal lives they are used to instantly-accessible services that are high-quality and easy to use. It’s no wonder they try to use those services for their jobs as well.
Sysadmins are charged with ensuring the uptime and availability of business services, not just the IT infrastructure. They also need to deliver new services and changes to existing services in a timely manner, so that IT does not act as a speed bump for the business. This means re-evaluating some assumptions about How Things Are Done.
So what if you can’t put your preferred monitoring agent inside the cloud service? There’s sure to be an API that will let you know the status. So what if your application architecture now sprawls far beyond the walls of your own data center? Look at the end-to-end user transaction, either real or simulated – or even better, both. So what if developers want to do a release every week instead of every six months? If it’s that routine, you can automate your part of the process, and then it doesn’t matter how often they want to release.
One problem – this has now multiplied the number of data sources for you to look at when you need to figure out whether everything is working properly. Again, the solution is a change in thinking.
Shadow IT – a result of IT Bottlenecks
Part of the reason IT is a bottleneck is this reliance on eyes on glass and hands on keyboard. If any new service requires lots of manual steps, those inevitably take time. If the business is asking IT to speed up delivery, those manual steps need to be automated.
The same goes for monitoring. If the preparation for go-live of a new service involves laboriously working out which metrics need to be monitored, what the baseline is, what the thresholds are for warning or minor or major alerts, and what the logic is for putting those alerts together into a coherent picture of the state of the service, that’s going to be a major drag on IT’s ability to deliver that new service.
The new approach to monitoring – what some are starting to call visibility, to emphasise the difference from the old ways – is to monitor everything, and use automated tools to figure out what’s important. No more building out massive rulesets, beautiful service maps with no relation to reality, or filters so complex they are about to attain sentience on their own.
This approach to service visibility reverses the old pattern of IT. Instead of starting with the infrastructure and working forwards, it starts with the user and works backwards. If the user will be best served by doing something different, then we as IT professionals need to work out how best to support that need – including by monitoring it effectively.
This means building up a picture of the entire service from end to end, including in-house infrastructure, public cloud, middleware, application metrics, and end-user experience. All of these data sources need to be brought together, but not just because operators can see them on one screen. The real benefit is from letting automated processes look at all of these different metrics in context.
Maybe an apparently insignificant network slowdown goes together with an internal application error and is showing up as user transaction failures. Once upon a time, it might have been acceptable to look at these symptoms in isolation – because most of the time, they were isolated! These days, application architectures are complex and interconnected, and so are their failure modes.
The new approach of Algorithmic IT Operations, or AIOps to its growing number of friends, is all about detecting those correlations and directing the operators’ attention to what’s important, so that they can focus on supporting the business – not just fixing technical problems.
It’s not the users who are living in the shadows, it’s IT (yes yes, many sysadmins like it that way!). It’s time to shine a light in the data center and see what is actually going on. That’s where the real shadow IT is going on. If it’s not visible to users, does it even exist?
You can read more articles on Shadow IT here