Problem management may be your problem

When Problem Management is the Problem

Next Story

ITSM - It's All About Balance

Humans have been solving problems for millions of years

Now of course by saying that I am including versions of humanity extending back further than the recognized emergence of our current incarnation homo sapiens. But even if we restrict ourselves to just that period we are still attributing what most estimates peg at about 200,000 years of history – which most of us (excepting perhaps those professions such as geology or Big Bang theorists) would agree is still a pretty long time to practice something. So surely by now we should have evolved to the point that problem solving should be easy for us, even instinctual, right?

But of course, it is not – we all encounter situations in which we find the reason for an issue eludes our ability to pin it down, or a solution for an issue seems out of reach. I have no doubt that forensic anthropology, psychiatry and other such sciences are hard at work examining the ‘why’ behind this seeming failure of the evolutionary process, and someday may identify just what seems to be holding us back. At the end though, I suspect it will likely come down to something akin to “Because we’re human”.

Having established that we all share a common bond in the form of not being omniscient, let’s narrow our focus a bit from all of humanity to just those of us engaged in making IT Service Management a happier experience for our customers.

Having an Umbrella is Not Enough

The dominant ITSM umbrella that covers solving problems is of course Problem Management – unfortunately, it is treating Problem Management like an umbrella that leads to some organizations concluding that problem management does not deliver what it is expected to. What they are not accepting however is the fact that deploying an umbrella has never been enough to keep one from getting wet, because even the most effective umbrella can only do so much.

How effective an umbrella is greatly depends on a lot of factors. How big is your umbrella (cocktail-drink-size, golf-size or something between)? How well made (cheap collapsible, or high-end “brolly”)? What features it has (wind resistance, ergonomic design, sword in the handle). And of course, weather and related conditions.

So you look out the window, and it is just barely raining – what I was brought up calling a ‘sprinkle’. You decide it is fine for walking to work and grab your umbrella and go out, and ignore the rain, and hardly notice that your shoes are getting wet. Soon though the rain is coming down steadily, you can’t see as far down the road, and puddles are forming – feet getting wetter, and you realize you should have worn boots. Still heavier now, the wind picks up, making it harder to hold on to your umbrella and pushing the rain more and more sideways – you lean into it but your clothes are getting wet now, and that rain suit you thought about buying comes to mind. Suddenly a flash and a rumble tells you lighting and thunder have joined the party; winds begin gusting to 50 miles per hour, the first hail stones appear, and your umbrella (with a groan) turns inside out. The last thing you recall is seeing the funnel cloud approaching and your umbrella flying upward into it….

It’s More than a Process

Problems – and hence incidents – will tend to present a similar scenario for you and your customer unless you are prepared to embrace the idea that to truly deliver the kind of value ITIL outlines, problem management needs to be more than an afterthought of incident management.   Unless your organization enables problem management not merely as a documented process but as a practice and a cultural norm, you should not be surprised to have that consultant you eventually bring in to assess why you have so many issues tells you that your problem management process is actually part of the problem. Unfortunately, many organizations do in fact find themselves surprised by that finding.

If you suspect problem management is not what it could be and want to see if your organization is outfitted for whatever (problem) storm comes along, here are some things you should consider reviewing.

  1. Have you covered both reactive and proactive problem management? If your first response is “What do you mean, ‘proactive’?” the answer is pretty obvious. Most every effort covers reactive, which is the response to an incident. Proactive is meant to provide your organization with activities and abilities to discover and mitigate problems you are not really aware that you have. These issues are a main source of ‘recurring’ incidents, often low priority events that nonetheless represent considerable service interruption over time. If you enable proactive problem management, you can identify and mitigate these issues, ending the need to address the incidents they cause and providing more stable services while freeing support personnel from repeatedly correcting the same symptoms.
  2. Have you built a wall where a door should be? This one is about the question of “When does problem management begin?” Some organizations take verbiage from ITIL out of context and conclude that problem management cannot begin until you restore service for the incident; this is not the case nor intent. In fact, what most people fail to realize is that they quite often perform problem management during an incident. You do so any time you look for what caused the incident (which happens a lot as we try to fix things), because cause investigation is a problem management activity. If this assertion generates a feeling of righteous indignity for you, I urge you to consult the ITIL Service Operation book and look up Incidents versus Problems.
  3. Have you trained your people to investigate? Many of the organizations I have assessed for problem management weaknesses trip on this one, by assuming that someone that has knowledge of a technology automatically knows how to investigate issues with it. Cause investigation however requires another set of skills that is quite separate. Anyone having watched an episode of CSI, Bones, Law and Order or similar already knows this, but few realize that anyone expected to execute problem management also should be trained / possess these investigative skills. There are many structured techniques and methods that can help with this, but frankly it helps to have some innate ability for observation in addition; in this I agree strongly with Sherlock Holmes. Speaking of specialists, make sure you enable problem investigations to include the right ones for a given problem! Your dentist is a specialist, but not if the problem is with your car…
  4. About ‘known errors’ and ‘workarounds’ The term ‘known error’ never quite rolled off my tongue quite right. Based on what I have seen in some problem management efforts, I am assured I am not alone, but honestly it is simply an identified defect (or cause, if you will). Organizations get twisted trying to keep them separate from problems, which is a bit like separating a PB&J sandwich. They are not separate – a known error is still a problem, it’s just a problem you found a cause for. The ITIL concept was to separate a KE record, for the purpose of populating a knowledge base designed to allow faster diagnosis and recovery for future incidents. You should think of them as a detailed form of knowledge record that is tied to a problem that is not eliminated yet. A ‘workaround’ – how to address an incident if the problem causes another one – is also part of a known error, BUT organizations again tend to take the verbiage of the definition too literally. You do not need to have a ‘formal’ workaround to create a KE, because (think about it) there is pretty much always a workaround; if there wasn’t, you would be saying you cannot resolve the incident even temporarily – hopefully a very rare situation. It might be ugly, it might even be embarrassing, but if you have a way to restore service, you have a workaround – document the KE!

Now comes the really fun part, because I get to tell you that the way to identify the reasons at the heart of a problem management effort that is not adding the value expected is to perform problem management. Actually though, if the reasons are not already evident, you should likely consider pointing a Six Sigma person (Six Sigma certification includes investigation training) at your process to help you get things on the right path.

Don’t accept the idea that problem management does not work simply because it is not working as expected. Properly designed and enabled, problem management will help your organization improve and stabilize. Gather the right investigators, diagnose the issues, and redesign / enable as needed – I promise it is worth the effort.

The following two tabs change content below.
mm

Michael Keeling

Michael has been providing consulting and guidance in IT Operations, ITSM and SIAM to enterprise level organizations in many industries for more than 20 years, and has extensive background in data center and service desk operations, technical writing, mentoring, cause analysis and workflow improvement. He is known for bringing the view of a detective to these efforts, perspective he credits to education in crime scene investigation and over 10 years designing processes and performing risk management in the private security sector prior to his career in IT. A confirmed realist that believes no project can be truly successful unless all involved parties are grounded in reality, Michael is always prepared to paint ‘the elephant in the room’ bright yellow when appropriate….