An IT operations strategy is only as good as the quality of its data. I’ve addressed the importance of good data before, but it bears repeating: clean, accurate data is the key to making clean, accurate decisions. That means taking care of the pieces and programs that take care of data and that interact with each other in a logical process.
It’s Time to Get Back to Basics
The foundation starts with a configuration management database (CMDB) that is populated with the data necessary to give you a complete, precise, and real-time picture of all the physical and ephemeral pieces in your IT environment. A rock-solid CMDB is used to manage vital IT processes, including IT operations and your asset management program. In turn, that allows you to implement incident management, change management, and problem management strategies with speed and confidence.
Accurate data begets accurate decisions. A CMDB informs an Asset Management program that informs an Incident Management strategy. CMDB, Asset Management, Incident Management: These are simple concepts, but that simplicity requires understanding what each is, what each piece entails, and how they interact.
What is a CMDB?
A CMDB is a data warehouse where all the information about the configuration items (CIs) that comprise your IT operation lives. It is more than just an accounting of what is in your datacenter; a CMDB must include hardware, software, applications, and the relationships between them.
One common misconception that I encounter regarding CMDBs is that organizations think that it’s enough to just build and then task your staff with keeping it updated. But that’s a big mistake. A CMDB is not a static inventory of components. It must be dynamic, updating as often as today’s ephemeral IT environments do.
That means data inputs must be automated, providing a real-time, moment-to-moment account of the rate of change inherent with today’s cloud-forward, IT environments. If your CMDB is incomplete, out of date, or if the data that populates it are inaccurate, the effects on operations management and the health of your IT enterprise will suffer.
CIs are described in a CMDB by the data relative to their configuration and operation. Because CIs are typically associated with assets within the IT environment, there may be confusion between configuration management and asset management. But it’s important￼ to understand the differences between CMDB and Asset Management, taking us to the next point.
What is Asset Management?
Assets are the pieces of an IT operations environment that comprise not only of the full array of CIs your enterprise relies on to function but also the terms and settings—contractual and otherwise—associated with each component. Typically, there is a dollar value and a terms of services agreement associated with each asset. Therefore, they must be managed according to those terms, as well as financially, as each asset depreciates according to a lifecycle management program. There may also be departmental accounting requirements or partner dependencies that also require monitoring.
Even though assets are inventoried in a CMDB, asset management occurs apart from operations according to a lifecycle management program. How an asset is functioning is relevant to your CMDB, however. For example, when the software license expires it is relevant to asset management. Failure to follow a proper asset management plan can result in sub-optimal IT operations and conditions like a breach of contract, or a degradation of security.
What is Incident Management?
When something unexpected happens that affects proper IT operations it is an incident. How your IT operations team responds to an incident is dictated by an Incident Management plan. Incident management involves identifying the source of a problem (mean time to identification or MTTI), diagnosing and fixing the cause of a problem, and returning to normal operations as quickly as possible (mean time to resolution, or MTTR).
There’s also mean time between service incidents (MTBSI) and mean time between failure (MTBF). MTBSI is a key metric for measuring one of the biggest culprits of IT service time lost, tracking the span between a recurring incident responsible for the failure of the same piece of infrastructure or business service. It is often an indicator of IT staff using a “workaround” to restore service rather than a real repair. MTBF measures the average time that a CI or IT service performs its function without interruption. Both MTBSI and MTBF are often-overlooked indicators of the health of your IT operations.
Determining the cause of an incident depends heavily on the data available. An accurate CMDB is vital to isolating conditions that point to a CI failure or a problem affecting services such as an unexpected spike in server demand or even physical conditions like high temperature in a server room. There may be cases where a failure to properly manage asset lifecycle could result in an asset going out of service, disrupting associated operations. Accurate, real-time data can save precious time in achieving rapid time to resolution.
Putting Knowledge to Practice
Understanding the basics of CMDB, Asset Management, and Incident Management is one thing. Putting that knowledge to use in an operational environment is another. Budgets, time, executive mandates, and other kinds of sand can grind at the gears of IT operations, but if these are the issues you face, you need to be prepared to fight for whatever changes are needed. That fight will be difficult, and change is hard when it involves organizational culture, but the results will be worth it.
I’ve seen it from both sides. The situations most often encountered are when an organization thinks CMDB and Asset Management is a “people” problem: just allocate more time for staff to fix what’s broken or incorrect and everything will be okay. These days, that is an impossibility. Human beings are incapable of keeping up with the speed and volume of change in any IT environment, let alone those of large enterprises. Not only that, but we are prone to making assumptions and mistakes that are detrimental—catastrophic, even—to proper IT operations.
Nothing is ever configured to standard or remains that way for long. And inconsistency anywhere in your CMDB means you have inaccuracy everywhere in the environment. That’s why you need to establish a virtuous cycle of inputs that are automated, continuously feeding complete and accurate data to your CMDB. Asset management can then be tracked and informed by that data. And when things go wrong, incident management can happen quickly thanks to rapid identification of the issue, repair decisions made—or even automated—by good data, and recurring problems tracked and solved. Without an accurate, up-to-date CMDB, Asset Management and Incident Management cannot happen.
The irony is that an investment in proper CMDB, Asset Management, and Incident Management will save your organization time and resources in the long term and may even result in better customer retention and revenue enhancement thanks to reliable, efficient operations.
Fight the good fight for accurate, efficient IT operations. Get your data house in order, and you’ll see immediate results. The good news is, you aren’t breaking new ground, and you don’t have to do it alone. We’ve helped a lot of organizations do it before—and probably with situations that were a lot more daunting than yours.