At one point in our lives, we were likely told by our parents or co-workers to not just work HARDER, but to also work SMARTER. In the age of modern AIOps, I believe this adage should evolve to: Don’t just work with data, but work with GOOD data. With good operational data, businesses can make more intelligent business decisions, maximize their resources, drive efficiencies, and focus on more ways to innovate. To help understand what good operational data looks like, we should first examine bad data and find ways to avoid its impacts.
It’s all about quality, not volume
Having too much operational data isn’t necessarily a bad thing. IT Operations needs detailed data on how our applications and infrastructure are running at any given moment to keep things running smoothly, lower costs, and ensure secure and compliant operations. Service delivery and line of business managers require IT to provide them with visibility into service management and measure the effectiveness of service delivery. To date, sifting through large volumes of IT data has been primarily addressed by the advances of NoSQL datastores, and robust ETL processes around Hadoop oriented technologies. The log management industry has also matured over time, led by big data platforms such as Splunk.
So where is the issue?
As our digital systems have become more complicated and workloads more ephemeral, businesses are increasingly reliant on our operational management systems to provide accurate, complete and timely data.
But buyer beware: as we all know, not all data is made equal.
The characteristics of bad data
When talking with various enterprises and large service provider customers, we see some consistent trends around bad or poor operational data in their existing IT environments.
I’ve classified them into four key categories:
1. Inaccurate data – this data is often misleading as it’s wrong. This may include data that has the wrong name, wrong metric, wrong calculation, wrong time stamp, wrong location, wrong label, or wrong owner.
2. Inconsistent data – this data may have duplicates, varying naming conventions or descriptors, or changes over time and thus becomes unreliable in nature.
3. Out of date data – this would include historical or no longer relevant data of what once was but is no longer. Data that is collected or processed too late in its lifecycle may be no longer useful or worse may provide misleading information.
4. Missing data – this data includes partial or incomplete data sets. This could consist of unknown attributes, configurations, state, relationships, or dependencies tied to a device, asset, Configuration Item (CI), or IT Service.
Unfortunately, bad data can be hard to initially spot, and it adversely affects your team’s ability to automate. But it doesn’t have to be a corporate death sentence.
AI and ML engines need our help
Many new neural nets and machine learning algorithms are designed to sift through tremendous amounts of data to find meaningful insights. The challenge vexing many data scientists is how to identify, locate, and verify that operational training data meets the necessary minimum standards to be considered ‘clean’ data. Besides the presence of bad data, many of the current data sets lack the context to help them derive meaningful outcomes in their analysis of operational data. That includes the metadata success as relationships, dependencies, association to business services, technology type, and organizational context. The more effective the data, the more likely that you can take advantage of AI and ML technologies.
How do I eliminate bad data?
At this point, you have likely seen many examples of bad data and how it can negatively impact your business. Our goal at ScienceLogic has been and continues to be the provider of good clean operational data to help improve the delivery of IT operations. To realize the benefits of AIOps, you need to provide context to big data so it can provide meaningful insights. We have spent over a decade creating a common data model for over 5000 technology signatures that offer operational data around correlated events, configuration attributes, topology relationships, relevant performance information, and automation enrichments/remediations. This is backed up by multiple patents in discovery, data management, and mapping technology. By combining all of this into a single platform, we are transforming the way companies gather and share meaningful operational data.
Please contact us directly or schedule a 1:1 session to discuss how we can help you prevent the common perils of bad data.