Hope you enjoyed the festivities, along with the latest in AIOps, ITOps, and IT monitoring.
- 1. I&O leaders leverage AI techniques to make data-driven decisions and automate actions that ensure business agility and stability.
According to an article by The Start-Up, the growing need for organizations to analyze vast volumes of data in enabling rapid application delivery makes manual decision making a critical bottleneck in DevOps. This will require I&O leaders to leverage AIOps to make data-driven decisions and automate actions.
Gartner estimates that only 5% of all large enterprises are currently combining big data and AIOps platforms to support and replace monitoring, service desk, and automation processes and tasks. Gartner also expects that number to jump to 40% of all large enterprises by 2022.
AIOps is poised to create a massive shift in IT operations methodology and spending. It benefits everyone to understand what vendors, products, and services make up the AIOps marketplace. And the question that will be asked is, “Will there be a single pane of glass to view it all?”
The goals of a single pane of glass constructs are accurate telemetry, large amounts of data aggregation, optimal/minimal human input in the act/react loop, and noise reduction using algorithmic clustering. Meanwhile, existing silos between network, infrastructure, apps, servers, security, and end-user computing are hard to change.
- 2. Having a resilient CMDB helps drive automated failover processes.
According to an article by TechTarget, a resilient CMDB helps drive automated failover processes. However, ITOps teams must consider the domino effect a CMDB failure might unleash if a failover process uses automation and relies on the information contained in the CMDB.
While there is no single approach to CMDB high availability, there are several approaches to optimizing a CMDB and ensuring that it’s functioning well should a crisis hit. Here are four approaches to ensure your CMDB will be resilient in the face of a crisis:
- • A load-balanced approach. Balance multiple front and back-end servers to handle a host failure, such as a two-tier or three-tier infrastructure, which ensures no single point of failure. While this infrastructure can be virtualized, the same availability rules will apply, such as not putting VMs on the same host.
- • A redundancy approach. Maintain a redundant copy of configuration management data at the recovery site. It could be a read-only replicated copy but should have the information required for failover, such as unique identifiers and system dependencies.
- • A data and system integrity approach. It’s a complicated task to ensure the accuracy and quality of the CMDB data. Focus attention on the security of the servers involved. Disable unused services because disabled services can’t be exploited.
- • A resource and performance approach. Ensure resource availability and performance that can also support large-scale failovers that use the CMDB. Plan capacity for worst-case scenarios
- 3. CIOs adopt AIOps to rise above chaos.
According to an article in Forbes, hybrid infrastructure comprised of on-premises and cloud elements can create a chaotic environment for IT operations teams. Amid this chaos, even the failure of even the smallest of components can cause a complete business service outage. For a business service to operate flawlessly, all the elements in multiple cloud/data center locations need to be monitored, managed, alerted, maintained, and acted upon in real time, but that’s going to generate and rely upon a tremendous amount of data.
An infrastructure event can create tens of thousands of alerts, signals, events, and triggers across multiple infrastructure domains. Unless an enterprise has a tool to auto-discover, and auto co-relate, across layers of your digital business, the ITOps team will often be clueless and end up chasing thousands of alerts triggered by a single event.
To make faster, better decisions, and accelerate the MTTR of an event, the root cause needs to be identified and isolated quickly. But this cannot happen when a single event can produce thousands of alerts requiring hundreds of Ops hours to find it.
So, how can AIOps help?
- • Anomaly detection. In a hybrid model, the infrastructure layer is spread across multiple locations. The normal IT monitoring and alerting systems, which are typically rule/threshold-based, can be confused when they encounter a previously unseen problem. Dynamic thresholding can adjust for seasonal, weekly, and daily patterns and alert a human ITOps analyst to look closer into a suspected anomaly in real time. Because the identification is quick, and the data is co-related, the ITOps teams can work to figure out the root cause in near real-time.
- • Noise reduction/Event consolidation. AI can help you reduce a large stream of low-level system events to a smaller number of local incidents, eliminating the volume of event streams up to 95%. This white noise reduction allows ITOps teams to take a look at a few specific important events instead of looking at an overwhelming number of logs and alerts.
- • Capacity planning. Using time series forecasting, AI can predict future usage values, such as CPU, memory, server size, network throughput, help desk ticket count, and mean time-to-resolution (MTTR) of incidents. By accurately forecasting the usage ahead of time, even if it were only hours ahead, an enterprise could purchase reserve instances at reduced costs to cope with the demand increase in a cloud-based usage model.
- • Service ticket analytics. Managing reduced budget and increased service tickets in a hybrid multi-location is an arduous task. Based on historical data combined with machine learning, AI can forecast with high accuracy (up to 95%) on the expected number of service tickets.
- 4. Increased cloud use will increase business agility.
Cloud computing technology steadily grows in acceptance and deployment and continues to prove beneficial to an enterprise. But, can the increased use of these technologies increase business agility?
In this article, TechTarget examines how cloud use will affect business agility. Their focus is the comparison of a single cloud-based infrastructure to multiple cloud-based services infrastructure, including the many “as-a-service” offerings now available. By comparing them, TechTarget presents two possible configurations businesses may choose when adopting cloud services.
Single cloud-based infrastructure configuration. The use of a cloud service suggests that the organization employs some sort of managed service for a specialized application. That alone doesn’t necessarily mean a state of agility has been achieved. In addition, the method for data backup and recovery may be solely implemented in local storage or a combination of local and cloud storage.
Multiple cloud-based services infrastructure configuration. While this kind of configuration isn’t common yet, it could be an interesting trend. In this configuration, an enterprise will utilize less local storage, and rely on an increased dependence on cloud-based resources, including the “as-a-service” resources, as well as cloud-based, mission-critical applications.
In each of the two infrastructure configurations, sufficient processing power and network bandwidth can be available to provide agility. However, in the single cloud-based configuration, IT continuously determines each user department’s requirements for accessing resources and their expectations for response times to user inquiries and processing speed of the resources. IT also manages the performance of servers, applications, and network bandwidth to make incremental adjustments to achieve the best overall performance possible.
We look to the past for inspiration, search today for illumination, and reach for tomorrow with aspiration. Along that journey, we find transformation. All business is digital business—and digital business thrives with AIOps.
Just getting started with AIOps and want to learn more? Read the eBook, “Your Guide to Getting Started with AIOps”»