AIOps for Financial Services Companies
ScienceLogics’s Brian Amaro Sr. Director of Global Solutions and CMO Murali Nemani sat down with Kalpesh Sharma, Director, Principal Enterprise Architect, at technologies services firm Capgemini.
During the course of the two-part conversation, which includes examples from the field, Brian, Murali, and Kalpesh address challenges common to the industry and how AIOps helps forward-thinking organizations overcome them. Here are highlights from the conversation. Watch the complete video podcast for deeper insights.
What is AIOps? And what does it mean to the financial services industry?
Gartner coined the term artificial intelligence for IT operations (AIOps) in 2016 as a category of machine learning-based solutions in the IT operations area. AIOps was deemed necessary for organizations to take the next step in their digital transformation. AIOps isn’t a panacea for all that ails IT operations, but the problems that it solves, and the agility that it enables, deliver results that are needed to keep up with today’s market demands. It does this by making the data that inundates IT operations valuable and enabling IT operations team to put that data to work.
This was reflected in a recent informal poll of data scientists conducted by ScienceLogic that found data scientists spend too much time on data preparation because their data lakes more closely resemble data swamps—subject to duplication, misalignment, and misclassification that costs their organizations time and money. This is the foundational problem AIOps solves.
Kalpesh, with your background in working with the financial services industry, how are they interpreting this problem? And what does AIOps mean to them?
I think with the consumer-centric, hyper-connected, and hyper-available world, both banking and insurance organizations, are forced to change the way they have been interacting with customers so far.
Four examples of this change are:
- Going from reactive to proactive customer interactions. We have been interacting with the insurance and banking industry and we see how they are moving from reactive and then sending notifications.
- Going from paper-heavy communication to digital communication.
- Moving from agent-assisted or call-center-assisted channels to self-service channels.
- Being available to serve only during business hours to be available 24X7X365.
This has forced major changes in financial services IT operations, and has resulted in two major problem areas:
- Increased logging and monitoring data generated across systems and applications; and,
- Increased demands for IT systems to be highly available, scalable, reliable, and resilient.
What is the solution? That would be deciphering this heightened amount of data and analyzing this data in real time, 24/7, to proactively detecting or reducing incidents would be next to impossible for human operators. And that’s where AIOps come to the rescue for Ops and SRE teams within financial institutions.
Pain Points for the Financial Services Industry
I think the complexity you describe is exponentially growing with the advent of some of the latest technologies and the reliance on digital, the way we work, the way we live, and the way we play. Can you translate that into specific pain points that the financial services industries are dealing with and how that impacts the services that they offer?
CIOs of leading financial institutions are grappling with the use of multiple monitoring tools that still do not provide full data visibility or a unified view of their data. And while they are drowning in alerts, they fear they are missing the most important alerts because there is not enough contextual data to prioritize events appropriately.
And the alerts generated by the systems do not have any actionable information leading to a higher mean time to detect. CIOs are not confident of the reliability of systems, which is the truth. I think very few CIOs will admit that, but that is the truth of which we know, which they are sharing with us. And I think this is primarily due to the lower mean time between failure.
Also, after priority one incidents, CIOs are losing precious time gathering the right team members, and that increasingly leads to higher mean time to acknowledge. Lastly, more time is getting lost in handing off incidents between teams of different departments leading to higher mean time to resolve.
How does ITOps address these pain points?
What you described ultimately impacts customer experience. Given that framework, what approach should these financial services be taking to addressing these pain points?
Based on our experience of implementing the AIOps solution we suggest a three-step approach:
- Revamp operational architecture.
- Re-skill and up-skill team members on the ground with new ways of working.
- Streamline the operational process and improve knowledge management.
Looking at the academic approach to AIOps, there are several opportunities to establish a solid foundation. In general, we always start with aligning an authoritative accurate data set to establish the foundational elements. Could you walk us through that process in a little bit more detail?
First, revamping operational architecture is very important. We need to build the foundation, and the foundation needs to be built in terms of data collection, data transformation, and data storage before forwarding all this content to the AIOps engine.
I suggest following these six tenets in building the target operational architecture:
- Use a domain agnostic AIOps engine like ScienceLogic.
- Integrate the AIOps engine with an incident management tool.
- Keep the CMDB complete, accurate, and up to date.
- Enable CI/CD and platform provisioning integrations with AIOps to bring in observability for DevOps and SRE teams.
- Enable self-healing to increase durability and mean-time-between-failures.
- Establish an architecture that is pluggable, extendible, cloud-ready, and scalable to any number of monitoring tools.
You made a very, very good point there because having a solid architecture is essential to establishing a solid AIOps foundation, but we must also take in consideration the human element of AIOps. As we all know, IT has been around for over half a century, and in that time, IT professionals have built tribal knowledge of their environments. Unfortunately, we’re seeing that tribal knowledge is now becoming a detractor in modern-day hybrid ecosystems. As we move forward enabling AIOps, Kalpesh, how do you see the human element being able to compete and stay relevant?
Let me tell you a story about when we were implementing the SL1 AIOps solution for one of the major insurance companies. Someone reached out to me and asked, “Will I be losing my job after this AIOps implementation?”
And I said absolutely not. We need your experience. We need your tribal knowledge. That’s because AIOps isn’t just about technology. It’s a tool that people can use to streamline operational processes and to better apply the experience, skills, and tribal knowledge that good IT people have.
We are not just coming, and then solving, and then fixing the architecture and the operational process. We will also, along the way, help you up-skill and help re-skill in the new ways of working.
For Symposium 2021 (taking place May 11 & 12 online), we’re having a few of our customers talking about that journey, where you drive the operational overhead from spending time on issues that are, let’s say for the way that tickets are managed, the way that you grab their information, if you can automate a lot of those things, then you focus that energy of those resources on the service creation part. And that’s real value.
Absolutely. And to be very honest, I firmly believe re-skilling and up-skilling team members working on the ground is a must. It is not an option. We need to bring them into confidence. We need to instill confidence in them. We need to explain to them that this will help them stay relevant in the current climate and also prepare them for the future.
That brings up a very, very important question that many people don’t think about, and I think our industry experts completely ignore. Is AIOps all about the technology?
Absolutely not. With any new technology, technology is just one aspect of it. I believe for a successful AIOps implementation, along with technology and re-skilling and up-skilling people, we also need to work on other pieces of the puzzle. One is streamlining the operational process, and the second one is the improvement of knowledge management.
How do we do that? For streamlining the operational process, I suggest looking into your current state process, find out the areas of multiple handoffs, and find out the areas where automation can be achieved. After gathering all this information, streamline and form up the solution. And the second one, to improve knowledge management, we need to understand how information is being managed today.
- Is the knowledge of a critical system with key resources?
- Are we recording the findings of the incidents?
- Are we doing blameless post-mortem and recording them?
Once you have answered these questions, I would suggest creating a knowledge repository and integrating this knowledge repository with the AIOps engine so that it can tap into it and suggest fast resolutions for the incident. This should ultimately help you reduce mean time to detect and mean time to resolve.
That’s that part of that actionable data lake, right? Without the knowledge, there’s no way we would know what to do with our algorithms.
To learn more about AIOps and the financial industry and much, much more, register for Symposium 2021, taking place May 11 & 12 online >