Sr. Director, App Management Strategy, ScienceLogic
While the concept of ADM is self-evident, the implementation of ADM is a little more complicated. There are different approaches and levels that can produce different results. Depending on which approach you use for automated discovery and mapping, you may be at risk of not seeing your IT ecosystem in its entirety. That’s because some older systems are engineered to operate in a specific environment or for specific management platforms.
Within the realm of agent and agentless discovery, there are three well-established techniques for accomplishing ADM, all of which have their pros and cons:
1. Sweep and Poll
Sweep and poll is the oldest and lightest-weight approach. This method sweeps and pings IP addresses and interrogates the type of device pinged (i.e. server) and also provides information such as what’s running on the server as well as the type of application components that it’s using. Often this method features “blueprints” of the environment it uses to search for clues and identify pieces that, when put together, are capable of framing out the larger structure of the business application.
- From a singular location, you can remotely sweep an entire network structure or data center.
- From an instrumentation perspective, its lightweight ability makes it incredibly attractive.
- The dynamic nature of today’s IT environment (VMs, containers, autoscalers) complicates sweep and poll’s ability to accurately survey and capture what’s taking place and changing within an ecosystem.
- Sweeping a datacenter takes a long time, leading some organizations to do this nightly or even weekly, creating a “strobe light” effect: you can see when the light is flashing, but much can change in the subsequent period of darkness.
- Fingerprints work fairly well for off-the-shelf applications but are not helpful for custom applications or applications not in the fingerprint library.
- Sweep and poll is poor at learning the dependencies between the different application components.
2. Network Monitoring
Network monitoring looks at network traffic patterns and can either be done at the packet-level through packet capture (which means you need an appliance that’s monitoring the packets), or at the flow level through NetFlow (where the routers themselves are the probes and they send NetFlow records of the traffic).
PROS (NetFlow & Packet)
- Sees network traffic in real-time, so dependencies and changes can be detected immediately.
- The actual truth. Oftentimes, the operational deployment differs from the initial design. The difference between a developer’s whiteboard sketch and what is really happening can frequently be dramatic.
- Not dependent on pre-built blueprints. It isn’t dependent on foreknowledge of what the application should look like and looking for that blueprint.
- Scale – NetFlow is great for WANs, but not so good in datacenters (where applications live). Modern link speeds (10 gig, 40 gig, 100 gig) can produce billions of flow records from every interface. Since traffic will likely cross multiple devices, many flow records are duplicates (but not identical for interesting reasons). Network devices (in general) cannot NetFlow on all interfaces at datacenter speeds due to the processing burden. Furthermore, few monitoring tools can process and analyze the massive volume of raw flow data.
- Flow records only show the IP address and TCP port. They cannot differentiate application-level dependencies. For example, two web applications (A and B) hosted on the same server. With NetFlow if you see an outbound flow to a backend database, there is no way to tell if the database traffic is part of application A or Application B.
- Technologies like NAT, load-balancers, firewalls, proxies, and tunnels complicate the process of piecing back the flows into the broader application.
- Cost and placement: packet-capture appliances connect to specific points of your network/datacenter and capture packets. This data can be used to map application flows (similar to NetFlow). However, they only provide visibility where the probes are placed.
- It can be expensive—and sometimes impossible—to put packet-capture everywhere you need (or want) to see every application flow in a datacenter. This leaves islands of visibility, and not the holistic view that you were initially seeking.
3. Agent on Server
Agents can provide a real-time monitor of both the incoming and outgoing traffic to find and understand every component and immediately recognize changes to status as topology changes.
- Agents perform in real-time, so if something spins up or down, they will capture and report what’s taking place, offering immediacy benefits that are increasingly relevant with today’s use of ephemeral technologies.
- It’s easier to push agents out than it is to push expensive pieces of hardware, which makes it less expensive than using packet pieces.
- Agents have internal resolution that allow them to differentiate between different applications running on the same IP address, which can take on increasing significance if you have multiple applications on the same server.
- You need to put agents everywhere, or else you run the risk of not having complete visibility.
- You have to know what you’re trying to monitor and not forget to put an agent on it. This issue could potentially be negated because agents have the ability to see one hop away, but it’s entirely reliant on user memory.
- Cost could potentially come into play. The price of putting an agent on every server could quickly add up.
There’s also a fourth source of ADM emerging, which leverages orchestration platforms themselves. Platforms like Kubernetes, Cisco CloudCenter, or ACI deploy and maintain all of the underlying application components. As a result, the orchestration knows at any given point what individual components are part of a specific application.
Since today’s IT ecosystem is highly ephemeral and you constantly need to monitor, measure, and manage what’s taking place within the environment, a hybrid strategy that combines the best of multiple practices is required. For example, the ScienceLogic SL1 Platform can combine application maps from AppDynamics and augment them with maps from our agents to provide a real-time, thorough view of what’s taking place within the IT ecosystem.
In our next and final application dependency mapping blog, we will examine a number of common misconceptions regarding ADM implementation. In the meantime, to learn more about application dependency mapping, visit the SL1 application dependency mapping webpage.