During competitive and challenging economic landscapes, business productivity takes precedence over other priorities. Without service assurance, poor service performance degrades the customer experience, reduces business productivity and eventually results in lost revenue. Since both lost productivity and revenue are quantifiable and can be tied back to quality of service, IT infrastructure and operations need to explore best practices that can resolve any performance issues.
However, this is a complex process because problems can originate anywhere across the service delivery infrastructure, including networks, servers, enablers, and applications. In a recent survey, Forrester found that 91 percent of senior IT decision makers at large North American firms responsible for application, network and/or service monitoring technology cited problem identification as the primary area that needed improvement. The survey found that one hour of service downtime costs $29,162. Since half of these respondents reported that 90 percent of their IT issues take more than 24 hours to resolve, the annual cost of brown-outs or service downtime quickly escalates into the millions.
In addition to the cost of downtime being innately expensive, nearly three quarters of respondents are grappling with more than 10 monitoring tools encompassing Network Performance Management (NPM), Application Performance Management (APM) and log data analytics to discover issues. The excessive amount of silo-specific tools actually inhibits service triage activities, prevents cohesive integration and has clear consequences on IT operational costs.
In order to rein in costs and drive proactive, effective service assurance that increases employee productivity, IT decision makers should consider the following five recommendations:
- Assess the cost of your firm’s service outages. Gathering data is crucial for organizations that wish to streamline costs and simplify day-to-day IT workflow by adopting a more top-down approach to service assurance. It’s important to assess the cost of service outages so the reduction in employee productivity, revenue loss, and customer confidence as well as the burden placed on IT staff can be quantified. By adding these numbers up, CIOs and other decision makers are armed with strong metrics that demonstrate the ROI of improving monitoring tools, processes and infrastructure.
- Get a holistic view of your service delivery infrastructure. IT plays a critical role in assuring quality of service that ultimately affects the bottom line of the company. Problem identification and alerting capabilities are crucial and time is of the essence. In-depth visibility into your service delivery infrastructure is the key to resolving a problem that affects an organization’s performance and availability management. However, the current model makes end-to-end performance monitoring and proactive detection of service degradations in order to discover the root cause, difficult. An average of 93 percent of survey respondents see value in re-prioritizing their tools and shifting to a holistic, top-down approach.
- Consider different approaches to performance and availability problems. Enterprises often operate in specialized silos that not only stifle collaboration and innovation across teams, but also cause inefficiency and directly impact the bottom line. Organizations need to move beyond specialized APM, NPM and log analysis tools because components based tools lack the holistic view of the service delivery environment which complicates the triage process and extends the mean time to resolution (MTTR). Having a bottom-up approach of siloed APM, NPM and log analysis tools is ineffective because they do not provide the visibility into the dependencies of the individual service components. These point tools translate into barriers for collaboration among IT staff, prolong MTTR, and ultimately increase operational costs. As a result, revenue and customer satisfaction are also significantly impacted or damaged.
- A global framework approach may not be the right solution. Organizations should resist the temptation to adopt a global framework approach. Automating the aggregation, normalization, correlation and contextual analysis of large volumes of disparate data sets in real time from multiple sources can be complex and difficult to implement and maintain across an entire service delivery infrastructure. The best approach is to leverage products that can empower the IT staff to gain a holistic visibility into service delivery based on deep, packet-level insights that can pinpoint or troubleshoot where service is degrading in order to assure enterprise networks are always on.
- A simple, global and effective approach comes from the network. Forrester Research found that problems can be caused by both hardware and software monitoring tools that focus only on single elements are ineffective. In order to be effective, organizations should follow a simple and global top-down service assurance approach. This alternative top-down service assurance approach relies on one consistent and cohesive set of metrics generated based on deep analysis of network traffic. Service degradations and outages can originate anywhere since complex services are increasingly delivering in an abstracted and opaque infrastructure. Ideally this approach would be based on a specialized technology which can offer contextual top-down workflows from a single pane of glass, to proactively detect service degradations and quickly identify the failed service delivery component.
Rather than taking the aforementioned recommendations into consideration, many enterprises acquire and implement networking tools in an ad-hoc fashion that only aids in identifying a problem after it has happened. With traditional IT service degradation, outage monitoring and resolution processes, enterprises waste a great deal of time identifying where a problem originated.
Since modern IT involves several elements beyond the simple network layer, the best way to reduce that time is by adopting a holistic approach to service assurance and application-oriented network performance monitoring.
A holistic view that leverages network traffic analysis combined with a consistent and cohesive set of metrics using real-time deep analysis of traffic traversing the service delivery infrastructure should be adopted.
Metrics ultimately save time and money, and should include key performance indicators as application traffic volumes, server response times and throughputs, aggregate error counts, and error counts specific to application server and domains.
A true, holistic view across an entire service delivery infrastructure should be available through single pane of glass to provide contextual top-down workflows so IT staff can proactively find service degradations before they become major problems that affect customer satisfaction and drain the organization’s revenue stream with high operational expenses and frequent losses.
Michael Segal is director of solutions marketing at NetScout. He is a seasoned product management, product marketing and business development professional with experience in all aspects of product and solution marketing.
Published under license from ITProPortal.com, a Net Communities Ltd Publication. All rights reserved.