At an application level with Vityl Capacity Management
In this guide, John Miecielica of Metavante, provides a step-by-step example showing how he uses Vityl Capacity Management to analyze IT resource consumption at an application level. This key capability is especially important in today’s environments where multiple applications run on a server or multiple servers might be required to implement one application. It allows Metavante to solve complex performance and capacity management issues crucial to keeping their business running smoothly and efficiently.
Managing a Consolidated Environment: Issues and Challenges
Metavante has a server farm with more than 1500 Windows, Linux and Unix servers supporting a vast array of Java, .Net and legacy applications. When first hosting applications on Windows and Unix servers in the 1990s, the standard mode of operation was one application per server. In this current era of virtualization, grid computing, and server reduction initiatives, the situation is now more complex. One physical server might host multiple applications, or one application could span many servers.
One of the core activities for any capacity planning and performance management team is to track physical and logical resource consumption. Physical resources include CPU, memory, and I/O; logical resources include number of application threads and number of available database connections. In a shared resource environment, it becomes vitally important to correlate the resources consumed with the application or system process consuming them for a variety of reasons:
- Tracking resource consumption over time provides input to capacity planning activities at an application level.
- Tracking resource consumption is important not only after an application is in production, but also during the pre-deployment testing phase to ensure that it meets design specifications.
- Correlating resource utilization with other metrics on a given server allows us to troubleshoot performance issues. We can identify the busiest I/O device when an application stops consuming user CPU for a short period of time, for example
Tracking Resource Usage with Workloads
Vityl workloads allow me to correlate the system resources consumed with the applications,
departments, or processes consuming them.
A workload is a logical classification of work on a computer system defined around a common characteristic, such as by application or department.
A workload set is several workloads that together represent the resources used by all activities of the systems being analyzed. Each workload set defines a different way of looking at performance.
Workloads and workload sets allow me the flexibility to view resource consumption in a variety of ways. I can track resource consumption by application, for example, to measure how well we’re meeting service levels and to uncover potential performance or capacity concerns before users are affected.
Defining Workloads: A Drill Down Approach
Due to the nature and complexity of the environment being managed, a drill down approach to defining workloads is appropriate. For example, a high level view of a given environment would consist of the following defined workloads:
- Operating Environment
- Application Component 1
- Application Component 2
- Application Component 3
However, if we encounter either a capacity concern or a performance issue, we may need to quickly re-characterize work into the following categories:
- Operating Environment
- Operating System
- Disk and Backup
- Network
- Security
- Application Component 1
- Application Component 2
- Application Component 3
Or perhaps:
- Operating Environment
- Application Component 1
- Application Component 2
- Application Component 2a
- Application Component 2b
- Application Component 2c
- Application Component 3
I set out to see if we could meet these business objectives with the components of Vityl Capacity Management. Here is what I found.
Finding Answers with Vityl Capacity Management
Step 1: Set-up
My research indicates that we need access to the field fullcmd. The default is for Vityl to collect the first 72 characters of this field. Some of our applications have a fullcmd field of more than 1000 characters, so we modify the collection agent to collect just the last 128 characters of fullcmd. This is done via the Vityl administrative interface.
Step 2: Understanding our environment and practicing workload definitions
The next step is to get a list of the most active processes and start classifying them into the categories that we want to monitor. One of the easiest ways to do this is with Vityl KPIs and the process table.
Start with a CPU chart and then drill down into the process table.
Once in the process table, turn on the field “fullcmd” and organize it so it appears toward the left side of the display. This is especially important in the Unix environment where the differentiators between processes are only found in the fullcmd field. To accomplish this, I simply use the standard GUI to change and organize the display fields.
The report shows several of the top CPU-utilizing processes are Java processes which all look similar to each other. However, there is one differentiator. In the middle of the fullcmd string, note that some are CEBNG_D where as others are CEBNG_E. I asked our application team contact about this and he explained that the application is organized into “containers” and that the containers on this server are labeled “D” and “E.” I further noted that it is important when troubleshooting issues to distinguish between the containers. We discussed that this differentiation would be very important in capacity planning, as well.
Before defining workloads, I find it best to test my selection criteria again using the process table in Vityl. The resultant process table view confirms that we capture just the processes we want.
Step 3: Set up workloads
After I’ve defined all workloads inside Vityl, I find it very trivial to set these workloads up. To start, I go to the administrative interface on the target server and select “Workload Policy.”
I can select standard features such as adding a new workload set, or I can click on an existing workload to modify it.
Once in the workload set, I can add various workload definitions or modify existing ones.
Once I have all the workloads defined, I simply activate my changes and Vityl applies the new workload definitions to newly collected performance data.
Step 4: Using the workload definitions
Now I can open predefined workload reports in Vityl Capacity Management.
The report shows the CPU utilization broken down into the workloads we defined.
I then can drill back down to the process table and include “workload: Application Workloads” as a field. Then, I click on the new column header and then click “Set.” I get the following report.
Now, I can quickly identify top CPU workload, top I/O workload, etc. Another useful graph I found was the lioch, which is a pre-defined workload report, which allows me to easily identify which workload is responsible for high I/O waits.
Step 5: Capacity Planning and Modeling
These workloads are carried over from one component of Vityl Capacity Management to another for seamless modeling, what-if analysis, and capacity planning. Since these workloads are now discrete objects, I can change the rate of growth on each one independently.
Summary
- Moving Forward: Vityl workloads allow me to reprocess previously collected data into newly defined workloads. With this capability, we will be creating a drill-down process for workload definitions. Right now, we run most workload definitions generically between infrastructure components and major application components. However, if a noticeable change occurs or problem develops in the infrastructure workload, we reprocess the data using different workload definitions to show which technical discipline (system, storage, security, etc) needs to review the issue.
- Hidden Gem — Input to Chargeback Systems: In order to apply this workload methodology across our server farm, a strong naming convention is needed to identify processes and the workload to which they belong. This can be derived from either the command or fullcmd fields. Once the naming convention is in place, we can use the performance data as input to our chargeback system because we will be able to identify which application is driving resource utilization on a given server where multiple applications share a common server.
- Conclusions: Strong workload analysis is key to solving business issues in performance analysis, capacity planning and chargeback. The ease of use and flexibility of TeamQuest software help solve these key issues effortlessly. The key differentiators with TeamQuest that I have found are:
- Ease of use – All of the above outlined steps worked the first time I tried them and I did not have to call TeamQuest Support for even so much as a clarification question. It is extremely easy to implement and very intuitive.
- Flexibility – The ability to post process data so that multiple workloads can be applied against the same data is key in that you never know which workload you may need to subdivide to resolve business issues.
I have tried other performance data collection software and have found the workload classification process in those products so cumbersome and inflexible that we abandoned its use.