Just like other parts of your organization, IT is tasked to do more with less and to steward existing resources to the fullest. At the core, this is an effort to improve the way the business runs and to manage cost. A capacity management practice plays a major role in helping your organization make the best use of IT resources.
Accurately identifying unused resources that can be repurposed is one of the key benefits of capacity management and a great way to improve efficiency. In this webinar we show how the latest version of Vityl Capacity Management makes identifying those resources really simple for you.
In this 30-minute webinar, you’ll see how to:
- Use the Key Performance Indicators component to identify underutilized resources
- Proactively reduce IT waste and improve efficiency
- Implement other best practices for IT cost management
Watch the on-demand webinar to learn how to improve IT efficiency at your organization.
Per Bauer: 00:15 Hi, welcome everyone to today's webinar. And it's about our latest release of Vityl Capacity Management, and we're going to focus on how to improve IT efficiency and how that new release helps you achieve that.
Per Bauer: 00:33 For those who don't know me, my name is Per Bauer. I'm director of technical services at HelpSystems, responsible for working primarily with Vityl Capacity Management suite. I've been with TeamQuest prior to HelpSystems and now HelpSystems for many years, working with this solution. And I'm going to take you through today's session and take any questions that you may have at the end of that.
Per Bauer: 01:08 So if we look at what we're going to cover today... So we have a new version of Vityl Capacity Management coming out in July, version 2.4. I assume that some of the people on this call are new to our solution. I'm going to spend a few slides at the beginning talking about Vityl Capacity Management at large. So what is the solution capable of and how does it look like... And then we're going to focus on the new features that we've added in version 2.4, or the new capabilities.
Per Bauer: 01:41 And primarily, we're going to focus on the cost aspect, so how to drive efficiency in your environment, and then at the end, we'll do a summary. Previously, we've done these sessions. For each new release, we've done one session. This time, since we've been able to get a lot of features into this new release, we've decided to split it across multiple different sessions. So this one is only going to be 30 minutes, and we're going to talk about Key Performance Indicators and efficiency. Subsequent webinars, one later in July and then one later in the early fall, we'll talk about some of the other features and capabilities of this new version.
Per Bauer: 02:25 So since we only have half an hour to spend on this, I'm going to move ahead and cover this as quick as possible. So Vityl Capacity Management is a software suite or solution that addresses all the different aspects of capacity management. So it covers everything from performance monitoring, so requiring data about how the infrastructure or the systems and the applications have been performing based on observations or monitoring of those... To analyze that data, to find the recent or the root cause for different performance issues, over to doing forecasting and planning based on that data, so doing trend analysis or capacity modeling based on that data. The solution covers all this. It's divided into some number of components, but they all interact and together they address the capacity management discipline as a whole.
Per Bauer: 03:37 When we decided to design Vityl Capacity Management some to two years ago, we did a major refactoring of our previous solution and launched it as Vityl Capacity Management. We had a couple of key objectives that we wanted to make sure that we addressed. First of all, we wanted it to have a simple and intuitive user interface. We believe that more people should do capacity management or be involved in capacity management, and one of the ways to achieve that is to make it simple, to make the learning curve a bit more attractive, and to invite people from the business side from not... You know only the diehard capacity management people to work with this solution but actually push it out to a wider group of users.
Per Bauer: 04:28 Number two was to deliver useful insights out-of-the-box. That goes along the vein of the first one. So having less experienced people using the solution and benefit from this solution, we need to provide general recommendations and advice out-of-the-box. Simple to set it up, should be up and running quickly. You shouldn't have to spend weeks or months configuring and calibrating the solution to get results from it, so it should be out-of-the-box simple to use.
Per Bauer: 04:58 It needs to be scalable and extensible. You want to be able to start small perhaps, and then as you grow add more resources to the architecture and by that scale it out. Rather than having to buy the... Or having to have the provision for the end goal when you start using it.
Per Bauer: 05:18 And then we wanted to provide this in a single package. It's hard at the time of buying the solution or designing the solution to know exactly which parts you're going to use and which parts you're going to benefit from, so it's easier if we have this as one single package that does all of the things that we described in the previous slide in a simple way, so that it fits together, and it makes it easy to use.
Per Bauer: 05:48 So what we came up with was a solution that is made up of a number of different components. We have a component called Key Performance Indicators. That's the one we're going to focus on today. It's a high-level tool that allows you to quickly assess the health risk and efficiency of your system. It's not meant for root cause analysis or to provide all the details. It's more around providing awareness to people, or awareness to the users, about certain problems or certain situations that they should deal with.
Per Bauer: 06:29 We have a troubleshooting component called Performance Monitor, which is the view to the real-time data that we collect all the way down to process level, so it allows you to find the root cause behind different behaviors, exactly which process, or which workload, or which container, or which VM was causing a specific problem and what resource they were running out of, et cetera, et cetera.
Per Bauer: 06:54 We have a component called Capacity Plans which allows you to do forecasting and planning, so predictive analytics based on the data. So we build capacity models of the data using the empirical data that we've collected, or we acquired from third party sources and build models of those systems, and that allows you to answer what if questions. Like what if I grow this workload with a certain percentage? What if I migrate from this existing platform to a brand new one with slightly different characteristics? What will happen to my workloads if I do that, or my application, or my response times, et cetera?
Per Bauer: 07:37 And then, in addition to those, we also have a report component called Automated Analytics which allows you to create dashboards and reports of any of the data that we acquire, or any of the data that we've analyzed and created, or all the insights that we've created through our analytics, create custom content, build your own type of reports. You can automate that and schedule those to be available to users. Either proactively push them out, or they can be published into a portal framework or wherever you want to put that data.
Per Bauer: 08:12 So those are the components of Vityl Capacity Management, and this is the solution. What we've done in the last two years, as I referred to, is that we built the framework or the base components of Vityl Capacity Management over two releases in 2018, and since then we've released in March. We released an update to that where we added quite a lot of new features and capabilities, and now in July we're releasing the second set of new features and capabilities, and this is going to go on. Each quarter we're going to release an update where we add more and more features to this.
Per Bauer: 08:59 So the two first releases allowed us to build this architectural runway that gives us a lot of momentum now. So in the quarterly releases we have, there's quite a lot of content in those, and it's quite easy for us to add new features to those, as you will see in the future.
Per Bauer: 09:16 In this new release, version 2.4, these are the main themes, or main capabilities, that we've added. We've added out-of-the-box support for Azure. You were able to do that before, bring in data from Microsoft's Azure public cloud, but now it's out-of-the-box, so we have a ready-made integration that you can use. And then you can use that as your data across the whole solution and analyze those types of systems and et cetera, et cetera. Same way as you can with all the other platforms that we support.
Per Bauer: 09:48 We've improved our high-resolution data collection and how you can analyze that data. We've added some new capabilities that allows you to keep that data separate from the baseline monitoring data that you collect. We have a slightly different approach to it than we've had in the first couple of releases of VCM. We'll cover that in a later webinar, but it's a very interesting feature that we will continue to develop, and in subsequent releases it will be an even more potent way of looking at high-resolution data.
Per Bauer: 10:26 We've created workload mechanisms. This is something we used to have back in the TeamQuest days. Now we've caught up with that, so we now add workload and process data reduction to our data collection mechanism. We're adding an efficiency indicator to KPI. Key Performance Indicators was this high-level view that allowed you to look at the large population of systems at once and assess whether there were any problems, so we've added an efficiency view to that. I'll describe that later, in this session, a bit more in detail.
Per Bauer: 11:05 And then we also added a demand calendar object, so that you can record data about future demand in your environment, and record that somewhere, and make everyone using the Capacity Plans component or making Key Performance Indicators aware of those specific demands that has been forecasted. There's also a group of other features that we've added. I'm not going to bring them up here.
Per Bauer: 11:36 So we're going to focus on the efficiency aspect of KPIs today. So as part of that, we're also going to talk a bit in general about how to do IT cost optimization, and what are the best practices, and what are the best ways of doing that, and then we'll end that off by showing you a demo of how the efficiency scores and KPI looks like.
Per Bauer: 12:03 So IT cost optimization as a discipline. So it's basically about squeezing the cost of IT without jeopardizing the quality of the service or by putting that at risk. So one part of it is to avoid duplication and redundancy in your environment. That's normally the responsibility of portfolio management in an organization, so to make sure that you don't have multiple services, multiple applications providing identical or similar capabilities, et cetera. Over time, the goal must be to consolidate those to fewer applications and by that avoiding duplicating capabilities and features across and using more resources than necessary.
Per Bauer: 12:50 The other one is making sure that you don't have dormant or outdated assets in your environment. That could be part of capacity management in a way, but primarily that would be the responsibility of lifecycle management. So understand what are the assets that are outdated that should be end of lifed and put to sleep.
Per Bauer: 13:09 And then, you have those circumstances or those occasions where the consumption of resources, or the demand for resources, doesn't really match the real demand, and in those cases, capacity management should be there to safeguard those and make sure that you're actually using your resources in the best possible way.
Per Bauer: 13:36 If your forecasted behavior doesn't turn out to be true or the reality doesn't match those, you need to recognize that. You need to identify that and reclaim, repurpose those resources. So capacity management is involved in overall IT cost optimization. It plays a big part in doing that, or achieving that, cost optimization.
Per Bauer: 14:04 So what does IT cost optimization entail? There's a number of proactive measures that you can take before the cost is actually incurred. You can have policies and guidelines around how to provision resources, what type of resources that are allowed, and what are the different checkpoints that you need to go through, et cetera before you handout or provision your resources.
Per Bauer: 14:29 And you should probably enforce some sort of standard configurations, small, medium, large T-shirt size type of configurations. How those look in detail will vary from company to company, organization to organization, et cetera, and they may actually change over time. Based on your experience, you may modify those over time et cetera. But it's always a good idea to have a standard set of configurations that you allow and not to ask the user to define for themself what type of resources they would like to have.
Per Bauer: 15:07 And then, having activity-based cost allocation, so allocating cost in relationship to actual consumption will always make the consumers a bit more thoughtful about how they request resources. So long-term, that's really probably the best and the most efficient way of controlling how much resources are being provisioned and that those provisioned resources are in line with the actual demand. So implementing activity-based cost allocation is always a good idea.
Per Bauer: 15:44 But sometimes a chargeback where you charge the different business units or the different consumers for what they actually use is too hard to sell, or it's too much of a cultural change. Going from having nothing to a full chargeback system is not realistic, especially for traditional on-prem IT.
Per Bauer: 16:10 In public cloud, all this becomes different because in public cloud you pay for what you use, and it's much easier to allocate that cost across different business units. But for these on-prem data center resources that are traditionally maybe owned by someone, you don't necessarily charge out exactly based on activity to your different customers.
Per Bauer: 16:34 So when you do this, you normally go through these maturity steps where you start by just observing how much resources are being used by the different tenants, or the different customers, or different business units in your organization. So simple usage metering and then you can present back on some sort of percentage of how they're being used and at least get some level of awareness around this.
Per Bauer: 16:59 The next step is doing showback where you take the running costs. You actually analyze the running cost of your data center, and then you multiply that by usage to get the actual numbers for how much is IT support, or IT for this specific business unit, or for this specific service, costing us. And then, put that in perspective to what is the benefit of that service, or what is the overall benefit of this business unit.
Per Bauer: 17:33 And then ultimately, you get to a level where you have a practice around chargeback, where you send out invoices and you have your different tenants or your different customers pay for what they actually use. So when you implement this, it's important to have... You know it needs to be fair. It should be a system that is consistent, so that you actually... Gives you a platform to evaluate your services in the proper way. So it's not a trivial task, and it normally takes some time to get to this point, so it's this maturity curve that you have to work your way through.
Per Bauer: 18:15 In terms of corrective actions, so what you can do after you've actually... The cost has already been incurred, and you're faced with a situation where you have infrastructure in your data center or in the cloud that are being used by your services. You need to constantly run a discovery for wasted resources to try to analyze where there is wasted resources and try to address that, reclaiming those resources or repurposing those resources.
Per Bauer: 18:46 You have to have policies for that reclamation and right sizing because if you've already given it away to the business units or the application owners et cetera, it's not a trivial task to reclaim that. Upfront you need to have a policy for who owns what, and who is allowed to do what in your operation, so that you can actually correct your mistakes afterwards. Otherwise, you're bound to those mistakes, and you can never turn things around.
Per Bauer: 19:16 And then, this needs to be a continual improvement process. We talked about this before. That the standard configurations that you have on the proactive side, those are probably... Over time, you'll probably find better ways of offering those or making corrections to how those are offered, et cetera, so you have to have this continual improvement process.
Per Bauer: 19:34 You also need to understand... So when you get a forecast about growth for a time period, and you provision based on that forecast, you need to follow-up and make sure that you compare the turnout with the forecast and understand if there are any systematic errors in those forecasts, et cetera, and compensate for those. You need to assign probability measures based on timelines, et cetera, et cetera, et cetera.
Per Bauer: 19:59 So you're never finished with this really. It's a continual improvement process that needs to be worked out all the time, and you need to save as much data about this as possible, so you can do the full-blown analysis afterwards and compare the outcome with the prognosis et cetera, to understand and to better learn about your systems.
Per Bauer: 20:20 So this is IT cost optimization in a nutshell. So if we focus on the discovery of wasted resources, because that's really where... One of the areas where you really need a tool that can do this for you, in a safe way, in an automated way, in a scalable way. So what we've created is a view in Key Performance Indicators that does this for you, so it analyzes and reports on the efficiency of the infrastructure being used.
Per Bauer: 20:48 So Key Performance Indicators, for those of you who aren't aware or haven't seen it before, it's basically a view or a window to all your systems or in your environment, so you can analyze thousands of systems to identify the few ones that needs attention. So all the systems are analyzed and then ordered by severity, so you can focus on the ones that... Where you're going to hurt the most or where you're wasting the most resources, et cetera.
Per Bauer: 21:16 It's focused on providing some sort of RAG metrics or simplified metrics. You know good, bad, red, amber, green, whatever you call them. But it's more of, sort of, an indicator of where deeper analysis is required. It's not the full blown analysis. You can also aggregate these individual systems or individual infrastructure items up to group levels to symbolize things like services or business lines, et cetera, to make it more useful.
Per Bauer: 21:52 So we've had the health indicator for quite a while. So it looks at how a service, or components, has been performing over the last 24 hours. Yeah, the risk component where you look at... The same way as health but into the future. So for costing six months ahead, is there any performance issues or capacity issues in the six coming months?
Per Bauer: 22:17 And then we've added an efficiency view to this. So looking for any unused resources that can be reconfigured or repurposed to lower the cost of your [inaudible 00:22:28]. So the way this looks is a view like this where you have the systems, one system per line grouped. For each one, we analyze the activity levels for CPU, Disk IO, and Network IO with thresholds that can be tweaked if you want to, but out-of-the-box, it should give you a fairly good idea about how those systems are being used.
Per Bauer: 22:55 We use percentile filters to reduce the impact of outliers. If you have a single one or a few single outliers, that can completely throw you off track, so we've implemented those to sort of cleanse the data and make sure that the data is... We're analyzing the data in the right way.
Per Bauer: 23:17 You have multiple time range, which sort of provides you this context for how long this system has been unused, so not just focusing on here and now but going back as long as one year, so you can understand the business cycle, so the cyclicality of events. So making sure that the system is actually being unused for a long enough time... You know it's probably a good idea before you turn it off, or you decide to downsize the system.
Per Bauer: 23:44 We're also counting the number of unused or used days when we aggregate the data, so we have an understanding of how many days has this system... According to the rules that we've set up, or the rules that you tweak, how many days has this system been unused?
Per Bauer: 24:03 You have a shortcut to Performance Monitor and Capacity Plans from here, so if you want to do a drill down and look at the system in detail or understand exactly why this is, you can do that directly from here without going through the start screen and having to look up the system again. We do this for all the different "platforms." So Linux, Windows, VMware, AWS, Azure, AIX, and Solaris, so all the platforms that we cover with our data collectors, or our provision, or our federation of other data sources.
Per Bauer: 24:43 You can also use search and filters to manage if you have a larger scope. So if you have all your different systems in here, and it's thousands of systems, you can limit the scope based on just groups or systems. You can filter out on names, et cetera, et cetera. This is not just for efficiency. This also applies to the health and risk scores, so you can search and filter those as well.
Per Bauer: 25:09 And then you can do time-based drill down to see which days, or a month if that is the scope, that the system was unused. If you're looking at the last 30 days, you'll get one notification per day. If you look at the last seven days, you'll get more granular indicators et cetera, et cetera.
Per Bauer: 25:32 If there is an indicator of an unused system, you can drill down and see exactly which days that was happening and if there is a systematic or seasonality behavior that keeps repeating et cetera, across these different systems. So it's a very powerful way of analyzing systems and finding those inefficient or systems that are not being used efficiently.
Per Bauer: 25:58 So to summarize Vityl Capacity Management, as we talked about, is this integrated solution product that consists of four different components, all connected to this data management analytics frameworks that allows you to acquire data from quite a lot of different sources, so we can use our own native data collectors to bring the data.
Per Bauer: 26:21 We can also integrate with third party data sources. We can analyze everything from servers just over storage network on their servers. Could be virtual machines or VMs running, could be containers, could be public cloud services, could also be metrics from things like data center, et cetera. That can also be useful for that.
Per Bauer: 26:42 So all that is available across all the different components. You can do logical groupings of this data. We saw that in Key Performance Indicators. So you can extract definitions from a service catalog, or a CMDB, et cetera and apply that to the data, so not just in KPI but in Performance Monitor or Capacity Plans. In order to make an analysis, you can use those groupings to report on the data, so to symbolize things like services or applications, et cetera.
Per Bauer: 27:10 On the predictive analytic side, we can do these advanced forecasts or predictions of future behavior based on the what if scenarios you defined. In this release, we're introducing a demand calendar, which allows you to do more business aware capacity management, so you can add business events in a calendar format. To make the users of the application or make the default behavior of some of the components take those things into account when a forecast is being made. So you record data about a future business demand, and if that is then covered by the analysis you're doing, it will be suggested or included into that analysis, so it's a way of collecting all that data in one place.
Per Bauer: 28:05 So this is what I had for you today. So main focus was on Key Performance Indicators. We will have a couple of more sessions for this release talking about... The next one is going to be about Azure, and how we get data from Azure, and how you can use that data for analysis. And then in a subsequent webinar, we will talk a bit more about our new data collection capabilities, the high-resolution data, and the workload mechanism, et cetera that we're introducing in version 2.4.
Per Bauer: 28:40 We're running out of time here, so we don't really have time for any questions. If you've posted any questions in the chat windows, we'll get back to you on those. If you have other questions for me, or you want to have a follow-up on this session, you see my email address here at the bottom of the screen, so you can just shoot me an email, and I'll promise to get back to you as soon as possible. So thank you everyone very much for your attention and have a great rest of the day.