How to Use This Guide to IBM i Message Management
IBM i generates a lot of messages. These messages—which communicate information about the operating system, applications, hardware, and more—should help administrators and operators gauge system performance, uphold SLAs, and solve problems.
Unfortunately, without the right tools and processes, monitoring IBM i messages is like trying to count drops in a downpour: You might get one or two, but you’ll never keep up with what seems like an infinite amount. Unlike getting a little wet, the consequences of missing an IBM i message—like taking down your entire system—go far beyond IT.
Your company’s most business-critical applications and data are hosted on the i. Do you want to explain to your CIO or CFO why their critical application is unavailable because of problems on your system?
So, you need help—help to pick out only those messages that you need. This guide teaches you how to handle message management on your IBM i: The process of monitoring for messages, filtering the critical ones, and assigning them to members of your team as fast as possible.
We’ve broken down each step and provided you with a range of options (Okay, Better, and Best) along the way. It’s up to you to decide which solutions are going to work best for your system, needs, and budget.
IBM i Message Monitoring
Manual Break Messages of DSPMSG QSYSOPR
Before we get to manual break messages, let’s address the “use” of DSPMSG QSYSOPR, which is the practice of only looking at messages when a user reports a problem. If you are doing this: Stop it! There’s a better way! (We’ll come back to this later.)
Implementing manual break messages is as simple and straightforward as having someone sit in front of a screen. And this method is fool-proof—if the person watching the screen never eats, never sleeps, and intimately knows every application and how to respond to every single message, without a run book. There are so many messages, you won’t have time to look away from the latest message, even though it’s probably a routine completed process...or is it? Did you just look away again? Oh, there’s another one!
Many shops across the country waste talented staff on this mind-numbing method. The truth is it isn’t a real way to deal with the quantity of messages that IBM i produces. Nobody knows every application, people sometimes answer messages incorrectly or miss them altogether—and then there’s the problem of informational messages.
In this manual method, we’ve only considered break messages that require a response; this leaves out loads of informational messages. Most are not important, but some are crucial for the operator to know. Storage threshold exceeded? User profile disabled? Disk drive failure? If you miss those, your application could stop working. Or worse, your system could experience some unplanned downtime!
If you don’t have another option, you can set up break messages based on the severity level of an informational message (with 70 being the typical threshold to receive break messages). When you do, make sure to ask yourself and your colleagues how high you should set the severity break levels. If you set the threshold too high, you might miss some that are critical, and if you set them too low, you’ll be inundated with messages.
Set Up System Reply List Entries
Nothing can take the place of dedicated monitoring software, but in a pinch, you can try setting up System Reply List entries. Use the command WRKRPYLE to see the reply list entries already set up on your system. Command ADDRPYLE is used to add new reply instructions to the list. These provide partial automation of your monitoring, plus limited capabilities to interrogate and respond to messages.
This method can help you with automating inquiry messages if the response is the same regardless of the message details; or if you need to pick out some small bit of text from the message to refine your automatic response.
Unfortunately, the benefits stop there. System Reply Lists do not allow flexibility—if your system needs to answer a message differently depending on the time of day or week, you can’t. If you need to respond differently based on the occurrence of other system events, you can’t. Without the ability to set limits on jobs, problem scenarios could loop and chew up CPU and disk. And the fact is, every environment is different and requires different things depending on the time. A one-size-fits-all approach to message response is destined to cause you problems down the road.
Graphical Interface with "At-A-Glance" Capabilities
You have more important things to do than try to parse the flood of messages that come in throughout the day. Graphical interfaces take information collected for critical metrics on your system and display them visually, which allows you to check your system with a quick glance at a graph, chart, or table. Ideally, a tool should have a robust history function, so you can also quickly compare your current performance to recent trends and historical averages.
A talented in-house or contract programmer will likely be able to put together an application that achieves many of these features. Unfortunately, ad hoc applications, even at $100 an hour in development, don’t have (or keep up with) APIs, aren’t easily able to be updated, and don’t have a security layer to address compliance issues. When it comes to such a specialized function, your best bet is a commercial tool.
IBM i Message Filtering
Run Books, Cheat Sheets, and Tribal Knowledge
The world of IBM i messages is a big one. Informational, application, or system-related messages come in a variety of types, and they come from an even greater variety of places throughout your system. Because the source of your message is the most important part—that’s how you know what it is about—and because different members of your team are responsible for different areas of operation, the ability to get a message from its source to the right person is crucial. Codifying this in a run book or cheat sheet is not ideal—especially for those forgettable, once-a-month messages—and if different operators handle messages differently, it can be a huge headache.
Set Up QSYSMSG
To funnel system messages related to security and hardware, IBM i allows you to set up a narrow message queue to complement the primary channel. QSYSMSG will help you monitor system power-down messages, alerts that storage thresholds have been exceeded, notices of weak battery, and others. The visibility to these specific kinds of problems is useful, and is a good start for an environment that currently doesn’t filter messages at all. But this method has limitations—namely that these messages don’t solve the underlying issues of determining which messages are important, what and how to escalate, and whether you’re missing important messages. The difference is that you now have two places to look for them! If you run a smaller environment, this will put even more strain on your limited resources.
Get Rid of the Noise
The key to message management is finding the signal through the noise. To do this, you need to suppress the messages you don’t need, highlight the ones that need a response, and immediately escalate those important messages that require immediate attention. To do this, consider a commercial tool that gives you the ability to assign rules which will automatically filter incoming messages accordingly.
IBM i Message Escalating & Notifying
Be the Messenger
This is as simple as it sounds: When a message comes through that needs escalation, pick up the phone and consult the programmer about how to handle it. This old-fashioned way is time-consuming, and not always the most convenient—especially if it requires waking that programmer up at 3 a.m.! Of course, you also have to be there to catch the right message in the first place—as unlikely to happen as your programmer being happy about that 3 a.m. phone call.
Building a list of names to manually email when a problem occurs is a good homemade remedy for escalation. You can also set up a triggered SMTP email to the same list, which takes the manual work of sending the email off of your plate but still requires the recipients to figure out—likely in a long email chain—who should handle the resolution.
Immediate escalation is key for timely resolution of issues, and for this there is no substitute for an automation tool. When events are barreling down QSYSOPR, there’s no way a human can parse, sift, and pass on the important messages to the team. That’s why automation tools are set up for operators to assign rules that escalate different messages to different staff. This approach takes the old “tribal knowledge” and replicates it—but also, by being transparent and codified, is open to editing when your staff or your environment changes!
It is also flexible: Want a message to go different places based on the day of the week or time of day? Want an unanswered message to re-escalate to someone else after, say, 10 minutes? A professional tool worth its salt will give you that flexibility.
Automation tools also bring an important element to escalation: Accountability and responsibility. Automatically escalated messages take the hot potato from QSYSOPR and pass it to the responsible member. If the issue isn’t getting resolved, that signals a problem either with staffing or with staff. With a transparent and automated escalation process, your team will be set up to handle messages as soon as possible.
The available technology also affects your staff's ability to respond. Sure, VPN is common enough, but nobody can always be near their laptop. So, what happens when a critical message comes through and the relevant operator is in the checkout line? The right tool brings you into the 21st century with two-way response for mobile phones and tablets.
How to Get "Best" for IBM i Message Management
If professional tools didn’t cost anything, every IT department would have one. But it isn’t a perfect world, and we have to make choices with limited means. So how should you decide whether a professional solution is worth it? Or, more likely, how do you justify the purchase to your boss? Here are a couple things to consider.
Professional Tools Are an Insurance Policy
Big problems are just little problems that were ignored long enough. And in the case of IBM i messages, it is incredibly easy to “ignore” a problem—you could be working overtime and still end up “ignoring” the message that takes down your system! The likelihood of that happening is small, thankfully, but the risk is so great that it is worth investing in resources to prevent it. The insurance and peace of mind that a professional tool gives you not only makes it easier for your team to do its job, it pays for itself in operational and financial stability.
Professional Tools Save Resources and Are Cheaper Than Outsourcing
While monitoring takes a lot of man-hours, in some countries man-hours are quite inexpensive. As a result, the low-activity overnight shifts at many shops have been used to justify shipping an entire team’s jobs to another continent. Fortunately, automating this incredibly important aspect of your operations is far cheaper than outsourcing. Even better, adopting a professional tool not only saves your team time from tedious and unrewarding work—it frees you up to make a dent in the proactive, innovative work you’ve been itching to start.
Professional Tools Are Necessary to Maintain an Enterprise Environment
IBM i is the core system for thousands of leading logistics, manufacturing, technology, healthcare, and financial organizations worldwide. To run a sophisticated, cross-platform environment, you need to make sure processes can move cross-platform from IBM i to the warehouse, CFO’s office, or another critical department. In the case of consumer-facing operations, this need is just as crucial: Your i is likely the platform supplying the data customers need.
Of course, smooth cross-platform processing isn’t the only hallmark of a well-run environment: You are judged on how your system satisfies audits, helps end users, meets SLAs, and contributes to a healthier bottom line. For each of these, a professional solution for message management will improve your department’s performance and ease its workload.
Summing It Up
As you evaluate tools for managing messages, you’ll want to consider a few things. First, a solution needs to automate your IBM i message monitoring, filtering, and your escalating and notifying. Second, it needs to address end-goals like those mentioned above: Does it have features to help you satisfy audits? Will the tools stabilize your environment and take some end-user and SLA pressure off your back? And then there’s configuration: Will this tool have smart, preset configurations that work out of the box? Or will I have to spend months of implementing a minimalist tool with spotty support?
There’s one other item to consider. While any tool is an improvement over doing nothing, that answer won’t satisfy the person who controls the purse strings. Your solution needs to improve operations in a way that can be quantified. For that kind of solution, you need a solution with dashboards, documentation, and rules-based controls that shows everyone from your boss to your auditors how you’ve taken your work to the next level.
Management will be impressed when you pitch a solution with concrete, quantifiable benefits. But they’ll be even more impressed if you show that its products sync as a broader solution for systems management. If the products were cobbled together, their benefits will be too narrow to build on. If, however, they were designed to work with each other, you won’t need to do much persuading: Cohesive, complimentary tools that lay a groundwork for operations is the holy grail in this industry.
Choosing A Solution
Once you make the decision to work with a professional tool, it’s time to go shopping. Because your choice impacts everything from cross-platform operations to your budget to SLAs, it’s a good idea to break down each option by how it helps you monitor, filter, and escalate your crucial IBM i messages. You’ll also want to consider how well a solution’s products integrate with each other and with your environment. Finally, you’ll want to make a wish list of some results you want to get out of your solution.
Here are a few to start:
- Automate message, system resource, and log monitoring across IBM i
- Mobile and desktop response
- Help us prevent small problems from becoming big crises
- Free up our team's operations and cut down on tedious work
- Keep better track of critical job, subsystems, devices, job queues, IBM MQ, output queues, and more
- Speed up our response time to critical issues
- Spend less time "fighting fires" and troubleshooting problems
- Save money on overtime hours
- Make meetings SLAs easier
- Make cross-platform operations easier
- Make business processes more stable
- Stop stress
- Get more efficient
- Stay better informed on system issues
- Save time, effort, and money
So, what gives Fortra the ability to do all that?
Developed in-house to work seamlessly with each other, Robot Console, Robot Alert, and Robot Network are backed by decades of expertise in automating systems management for Power Systems running IBM i. Powering environments from the largest, most complex IBM i and IBM i-connected systems to the medium and small shops that rely on IBM i to store their critical data and applications, our message management solution leverages the stable and feature-rich IBM i platform with custom, flexible scripting, intuitive graphic interfaces, mobile capability, plus a security layer to satisfy your corporate policies and IT audits. Combined, these tools bring you comprehensive feature sets that automate identifying, sorting, and assigning messages.
Robot Console was built to handle even the most oppressive loads of IBM i messages. Automatically and 24/7, it handles your job queues, subsystems, devices, objects, TCP/IP services, printers, and more. And instead of checking each queue and each virtual machine individually, Robot Network brings them all into one console for central, intuitive viewing in charts, graphs, and tables that update in real time.
Robot Network also lets you monitor the status of your entire IBM i network from a single PC display. Its Status Center gives you flexibility for escalation, assignment, and displaying properties throughout the network. Notification options can also be combined to escalate statuses, which can be triggered in succession or sent at the same time.
Routine messages which today flood your queues, Robot Console automatically diverts from your attention. But when something important does happen, Robot Alert will automatically notify the most appropriate person—according to custom rules that fit your environment—via email, text, pager, or browser interface. From their email client, the expert can respond to the message like they would to a message from anyone: Type the response, then send it off. Robot Alert’s two-way messaging feature routes the message back to Robot Console and Robot Network, which execute your response.
In addition to suppressing the messages you don’t need, the solution highlights the messages that do need a response and immediately escalates the important ones that require your or a colleague’s immediate attention. The power behind this and other intuitive features is called OPAL, or OPerator Assistance Language, a custom scripting language that gives you sophisticated options for processing messages. With the power to check the contents of message variables, OPAL gives the message management solution the ability to respond by executing commands, calling programs, canceling jobs, and forwarding to staff experts for help.
Escalating and Notifying
The Fortra message management solution is like your staff expert—except it will not one day retire and take its knowledge with it out the door. Instead of dreading the impending retirement or worrying about when an employee leaves, you can codify their expertise in the sophisticated rules provided by Robot Console—in effect, guaranteeing that each message will automatically receive the best possible response and with it, the stability of your environment.
Whether a particular message should be funneled to one person or a team, escalated after 20 minutes or after 10, or given personal attention or a canned response, Fortra message management is like replicating the best judgment of your best employee indefinitely.
Critical Resources and Logs
On top of message automation, filtering, and escalation, there are other crucial tasks that are unfortunately performed manually much too often: Checking the status of critical system resources such as jobs, subsystem, job queues, out queues, IBM MQ, and more. The other is checking log files such as QHST, QAUDJRN, FTP activity, and log files in the IFS. These log files are typically never checked until there is a problem. So, why not manage them proactively?
Robot Console can read these log files for you, automatically escalate critical issues, and even automate possible recovery processes. You know what needs to be done when a critical event occurs and is posted to one of these log files. Point Robot Console to the log and Robot Console will automatically examine the log, escalate issues, and execute instructions (commands or programs) that you’d normally run manually. Only Robot Console will do it automatically, 24 hours a day, 7 days a week. No vacations, no holidays, never calls in sick, and never has a bad day.
Don’t let firefighting dictate your workday.
Imagine this. You arrive at your desk only to find that the system has been waiting on a message from last night’s backup. While fixing that, your CIO stops by and wants to know why the subsystem and associated jobs that serve your website aren’t available. The minute she leaves, someone calls to complain that their report hasn’t arrived...and you resort to DSPMSG QSYSOPR. Sound familiar?
With the message management solution from Fortra, it’s finally okay to step away from your desk and take your eyes off the system. No need to worry. You’ll receive an alert if anything urgent needs your attention. Robot Console will handle the rest.
Why should I choose Fortra?
- Automated message management. Indicate which messages should be ignored, responded to, seen by an operator, or require operator action.
- System resource monitoring. Regularly check the statuses of lines, subsystems, controllers, devices, writers, servers, jobs, job queues, output queues, and more.
- IFS directory monitoring. Watch for errors or events posted to logs in an IFS directory from IBM MQ, SAP, EnterpriseOne, and your other applications.
- Two-way message response. Receive and respond to notifications from any mobile device. If a message is not answered quickly, automatically route it elsewhere, even to a different IBM i.