Nippon Express USA Improves System Awareness with Crucial Monitoring Tools

Image

When I buy a computer, piece of furniture or appliance, I just go to the store, pick what I want and take it home with me. It’s as simple as that—for me. When it comes to manufacturers and retailers, there’s much more to it.

Most likely, that item—or its component parts—came from overseas, built in a manufacturing plant in Asia or Central or South America. How manufacturers transport merchandise to its final destination is a matter of complex logistics involving arranging for shipment on a jet or ship, temporary storage in a massive warehouse and placement on a semi trailer so items can end up at the retailer.

Logistics companies such as Nippon Express USA exist for this reason. It does the work so manufacturers don’t have to. Of course, its efforts are only as good as the IT that underpins them. If something were to go wrong and a system were to go down, logistics companies’ customers might lose track of their products in the transportation pipeline.

That’s why Nippon Express USA has put many safeguards into place to avoid such situations. Using two IBM iSeries* 825s, soon to be upgraded to two new Power* Systems 550s, it’s set up a nearly bulletproof, all-the-time uptime IT environment, using a high-availability mirroring solution and several monitoring and reporting tools from CCSS to ensure that if something goes awry, it can be quickly and easily found and fixed, with little or no impact to customers.

Thankfully Unimportant

Headquartered in New York, Nippon Express USA, a subsidiary of the Japan-based Nippon Express group, is one of the largest freight-forwarding and logistics warehousing companies in the world. It works with global companies, including IBM, to handle “the movement of their products from source to destination,” according to Paul Cree, integrated security manager with Nippon Express USA.

Although Nippon Express USA has several ongoing relationships with some companies, it also must bid for new business. If it lands an entire bid, it contracts with shipping lines, airlines or airfreight carriers to get the products to its warehouse and then ships the items via contracted over-the-road trucking companies to their final destinations. If it doesn’t land the entire bid, from inception to completion, it may get partial jobs, such as simply warehousing.

The company doesn’t own the ships, planes or trucks that facilitate the freight deliveries. (It does, however, have 400 million square feet of warehouse space worldwide.) Instead, it acts as a middleman of sorts, working with contracted companies to move freight for its customers. In some cases, it may bundle several customers’ freight to reduce prices.

Nippon Express USA offers all of its customers important details, including tracking information, based on the data produced by its IBM i-based homegrown applications. This is reported via the company’s Web site, using data coming into WebSphere* MQ, EDI software or from many other more traditional means, such as telephone calls and manual data-entry updates. Data is often sent directly to customers’ systems as well. It’s critical that the company’s systems are up and running all of the time, except for the occasional routine maintenance.

“Officially, our operations run Monday through Friday between 5 a.m. and 11 p.m. But we actually work seven days a week, and our systems have to be pretty much up and running during all hours,” Cree says. “Obviously, though, like any company’s IT department, we do have occasional maintenance issues to deal with, but those are sporadic and typically take place on Saturdays and Sundays.”

Because of its availability demands, the company deployed Vision Solutions’ ORION high-availability solution. Using this tool, Nippon Express USA mirrors its production 825 to a backup 825 in real time. (These two systems and their 12 partitions are hosted but not managed at an IBM facility that also hosts and manages both of the company’s IBM System p* servers.) If the production IBM i server were to fail, Nippon Express USA’s IT department can failover to the backup box and be up and running again, according to Cree, in about 30 minutes.

But Nippon Express USA decided it needed monitoring and reporting solutions to help avoid any failures that might force a switchover. This was because its previous systems monitoring had involved manual processes where IT administrators had to go out and look for any system-related messages.

On one occasion, a disk drive had failed on what Cree characterizes as an unimportant test partition on the company’s high-availability box, but it took about 30 days for someone to find out about it and then have it replaced. Even though this wasn’t a critical failure, Nippon Express USA decided that any failure wasn’t an option. It also decided that manually checking message queues wasn’t the best way to avoid failures.

Given that the company’s IT personnel are always working on new and innovative ways to support its customers, that time- and effort-intensive manual administration often fell by the wayside, with more pressing matters taking precedence. “Because we have so many partitions, it would take a while to log into each one to check on them. And we wouldn’t necessarily check each system each day,” Cree says. “So you might not know that a disk drive had failed for quite a while—or until it became critically important to have that drive and then find out that the system failed after another drive failure.”

Additionally, because the company has to keep its systems up as much as possible to support its business and give its customers the capability to place bid requests and check on tracking information, its core applications have to be available around the clock. This is especially true given the company’s broad user community, which includes more than 72 warehouses and branch offices servicing some 50 or so cities in the U.S., Canada and Mexico. As Cree points out, “Our customers need us to respond to everything. If we don’t, we face possible loss of business. We simply can’t afford to have anything catastrophic happen at any time.”

Building on a Foundation

Although the ORION high-availability solution, which Cree says was deployed four or five years ago with the help of the IBM business partner Essex Technology, was a step in the right direction, Nippon Express USA felt it needed to build on that foundation to further its around-the-clock uptime goals.

To that end, it began looking for automated system monitoring and messaging tools. Using these tools, the company’s IT department could focus on its core business rather than manually checking each system partition or application for error messages. This would reduce time and effort as well as alert administrators and programmers in a more timely and detailed manner to issues that could critically impact the business.

Using a deliberate approach, Nippon Express USA began exploring possible solutions, speaking with several vendors and testing what they had to offer. Many of these tools, Cree says, “were typically green-screen based, weren’t geared toward the multipartition environment we work within and didn’t fit in with the way we operate.”

One vendor stood out from the others. CCSS had the tools Nippon Express sought, including QSystem Monitor, QMessage Monitor and QRemote Control. All of them come with a GUI (although a green-screen interface is also available), work in multipartition environments and, specific to Nippon Express USA, fit into the company’s operational environment.

Working with CCSS, and after undergoing some tool training, the company began rapidly deploying these solutions, knowing that, according to Cree, they “would improve everybody’s work and allow us to be able to respond to problems on a much quicker basis.” This deployment took place over the spring of 2007, and the company went live with the tools shortly thereafter.

“It was a pleasant surprise to see how easy it was to see how much was going on with our systems,” Cree says. “But that brought up another issue. Because of the volume of inquiry messages we were receiving via QMessage Monitor from each partition, we needed to find a way to decide what was critical and what was trivial. Thankfully, this tool allowed us to decide which inquiry messages we wanted to see and which we wanted to ignore. Otherwise, we would have been swamped with messages and not necessarily know the ones we should respond to.”

Additionally, QMessage Monitor users can set up the solution to send, for example, any EDI-related messages to the EDI programming group. This type of message-level parsing helps ensure that critical messages are sent to the proper personnel so they can take immediate action. “What we did was use QMessage Monitor to create groups of people to whom we should route specific types of messages,” Cree explains.

To illustrate that point, he recalls a recent hard-drive failure that resulted in an inquiry message. That message was routed to appropriate personnel and, on the same day—a Saturday—the drive was quickly replaced. “We’ve programmed the QMessage Monitor to turn issues like hard-drive failures into inquiry messages,” he says. “We’ve come a long way since that 30-day failure we had in the past.”

While QMessage Monitor is focused on system-related messages, QSystem Monitor indicates how the systems are performing. For example, the tool can notify the appropriate IT employees of excessive CPU use so they can take action to reduce it. Similarly, it can generate messages if disk space is nearing peak capacity. “We can even monitor controllers that go down or services that aren’t operating correctly,” Cree says. “For instance, we can monitor our Domino* server ports to make sure that our Domino servers are operating correctly.”

As with QMessage Monitor, QSystem Monitor messages can be routed to the appropriate personnel, who can take prompt action should anything occur. This type of messaging is enhanced with QRemote Control, which can route messages to onsite terminals, offsite PCs and portable devices. For example, if something goes awry, a message can be sent to a smartphone. If the problem is relatively simple to resolve, the message recipient can answer the message with a “Retry or Cancel,” as Cree puts it, to correct the issue.

If the problem is more involved than that, the recipient can log onto the system from a remote workstation and respond to the situation. As a result, IT workers are no longer tethered to the company’s systems, waiting for messages to appear on their consoles or finding a message on a Monday related to a problem that occurred on a Saturday. Now, they can take near-immediate action. “In the past, we checked on Saturday and Sunday mornings for messages, but that solution wasn’t always reliable and people often didn’t know what to do when they got a message. With QRemote Control, we don’t have to worry about that anymore,” Cree says.

Better Control

Before deploying the CCSS solutions, Nippon Express USA’s IT personnel had to manually check for messages. Now, much of that work is automatic. If something goes wrong in the company’s IT department, IT personnel can quickly respond, whether they’re in the office or not. And thanks to this automation, the company estimates that system operators are spending 87 percent less time monitoring systems. This means programmers can now spend more time innovating and helping customers.

Cree indicates that the CCSS tools have already paid for themselves, in improved productivity, the prevention of unplanned downtime and better allocation of system resources. As he notes, “QSystem Manager allows us, for example, to view disk-space usage on a very detailed level, and if disk space is being overused, we can identify how and get it remedied. Overall, we simply have better control of our systems.”

Let’s Get Started

Call us at 800-328-1000 or email [email protected] to set up a personal consultation. We'll review your current setup and see how CCSS solutions can help you achieve your monitoring goals.

Get this Case Study in PDF