An up-to-date and comprehensive runbook won't win you any awards. Especially when staff members know the schedule better than their kid's birthdays.
But when you get the green light to implement a workload automation tool, documentation is your lifeline. Unfortunately, most runbooks aren’t up to snuff.
We’ll show you when and where documentation gets neglected, how to do it the right way, and why properly documented job streams may give you negotiating power at your next budget meeting.
The Problem with Tribal Knowledge
No matter how small, every data center has tribal knowledge that isn’t documented. Do the same operators run month-end processes every time? If so, they probably have specific ways of doing things that aren’t written on your run sheet, such as:
- Checking for active processes before submitting a job.
- Creating a file before the job runs.
- Moving files to correct directories for a file transfer downstream.
When I was an Operations Manager, I would work an afternoon or night shift periodically, and I was always surprised by the undefined procedures that were followed. It gave me a chance to ask "Why are we doing it that way?"
Pounding the (IT) Pavement
You can interview operators, admins, and other staff about their daily processes, but that won’t give you the whole story.
Working next to someone is a great way to discover what really goes on during a shift. Some people have been doing their job for so many years that they don’t realize they’re performing specific steps, or they take it for granted that everyone does it that way.
You’ll also want to talk to staff in accounting or other departments that run batch type jobs on your systems. Accounting, specifically, may have some requirements such as balancing accounts before month-end jobs are run. Again, this may be something that is not written down, but just assumed.
From Spreadsheets to Job Flows
Often run books are kept in a spreadsheet format and are easy to update. Make sure the documentation that you’re using is the most recent and includes all the necessary details:
- Job name and schedule
- Predecessors and successors
- Commands used or the script called when the job runs
It’s important to capture information about the job and its schedule now so that nothing gets left out when it’s time to implement your new workload automation tool. Other information, like the user login and directory path, will also be needed.
Once your job and schedule information is updated, create flow charts of those job streams. Displaying each job stream or process in a flow chart helps you identify the true dependencies. Simplify where you can, eliminating redundancy and inefficiencies.
Making your Case
Your research will probably surprise you by turning up all kinds of hidden processes, many of which are manual. Often you’ll find jobs that are included in a production job stream as a placeholder for a message or notification.
Because these “jobs” can be eliminated with the implementation of a workload automation tool, your new job flows will help you justify the purchase of one. Automation also eliminates the "retry" loops that result from waiting for the required resources to be made available.
Pointing out the manual steps or delays in a visual way helps your managers understand the exact time and resource savings that can be achieved.
In addition to saving time and resources, automation also frees up your developers and system administrators who are currently spending precious time coding around those schedule exceptions.
When you finally get your workload automation software, your existing documentation will dramatically speed up the implementation process. Your updated flow charts will help you create new, efficient, and streamlined job flows rather than just mirroring your current schedule.
If you’ve done your due diligence on this first step, automating your processes with a workload automation tool will be a piece of cake.