Getting data back and forth into web portals can be a complicated endeavor. Especially so when there isn’t a way to interact with them programmatically—such as an API—and data must be manipulated via the web GUI itself. This leaves employees wasting precious time performing manual data entry tasks that could be handled more efficiently via an automation tool—like Fortra’s Automate.
Automate uses robotic process automation (RPA) to streamline manual tasks in the web browser and beyond. The Automate Recorder, is able to quickly and easily build out your automation tasks via any website or desktop based application using live screen recordings. In this introduction to advanced web UI automation, you’ll get a closer look at using Automate for advanced web UI automation.
If you’d like to follow along and try out the Automate Recorder, download a free Automate trial.
Web Automation Best Practices
First, let's talk about some general best practices when attempting to automate any sort of GUI, whether on a website or desktop application.
1. Set Screen Resolution
Screen resolution is an often-overlooked element of creating automation tasks. It’s important to make sure the computer executing the task always has the same screen resolution each time the task runs. This ensures that the size of the screen remains constant for each execution, so that the various elements of the site or app remain in the same general area of the screen. This is especially important if your automation has to use any sort of X,Y coordinates to locate items. If the screen resolution differs from when the task was built to when it executes, you could run into potential issues with the task not moving the mouse or clicking into the appropriate areas of the screen.
2. Maximize Windows
In addition to ensuring you always have a consistent screen resolution, the other piece of the puzzle to unlock consistent results is to always maximize the window of your web browser or desktop application. By maximizing the window, you not only make sure things remain in the same location (especially if you’ve already set the screen resolution), but also make sure that the window itself is front and center. In addition, if you are automating a website task, the size of the window can have an impact on where objects are placed within the browser window. For example, if your window is set to only occupy half the screen, it could cause elements to stack vertically versus horizontally, or not even appear at all.
In this example, we have Fortra’s homepage. You’ll notice as the window is resized horizontally that certain elements move around, such as the buttons above the dashboard image. And once the page gets small enough, the menu changes from a tabular menu to a hamburger menu in the top right corner.
Automate comes with built-in native actions that perform both of these functions and can be included in your GUI automations. You can use the Computer – Display action to set the display adapter to use, along with the color depth, frequency, and resolution of the screen. This should be set at the beginning of your task to ensure that the screen resolution is set properly for each execution. The Window – Maximize action lets you maximize a window on the screen which can be identified by attributes such as window title, class, handle, or even content.
Using these native actions in your web automations will help ensure consistency within the UI across each task execution, which will ultimately lead to better outcomes.
How Automate Uses Web Elements
Now that you know how to set up a web UI task for success by setting the resolution and maximizing the window, let's go through how you can build web automation tasks within Automate. As you get started, it’s important to understand how Automate identifies and works with the various HTML elements that exist on a webpage, as well as what those elements are.
HTML Basics
HTML (HyperText Markup Language) is the standard language used to create and structure content on the web. All the content that makes up a webpage is stored in the HTML document of that page. This includes elements like text, images, links, and other content that you see when the page is displayed in your browser. There are three primary HTML components that you need to understand when automating web browser tasks in Automate or other RPA tools.
1. Tag
HTML elements are defined by a start tag, some content, and an end tag. The tag defines what type of element it is, such as a paragraph, div, or frame. Here is a basic example of a paragraph tag with some text.
<p>Fortra’s Automate</p>
As you can see, we have the start tag <p>, content (Fortra’s Automate), and end tag </p> that make up this basic element. We can use tags like this to identify certain elements on the page to interact with by using either the Recorder or the Web Browser actions in Automate.
2. Attributes
Attributes contain extra information about the element that you don't want to appear in the actual content. This could be something like a class, which is used to give the elements of that class style information for how it will be formatted or displayed, or maybe an href, to provide a link to some other content. Here is an example showing an element with a paragraph tag and a class value of “small”.
3. Identifier
An HTML identifier is a special type of attribute used to specify a unique id for an HTML element. The value of the id must be unique across the entire HTML document. This can be used to point to a specific style declaration in a style sheet, or it can be used by JavaScript to access and manipulate the element with the specific id. Here is an example with a div tag that is using an id of “footer-center”.
Identifiers are the best way to identify and interact with web elements within Automate. Since they must be unique across the HTML document, you know that the element that is identified with that id will always be found correctly since it is the only one that exists on the page.
Using Automate's Recorder
When you use the Automate Recorder to "record" your automation steps on your webpage, first click the web button in the recorder interface and then hover over the webpage. The Automate Recorder will attempt to identify those elements by traversing the HTML source code of the page itself. If that item can be identified by the loaded HTML, then it will be highlighted and labeled with its corresponding HTML identifier, tag, and/or attribute.
Here is an example showing the Username field of the Fortra Support Portal. You can see the field is identified by the recorder successfully, and the recorder is able to read its Identifier ("Name"), tag ("INPUT"), and class ("mdc-text-field_input.pf-c-form-control") information.
We can verify this is correct by inspecting the HTML of the page by opening the developer tools in the browser (typically CTRL+SHIFT+I) and using the selector to pick that item. It should highlight the section of code for that element.
Automate then uses this information to correctly identify that HTML element on the page, and either select it, click it, get information from it, or set information to it. Using the recorder is typically the preferred approach to building web UI automations, as it is faster and less prone to error than manually creating the steps, but there are cases (which we’ll discuss later) where the recorder is not always the optimal choice. In general, though, it should be the first choice when working with websites.
Using Automate’s Web Browser Native Actions
In addition to being able to use the Automate Recorder to build the steps for web automation, you are also able to directly use the Web Browser Native Actions within Automate’s Task Builder. This can be useful for instances where you know the HTML element you want to work with, but the recorder is unable to properly identify it, or to fine-tune any additional information for the element that could help the task identify it faster or more accurately, such as adding in additional attributes.
For example, perhaps we have an item that the recorder cannot handle, but we still want to try to use the web browser actions to interact with it. Within the Action Properties panel of most of the web browser actions, you will find an icon that will allow you to select an HTML element from the page, like the recorder. It looks like a hand with a finger pointing to something.
If you click on that icon, and then hover over the webpage and click the element, it will automatically extract the needed element data and fill in the fields in the action.
This will usually work the same as the recorder but can occasionally work a little better as the cursor is the typical arrow instead of the pointing finger that you get with the recorder. This helps you be more precise when trying to locate the small boundaries between HTML elements.
You can also manually enter in the HTML element data by locating it in the HTML source of the page using the browser developer tools and entering it into the Automate action.
Next Steps: Troubleshooting Advanced Web UI Issues
Web automation is one of the most common places where RPA can be beneficial to an organization, as there are many tasks involving websites and portals. By using Automate’s Recorder and Web Browser actions, you can easily build out automations for your web processes. These tools are great for many of the webpages out there that you may want to automate. However, not every web page is coded the same.
You may encounter pages where the recorder can't identify something, or it won’t populate or select the item if it did identify it successfully. When that happens, what can we do? What other tools and tricks do we have to work with more advanced pages or pages that are not coded with modern standards?
In the next part of this series, we’ll take a look at some examples of these types of issues and discuss how we can work around them.