Skip to main content

In a recent blog post I spoke of how AI and RPA are now working together better than ever. Two of the tools I mentioned are the cloud-based AI OCR engines by Google and Microsoft. While it’s certainly valuable for getting characters off an image, whether it be a PDF or a traditional image, it can also be a useful tool when automating a virtual desktop. Whether it be Citrix or VMWare Horizon platforms, both are becoming more popular to deliver applications to the enterprise.

Gone are the days where IT needs to install a Windows application on every desktop in an organization. More and more companies are now using the virtual desktop architecture.

It makes perfect sense from a delivery and security standpoint, since the applications that are published to the user reside on a server or multiple servers. This method is much easier to manage than touching 1000s of desktops.

A majority of our RPA Partners are using the Virtual Desktop Model. In the past, automating a Citrix-like environment was extremely unreliable compared to a Windows, web or character-based application. In those environments the automation tool has access to the object either through an API or an exposed extension. These types of applications don’t allow this level of native control. In some cases, there is an extension available through Citrix using the XenApp utility.

This is great and all but to get this installed isn’t always a possibility. Besides, I like the approach where you need as little intervention as possible and, as I tell our prospective and existing RPA partners, “if you can see it, you can automate it.”

Therefore, in the absence of those techniques we are left with image recognition and OCR (Optical Character Recognition) as the primary techniques to automate these virtual desktops. Image recognition can be very slow and, in fact, unreliable if not tuned correctly. With the new set of tools that are now available this reliability is an afterthought.

We have recently started using the new Computer Vision activities written by UiPath. We are very happy with the results thus far and look forward to these activities getting more robust in upcoming releases.

These activities are innovative in that they use the “best in breed” approaches for image recognition and OCR and combine it with machine learning/AI.  They are very remarkable in that they can identify what type of control you are interacting with even though there really isn’t a control, but rather a painted-on image.

In addition to these techniques, our development staff is constantly coming up with creative ways to select items on a dynamic list and match the specified data.

This problem, because it is so unreliable, has killed many RPA projects in the past. In fact, we had a client who came to us after they had spent hundreds of thousands of dollars and a large amount of time using a legacy RPA product that could never seem to get it right. We have had a solution in place for months that runs on a scheduled basis. The only time it fails to run is if there aren’t any transactions to process.

In conclusion, the ideas of screen scraping, heavy use of OCR and image recognition are in the past. It’s time to embrace the new set of tools to automate today’s virtual desktops.


Peter S. Camp is the CTO and Founder of CampTek Software. He has been developing RPA Applications for over 15 years. For further questions, discussion or inquiry about CampTek Software Services, contact