Skip to main content

When I wrote my first blog, What is RPA (really?), I spoke about how “AI Robots (that) will make business so intelligent it will free people up to focus on more important tasks instead of the time-consuming, mundane tasks that they perform today”. That statement was written in a little bit of jest. At the time, I was a bit skeptical of how AI could make RPA more powerful in the near term. Like many other thought leaders in RPA I was of the opinion it was going to happen, but it was years away. Well, I was wrong.

In the past few months, I have been blown away by the capabilities of both the Computer Vision activities. Not to forget about the cloud-based AI API engines.

I am now convinced that RPA with AI is here. Computer Vision is an absolute game changer. I understand it’s the first release of these activities, but it’s still very powerful.

The activities attempt to solve the issue of automating Citrix, VMWare Horizon and Remote Desktop applications. In the past, the only way to interact with them was the use of imaging, OCR (Optical Character Recognition), hotkeys and the clipboard, amongst other ways. While we have created many successful automations using these techniques, it can be a challenging environment to work in.

Last October at the UiPath Forward conference in Miami, I met a product manager. He was leading a development team to create capabilities using AI to recognize common elements (buttons, text boxes, user controls, text etc.) within a user interface, regardless of size, shape, color, resolution and a host of other properties. Speaking to him at the time I fell in love with the concept. I didn’t realize they were going to release something within 4 months of that conversation! Let’s just say I am mildly delighted.

Having spent a large amount of my life developing RPA applications, the panacea has always been to create an application that was smart and reliable enough to be able to interact with ANY element in ANY user interface with 100% reliability.

The idea is that the RPA application would know exactly the best method to automate the appropriate function. Moreover, it should also provide the best means to do so. While this makes an RPA application more of black box, it’s the direction this is all headed. This doesn’t mean that RPA developers won’t be needed since it will be so simple to create automations. On the contrary, they will allow developers to do more with RPA and spend less time working on mundane tasks.

Computer Vision, in addition to the cloud-based APIs by Google, Microsoft and IBM, are taking RPA to new levels of robustness. Using a cloud-based API is the most powerful way to harness AI.

When making a request to read an image and get text from it, compiled libraries that run locally in the runtime environment are limited in design and space. They are effective in the jobs they are given. In the case of some tasks, there simply isn’t enough computing power and broad enough of a decision engine to attain reliability in all cases. Take the age-old OCR dilemma. OCR can be “reliable” if it’s “tuned” properly. An example is when teaching an engine (i.e. Tesseract) to recognize certain fonts, words, shapes and characters using image matching, this approach is limited and very brittle. One can only teach the engine so many things before it predictably sees a character it has never seen before.

A cloud-based API AI Engine solves this problem. Using AI, it can apply every approach it has available to reach a logical decision. In this case it can apply every iteration of every character that’s ever been recognized, apply a probability criteria and POOF!!, you have an OCR engine! Moreover, it doesn’t require teaching and provides the best possible choice to recognize any character it has to read.

This is a breakthrough for reading invoices, documents and PDFs. In addition, other common AI APIs out there include speech recognition, handwriting, images of objects and a host of many others.

RPA with AI is here. To understand RPA’s core capabilities, we cannot only leverage existing activities that are available within a tool but also leverage some of the capabilities brought on by the advent of AI. RPA, with the advent these new tools, are replacing more expensive off-the-shelf products that only do “one or two” things well. RPA as a platform can do many things well, for less cost and with more choices.

I am very excited about the present and the very near future of RPA and think you should be too. Buckle up!

 

Peter S Camp is the CTO and Founder of CampTek Software. He has been developing RPA Applications for over 15 years. For further questions, discussion or inquiry about CampTek Software Services, contact info@campteksoftware.com.