Fireside chat recap: exploring the evolution of intelligent document processing with GigaOm

In this Fireside Chat, two industry experts lead an in-depth discussion on the current IDP landscape and the evolution of intelligent document processing beyond traditional OCR.

Oct 26, 2022 by Craig Woolard


If you are exploring the best automation technologies for unstructured data, here is a summary of the discussion between these two thought leaders. The fireside chat begins with some background on optical character recognition (OCR) and the introduction of Robotic Process Automation (RPA), and then they each weigh in on the future of intelligent automation, unstructured data, and the technologies that will get us there. Finally, Stefan Groschupf and the GigaOm analyst discuss essential IDP benchmarks.

Keep in touch

Who are the industry insiders?

Stefan Groschupf, CEO, Automation Hero

When he started his previous company, Datameer, Stefan Groschupf saw first-hand the challenges financial institutions confronted as they navigated “unstructured” data. As founder and CEO of Automation Hero, Stefan focuses his data analytics experience on processing the diversity of data hiding in unstructured documents faster than ever using the most advanced and complete IDP platform.

Saurabh Sharma, Lead Analyst, GigaOm

Saurabh Sharma, the lead analyst with GigaOm, has over 15 years in software development, consulting, and product go-to-market strategy. GigaOm is a top analyst firm democratizing access to strategic engineering-led technology research. GigaOm enables innovation at the speed of the market by helping business leaders grasp technologies, upskill teams, and provide strategic sales training and advisory services.

How intelligent document processing goes beyond OCR

Stefan and Saurahb discuss how machine learning, natural language processing, and deep learning evolved into intelligent document processing. Both conclude and agree that OCR and RPA are decades-old technologies and are fundamentally limited. Saurabh explains how OCR’s accuracy is limited with the following:

“Even with the best quality scanners and document quality…you might get 50 to 60 percent accuracy with OCR. But if you look at the last five years, a lot of automation leaders are not interested because they’ll spend more time trying to solve those errors and doing the rest of the processing, manually.”

Saurabh continues, noting how the value of automation diminishes, “…and that’s when some vendors started experimenting with intelligent OCR, which is nothing but a bunch of python code with OCR, so accuracy increased by five to 10 percent…still not good enough.”

How intelligent document processing goes beyond RPA

On some of the limitations affecting traditional RPA, Saurabh says, “there was a lot of hype around RPA and how it would automate everything, and then we realized RPA is only for structured data and structured processes…RPA hit the wall when faced with semi-structured and unstructured data.”

Commenting on the “hype” surrounding traditional RPA, Saurabh explains how data extraction accuracy goes up to 60-70% when it’s combined with intelligent OCR, but notes how this combo eventually hits a wall:

“If you have automated processes, or if you are an enterprise architect or a CIO, you understand that that 60 to 70 percent accuracy is not good enough…If you want to go to the next level…that number goes down to 30-40 percent, so that does not deliver strategic benefits.”

As the rapid growth and the complexity of unstructured data hiding in documents continues to increase globally, Saurabh shares his insight with us on current RPA solutions combined with IDP:

“With the combination of RPA and IDP…you can look at 70 to 80 percent straight-through processing with 95 plus or minus ‘X’ percent accuracy. That’s the end game!”

Since 80% of global enterprise data is unstructured, Stefan and the independent analyst both agree organizations need more than OCR and legacy RPA to navigate the current complexities of real diverse data:

“We need an easy-to-use technology that hides the complexity but is ready to do automatic underwriting, automatic claims processing, and automatic contract management and allows us to integrate that into our IT systems seamlessly.”

 The competitive IDP landscape

After deep diving into the significant differences in accuracy between OCR, RPA, and IDP, Saurabh and Stefan discuss the current landscape of intelligent document processing. On the topic of competitive IDP vendors, Saurabh says:

“I will be a bit frank — there are about one hundred vendors who claim that they have IDP…most of them, I would say, 80 percent offer intelligent OCR, and even worse than that…so there are only ten to twenty serious players who really offer IDP…so the way we see what is next going forward would be how much semi-structured and unstructured processing you can support.”

Saurahb offers another valuable point on the capabilities of current IDP vendors and the truth about accuracy ratings, citing his own research from the GigaOm IDP Radar report:

“The other thing I see is everyone tells us that it’s a ‘template-free’ approach, but that’s not the case. In most cases, if the document layout changes or the logo moves somewhere else, the product fails to understand…so, I think there’s a lot of not-so-good marketing happening around accuracy rates like 99.9 percent for cursive and handwriting. That’s not going to be possible.”

Finally, Saurabh makes a powerful statement about the current IDP market when he says:

“…and Stefan, to your point, that 80 percent [of global data] is unstructured, I think it’s now more than 80 percent actually…if vendors are not going to solve that problem, I don’t think they can make progress on automation.

GigaOm Radar Report Intelligent Document Processing

The future: GigaOm weighs in

On the future of intelligent document processing, Stefan agrees with the independent analyst’s assessment that global unstructured data has already surpassed 80% growth, saying:

 “The next generation is growing up with FaceTime and WhatsApp…so the customer experience and expectations are going this way…it will only grow more as we continue to interact with our customers digitally.”

On the landscape of intelligent document processing going forward, the duo share exciting insights on what the future will hold. Their conclusion? Even though OCR is an architectural component of intelligent document processing, it will be the vendors who rethink how they implement OCR into the IDP stack that will bring the most value to the growing challenges of unstructured data.

How Automation Hero’s IDP achieves over 90% accuracy

Finally, the two discuss the competitive landscape outlined in Saurahb’s GigaOm Benchmark report, and highlight Automation Hero’s custom context-aware OCR post-processing AI model, which delivered 281% more accurate handwriting recognition than even the leading competitor according to GigaOm’s independent evaluation! Stefan then explained how Automation Hero uses layers of neural networks by treating the output of OCR extractions as signal inputs to intelligent document processing. 

For example, Stefan explains how one neural network pre-trained on languages is layered on top of another network that uses domain knowledge alongside machine learning, deep learning, and natural language AI models to add additional layers to the IDP stack. 
Regarding the Hero Platform_’s better performance metrics, Stefan says:

“Eventually, you begin stacking up these next-level intelligent document understanding capabilities, which is what you have with an end-to-end IDP platform like Automation Hero. If you look under the hood, we are training all sorts of things with two things in mind. Number one, how can we get you the highest accuracy? The highest accuracy means the highest automation rate with the highest straight-through processing rate. Number two, how can we deliver the highest accuracy using the smallest amount of data possible? Automation Hero is at the point where we can achieve 80-90 percent accuracy with as little as 50 sample documents!”

On the future landscape of IDP, Saurabh offers his insight:

“If you look at the hype of RPA, it has plateaued now. IDP is going through that time, and you will see serious adoption. Over a period of time, I think we will see a convergence…AI will become automation.”

Take your next step into digital transformation

Get Saurabh Sharma’s GigaOm IDP Radar report now and learn everything your organization needs to navigate the current IDP landscape and achieve true digital transformation.


Go beyond OCR & reach the next level of intelligent automation

Watch the recap from the fireside chat with Automation Hero CEO, Stefan Groschupf, and GigaOm analyst, Saurabh Sharma as they discuss the current IDP landscape. Learn what your organization needs to achieve …