Menu Close

Does OCR work well for reading IDs?

By Ihar Kliashchou, Chief Technology Officer at Regula

Waiting may kill many things, and user experience is one of the first victims. Let’s take banking as an example of an industry where the UX bar is set high. Being a regulated industry, it must verify plenty of personal data for any basic operation, such as opening a new account. Yet, the competition dictates that the process should also be fast and smooth to not create unnecessary bottlenecks. No surprise that banking is one of the industries which leverage Optical Character Recognition (OCR) tools the most.

The tricky thing with OCR is that it’s just not enough when it comes to processing ID documents, as they contain much more than just text. In this article, we’ll dive into the difference between OCR and ID data parsing, and how the latter can benefit high-load institutions’ workflows.

“Reading” the image

Optical character recognition (OCR) is a technology that can turn an image of text into actual editable text. Say, a scanned passport is an image. If you need to copy details to use them somewhere else, you can hire the OCR technology which does exactly this: distinguishes text characters within images and converts them into text format. This is convenient, as it helps you avoid tedious and time-consuming manual data entry.

Some OCR tools are smart enough to additionally help you with structuring extracted data. This comes in handy when it’s not a one-time operation. Such capabilities are called “OCR templating” and allow you to manually create document templates, using a set of your most common documents as a foundation. These templates let the computer know where important elements are located on the page, so you can automate some repetitive processes at scale.

All is good when working solely with text. The very definition of OCR implies that the technology works with characters. Modern IDs, however, can include as many as four different types of data sources: visual inspection area, MRZs (machine readable zones), RFID (radio frequency identification) chips, and barcodes. An OCR tool can neither fetch encrypted data — in QR codes, for instance — nor validate and cross-check it. Here’s where identity data parsing comes into play.

How does data parsing from identity documents work?

The point of data parsing is that you get structured and additionally analyzed data as an outcome. Generally, the process of document parsing consists of five steps:
Scanning a document;
Automatically identifying its type by comparing the document against a database of document templates;
Reading and validating the fields that are defined by the template;
Structuring the output;
Document verification.

While the first three steps of the ID document parsing process resemble the principles of OCR templating, there can be major differences, depending on who created the document templates, the number of templates, and how well they are done. To illustrate the point, we’ll use the data parsing capabilities of Regula solutions, which are purpose-built for reading identity documents.

When using an OCR solution, the number of templates is usually limited to the few most common ones. In contrast, Regula’s solution for data parsing leverages the world’s largest document template database, which currently includes over 12,000 templates of passports, ID cards, visas, driver’s licenses, and other documents from all over the world. It saves you a tremendous amount of time, as you don’t need to create any ID document templates. But it’s not only about saving time.

To create a reliable template, you need to have information about all possible variations for each of the fields in the document. This isn’t something you can do having a couple of samples at hand. For example, on ID cards, the expiration date is usually written as a date. In some countries, like Bulgaria or Vietnam, for people over a certain age, there are the words “No expiration date” (or words to that effect in the respective language). If you don’t know these peculiarities, the template becomes useless.

Can a data parsing solution really verify documents?

It depends on the level of analysis depth you need, but the short answer is: yes, it can. Even if your customer submits a document you’ve never seen, you should know there is a solution that will be able to recognize it in a moment and tell you what it is and what its characteristics are.

For example, the Regula data parser starts with lexical analysis and validation that every field in the document says exactly what it should say. It checks if the expiration dates are valid, and flags if the document has expired. The lexical analysis also includes mask violations (say we expect a field to contain a date, but it’s empty or has another value). There is also an analysis for stop words: the provided documents shouldn’t have words such as “sample,” “specimen” or “test.” All this happens automatically and is indicated in the field statuses.

Also, as noted above, identity documents can have four types of data sources: visual inspection area, MRZ, RFID (radio frequency identification) chip, and barcodes. The data in different sources is often duplicated. Unlike an OCR solution, Regula reads all the sources and automatically compares all similar fields. For example, it can take a person’s last name from the RFID chip and compare it to the last name written in the MRZ and the one in the visual inspection zone. If anything doesn’t match, the solution will mark this field as invalid. So, if someone altered their name in the visual inspection zone (relatively easy to do) but failed to update the chip (a way harder thing to do), it’ll be detected.

Structuring data makes it actionable

Data can hardly be used in its raw state. Once it’s collected, it needs to be broken down and analyzed to have value and, eventually, turn into decisions. While OCR is a great technology that has revolutionized data collection, it’s no longer enough to effectively deal with identity documents. The highly structured output is one of the biggest pros of applying data parsing for processing ID documents.

With it, all the data it reads and analyzes is divided into groups, fields, and types. You can scan a document and instantly pull out the specific information you need: request the full name or date of birth. You can also have relevant data automatically converted into proper format: say, bring measurement systems (metrical/imperial) and date formats (yyyy/dd/mm, dd.mm.yyyy) into a unified format. This allows you to provide values that users are familiar with and immediately compare apples to apples at verification checks without any workarounds.

The main idea behind data parsing is to quickly deliver ready-to-use results. You quickly get the analysis, make sure the document is authentic, quickly fetch information from certain fields, and quickly scan and digitize the document to fill out a form in your internal system. When backed up with solid expertise in protected document forensics, data parsing solutions help you effectively tackle most challenges with identity document processing.

About the author

Ihar Kliashchou is the Chief Technology Officer at Regula.

DISCLAIMER: Biometric Update’s Industry Insights are submitted content. The views expressed in this post are that of the author, and don’t necessarily reflect the views of Biometric Update. By Ihar Kliashchou, Chief Technology Officer at Regula

Waiting may kill many things, and user experience is one of the first victims. Let’s take banking as an example of an industry where the UX bar is set high. Being a regulated industry, it must verify plenty of personal data for any basic operation, such as opening a new account. Yet, the competition dictates that the process should also be fast and smooth to not create unnecessary bottlenecks. No surprise that banking is one of the industries which leverage Optical Character Recognition (OCR) tools the most.

The tricky thing with OCR is that it’s just not enough when it comes to processing ID documents, as they contain much more than just text. In this article, we’ll dive into the difference between OCR and ID data parsing, and how the latter can benefit high-load institutions’ workflows.
“Reading” the image
Optical character recognition (OCR) is a technology that can turn an image of text into actual editable text. Say, a scanned passport is an image. If you need to copy details to use them somewhere else, you can hire the OCR technology which does exactly this: distinguishes text characters within images and converts them into text format. This is convenient, as it helps you avoid tedious and time-consuming manual data entry.

Some OCR tools are smart enough to additionally help you with structuring extracted data. This comes in handy when it’s not a one-time operation. Such capabilities are called “OCR templating” and allow you to manually create document templates, using a set of your most common documents as a foundation. These templates let the computer know where important elements are located on the page, so you can automate some repetitive processes at scale.

All is good when working solely with text. The very definition of OCR implies that the technology works with characters. Modern IDs, however, can include as many as four different types of data sources: visual inspection area, MRZs (machine readable zones), RFID (radio frequency identification) chips, and barcodes. An OCR tool can neither fetch encrypted data — in QR codes, for instance — nor validate and cross-check it. Here’s where identity data parsing comes into play.
How does data parsing from identity documents work?
The point of data parsing is that you get structured and additionally analyzed data as an outcome. Generally, the process of document parsing consists of five steps:
Scanning a document;
Automatically identifying its type by comparing the document against a database of document templates;
Reading and validating the fields that are defined by the template;
Structuring the output;
Document verification.

While the first three steps of the ID document parsing process resemble the principles of OCR templating, there can be major differences, depending on who created the document templates, the number of templates, and how well they are done. To illustrate the point, we’ll use the data parsing capabilities of Regula solutions, which are purpose-built for reading identity documents.

When using an OCR solution, the number of templates is usually limited to the few most common ones. In contrast, Regula’s solution for data parsing leverages the world’s largest document template database, which currently includes over 12,000 templates of passports, ID cards, visas, driver’s licenses, and other documents from all over the world. It saves you a tremendous amount of time, as you don’t need to create any ID document templates. But it’s not only about saving time.

To create a reliable template, you need to have information about all possible variations for each of the fields in the document. This isn’t something you can do having a couple of samples at hand. For example, on ID cards, the expiration date is usually written as a date. In some countries, like Bulgaria or Vietnam, for people over a certain age, there are the words “No expiration date” (or words to that effect in the respective language). If you don’t know these peculiarities, the template becomes useless.
Can a data parsing solution really verify documents?
It depends on the level of analysis depth you need, but the short answer is: yes, it can. Even if your customer submits a document you’ve never seen, you should know there is a solution that will be able to recognize it in a moment and tell you what it is and what its characteristics are.

For example, the Regula data parser starts with lexical analysis and validation that every field in the document says exactly what it should say. It checks if the expiration dates are valid, and flags if the document has expired. The lexical analysis also includes mask violations (say we expect a field to contain a date, but it’s empty or has another value). There is also an analysis for stop words: the provided documents shouldn’t have words such as “sample,” “specimen” or “test.” All this happens automatically and is indicated in the field statuses.

Also, as noted above, identity documents can have four types of data sources: visual inspection area, MRZ, RFID (radio frequency identification) chip, and barcodes. The data in different sources is often duplicated. Unlike an OCR solution, Regula reads all the sources and automatically compares all similar fields. For example, it can take a person’s last name from the RFID chip and compare it to the last name written in the MRZ and the one in the visual inspection zone. If anything doesn’t match, the solution will mark this field as invalid. So, if someone altered their name in the visual inspection zone (relatively easy to do) but failed to update the chip (a way harder thing to do), it’ll be detected.
Structuring data makes it actionable
Data can hardly be used in its raw state. Once it’s collected, it needs to be broken down and analyzed to have value and, eventually, turn into decisions. While OCR is a great technology that has revolutionized data collection, it’s no longer enough to effectively deal with identity documents. The highly structured output is one of the biggest pros of applying data parsing for processing ID documents.

With it, all the data it reads and analyzes is divided into groups, fields, and types. You can scan a document and instantly pull out the specific information you need: request the full name or date of birth. You can also have relevant data automatically converted into proper format: say, bring measurement systems (metrical/imperial) and date formats (yyyy/dd/mm, dd.mm.yyyy) into a unified format. This allows you to provide values that users are familiar with and immediately compare apples to apples at verification checks without any workarounds.

The main idea behind data parsing is to quickly deliver ready-to-use results. You quickly get the analysis, make sure the document is authentic, quickly fetch information from certain fields, and quickly scan and digitize the document to fill out a form in your internal system. When backed up with solid expertise in protected document forensics, data parsing solutions help you effectively tackle most challenges with identity document processing.
About the author
Ihar Kliashchou is the Chief Technology Officer at Regula.

DISCLAIMER: Biometric Update’s Industry Insights are submitted content. The views expressed in this post are that of the author, and don’t necessarily reflect the views of Biometric Update.  Read More   

Generated by Feedzy

Disclaimer

Innov8 is owned and operated by Rolling Rock Ventures. The information on this website is for general information purposes only. Any information obtained from this website should be reviewed with appropriate parties if there is any concern about the details reported herein. Innov8 is not responsible for its contents, accuracies, and any inaccuracies. Nothing on this site should be construed as professional advice for any individual or situation. This website includes information and content from external sites that is attributed accordingly and is not the intellectual property of Innov8. All feeds ("RSS Feed") and/or their contents contain material which is derived in whole or in part from material supplied by third parties and is protected by national and international copyright and trademark laws. The Site processes all information automatically using automated software without any human intervention or screening. Therefore, the Site is not responsible for any (part) of this content. The copyright of the feeds', including pictures and graphics, and its content belongs to its author or publisher.  Views and statements expressed in the content do not necessarily reflect those of Innov8 or its staff. Care and due diligence has been taken to maintain the accuracy of the information provided on this website. However, neither Innov8 nor the owners, attorneys, management, editorial team or any writers or employees are responsible for its content, errors or any consequences arising from use of the information provided on this website. The Site may modify, suspend, or discontinue any aspect of the RSS Feed at any time, including, without limitation, the availability of any Site content.  The User agrees that all RSS Feeds and news articles are for personal use only and that the User may not resell, lease, license, assign, redistribute or otherwise transfer any portion of the RSS Feed without attribution to the Site and to its originating author. The Site does not represent or warrant that every action taken with regard to your account and related activities in connection with the RSS Feed, including, without limitation, the Site Content, will be lawful in any particular jurisdiction. It is incumbent upon the user to know the laws that pertain to you in your jurisdiction and act lawfully at all times when using the RSS Feed, including, without limitation, the Site Content.  

Close Bitnami banner
Bitnami