Tuesday, June 16, 2020

       Extract Information From Invoices   



Its very common now a days to get information from invoices and then extracting further for processing but how we can achieve this is a very challenging task.

Most of us think about OCR while extracting information from invoices but only OCR is not capable to extract information.

So what we need with OCR ?

We need Deep learning and ML technique to identify information and making that information to process further.


Why OCR is not sufficient ?

By using OCR we can only extract data from images\invoices but we won’t be able to identify information . For example if we are taking a communication bill all the data will get extracted in form of blog  and it won’t be in a single line .  If we take any invoice which is having information as below
Name    Blog
Number  123
Then after OCR processing result might be Name Number as single line and then  Blog 123 as another line . This is the area where we need to have Rule based validation or ML to identify template and then considering label.


So how we can achieve information extraction ?


We will have to use OCR with DEEP learning to identify Pattern , Template and rules . Based on that we will be able to extract information into meaningful format.



  

No comments:

Post a Comment