Data extraction from PDF
Customer Challenges:
- Enable a leading Test Automation Platform to use semi-structured PDF data as a query-able, addressable data source to query/read data, text, tables, images
Result:
- ~90% accuracy at the end of POC Period
- Seamless PDF Data Extraction on Production rollout for first Customer
- Easy, Automated Training of new PDF formats
CoreView Solution:
- NLP, ML Data Extraction solution based on pre-trained Google models
- Confidence Score of PDF Data Detection and Extraction process
- UI Driven & Automated Training of new PDF Formats
- Future extensibility to Scanned PDFs
Other Considerations:
- Work with any PDF layouts
- Computer generated, scanned PDFs
- Easily learn variations & new layouts