PDF Section Detection & Classification

PDF Section Detection & Classification

Customer Challenges:

  • Identify coordinates of different PDF sections
  • Paragraph, Images, Title, Header, Equations, Footer, ect
  • Classify different PDF sections

Result:

  • High accuracy section detection and classification model, >90%
  • Prediction time < 2 seconds
  • False positive rate < 5%

CoreView Solution:

  • Different PDFs from multiple sources
  • Deep learning- based computer vision model for section detection
  • Advanced classification models for section classification

Other Considerations:

  • Poor PDF data quality
  • Skewed Data
  • Prediction must be within 2 seconds

Share this post