Eugene Yan 9/4/2020

Mailbag: Parsing Fields from PDFs—When to Use Machine Learning?

Read Original

A developer seeks guidance on whether to implement machine learning for parsing quote numbers from PDFs, where occasional typos cause errors. The article advises that a 99% success rate is already good and suggests using techniques like Levenshtein distance for text matching and flagging ambiguous cases for human review, rather than immediately jumping to a full ML solution.

Mailbag: Parsing Fields from PDFs—When to Use Machine Learning?

Comments

No comments yet

Be the first to share your thoughts!

Browser Extension

Get instant access to AllDevBlogs from your browser