Demonstration that uses Form Recognizer (of Azure Cognitive Services) with Azure Storage, Databricks, and Power BI for unlocking data in PDFs.
Check out the associated blog post and embedded demo video: https://www.bluegranite.com/blog/extract-data-from-pdfs-at-scale-with-form-recognizer
- Architecture of Form Recognizer Demo.jpg -- High-level overview of the demo
- sample_pdfs.zip -- Zipped file with 13 pdf invoices
- form_recognizer_demo.dbx -- Executed Databricks notebook of orchestration
- form_recognizer_demo.ipynb -- Executed Jupyter notebook of orchestration (Note: cannot be run locally)
- form_recognizer_demo.pbix -- Mock-up Power BI Desktop report of output data