This repository contains source code for the blog post series titled A Practical Take On Processing Price Transparency Data
In Part A, we talk about downloading source machine readable data file using AWS Lambda running Python script.
In Part B, we talk about using Polars script to pre-process the source data and store data in Parquet format in S3.
In Part C, we talk about using both PySpark and Polars scripts to produce denormalized data partitioned by billing_code and store final data in Parquet format in S3.