Please read the README first.
The processing downloads the OpenStreetMap (OSM) data, filters and processes it into a PostgreSQL/PostGIS database, which is then made available as vector tiles with martin
.
The data is selected and optimized to make planning of bicycle infrastructure easier.
We use Bun and Bun Shell to orchestrate the commands needed to fetch, filter, process and post-process the data and trigger post-processing hooks.
See index.ts
for more.
We use the public Germany export from Geofabrik which includes OSM Data up until ~20:00 h of the previous day. All processing is done on this dataset.
- Data is processed every day (cron job definition)
- Data is processed on every deploy/release
- Data can be processed manually via Github Actions ("Run workflow > from Branch:
main
").
See https://github.com/FixMyBerlin/atlas-app/blob/develop/processing/run-5-process.sh#L45-L50 for a list URLs to see the data that Martin provides.
- Install bun
- Run
bun install
in./processing
The workflow is…
-
Edit the files locally
-
Rebuild and restart everything
Frist, make sure you are in the root folder of this repo.
docker compose build && docker compose up
-
Inspect the new results, see "Inspect changes"
Note Our development docker compose add two
volumens
which means in most cases, we don't need to rundocker compose build
.
Note Learn more about the file/folder-structure and coding patterns in
processing/topics/README.md
With SKIP_UNCHANGED=1
we compare the hashes of all .lua
and .sql
files to the last run per topic.
During run-5-process.sh
we only run code if the respective hash has changed.
If any helper in (topics/helper
)[processing/topics/helper] or the OSM file has changed, we rerun everything.
Whenever we talk about hash
es in this code, this feature is referenced.
With COMPUTE_DIFFS=1
the system will create <tablename>_diff
tables that contain only changed entries.
It will compare the tags
column to the previous run.
Whenever we talk about diff
s in this code, this feature is referenced.
- With
FREEZE_DATA=0
you see the changes to the last run on every run - With
FREEZE_DATA=1
you see the changes to the last reference-run, allowing you to compare your changes to a certain version of your data. The reference will be the last time you ran withFREEZE_DATA=0
. In this case the system will not update thebackup.<tablename>
tables. This flag will be ignored ifCOMPUTE_DIFFS=0
.
To run everything without code caching and diffing set SKIP_UNCHANGED=0
and COMPUTE_DIFFS=0
.
For the development process it's often useful to run the processing on a single object.
For that you can specify an id (list) as ID_FILTER
in the processing/run-3-filter.sh
.
See the osmium-docs for more information.
We use the luarocks package busted as our testing framework.
To run the tests manually:
./processing/run-tests.sh
Additionally all tests are being run in the husky pre-push hook.
- Create one test file per helper
- Filename has to be
\*.test.lua
- Place it in a
__tests__
folder next to the file
- First https://github.com/FixMyBerlin/atlas-app/actions runs.
- Server (IONOS) runs the processing one table at a time.
The whole processing takes about 1.5 h.
See
index.ts
for details.
The first iteration of the processing pipeline was inspired by gislars/osm-parking-processing