-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some DB tables are not dumped when using db.to_csv()
#383
Comments
Update on the encoding: it seems that only the additional tables have utf-16le: darth@vader:~/.open-MaStR/data$ for f in dataversion-2022-11-17_OLD/*.csv;do file -i $f;done
dataversion-2022-11-17_OLD/bnetza_mastr_balancing_area_raw.csv: application/csv; charset=utf-16le
dataversion-2022-11-17_OLD/bnetza_mastr_biomass_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_combustion_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_electricity_consumer_raw.csv: application/csv; charset=utf-16le
dataversion-2022-11-17_OLD/bnetza_mastr_gsgk_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_hydro_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_nuclear_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_solar_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_storage_raw.csv: application/csv; charset=utf-8
dataversion-2022-11-17_OLD/bnetza_mastr_wind_raw.csv: application/csv; charset=utf-8 |
I opened a separate issue for this problem: #385 |
* add tqdm for additional table export * rename second table variable
* Move method to helpers.py * Add 'source' parameter to choose mapping * Add NotImplementedException
The problem with the encoding was simply fba5bb1. open-MaStR/open_mastr/mastr.py Lines 316 to 320 in a826d15
table was drawn from set of data options of BULK_DATA and comparead to ADDITIONAL_TABLES, in the case that the tables to export were unspecified in db.to_csv() . See below, as method='bulk' is default.open-MaStR/open_mastr/utils/helpers.py Lines 273 to 274 in a826d15
This resulted in balancing_area and electricity_consumer beeing the only match between BULK_DATA and ADDITIONAL_TABLES. Thus, only the two were added to the list of exported additional tables.
As solution, I wrote a mapping (64c306e) from the possible |
Thanks @chrwm for these quick fixes! |
* raise NotImplementedError successfully * run black
Fixed by #384 |
Hey!
It's great to see the progress you've made this year 👍. Following your instructions in the docs I was able to bulk-download and access the MaStR data. But:
Expected behavior
db.to_csv(None)
exports all tables.Problem
db.to_csv(None)
successfully creates a dump but some tables seem to be omitted, e.g. no files are created for locations and gas. So then I triedafterwards, but no location file is created.
So I thought it might be a problem if the dir already exists, removed it and explicitly listed all tables:
Same result, the additional tables are missing again.
Investigation
Ok, as the requested table "location" is not included in the "Additional tables: ..." above, I checked these lines in the package to find out where they get lost:
open-MaStR/open_mastr/mastr.py
Lines 313 to 325 in a826d15
It turns out that it is included in data before the loop starts in L316 but not in the constant
ADDITIONAL_TABLES
taken fromconstants.py
- in there only "locations_extended" is listed. However,db.to_csv("locations_extended")
is not allowed and raises this message.I guess this is a bug?
My dirty workaround: Comment out the validation here and run
db.to_csv("locations_extended")
, this works out.Problem 2: CSV encoding
But, haha, the resulting CSV has a wrong encoding (utf-16le):
But you explicitly defined the charset:
open-MaStR/open_mastr/soap_api/download.py
Lines 411 to 428 in a826d15
That might be a different issue?
The text was updated successfully, but these errors were encountered: