Shorten forced fit measurement names #734

ddobie · 2024-07-23T07:04:14Z

ddobie · 2024-10-17T02:34:13Z

These changes implicitly limit the number of forced fits measurements per source to be 1000. That's more than sufficient to handle the VAST survey (it will be at absolute most 400 measurements per source), but it could come into play in the future. I've documented that limitation in the changelog, but I feel like it should be mentioned somewhere else in the docs just in case. Any thoughts @mauch?

mauch · 2024-10-18T04:02:30Z

vast_pipeline/pipeline/forced_extraction.py

@@ -614,7 +613,7 @@ def forced_extraction(
    )

    # make measurement names unique for db constraint
-    extr_df['name'] = extr_df['name'] + f'_f_run{p_run.id:06d}'
+    extr_df['name'] = extr_df['name'] + f'_f_run{p_run.id:03d}'


If we add the UUID changes in future this will come from a safely chosen subset of caracters of the run's UUID and should provide for a much larger number of runs (depending on how many characters we use).

* Organise v1.1.1-dev * Fix changelog formatting and update changelog instructions (#772) * Initial changelog formatting issues * Update changelog + instructions * Updated changelog * Updated Code of conduct (#773) * Updated Code of conduct * Updated changelog * Fixed grammar * Fix zenodo DOI * Fixed typo in README * Shorten forced fit measurement names (#734) * Shorten names * Updated changelog * Update clearpiperun to use raw SQL (#775) * timing and memory benchmark * delete raw initial * adding profiler * optimisation handling exceptions * Added logging * Updated delete_run * Fix syntax errors * Disable triggers to see if that fixes speed issues * Remove memory profiling * Reenabled logging * Add end of loop logging, remove tqdm * Remove all tqdm, improve logging slightly * Added timing * Fixed tqdm missing * Fix logging * Added units to logging * specify source id in logging * Toggle triggers * clean up clearpiperun * Other minor updates * Fix variable name * Correctly handle images and skyregions that are associated with multiple runs * PEP8 * Updated changelog * Remove commented code * Remove whitespace - don't know why the linter didn't pick this up * Update vast_pipeline/management/commands/clearpiperun.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/management/commands/clearpiperun.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/management/commands/clearpiperun.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/management/commands/clearpiperun.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Update vast_pipeline/utils/delete_run.py Co-authored-by: Tom Mauch <[email protected]> * Fix logging count * Clean up logging statements --------- Co-authored-by: Shibli Saleheen <[email protected]> Co-authored-by: Tom Mauch <[email protected]> * Quick memory optimisations (#776) * Use itertuples over iterrows since iterrows is an enormous memory hog. * Drop sources_df columns before renaming id column to avoid a copy of the while dataframe in memory. * Decrease default partition size to 15MB * Dont split (large-in-memory) list of DataFrames into dask bags (No performance hit). * Don't write forced parquets in parallel (No perfomance hit for this). * Dont overwrite input DataFrame when writing parquets. * Update CHANGELOG.md * Address review comments. * Copy YAML objects before revalidation so the can be garbage collected. * Appease flake8 * 750 configure workers (#777) * Use itertuples over iterrows since iterrows is an enormous memory hog. * Drop sources_df columns before renaming id column to avoid a copy of the while dataframe in memory. * Decrease default partition size to 15MB * Dont split (large-in-memory) list of DataFrames into dask bags (No performance hit). * Don't write forced parquets in parallel (No perfomance hit for this). * Initial configuration updates for processing options. * Dont overwrite input DataFrame when writing parquets. * Update CHANGELOG.md * Address review comments. * Copy YAML objects before revalidation so the can be garbage collected. * Appease flake8 * Add processing options as optional with defaults. * filter processing config to parallel association. * Add a funtion to determine the number of workers and partitions for Dask. * Use config values for num_workers and max_partition_size throughout pipeline. * Correct working in config template. * Update CHANGELOG.md * Remove unused imports. * Bump strictyaml to 1.6.2 * Use YAML 'null' to create Python None for all cores option. * Make None the default in `calculate_workers_and_partitions` instead of 0 * Updated run config docs * Allow null for num_workers_io and improve validation of processing parameters. * Update num_workers_io default in docs. --------- Co-authored-by: Dougal Dobie <[email protected]> * Prepare v1.2.0 release --------- Co-authored-by: Shibli Saleheen <[email protected]> Co-authored-by: Tom Mauch <[email protected]>

Shorten names

9f3a78e

ddobie added bug Something isn't working enhancement New feature or request labels Jul 23, 2024

ddobie added the do not merge Do not merge this PR label Oct 14, 2024

ddobie added 2 commits October 17, 2024 11:22

Merge branch 'dev' into fix-iss-733

12709fb

Updated changelog

e69e308

ddobie marked this pull request as ready for review October 17, 2024 02:31

ddobie requested a review from mauch October 17, 2024 02:32

mauch approved these changes Oct 18, 2024

View reviewed changes

ddobie removed the do not merge Do not merge this PR label Oct 18, 2024

Merge branch 'dev' into fix-iss-733

65dd314

ddobie merged commit a86ca43 into dev Oct 21, 2024
5 checks passed

ddobie deleted the fix-iss-733 branch October 21, 2024 00:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shorten forced fit measurement names #734

Shorten forced fit measurement names #734

ddobie commented Jul 23, 2024 •

edited

Loading

ddobie commented Oct 17, 2024

mauch Oct 18, 2024

Shorten forced fit measurement names #734

Shorten forced fit measurement names #734

Conversation

ddobie commented Jul 23, 2024 • edited Loading

ddobie commented Oct 17, 2024

mauch Oct 18, 2024

Choose a reason for hiding this comment

ddobie commented Jul 23, 2024 •

edited

Loading