Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: user error for hetpar start and end core #1262

Open
wants to merge 5 commits into
base: release
Choose a base branch
from

Conversation

pgierz
Copy link
Member

@pgierz pgierz commented Jan 6, 2025

No description provided.

@pgierz pgierz linked an issue Jan 6, 2025 that may be closed by this pull request
@mandresm
Copy link
Contributor

mandresm commented Jan 6, 2025

Hey @pgierz! I might be wrong, but I think start_core and end_core are parameters defined on the backend that depend on the number of nprocs of all components. The problem is that we need to catch missing nproc for components that need an nproc so that start_core and end_core values are computed.

@pgierz
Copy link
Member Author

pgierz commented Jan 7, 2025

whoops, yeah...I'll take care of that, sorry.

@@ -220,6 +250,7 @@ def calculate_requirements(config, cluster=None):
)

else:
# FIXME(PG): ...what? Just continue???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, think about oasis or recom for fesom2: they are components for ESM-Tools, but they are libraries, meaning they don't need cores themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. Then I think we should write something to that effect:

Suggested change
# FIXME(PG): ...what? Just continue???
# NOTE: For components like oasis or recom, no "own" cores are needed

Comment on lines -263 to -265
start_core = config[model]["start_core"]
end_core = config[model]["end_core"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost certain that removing this will kill the het_par_wrappers functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One would think so. However, the start_core and end_core variables are not used anywhere in that function. Note that other functions further down might need those keys, but they aren't needed here specifically. To avoid getting errors in the wrong place, I removed them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

prepexp returned non-zero exit status 1
2 participants