HPCC-33137 Add supports for executing the cost optimizers in Thor #19423

shamser · 2025-01-16T16:37:29Z

Cost optimizer will execute in Thor for containerized deployments. It will continue to execute in EclAgent for non-containerized deployment.

The reason for this change is that in containerized deployments EclAgent cannot calculate the cost of the Thor cluster and so cannot not assigned the cost of issues reported by the cost optimizer. It cannot calculate the cost of the Thor cluster because Thor's cost parameters and resource configuration is not available to EclAgent - this information is only available in the Thor's configuration.

Note, that as the Thor manager executes on a per job basis, it executes the analyzer after every graph executes. The concept of "end of workunit" does not exist at present in Thor so the analyzeWhenComplete==true has not been implemented for Thor.

The changes:

Support for executing cost optimizer in Thor. This the default in containerized.
New 'analyzeInEclAgent' option - if true, the analyzer executes in EclAgent, otherwise it executes in Thor Manager.
By default 'analyzeInEclAgent' is true in bare-metal and false in containerized
The defaults for 'analyzeWhenComplete' has changed for containerized. It is now false by default in containerized. It remains true in bare-metal.
New 'disabled' option is available within analyzerOptions to disable cost optimizer. 'disabled' is false by default.

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

Cost optimizer will execute in Thor for containerized deployments. It will continue to execute in EclAgent for non-containerized deployment. The reason for this change is that in containerized deployments EclAgent cannot calculate the cost of the Thor cluster and so cannot not assigned the cost of issues reported by the cost optimizer. It cannot calculate the cost of the Thor cluster because Thor's cost parameters and resource configuration is not available to EclAgent - this information is only available in the Thor's configuration. Note, that as the Thor manager executes on a per job basis, it executes the analyzer after every graph executes. The concept of "end of workunit" does not exist at present in Thor so the analyzeWhenComplete==true has not been implemented for Thor. The changes: * Support for executing cost optimizer in Thor. This the default in containerized. * New 'analyzeInEclAgent' option - if true, the analyzer executes in EclAgent, otherwise it executes in Thor Manager. * By default 'analyzeInEclAgent' is true in bare-metal and false in containerized * The defaults for 'analyzeWhenComplete' has changed for containerized. It is now false by default in containerized. It remains true in bare-metal. * New 'disabled' option is available within analyzerOptions to disable cost optimizer. 'disabled' is false by default. Signed-off-by: Shamser Ahmed <[email protected]>

github-actions · 2025-01-16T16:51:04Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-33137

Jirabot Action Result:
Workflow Transition To: Merge Pending
Updated PR

ghalliday

@shamser Looks ok. How long does it take to run the analyser on a 2,000 activity graph?

ghalliday · 2025-01-20T11:18:01Z

ecl/eclagent/eclagent.cpp

-    if (w->getDebugValueBool("analyzeWorkunit", agentTopology->getPropBool("@analyzeWorkunit", true)))
+    if (w->hasDebugValue("analyzeWorkunit") && !w->getDebugValueBool("analyzeWorkunit", true))
+        return;
+    if (!getBoolWUOption(nullptr, nullptr, "analyzerOptions/@disabled", false))


Why not pass w and "analyzeWorkunit" to the function?

Because if these are passed into the function and 'analyzeWorkunit' was true, then the getBoolWUOption would also return true and that would mean the workunit would not be analyzed, which is not what it should do:

if analyzerOptions/@disabled == true, then it shouldn't execute the analysis. However, if analyzeWorkunit == true, it should analyze the workunit.

jakesmith · 2025-01-21T12:26:23Z

How long does it take to run the analyser on a 2,000 activity graph?

I am not sure how to produce such a graph - probably utilizing ECL macros and #LOOP perhaps to produce long sequences of chained attributes(?), or a macro that produces a smaller subgraph variant that can be produced many times.
May need @ghalliday to produce this test, or reach out to ECL coders for some tips.

As discussed in my meeting with Shamser, if speed (and the incurred additional Thor cost) is an issue we could:

Spin the analysis off onto an asynchronous thread, allowing the Thor to become available to run other jobs.
Downside is that if the Thor hits a hard failure in a subsequent job before the analysis is complete, the analysis will be lost.
Farm out the analysis to a service (e.g. a sasha service). i.e. notify (queue) a analysis job, and have a separate service perform the analysis asynchronously. The service could also be load balanced trivially this way, though that would probably be overkill.

shamser requested a review from ghalliday January 16, 2025 16:37

shamser force-pushed the issue33137 branch from 1a314f4 to cc0353c Compare January 16, 2025 16:43

shamser changed the title ~~HPCC-33137 Add supports for executing the cost optimizers execution in Thor~~ HPCC-33137 Add supports for executing the cost optimizers in Thor Jan 16, 2025

shamser force-pushed the issue33137 branch from cc0353c to 5f1b976 Compare January 16, 2025 16:46

ghalliday reviewed Jan 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-33137 Add supports for executing the cost optimizers in Thor #19423

HPCC-33137 Add supports for executing the cost optimizers in Thor #19423

shamser commented Jan 16, 2025 •

edited

Loading

github-actions bot commented Jan 16, 2025

ghalliday left a comment

ghalliday Jan 20, 2025

shamser Jan 21, 2025

jakesmith commented Jan 21, 2025

HPCC-33137 Add supports for executing the cost optimizers in Thor #19423

Are you sure you want to change the base?

HPCC-33137 Add supports for executing the cost optimizers in Thor #19423

Conversation

shamser commented Jan 16, 2025 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Jan 16, 2025

ghalliday left a comment

Choose a reason for hiding this comment

ghalliday Jan 20, 2025

Choose a reason for hiding this comment

shamser Jan 21, 2025

Choose a reason for hiding this comment

jakesmith commented Jan 21, 2025

shamser commented Jan 16, 2025 •

edited

Loading