Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability for parallel Jenkins multi-arch builds #99

Merged
merged 1 commit into from
Dec 2, 2024

Conversation

yosifkit
Copy link
Member

Trigger all jenkins builds (adding them to jenkins queue) and then parallel wait on them. We control the concurrency by the number of executors per arch (or if we add multiple machines that match the build label).

@@ -1,15 +1,18 @@
// one job per arch (for now) that just builds "the top thing" (triggered by the meta-update job)
properties([
disableConcurrentBuilds(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, hmm, something to think about here:

https://github.com/docker-library/meta/blob/f595ec2375e11540c4e26352274b05247953d763/.github/workflows/build.yml#L28-L30

I wish Jenkins had something like GitHub's concurrency groups -- this prevents the same buildId from even attempting to build twice at the same time, and with this PR the only thing that prevents that is that these are canonically triggered by the "trigger" job, which is fine for the normal case, but when things go wrong and we're running "build" by hand to debug, nothing will stop "trigger" from firing and potentially clobbering the thing we're testing. 🤔

I think the closest thing Jenkins has is "Lockable Resources" (https://plugins.jenkins.io/lockable-resources/), but they're a really awful experience and IIRC nothing cleans them up, so we can't reasonably put an arbitrary number of those into the system. 😭

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "Throttle Concurrent Builds" plugin can at least get things with the same parameters from running at the same time: https://www.gusi.me/2022/05/06/Disable-concurrent-builds-based-on-parameters.html. It will still be quasi-queued though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	throttleJobProperty(
		limitOneJobWithMatchingParams: true,
		paramsToUseForLimit: 'identifier, buildId',
		throttleEnabled: true,
		throttleOption: 'project',
	),

Ok, I did a local test using the "Throttle Builds" plugin with this job config and found that it can stop concurrent jobs from adding to the Jenkins queue when specific parameters match already queued/running jobs.

image

If the build job activation comes from something like the trigger job that uses waitForStart :true, then the trigger job will be stuck waiting to add it to the queue and not start anything else. This should be fine since trigger is the only one that should be starting them and so would only happen if we manually had started a build.

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly cosmetics! 😄 ❤️

string(name: 'identifier', trim: true),
string(name: 'buildId', trim: true),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put the "required" parameter first and give the identifier a description so we remember what it is while triggering the job manually (also doubles as a "load-bearing" code comment 😄):

Suggested change
string(name: 'identifier', trim: true),
string(name: 'buildId', trim: true),
string(name: 'buildId', trim: true),
string(name: 'identifier', trim: true, description: '(optional) used to set <code>currentBuild.displayName</code> to a meaningful value earlier'),

//echo(json) // for debugging/data purposes
// list of closures that we can use to wait for the jobs on.
def waitQueue = [:]
def addToWait(identifier, buildId, externalizableId) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall your original version of this function actually was adding the given parameters to the queue, but given the complications with that IMO we should rename this to make it more clear it only returns the closure and adding it to the queue is still something the caller is responsible for; maybe something like this?

Suggested change
def addToWait(identifier, buildId, externalizableId) {
def waitQueueClosure(identifier, buildId, externalizableId) {

Comment on lines 111 to 113
// "catchError" to set "stageResult" :(
catchError(message: 'Build of "' + identifier + '" failed', buildResult: 'UNSTABLE', stageResult: 'FAILURE') {
stage(identifier) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you tested and it should work in this order, but I think it's slightly clearer if we swap these two (both so we don't catch stage() itself somehow failing and so it's clear our stage happens and then we set the state of it from within not without):

Suggested change
// "catchError" to set "stageResult" :(
catchError(message: 'Build of "' + identifier + '" failed', buildResult: 'UNSTABLE', stageResult: 'FAILURE') {
stage(identifier) {
stage(identifier) {
// "catchError" to set "stageResult" :(
catchError(message: 'Build of "' + identifier + '" failed', buildResult: 'UNSTABLE', stageResult: 'FAILURE') {

Copy link
Member Author

@yosifkit yosifkit Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sane. I should also move the other one to inside the stage(buildObj.identifier) on https://github.com/docker-library/meta-scripts/pull/99/files#diff-a1eafb3a911bf01d23222ee7808d97d5f227a9608c46843ae4908cdca3408234R139, but that means I'll need to duplicate it for the else block but that seems fine.


// stage to wrap up all the build job triggers that get waited on later
stage('trigger') {
for (int i = 0; i < queue.size(); i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember what the error is when you left this with the previous syntax so we could note it in a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it from reading https://www.jenkins.io/doc/pipeline/examples/#jobs-in-parallel and https://www.jenkins.io/doc/pipeline/examples/#parallel-from-grep, but it seems like the function is now enough to scope it correctly (tested locally), so I'll switch back to "for-in"

Comment on lines 177 to 179
string(name: 'identifier', value: buildObj.identifier),
string(name: 'buildId', value: buildObj.buildId),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same order swap)

Suggested change
string(name: 'identifier', value: buildObj.identifier),
string(name: 'buildId', value: buildObj.buildId),
string(name: 'buildId', value: buildObj.buildId),
string(name: 'identifier', value: buildObj.identifier),

Comment on lines 181 to 186
// trigger these quickly so they all get added to Jenkins queue in "queue" order
quietPeriod: 0, // seconds
// we'll wait on them after they are all queued
waitForStart: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have a quietPeriod on these (even though that would be nice) because it messes with waitForStart (and we would then be "captive" for the entire quiet period) -- I think we should probably note that here. 🤔

Maybe something like this?

Suggested change
// trigger these quickly so they all get added to Jenkins queue in "queue" order
quietPeriod: 0, // seconds
// we'll wait on them after they are all queued
waitForStart: true,
// trigger these quickly so they all get added to Jenkins queue in "queue" order (also using "waitForStart" means we have to wait for the entire "quietPeriod" before we get to move on and schedule more)
quietPeriod: 0, // seconds
// we'll wait on the builds in parallel after they are all queued (so our sorted order is the queue order)
waitForStart: true,

Trigger all jenkins builds (adding them to jenkins queue) and then parallel wait on them. We control the concurrency by the number of executors per arch (or if we add multiple machines that match the build label).
Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great 👍

I do think we should wait to merge it until after Thanksgiving at this point, though 👀 🦃

} else {
// "catchError" to set "stageResult" :(
catchError(message: 'Build of "' + buildObj.identifier + '" failed', buildResult: 'UNSTABLE', stageResult: 'FAILURE') {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This blank line is extraneous, but it doesn't bother me to leave it in. 👍

Suggested change

}
}
}
}

// wait on all the 'build' jobs that were queued
if (waitQueue.size() > 0) {
parallel waitQueue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually prefer to be explicit about the () in single-argument invocations like this (although I don't usually go as far as "naming" the parameter), but I don't think it really matters much either way. 👍

Suggested change
parallel waitQueue
parallel(waitQueue)

@tianon tianon merged commit 5978683 into docker-library:main Dec 2, 2024
1 check passed
@tianon tianon deleted the parallel branch December 2, 2024 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants