Here you can find examples of using Katib with Tekton.
To deploy Tekton Pipelines v0.26.0
, run the following command:
kubectl apply -f https://storage.googleapis.com/tekton-releases/pipeline/previous/v0.26.0/release.yaml
Check that Tekton Pipelines components are running:
$ kubectl get pods -n tekton-pipelines
NAME READY STATUS RESTARTS AGE
tekton-pipelines-controller-799cdc78fc-sm4vl 1/1 Running 0 50s
tekton-pipelines-webhook-79d8f4f9bc-qmk97 1/1 Running 0 50s
Note: You must modify Tekton nop
image to run Tekton Pipelines. Nop
image is used to stop sidecar containers after main container
is completed. Since Katib is using Metrics Collector sidecar container
and Tekton Pipelines Controller should not kill sidecar containers, you have to
set this nop
image to Metrics Collector image.
For example, if you are using
StdOut Metrics Collector,
nop
image must be equal to docker.io/kubeflowkatib/file-metrics-collector
.
Run the following command to modify the nop
image:
kubectl patch deploy tekton-pipelines-controller -n tekton-pipelines --type='json' \
-p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args/9", "value": "docker.io/kubeflowkatib/file-metrics-collector"}]'
Check that Tekton Pipelines Controller's pod was restarted:
$ kubectl get pods -n tekton-pipelines
NAME READY STATUS RESTARTS AGE
tekton-pipelines-controller-7fcb6c6cd4-p8zf2 1/1 Running 0 2m2s
tekton-pipelines-webhook-7f9888f9b-7d6mr 1/1 Running 0 3m
Verify that nop
image was modified:
$ kubectl get $(kubectl get pods -o name -n tekton-pipelines | grep tekton-pipelines-controller) -n tekton-pipelines -o yaml | grep katib
- docker.io/kubeflowkatib/file-metrics-collector
To run Tekton Pipelines within Katib Trials you have to update Katib ClusterRole's rules with the appropriate permission:
- apiGroups:
- tekton.dev
resources:
- pipelineruns
- taskruns
verbs:
- "*"
Run the following command to update Katib ClusterRole:
kubectl patch ClusterRole katib-controller -n kubeflow --type=json \
-p='[{"op": "add", "path": "/rules/-", "value": {"apiGroups":["tekton.dev"],"resources":["pipelineruns", "taskruns"],"verbs":["*"]}}]'
In addition to that, you have to modify Katib
Controller args
with the new flag --trial-resources
.
Run the following command to update Katib Controller args:
kubectl patch Deployment katib-controller -n kubeflow --type=json \
-p='[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--trial-resources=PipelineRun.v1beta1.tekton.dev"}]'
Check that Katib Controller's pod was restarted:
$ kubectl get pods -n kubeflow
NAME READY STATUS RESTARTS AGE
katib-cert-generator-hnv6q 0/1 Completed 0 6m12s
katib-controller-784994d449-9bgj9 1/1 Running 0 28s
katib-db-manager-78697c7bd4-ck7l8 1/1 Running 0 6m13s
katib-mysql-854cdb87c4-krcm9 1/1 Running 0 6m13s
katib-ui-57b9d7f6dd-cv6gn 1/1 Running 0 6m13s
Check logs from Katib Controller to verify Tekton Pipelines integration:
$ kubectl logs $(kubectl get pods -n kubeflow -o name | grep katib-controller) -n kubeflow | grep '"CRD Kind":"PipelineRun"'
{"level":"info","ts":1628032648.6285546,"logger":"trial-controller","msg":"Job watch added successfully","CRD Group":"tekton.dev","CRD Version":"v1beta1","CRD Kind":"PipelineRun"}
If you ran the above steps successfully, you should be able to run Tekton Pipelines examples.
Learn more about using custom Kubernetes resource as a Trial template in the official Kubeflow guides.