Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus adapter not able to extract GPU metrics, getting "apiserver was unable to write a JSON response: http2: stream closed" #677

Open
Vijaygawate opened this issue Aug 30, 2024 · 1 comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Vijaygawate
Copy link

I am trying to extract custom GPU metric using prometheus adapter.
But when I am running below command, I am getting error

kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq -r . | grep DCGM_FI_DEV_GPU_UTIL

Error from server (NotFound): the server could not find the metric DCGM_FI_DEV_GPU_UTIL pods

I then checked Prometheus adapter logs and found out below logs

I0830 06:59:41.308791 1 httplog.go:132] "HTTP" verb="GET" URI="/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/DCGM_FI_DEV_GPU_UTIL" latency="6.829796ms" userAgent="kubectl/v1.27.3 (linux/amd64) kubernetes/25b4e43" audit-ID="868c4e60-baee-4213-af4a-eab17b882e46" srcIP="10.1.108.126:53018" resp=404
E0830 06:59:43.515770 1 writers.go:122] apiserver was unable to write a JSON response: http2: stream closed
E0830 06:59:43.515798 1 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http2: stream closed"}: http2: stream closed

EKS version : 1.30
Prometheus adapter version: v0.12.0

Please help here
Thanks!

@Vijaygawate Vijaygawate added the kind/bug Categorizes issue or PR as related to a bug. label Aug 30, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 30, 2024
@dashpole
Copy link

dashpole commented Sep 5, 2024

/triage accepted
This repo doesn't have a lot of bandwidth of people who are able to investigate issues right now. Sorry if there is a slow response

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants