You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.
We are facing an issue in our env where Spark pods go in Pending state intermittently. We have to restart Spark scheduler pods to fix the issue.
We are seeing below errors in spark-scheduler-extender logs...not sure this is related to the issue
Looking for some pointers to explain this odd behaviour.
"stacktrace": "error when looking for already bound reservations\nfailed to get resource reservations podName:agg-spark-350zvn28en0u-b29f74875b02ba23-exec-1, podNamespace:prod01\n\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*ResourceReservationManager).FindAlreadyBoundReservationNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resourcereservations.go:141\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).selectExecutorNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:382\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).selectNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:210\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).Predicate\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:151\ngithub.com/palantir/k8s-spark-scheduler/cmd.registerExtenderEndpoints.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/cmd/endpoints.go:36\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).Register.func1.1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:136\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRouteLogTraceSpan.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/route.go:107\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRouteRequestLog.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/route.go:32\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestMetricRequestMeter.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:168\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).Register.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:139\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngithub.com/julienschmidt/httprouter.(*Router).Handler.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/julienschmidt/httprouter/router.go:275\ngithub.com/julienschmidt/httprouter.(*Router).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/julienschmidt/httprouter/router.go:387\ngithub.com/palantir/witchcraft-go-server/wrouter/whttprouter.(*router).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/whttprouter/routerimpl.go:71\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestExtractIDs.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:139\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestContextLoggers.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:73\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestContextMetricsRegistry.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:84\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestPanicRecovery.func1.1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:42\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/negroni.(*Recovery).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/negroni/recovery.go:193\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestPanicRecovery.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:41\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:103\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.initALPNRequest.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:3556\nnet/http.(*http2serverConn).runHandler\n\t/usr/local/go/src/net/http/h2_bundle.go:5910",
The text was updated successfully, but these errors were encountered:
We are facing an issue in our env where Spark pods go in Pending state intermittently. We have to restart Spark scheduler pods to fix the issue.
We are seeing below errors in spark-scheduler-extender logs...not sure this is related to the issue
Looking for some pointers to explain this odd behaviour.
"stacktrace": "error when looking for already bound reservations\nfailed to get resource reservations podName:agg-spark-350zvn28en0u-b29f74875b02ba23-exec-1, podNamespace:prod01\n\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*ResourceReservationManager).FindAlreadyBoundReservationNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resourcereservations.go:141\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).selectExecutorNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:382\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).selectNode\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:210\ngithub.com/palantir/k8s-spark-scheduler/internal/extender.(*SparkSchedulerExtender).Predicate\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/internal/extender/resource.go:151\ngithub.com/palantir/k8s-spark-scheduler/cmd.registerExtenderEndpoints.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/cmd/endpoints.go:36\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).Register.func1.1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:136\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRouteLogTraceSpan.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/route.go:107\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRouteRequestLog.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/route.go:32\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestMetricRequestMeter.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:168\ngithub.com/palantir/witchcraft-go-server/wrouter.(*routeRequestHandlerWithNext).HandleRequest\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:150\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).Register.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:139\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2109\ngithub.com/julienschmidt/httprouter.(*Router).Handler.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/julienschmidt/httprouter/router.go:275\ngithub.com/julienschmidt/httprouter.(*Router).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/julienschmidt/httprouter/router.go:387\ngithub.com/palantir/witchcraft-go-server/wrouter/whttprouter.(*router).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/whttprouter/routerimpl.go:71\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestExtractIDs.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:139\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestContextLoggers.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:73\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestContextMetricsRegistry.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:84\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestPanicRecovery.func1.1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:42\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/negroni.(*Recovery).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/negroni/recovery.go:193\ngithub.com/palantir/witchcraft-go-server/witchcraft/internal/middleware.NewRequestPanicRecovery.func1\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/witchcraft/internal/middleware/request.go:41\ngithub.com/palantir/witchcraft-go-server/wrouter.(*requestHandlerWithNext).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:250\ngithub.com/palantir/witchcraft-go-server/wrouter.(*rootRouter).ServeHTTP\n\t/home/circleci/go/src/github.com/palantir/k8s-spark-scheduler/vendor/github.com/palantir/witchcraft-go-server/wrouter/router_root.go:103\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2947\nnet/http.initALPNRequest.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:3556\nnet/http.(*http2serverConn).runHandler\n\t/usr/local/go/src/net/http/h2_bundle.go:5910",
The text was updated successfully, but these errors were encountered: