You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Develop an app running in Kubernetes is not easy. Managing a Kubernetes cluster is even harder. It takes years of experience to understand how Kubernetes works, how to read logs from different components and where to start when some part of your cluster are not working.
This projects aims to create a simple tool to run diagnostics and gives advices for your troubleshooting direction in ops scenario.
Goals
A handy ops tool for troubleshooting Kubernetes and apps in it
Non-Goals
Deep integration with app development flow
Debug Kubernetes itself
User Experience
At early stage it should be a CLI tool with minimum dependencies.
Check sub command
Check sub command is used to run specific check suites.
For example, following command runs DNS and HTTP check suites:
kdebug check -s dns,http,kube,app
It generates a report after checks complete.
An example for healthy report:
* DNS
=> [OK] System DNS
=> [OK] In-cluster CoreDNS
=> [OK] Azure DNS
=> [OK] Google DNS
* HTTP
=> [OK] Connectivity to kube-apiserver
=> [OK] Connectivity to google.com
* Kubernetes
=> [OK] Kubelet is running
* Apps
=> [OK] All pods are running
All OK.
An example for unhealthy report:
* DNS
=> [OK] System DNS
=> [Fail] In-cluster CoreDNS
=> [OK] Azure DNS
=> [OK] Google DNS
* HTTP
=> [OK] Connectivity to kube-apiserver
=> [OK] Connectivity to google.com
* Kubernetes
=> [Fail] Kubelet liveness
* Apps
=> [Fail] Pods Crashloopbackoff
kdebug has detected these problems for you:
----------
Checker: In-cluster CoreDNS
Error: Time-out
Description: In-cluster CoreDNS query failed. Check if CoreDNS pods are running.
Recommendations:
Check CoreDNS pods using command ` kubectl get pods -o wide -n kube-system | grep coredns`
Help links:
https://example.com
----------
Checker: Kubelet
Error: systemd service kubelet is not running
Description: Systemd service kubelet is not running. It has crashed 300 times in last 1h.
Logs:
[I] xxx
[I] yyy
[F] cgroup is invalid.
...
Recommendations:
Use `systemctl status kubelet` to check its status.
Use `journactl -r -u kubelet` to see full logs.
Reboot machine.
Help links:
https://foo.com
https://bar.com
----------
Checker: App
Error: Pod default/xxx is in Crashloopbackoff state
Description: Pod is crashing. Last exit reason is OOM
Recommendations:
Increase pod memory limit. Current is 100MB.
Check potential memory leak in your app.
Help links:
https://foo.com
https://bar.com
The text was updated successfully, but these errors were encountered:
Summary
Develop an app running in Kubernetes is not easy. Managing a Kubernetes cluster is even harder. It takes years of experience to understand how Kubernetes works, how to read logs from different components and where to start when some part of your cluster are not working.
This projects aims to create a simple tool to run diagnostics and gives advices for your troubleshooting direction in ops scenario.
Goals
Non-Goals
User Experience
At early stage it should be a CLI tool with minimum dependencies.
Check sub command
Check sub command is used to run specific check suites.
For example, following command runs DNS and HTTP check suites:
It generates a report after checks complete.
An example for healthy report:
An example for unhealthy report:
The text was updated successfully, but these errors were encountered: