Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

medusa backup-cluster fails in k8s environment #842

Open
sabbir-hossain70 opened this issue Jan 7, 2025 · 0 comments
Open

medusa backup-cluster fails in k8s environment #842

sabbir-hossain70 opened this issue Jan 7, 2025 · 0 comments

Comments

@sabbir-hossain70
Copy link

sabbir-hossain70 commented Jan 7, 2025

Project board link

I was trying to take full cluster backup using cassandra-medusa, but failed. I found the following error:

[2025-01-07 09:07:10,392] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 1/3
[2025-01-07 09:07:15,397] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 2/3
[2025-01-07 09:07:16,399] ERROR: Could not resolve host 'cassandra-medusa-0' - retry 1/3
[2025-01-07 09:07:20,399] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 3/3
[2025-01-07 09:07:20,400] ERROR: Failed to run on host cassandra-medusa-1 - ("Error connecting to host '%s:%s' - %s - retry %s/%s", 'cassandra-medusa-1', 22, 'Connection refused', 3, 3)
[2025-01-07 09:07:27,406] ERROR: Could not resolve host 'cassandra-medusa-0' - retry 2/3
[2025-01-07 09:07:38,416] ERROR: Could not resolve host 'cassandra-medusa-0' - retry 3/3
[2025-01-07 09:07:38,416] ERROR: Failed to run on host cassandra-medusa-0 - ('Unknown host %s - %s - retry %s/%s', 'cassandra-medusa-0', 'Temporary failure in name resolution', 3, 3)
[2025-01-07 09:07:38,418] ERROR: This error happened during the cluster backup: ("Error connecting to host '%s:%s' - %s - retry %s/%s", 'cassandra-medusa-1', 22, 'Connection refused', 3, 3)
Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/medusa/backup_cluster.py", line 71, in orchestrate
backup.execute(cql_session_provider)
File "/home/cassandra/medusa/backup_cluster.py", line 153, in execute
self._create_snapshots()
File "/home/cassandra/medusa/backup_cluster.py", line 162, in _create_snapshots
pssh_run_success = self.orchestration_snapshots.
File "/home/cassandra/medusa/orchestration.py", line 90, in pssh_run
output = client.run_command(command, host_args=hosts_variables, use_pty=use_pty, shell=shell,
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/parallel.py", line 216, in run_command
return BaseParallelSSHClient.run_command(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 106, in run_command
return self._get_output_from_cmds(cmds, raise_error=stop_on_errors,
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 113, in _get_output_from_cmds
finished = joinall(_cmds, raise_error=True)
File "src/gevent/greenlet.py", line 1065, in gevent._gevent_cgreenlet.joinall
File "src/gevent/greenlet.py", line 1081, in gevent._gevent_cgreenlet.joinall
File "src/gevent/greenlet.py", line 373, in gevent._gevent_cgreenlet.Greenlet._raise_exception
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_compat.py", line 49, in reraise
raise value.with_traceback(tb)
File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 125, in _get_output_from_greenlet
raise ex
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 118, in _get_output_from_greenlet
host_out = cmd.get()
File "src/gevent/greenlet.py", line 805, in gevent._gevent_cgreenlet.Greenlet.get
File "src/gevent/greenlet.py", line 373, in gevent._gevent_cgreenlet.Greenlet._raise_exception
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_compat.py", line 49, in reraise
raise value.with_traceback(tb)
File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 228, in _run_command
raise ex
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 220, in _run_command
_client = self._make_ssh_client(host_i, host)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/parallel.py", line 242, in _make_ssh_client
_client = SSHClient(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/single.py", line 128, in init
super(SSHClient, self).init(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 84, in init
self._init()
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 87, in _init
self._connect(self._host, self._port)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 159, in _connect
return self._connect(host, port, retries=retries+1)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 159, in _connect
return self._connect(host, port, retries=retries+1)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 167, in _connect
raise ex
pssh.exceptions.ConnectionError: ("Error connecting to host '%s:%s' - %s - retry %s/%s", 'cassandra-medusa-1', 22, 'Connection refused', 3, 3)
[2025-01-07 09:07:38,420] ERROR: Something went wrong! Attempting to clean snapshots and exit.
[2025-01-07 09:07:38,420] INFO: Executing "nodetool -Dcom.sun.jndi.rmiURLParsing=legacy clearsnapshot -t medusa-demo" on following nodes ['cassandra-medusa-1', 'cassandra-medusa-0'] with a parallelism/pool size of 1
[2025-01-07 09:07:38,421] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 1/3
[2025-01-07 09:07:43,426] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 2/3
[2025-01-07 09:07:48,427] ERROR: Error connecting to host 'cassandra-medusa-1:22' - retry 3/3
[2025-01-07 09:07:48,427] ERROR: Failed to run on host cassandra-medusa-1 - ("Error connecting to host '%s:%s' - %s - retry %s/%s", 'cassandra-medusa-1', 22, 'Connection refused', 3, 3)
Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 141, in _connect
self.sock.connect((host, port))
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 590, in connect
self._internal_connect(address)
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_socketcommon.py", line 634, in _internal_connect
raise _SocketError(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/cassandra/medusa/backup_cluster.py", line 71, in orchestrate
backup.execute(cql_session_provider)
File "/home/cassandra/medusa/backup_cluster.py", line 153, in execute
self._create_snapshots()
File "/home/cassandra/medusa/backup_cluster.py", line 162, in _create_snapshots
pssh_run_success = self.orchestration_snapshots.
File "/home/cassandra/medusa/orchestration.py", line 90, in pssh_run
output = client.run_command(command, host_args=hosts_variables, use_pty=use_pty, shell=shell,
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/parallel.py", line 216, in run_command
return BaseParallelSSHClient.run_command(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 106, in run_command
return self._get_output_from_cmds(cmds, raise_error=stop_on_errors,
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 113, in _get_output_from_cmds
finished = joinall(_cmds, raise_error=True)
File "src/gevent/greenlet.py", line 1065, in gevent._gevent_cgreenlet.joinall
File "src/gevent/greenlet.py", line 1081, in gevent._gevent_cgreenlet.joinall
File "src/gevent/greenlet.py", line 373, in gevent._gevent_cgreenlet.Greenlet._raise_exception
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_compat.py", line 49, in reraise
raise value.with_traceback(tb)
File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 125, in _get_output_from_greenlet
raise ex
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 118, in _get_output_from_greenlet
host_out = cmd.get()
File "src/gevent/greenlet.py", line 805, in gevent._gevent_cgreenlet.Greenlet.get
File "src/gevent/greenlet.py", line 373, in gevent._gevent_cgreenlet.Greenlet._raise_exception
File "/home/cassandra/.venv/lib/python3.10/site-packages/gevent/_compat.py", line 49, in reraise
raise value.with_traceback(tb)
File "src/gevent/greenlet.py", line 908, in gevent._gevent_cgreenlet.Greenlet.run
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 228, in _run_command
raise ex
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/parallel.py", line 220, in _run_command
_client = self._make_ssh_client(host_i, host)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/parallel.py", line 242, in _make_ssh_client
_client = SSHClient(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/native/single.py", line 128, in init
super(SSHClient, self).init(
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 84, in init
self._init()
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 87, in _init
self._connect(self._host, self._port)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 159, in _connect
return self._connect(host, port, retries=retries+1)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 159, in _connect
return self._connect(host, port, retries=retries+1)
File "/home/cassandra/.venv/lib/python3.10/site-packages/pssh/clients/base/single.py", line 167, in _connect
raise ex
pssh.exceptions.ConnectionError: ("Error connecting to host '%s:%s' - %s - retry %s/%s", 'cassandra-medusa-1', 22, 'Connection refused', 3, 3)

I can take backup for a single node and restore it properly. Facing this error only for cluster-backup.

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: MED-119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant