Ozone 1.4.0 will not connect to OM on Kubernetes: Failure in resolving OM host address #6615
Replies: 2 comments 1 reply
-
Thanks @RonaldHolex for bringing this up. cc @sadanand48 |
Beta Was this translation helpful? Give feedback.
-
Thanks @RonaldHolex for the detailed description . HDDS-8041 was done in order to prevent endless retries when an invalid hostname is provided and the intention was to check connection to a host which is why ICMP looked like a good choice. |
Beta Was this translation helpful? Give feedback.
-
Hi,
When using Apache Ozone 1.4.0 or 1.3.0.7.2.18.0-273 from CDP (which I think is patched with HDDS-80411) then the cluster will start using a single SCM and OM. However, clients are unable to connect which was kind of confusing because doing a curl call on the OM port will return an HTTP response.
It turns out that a PR 1 introduced the usage of isReachable() [2] to test for connectivity in order to validate the settings.
However, it appears that this call is generally implemented using ICMP 3 (which confirms what I'm seeing with tcpdump) [4] when starting Impala 4.3.0.
Is this a known issue or a configuration issue?
I think ICMP is never going to work in Kubernetes and might even fail in some strictly firewalled environments. The check could be improved by opening a TCP connection to the proper port.
Kind regards,
[2]:
if (!omHostAddress.getAddress().isReachable(5000)) {
A typical implementation will use ICMP ECHO REQUESTs if the privilege can be obtained, otherwise it will try to establish a TCP connection on port 7 (Echo) of the destination host.
[4]:
E0501 08:55:15.000583 1 OmUtils.java:851] Failure in resolving OM host address
Java exception follows:
Beta Was this translation helpful? Give feedback.
All reactions