Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not get response from LwM2M clients when using CoAP+TCP (java-coap) #1675

Open
max-pv opened this issue Nov 20, 2024 · 5 comments
Open
Labels
bug Dysfunctionnal behavior

Comments

@max-pv
Copy link

max-pv commented Nov 20, 2024

Version(s)

2.0.0-M16

Which components

leshan-lwm2m-server

Tested With

Leshan Client Demo, Anjay Demo

What happened

When running a Leshan demo a server with a RegistrationListener from Wiki and using Leshan Client (or another LwM2M client) with CoAP+TCP server address, the server timeouts and throws the error from the "relevant output" section.

It works fine when using plain CoAP provider (just coap:// scheme)

How to reproduce

  1. Copy the changes from this diff to LeshanDemoServer.java
  2. Run mvn clean install
  3. Run the Leshan server: java -jar ./leshan-demo-server/target/leshan-demo-server-*-SNAPSHOT-jar-with-dependencies.jar -vvv
  4. Run the Leshan client with CoAP+TCP scheme (in a separate terminal session): java -jar leshan-demo-client/target/leshan-demo-client-*-SNAPSHOT-jar-with-dependencies.jar -u coap+tcp://localhost:5683 -n tcp-device-1
  5. Wait for some time, the server will not get a response from client
  6. After around 2-3 minutes, the server terminal will show the error from "relevant output"

Relevant Output

2024-11-19 15:32:51,172 LeshanServerDemo     [ERROR] Unexpected error while reading device time  
java.lang.NullPointerException: Cannot invoke "org.eclipse.leshan.core.response.ReadResponse.isSuccess()" because "response" is null
	at org.eclipse.leshan.demo.server.LeshanServerDemo$1.registered(LeshanServerDemo.java:150)
	at org.eclipse.leshan.server.registration.RegistrationServiceImpl.fireRegistered(RegistrationServiceImpl.java:74)
	at org.eclipse.leshan.server.registration.RegistrationHandler$1.run(RegistrationHandler.java:126)
	at org.eclipse.leshan.core.response.SendableResponse.sent(SendableResponse.java:47)
	at org.eclipse.leshan.transport.javacoap.server.resource.RegistrationResource.handleRegister(RegistrationResource.java:171)
	at org.eclipse.leshan.transport.javacoap.server.resource.RegistrationResource.handlePOST(RegistrationResource.java:89)
	at org.eclipse.leshan.transport.javacoap.resource.LwM2mCoapResource.handleRequest(LwM2mCoapResource.java:78)
	at org.eclipse.leshan.transport.javacoap.resource.LwM2mCoapResource.apply(LwM2mCoapResource.java:63)
	at org.eclipse.leshan.transport.javacoap.resource.LwM2mCoapResource.apply(LwM2mCoapResource.java:42)
	at com.mbed.coap.server.RouterService.apply(RouterService.java:68)
	at com.mbed.coap.server.RouterService.apply(RouterService.java:33)
	at com.mbed.coap.utils.Filter$1.apply(Filter.java:69)
	at com.mbed.coap.utils.Filter.lambda$null$0(Filter.java:33)
	at com.mbed.coap.server.block.BlockWiseIncomingFilter.apply(BlockWiseIncomingFilter.java:55)
	at com.mbed.coap.server.block.BlockWiseIncomingFilter.apply(BlockWiseIncomingFilter.java:38)
	at com.mbed.coap.utils.Filter.lambda$null$0(Filter.java:33)
	at com.mbed.coap.server.CriticalOptionVerifier.apply(CriticalOptionVerifier.java:34)
	at com.mbed.coap.server.CriticalOptionVerifier.apply(CriticalOptionVerifier.java:27)
	at com.mbed.coap.utils.Filter.lambda$null$0(Filter.java:33)
	at com.mbed.coap.server.RescueFilter.apply(RescueFilter.java:35)
	at com.mbed.coap.server.RescueFilter.apply(RescueFilter.java:29)
	at com.mbed.coap.utils.Filter.lambda$andThen$1(Filter.java:34)
	at com.mbed.coap.utils.Filter.lambda$andThen$1(Filter.java:34)
	at com.mbed.coap.utils.Filter.lambda$andThen$1(Filter.java:34)
	at com.mbed.coap.utils.Filter.lambda$then$3(Filter.java:53)
	at com.mbed.coap.server.messaging.CoapTcpDispatcher.handle(CoapTcpDispatcher.java:73)
	at com.mbed.coap.server.CoapServer.handle(CoapServer.java:76)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
	at org.eclipse.leshan.transport.javacoap.server.coaptcp.transport.NettyCoapTcpTransport$DispatchHandler.channelRead(NettyCoapTcpTransport.java:167)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:289)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:840)
@max-pv max-pv added the bug Dysfunctionnal behavior label Nov 20, 2024
@sbernard31
Copy link
Contributor

sbernard31 commented Nov 20, 2024

Thx for taking time to reporting that. 🙏

Glad to see that you are testing coap+tcp experimental feature.

I will try to look at this but maybe not today 😬

Some remarks waiting I take time to look at this deeper :

"org.eclipse.leshan.core.response.ReadResponse.isSuccess()" because "response" is null

a null response with synchronous send API means that the request timeout.
Why it times-out ? I don't know for now.

ReadResponse response = lwm2mServer.send(registration, new ReadRequest(3, 0, 13));

I see you are using synchronous send API in registered event.
This is not advised, see javadoc :

/**
 * Listen for client registration events.
 * <p>
 * Those methods are called by the protocol stage thread pool, this means that execution MUST be done in a short delay,
 * if you need to do long time processing use a dedicated thread pool.
 */

Could you try with async send API ?

@max-pv
Copy link
Author

max-pv commented Nov 23, 2024

Thank you! I will try that sometime later. Instead of doing CoAP over TCP, for now I've decided to try if WebSocket proxy would work, as defined in Section 4 of RFC 8323.

@sbernard31
Copy link
Contributor

I'm able to reproduce that but using async way to send request, it works for me, with :

lwm2mServer.getRegistrationService().addListener(new RegistrationListener() {
    @Override
    public void registered(Registration registration, Registration previousReg,
            Collection<org.eclipse.leshan.core.observation.Observation> previousObservations) {
        LOG.info("new device: {}", registration.getEndpoint());
        lwm2mServer.send(registration, new ReadRequest(3, 0, 13), response -> {
            if (response.isSuccess()) { // <-- error: response is null when using CoAP over TCP
                LOG.info("Device time: {}", ((LwM2mResource) response.getContent()).getValue());
            } else {
                LOG.info("Failed to read: {} {}", response.getCode(), response.getErrorMessage());
            }
        }, e -> LOG.error("Unexpected error while reading device time", e));
    }

    @Override
    public void updated(RegistrationUpdate update, Registration updatedReg, Registration previousReg) {
        LOG.info("device is still here: {}", updatedReg.getEndpoint());
    }

    @Override
    public void unregistered(Registration registration,
            Collection<org.eclipse.leshan.core.observation.Observation> observations, boolean expired,
            Registration newReg) {
        LOG.info("device left: {}", registration.getEndpoint());
    }
});

I get

2025-01-09 14:41:48,749 LeshanServerDemo     [INFO] new device: tcp-device-1  
2025-01-09 14:41:48,835 LeshanServerDemo     [INFO] Device time: Thu Jan 09 14:41:48 CET 2025  
2025-01-09 14:42:08,664 LeshanServerDemo     [INFO] device left: tcp-device-1  

@sbernard31
Copy link
Contributor

I looked at your problem a bit deeper and try to investigate about the sync way you used.

I confirm the NPE is because you don't handle the timeout correctly. Code should looks like :

 lwm2mServer.getRegistrationService().addListener(new RegistrationListener() {
    @Override
    public void registered(Registration registration, Registration previousReg,
            Collection<org.eclipse.leshan.core.observation.Observation> previousObservations) {
        LOG.info("new device: {}", registration.getEndpoint());
        try {
            ReadResponse response = lwm2mServer.send(registration, new ReadRequest(3, 0, 13), TIMEOUT_IN_MS);
            if (response == null) {
                LOG.info("Read 'Device Time' timeout");
            } else if (response.isSuccess()) { // <-- error: response is null when using CoAP over TCP
                LOG.info("Device time: {}", ((LwM2mResource) response.getContent()).getValue());
            } else {
                LOG.info("Failed to read: {} {}", response.getCode(), response.getErrorMessage());
            }
        } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            LOG.error("Interrupted while reading device time", e);
        } catch (Exception e) {
            LOG.error("Unexpected error while reading device time", e);
        }
    }

    @Override
    public void updated(RegistrationUpdate update, Registration updatedReg, Registration previousReg) {
        LOG.info("device is still here: {}", updatedReg.getEndpoint());
    }

    @Override
    public void unregistered(Registration registration,
            Collection<org.eclipse.leshan.core.observation.Observation> observations, boolean expired,
            Registration newReg) {
        LOG.info("device left: {}", registration.getEndpoint());
    }
});

But even with a cleaner code like ☝️, Sync send will not work because this code is executed in Netty NioEventLoopGroup and it should not contain blocking code.

I'm not 100% sure but, my understanding :
In Netty, 1 connection is representing by a Channel and each Channel IO operation are done by same thread.
In your case, you block the thread by executing a Sync Read Request but you will never get the answer because you actually blocked the IO thread responsible to read the response. (as this is same channel so same thread)
I'm not sure I'm clear ?

Anyway, I think the right way to do is to use Async Send API.

By the way the RegistrationListner javadoc says :

/**
 * Listen for client registration events.
 * <p>
 * Those methods are called by the protocol stage thread pool, this means that execution MUST be done in a short delay,
 * if you need to do long time processing use a dedicated thread pool.
 */

HTH

@sbernard31
Copy link
Contributor

@max-pv please let us know if you move forward on that and also If we should close this issue ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Dysfunctionnal behavior
Projects
None yet
Development

No branches or pull requests

2 participants