-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OTA with MQTT TLS not working #648
Comments
The problem seems to be in ESPAsyncTCP. I created an issue on that project to see if I get some pointers. Here it is : me-no-dev/ESPAsyncTCP#133 |
Ok, got no answers at the ESPAsyncTCP. Good news is that I have Homie working for ESP32 with TLS and starts doing OTA over TLS (I say "starts" because it crashes eventually). It's a work in progress but here is what I have accomplished so far ... Using a PR branch on AsyncTCP to merge in mbed-tls and propagating some changes on AsyncMqttClient and Homie I have my ESP32 connecting to mosquitto using TLS with pre shared keys. In fact you can try it out yourselves : [env:nodemcu32]
platform = espressif32
board = esp32doit-devkit-v1
framework = arduino
build_flags =
-D ASYNC_TCP_SSL_ENABLED=1
-D PIO_FRAMEWORK_ARDUINO_LWIP2_HIGHER_BANDWIDTH
upload_speed = 115200
lib_ldf_mode = deep
lib_deps =
[email protected]
https://github.com/nemidiy/AsyncTCP.git#mbed-tls-try2
https://github.com/nemidiy/async-mqtt-client.git#mbed-tls
https://github.com/nemidiy/homie-esp8266.git#feature/mbed-tls-esp32 The PSK and PSK-IDENT need to be added to your config : {
"name": "Test Box 32",
"device_id": "testbox32",
"device_stats_interval": 10,
"wifi": {
"ssid": "YYYYYYYYYYYYY",
"password": "XXXXXXXXXX",
"ip":"192.168.0.202",
"gw":"192.168.0.1",
"mask":"255.255.255.0",
"dns1":"192.168.0.232"
},
"mqtt": {
"host": "mqtt.dc-iot.com",
"port": 9883,
"base_topic": "devices/",
"ssl":true,
"psk_identity":"test", -> new setting
"psk":"XXX" -> new setting
},
"ota": {
"enabled": true
}
} This is how the mosquitto listener has been set up: listener 9883
allow_anonymous true
psk_hint somecrazyhint
psk_file /etc/mosquitto/conf.d/psk.txt Here is the output for the serial : 💡 Firmware test (1.0.0)
🔌 Booting into normal mode 🔌
SSL is: 1
{} Stored configuration
• Hardware device ID: 30aea41c0018
• Device ID: testbox32
• Name: Test Box 32
• Device Stats Interval: 10 sec
• Wi-Fi:
◦ SSID: dc-iot
◦ Password not shown
◦ IP: 192.168.0.202
◦ Mask: 255.255.255.0
◦ Gateway: 192.168.0.1
• MQTT:
◦ Host: mqtt.dc-iot.com
◦ Port: 9883
◦ SSL enabled: true
◦ PSK identity not shown
◦ PSK not shown
◦ Base topic: devices/
◦ Auth? no
• OTA:
◦ Enabled? yes
↕ Attempting to connect to Wi-Fi...
✔ Wi-Fi connected, IP: 192.168.0.202
Triggering WIFI_CONNECTED event...
↕ Attempting to connect to MQTT...
✔ Wi-Fi connected, IP: 192.168.0.202
Triggering WIFI_CONNECTED event...
↕ Attempting to connect to MQTT...
↕ Attempting to connect to MQTT...
Sending initial information...
✔ MQTT ready
Triggering MQTT_READY event...
Calling setup function...
〽 Sending statistics...
• Interval: 15s (10s including 5s grace time)
• Wi-Fi signal quality: 100%
• Uptime: 3s
Fake value: 3444
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
〽 Sending statistics...
• Interval: 15s (10s including 5s grace time)
• Wi-Fi signal quality: 100%
• Uptime: 13s
Fake value: 13450
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c
[E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c Here I need to figure out why I get the [E][AsyncTCP.cpp:1130] _poll(): 0x3ffcc148 != 0x3ffd293c And then when I OTA update ... Receiving OTA payload
↕ OTA started
Triggering OTA_STARTED event...
Firmware is binary
Receiving OTA firmware (939/1281392)...
Receiving OTA firmware (1963/1281392)...
Receiving OTA firmware (2987/1281392)...
Receiving OTA firmware (4011/1281392)...
Receiving OTA firmware (5035/1281392)...
Receiving OTA firmware (6059/1281392)...
Receiving OTA firmware (7083/1281392)...
Receiving OTA firmware (8107/1281392)...
Receiving OTA firmware (9131/1281392)...
Receiving OTA firmware (10155/1281392)...
Receiving OTA firmware (11179/1281392)...
Receiving OTA firmware (12203/1281392)...
Receiving OTA firmware (13227/1281392)...
Receiving OTA firmware (14251/1281392)...
Receiving OTA firmware (15275/1281392)...
Receiving OTA firmware (16299/1281392)...
.
.
.
Receiving OTA firmware (816043/1281392)...
Receiving OTA firmware (817067/1281392)...
Receiving OTA firmware (818091/1281392)...
Receiving OTA firmware (819115/1281392)...
Receiving OTA firmware (820139/1281392)...
〽 Sending statistics...
• Interval: 15s (10s including 5s grace time)
Receiving OTA firmware (821163/1281392)...
Receiving OTA firmware (822187/1281392)...
Receiving OTA firmware (823211/1281392)...
Receiving OTA firmware (824235/1281392)...
Receiving OTA firmware (825259/1281392)...
Receiving OTA firmware (826283/1281392)...
Receiving OTA firmware (827307/1281392)...
Receiving OTA firmware (828331/1281392)...
Receiving OTA firmware (829355/1281392)...
Receiving OTA firmware (830379/1281392)...
Receiving OTA firmware (831403/1281392)...
Receiving OTA firmware (832427/1281392)...
Receiving OTA firmware (833451/1281392)...
Receiving OTA firmware (834475/1281392)...
Receiving OTA firmware (835499/1281392)...
se r� tn • Wi-Fi signi.o nn n "�n de dw_ n rncv n _n a� nnn n_ n wn n�d <d= d0 dxf n fn ff�" n fn an in ln en dn :n n f�il de n" n /n Un sn e�r sn /f�i dc en tn on /nD ne n sn kn tn o� pn/ dES Pd 3n2 /dE dS Pn 3n2 /ne dsp �- in dS fn- pdu dib ln in cn/ dco n mn pn on nne nn t�s /�l nw ni dp /n ln wi�p /e sn r�c /d c nor� en /n tn c pn. nc n" d, l ni dn � en d77 9d ,n dfu dn cn tn in oen d: nt nc np �_ un p�d ant de d _n r�c vn _a�n nd_ dw d nn dn
Sbort( ) was calsled at PC 0x400e3063non co1e 0
Backtrace: 0x4008c800:0x3ffb5370 0x4008ca31:0x3ffb5390 0x400e3063:0x3ffb53b0 0x400feebe:0x3ffb53e0 0x400fef4d:0x3ffb5400 0x4018bb75:0x3ffb5420 0x400fcbe0:0x3ffb5440 0x4008b015:0x3ffb5470
Rebooting...
ets Jun 8 2016 00:22:57
rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:1100
load:0x40078000,len:9232
load:0x40080400,len:6412
entry 0x400806a8 Have to figure out why it crashes ... but given that we had no TLS support on ESP32 and that OTA was not even beginning to work with TLS I am quite happy so far. I'll keep working on it an get back. |
-> If using ESP32 and ssl is enabled then the json config parameters need to be setup : * mqtt.psk : the pre shared key, up to 32 chars * mqtt.psk_identity : the pre shared key identity, up to 32 chars -> Don't publish stats nor invoke the user defined loop function if OTA is ongoing. This helps aliviate the TCP window going out of space given all TLS extra computing.
Will test as soon as possible on ESP8266 with SSL encrypted MQTT connection. |
hey @kleini thanks for jumping in. Nothing I have done so far applies to ESP8266, just ESP32 |
-> If using ESP32 and ssl is enabled then the json config parameters need to be setup : * mqtt.psk : the pre shared key, up to 32 chars * mqtt.psk_identity : the pre shared key identity, up to 32 chars -> Don't publish stats nor invoke the user defined loop function if OTA is ongoing. This helps aliviate the TCP window going out of space. -> Send out only 30% of the 206 messages while doing OTA updates
-> If using ESP32 and ssl is enabled then the json config parameters need to be setup : * mqtt.psk : the pre shared key, up to 32 chars * mqtt.psk_identity : the pre shared key identity, up to 32 chars -> Don't publish stats nor invoke the user defined loop function if OTA is ongoing. This helps aliviate the TCP window going out of space. -> Send out 1 every 100 of the 206 messages while doing OTA updates
[UPDATE] Found why OTA crashes, not quite get the underlying reason to be honest, regardless I Contex when doing OTA :
That puts to much pressure on the AsyncTCP lib and it gets to a state where the TCP window https://github.com/marvinroger/async-mqtt-client/blob/master/src/AsyncMqttClient.cpp#L862 and after doing so, it just crashes. I took the following approach and got it working... while the OTA update is ongoing:
Still if I do a trace I can see the ESP32 window going down to zero but at least works. I pushed this fixes to the branches in the platformio.ini so if anyone has an ESP32 laying around I Here is the main I am using : #include "Arduino.h"
#include <Homie.h>
HomieNode sensorNode("sensor", "Sensor", "sensor");
unsigned long lastSent = 0;
void loopHandler() {
unsigned long m = millis();
if ( m - lastSent >= 10 * 1000UL || lastSent == 0) {
Homie.getLogger() << "Fake value: " << m << endl;
sensorNode.setProperty("value").send(String(m));
lastSent = millis();
}
}
void setup() {
Serial.begin(115200);
Homie_setFirmware("test", "1.0.0");
Homie.setLoopFunction(loopHandler);
sensorNode.advertise("value").setName("Value").setDatatype("double");
Homie.setup();
}
void loop(){
Homie.loop();
} |
Great work @nemidiy !!! This is the first time, OTA worked for me over a SSL connection. It still seems to get into some problems but at some point, the OTA works.
and ota_updater logs
|
@kleini that's great news! so if what I did on homie + async-mqtt improved ESP8266 as a side effect (and it was pure luck since I was targetting ESP32) it probably means that the ESPAsynTCP lib that uses a axtls ssl lib is also having issues with buffers that probably generate those MQTT disconnections due to timeouts. I am currently trying to build the binary using esp-idf since the sdkconfig.h allows to specify the underlying TCP window size. Thanks for the feedback. Hopefully I'll have more news soon. |
Oh, I missed your changes on AsyncMqttClient. It worked without that. After having a look into your changes, those are only for ESP32, I don't see any improvements for ESP8266. Furthermore I don't see, where axtls is replaced with mbedtls. But even without that, it is a big step forward for me. |
@kleini yes. you are right about AsyncMqttClient. axtls is only replaced by mbedtls only if use espressif32 as the platform since dependencies in AsyncMqttClient are :
the branch for AsyncTCP in the ini file is the one that replaces axtls by mbed. Anyways I am still struggling with platformio using the arduino core as an isp-idf component in order to play with window |
Fiddling around with TLS implementations is currently too much for me. My C++ skills are too unexercised for that. But at least your changes make OTA possible for my devices. If I can further support you in getting more things tested and done, just tell me, how I can help. |
Ok, so I built homie and all it's dependencies using esp-idf for the ESP32 platform to see if playing around with the TCP window size, would remove the window dropping down to 0 from time to time when doing the OTA updates over TLS. Short story, did not work ...
I just concluded that the window dropping down to zero is just the producer (broker) being much faster than the consumer (MCU) when doing too much decryption as needed for the OTA Anyways, and regardless the window dropping to zero:
For ESP8266 since the changes are just in the homie repo I can separate them from the ESP32 and create a PR. For ESP32 I first need :
Thus ... stay tunned for "TLS tweaks" PR during the week. Thanks |
Do you have any issues with closing the AsyncTCP connection or do you simply auto-reboot upon completion (or failure) and avoid the bug? I am not using mqtt, but am using AsyncTCP with mbedTLS on ESP32 pulling a payload from GCP. All is great, no issues, except closing the connection. |
hey @robert-alfaro, correct after OTA updating the device just reboots into the new firmware. Unless there is a problem on MQTT, the socket does not get closed, and so far I have not had any MQTT disconnections, but I can try it out and see what happens. Could you tell me about your use case so I can try and reproduce it ? Meaning, is the socket getting closed by your code because you are done doing whatever you need or is it closing due to a problem in the channel (ie, latency or whatever) ? BTW, all my tests are in an internal network, and if I do any connections with the outside world I do so through an MQTT bridge (mosquitto) since I am paranoid about exposing devices and I can run the bridge in a more powerful and secured device (ie raspberry). Also .. are you not getting this ? ... cant figure this log out.
|
+ During OTA do not publish any messages that are not OTA related + Publish 1 every 100 206 status messages
+ During OTA do not publish any messages that are not OTA related + Publish 1 every 100 206 status messages
+ During OTA do not publish any messages that are not OTA related + Publish 1 every 100 206 status messages
+ During OTA do not publish any messages that are not OTA related + Publish 1 every 100 206 status messages
+ During OTA do not publish any messages that are not OTA related + Publish 1 every 100 206 status messages
Wow, 4/4 approvals for the PR. I guess, we can close this issue. |
Hi everyone, happy new year!
I finally got to the point where I could test OTA updates under TLS. As we know it does not work.
Doc gives a hint :
What was rather painful to find is that for esp32 AsyncTcp has no support for TLS. There is a PR though that includes client side TLS using an mbed lib. Since all this adds an extra level of complexity I-ll start by debugging on ESP8266 and then if I can make that work Ill see what I can do in esp32 (my board of choice).
If anyone has any ideas on what the problem is for OTA not working on TLS please let me know :)
Here is what I am doing :
mqtt.dc-iot.com resolves in my private DNS to my mqtt server, that-s not even a real domain.
I can see if I subscribe to the base topic (using mosquitto_sub) that the FW is pushed just fine but then when the board receives the message it just disconnects from the server
here is my platformio ini file
and the deps :
The text was updated successfully, but these errors were encountered: