Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'No Response' after reboot/ssid_off WiFi router #139

Open
dlangamer opened this issue Jul 19, 2021 · 24 comments · May be fixed by #184
Open

'No Response' after reboot/ssid_off WiFi router #139

dlangamer opened this issue Jul 19, 2021 · 24 comments · May be fixed by #184

Comments

@dlangamer
Copy link

Everything goes fine until the network's wifi router goes down/restart. ESP serial reports sucessful reconnection, but devices are no longer able to communicate with the Home app ("No Response" red message). To working again, only restarting esp. Anyone else experiencing this problem?

Below, the result displayed in the serial:

SketchSize: 476624 B
FreeSketchSpace: 1617920 B
FlashChipSize: 4194304 B
FlashChipRealSize: 4194304 B
FlashChipSpeed: 40000000
SdkVersion: 2.2.2-dev(38a443e)
FullVersion: SDK:2.2.2-dev(38a443e)/Core:3.0.1=30001000/lwIP:STABLE-2_1_2_RELEASE/glue:1.2-48-g7421258/BearSSL:c0b69df
CpuFreq: 160MHz
FreeHeap: 46656 B
ResetInfo: External System
ResetReason: External System
OFF
>>> [    100] HomeKit: Starting server
>>> [    104] HomeKit: Using existing accessory ID: EF:B9:5C:77:D4:96
>>> [    110] HomeKit: Found admin pairing with 19052CD2-AA12-4409-A7C0-26A07A1E5030, disabling pair setup
>>> [    119] HomeKit: Configuring MDNS
>>> [    122] HomeKit: Init server over
>>> [   1327] HomeKit: heap: 46552, sockets: 0
>>> [   4841] HomeKit: WiFi connected, ip: 192.168.0.23, mask: 255.255.255.0, gw: 192.168.0.1
>>> [   4850] HomeKit: Configuring MDNS
>>> [   4854] HomeKit: MDNS begin: ESP8266_LED_06EB3A, IP: 192.168.0.23
>>> [   6401] HomeKit: heap: 45120, sockets: 0
>>> [   7611] HomeKit: Got new client: local 192.168.0.23:5556, remote 192.168.0.18:49238
>>> [   7620] HomeKit: [Client 1073680492] Pair Verify Step 1/2
>>> [   7939] HomeKit: Free heap: 42600
>>> [   8144] HomeKit: [Client 1073680492] Pair Verify Step 2/2
>>> [   8151] HomeKit: [Client 1073680492] Found pairing with 19052CD2-AA12-4409-A7C0-26A07A1E5030
>>> [   8174] HomeKit: Call ge_double_scalarmult_vartime_lowmem in ge_low_mem.c
>>> [   8905] HomeKit: [Client 1073680492] Verification successful, secure session established
>>> [   8913] HomeKit: Free heap: 42712
>>> [   9122] HomeKit: [Client 1073680492] Get Accessories
>>> [   9419] HomeKit: [Client 1073680492] Update Characteristics
>>> [  11440] HomeKit: heap: 43280, sockets: 1
>>> [  12851] HomeKit: [Client 1073680492] Get Characteristics
>>> [  13266] HomeKit: [Client 1073680492] Get Characteristics
>>> [  16305] HomeKit: [Client 1073680492] Update Characteristics
ON
>>> [  16516] HomeKit: heap: 42776, sockets: 1
>>> [  17326] HomeKit: [Client 1073680492] Update Characteristics
OFF
>>> [  18342] HomeKit: [Client 1073680492] Update Characteristics
ON
>>> [  18755] HomeKit: [Client 1073680492] Update Characteristics
OFF
>>> [  21577] HomeKit: heap: 42776, sockets: 1
>>> [  26610] HomeKit: heap: 42776, sockets: 1
>>> [  31628] HomeKit: heap: 42776, sockets: 1
>>> [  36646] HomeKit: heap: 42776, sockets: 1
>>> [  41662] HomeKit: [Client 1073680492] Disconnected!      <     WIFI DISCONNECT - NO RESPONSE AFTER THIS POINT
>>> [  41667] HomeKit: [Client 1073680492] Closing client connection
>>> [  41673] HomeKit: heap: 45256, sockets: 0
>>> [  46692] HomeKit: heap: 45280, sockets: 0
>>> [  51713] HomeKit: heap: 45280, sockets: 0
>>> [  56733] HomeKit: heap: 45280, sockets: 0
>>> [  61750] HomeKit: heap: 45280, sockets: 0
>>> [  66766] HomeKit: heap: 44896, sockets: 0
>>> [  69017] HomeKit: WiFi connected, ip: 192.168.0.23, mask: 255.255.255.0, gw: 192.168.0.1
>>> [  69025] HomeKit: Configuring MDNS
>>> [  69032] HomeKit: MDNS restart: ESP8266_LED_06EB3A, IP: 192.168.0.23
>>> [  71788] HomeKit: heap: 45000, sockets: 0
>>> [  76819] HomeKit: heap: 44496, sockets: 0
>>> [  81842] HomeKit: heap: 44704, sockets: 0
>>> [  86863] HomeKit: heap: 44704, sockets: 0
>>> [  91881] HomeKit: heap: 44704, sockets: 0
>>> [  96898] HomeKit: heap: 44704, sockets: 0
@dsbaha
Copy link
Contributor

dsbaha commented Jul 20, 2021

I, too, have recently experienced this problem. Originally I thought it was a problem with my code, but it seems like the problem may be elsewhere. I've been thinking about writing a public server_free function, for the following scenarios;

  1. Unload on OTA Start
  2. Unload/Reload on wifi disconnect?
  3. Unload/Reload on Request (via, maybe http, then maybe just ESP.restart() would suffice).

I did want to mention that I have Arduino statsd instrumented as well, the statsd stuff works on reconnect. So the core loop is operating as intended. Just the HomeKit server stuff doesn't respond after a reconnect.

@dlangamer
Copy link
Author

dlangamer commented Jul 21, 2021

I, too, have recently experienced this problem. Originally I thought it was a problem with my code, but it seems like the problem may be elsewhere. I've been thinking about writing a public server_free function, for the following scenarios;

  1. Unload on OTA Start
  2. Unload/Reload on wifi disconnect?
  3. Unload/Reload on Request (via, maybe http, then maybe just ESP.restart() would suffice).

I did want to mention that I have Arduino statsd instrumented as well, the statsd stuff works on reconnect. So the core loop is operating as intended. Just the HomeKit server stuff doesn't respond after a reconnect.

The problem with using the "ESP.restart()" would be that the device would lose the active status. e.g.: a light bulb via relay would turn off and the user would need to activate it again. Yesterday I tested the option to unload the entire wifi stack and reconnect, but it didn't work either. I believe it's a problem regarding the session/socket break instructions.

I'm studying all the code to try to find a solution, but I'm not an advanced programmer. If anyone has any ideas, help us.

@dlangamer
Copy link
Author

dlangamer commented Jul 26, 2021

I, too, have recently experienced this problem. Originally I thought it was a problem with my code, but it seems like the problem may be elsewhere. I've been thinking about writing a public server_free function, for the following scenarios;

  1. Unload on OTA Start
  2. Unload/Reload on wifi disconnect?
  3. Unload/Reload on Request (via, maybe http, then maybe just ESP.restart() would suffice).

I did want to mention that I have Arduino statsd instrumented as well, the statsd stuff works on reconnect. So the core loop is operating as intended. Just the HomeKit server stuff doesn't respond after a reconnect.

I found a workaround to solve the reconnection issue. Inside the source code "arduino_homekit_server.cpp", in the function "void arduino_homekit_setup", I created a check if the mDNS service is active. Otherwise, it calls accessory pairing and as a result also starts the mDNS service. Works well for now. I'm checking that there won't be any side effects.

void arduino_homekit_setup(homekit_server_config_t config) {
//ESP32 use FreeRTOS-task
xTaskCreate(esp32_homekit_task, "HomeKit Server",
SERVER_TASK_STACK, config, 1, NULL);
/

if (system_get_cpu_freq() != SYS_CPU_160MHZ) {
system_update_cpu_freq(SYS_CPU_160MHZ);
INFO("Update the CPU to run at 160MHz");
}*/

if (homekit_mdns_started = false) {
	homekit_server_init(config);
}

@tomdmt
Copy link

tomdmt commented Sep 5, 2021

I'm also trying to find a workaround and I would like to try your solution. I'm a bit noob and confused, why did you add this snippet of code

if (homekit_mdns_started = false) { homekit_server_init(config); }

into "void arduino_homekit_setup" as opposed to let's say "void arduino_homekit_loop"? isn't the setup only run once when the esp restarts? Another confusion that I have is that "void arduino_homekit_setup" already runs "homekit_server_init(config)" (at least in the 8266 library) regardless of mDNS status and the issue is still there. Thank you for your time and answers.

@ruleechen
Copy link

+1 has this problem. I have tested 2 WiFi routers with simplest code.

@ruleechen
Copy link

I am linking another relevant issue
#103

@ruleechen
Copy link

ruleechen commented Oct 11, 2021

Guys, I have got the magic! Just remove this line.

image

@tomdmt
Copy link

tomdmt commented Oct 11, 2021

huh, do you have a working theory as to why this works?

Edit: I made two identical devices, one with this change and one without and I'm glad to say that the one with the change did not disconnect a single time over several weeks, whereas the one without the change disconnected multiple times.
I'm so happy, I've been looking for a fix to this issue for months and this is the only thing that works.

@seenve
Copy link

seenve commented Oct 20, 2021

iOS 15.1, this problem. Did not help..

@jenspr
Copy link

jenspr commented Jan 10, 2022

Is this the best solution so far?
Any feedback yet? -- I am going to rebuild my devices these days and I had run into that problem in the past.
I am really looking for a robust fix too :-)

@mateusmsantin
Copy link

I have been changed the code to //MDNS.close(), but the error is the same. Then I put the reboot of ESP to monitor at thingspeak.
https://thingspeak.com/channels/1286808/charts/3?bgcolor=%23ffffff&color=%23d62020&days=1&dynamic=false&results=100&title=Reinicialização+do+sistema&type=line&yaxis=boot
The problem follow...

@ruleechen
Copy link

@mateusmsantin

Have a try of my version https://github.com/ruleechen/home-switch/blob/main/extras/arduino_homekit_server.cpp I have some little customizations. Probably helps. Good luck.

@mateusmsantin
Copy link

Hi! @ruleechen I try replace arduino_homekit_server.cpp but I’m having some error.
Can this update work with ArduinoOTA?

Multiple libraries were found for "ArduinoOTA.h" Used: /Users/test/Library/Arduino15/packages/esp8266/hardware/esp8266/3.0.2/libraries/ArduinoOTA Not used: /Users/test/Documents/Arduino/libraries/ArduinoOTA Multiple libraries were found for "arduino_homekit_server.h" Used: /Users/test/Documents/Arduino/libraries/HomeKit_ESP8266-1.2.0 Not used: /Users/test/Documents/Arduino/libraries/HomeKit-ESP8266 /Users/test/Documents/Arduino/libraries/HomeKit_ESP8266-1.2.0/src/arduino_homekit_server.cpp-OLD.cpp:17:10: fatal error: homekit_base64.h: No such file or directory Not used: /Users/test/Documents/Arduino/libraries/ESPHap 17 | #include "homekit_base64.h" | ^~~~~~~~~~~~~~~~~~ compilation terminated.

@ruleechen
Copy link

@mateusmsantin
It seems not the issue related to ArduinoOTA. Please change 'homekit_base64.h' in the file to 'base64.h'. That is a customization of my env.

@mateusmsantin
Copy link

@ruleechen
Great work! It is running very well, about 4 hours no reboot.

@Sanjayc1806
Copy link

@ruleechen Thanks! it's working, but unable restore previous relay state after power off&on
Is there any solution?
like when power goes and comes back...it should restore previous state on/off
Please help!!!!!!!!

@ruleechen
Copy link

ruleechen commented Feb 23, 2022

@Sanjayc1806
Keep state is the functionality beyond this library. I'm afraid we have to do it ourself by saving the on/off state. And recover state base on the saved state after reboot.

@Sanjayc1806
Copy link

@ruleechen yeah, we can do that using EEPROM and it's working fine while using webserver localhost buttons...
Should try that with homekit code

thanks for your version of code......it's working great now

@ruleechen
Copy link

@Sanjayc1806 Welcome. Mine is up for over 10 days now.

@ayush9upta
Copy link

Hi @ruleechen, Can you provide me the modified code for ESP32? Thank you in advance!!

@kendling
Copy link

huh, do you have a working theory as to why this works?

Edit: I made two identical devices, one with this change and one without and I'm glad to say that the one with the change did not disconnect a single time over several weeks, whereas the one without the change disconnected multiple times. I'm so happy, I've been looking for a fix to this issue for months and this is the only thing that works.

I facing the same issue.

After I found this page. I review homekit_mdns_init() in ardunio_homekit_server.cpp and MDNSResponder::close() in LEAmDNS.cpp .

The theory maybe:

  1. MDNS.close() function will release all services added by addService function.
  2. The variable homekit_mdns_started In boot/reboot always false, so will run addService function after // MDNS.close(); line.
  3. The variable homekit_mdns_started always true after reconnect wifi, so no any service active if we call the MDNS.close(); function.
  4. I guess this service using for waiting HA connect.
  5. No service active, HA can't connect to the homekit_server.

@mrthiti mrthiti linked a pull request Sep 1, 2022 that will close this issue
@paullj1
Copy link

paullj1 commented Feb 25, 2023

The theory maybe:

MDNS.close() function will release all services added by addService function.
The variable homekit_mdns_started In boot/reboot always false, so will run addService function after // MDNS.close(); line.
The variable homekit_mdns_started always true after reconnect wifi, so no any service active if we call the MDNS.close(); function.
I guess this service using for waiting HA connect.
No service active, HA can't connect to the homekit_server.

I think you might be on to something here. Looks like if WiFi reconnects, we need to re-advertise those services, and we’re simply not. Going to play with just removing the condition, and clearing/adding the service every time mdns_init is called.

Further evidence:

3539   // The MDNS needs to be restarted when WiFi is connected to confirm the       
3540   // MDNS runs at the IPAddress of STA                                          
3541   // otherwise the iOS will not show the Accessory                              

But if, like you point out, the services are actually gone, then they won’t be re-advertised, so HomeKit isn’t aware they exist. My guess is that if we reboot soon enough (due to some watchdog condition, or memory leak), we just reinit, but if we don’t reboot for sometime, HomeKit assumes we’re gone.

@paullj1
Copy link

paullj1 commented Feb 25, 2023

Okay, so digging deeper into the actual mDNS announcements, I noticed that the device IDs are changing which explains everything. The attempts thus far have been to optimize the mDNS code itself, but this doesn’t solve any problems if the accessory ID changes (which appears to happen randomly)

The problem I’ve been having isn’t that the device gets disconnected, it’s that when the device comes back online, it’s failing to find its ID, and generating a new random one. To the network, it’s a brand new device.

It looks like my problem is that something is colliding with the address space in the EEPROM that this library is using for the ID.

Hope this helps someone else!

@paullj1
Copy link

paullj1 commented Feb 25, 2023

Okay, so it doesn’t look like I was overlapping. Seems to be an error handling issue in the “compact_data” function. Looks like any time there’s a topology change, HomeKit notifies the device to update its pairing info. This happens often now with the new HomeKit topology changes… the bad news is, if the device reboots, or fails to read from the flash for any reason (preemption, watchdog, etc…) it blanks out the “magic” bytes completely invalidating the config/pairing data and doesn’t fix it causing the boot process to think the device ID is invalid, and generate a new one. HomeKit looks for the old device ID, and can’t find it. The device then advertises itself as a brand new device.

Working on a fix. WIll post as a PR when done.

paullj1 added a commit to paullj1/Arduino-HomeKit-ESP8266 that referenced this issue Feb 25, 2023
Izumiko pushed a commit to Izumiko/Arduino-HomeKit-ESP8266 that referenced this issue May 10, 2024
sebasanblas added a commit to sebasanblas/Arduino-HomeKit-ESP8266 that referenced this issue Jan 1, 2025
* fix: It base64 error if use with WifiManager

* fix: the device is not work after reconnect wifi

* Remove logging

* Add debug back in, but make default "no log"

* Fix complete annihilation of storage on new pair

- Might fix Mixiaoxiao#103, Mixiaoxiao#139, Mixiaoxiao#147, Mixiaoxiao#184, Mixiaoxiao#198
- Also adds changes by @ruleechen
- Also adds changes by @thiti-y

* Fix incorrect access of global var

---------

Co-authored-by: thiti yamsung <[email protected]>
Co-authored-by: Paul Jordan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.