-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Door shows as "No Response" in HomeKit until rebooted #94
Comments
0.9 I am having a similar issue where at some point it goes “no response” and I have to manually unplug and re-plug. |
Me three, it seems to hang and will occasionally come back on its own after a couple of minutes, other times it requires restart. Running version 0.9 with a Chamberlain CS105MYQ with an obstruction sensor installed. |
I thought it was just me or maybe my RATGDO's hardware had a problem. I'm planning to try to debug at some point, but not sure where to begin with that yet so any tips are appreciated. |
@egrim I'm in no way qualified to offer much technical guidance here, but I documented how I was able to log from the device in #24 (comment) |
@thenewwazoo is working on replacing the Arduino-based HomeKit stack with the ESP-HomeKit stack. We are hopeful that this will bring more stablity to the connections. |
I read in another Issue that with UniFi the Ratgdo roaming between APs was
causing an issue. Neither of my Ratgdos have went “no response” since I
forced connection to a single AP in the UniFi settings. I will report back
if it happens again, but looks like that was the issue after a few days of
it working.
…On Wed, Jan 17, 2024 at 3:50 PM David Kerr ***@***.***> wrote:
@thenewwazoo <https://github.com/thenewwazoo> is working on replacing the
Arduino-based HomeKit stack with the ESP-HomeKit stack. We are hopeful that
this will bring more stablity to the connections.
—
Reply to this email directly, view it on GitHub
<#94 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWE2PLKLFGN6GFC4NNTMWTYPA2QNAVCNFSM6AAAAABB37DBTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJWGY3DKOBSGU>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Oh! I'll give that a try as I am indeed on UniFi. |
I'm on UniFi too and have been fortunate in having really good uptime. My configuration is to have a separate (2.4Ghz) SSID broadcasting only from the closest AP. Not sure if there's a material difference between that and forcing a client to a given AP, but I wanted to offer another (working) solution. |
Ah, I'm also on UniFi, I'll try the roaming lock |
Crazy how many of us with these home automation devices also has Unifi
setups!! :-)
…On Wed, Jan 17, 2024 at 5:50 PM Lance Parker ***@***.***> wrote:
Ah, I'm also on UniFi, I'll try the roaming lock
—
Reply to this email directly, view it on GitHub
<#94 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAWE2PP75SRLE4KRPWE6XZTYPBISFAVCNFSM6AAAAABB37DBTGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJXGI3DAMRSG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I'm on unifi and I locked my device to the closest AP. I have twice hit this issue where I had to power cycle the ratgdo to get it back alive. And it was several weeks apart. So, this helps, but don't expect it to be a magic bullet. I've also seen where if I power cycle my AP, it may not reconnect. Which seems like it might be similar to moving between APs. I think once it's connected it doesn't like getting disconnected. |
Also Unifi, have also run into this issue twice. However, I can see the ratgdo has not connected to another AP within Unifi's console, so that wasn't issue. I've had no AP resets either. ratdgo was also still connected to the AP when it was reset according to the connection log. |
I did the AP lock yesterday, woke up this morning to "no response" :( |
Same issue here - UniFi and locked to AP. Still no response. |
Been having the same issue with my RATGDO on 0.9 and also running UniFi. Going to lock to a single AP just to be safe. Looking forward to the next firmware update. Thanks. |
In case it matters, I'm on 0.6.0. It doesn't seem like anything was changed in the interim that would affect connectivity, though. |
I'm on 0.8 and am seeing this problem, as well. In the most recent occurrence, it happened after just a few hours: HomeKit couldn't see the garage door, the web interface wouldn't load, and the device wasn't reachable via ping. I'm also using UniFi network gear, and the RatGDO has line of sight to the nearest AP, which is about 15 feet away. If there's anything I can do provide data to help troubleshoot this issue, I'd love to know. |
I appreciate everyone continuing to submit bug reports. Thank you! And I'm not kidding - it helps me get a sense of the scope of the problem. I do have good news, though! I have made significant progress in re-implementing the firmware to avoid the problem, which is instability in the Arduino core. I've replaced Arduino with the HomeKit SDK written by Espressif (who makes the ESP8266 chips used in the ratgdo), and my prototype is working well. Stay tuned, I expect to be able to publish a version for intrepid users to test in the next few days. |
Thank thenewazzo! same problem we gonna be waiting for a new firmware! keep the great work! excluding this bug the HomeKit is by far the easiest and most responsive of all three firmwares for me |
I'll also confirm issue with Unifi and 0.9.0 firmware. Three other devices using ESPHome work fine via HASS. Only this device fails to update, even when connecting to known good AP. Appreciate acknowledgement and I, too, am looking forward to f/w update. |
Unifi user here who did a good bit of troubleshooting before. Three APs, three ratgdos. v0.8 has definitely corrected the "No Response" issue I was very very frequently getting caused by hopping between APs. What v0.8 did not (and could not) correct was WiFI signal issues that only my ESPs were seeming to have. Despite "No Response" being corrected (when WiFi was working), I would still randomly and frequently get drop offs from WiFi entirely - some ratgdos more than others. At least two of them would not reliably stay connected for even an hour. About two weeks ago I made two changes in Unifi, and since then only one device has reconnected one time (and only a 5 sec disconnect) - the other two devices haven't lost connection a single time. Change 1: Stopped using "Minimum Data Rate Control" on my 2.4 GHz network. It was previously set to "5.5 Mbps" as a minimum, and I think this was forcing hopping as well as disconnects from not meeting the minimum. Change 2 (and probably more impactful): I increased AP 2.4 GHz signal strength. Yes, I know just increasing AP signal strength doesn't mean the devices can respond stronger, but it has helped - massively. Before this, each ratgdo was typically under 10 Mbps on rx & tx. After this, right now, they are all 52-54 Mbps on rx & tx. "AP/Client Signal Balance" are all "Good". "WiFi Experience" are all "Excellent". Note: Locking to an AP did not help me. Prior to these changes I tried a few different combinations of locking with no reliable improvement. I currently do not lock to an AP for any of my ratgdos. For everyone else on Unifi, if you're experience high latency / low data rates / full wifi disconnects - check the settings I listed above. Looking forward to @thenewwazoo's HomeKit rewrite, but I suspect it will have no impact if you can't get WiFi stable first like I was experiencing. I'm using v0.9, and since making the listed Unifi changes above, all three ratgdo's have been flawless - instant responses and never showing "No Response". |
Thanks @blakej. I’ll try #1 to see if that makes a difference. I have four devices: one is using HomeKit-RatGDO and the other three are using ESPHome firmware. All four devices are on UniFi APs. All four maintain strong WiFi connections with no drops, so that doesn’t seem to be the problem, but perhaps bandwidth sifting could be causing some issues. I can successfully add the HomeKit-RatGDO firmware device, control door, and all looks good for a day or two. Then, despite strong and consistent WiFi connection, Apple Home app shows “No Response” only for the HomeKit device. The non-responsive device is still connected to same WiFi AP and I can access the web page for it. Rebooting also doesn’t help - device will reestablish WiFi connection successfully but Home still shows “No response.” Using 0.9.0 firmware. Next step will be to check the recommendation you offer above, remove device from Home, power cycle, reinstall firmware, and re-add to Home. |
I do not have a UniFi system but am using a Velop system. One setting on the Velop which I have found made a difference for HomeKit device connectivity was to turn off "Device Steering". This lets the device connect to its preferred node rather than the Velop system steering which can result in a disconnect. Not sure if similar setting on UniFi but one to also try if available. |
Went ahead and reloaded f/w, reconnected to closest UniFi WiFi and all is good in Home at the moment. Will keep an eye on it in days to come. Looking forward to next release. Impressive to see how quickly this project went from concept to reality, so understand growing pains will be a part of this process. |
I had the same problem with Unifi (UAP-AC-HD). I downgraded WPA2/WPA3 (combination) to WPA2 (only) and it seems to be working well. I do use Band Steering and Fast Roaming. I don't use QOS. It doesn't seem to be causing an issue at the moment. |
Also recommend having your 2.4GHz network configured for WPA2 and not WPA2/WPA3 (combination). |
Are you guys resolving this issue by rebooting from the web interface or removing power? Are you needing to re-pair to HomeKit? |
I've been able to resolve by rebooting in the web interface, but it sounds like others weren't able to reach it so were forced to power-cycle it. |
@thetvguru For me, power cycling the box has done the trick. |
Mine is working again. For myself, I had to remove and reinstall to HomeKit. |
eero networks seem to be a bit of a common thread with regard to these issues. I don't think the Eero's are to blame, but there is something about these networks that the ratgdo doesn't seem to like. |
I have a Unifi Dream Machine and had connectivity troubles on 5G. Switching
to a (non-Unifi) access point with a stronger 2.4 Ghz signal seems to have
fixed the problem for me.
I think @DMBlakeley's question about Homekit Hub is a good one since I've heard about some instability there, too. I have a wireless HomePod Mini as my hub.
Firmware: 0.9.0
Up Time: (dd:hh:mm:ss) 9:04:44:18
…On Mon, Feb 12, 2024 at 2:08 PM jgstroud ***@***.***> wrote:
I've been having similar issues and I'm begin to suspect it's because of
my eero network. I'm also getting other symptoms including false
notifications (details: #105
<#105>)
eero networks seem to be a bit of a common thread with regard to these
issues. I don't think the Eero's are to blame, but there is something about
these networks that the ratgdo doesn't seem to like.
Message ID: ***@***.***>
|
THe ESP8266 does not support 5GHz, it can only connect on 2.4GHz. |
I used to have an Eero setup and the switching was very frequent. If there's any way to lock to one Eero, that would probably make it a lot more stable for you. I haven't had to do anything with the Unifi setup outside of lock it to an AP (which it was doing outside of resets).
My hub has switched a few times, and I think it could have been one of my drops, but it was only once (the drop). I've done multiple upgrades. Luckily I'm on a wired Apple TV right now, but you never know when that'll flip. Hopefully we get control of that one day. |
I use a wireless HomePod mini for my hub and mine has been very solid. At this point I wouldn't rule anything out though. I am also on a Unifi AP and I locked my device to the nearest AP. I have a spare ratgdo with the serial console being logged. I might try forcing it to roam and see if I can trigger this issue. Based on mrthiti/Arduino-HomeKit-ESP8266@3825ef4 I'm wondering if there is still something bad happening in the mdns stack when the ratgdo roams. |
Like many here, I have a couple of ratgdo devices and a Unifi Network. The devices go non-responsive all the time. Nothing I do with the network helps at all. Even when they are non-resonsive in homekit, I can reach the web management page and use it responsively. I have written my own homekit appliances a few times and have always had trouble when tying to use the Arduino-Homekit-Esp8266 library. Some kind of behavior. So I am convinced that this is something in the timing of the protocol between homekit and the device. While I hate to do it, I have a home bridge instance running already for some other stuff so I loaded the mqtt-ratgo firmware, used the homebridge-ratgdo plugin and everything runs well and is responsive. I have had much better luck with the Espressif Homekit SDK in terms of performance and stability, but it is a lot harder to get going with it. So I hope that @thenewwazoo has better luck with that attempt. I hearby volunteer for any testing you want to run when you are ready. |
I've been running v0.11.0 for 22+ days uptime without any of these issues. I use it just about everyday and working consistently (haven't noticed a failure/issue). Some speculation:
Not sure if this helps anyone else but some datapoints. |
For what it is worth, I have a bit of data to add to the discussion. When my ratgdo devices were flashed with ratgdo-homekit, ping times to the devices were all over the place ranging from 10ms-60ms. There would be bursts of long latency times. Now that they are flashed with mqtt-ratgdio the ping times are constantly under 5ms. Same location, same network settings, etc. This suggests that the chip does not have enough juice to run the tcp stack and homekit in the Arduino framework. Note that I have like 50+ accessories in this particular home and there are 5 potential hubs to talk to. I have a number of devices with 8266s that work well in this environment, but all are running code based in the Homekit SDK rather than the Arduino framework. These are nice little boards and there is no reason they be able to handle this with a good framework. |
@billy27607 I believe you're basically correct, as evidenced by my work porting this codebase to use the Espressif HomeKit SDK. The ESP8266 is (in my opinion) simply not powerful enough to meet the use cases of this firmware. I wrote a blog post about it, announcing that I'd be retiring from supporting this firmware, and recommending alternatives: On Abandoning my RATGDO Native HomeKit UsersNot every maintainer of this repo shares my pessimism! @jgstroud deserves crazy applause for his work to improve reliability, and he's previously stated that he's going to keep working on it. Choosing to stick it out is a reasonable option, so long as it's done after having expectation set accordingly. |
I certainly understand and appreciate the position that you are in. Sometimes when the juice ain't worth the squeeze enough is enough. There is working solution for those of us who have purchased the boards. I for one, appreciate all of the effort that you put into this. You should not worry about anythingat all. |
The no-response error occurs for me "reliably" if the ratgdo is left to run for ~3 days, at which point it spontaneously reboots and on startup believes that no HomeKit device is paired. If I do not pair HomeKit and leave the ratgdo running, it runs for many more days (I don't know how many, I reset it after 6 days). To me this smells awfully like a memory leak within the HomeKit code. It will likely take time to find the cause of the problem, or may never be found. But what works reliably for me is to reboot the ratgdo before the error occurs. We could do this automatically. In the PR #117 that is awaiting merge, I have added an automatic reboot every 24 hours (by default, configurable between 1 and 72 hours, with 0 meaning never). This is of course a nasty hack / work-around... but you would be surprised at the number of devices I have had to set to automatically reboot on a regular interval... not necessarily daily, but maybe weekly... commercial supported devices like cable modems, routers, etc. that would hang for no apparent reason, but work great as soon as rebooted. |
Completely understandable and a big thank you for all the hard work, this clearly is what a lot of people were interested in and you definitely got it quite far. If there are any more expensive options on market would be helpful to list those in the blog post for those who prefer not to trade one buggy tool for a time-consuming homelab setup. Likewise, a fancier high end RatGDO that has more powerful SOC/memory/etc. would be also nice if it exists. There is some curiosity/merit into trying to conclusively pin down what produces a "working" unit vs a crashy/buggy unit and/or root cause. It irks my brain to think there are, for a lack of a better word, 'golden' units vs general units, like an math formula that shouldn't work but does, and consistently. Does the flasher/flashing method matter? This might be a common denominator that would explain larger majority of people having issues vs handful of people without this issue. I don't personally know if the initial flash on this platform matters, but my understanding is initial firmware flash is different than firmware upgrade.
Probably nothing but I think that covers all the possible variations my consistently working build/setup could have from majority of everyone else's builds. |
I’ve been reading tonight, and this seems like something to consider given that it’s a further optimization (with a focus on ram) vs what thenewwazoo attempted in the other branch: https://www.blessuisse.net/?_=%2FJKoss2%2FArduino-HomeKit-ESP8266%23KJWqMdlUlBnsIvkdRR%2BuhIT4 Some interesting reading there as well within the readme… |
It may be that the appearance of reliability depends on how much HomeKit network traffic, or how many hubs you have. As I mentioned in an earlier post, I have over 50 appliances, and five devices that can act as a hub. My ratgdo devices would consistently go non-responsive in two or three minutes. That is about the same time. It would take a hub to notice that the device had gone away. It is certainly possible that the flashing method adds enough memory usage to make the problem worse. It also may be that how many hubs it has to talk to and register, taxes memory or makes a memory leak worse. I have at least 15 devices that run on esp8266 hardware that work fine. The firmware running on them uses the SDK framework rather than the Arduino framework. |
Following up to some of the earlier messages on this issue: |
This bug report is worth reading... esp8266/Arduino#8722 |
I mention above bug report because I have seen the same in ratgdo...
|
Some interesting tidbits from referenced discussion:
Some thoughts:
|
@iyerusad new h/w isn't going to solve for existing boards in the field. But I do like idea of reporting available memory in the web page. What is the right function to use to find that? I see the HomeKit code uses Thanks. |
@iyerusad I have added reporting of free heap size to the user intterface in #127 so if that gets merged then it will display. In a few hours of observation I am consistently seeing around 25KB of free heap, which feels okay. It displays at the bottom of the web page. I looked at also tracking stack usage, the ESP8266 apparently has a 4KB stack. When I did this however the stack usage (which remember, is at the point I measure, and not the maximum any other function could have used) was in the 100's of bytes not KB. It never changed. I don't know any way to find out what the highest stack usage has been at any point of time. Given that the info was not useful, I did not add it to the user interface. |
Now, having said what I said above about monitoring stack size... looking at the crash dump I posted... 0x3fffffa0 - 0x3fffef00 = 0x10A0 which is 4256 which exceeds 4KB So deep in |
Anything is possible, but I am not touching the code right now while I run a long-term test of #127 to see if I can last more than a few days without it crashing. |
Fixed in #129 |
I thought I had this same issue, but it was because I had three HomeKit hubs (two HomePod minis and an AppleTV). The one closest to my ratgdo was on a different network. If you have trouble with "no response" be sure to check all your hubs and ensure they are on the same network! |
I can successfully add the ratgdo to HomeKit, and everything works great, but after a day or so, it shows as "No Response" until I reboot the device.
The text was updated successfully, but these errors were encountered: