-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to debug issues? #24
Comments
Unfortunately, the only way to get any feedback from the device right now is serial logs. I have a sneaking suspicion that the ESP8266 devices are running out of RAM when they get a whole bunch of connections from various HomeKit hubs and clients while it's also trying to do pairing. I've seen this fail a couple of times but haven't been able to reproduce it reliably. |
Ok, FWIW I only have one hub and 3 clients. |
@PaulWieland what're you using to power your homekit device? edit: more specifically, I'm asking what the output power rating is. |
Running into the same thing--times out when adding the accessory. I'm about 0/10 so far. I do have LOTS of stuff in my HomeKit...which might be relevant to the hypothesis its lots of connections during the pairing. Edit: try 11 worked! |
Wondering if this is related to the wifi instability that was mentioned in Discord. When I run a continuous ping on the HomeKit firmware I see significant variance in latency and packet loss / timeouts. When I reflash with the ESPHome firmware and run the same ping, there is minimal latency and no packet loss / timeouts. |
Connecting a USB cable to collect serial logs is not ideal (especially when the log view is flaky for unknown reasons). And trying to collect logs after-the-fact is probably going to be less successful. What I'd like is a rolling log file in flash that can be read back out after the fact. @PaulWieland, the ESP8266 part has 1 MB flash, right? What's the default LittleFS partition space, do you know? Even just alternately writing to a pair of log files and clobbering the older one, or something similarly simplistic, would suffice to give a bit of persistence without making log truncation a hard problem to solve, but perhaps there's something off-the-shelf that's available? |
I'm having the same issues as reported in #30, specifically after setting up the device in HomeKit it will work properly for about 1-2 hours (need to confirm exactly how long this takes) but then the Home app will show that the ratgdo accessory is not responding. Adding the accessory to my home has been painless so far (though one time I did have to reboot ratgdo first). Even after it stops responding in HomeKit, I can still access the ratgdo web interface and use that to reset it. After that it seems to be responsive again with HomeKit, but only for a short time (<2 hrs). |
@usaplanb if you could capture serial output from the device for an entire "cycle" that would be hugely useful. the web-based log viewer is flaky, so it'd require using |
@thenewwazoo I wouldn't say it's within my abilities but after enough searching I think I figured it out. My first attempt I used the default baud rate which meant I was staring at random characters for 5 minutes wondering if that was correct, and of course while that was happening it went unresponsive in my Home app, so I missed the thing I was trying to see! Then I rebooted it and used the correct baud rate for logging and I'm getting what appear to be correct logs (i.e. readable status updates). Of course since the reboot it's been running for a couple hours trouble-free with my laptop on top of the GDO 🙄. Hopefully the issue reoccurs soon and I can grab the file and share it here. I'm running version 0.2.2 FWIW, but I don't think any of the changes since then would be likely to resolve this issue. In case it helps anyone else who has virtually no programming ability, here's what I did to log on my Windows machine:
And that should be enough to get it to start saving the logs. Now I'm just waiting for something to break, but apparently it'll work reliably as long as you're watching it closely. |
Might be able to set up something similar to what exists for serial (PC-hosted log receiver) but use UDP instead. Seems browsers can't receive UDP yet, so would need a different approach on PC end. Easy enough to do on the ESP though... https://arduino-esp8266.readthedocs.io/en/latest/esp8266wifi/udp-examples.html#complete-sketch Alternatively, have a generic function that will use MQTT, UDP, or whatever alternative is configured. MQTT seems to rarely be the area needing debugging. |
+1 …same here., though I haven’t tracked the amount of time it takes for it to go unresponsive. Running v0.6.0, HomeKit shows no response. Can connect via WiFi to reboot ratGDO and it comes back on HomeKit within seconds. update: it’s been over a week and I haven’t experienced the unresponsive thing again. I did realize I had a couple of UniFi AP FW updates which might have coincided with my couple of previous scenarios where ratGDO went unresponsive in HK (while all my other HK accessories remained responsive & properly functional) update2: as mentioned after reading blakej's comment further below, #24 (comment) - seems like maybe the reason why I never experienced the unresponsive thing again was that I also locked my ratGDO to one AP and prevented it from roaming. |
I am also having this issue. |
Okay, adding my little +1 here. Running food for some time and then, no answer. Nevertheless, I can't even ping or connect to the webpage to reboot. But I can see that my device is well connected to my WiFi network in my Unifi console. I have a mesh network, and I wonder if the ratGDO only try to connect to the same channel it has been set on, not searching only for the network name. Not sure if it is related, but could be a clue. |
I'd love to help test, but as I have a sec v1 + 889LM and am hoping/wishing/waiting for sec v1 support :) I followed Rachio's attempts to fix their sprinkler system's homekit going into no response state, and a couple of other projects. From what I had seen, this is generally caused by mDNS, although that service is not usually the root cause. I tried looking at the homkit-ratgdo code, but the mdns / arduino portion being used isn't published unless I missed it somehow. Mixiaoxiao/Arduino-HomeKit-ESP8266#139
I'm not sure if this is exactly our issue here (if it is, there's a patch for it in the above thread), but the root cause could be quite similar. i.e. Mesh/Multi-AP/Wifi Reconnect triggered. |
Original developer here. I picked the Arduino-HomeKit-ESP8266 library basically because it was fast to get to market, but I'm quite unhappy with it (and the Arduino stack generally). I'm already basing this firmware off a fork because the developer has moved on to other projects, and it's looking more and more like I'd need to maintain it myself if I want to end up with a device that's suitably stable. Add to that the fact that the platformio project's support for ESP8266 is similarly lagging for whatever reasons. I don't regret basing these initial versions on the Arduino ecosystem, but this experience is confirming my belief that Arduino is great for prototyping but less great for a polished final product. As such, I'm currently (literally this moment) working on a version based on Espressif's esp-homekit-sdk library. Espressif has also focused their attention away from the ESP8266 and on to the ESP32 series, but I have hopes that the result will be a bit more stable. I just think that's going to be a better use of my (limited, volunteer) time than trying to beat into shape an abandoned library built on an outdated dependency stack. |
Count me as another multiple Unifi AP user, and also have three ratgdos. After repeatedly encountering the "No response" issues for each device, I isolated it to occurring anytime one of the ratgdos would roam between APs. From there, I could easily repeat the issue on demand by telling an AP to reconnect the client device. The device stays "no response" until rebooted (ratgdo web interface continues to works normal and responds to pings). Did some additional digging, including to the Arduino-HomeKit-ESP8266#139 issue @edrikk linked to above, and this issue is fixed in the main mrthiti/Arduino-HomeKit-ESP8266 master branch (merged from "fix-unable-reconnect-after-lost-connection" branch). The current platformio.ini file is referencing mrthiti/Arduino-HomeKit-ESP8266.git#fix-conflict-base64-file-name, but if the branch name is removed and the master branch used instead, everything works. The "fix-conflict-base64-file-name" branch as well as "fix-unable-reconnect-after-lost-connection" branch are both merged into master. After recompiling using the master branch, all of my ratgdos are staying online in HomeKit regardless of AP roaming / reconnects. @thenewwazoo unless there's an intentional reason for using the "fix-conflict-base64-file-name" branch specifically, I'd suggest swapping to master for now until the time intensive esp-homekit-sdk rewrite is completed. |
Ah - no wonder I stopped seeing the issue - I forgot that one thing I did was that I locked my ratGDO to one specific AP in my Unifi settings. Nice find @blakej ! |
Hah.... I had submitted a PR last night to pull in that commit. I closed it without merging though as there are still issues we need to debug. I guess it doesn't hurt to just go ahead and move to that for now as it will help |
locking to a specific access point is a great workaround. |
Here's my datapoint, though it doesn't add much to the above. I have persistent "No response" with ping response times/loss all over the place. I'm using a Eero mesh network and I cannot pin a client to an AP or satellite, but I can see that they aren't moving around the mesh. I also disabled "client steering" to make sure the network isn't pushing them around unnecessarily. Rebooting doesn't seem to help the situation. I'll wait for the release with the fix @blakej suggests or maybe try it myself. Unfortunately I discarded an old AP last week otherwise I would have set that up as a dedicated dumb network for them 😮💨 Congrats on this project, looking forward to getting my doors back up and running! |
Same issue and symptoms w/ Eero mesh and my new ratgdo flashed with Homekit firmware. Frustrating. |
I'm seeing a similar issue with my single AP Ruckus network. Ping is steady and web interface is always available. After rebooting it works a little while before becoming non responsive again. I disabled direct multicast on the AP with no change |
A lot of the issues reported here are either now fixed or tracked under more specific issues. So closing this for now. |
v1.2.0 now stores crash dumps to flash for debug purposes. |
Having an issue pairing with HomeKit and wonder the best way to trouble shoot / debug what's going on.
Then an error appears saying that the device could not be added.
I then started over/erased/repeated entire procedure. Failed a second time. But the third time it worked...
The text was updated successfully, but these errors were encountered: