Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32 reboot #44

Closed
caveman2003 opened this issue Dec 27, 2023 · 11 comments
Closed

ESP32 reboot #44

caveman2003 opened this issue Dec 27, 2023 · 11 comments

Comments

@caveman2003
Copy link

Hi all,

finaly got a tool that work with my lamps :-) ... THANKS !! i have 84 EGLOSpot 120 (product_id: 0x96)

unfortunately, the ESP32 (tested with 2 boards: wemos_d1_mini32 and az-delivery-devkit-v4) and framework: esp-idf reboot few times then it can happen that it stops... when most of the LED are on it can happen faster ... with a few it may take longer.
Do i have too many devices ? is there any limit? or do i need an ESP with more memory?

I see the following:

[15:52:13][D][mesh_device:123]: Request info again for 13381
[15:52:13][D][mesh_device:123]: Request info again for 4996
[15:52:18]E (233064) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
[15:52:18]E (233064) task_wdt: - loopTask (CPU 1)
[15:52:18]E (233064) task_wdt: Tasks currently running:
[15:52:18]E (233064) task_wdt: CPU 0: IDLE
[15:52:18]E (233064) task_wdt: CPU 1: IDLE
[15:52:18]E (233064) task_wdt: Aborting.
[15:52:18]
[15:52:18]abort() was called at PC 0x4012a7e8 on core 0
[15:52:18]
[15:52:18]
[15:52:18]Backtrace: 0x40082ace:0x3ffbe3c0 0x40091611:0x3ffbe3e0 0x4009754e:0x3ffbe400 0x4012a7e8:0x3ffbe470 0x400834a1:0x3ffbe490 0x401b2227:0x3ffbc6a0 0x4012af39:0x3ffbc6c0 0x400929e8:0x3ffbc6e0
WARNING Found stack trace! Trying to decode it
WARNING Decoded 0x40082ace: panic_abort at /config/.esphome/platformio/packages/framework-espidf/components/esp_system/panic.c:408
WARNING Decoded 0x40091611: esp_system_abort at /config/.esphome/platformio/packages/framework-espidf/components/esp_system/esp_system.c:137
WARNING Decoded 0x4009754e: abort at /config/.esphome/platformio/packages/framework-espidf/components/newlib/abort.c:46
WARNING Decoded 0x4012a7e8: task_wdt_isr at /config/.esphome/platformio/packages/framework-espidf/components/esp_system/task_wdt.c:176 (discriminator 3)
WARNING Decoded 0x400834a1: _xt_lowint1 at /config/.esphome/platformio/packages/framework-espidf/components/freertos/port/xtensa/xtensa_vectors.S:1114
WARNING Decoded 0x401b2227: cpu_ll_waiti at /config/.esphome/platformio/packages/framework-espidf/components/hal/esp32/include/hal/cpu_ll.h:183
(inlined by) esp_pm_impl_waiti at /config/.esphome/platformio/packages/framework-espidf/components/esp_pm/pm_impl.c:830
WARNING Decoded 0x4012af39: esp_vApplicationIdleHook at /config/.esphome/platformio/packages/framework-espidf/components/esp_system/freertos_hooks.c:63
WARNING Decoded 0x400929e8: prvIdleTask at /config/.esphome/platformio/packages/framework-espidf/components/freertos/tasks.c:3987
[15:52:18]
[15:52:18]
[15:52:18]
[15:52:18]
[15:52:18]ELF file SHA256: 1a2a3a8426cab745
[15:52:18]
[15:52:18]Rebooting...
[15:52:18]ets Jul 29 2019 12:21:46
[15:52:18]
[15:52:18]rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
[15:52:18]configsip: 0, SPIWP:0xee
[15:52:18]clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
[15:52:18]mode:DIO, clock div:2
[15:52:18]load:0x3fff0030,len:6608
[15:52:18]load:0x40078000,len:15060
[15:52:18]ho 0 tail 12 room 4
[15:52:18]load:0x40080400,len:3816
[15:52:18]entry 0x40080698
[15:52:18]I (29) boot: ESP-IDF 4.4.5 2nd stage bootloader
[15:52:18]I (29) boot: compile time 14:43:44
[15:52:18]I (29) boot: chip revision: v3.0
[15:52:18]I (32) boot.esp32: SPI Speed : 40MHz
[15:52:18]I (37) boot.esp32: SPI Mode : DIO
[15:52:18]I (41) boot.esp32: SPI Flash Size : 4MB
[15:52:18]I (46) boot: Enabling RNG early entropy source...
[15:52:18]I (51) boot: Partition Table:
[15:52:18]I (55) boot: ## Label Usage Type ST Offset Length
[15:52:18]I (62) boot: 0 otadata OTA data 01 00 00009000 00002000
[15:52:18]I (69) boot: 1 phy_init RF data 01 01 0000b000 00001000
[15:52:18]I (77) boot: 2 app0 OTA app 00 10 00010000 001c0000
[15:52:18]I (84) boot: 3 app1 OTA app 00 11 001d0000 001c0000
[15:52:18]I (92) boot: 4 nvs WiFi data 01 02 00390000 0006d000
[15:52:18]I (99) boot: End of partition table
[15:52:18]I (104) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=32b10h (207632) map
[15:52:18]I (187) esp_image: segment 1: paddr=00042b38 vaddr=3ffbdb60 size=04bd4h ( 19412) load
[15:52:18]I (195) esp_image: segment 2: paddr=00047714 vaddr=40080000 size=08904h ( 35076) load
[15:52:18]I (210) esp_image: segment 3: paddr=00050020 vaddr=400d0020 size=e55c0h (939456) map
[15:52:19]I (550) esp_image: segment 4: paddr=001355e8 vaddr=40088904 size=148dch ( 84188) load
[15:52:19]I (599) boot: Loaded app from partition at offset 0x10000
[15:52:19]I (599) boot: Disabling RNG early entropy source...
[15:52:19]I (611) cpu_start: Pro cpu up.
[15:52:19]I (611) cpu_start: Starting app cpu, entry point is 0x40082180
[15:52:19]I (597) cpu_start: App cpu up.
[15:52:19]I (627) cpu_start: Pro cpu start user code
[15:52:19]I (627) cpu_start: cpu freq: 160000000
[15:52:19]I (627) cpu_start: Application information:
[15:52:19]I (632) cpu_start: Project name: awox-ble-mesh-hub
[15:52:19]I (638) cpu_start: App version: 2023.12.5
[15:52:19]I (643) cpu_start: Compile time: Dec 27 2023 14:42:49
[15:52:19]I (649) cpu_start: ELF file SHA256: 1a2a3a8426cab745...
[15:52:19]I (655) cpu_start: ESP-IDF: 4.4.5
[15:52:19]I (660) cpu_start: Min chip rev: v0.0
[15:52:19]I (664) cpu_start: Max chip rev: v3.99
[15:52:19]I (669) cpu_start: Chip rev: v3.0
[15:52:19]I (674) heap_init: Initializing. RAM available for dynamic allocation:
[15:52:19]I (681) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
[15:52:19]I (687) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
[15:52:19]I (693) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
[15:52:19]I (699) heap_init: At 3FFCAF80 len 00015080 (84 KiB): DRAM
[15:52:19]I (705) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
[15:52:19]I (712) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
[15:52:19]I (718) heap_init: At 4009D1E0 len 00002E20 (11 KiB): IRAM
[15:52:19]I (726) spi_flash: detected chip: generic
[15:52:19]I (729) spi_flash: flash io: dio
[15:52:19]I (735) cpu_start: Starting scheduler on PRO CPU.
[15:52:19]I (0) cpu_start: Starting scheduler on APP CPU.
[15:52:19][I][logger:355]: Log initialized
[15:52:19][C][ota:473]: There have been 10 suspected unsuccessful boot attempts.
[15:52:19][D][esp32.preferences:114]: Saving 1 preferences to flash...
[15:52:19][D][esp32.preferences:143]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[15:52:19][E][ota:480]: Boot loop detected. Proceeding to safe mode.
[15:52:19][I][app:029]: Running through setup()...

config file used (here with az-delivery board):

esphome: name: awox-ble-mesh-hub friendly_name: "AwoX BLE mesh hub" esp32: board: az-delivery-devkit-v4 framework: type: esp-idf # status_led: # Match this with the board your using # pin: GPIO2 # Enable logging logger: #level: VERY_VERBOSE # debug: # update_interval: 5s # text_sensor: # - platform: debug # reset_reason: # name: "Reset Reason" # sensor: # - platform: debug # free: # name: "Heap Free" # loop_time: # name: "Loop Time" mqtt: broker: "192.168.10.235" username: "XXXXXXXX" password: "XXXXXXXX" # Disable Home Assistant API to prevent frequent MQTT disconnections #api: ota: password: "XXXXXXXX" wifi: ssid: "XXXXXXXX" password: "XXXXXXXX" external_components: - source: type: git url: https://github.com/fsaris/EspHome-AwoX-BLE-mesh-hub # ref: dev # - source: # type: local # path: components esp32_ble_tracker: scan_parameters: active: false awox_mesh: mesh_name: XXXXXXXX mesh_password: XXXXXXXX address_prefix: A4:C1 device_info: # Example device type, see device info in HomeAssistant or MQTT message to find the 'product_id' - product_id: 0x32 device_type: RGB manufacturer: EGLO name: Spot 120 model: EGLOSpot 120/w icon: mdi:wall-sconce-flat - product_id: 0x96 device_type: RGB manufacturer: EGLO name: Spot 120 model: EGLOSpot 120/w icon: mdi:wall-sconce-flat
@fsaris
Copy link
Owner

fsaris commented Dec 28, 2023

It probably indeed a memory issue. Can you see when it reboots?

Maybe you can provide some more logging?

@caveman2003
Copy link
Author

Hi Frans,
thanks for your replay :-)

i increased the logging to:

logger:
level: VERY_VERBOSE

also i started to use picocom to log to file ... but i need to wait.

Last night seems that the ESP did not stop, but i do not know if it have rebooted or not ...

My use case is not to access all 80+ LEDs but only a few of them is this possible with:

address_prefix: A4:C1

by example:

address_prefix: A4:C1:38:54

or even:

address_prefix: A4:C1:38:3

?

@fsaris
Copy link
Owner

fsaris commented Dec 28, 2023

The address_prefix option is only there to limit what addresses to connect. Because the lights form a mesh only connecting to 1 device is enough to control (and get the status) of all lights in the mesh (of nodes that are in reach of each other).

So it will not limit the number of lights that are "controlled".

In general the system shouldn't be needing that much memory, but maybe when it's building a lot of mqtt messages when sending out the first device info message for each device that gets recognized.

@caveman2003
Copy link
Author

Hi Frans

before sending many kB of logfile i have an question regarding "status" and "connected" (published in mqtt). I suppose "status" means the device status and "connected" means that the device is connected to ble:

grafik

red = offline, green = online .... :-)

but if i am disconnected from ble why do i get informations about the lamps ?

Jan 7 13:50:02 nasdlna ESC[0;38m[VV][scheduler:032]: set_timeout(name='scan', timeout=600000)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;37m[V][esp32_ble:314]: (BLE) gap_event_handler - 2ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;37m[V][esp32_ble:314]: (BLE) gap_event_handler - 7ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;37m[V][esp32_ble:314]: (BLE) gap_event_handler - 3ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;37m[V][esp32_ble:314]: (BLE) gap_event_handler - 3ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:423]: Parse Result:ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:440]: Address: A4:C1:38:10:82:83 (PUBLIC)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:442]: RSSI: -88ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:443]: Name: 'xwXAPIwy'ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:451]: Ad Flag: 5ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:454]: Service UUID: 0x1800ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:457]: Manufacturer data: 96.10.83.82.10.38 (6)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:473]: Adv data: 02.01.05.03.02.00.18.09.09.78.77.58.41.50.49.77.79.09.FF.60.01.96.10.83.82.10.38 (27)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;37m[V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:10:82:83. RSSI: -88 dB (total devices: 1)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:423]: Parse Result:ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:440]: Address: 60:CC:1E:96:4D:A8 (RANDOM)ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:442]: RSSI: -97ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:443]: Name: ''ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:454]: Service UUID: 0xFEF3ESC[0m
Jan 7 13:50:02 nasdlna ESC[0;38m[VV][esp32_ble_tracker:473]: Adv data: 03.03.F3.FEESC[0m

also actual if i try to turn on a lamp using mqtt (awox-ble-mesh-hub/10523/command, {"state":"ON"}) it do not work ... and this explain the disconnect ...

i am pretty sure if i reboot the ESP32 then i can turn the lamp on ... is there a way to force connecting ?

regards Gerhard

@caveman2003
Copy link
Author

caveman2003 commented Jan 7, 2024

btw... i see:

Jan 7 13:55:46 nasdlna [W][esp32_ble_client:139]: [0] [A4:C1:38:5E:2D:38] Connection failed, status=133 Jan 7 13:56:53 nasdlna [VV][esp-idf:000]: E (52280145) BT_APPL: Connection open failure Jan 7 13:56:53 nasdlna [W][esp32_ble_client:139]: [0] [A4:C1:38:5E:2D:38] Connection failed, status=133 Jan 7 13:56:53 nasdlna [W][esp32_ble_client:139]: [0] [A4:C1:38:5E:2D:38] Connection failed, status=133 Jan 7 13:56:53 nasdlna [VV][esp-idf:000]: E (52283525) BT_APPL: Connection open failure Jan 7 13:56:53 nasdlna [W][esp32_ble_client:139]: [0] [A4:C1:38:5E:2D:38] Connection failed, status=133 Jan 7 13:56:54 nasdlna [W][esp32_ble_client:139]: [0] [A4:C1:38:5E:2D:38] Connection failed, status=133

... not sure it can be that this device is out of range..

the device i try to turn on/off is in range and i do not see errors except that it do not turn on :-) ...

Jan 7 14:01:59 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:01:59 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -68 dB (total devices: 8)
Jan 7 14:01:59 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:01:59 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -82 dB (total devices: 8)
Jan 7 14:01:59 nasdlna [D][AwoxMesh:058]: Available device A4:C1:38:58:51:A4 => rssi: -82
Jan 7 14:02:44 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:02:44 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -68 dB (total devices: 1)
Jan 7 14:02:44 nasdlna [D][AwoxMesh:058]: Available device A4:C1:38:58:51:A4 => rssi: -68
Jan 7 14:02:44 nasdlna [I][AwoxMesh:062]: Try to connect A4:C1:38:58:51:A4 => rssi: -68
Jan 7 14:02:44 nasdlna [D][esp32_ble_client:053]: [0] [A4:C1:38:58:51:A4] Found device
Jan 7 14:02:44 nasdlna [I][esp32_ble_client:069]: [0] [A4:C1:38:58:51:A4] 0x00 Attempting BLE connection
Jan 7 14:03:06 nasdlna [I][AwoxMesh:070]: Failed to connect A4:C1:38:58:51:A4 => rssi: -68
Jan 7 14:03:06 nasdlna [I][esp32_ble_client:086]: [0] [A4:C1:38:58:51:A4] Disconnecting.
Jan 7 14:04:02 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:04:02 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -67 dB (total devices: 1)
Jan 7 14:04:02 nasdlna [D][AwoxMesh:058]: Available device A4:C1:38:58:51:A4 => rssi: -67
Jan 7 14:04:02 nasdlna [I][AwoxMesh:062]: Try to connect A4:C1:38:58:51:A4 => rssi: -67
Jan 7 14:04:02 nasdlna [D][esp32_ble_client:053]: [0] [A4:C1:38:58:51:A4] Found device
Jan 7 14:04:02 nasdlna [I][esp32_ble_client:069]: [0] [A4:C1:38:58:51:A4] 0x00 Attempting BLE connection
Jan 7 14:04:24 nasdlna [I][AwoxMesh:070]: Failed to connect A4:C1:38:58:51:A4 => rssi: -67
Jan 7 14:04:24 nasdlna [I][esp32_ble_client:086]: [0] [A4:C1:38:58:51:A4] Disconnecting.
Jan 7 14:04:24 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:04:24 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -80 dB (total devices: 4)
Jan 7 14:04:25 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:04:25 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -68 dB (total devices: 8)
Jan 7 14:04:27 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:04:27 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -80 dB (total devices: 8)
Jan 7 14:04:29 nasdlna [VV][esp32_ble_tracker:440]: Address: A4:C1:38:58:51:A4 (PUBLIC)
Jan 7 14:04:29 nasdlna [V][AwoxMesh:043]: Found Awox device xwXAPIwy - A4:C1:38:58:51:A4. RSSI: -68 dB (total devices: 8)

@fsaris
Copy link
Owner

fsaris commented Jan 7, 2024

before sending many kB of logfile i have an question regarding "status" and "connected" (published in mqtt). I suppose "status" means the device status and "connected" means that the device is connected to ble:

Correct and both need to be online to be able to control your light/device.

Connection failed, status=133

Generally, it happens when the device is out of range.

@caveman2003
Copy link
Author

Hi Frans,

after reboot i see:

grafik

and i wonder about the fact that "status" is 5min online followed by 5min offline then again 5 min online and so on ... this is very strange... any ideea what that can be?

the "connected" is more irregular..... the ESP32 is 2m from the LED i need to manage... ok this can be ...

@caveman2003
Copy link
Author

mqtt.log
last 1,5h log

@fsaris
Copy link
Owner

fsaris commented Jan 8, 2024

The device is able to connect but reboots after a while, probably due to memory issues.

After finding +/- 45 devices (even when not online the messages about offline devices are published by the online devices) it looks to reboot.

Maybe disable the verbose logging and turn on some of the debug sensors to check memory use etc.

logger:
  #level: VERY_VERBOSE
debug:
   update_interval: 5s

text_sensor:
   - platform: debug
     reset_reason:
      name: "Reset Reason"

sensor:
   - platform: debug
     free:
       name: "Heap Free"
     loop_time:
       name: "Loop Time"

@fsaris
Copy link
Owner

fsaris commented Jan 10, 2024

@caveman2003 maybe you can try the dev branch. That version had some memory improvements.

external_components:
  - source:
      type: git
      url: https://github.com/fsaris/EspHome-AwoX-BLE-mesh-hub
      ref: dev

awox_mesh:
  mesh_name: !secret mesh_name
  mesh_password: !secret mesh_password

  # Optional options
  # address_prefix: A4:C1
  # min_rssi: -90 # to prevent that the mesh tries to connect to devices that are out of range
  max_connections: 1 # max 3 connections

  device_info:
    - product_id: 0x32
      device_type: RGB
      manufacturer: EGLO
      name: Spot 120
      model: EGLOSpot 120/w
      icon: mdi:wall-sconce-flat
    - product_id: 0x96
      device_type: RGB
      manufacturer: EGLO
      name: Spot 120
      model: EGLOSpot 120/w
      icon: mdi:wall-sconce-flat
  

When switching to the dev branch it could be you need to clear the build files (click "Clear Build Files" on the dashboard) due to file name changes.

@fsaris
Copy link
Owner

fsaris commented Jan 27, 2024

It looks that you are running into the memory limits of a ESP32.

Maybe that it will help once HA API is supported instead (#59) of only MQTT. And you could also limit the devices it tries to "process" using the allowed_mesh_ids: and allowed_mac_addresses: config options.

@fsaris fsaris closed this as completed Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants