Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: Why is listpayments so slow? #9069

Open
Impa10r opened this issue Sep 6, 2024 · 10 comments
Open

[bug]: Why is listpayments so slow? #9069

Impa10r opened this issue Sep 6, 2024 · 10 comments
Labels
bug Unintended code behaviour needs triage performance

Comments

@Impa10r
Copy link

Impa10r commented Sep 6, 2024

Background

I have only 209 payments, but it takes 43 seconds to list them:

$ time lncli listpayments | jq '.payments | length'
209

real    0m43.422s
user    0m0.188s
sys     0m0.142s

For comparison, pulling 349,464 of forwards takes less than a second:

$ time lncli forwardinghistory > nul

real    0m0.394s
user    0m0.082s
sys     0m0.058s

Your environment

  • version of lnd 0.18.0
  • which operating system (uname -a on *Nix) Umbrel 0.54 debian
  • version of btcd, bitcoind, or other backend bitcoind 27.0.0
  • any other relevant environment details
    image
@Impa10r Impa10r added bug Unintended code behaviour needs triage labels Sep 6, 2024
@Impa10r
Copy link
Author

Impa10r commented Sep 6, 2024

Update: excluding incomplete made my node unresponsive for 20 minutes and restarted LND!!!

$ time lncli listpayments --include_incomplete false | jq '.payments | length'
parse error: Invalid numeric literal at EOF at line 1, column 4

real    20m53.613s
user    0m5.974s
sys     0m21.513s

Nothing in LND log, just a restart after 16 minutes:

2024-09-06 21:40:17.332 [INF] WTCL: (taproot) Client stats: tasks(received=0 accepted=0 ineligible=0) sessions(acquired=0 exhausted=0)
2024-09-06 21:40:51.491 [INF] PEER: Peer(035e4ff418fc8b5554c5d9eea66396c227bd429a3251c8cbc711002ba215bfc226): unable to read message from peer: read next body: read tcp 10.9.9.9:9735->170.75.163.209:55350: i/o timeout
2024-09-06 21:40:57.571 [WRN] LNWL: Web API does not have a fee rate for target=3, using the fee rate for target=1 instead
2024-09-06 21:56:39.916 [WRN] LTND: Config 'feeurl' is deprecated, please remove it
2024-09-06 21:56:39.917 [WRN] LTND: Config 'bitcoin.active' is deprecated, please remove it
2024-09-06 21:56:39.924 [WRN] LTND: Config 'sweeper.batchwindowduration' is deprecated, please remove it
2024-09-06 21:56:39.927 [INF] LTND: Version: 0.18.0-beta commit=v0.18.0-beta, build=production, logging=default, debuglevel=info
2024-09-06 21:56:39.928 [INF] LTND: Active chain: Bitcoin (network=mainnet)

Syslog:

Sep  6 21:42:05 umbrel kernel: [3899170.755454] INFO: task kworker/2:1:16773 blocked for more than 120 seconds.
Sep  6 21:42:06 umbrel kernel: [3899170.755467]       Tainted: G         C        5.10.103-v8+ #1529
Sep  6 21:42:07 umbrel kernel: [3899170.755472] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  6 21:42:09 umbrel kernel: [3899170.755477] task:kworker/2:1     state:D stack:    0 pid:16773 ppid:     2 flags:0x00000028
Sep  6 21:42:11 umbrel kernel: [3899170.755506] Workqueue: events_freezable mmc_rescan
Sep  6 21:42:12 umbrel kernel: [3899170.755515] Call trace:
Sep  6 21:42:14 umbrel kernel: [3899170.755524]  __switch_to+0xd4/0x128
Sep  6 21:42:15 umbrel kernel: [3899170.755532]  __schedule+0x2d0/0x818
Sep  6 21:42:15 umbrel kernel: [3899170.755537]  schedule+0x48/0x100
Sep  6 21:42:15 umbrel kernel: [3899170.755543]  __mmc_claim_host+0xbc/0x208
Sep  6 21:42:15 umbrel kernel: [3899170.755549]  mmc_get_card+0x3c/0x50
Sep  6 21:42:15 umbrel kernel: [3899170.755554]  mmc_sd_detect+0x28/0x98
Sep  6 21:42:15 umbrel kernel: [3899170.755560]  mmc_rescan+0xcc/0x3b8
Sep  6 21:42:15 umbrel kernel: [3899170.755566]  process_one_work+0x1f4/0x4d8
Sep  6 21:42:15 umbrel kernel: [3899170.755572]  worker_thread+0x50/0x480
Sep  6 21:42:15 umbrel kernel: [3899170.755579]  kthread+0x148/0x158
Sep  6 21:42:15 umbrel kernel: [3899170.755585]  ret_from_fork+0x10/0x30
Sep  6 21:46:20 umbrel CRON[9820]: (umbrel) CMD (/bin/sh /home/umbrel/ping.sh >> /home/umbrel/logs/ping.log)
Sep  6 21:48:08 umbrel kernel: [3899533.259244] INFO: task kworker/2:1:16773 blocked for more than 120 seconds.
Sep  6 21:48:12 umbrel kernel: [3899533.259257]       Tainted: G         C        5.10.103-v8+ #1529
Sep  6 21:48:14 umbrel kernel: [3899533.259262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  6 21:48:15 umbrel kernel: [3899533.259268] task:kworker/2:1     state:D stack:    0 pid:16773 ppid:     2 flags:0x00000028
Sep  6 21:48:15 umbrel kernel: [3899533.259295] Workqueue: events_freezable mmc_rescan
Sep  6 21:48:15 umbrel kernel: [3899533.259304] Call trace:
Sep  6 21:48:15 umbrel kernel: [3899533.259313]  __switch_to+0xd4/0x128
Sep  6 21:48:15 umbrel kernel: [3899533.259320]  __schedule+0x2d0/0x818
Sep  6 21:48:15 umbrel kernel: [3899533.259326]  schedule+0x48/0x100
Sep  6 21:48:15 umbrel kernel: [3899533.259332]  __mmc_claim_host+0xbc/0x208
Sep  6 21:48:15 umbrel kernel: [3899533.259337]  mmc_get_card+0x3c/0x50
Sep  6 21:48:15 umbrel kernel: [3899533.259343]  mmc_sd_detect+0x28/0x98
Sep  6 21:48:15 umbrel kernel: [3899533.259349]  mmc_rescan+0xcc/0x3b8
Sep  6 21:48:15 umbrel kernel: [3899533.259355]  process_one_work+0x1f4/0x4d8
Sep  6 21:48:15 umbrel kernel: [3899533.259360]  worker_thread+0x50/0x480
Sep  6 21:48:15 umbrel kernel: [3899533.259367]  kthread+0x148/0x158
Sep  6 21:48:15 umbrel kernel: [3899533.259372]  ret_from_fork+0x10/0x30
Sep  6 21:50:08 umbrel kernel: [3899654.088593] INFO: task kworker/2:1:16773 blocked for more than 241 seconds.
Sep  6 21:50:11 umbrel kernel: [3899654.088606]       Tainted: G         C        5.10.103-v8+ #1529
Sep  6 21:50:14 umbrel kernel: [3899654.088610] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep  6 21:50:16 umbrel kernel: [3899654.088616] task:kworker/2:1     state:D stack:    0 pid:16773 ppid:     2 flags:0x00000028
Sep  6 21:50:19 umbrel kernel: [3899654.088641] Workqueue: events_freezable mmc_rescan
Sep  6 21:50:20 umbrel kernel: [3899654.088650] Call trace:
Sep  6 21:50:22 umbrel kernel: [3899654.088659]  __switch_to+0xd4/0x128
Sep  6 21:50:24 umbrel kernel: [3899654.088666]  __schedule+0x2d0/0x818
Sep  6 21:50:27 umbrel kernel: [3899654.088671]  schedule+0x48/0x100
Sep  6 21:50:28 umbrel kernel: [3899654.088677]  __mmc_claim_host+0xbc/0x208
Sep  6 21:50:31 umbrel kernel: [3899654.088683]  mmc_get_card+0x3c/0x50
Sep  6 21:50:33 umbrel kernel: [3899654.088689]  mmc_sd_detect+0x28/0x98
Sep  6 21:50:35 umbrel kernel: [3899654.088694]  mmc_rescan+0xcc/0x3b8
Sep  6 21:50:38 umbrel kernel: [3899654.088700]  process_one_work+0x1f4/0x4d8
Sep  6 21:50:40 umbrel kernel: [3899654.088705]  worker_thread+0x50/0x480
Sep  6 21:50:41 umbrel kernel: [3899654.088712]  kthread+0x148/0x158
Sep  6 21:50:43 umbrel kernel: [3899654.088718]  ret_from_fork+0x10/0x30

@saubyk
Copy link
Collaborator

saubyk commented Sep 6, 2024

Hi @Impa10r what database do you have for LND? Is it regular bbolt (which is the default) or are you running postgres/sqlite?

@Impa10r
Copy link
Author

Impa10r commented Sep 6, 2024

Hi @Impa10r what database do you have for LND? Is it regular bbolt (which is the default) or are you running postgres/sqlite?

regular bolt, DB Size: 1.834 GB Total States Updates: 7,712,708

@Roasbeef
Copy link
Member

Roasbeef commented Sep 6, 2024

Will close #9015 in favor of this, as it's a bit more detailed.

@Impa10r
Copy link
Author

Impa10r commented Sep 6, 2024

Update 2: my testnet node with LND 0.18.0 does not suffer from this:

$ time lncli listpayments --include_incomplete false | jq '.payments | length'
2371

real    0m0,418s
user    0m0,168s
sys     0m0,488s

channel.db size 70mb

@Roasbeef
Copy link
Member

Roasbeef commented Sep 6, 2024

If you can obtain a CPU profile of a sample run and post the SVG or raw file here, that would be super helpful.

@Impa10r
Copy link
Author

Impa10r commented Sep 6, 2024

If you can obtain a CPU profile of a sample run and post the SVG or raw file here, that would be super helpful.

Mine runs in docker, not sure how to start it like that ((

@calvinrzachman
Copy link
Contributor

How do you spin up your docker container? You can forward a port on your host machine to a port inside the container using something like:

docker run -p 9736:9736 my-lnd-container

Then the curl command to localhost will hit the container.

Ahh, I've never used Umbrel, but looks like there's some information on how to modify lnd config here:

@Impa10r
Copy link
Author

Impa10r commented Sep 7, 2024

If you can obtain a CPU profile of a sample run and post the SVG or raw file here, that would be super helpful.

@Roasbeef taken while $ time lncli listpayments --include_incomplete false | jq '.payments | length'
was running in another terminal. After a minute my node became unresponsive again. In 22m LND restarted...

profile.txt

@Impa10r
Copy link
Author

Impa10r commented Sep 7, 2024

How do you spin up your docker container? You can forward a port on your host machine to a port inside the container using something like:

docker run -p 9736:9736 my-lnd-container

Then the curl command to localhost will hit the container.

Ahh, I've never used Umbrel, but looks like there's some information on how to modify lnd config here:

I used umbrel@umbrel:~ $ docker exec -it lightning_lnd_1 /bin/bash, then curl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Unintended code behaviour needs triage performance
Projects
None yet
Development

No branches or pull requests

4 participants