Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move salsa cancellation into Drop for Server #74

Closed
wants to merge 1 commit into from

Conversation

the-mikedavis
Copy link
Contributor

@the-mikedavis the-mikedavis commented Jan 29, 2025

I'm not super familiar with the salsa internals but it seems as though if the database is modified between the receipt of request::Shutdown and notification::Exit then the Drop for Server deadlocks. This is fairly easy to trigger especially on a debug build by quitting an editor like neovim soon after starting the language server. Moving the cancellation request into Drop matches rust-analyzer and seems to avoid the deadlock - I believe because it closes the window where the database could be edited between cancel and drop.

Fixes #36

This fixes a potential deadlock if the language server is quit at a bad
time.
@the-mikedavis the-mikedavis changed the title server: Move salsa cancellation into Drop for Server Move salsa cancellation into Drop for Server Jan 29, 2025
@the-mikedavis
Copy link
Contributor Author

the-mikedavis commented Jan 29, 2025

Bah actually this is incorrect, drop isn't what's hanging. This change could be considered just to be closer to rust-analyzer but it doesn't solve #36 - I had other changes locally that should solve that.

@alanz
Copy link
Member

alanz commented Feb 3, 2025

FYI we will be migrating to the updated salsa later this quarter, when it lands in RA. See rust-lang/rust-analyzer#18964

@alanz
Copy link
Member

alanz commented Feb 3, 2025

And I think the way to deal with it is to forbid database updates after a shutdown is received. Let me take a look.

@alanz
Copy link
Member

alanz commented Feb 3, 2025

@the-mikedavis the way I see it is that we do not update the salsa database after we receive the shutdown request

if self.status == Status::ShuttingDown {
return Ok(());
}
let changed = self.process_changes_to_vfs_store();

self.process_changes_to_vfs_store(); is the only place we update the salsa database.

We do update the vfs store, based on document DidChange notifications, but that does not affect salsa.

@the-mikedavis
Copy link
Contributor Author

It looks like rust-analyzer is keeping its call to request_cancellation in Drop (for GlobalState which I believe is similar to Server) in that change. I'm not familiar with Salsa enough to know what that's intended to do. I tried tracking back the history of that change in the rust-analyzer codebase but I couldn't find an explanation of why it's there.

I should note that the deadlock doesn't appear to be in salsa itself but in the Drop for TaskPool for the cache_pool handle which blocks on an analysis. The backtrace for that thread is confusing as it seems to be stuck doing something that shouldn't be able to block.

Backtraces...

The main thread:

Thread 1 (Thread 0x7f1cfffa4340 (LWP 4011279) "elp"):
#0  0x00007f1d000b613d in syscall () from /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6
#1  0x0000555fd5433e15 in std::sys::pal::unix::futex::futex_wait () at std/src/sys/pal/unix/futex.rs:67
#2  std::sys::sync::condvar::futex::Condvar::wait_optional_timeout () at std/src/sys/sync/condvar/futex.rs:50
#3  std::sys::sync::condvar::futex::Condvar::wait () at std/src/sys/sync/condvar/futex.rs:34
#4  0x0000555fd3c19e72 in std::sync::condvar::Condvar::wait<()> (self=0x7f1cffa4e260, guard=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sync/condvar.rs:192
#5  0x0000555fd3c1dbef in threadpool::ThreadPool::join (self=0x7ffd7d19e9f8) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/threadpool-1.8.1/src/lib.rs:627
#6  0x0000555fd36c1cdf in elp::task_pool::{impl#1}::drop<elp::server::Task> (self=0x7ffd7d19e9e8) at crates/elp/src/task_pool.rs:75
#7  0x0000555fd36b0937 in core::ptr::drop_in_place<elp::task_pool::TaskPool<elp::server::Task>> () at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ptr/mod.rs:574
#8  0x0000555fd36a3db7 in core::ptr::drop_in_place<elp::server::Handle<elp::task_pool::TaskPool<elp::server::Task>, crossbeam_channel::channel::Receiver<elp::server::Task>>> () at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ptr/mod.rs:574
#9  0x0000555fd36a8183 in core::ptr::drop_in_place<elp::server::Server> () at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ptr/mod.rs:574
#10 0x0000555fd36f6ff5 in elp::server::Server::main_loop (self=...) at crates/elp/src/server.rs:399
#11 0x0000555fd3216745 in elp::run_server (logger=..., query_config=elp_project_model::buck::BuckQueryConfig::Original) at crates/elp/src/bin/main.rs:179
#12 0x0000555fd3213e91 in elp::try_main (cli=..., args=...) at crates/elp/src/bin/main.rs:108
#13 0x0000555fd3212b45 in elp::main () at crates/elp/src/bin/main.rs:68

Most other threads are idle but thread 68:

#0  0x00007f1d00110789 in __memset_avx2_unaligned_erms () from /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6
#1  0x0000555fd5088893 in core::intrinsics::write_bytes<u8> (dst=0x7f18511e7000, count=80) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/intrinsics.rs:3499
#2  core::ptr::mut_ptr::{impl#0}::write_bytes<u8> (self=0x7f18511e7000, count=80) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ptr/mut_ptr.rs:1405
#3  hashbrown::raw::RawTableInner::fallible_with_capacity<alloc::alloc::Global> (alloc=0x7f1c4c6b75b8, table_layout=..., capacity=29, 
    fallibility=hashbrown::raw::Fallibility::Infallible) at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:1792
#4  0x0000555fd433edf9 in hashbrown::raw::RawTableInner::prepare_resize<alloc::alloc::Global> (self=0x7f1c4c6b7598, alloc=0x7f1c4c6b75b8, table_layout=..., capacity=29, 
    fallibility=hashbrown::raw::Fallibility::Infallible) at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:2864
#5  hashbrown::raw::RawTableInner::resize_inner<alloc::alloc::Global> (self=0x7f1c4c6b7598, alloc=0x7f1c4c6b75b8, capacity=29, hasher=..., 
    fallibility=hashbrown::raw::Fallibility::Infallible, layout=...) at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:3060
#6  hashbrown::raw::RawTableInner::reserve_rehash_inner<alloc::alloc::Global> (self=0x7f1c4c6b7598, alloc=0x7f1c4c6b75b8, additional=1, hasher=..., 
    fallibility=hashbrown::raw::Fallibility::Infallible, layout=..., drop=...) at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:2950
#7  hashbrown::raw::RawTable<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global>::reserve_rehash<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global, hashbrown::map::make_hasher::{closure_env#0}<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>>> (self=0x7f1c4c6b7598, additional=1, hasher=..., fallibility=hashbrown::raw::Fallibility::Infallible)
    at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:1231
#8  0x0000555fd43810e5 in hashbrown::raw::RawTable<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global>::reserve<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global, hashbrown::map::make_hasher::{closure_env#0}<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>>> (self=0x7f1c4c6b7598, additional=1, hasher=...)
    at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:1179
#9  0x0000555fd435f48e in hashbrown::raw::RawTable<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global>::find_or_find_insert_slot<(la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef), alloc::alloc::Global, hashbrown::map::equivalent_key::{closure_env#0}<la_arena::Idx<hir::form_list::FunctionClause>, la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef>, hashbrown::map::make_hasher::{closure_env#0}<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>>> (self=0x7f1c4c6b7598, hash=16835915594115655756, eq=..., hasher=...)
    at /rust/deps/hashbrown-0.14.5/src/raw/mod.rs:1413
#10 0x0000555fd426da5b in hashbrown::map::HashMap<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>, alloc::alloc::Global>::insert<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>, alloc::alloc::Global> (self=0x7f1c4c6b7598, k=..., v=...) at /rust/deps/hashbrown-0.14.5/src/map.rs:1754
#11 0x0000555fd44aa8fa in std::collections::hash::map::HashMap<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>>::insert<la_arena::Idx<hir::form_list::FunctionClause>, hir::module_data::FunctionClauseDef, core::hash::BuildHasherDefault<fxhash::FxHasher>> (self=0x7f1c4c6b7598, 
    k=..., v=<error reading variable: Cannot access memory at address 0x59>) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/collections/hash/map.rs:1104
#12 0x0000555fd42824a6 in hir::def_map::DefMap::def_map_local_query (db=..., file_id=...) at crates/hir/src/def_map.rs:161
#13 0x0000555fd42cea46 in hir::db::{impl#136}::execute (db=..., file_id=...) at crates/hir/src/db.rs:160
#14 0x0000555fd4554ab7 in salsa::derived::slot::{impl#0}::read_upgrade::{closure#0}<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue> ()
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:216
#15 0x0000555fd46235a0 in salsa::runtime::Runtime::execute_query_implementation<dyn hir::db::DefDatabase, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>, salsa::derived::slot::{impl#0}::read_upgrade::{closure_env#0}<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue>> (self=0x7f1c4c6bbdd8, db=..., database_key_index=..., execute=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/runtime.rs:232
#16 0x0000555fd453a93f in salsa::derived::slot::Slot<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue>::read_upgrade<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue> (self=0x7f18511441e0, db=..., revision_now=...) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:213
#17 0x0000555fd458f083 in salsa::derived::slot::Slot<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue>::read<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue> (self=0x7f18511441e0, db=...) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:146
#18 0x0000555fd42e60e1 in salsa::derived::{impl#4}::try_fetch<hir::db::DefMapLocalQuery, salsa::derived::AlwaysMemoizeValue> (self=0x7f1cffa6ee30, db=..., key=0x7f1c4c6b9a2c)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived.rs:172
#19 0x0000555fd4601200 in salsa::QueryTable<hir::db::DefMapLocalQuery>::try_get<hir::db::DefMapLocalQuery> (self=0x7f1c4c6b9a98, key=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/lib.rs:426
#20 0x0000555fd45fe13d in salsa::QueryTable<hir::db::DefMapLocalQuery>::get<hir::db::DefMapLocalQuery> (self=0x7f1c4c6b9a98, key=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/lib.rs:421
#21 0x0000555fd42ce1f8 in hir::db::{impl#1}::def_map_local::__shim (db=..., file_id=...) at crates/hir/src/db.rs:54
#22 0x0000555fd408d19c in hir::db::{impl#1}::def_map_local<elp_ide_db::RootDatabase> (self=0x7f1c4c6bbdd0, file_id=...) at crates/hir/src/db.rs:54
#23 0x0000555fd4283cda in hir::def_map::DefMap::def_map_query (db=..., file_id=...) at crates/hir/src/def_map.rs:318
#24 0x0000555fd42ce8e6 in hir::db::{impl#131}::execute (db=..., file_id=...) at crates/hir/src/db.rs:155
#25 0x0000555fd4555647 in salsa::derived::slot::{impl#0}::read_upgrade::{closure#0}<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue> ()
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:216
#26 0x0000555fd4626d10 in salsa::runtime::Runtime::execute_query_implementation<dyn hir::db::DefDatabase, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>, salsa::derived::slot::{impl#0}::read_upgrade::{closure_env#0}<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue>> (self=0x7f1c4c6bbdd8, db=..., database_key_index=..., execute=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/runtime.rs:232
#27 0x0000555fd453ecef in salsa::derived::slot::Slot<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue>::read_upgrade<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue> (self=0x7f1851144170, db=..., revision_now=...) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:213
#28 0x0000555fd458bb63 in salsa::derived::slot::Slot<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue>::read<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue> (
    self=0x7f1851144170, db=...) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived/slot.rs:146
#29 0x0000555fd42e6cb1 in salsa::derived::{impl#4}::try_fetch<hir::db::DefMapQuery, salsa::derived::AlwaysMemoizeValue> (self=0x7f1cffa6ed50, db=..., key=0x7f1c4c6bb79c)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/derived.rs:172
#30 0x0000555fd4600e20 in salsa::QueryTable<hir::db::DefMapQuery>::try_get<hir::db::DefMapQuery> (self=0x7f1c4c6bb808, key=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/lib.rs:426
#31 0x0000555fd45fdf4d in salsa::QueryTable<hir::db::DefMapQuery>::get<hir::db::DefMapQuery> (self=0x7f1c4c6bb808, key=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/lib.rs:421
#32 0x0000555fd42ce1b8 in hir::db::{impl#1}::def_map::__shim (db=..., file_id=...) at crates/hir/src/db.rs:54
#33 0x0000555fd408d56c in hir::db::{impl#1}::def_map<elp_ide_db::RootDatabase> (self=0x7f1c4c6bbdd0, file_id=...) at crates/hir/src/db.rs:54
#34 0x0000555fd3eceb56 in elp_ide::{impl#2}::def_map::{closure#0} (db=0x7f1c4c6bbdd0) at crates/ide/src/lib.rs:762
#35 0x0000555fd3ed07f4 in elp_ide::{impl#2}::with_db::{closure#0}<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>> ()
    at crates/ide/src/lib.rs:786
#36 0x0000555fd3ee61e7 in std::panicking::try::do_call<elp_ide::{impl#2}::with_db::{closure_env#0}<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>>, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>> (data=0x7f1c4c6bb950)
    at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554
#37 0x0000555fd3e480cb in __rust_try ()
#38 0x0000555fd3e469a3 in std::panicking::try<alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>, elp_ide::{impl#2}::with_db::{closure_env#0}<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>>> (f=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518
#39 std::panic::catch_unwind<elp_ide::{impl#2}::with_db::{closure_env#0}<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>>, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>> (f=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345
#40 0x0000555fd3d98faf in salsa::Cancelled::catch<elp_ide::{impl#2}::with_db::{closure_env#0}<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>>, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>> (f=...)
    at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/salsa-0.17.0-pre.2/src/lib.rs:586
#41 0x0000555fd3ecee5f in elp_ide::Analysis::with_db<elp_ide::{impl#2}::def_map::{closure_env#0}, alloc::sync::Arc<hir::def_map::DefMap, alloc::alloc::Global>> (
    self=0x7f1c4c6bbdd0, f=...) at crates/ide/src/lib.rs:786
#42 0x0000555fd3dbdea8 in elp_ide::Analysis::def_map (self=0x7f1c4c6bbdd0, file_id=...) at crates/ide/src/lib.rs:762
#43 0x0000555fd37ac0a0 in elp::snapshot::Snapshot::update_cache_for_file (self=0x7f1c4c6bbd88, file_id=..., include_generated=elp_types_db::IncludeGenerated::No, 
    optimize_for_eqwalizer=false) at crates/elp/src/snapshot.rs:172
#44 0x0000555fd385de6e in elp::server::{impl#4}::update_cache::{closure#0} (sender=...) at crates/elp/src/server.rs:1637
#45 0x0000555fd393bc18 in elp::task_pool::{impl#0}::spawn_with_sender::{closure#0}<elp::server::Task, elp::server::{impl#4}::update_cache::{closure_env#0}> ()
    at crates/elp/src/task_pool.rs:64
#46 0x0000555fd39367eb in threadpool::{impl#0}::call_box<elp::task_pool::{impl#0}::spawn_with_sender::{closure_env#0}<elp::server::Task, elp::server::{impl#4}::update_cache::{closure_env#0}>> (self=0x7f1cad7a1a60) at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/threadpool-1.8.1/src/lib.rs:95
#47 0x0000555fd3c1e53f in threadpool::spawn_in_pool::{closure#0} () at /home/michael/.cargo/registry/src/index.crates.io-6f17d22bba15001f/threadpool-1.8.1/src/lib.rs:769
#48 0x0000555fd3c197ce in std::sys::backtrace::__rust_begin_short_backtrace<threadpool::spawn_in_pool::{closure_env#0}, ()> (f=...)
    at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/sys/backtrace.rs:154
#49 0x0000555fd3c2451b in std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure#0}<threadpool::spawn_in_pool::{closure_env#0}, ()> ()
    at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:522
#50 0x0000555fd3c1973f in core::panic::unwind_safe::{impl#23}::call_once<(), std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<threadpool::spawn_in_pool::{closure_env#0}, ()>> (self=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/panic/unwind_safe.rs:272
#51 0x0000555fd3c20ad6 in std::panicking::try::do_call<core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<threadpool::spawn_in_pool::{closure_env#0}, ()>>, ()> (data=0x7f1c4c6bc1c0) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:554
#52 0x0000555fd3c261cb in __rust_try ()
#53 0x0000555fd3c241b3 in std::panicking::try<(), core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<threadpool::spawn_in_pool::{closure_env#0}, ()>>> (f=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panicking.rs:518
#54 std::panic::catch_unwind<core::panic::unwind_safe::AssertUnwindSafe<std::thread::{impl#0}::spawn_unchecked_::{closure#1}::{closure_env#0}<threadpool::spawn_in_pool::{closure_env#0}, ()>>, ()> (f=...) at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/panic.rs:345
#55 std::thread::{impl#0}::spawn_unchecked_::{closure#1}<threadpool::spawn_in_pool::{closure_env#0}, ()> ()
    at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/std/src/thread/mod.rs:521
#56 0x0000555fd3c247af in core::ops::function::FnOnce::call_once<std::thread::{impl#0}::spawn_unchecked_::{closure_env#1}<threadpool::spawn_in_pool::{closure_env#0}, ()>, ()> ()
    at /rustc/f6e511eec7342f59a25f7c0534f1dbea00d01b14/library/core/src/ops/function.rs:250
#57 0x0000555fd543012b in alloc::boxed::{impl#48}::call_once<(), dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global> () at alloc/src/boxed.rs:2231
#58 alloc::boxed::{impl#48}::call_once<(), alloc::boxed::Box<dyn core::ops::function::FnOnce<(), Output=()>, alloc::alloc::Global>, alloc::alloc::Global> ()
    at alloc/src/boxed.rs:2231
#59 std::sys::pal::unix::thread::{impl#2}::new::thread_start () at std/src/sys/pal/unix/thread.rs:105
#60 0x00007f1d00038d02 in start_thread () from /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6
#61 0x00007f1d000b83ac in __clone3 () from /nix/store/3bvxjkkmwlymr0fssczhgi39c3aj1l7i-glibc-2.40-36/lib/libc.so.6

The exact task that the thread is doing seems to change between reproductions of the hang.

@the-mikedavis
Copy link
Contributor Author

I think the deadlock happens only when a task is kicked off before the shutdown request but doesn't complete before Server is dropped.

Also this deadlock only happens when the logger deadlock is fixed - you can comment out the body of ServerSetup::set_up_logger for a simpler/temporary fix for that.

facebook-github-bot pushed a commit that referenced this pull request Feb 5, 2025
Summary:
This fixes two possible deadlocks that cause `elp server` to hang during exit. I was on the right track in #74 but there are actually two deadlocks:

* The first happens because the logger backend has a clone of the `Sender` of the `Connection` (because the `LspLogger` clones it one). This can cause a deadlock when a message is _not_ logged following the drop of `Server` on `io_threads.join()` because the `Connection`'s thread for receiving messages from the `Connection`'s `Receiver` awaits messages indefinitely. The solution for this deadlock is to remove the `LspLogger` from the backend on `Drop` of the server (or any time earlier). That decrements the reference count on the `Sender` and causes the `Receiver`'s iterator to return `None` resulting in the termination of its thread.
* The second I don't really understand fully. The `Drop for Server` hangs because the cache task pool's drop joins its threads (by design) and there is an outstanding task that seems to be frozen, I think related to Salsa's cancellation mechanism. Explicitly requesting cancellation during drop eliminates this problem. Plus it matches what rust-analyzer does.

Fixes #36

Pull Request resolved: #75

Reviewed By: michalmuskala

Differential Revision: D69115301

Pulled By: alanz

fbshipit-source-id: 3b0a5b39f49d668199a2370f1b23bc78f0ae6d17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

elp server process stays up after a window reload
3 participants