Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When crossterm may encounter "The cursor position could not be read within a normal duration" #963

Open
tisonkun opened this issue Jan 3, 2025 · 3 comments

Comments

@tisonkun
Copy link

tisonkun commented Jan 3, 2025

Background: nushell/reedline#870

I use reedline to write my CLI and it transitively depends on crossterm. Today I meet the error:

thread 'main' panicked at cmd/scopeql/src/main.rs:50:78:
called `Result::unwrap()` on an `Err` value: failed to read next line
|- at cmd/scopeql/src/repl/entrypoint.rs:42:14
|
| -> The cursor position could not be read within a normal duration
    | - at cmd/scopeql/src/repl/entrypoint.rs:42:14

And I have no idea how "asking for cursor position" can hang or fail.

The related code and discussion can be found at the issue linked above. And I'd like to ask for crossterm's maintainers' help for insights to investigate deeper.

@joshka
Copy link
Collaborator

joshka commented Jan 4, 2025

Just a pure guess. Your code is reading from two threads. Don't do that if so.

If the cli is multi-threaded, make sure that the read thread and the write thread are the same. If you have code which reads crossterm events on one thread, but also code which attempts to read the cursor position on another, then a race condition can cause this problem.

@tisonkun
Copy link
Author

tisonkun commented Jan 4, 2025

@joshka My code is running at the same thread while spawn async task onto another runtime:

pub fn entrypoint(config: Config, rt: Runtime) -> Result<(), Error> {
    let mut state = Reedline::create()
        .with_validator(Box::new(ScopeQLValidator))
        .with_highlighter(Box::new(ScopeQLHighlighter));

    loop {
        let input = state
            .read_line(&prompt)
            .change_context_lazy(|| Error::Other("failed to read next line".to_string()))?;
        let input = match input {
            Signal::CtrlC | Signal::CtrlD => {
                println!("exit");
                break;
            }
            Signal::Success(input) => input,
        };
        let input = input.trim();

        // other biz code

        for range in stmts_range {
            // Statement should be executed regardless of whether it is valid or not, because it
            // may be valid on the server, but invalid on an outdated client.
            let stmt = &input[range];

            let stmt_formatted = ast::parser::parse_stmts(stmt)
                .map(|stmts| stmts.0[0].to_string())
                .unwrap_or_else(|_| stmt.to_string());
            let stmt_highlighted = ScopeQLHighlighter
                .highlight(&stmt_formatted, 0)
                .render_simple();
            println!("{stmt_highlighted}");

            rt.block_on(async move {
                let output = tokio::select! {
                    _ = tokio::signal::ctrl_c() => {
                        println!("interrupted");
                        return;
                    }
                    output = client.execute(stmt.to_string()) => output,
                };
                println!("{}", output.unwrap_or_else(format_error));
            });
        }

        state.run_edit_commands(&[EditCommand::InsertString(
            outstanding.trim_start().to_string(),
        )]);
    }

    Ok(())
}

Hopefully the read thread and the write thread are the same, or I should factor out the println!("interrupted") outside the async block?

Besides, the runtime (rt) is single-threaded.

@joshka
Copy link
Collaborator

joshka commented Jan 4, 2025

Does your code have anything which could be reading events during the call to read_position (where the error message you're seeing originates):

fn read_position_raw() -> io::Result<(u16, u16)> {
// Use `ESC [ 6 n` to and retrieve the cursor position.
let mut stdout = io::stdout();
stdout.write_all(b"\x1B[6n")?;
stdout.flush()?;
loop {
match poll_internal(Some(Duration::from_millis(2000)), &CursorPositionFilter) {
Ok(true) => {
if let Ok(InternalEvent::CursorPosition(x, y)) =
read_internal(&CursorPositionFilter)
{
return Ok((x, y));
}
}
Ok(false) => {
return Err(Error::new(
ErrorKind::Other,
"The cursor position could not be read within a normal duration",
));
}
Err(_) => {}
}
}
}

In case the source code is not clear, the process of working out the window size is write an ANSI escape code, then read an event (possibly via either the stdin or tty depending on feature flags / OS)

I'm not certain whether writing would also trigger the problematic behavior (I've only personally encountered input side problems), but something to check:

  • To make it deterministically bugged, grab a copy of crossterm and compile it with a thread::sleep of a few seconds between the stdout.flush() call and the read checks.
  • To make it deterministically not bugged, add the same sleep call before your prints in the async code to ensure that it happens after the gap between the write and read.

This is just an idea - there could be other things wrong with this. Perhaps there are other pending events that are not processed by this point that prevent the window size code being seen. Try putting a loop with an even::poll call with a zero duration gating an event::read call after the point where you see the error to drain the event queue to see what you're seeing. If that's the problem then you need to make sure the event queue is empty before requesting the window size like this.

Regardless, my advice would be to try to arrange your UI related stuff on a single (non-async) thread to avoid this sort of thing, and to avoid blocking your async code from being scheduled. The amount of time spent waiting on UI things (i.e..100s+) can often be longer by an order of magnitude or two compared with the amount of time you'd want to spend blocking async schedulers (i.e.. 10µs). Of course this might not matter for your purposes, but Its still going to lead to fewer problems. The only reports I've seen people running into problems with this sort of code is when things happen on different threads. (That's not to say it's the only reason, but it's common enough to be ruled out first).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants