Add functions to interrupt a single thread #761

dbaston · 2022-12-07T14:57:14Z

This PR changes GEOS_interruptRegisterCallback to operate on a per-thread basis; the callback will only be registered on the thread from which GEOS_interruptRegisterCallback is called. This would break existing expectations of a global callback, so maybe a better solution is to add a new function that registers a thread-local callback.

I'm fuzzy on how this gets used outside of PostGIS which (I think?) would not be affected by this change.

cc @nyalldawson

jorisvandenbossche · 2022-12-07T23:25:30Z

I'm fuzzy on how this gets used outside of PostGIS which (I think?) would not be affected by this change.

In Shapely we don't use this callback. @caspervdw experimented with this callback earlier this year (shapely/shapely#1370), but we ended up implementing a way to interrupt on the shapely side (so in between calls to GEOS).

nyalldawson · 2022-12-07T23:50:14Z

@jorisvandenbossche

(so in between calls to GEOS).

The big benefit of this call is that it can potentially interrupt stupidly long geos internal methods -- ie. when a user tries to intersection a >million vertex polygon with another >million vertex polygon 🤣 . Right now there's no clean way to interrupt those operations without completely killing the whole process.

nyalldawson · 2022-12-08T00:27:33Z

@dbaston

I'm fuzzy on how this gets used outside of PostGIS which (I think?) would not be affected by this change.

QGIS currently doesn't use it, because we can't isolate to individual threads. I'd love to use it to allow users to cancel long running geos operations though (such as the previously mentioned ridiculous intersection case)

The api for the interruption handling is a little tricky to understand. I gather I register the callback, and then from the callback I call GEOS_interruptRequest() when I want to interrupt a process. Is this correct?

(If so, this may be a nice opportunity to clean up that api and make the callback return bool for when the interruption is desired)

dbaston · 2022-12-08T02:55:38Z

The api for the interruption handling is a little tricky to understand. I gather I register the callback, and then from the callback I call GEOS_interruptRequest() when I want to interrupt a process. Is this correct?

Yes, this is the case. (It took me a while to figure it out too!) The API sort of makes sense for a system where GEOS_interruptRequest interrupts the thread that calls it. Right now it seems that GEOS_interruptRequest will interrupt only whichever single thread happens to check for an interrupt next.

In your case, would you want to register a different callback for each thread? (Does the callback need to be thread-local?) Or would you use a single callback like I did in InterruptTest.cpp to decide if a given thread should be interrupted?

nyalldawson · 2022-12-08T02:58:52Z

In your case, would you want to register a different callback for each thread? (Does the callback need to be thread-local?) Or would you use a single callback like I did in InterruptTest.cpp to decide if a given thread should be interrupted?

I could work with either. The per-thread/context callback would be cleaner though.

jorisvandenbossche · 2022-12-08T07:48:43Z

(so in between calls to GEOS).

The big benefit of this call is that it can potentially interrupt stupidly long geos internal methods

Yes, and so ideally we should certainly do both (and thus also implement the GEOS interrupt callback).
The initial motivation of the PR I referenced was to be able to interrupt a call that would create too many geometries (and blow up memory and eventually crash), for example by accidentally creating billions of points (so many calls to a cheap GEOS function). For that we needed to use another mechanism, since (I think) the interrupt callback isn't checked in such cheap GEOS operations.

- Add Interrupt::requestForCurrentThread (C API: GEOS_interruptThread) to request interruption of the current thread only. - Add Interrupt::registerThreadCallback (C API: GEOS_interruptRegisterThreadCallback) to register an interruption callback for the current thread only.

nyalldawson · 2022-12-09T01:02:42Z

@dbaston I can work with this api 👍 many thanks!!

dbaston · 2022-12-09T01:08:45Z

I could work with either. The per-thread/context callback would be cleaner though.

OK, with the current iteration you can interrupt the current thread using GEOS_interruptThread called from either a callback that executes on all threads (registered by the existing GEOS_interruptRegisterCallback) or a callback that executes on the current thread only (registered by a new GEOS_interruptRegisterThreadCallback). There are a few examples in InterruptTest.cpp and GEOSInterruptTest.cpp.

I used a new callback signature for GEOS_interruptRegisterThreadCallback, adding a void* so that an interrupt signal can be passed from a coordinating thread.

include/geos/util/Interrupt.h

Co-authored-by: Even Rouault <[email protected]>

macdrevx · 2023-01-02T01:51:13Z

@dbaston I'm glad to see this work taking shape as it will help to address GEOSwift/GEOSwift#190.

The proposed API lets you register a per-thread callback. I'm curious whether you've considered making it per-context rather than per-thread. My understanding is that GEOS contexts are meant to be confined to a single thread anyway, so tying the interrupt callback to the context seems like it could also solve the issue. It might also fit nicely alongside other existing context-specific APIs like the ones to register error and message handlers.

It could also be easier to use from systems where you don't select which thread to use directly (e.g. Grand Central Dispatch on Apple platforms, where you schedule work to be performed and the work is executed by one of several threads in a pool). In GEOSwift, we actually use short-lived GEOS contexts that never out-live a single call into the GEOSwift API, so if we were going to use the interrupt API in the future, tying the lifetime of the callback to the lifetime of the context would be a natural design.

That said, I think we could use the per-thread design as well by simply un-registering the handler (which I see is possible by passing NULL for the new handler) before returning from each GEOSwift API call that uses this feature.

dbaston · 2023-01-02T14:22:56Z

I'm curious whether you've considered making it per-context rather than per-thread.

Yes, and that would certainly make for a cleaner API. The problem is that the "context" is a C API construct to which the C++ library (the code actually being interrupted) has no access, so we would need to introduce the notion of a context into the C++ library in such a way that any interruptible code can access its current context. I'm not sure of a simple way to do that, but I'm open to ideas.

dbaston · 2023-01-02T14:25:21Z

Maybe we could have a thread-local global variable in the C++ library that stores the context, and every C API call sets that global variable before calling code in the C++ library?

rouault · 2023-01-02T14:26:17Z

The problem is that the "context" is a C API construct to which the C++ library (the code actually being interrupted) has no access,

That might have perhaps performance implications, but couldn't that interrupt handler be registered at the C context level, and each time a C _r function is invoked it would automatically install the per-thread interrupt handler before calling the C++ method, and uninstalls it at return / when an exception is caught ?

dbaston · 2023-01-02T14:58:15Z

I think that would also work. Pushing the context object into the C++ library might reduce the penalty because we would be setting/checking one thread-local variable instead of two. I would need to set up a benchmark to see if that actually matters.

dbaston · 2023-01-20T15:59:47Z

See #803 for an alternative API that is based on context handles.

robe2 · 2024-08-21T19:58:55Z

same here. I'm assuming this should go to 3.14 so switching, @dbaston @pramsey switch back if you disagree.

dbaston force-pushed the interrupt-thread branch from d6413ef to b6597c9 Compare December 8, 2022 02:36

dbaston force-pushed the interrupt-thread branch from b6597c9 to 14f2a62 Compare December 9, 2022 01:00

dbaston changed the title ~~Isolate interrupt requests to single thread~~ Add functions to interrupt a single thread Dec 9, 2022

Interrupt doc clarifications

26a87ea

rouault reviewed Dec 9, 2022

View reviewed changes

include/geos/util/Interrupt.h Outdated Show resolved Hide resolved

dbaston and others added 7 commits December 8, 2022 21:11

Add default 30s timeout to unit tests

cd37242

Update include/geos/util/Interrupt.h

a941096

Co-authored-by: Even Rouault <[email protected]>

Avoid potential lost interrupt request in test

6be3f03

Add more comments

6b2bce8

Add more comments

70f891c

Fix hanging interrupt test

dbd1263

Doc clarification

5081f27

dbaston mentioned this pull request Jan 20, 2023

Add functions to interrupt processing in a specific thread/context #803

Open

dbaston added the Feedback Requested label May 4, 2023

dbaston added this to the 3.13.0 milestone Jun 7, 2023

robe2 modified the milestones: 3.13.0, 3.14.0 Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add functions to interrupt a single thread #761

Add functions to interrupt a single thread #761

dbaston commented Dec 7, 2022

jorisvandenbossche commented Dec 7, 2022

nyalldawson commented Dec 7, 2022

nyalldawson commented Dec 8, 2022

dbaston commented Dec 8, 2022

nyalldawson commented Dec 8, 2022

jorisvandenbossche commented Dec 8, 2022

nyalldawson commented Dec 9, 2022

dbaston commented Dec 9, 2022

macdrevx commented Jan 2, 2023 •

edited

Loading

dbaston commented Jan 2, 2023

dbaston commented Jan 2, 2023

rouault commented Jan 2, 2023

dbaston commented Jan 2, 2023

dbaston commented Jan 20, 2023

robe2 commented Aug 21, 2024

Add functions to interrupt a single thread #761

Are you sure you want to change the base?

Add functions to interrupt a single thread #761

Conversation

dbaston commented Dec 7, 2022

jorisvandenbossche commented Dec 7, 2022

nyalldawson commented Dec 7, 2022

nyalldawson commented Dec 8, 2022

dbaston commented Dec 8, 2022

nyalldawson commented Dec 8, 2022

jorisvandenbossche commented Dec 8, 2022

nyalldawson commented Dec 9, 2022

dbaston commented Dec 9, 2022

macdrevx commented Jan 2, 2023 • edited Loading

dbaston commented Jan 2, 2023

dbaston commented Jan 2, 2023

rouault commented Jan 2, 2023

dbaston commented Jan 2, 2023

dbaston commented Jan 20, 2023

robe2 commented Aug 21, 2024

macdrevx commented Jan 2, 2023 •

edited

Loading