Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements TimerTasks using Promises. #425

Closed
wants to merge 3 commits into from
Closed

Conversation

hms
Copy link
Contributor

@hms hms commented Nov 24, 2024

Replaces the OG TimerTask with a Promises based implementation.

In combination with the previous commit, all of SQ has been moved
to the Promises infrastructure and makes using any of the new
concurrency functionality (Actors, Channels, etc.) more easily available
for future SQ enhancements.

This PR is a continuation of the not yet accepted PR #417. If desired / required,
I am happy to pull the 2 commits into a clean PR based on main.

@rosa
Copy link
Member

rosa commented Nov 30, 2024

Hi @hms, so sorry for the delay here! I didn't have the chance to look at this and #417 this past week as I focused on something else, but this is top of my list for next week.

@hms
Copy link
Contributor Author

hms commented Dec 2, 2024

@rosa

Rebased this PR too.

@rosa
Copy link
Member

rosa commented Dec 4, 2024

Hey @hms, would you mind rebasing this one once more, now that #417 has been merged?

end
result.just? ? finished : failed_with(result.reason)

result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of using Concurrent::Maybe here vs. a regular struct?

Copy link
Contributor Author

@hms hms Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning: Rationalization alert.

I dislike Struct. I would have reached for Data, but when I started this PR, SQ was on Ruby 3.3.1.

The Result struct was nothing more than a simple Maybe. It was also only used in 1 place, for 1 if and made me question if it was really needed at all (returning [success, exception] is pretty much the same thing minus the success method). But with the change from Futures to Promises::Futures, I also pushed the error reporting out of the execute method back to the Future. With some of the logic moving back to the Future, why not keep the logic all within the Concurrent library?

If all of this is bothersome, you're not buying the rationalization for the change, of you simply feel that this doesn't make SQ better I'm happy to refactor the Maybe out and return the Result object (but I would recommend using Data over Struct).

One question on Result: Given it's only used in one place and in a sequence of a return value, immediately in an IF, and then GC'ed, is there a reason for the (very little) extra code Result Vs. an alternative implementation?

hms added 3 commits December 4, 2024 11:21
SolidQueue has excellent built-in error reporting. While this is
fantastic for SQ users, it is less than ideal for testing
SolidQueue because any test that deliberately uses or triggers an
exception produces voluminous error reporting.

This error reporting is hugely valuable when the exception is
not expected, but distracting and of limited value for expected
use-cases, especially when the test confirms the correct outcomes
via assertions.

This commit adds:
* A generic test-specific Exception class: ExpectedTestError
  This allows testing for specific exceptions while retaining
  all error reporting infrastructure for unexpected exceptions.

* Two helper methods for silencing on_thread_error output
  These methods accept an Exception or Array(Exception)
  and simply does not call the output mechanism if the
  exception passed to on_thread_error matches.
  This way, any unexpected error during test still
  reports in a highly visible manner while the exceptions
  being tested are validated via assertions.

* Replaces the stock on_thread_error with one that ignores
  ExpectedTextError.  Updated several tests from using
  the ruby stock RuntimeError to ExpectedTestError.

* Configures tests to run with YJIT enabled
  This is to test under likely production deployment configuration,
  not for performance reasons.

  Note: With the very recent reporting on M4's crashing on Mac's
  with YJIT enabled, we might want to either defer this change
  or add a conditional to opt in until the problem is resolved.
* Upgrade Worker.pool from Future to Promises.future

  Promises.future leverages an improved, non-blocking, and lock-free
  implementation of Concurrent Ruby's Async runtime, enhancing performance
  and future feature compatibility.  Promises.future was moved from
  edge to main in V1.1 (2018).

* Replace ClaimedExecution::Results with Concurrent::Maybe

  `Results` was underutilized and essentially mirrored `Maybe`'s
  functionality. This change simplifies code and reduces redundancy
  by leveraging the `Concurrent::Maybe` class.

* Centralize error reporting in AppExecutor.handle_thread_error

  This change was necessitated by the change to Promises.future.
  Concurrent Ruby has some very strong ideas about exceptions
  within a future with a little code rearranging, this change
  pulls the error / exception reporting responsibilities out
  of the Future execution code path and pushes it to
  AppExecutor#handle_thread_error.

  This change ensures that `Rails.error` is called exactly once
  per `handle_thread_error` invocation regardless of
  on_thread_error calling `Rails.error` or not.

* Update tests to accommodate these changes
Promises are the recommended infrastructure, replacing
several OG APIs, including TimerTasks.

SQ only uses TimerTasks in 3 places (currently) and
a very small subset of their overall functionality.
SolidQueue::TimerTask is a drop-in replacement.

This PR uses AtomicBoolean instead of the recommended
Concurrent::Cancellation to avoid a dependency on,
and the potential API stability issues of, edge features.

This completes the move from the old APIs to Promises and
makes all of the new concurrency features (Actors,
Channels, etc.) available for future SQ features and
enahancements.
@hms
Copy link
Contributor Author

hms commented Dec 4, 2024

rebased

@hms
Copy link
Contributor Author

hms commented Dec 4, 2024

@rosa

One other question should this PR be accepted. I avoided using Concurrent::Cancellation because it's in Edge. But it's been in edge since 2018, is the SQ team open my cleaning up the TimerTask implementation using Cancellation? If yes, the code would be a little cleaner and it would allow the last bit of the older concurrency implementation used in Scheduler to be reimplemented via the Promises infrastructure.

assert_raises RuntimeError do
claimed_execution.perform
end
claimed_execution.perform
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in #373, and it's not clear to me that the new behaviour is correct 🤔 If the error reporting library doesn't rely on Rails error reporting, I think this might not work correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rosa

I'm not sure I fully understand your issue / question or possibly the impact of #373.

Is your issue with the actual error handling and possibly missed / lost errors, this test is not valid with my changes and hence I'm no longer testing an important behavior, or what happens if a SQ user is depending on something other than Rails.error for their system wide Error manage mechanism?

hms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rosa

I've done some digging into ActiveSupport::ExecutorWrapper and SolidQueue initialization and I think I understand your issue/concern.

When an exception is raised and not rescued within a ExecutionWrapper.wrap block, it's caught and reported to the ActiveSupport.error_reporter. Under normal circumstances, ActiveSupport.error_reporter and Rails.error_reporter are one and the same. So, the code in the PR is fine.... under normal circumstances.

However, if directly or indirectly, the app_executor has been overridden and the overridden app_executor handles exception reporting without calling Rails.error_reporter, then the current implementation of the PR is effectively swallowing the exception.

If this summary of your concern is correct, I'll see what I have to do to address it.

@hms
Copy link
Contributor Author

hms commented Dec 8, 2024

@rosa

Based on your feedback and a clear misunderstanding of a Concurrent Ruby behavior, I'm going to pull this PR and re-submit it. I'll put back the Results, since with the required rework, Maybe is even harder to justify. Thanks for your patience and catching the significant issue with the code as submitted.

@hms hms closed this Dec 8, 2024
@hms hms deleted the promises branch December 18, 2024 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants