-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add failed jobs support #103
Conversation
cef5041
to
dbe3b8a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 👏 Thank you for putting this together. I left a few comments on naming conventions and documentation improvements.
*/ | ||
public function change() | ||
{ | ||
$table = $this->table('failed_jobs'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$table = $this->table('failed_jobs'); | |
$table = $this->table('queue_failed_jobs'); |
Namespacing the table seems prudent when you want to use a name this generic.
docs/en/index.rst
Outdated
@@ -55,13 +55,23 @@ The following configuration should be present in the config array of your **conf | |||
|
|||
// The amount of time in milliseconds to sleep if no jobs are currently available. default: 10000 | |||
'receiveTimeout' => 10000, | |||
|
|||
// Whether to store failed jobs in the failed_jobs table. default: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Whether to store failed jobs in the failed_jobs table. default: false | |
// Whether to store failed jobs in the queue_failed_jobs table. default: false |
docs/en/index.rst
Outdated
] | ||
], | ||
// ... | ||
|
||
The ``Queue`` config key can contain one or more queue configurations. Each of | ||
these is used for interacting with a different queuing backend. | ||
|
||
If ``storeFailedJobs`` is set to ``true``, make sure to run the plugin migrations to create the ``failed_jobs`` table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ``storeFailedJobs`` is set to ``true``, make sure to run the plugin migrations to create the ``failed_jobs`` table. | |
If ``storeFailedJobs`` is set to ``true``, make sure to run the plugin migrations to create the ``queue_failed_jobs`` table. |
docs/en/index.rst
Outdated
``maxAttempts`` is configured on the job class or via a command line argument, a | ||
job will be considered failed if an exception is thrown while processing the job | ||
and there are no remaining attempts. The job will then be added to the | ||
``failed_jobs`` table and can be requeued manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``failed_jobs`` table and can be requeued manually. | |
``queue_failed_jobs`` table and can be requeued manually. |
@@ -266,6 +276,53 @@ This shell can take a few different options: | |||
- Max Runtime | |||
- Runtime: Time since the worker started, the worker will finish when Runtime is over Max Runtime value | |||
|
|||
Failed Jobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we call this a 'dead letter queue' as that is another term that is used for this behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if that will get confusing because some queue backends implement their own dead letter queues (for example, sqs, rabbitt mq). And it seems in those examples a dead letter queue is defined more broadly where here only jobs that are requeued too many times will be failed/dead lettered. Also, when a job is failed, it's not going to be queued on the original queue infrastructure, but a database table instead. I think this does fall within the concept of a dead letter queue though. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree that the terminology would overlap with native message brokers. Using 'failed jobs' is a good term for the abstraction we're trying to provide. I think we should mention the similarities with dead letter queues when we explain the behavior.
Let's keep 'failed jobs' until we hear feedback otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept it as failed jobs naming and added some clarification about the similarities/differences in the documentation.
src/Command/DeleteCommand.php
Outdated
|
||
if ($args->hasOption('class')) { | ||
$jobsToDelete->where([ | ||
$failedJobsTable->aliasField('class') => $args->getOption('class'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't need to manually alias all of these fields. Using where(['class' => $args->getOption('class')])
should be enough.
src/Command/DeleteCommand.php
Outdated
return; | ||
} | ||
|
||
if (!$args->getOption('assume-yes')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually call these kinds of parameters --force
src/Command/RequeueCommand.php
Outdated
$parser->addOption('config', [ | ||
'help' => 'Requeue jobs by the config used to queue the job.', | ||
]); | ||
$parser->addOption('assume-yes', [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be force
as well.
src/Command/RequeueCommand.php
Outdated
|
||
if ($args->hasOption('config')) { | ||
$jobsToRequeue->where([ | ||
$failedJobsTable->aliasField('config') => $args->getOption('config'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same feedback here about not needing to use aliasField()
.
] | ||
); | ||
|
||
$failedJobsTable->deleteOrFail($failedJob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll only be able to do at-least once delivery with this. We should mention that in the usage docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean because if deleteOrFail
fails, the job would remain in the failed_jobs
table and could be requeued, right? I can note that in the docs. I suppose the only solution to that would be to remove the job from the database, queue it, then insert the record again if there was an exception when queueing. But then a failed job could be "dropped" if it could not be inserted again, which seems worse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the scenario. I agree that the solution is overly complicated and that we can document the delivery guarantees we currently offer. If it is a problem we can always add complexity and improve the delivery guarantees in the future.
] | ||
); | ||
|
||
$failedJobsTable->deleteOrFail($failedJob); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that is the scenario. I agree that the solution is overly complicated and that we can document the delivery guarantees we currently offer. If it is a problem we can always add complexity and improve the delivery guarantees in the future.
['exception' => $exception, 'logger' => $context->getLogger()] | ||
); | ||
|
||
EventManager::instance()->dispatch($event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this extension use EventDispatcherTrait
instead of firing the event on the global event manager directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can actually. The listener can be attached when instantiating the extension in WorkerCommand
. That's the only place it's instantiated so there isn't really any need for a global event. I'll make that change.
c4c8360
to
fe82f67
Compare
composer.json
Outdated
@@ -20,6 +20,7 @@ | |||
"require": { | |||
"php": ">=7.2.0", | |||
"cakephp/cakephp": "^4.1", | |||
"cakephp/migrations": "^3.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make this an optional dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it optional and added a note in the docs to composer require it if using failed jobs.
fe82f67
to
2271a55
Compare
phpstan v1.9.1 returns an error with message "Dead catch - Cake\ORM\Exception\PersistenceFailedException is never thrown in the try block.". this error was not returned on v1.8.11.
The second commit I added was to address phpstan failing. I was getting a phpstan error |
Thank you 🎉 |
This adds support for storing and manually requeuing failed jobs (defined as jobs that have exceeded
maxAttempts
). This would lay some groundwork for #11.This is a somewhat significant set of changes so I'm very open to any feedback.
Many thanks to @amayer5125 for his suggestions and review.