RFC: [SQS Batch Processing] Ability to move Message from Batch to DLQ when certain types of exception occurs #29
Labels
all_runtimes
Changes that should be applied to all runtimes
Batch
Batch processing utility
Pending/Triage
Pending triage
RFC
Key information
Summary
During batch processing of SQS messages, there can be messages in the batch which fails processing because of reasons for which user will not want them to retry it but move those to a DLQ associated with SQS queue or delete it entirely. Example might be where a message is failing business validation, and it won't make sense to let it retry until it expires, rather we can simply move such message directly to a DLQ.
We could enhance SQL Batch processing to accept a list of Exception/Errors. If those exceptions occur during message processing via
SqsMessageHandler
for Java or viarecord_handler
in python, utility can take care of moving such message to DLQ directly or delete it entirely based on a config param, instead of moving them back to queue.Motivation
This is a fairly common use-case. This will take away all the custom logic that user need to build themselves and let user focus on writing business logic of processing the SQS message instead.
Proposal
During batch processing of SQS messages, there can be messages in the batch which fails processing because of reasons for which user will not want them to retry it but move those to a DLQ associated with SQS queue or delete it entirely. Example might be where a message is failing business validation, and it won't make sense to let it retry until it expires, rather we can simply move such message directly to a DLQ.
We could enhance SQL Batch processing to accept a list of Exception/Errors. If those exceptions occur during message processing via
SqsMessageHandler
for Java or viarecord_handler
in python, utility can take care of moving such message to DLQ directly or delete it entirely based on a config param, instead of moving them back to queue.So basically, accepting a list of exceptions/errors and a new flag if such a message should be deleted or moved to DLQ in the api contract or the annotation/decorator. Default could be to move to DLQ is one exists for the SQS queue.
If this feature should be available in other runtimes (e.g. Python), how would this look like to ensure consistency?
For Python version, since it supports similar batch processing utility with similar UX via decorator and APIs, same capability could be added to it as well.
User Experience
How would customers use it?
For Java:
Similar support can be made to higher level api supported by utility:
So in above examples, if
IllegalStateException
is thrown from handler while processing the message, then that message will be automatically moved to a DLQ or be deleted based on flag value ofdeleteNonRetryableMessageFromQueue
. By default, it will attempt to move it to DLQ if one exists.Any configuration or corner cases you'd expect?
NA
Demonstration of before and after on how the experience will be better
Refer summary above. Today all the logic of deciding to move DQL or Deleting such message has to be done by users writing alot of custom code around it.
Drawbacks
Increases complexity of the utilty and more code to maintain?
No, since we already depend on SQS client today. Its just additional functionality within utility.
Rationale and alternatives
Unresolved questions
The text was updated successfully, but these errors were encountered: