Batch dispatcher stuck in infinite retry loop when encountering conflict error

I've noticed that if there's an error (timeout/disconnect/connection loss) during a `BatchPin` submission to the blockchain connector (EVMConnect in my case), it's possible for EVMConnect to successfully process the transaction but the FireFly operation will be `Failed`. Then, the batch processor resubmits the batch but EVMConnect returns a 409, which causes the processor to indefinitely retry and prevent new batches from occurring.


Example error message:
```
FF10458: Conflict from blockchain connector: FF21065: ID 'default:f5296ba7-1b23-4c7f-8612-620dafc0e40a' is not unique d=pinned_broadcast ns=default opcache=1UGYM3mn p=did:firefly:org/org_5452a6| pid=61421 role=batchmgr
```

Currently, the only way to fix this is restarting FireFly.

From my understanding, this should be handled by FireFly idempotent retry logic https://github.com/hyperledger/firefly/blob/1939b67fa0135f59eee7452dd7b73054915eea4d/internal/operations/manager.go#L178

 I _think_ the bug is that an error is still returned by `RunOperation`, causing the batch processor to retry forever https://github.com/hyperledger/firefly/blob/1939b67fa0135f59eee7452dd7b73054915eea4d/internal/batch/batch_processor.go#L622

I'll link my naive fix that unblocks the batch processor, but I'm not sure it's fully correct because the operation still gets marked as `Failed`. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch dispatcher stuck in infinite retry loop when encountering conflict error #1594

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Batch dispatcher stuck in infinite retry loop when encountering conflict error #1594

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions