Skip to content
This repository has been archived by the owner on Dec 27, 2022. It is now read-only.

[all] Error Refactor #464

Open
LayneHaber opened this issue Mar 16, 2021 · 0 comments
Open

[all] Error Refactor #464

LayneHaber opened this issue Mar 16, 2021 · 0 comments
Labels
chore Refactoring or ops work needs spec p3 Longer-term change

Comments

@LayneHaber
Copy link
Contributor

Describe the problem
Our errors have a few issues:

  • They often have several nested contexts and can be very difficult to parse through
  • It is not clear which ones are retry-able (message dropped, timeout, etc) and which ones you should not retry
  • They can blow up a console, and often times take one of us to sort through and extract the useful information

The goal

Errors should:

  • Be informative, but not swamp you with a deluge of information that may or may not be helpful
  • Not have nested contexts, all of that can go in one place
  • Have clear codes, and if possible, resolution steps
  • Be unique, and clearly traceable to the part of the stack they originate in (Inbound v Outbound in protocol.. bit of an oof dealing with those)

Some inspiration

We've started using error codes and linking to a GitHub page in the indexer logs, to give people a more in-depth explanation of every error they might see: https://github.com/graphprotocol/indexer/blob/master/docs/errors.md

Every log message includes an "errorUrl": ... kind of field pointing to the specific error explanation.

However, as you can see, we've hardly managed to fill out the descriptions so far. 😉
The error definitions are here: https://github.com/graphprotocol/indexer/blob/ab34268709111ed013cac9f0a65d2cc9745428cd/packages/indexer-common/src/errors.ts

The way we wrap errors in indexer errors is: https://github.com/graphprotocol/indexer/blob/deedd11a42a7d7b02bec3e1b9fe5b8844a3c0ac8/packages/indexer-agent/src/network.ts#L342-L346

One thing that's been useful to us and our indexers are automatically tracked error metrics: https://github.com/graphprotocol/indexer/blob/ab34268709111ed013cac9f0a65d2cc9745428cd/packages/indexer-common/src/errors.ts#L104-L106

That way everyone can track how often they are seeing which errors. In the testnet we could see every individual's error numbers and help specific people out with their issues.

@LayneHaber LayneHaber added p3 Longer-term change needs spec chore Refactoring or ops work labels Mar 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
chore Refactoring or ops work needs spec p3 Longer-term change
Projects
None yet
Development

No branches or pull requests

1 participant