This document outlines the incident response plan for our smart contract system, addressing both ChainOps and SmartContract-related incidents. It provides a list of potential incidents and instructions on how to handle them effectively.
Description: A congested network can lead to (slow transaction confirmations, higher gas fees, slashing provers) impacting the performance of the rollup.
Response:
- Check Grafana Alerts: Monitor the Grafana dashboard at Grafana Dashboard for alerts related to network congestion.
- Engineer on Duty: The engineer on duty should be alerted automatically through the monitoring system.
- Mitigation: If network congestion is detected, consider adjusting gas prices or scheduling transactions during off-peak times.
Description: When the chain head stops, it indicates a potential issue with the operation of the network.
Response:
- Grafana Alerts: Monitor Grafana for alerts regarding the chain head number.
- Engineer on Duty: The engineer on duty should receive automatic alerts.
- Investigation: Investigate the root cause by analyzing blockchain data and logs.
- Collaboration: Collaborate with blockchain network administrators if necessary for a solution.
Description: A halt in the increase of the latest verified block number may indicate a problem with the operation of the network.
Response:
- Grafana Alerts: Keep an eye on Grafana alerts regarding the latest verified block number.
- Engineer on Duty: The engineer on duty should be automatically notified.
- Troubleshooting: Investigate the node's syncing process and take corrective actions to ensure it resumes.
Description: Unforeseeable issues with the smart contracts may arise, which were not identified during the audit.
Response:
- Incident Report: Create a detailed incident report, including the symptoms, affected contracts, and any relevant transaction or event data.
- Escalation: Notify the development and audit teams for immediate attention.
- Isolation: If necessary, isolate the affected smart contracts or functions to prevent further damage.
- Analysis: Collaborate with the audit team to analyze and diagnose the issue.
- Resolution: Implement necessary fixes, upgrades, or rollbacks as per the audit team's recommendations.
- Communication: Keep stakeholders informed throughout the incident resolution process.
This incident response plan ensures that potential incidents, whether related to ChainOps or SmartContracts, are promptly detected and addressed. The plan relies on monitoring tools like Grafana and the availability of an engineer on duty. In the case of unforeseeable smart contract issues, a systematic incident resolution process is in place to minimize the impact on the system's functionality and security.
Regular testing and review of this plan are recommended to ensure its effectiveness in responding to incidents as the system evolves.