- Make sure the client knows from the start that critical incidents are natural.
- Disclose your backup plan and recovery process.
- What happened, who was affected and what's the impact of the issue?
- Does the issue have impact and complexity that requires a team effort?
- Who will communicate the client?
- Who will fix the issue?
- Who will work on the restoration?
- Notify the client and take responsibility as a team.
- Identify the bug causing the incident and issue a hotfix.
- Identify the latest backup with valid data.
- Define the time-frame not covered by the backup.
- Retrace the state of the system during the time-frame not covered by the backup.
- Write data restoration scripts.
- Specify all commands and steps required for the restoration.
- Test the restoration locally.
- Backup the data.
- Restore the lost data.
- Identify the data that could not be recovered.
- Disclose the lost data to the client.
- What happened?
- Why the incident occurred?
- What was the resolution? And how effective?
- What would the team do differently?
- What problems did the team encounter?
- What actions will be taken to make sure the incident doesn’t happen again?
- Take time after the incident to read the postmortem and update what was necessary.