This page describes examples for ecctool, if you're looking for general ecctool documentation refer to ECCTOOL.md
In this example, we will use ecctool repairs
to check the status of manual repairs.
The output shows all manual repairs for all ecChronos instances.
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Host Id | Keyspace | Table | Status | Repaired(%) | Completed at | Repair type |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| f4fb2b38-b9d0-4390-97ca-eeb284391f80 | ee32d9c7-1a4e-40c2-9b28-1000544011ae | test | table1 | IN_QUEUE | 0.00 | - | VNODE |
| f4fb2b38-b9d0-4390-97ca-eeb284391f80 | ba7665b2-5a7b-42b5-9f38-037f2da1e80a | test | table1 | COMPLETED | 100.00 | 2022-09-22 13:40:07 | VNODE |
| 9c86aa0e-b4af-4f3c-a89c-1596f8d1dd2a | ba7665b2-5a7b-42b5-9f38-037f2da1e80a | test | table2 | COMPLETED | 100.00 | 2023-09-21 15:26:58 | INCREMENTAL |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Summary: 1 completed, 1 in queue, 0 blocked, 0 warning, 0 error
Looking at the example output above, the columns are:
Id
- the manual repair ID, manual repair triggered on several hosts will have the same ID.
Host Id
- the host id of the Cassandra instance ecChronos responsible for performing manual repair is connected to.
Keyspace
- the keyspace the manual repair is run on.
Table
- the table the manual repair is run on.
Status
- the status of the manual repair.
The possible statuses are:
IN_QUEUE
- the manual repair is awaiting execution or is currently runningERROR
- the manual repair failed, some ranges might've failed or the ranges have changed.COMPLETED
- the manual repair is completed, all ranges have been repaired.
Repaired(%)
- the number of ranges repaired vs total ranges.
For manual repairs this value should never go down.
Completed at
- the time when the manual repair has finished.
Repair type
- the type of the repair, can be VNODE
, PARALLEL_VNODE
or INCREMENTAL
. All repairs pre ecChronos 5.0 were VNODE
.
In this example we will use ecctool schedules
to check the status of schedules.
The output shows all schedules the local ecChronos instance has. For new tables a completed time will be calculated using
an initial delay that is configurable.
Snapshot as of 2023-06-12 15:33:25
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Keyspace | Table | Status | Repaired(%) | Completed at | Next repair | Repair type |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 51a64d70-0924-11ee-9173-1f4a33dd583b | test | table2 | COMPLETED | 100.00 | 2023-06-06 15:33:19 | 2023-06-13 15:33:19 | VNODE |
| 552d5150-0924-11ee-9173-1f4a33dd583b | keyspaceWithCamelCase | tableWithCamelCase | COMPLETED | 100.00 | 2023-06-06 15:33:19 | 2023-06-13 15:33:19 | VNODE |
| 5365b0b0-0924-11ee-9173-1f4a33dd583b | test2 | table1 | COMPLETED | 100.00 | 2023-06-06 15:33:19 | 2023-06-13 15:33:19 | VNODE |
| 54079600-0924-11ee-9173-1f4a33dd583b | test2 | table2 | COMPLETED | 100.00 | 2023-06-06 15:33:19 | 2023-06-13 15:33:19 | VNODE |
| 51021e30-0924-11ee-9173-1f4a33dd583b | test | table1 | COMPLETED | 100.00 | 2023-06-12 15:26:27 | 2023-06-19 15:26:27 | VNODE |
| 24280063-034f-4f63-9e40-2f8a86ca569b | test | table2 | COMPLETED | 100.00 | 2023-06-12 15:26:53 | 2023-06-13 15:26:53 | INCREMENTAL |
| d4c0fdb8-8183-4bbe-b714-53f8aff892f0 | test | table1 | COMPLETED | 100.00 | 2023-06-12 15:33:19 | 2023-06-13 15:33:19 | INCREMENTAL |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Summary: 7 completed, 0 on time, 0 blocked, 0 late, 0 overdue
Looking at the example output above, the first line is Snapshot as of 2022-09-22 14:05:12
.
This means that the output is tied to a point in time,
the output might change if the subcommand is run at a different time.
Looking at the table, the columns are:
Id
- the schedule ID, this corresponds to the table id.
Keyspace
- the keyspace the repair is run on.
Table
- the table the repair is run on.
Status
- the status of the repair.
The possible statuses are:
ON_TIME
- the schedule is awaiting execution or is currently runningLATE
- the schedule is late, warning time specified in the configuration has passed.OVERDUE
- the schedule is overdue, error time specified in the configuration has passed.COMPLETED
- the schedule is completed, all ranges have been repaired within the interval.BLOCKED
- the schedule is blocked, occurs if a schedule should be executing but is blocked by a run-policy or if a repair task has failed and triggered a backoff (30 minutes).
Repaired(%)
- the number of ranges repaired within the interval vs total ranges.
For schedules this value can go up and down as ranges become unrepaired.
Completed at
- the time when the all ranges for the schedule are repaired.
ecChronos assumes all ranges are repaired if there's no repair history.
Next repair
- the time when the schedule will be made ready for execution.
This is based on the (oldest range repair time + interval) - repair time taken for the ranges.
This is updated each time a repair group is completed.
Repair type
- the type of the schedule, can be VNODE
, PARALLEL_VNODE
or INCREMENTAL
. All schedules pre ecChronos 5.0 were VNODE
.
In this example we will use ecctool run-repair
to run a manual repair for all ecChronos instances,
for all keyspaces and tables.
The output shows created manual repairs for all ecChronos instances.
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Host Id | Keyspace | Table | Status | Repaired(%) | Completed at | Repair type |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 497eb4cf-9275-4216-9cca-12958bde28af | 6424a5fa-69ea-49a3-a542-4751d0283c9a | test2 | table2 | IN_QUEUE | 0.00 | - | VNODE |
| 497eb4cf-9275-4216-9cca-12958bde28af | 045c01c1-ff50-4de2-8da3-1b9270c382b5 | test2 | table2 | IN_QUEUE | 0.00 | - | VNODE |
| ea32c8b5-3e8a-466a-b5f5-f9248e73774c | 6424a5fa-69ea-49a3-a542-4751d0283c9a | test | table2 | IN_QUEUE | 0.00 | - | VNODE |
| ea32c8b5-3e8a-466a-b5f5-f9248e73774c | 045c01c1-ff50-4de2-8da3-1b9270c382b5 | test | table2 | IN_QUEUE | 0.00 | - | VNODE |
| 0d8845fd-84dc-435c-8b8b-83b701dd2cbd | 045c01c1-ff50-4de2-8da3-1b9270c382b5 | keyspaceWithCamelCase | tableWithCamelCase | IN_QUEUE | 0.00 | - | VNODE |
| 0d8845fd-84dc-435c-8b8b-83b701dd2cbd | 6424a5fa-69ea-49a3-a542-4751d0283c9a | keyspaceWithCamelCase | tableWithCamelCase | IN_QUEUE | 0.00 | - | VNODE |
| c5f830b0-533a-464f-80ed-aa8b90248ba3 | 6424a5fa-69ea-49a3-a542-4751d0283c9a | test2 | table1 | IN_QUEUE | 0.00 | - | VNODE |
| c5f830b0-533a-464f-80ed-aa8b90248ba3 | 045c01c1-ff50-4de2-8da3-1b9270c382b5 | test2 | table1 | IN_QUEUE | 0.00 | - | VNODE |
| 73c27554-58a5-47b9-a2ab-01b9fcfad4f0 | 6424a5fa-69ea-49a3-a542-4751d0283c9a | test | table1 | IN_QUEUE | 0.00 | - | VNODE |
| 73c27554-58a5-47b9-a2ab-01b9fcfad4f0 | 045c01c1-ff50-4de2-8da3-1b9270c382b5 | test | table1 | IN_QUEUE | 0.00 | - | VNODE |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Summary: 0 completed, 10 in queue, 0 blocked, 0 warning, 0 error
Looking at the example output above, the columns are:
Id
- the manual repair ID, manual repair triggered on several hosts will have the same ID.
Host Id
- the host id of the Cassandra instance ecChronos responsible for performing manual repair is connected to.
Keyspace
- the keyspace the manual repair is run on.
Table
- the table the manual repair is run on.
Status
- the status of the manual repair. This will always be IN_QUEUE
for newly run manual repairs.
Repaired(%)
- the number of ranges repaired vs total ranges.
For manual repairs this value should never go down.
This will always be 0
for newly run manual repairs.
Completed at
- the time when the manual repair has finished. This will always be -
for newly run manual repairs.
After running this subcommand, to check the progress of running manual repairs use ecctool repairs
.
In this example we will use ecctool repair-info --duration 5m
to check how much each table is repaired.
The output shows the cluster wide repair information for all tables in the past 5 minutes.
Time window between '2022-09-23 13:12:54' and '2022-09-23 13:17:54'
---------------------------------------------------------------------------------
| Keyspace | Table | Repaired (%) | Repair time taken |
---------------------------------------------------------------------------------
| keyspaceWithCamelCase | tableWithCamelCase | 73.73 | 3 seconds |
| test | table1 | 73.73 | 3 seconds |
| test | table2 | 73.73 | 4 seconds |
| test2 | table1 | 73.73 | 3 seconds |
| test2 | table2 | 73.73 | 4 seconds |
---------------------------------------------------------------------------------
Looking at the example output above, the columns are:
Keyspace
- the keyspace the repair information corresponds to.
Table
- the table the repair information corresponds to.
Repaired (%)
- the repaired ranges vs total ranges of the table in %.
Repair time taken
- the time taken for the Cassandra to finish the repairs.
By default, repair-info fetches the information on a cluster level.
To check the repair information for the local node use --local
flag.
In this example we will use ecctool running-job
to check if any job is currently running. It will give one of these
two responses
No job is currently running
or
Job ID: x-x-x-x-x, Status: Running