Skip to content

Latest commit

 

History

History
159 lines (115 loc) · 7.46 KB

UPGRADE.md

File metadata and controls

159 lines (115 loc) · 7.46 KB

Upgrade to 5.X

Caffeine cache

EcChronos has changed its caches to use Caffeine. Logback.xml has been updated because of this to only log errors from Caffeine, if you're overriding logback make sure to include latest changes. If not, logs will be spammed by Caffeine cache when values in cache cannot be reloaded (same behaviour as Guava cache).

Incremental repairs

Version 5.X has added support for incremental repairs and multiple schedules for the same table. To enable multiple schedules for a table, multiple schedules should be specified in schedule.yaml. If there's a single entry in the schedule.yaml it will overwrite the default schedule (as in pre 5.X).

On demand repairs has also gotten the incremental support, due to this the ecchronos.on_demand_repair_status table has gotten a new column repair_type. The ecchronos.on_demand_repair_status table must be updated before performing the upgrade to version 5.X.

The command to add the column is shown below:

ALTER TABLE ecchronos.on_demand_repair_status ADD repair_type text;

Lock Refresh and failure cache

Added support for dynamic calculation of the lock refresh rate based on the Time-To-Live (TTL) from the ecchronos.lock table. The refresh rate is calculated by the formula TTL/10. This update ensures that the lock refresh rate aligns consistently with the actual TTL of the ecchronos.lock table.

Additionally, a feature has been introduced that allows users to configure the expiry time of the lock failure cache directly within the ecc.yaml file.

It’s important to note that while making these changes, the default behavior remains unchanged.

Priority Calculation

The unit of time granularity used for priority calculation. Possible values are HOURS, MINUTES, or SECONDS. This unit influences how quickly the priority of a job increases. Default is set to HOURS for backward compatibility. IMPORTANT: Ensure to pause repair operations prior to changing the granularity. Not doing so may lead to inconsistencies as some ecchronos instances could have different priorities compared to others for the same repair.

Upgrade to 4.x

Metrics

Version 4.x has revamped metrics produced by ecChronos. The following major changes have been made:

  • Metric names no longer contain keyspace and table, keyspace and table are used as tags instead.
  • Metrics that were split based on success/failure are now merged into one metric, the success/failure is indicated by a tag.

The following table metrics are available:

Metric pre 4.x Metric in 4.x
RepairSuccessTime repair.sessions
RepairFailedTime repair.sessions
LastRepairedAt time.since.last.repaired
RepairState repaired.ratio
RemainingRepairTime remaining.repair.time

The following aggregated metrics are available:

Metric pre 4.x Metric in 4.x
RemainingRepairTime node.remaining.repair.time
TableRepairState node.repaired.ratio
RepairSuccessTime node.repair.sessions
RepairFailedTime node.repair.sessions

For more information about new metrics, see metrics documentation.

V1 REST API

The v1 REST API deprecated in 3.x version of ecChronos have been removed in 4.x.

The following REST API endpoints have been removed:

  • /repair-management/v1/status
  • /repair-management/v1/status/ids
  • /repair-management/v1/status/keyspaces/<keyspace>/tables/<table>
  • /repair-management/v1/config
  • /repair-management/v1/schedule/keyspaces/<keyspace>

For more information about current REST interface, refer to REST documentation.

ecctool

The ecctool subcommands deprecated in 3.x version of ecChronos have been removed in 4.x.

The following ecctool subcommands have been removed:

  • repair-status
  • repair-config
  • trigger-repair

For more information about current ecctool subcommands, refer to ecctool documentation.

Upgrade to 3.x

From versions 2.x

The REST interface has been significantly reworked. Schedules and repairs are now split into two separate resources. Config is now part of full Schedules. Query parameters are used for filtering instead of path parameters.

Old New Description
/repair-management/v1/status /repair-management/v2/[repairs,schedules] Status has been split into repairs for on demand repairs and schedules for schedules
/repair-management/v1/status/ids /repair-management/v2/[repairs,schedules]/<id> Id can now be searched for on repairs or schedules specifically
/repair-management/v1/status/keyspaces/<keyspace>/tables/<table> /repair-management/v2/[repairs,schedules]?keyspace=<keyspace>&table=<table> keyspace and table are now query parameters
/repair-management/v1/config - Config has been removed and is part of schedules
/repair-management/v1/schedule/keyspaces/<keyspace> /repair-management/v2/repairs?keyspace=<keyspace>&table=<table> Triggering can be done by using POST to repairs with query parameters

For more information about REST interface, refer to REST documentation.

Upgrade to 2.x

From 2.0.0

A new column has been added to the table ecchronos.on_demand_repair_status, this must be added before upgrading.

The command to add the column is shown below:

ALTER TABLE ecchronos.on_demand_repair_status ADD completed_time timestamp;

Note: Make sure that you create the column with the cql_type timestamp since its not possible to change cql_type on an existing column.

From versions before 2.0.0

A new table has been introduced and must be present before upgrading.

The required table is shown below:

CREATE TABLE IF NOT EXISTS ecchronos.on_demand_repair_status (
    host_id uuid,
    job_id uuid,
    table_reference frozen<table_reference>,
    token_map_hash int,
    repaired_tokens frozen<set<frozen<token_range>>>,
    status text,
    completed_time timestamp,
    PRIMARY KEY(host_id, job_id))
    WITH default_time_to_live = 2592000
    AND gc_grace_seconds = 0;

An optional configuration parameter for remote routing has been introduced, the default value is true.

This can be configured in conf/ecc.yml:

cql:
  remoteRouting: false