Skip to content

fix(crates): Detect stale upload caches and retry #98

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 18, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 51 additions & 6 deletions src/targets/crates.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,11 @@ import { GithubGlobalConfig, TargetConfig } from '../schemas/project_config';
import { forEachChained } from '../utils/async';
import { ConfigurationError } from '../utils/errors';
import { withTempDir } from '../utils/files';
import { checkExecutableIsPresent, spawnProcess } from '../utils/system';
import {
checkExecutableIsPresent,
sleepAsync,
spawnProcess,
} from '../utils/system';
import { BaseTarget } from './base';
import { BaseArtifactProvider } from '../artifact_providers/base';

Expand All @@ -24,6 +28,30 @@ const DEFAULT_CARGO_BIN = 'cargo';
*/
const CARGO_BIN = process.env.CARGO_BIN || DEFAULT_CARGO_BIN;

/**
* A message fragment emitted by cargo when publishing fails due to a missing
* dependency. This sometimes indicates a false positive if the cache has not
* been updated.
*/
const VERSION_ERROR = 'failed to select a version for the requirement';

/**
* Maximum number of attempts including the initial one when publishing fails
* due to a stale cache. After this number of retries, publishing fails.
*/
const MAX_ATTEMPTS = 5;

/**
* Initial delay to wait between publish retries in seconds. Exponential backoff
* is applied to this delay on retries.
*/
const RETRY_DELAY_SECS = 2;

/**
* Exponential backoff that is applied to the initial retry delay.
*/
const RETRY_EXP_FACTOR = 2;

/** Options for "crates" target */
export interface CratesTargetOptions extends TargetConfig {
/** Crates API token */
Expand Down Expand Up @@ -203,9 +231,9 @@ export class CratesTarget extends BaseTarget {

const crates = this.getPublishOrder(unorderedCrates);
logger.debug(
`Publishing packages in the following order: [${crates
`Publishing packages in the following order: ${crates
.map(c => c.name)
.toString()}]`
.join(', ')}`
);
return forEachChained(crates, async crate => this.publishPackage(crate));
}
Expand All @@ -227,9 +255,26 @@ export class CratesTarget extends BaseTarget {
crate.manifest_path
);

return spawnProcess(CARGO_BIN, args, {
env: { ...process.env, CARGO_REGISTRY_TOKEN: this.cratesConfig.apiToken },
});
const env = {
...process.env,
CARGO_REGISTRY_TOKEN: this.cratesConfig.apiToken,
};

let delay = RETRY_DELAY_SECS;
logger.info(`Publishing ${crate.name}`);
for (let i = 0; i <= MAX_ATTEMPTS; i++) {
try {
return await spawnProcess(CARGO_BIN, args, { env });
} catch (e) {
if (i < MAX_ATTEMPTS && e.message.includes(VERSION_ERROR)) {
Copy link
Member

@BYK BYK Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This differs from the for loop's condition, intentional? I find having two checks a bit confusing btw. Feels like this should be a while loop. How about a slightly different approach?

const MAX_WAIT_SECS = 60;
let totalWait = 0;
let delay = RETRY_DELAY_SECS;
let error;

do {
  if (error) {
    if (error.message.includes(VERSION_ERROR)) {
      logger.warn(`Publish failed due to potentially stale cache. Trying again in ${delay}s...`);
      await sleepAsync(delay * 1000);
      totalWait += delay;
      delay *= RETRY_EXP_FACTOR;
    } else {
      break;
    }
  }
  try {
    return await spawnProcess(CARGO_BIN, args, { env });
  } catch (err) {
    error = err;
  }
} while (totalWait <= MAX_WAIT_SECS)

throw error;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is intentional. The loop needs to to take another turn so that we get into the else branch. In your implementation, you do not throw the error in the last iteration.

I thought about a total wait time (which could be written as a for loop, too), but rather wanted a fixed number of retries. And then doing RETRY_DELAY_SECS * Math.pow(RETRY_EXP_FACTOR, MAX_ATTEMPTS) seemed to cumbersome.

Copy link
Member

@BYK BYK Jun 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your implementation, you do not throw the error in the last iteration.

Yes I do as I do not check the count or total wait in the if separately.

I need more sleep/coffee.

logger.warn(`Publish failed, trying again in ${delay}s...`);
await sleepAsync(delay * 1000);
delay *= RETRY_EXP_FACTOR;
} else {
throw e;
}
}
}
}

/**
Expand Down