Skip to content

Latest commit

 

History

History
executable file
·
1546 lines (999 loc) · 63.1 KB

reference.md

File metadata and controls

executable file
·
1546 lines (999 loc) · 63.1 KB

Table of Contents

author

Represents the author of a change

authoring

The authors mapping between an origin and a destination

authoring.overwrite

Use the default author for all the submits in the destination. Note that some destinations might choose to ignore this author and use the current user running the tool (In other words they don't allow impersonation).

authoring_class authoring.overwrite(default)

Parameters:

Parameter Description
default string

The default author for commits in the destination

Example:

Overwrite usage example:

Create an authoring object that will overwrite any origin author with [email protected] mail.

authoring.overwrite("Foo Bar <[email protected]>")

new_author

Create a new author from a string with the form 'name [email protected]'

author new_author(author_string)

Parameters:

Parameter Description
author_string string

A string representation of the author with the form 'name [email protected]'

Example:

Create a new author:
new_author('Foo Bar <[email protected]>')

authoring.pass_thru

Use the origin author as the author in the destination, no whitelisting.

authoring_class authoring.pass_thru(default)

Parameters:

Parameter Description
default string

The default author for commits in the destination. This is used in squash mode workflows or if author cannot be determined.

Example:

Pass thru usage example:
authoring.pass_thru(default = "Foo Bar <[email protected]>")

authoring.whitelisted

Create an individual or team that contributes code.

authoring_class authoring.whitelisted(default, whitelist)

Parameters:

Parameter Description
default string

The default author for commits in the destination. This is used in squash mode workflows or when users are not whitelisted.

whitelist sequence of string

List of white listed authors in the origin. The authors must be unique

Examples:

Only pass thru whitelisted users:
authoring.whitelisted(
    default = "Foo Bar <[email protected]>",
    whitelist = [
       "[email protected]",
       "[email protected]",
       "[email protected]",
    ],
)
Only pass thru whitelisted LDAPs/usernames:

Some repositories are not based on email but use LDAPs/usernames. This is also supported since it is up to the origin how to check whether two authors are the same.

authoring.whitelisted(
    default = "Foo Bar <[email protected]>",
    whitelist = [
       "someuser",
       "other",
       "another",
    ],
)

authoring_class

The authors mapping between an origin and a destination

Console

A console that can be used in skylark transformations to print info, warning or error messages.

metadata

Core transformations for the change metadata

metadata.squash_notes

Generate a message that includes a constant prefix text and a list of changes included in the squash change.

transformation metadata.squash_notes(prefix='Copybara import of the project:\n\n', max=100, compact=True, show_ref=True, show_author=True, show_description=True, oldest_first=False, use_merge=True)

Parameters:

Parameter Description
prefix string

A prefix to be printed before the list of commits.

max integer

Max number of commits to include in the message. For the rest a comment like (and x more) will be included. By default 100 commits are included.

compact boolean

If compact is set, each change will be shown in just one line

show_ref boolean

If each change reference should be present in the notes

show_author boolean

If each change author should be present in the notes

show_description boolean

If each change description should be present in the notes

oldest_first boolean

If set to true, the list shows the oldest changes first. Otherwise it shows the changes in descending order.

use_merge boolean

If true then merge changes are included in the squash notes

Examples:

Simple usage:

'Squash notes' default is to print one line per change with information about the author

metadata.squash_notes("Changes for Project Foo:\n")

This transform will generate changes like:

Changes for Project Foo:

  - 1234abcde second commit description by Foo Bar <[email protected]>
  - a4321bcde first commit description by Foo Bar <[email protected]>
Removing authors and reversing the order:
metadata.squash_notes("Changes for Project Foo:\n",
    oldest_first = True,
    show_author = False,
)

This transform will generate changes like:

Changes for Project Foo:

  - a4321bcde first commit description
  - 1234abcde second commit description
Removing description:
metadata.squash_notes("Changes for Project Foo:\n",
    show_description = False,
)

This transform will generate changes like:

Changes for Project Foo:

  - a4321bcde by Foo Bar <[email protected]>
  - 1234abcde by Foo Bar <[email protected]>
Showing the full message:
metadata.squash_notes(
  prefix = 'Changes for Project Foo:',
  compact = False
)

This transform will generate changes like:

Changes for Project Foo:
--
2 by Foo Baz <[email protected]>:

second commit

Extended text
--
1 by Foo Bar <[email protected]>:

first commit

Extended text

metadata.save_author

For a given change, store a copy of the author as a label with the name ORIGINAL_AUTHOR.

transformation metadata.save_author(label='ORIGINAL_AUTHOR')

Parameters:

Parameter Description
label string

The label to use for storing the author

metadata.map_author

Map the author name and mail to another author. The mapping can be done by both name and mail or only using any of the two.

transformation metadata.map_author(authors, reversible=False, noop_reverse=False, fail_if_not_found=False, reverse_fail_if_not_found=False, map_all_changes=False)

Parameters:

Parameter Description
authors dict

The author mapping. Keys can be in the form of 'Your Name', 'some@mail' or 'Your Name some@mail'. The mapping applies heuristics to know which field to use in the mapping. The value has to be always in the form of 'Your Name some@mail'

reversible boolean

If the transform is automatically reversible. Workflows using the reverse of this transform will be able to automatically map values to keys.

noop_reverse boolean

If true, the reversal of the transformation doesn't do anything. This is useful to avoid having to write core.transformation(metadata.map_author(...), reversal = []).

fail_if_not_found boolean

Fail if a mapping cannot be found. Helps discovering early authors that should be in the map

reverse_fail_if_not_found boolean

Same as fail_if_not_found but when the transform is used in a inverse workflow.

map_all_changes boolean

If all changes being migrated should be mapped. Useful for getting a mapped metadata.squash_notes. By default we only map the current author.

Example:

Map some names, emails and complete authors:

Here we show how to map authors using different options:

metadata.map_author({
    'john' : 'Some Person <[email protected]>',
    '[email protected]' : 'Other Person <[email protected]>',
    'John Example <[email protected]>' : 'Another Person <[email protected]>',
})

metadata.use_last_change

Use metadata (message or/and author) from the last change being migrated. Useful when using 'SQUASH' mode but user only cares about the last change.

transformation metadata.use_last_change(author=True, message=True, default_message=None, use_merge=True)

Parameters:

Parameter Description
author boolean

Replace author with the last change author (Could still be the default author if not whitelisted or using authoring.overwrite.

message boolean

Replace message with last change message.

default_message string

Replace message with last change message.

use_merge boolean

If true then merge changes are taken into account for looking for the last change.

metadata.expose_label

Certain labels are present in the internal metadata but are not exposed in the message by default. This transformations find a label in the internal metadata and exposes it in the message. If the label is already present in the message it will update it to use the new name and separator.

transformation metadata.expose_label(name, new_name=label, separator="=", ignore_label_not_found=True)

Parameters:

Parameter Description
name string

The label to search

new_name string

The name to use in the message

separator string

The separator to use when adding the label to the message

ignore_label_not_found boolean

If a label is not found, ignore the error and continue.

Examples:

Simple usage:

Expose a hidden label called 'REVIEW_URL':

metadata.expose_label('REVIEW_URL')

This would add it as REVIEW_URL=the_value.

New label name:

Expose a hidden label called 'REVIEW_URL' as GIT_REVIEW_URL:

metadata.expose_label('REVIEW_URL', 'GIT_REVIEW_URL')

This would add it as GIT_REVIEW_URL=the_value.

Custom separator:

Expose the label with a custom separator

metadata.expose_label('REVIEW_URL', separator = ': ')

This would add it as REVIEW_URL: the_value.

metadata.restore_author

For a given change, restore the author present in the ORIGINAL_AUTHOR label as the author of the change.

transformation metadata.restore_author(label='ORIGINAL_AUTHOR', search_all_changes=False)

Parameters:

Parameter Description
label string

The label to use for restoring the author

search_all_changes boolean

By default Copybara only looks in the last current change for the author label. This allows to do the search in all current changes (Only makes sense for SQUASH/CHANGE_REQUEST).

metadata.add_header

Adds a header line to the commit message. Any variable present in the message in the form of ${LABEL_NAME} will be replaced by the corresponding label in the message. Note that this requires that the label is already in the message or in any of the changes being imported. The label in the message takes priority over the ones in the list of original messages of changes imported.

transformation metadata.add_header(text, ignore_label_not_found=False, new_line=True)

Parameters:

Parameter Description
text string

The header text to include in the message. For example '[Import of foo ${LABEL}]'. This would construct a message resolving ${LABEL} to the corresponding label.

ignore_label_not_found boolean

If a label used in the template is not found, ignore the error and don't add the header. By default it will stop the migration and fail.

new_line boolean

If a new line should be added between the header and the original message. This allows to create messages like HEADER: ORIGINAL_MESSAGE

Examples:

Add a header always:

Adds a header to any message

metadata.add_header("COPYBARA CHANGE")

Messages like:

A change

Example description for
documentation

Will be transformed into:

COPYBARA CHANGE
A change

Example description for
documentation
Add a header that uses a label:

Adds a header to messages that contain a label. Otherwise it skips the message manipulation.

metadata.add_header("COPYBARA CHANGE FOR https://github.com/myproject/foo/pull/${GITHUB_PR_NUMBER}",
    ignore_label_not_found = True,
)

A change message, imported using git.github_pr_origin, like:

A change

Example description for
documentation

Will be transformed into:

COPYBARA CHANGE FOR https://github.com/myproject/foo/pull/1234
Example description for
documentation

GIT_URL=http://foo.com/1234```

Assuming the PR number is 1234. But any change without that label will not be transformed.

##### Add a header without new line:

Adds a header without adding a new line before the original message:

```python
metadata.add_header("COPYBARA CHANGE: ", new_line = False)

Messages like:

A change

Example description for
documentation

Will be transformed into:

COPYBARA CHANGE: A change

Example description for
documentation

metadata.replace_message

Replace the change message with a template text. Any variable present in the message in the form of ${LABEL_NAME} will be replaced by the corresponding label in the message. Note that this requires that the label is already in the message or in any of the changes being imported. The label in the message takes priority over the ones in the list of original messages of changes imported.

transformation metadata.replace_message(text, ignore_label_not_found=False)

Parameters:

Parameter Description
text string

The template text to use for the message. For example '[Import of foo ${LABEL}]'. This would construct a message resolving ${LABEL} to the corresponding label.

ignore_label_not_found boolean

If a label used in the template is not found, ignore the error and don't add the header. By default it will stop the migration and fail.

Example:

Replace the message:

Replace the original message with a text:

metadata.replace_message("COPYBARA CHANGE: Import of ${GITHUB_PR_NUMBER}\n\n${GITHUB_PR_BODY}\n")

Will transform the message to:

COPYBARA CHANGE: Import of 12345
Body from Github Pull Request

metadata.scrubber

Removes part of the change message using a regex

transformation metadata.scrubber(regex, replacement='')

Parameters:

Parameter Description
regex string

Any text matching the regex will be removed. Note that the regex is runs in multiline mode.

replacement string

Text replacement for the matching substrings. References to regex group numbers can be used in the form of $1, $2, etc.

Examples:

Remove from a keyword to the end of the message:

When change messages are in the following format:

Public change description

This is a public description for a commit

CONFIDENTIAL:
This fixes internal project foo-bar

Using the following transformation:

metadata.scrubber('(^|\n)CONFIDENTIAL:(.|\n)*')

Will remove the confidential part, leaving the message as:

Public change description

This is a public description for a commit

Keep only message enclosed in tags:

The previous example is prone to leak confidential information since a developer could easily forget to include the CONFIDENTIAL label. A different approach for this is to scrub everything by default except what is explicitly allowed. For example, the following scrubber would remove anything not enclosed in tags:

metadata.scrubber('^(?:\n|.)*<public>((?:\n|.)*)</public>(?:\n|.)*$', replacement = '$1')

So a message like:

this
is
very confidential<public>but this is public
very public
</public>
and this is a secret too

would be transformed into:

but this is public
very public

metadata.verify_match

Verifies that a RegEx matches (or not matches) the change message. Does not transform anything, but will stop the workflow if it fails.

transformation metadata.verify_match(regex, verify_no_match=False)

Parameters:

Parameter Description
regex string

The regex pattern to verify. The re2j pattern will be applied in multiline mode, i.e. '^' refers to the beginning of a file and '$' to its end.

verify_no_match boolean

If true, the transformation will verify that the RegEx does not match.

Example:

Check that a text is present in the change description:

Check that the change message contains a text enclosed in :

metadata.verify_match("<public>(.|\n)*</public>")

metadata.map_references

Allows updating links to references in commit messages to match the destination's format. Note that this will only consider the 5000 latest commits.

referenceMigrator metadata.map_references(before, after, regex_groups={}, additional_import_labels=[])

Parameters:

Parameter Description
before string

Template for origin references in the change message. Use a '${reference}' token to capture the actual references. E.g. if the origin uses linkslike 'http://changes?1234', the template would be 'http://internalReviews.com/${reference}', with reference_regex = '[0-9]+'

after string

Format for references in the destination, use the token '${reference}' to represent the destination reference. E.g. 'http://changes(${reference})'.

regex_groups dict

Regexes for the ${reference} token's content. Requires one 'before_ref' entry matching the ${reference} token's content on the before side. Optionally accepts one 'after_ref' used for validation. Copybara uses re2 syntax.

additional_import_labels sequence of string

Meant to be used when migrating from another tool: Per default, copybara will only recognize the labels defined in the workflow's endpoints. The tool will use these additional labels to find labels created by other invocations and tools.

Example:

Map references, origin source of truth:

Finds links to commits in change messages, searches destination to find the equivalent reference in destination. Then replaces matches of 'before' with 'after', replacing the subgroup matched with the destination reference. Assume a message like 'Fixes bug introduced in origin/abcdef', where the origin change 'abcdef' was migrated as '123456' to the destination.

metadata.map_references(
    before = "origin/${reference}",
    after = "destination/${reference}",
    regex_groups = {
        "before_ref": "[0-9a-f]+",
        "after_ref": "[0-9]+",
    },
),

This would be translated into 'Fixes bug introduced in destination/123456', provided that a change with the proper label was found - the message remains unchanged otherwise.

core

Core functionality for creating migrations, and basic transformations.

glob

Glob returns a list of every file in the workdir that matches at least one pattern in include and does not match any of the patterns in exclude.

glob glob(include, exclude=[])

Parameters:

Parameter Description
include sequence of string

The list of glob patterns to include

exclude sequence of string

The list of glob patterns to exclude

Examples:

Simple usage:

Include all the files under a folder except for internal folder files:

glob(["foo/**"], exclude = ["foo/internal/**"])
Multiple folders:

Globs can have multiple inclusive rules:

glob(["foo/**", "bar/**", "baz/**.java"])

This will include all files inside foo and bar folders and Java files inside baz folder.

Multiple excludes:

Globs can have multiple exclusive rules:

glob(["foo/**"], exclude = ["foo/internal/**", "foo/confidential/**" ])

Include all the files of foo except the ones in internal and confidential folders

All BUILD files recursively:

Copybara uses Java globbing. The globbing is very similar to Bash one. This means that recursive globbing for a filename is a bit more tricky:

glob(["BUILD", "**/BUILD"])

This is the correct way of matching all BUILD files recursively, including the one in the root. **/BUILD would only match BUILD files in subdirectories.

Matching multiple strings with one expression:

While two globs can be used for matching two directories, there is a more compact approach:

glob(["{java,javatests}/**"])

This matches any file in java and javatests folders.

core.reverse

Given a list of transformations, returns the list of transformations equivalent to undoing all the transformations

sequence core.reverse(transformations)

Parameters:

Parameter Description
transformations sequence of transformation

The transformations to reverse

core.workflow

Defines a migration pipeline which can be invoked via the Copybara command.

Implicit labels that can be used/exposed:

  • COPYBARA_CONTEXT_REFERENCE: Requested reference. For example if copybara is invoked as copybara copy.bara.sky workflow master, the value would be master.
  • COPYBARA_LAST_REV: Last reference that was migrated
  • COPYBARA_CURRENT_REV: The current reference being migrated
  • COPYBARA_CURRENT_MESSAGE: The current message at this point of the transformations
  • COPYBARA_CURRENT_MESSAGE_TITLE: The current message title (first line) at this point of the transformations

core.workflow(name, origin, destination, authoring, transformations=[], origin_files=glob(['**']), destination_files=glob(['**']), mode="SQUASH", reversible_check=True for 'CHANGE_REQUEST' mode. False otherwise, check_last_rev_state=False, ask_for_confirmation=False, dry_run=False, after_migration=[])

Parameters:

Parameter Description
name string

The name of the workflow.

origin origin

Where to read from the code to be migrated, before applying the transformations. This is usually a VCS like Git, but can also be a local folder or even a pending change in a code review system like Gerrit.

destination destination

Where to write to the code being migrated, after applying the transformations. This is usually a VCS like Git, but can also be a local folder or even a pending change in a code review system like Gerrit.

authoring authoring_class

The author mapping configuration from origin to destination.

transformations sequence

The transformations to be run for this workflow. They will run in sequence.

origin_files glob

A glob relative to the workdir that will be read from the origin during the import. For example glob(["**.java"]), all java files, recursively, which excludes all other file types.

destination_files glob

A glob relative to the root of the destination repository that matches files that are part of the migration. Files NOT matching this glob will never be removed, even if the file does not exist in the source. For example glob([''], exclude = ['/BUILD']) keeps all BUILD files in destination when the origin does not have any BUILD files. You can also use this to limit the migration to a subdirectory of the destination, e.g. glob(['java/src/'], exclude = ['/BUILD']) to only affect non-BUILD files in java/src.

mode string

Workflow mode. Currently we support three modes:

  • 'SQUASH': Create a single commit in the destination with new tree state.
  • 'ITERATIVE': Import each origin change individually.
  • 'CHANGE_REQUEST': Import a pending change to the Source-of-Truth. This could be a GH Pull Request, a Gerrit Change, etc. The final intention should be to submit the change.
  • 'CHANGE_REQUEST_FROM_SOT': Import a pending change from the Source-of-Truth. This mode is useful when, despite the pending change being already in the SoT, the users want to review the code on a different system. The final intention should never be to submit in the destination, but just review or test

reversible_check boolean

Indicates if the tool should try to to reverse all the transformations at the end to check that they are reversible.
The default value is True for 'CHANGE_REQUEST' mode. False otherwise

check_last_rev_state boolean

If set to true, Copybara will validate that the destination didn't change since last-rev import for destination_files. Note that this flag doesn't work for CHANGE_REQUEST mode.

ask_for_confirmation boolean

Indicates that the tool should show the diff and require user's confirmation before making a change in the destination.

dry_run boolean

Run the migration in dry-run mode. Some destination implementations might have some side effects (like creating a code review), but never submit to a main branch.

after_migration sequence

Run a feedback workflow after one migration happens. STILL WIP

Command line flags:

Name Type Description
--change_request_parent string Commit revision to be used as parent when importing a commit using CHANGE_REQUEST workflow mode. this shouldn't be needed in general as Copybara is able to detect the parent commit message.
--last-rev string Last revision that was migrated to the destination
--init-history boolean Import all the changes from the beginning of the history up to the resolved ref. For 'ITERATIVE' workflows this will import individual changes since the first one. For 'SQUASH' it will import the squashed change up to the resolved ref. WARNING: Use with care, this flag should be used only for the very first run of Copybara for a workflow.
--iterative-limit-changes int Import just a number of changes instead of all the pending ones
--ignore-noop boolean Only warn about operations/transforms that didn't have any effect. For example: A transform that didn't modify any file, non-existent origin directories, etc.
--squash-skip-history boolean Avoid exposing the history of changes that are being migrated. This is useful when we want to migrate a new repository but we don't want to expose all the change history to metadata.squash_notes.
--import-noop-changes boolean By default Copybara will only try to migrate changes that could affect the destination. Ignoring changes that only affect excluded files in origin_files. This flag disables that behavior and runs for all the changes.
--workflow-identity-user string Use a custom string as a user for computing change identity
--check-last-rev-state boolean If enabled, Copybara will validate that the destination didn't change since last-rev import for destination_files. Note that this flag doesn't work for CHANGE_REQUEST mode.
--dry-run boolean Run the migration in dry-run mode. Some destination implementations might have some side effects (like creating a code review), but never submit to a main branch.
--threads int Number of threads to use when running transformations that change lot of files
--threads-min-size int Minimum size of the lists to process to run them in parallel
--notransformation-join boolean By default Copybara tries to join certain transformations in one so that it is more efficient. This disables the feature.
--read-config-from-change boolean For each imported origin change, load the configuration from that change.

core.move

Moves files between directories and renames files

transformation core.move(before, after, paths=glob(["**"]), overwrite=False)

Parameters:

Parameter Description
before string

The name of the file or directory before moving. If this is the empty string and 'after' is a directory, then all files in the workdir will be moved to the sub directory specified by 'after', maintaining the directory tree.

after string

The name of the file or directory after moving. If this is the empty string and 'before' is a directory, then all files in 'before' will be moved to the repo root, maintaining the directory tree inside 'before'.

paths glob

A glob expression relative to 'before' if it represents a directory. Only files matching the expression will be moved. For example, glob(["**.java"]), matches all java files recursively inside 'before' folder. Defaults to match all the files recursively.

overwrite boolean

Overwrite destination files if they already exist. Note that this makes the transformation non-reversible, since there is no way to know if the file was overwritten or not in the reverse workflow.

Examples:

Move a directory:

Move all the files in a directory to another directory:

core.move("foo/bar_internal", "bar")

In this example, foo/bar_internal/one will be moved to bar/one.

Move all the files to a subfolder:

Move all the files in the checkout dir into a directory called foo:

core.move("", "foo")

In this example, one and two/bar will be moved to foo/one and foo/two/bar.

Move a subfolder's content to the root:

Move the contents of a folder to the checkout root directory:

core.move("foo", "")

In this example, foo/bar would be moved to bar.

core.copy

Copy files between directories and renames files

transformation core.copy(before, after, paths=glob(["**"]), overwrite=False)

Parameters:

Parameter Description
before string

The name of the file or directory to copy. If this is the empty string and 'after' is a directory, then all files in the workdir will be copied to the sub directory specified by 'after', maintaining the directory tree.

after string

The name of the file or directory destination. If this is the empty string and 'before' is a directory, then all files in 'before' will be copied to the repo root, maintaining the directory tree inside 'before'.

paths glob

A glob expression relative to 'before' if it represents a directory. Only files matching the expression will be copied. For example, glob(["**.java"]), matches all java files recursively inside 'before' folder. Defaults to match all the files recursively.

overwrite boolean

Overwrite destination files if they already exist. Note that this makes the transformation non-reversible, since there is no way to know if the file was overwritten or not in the reverse workflow.

Examples:

Copy a directory:

Move all the files in a directory to another directory:

core.copy("foo/bar_internal", "bar")

In this example, foo/bar_internal/one will be copied to bar/one.

Copy with reversal:

Copy all static files to a 'static' folder and use remove for reverting the change

core.transform(
    [core.copy("foo", "foo/static", paths = glob(["**.css","**.html", ]))],
    reversal = [core.remove(glob(['foo/static/**.css', 'foo/static/**.html']))]
)

core.remove

Remove files from the workdir. This transformation is only mean to be used inside core.transform for reversing core.copy like transforms. For regular file filtering use origin_files exclude mechanism.

remove core.remove(paths)

Parameters:

Parameter Description
paths glob

The files to be deleted

Examples:

Reverse a file copy:

Move all the files in a directory to another directory:

core.transform(
    [core.copy("foo", "foo/public")],
    reversal = [core.remove(glob(["foo/public/**"]))])

In this example, foo/bar_internal/one will be moved to bar/one.

Copy with reversal:

Copy all static files to a 'static' folder and use remove for reverting the change

core.transform(
    [core.copy("foo", "foo/static", paths = glob(["**.css","**.html", ]))],
    reversal = [core.remove(glob(['foo/static/**.css', 'foo/static/**.html']))]
)

core.replace

Replace a text with another text using optional regex groups. This tranformer can be automatically reversed.

replace core.replace(before, after, regex_groups={}, paths=glob(["**"]), first_only=False, multiline=False, repeated_groups=False, ignore=[])

Parameters:

Parameter Description
before string

The text before the transformation. Can contain references to regex groups. For example "foo${x}text".

If '$' literal character needs to be matched, '$$' should be used. For example '$$FOO' would match the literal '$FOO'.

after string

The text after the transformation. It can also contain references to regex groups, like 'before' field.

regex_groups dict

A set of named regexes that can be used to match part of the replaced text.Copybara uses re2 syntax. For example {"x": "[A-Za-z]+"}

paths glob

A glob expression relative to the workdir representing the files to apply the transformation. For example, glob(["**.java"]), matches all java files recursively. Defaults to match all the files recursively.

first_only boolean

If true, only replaces the first instance rather than all. In single line mode, replaces the first instance on each line. In multiline mode, replaces the first instance in each file.

multiline boolean

Whether to replace text that spans more than one line.

repeated_groups boolean

Allow to use a group multiple times. For example foo${repeated}/${repeated}. Note that this mechanism doesn't use backtracking. In other words, the group instances are treated as different groups in regex construction and then a validation is done after that.

ignore sequence

A set of regexes. Any text that matches any expression in this set, which might otherwise be transformed, will be ignored.

Examples:

Simple replacement:

Replaces the text "internal" with "external" in all java files

core.replace(
    before = "internal",
    after = "external",
    paths = glob(["**.java"]),
)
Replace using regex groups:

In this example we map some urls from the internal to the external version in all the files of the project.

core.replace(
        before = "https://some_internal/url/${pkg}.html",
        after = "https://example.com/${pkg}.html",
        regex_groups = {
            "pkg": ".*",
        },
    )

So a url like https://some_internal/url/foo/bar.html will be transformed to https://example.com/foo/bar.html.

Remove confidential blocks:

This example removes blocks of text/code that are confidential and thus shouldn'tbe exported to a public repository.

core.replace(
        before = "${x}",
        after = "",
        multiline = True,
        regex_groups = {
            "x": "(?m)^.*BEGIN-INTERNAL[\\w\\W]*?END-INTERNAL.*$\\n",
        },
    )

This replace would transform a text file like:

This is
public
 // BEGIN-INTERNAL
 confidential
 information
 // END-INTERNAL
more public code
 // BEGIN-INTERNAL
 more confidential
 information
 // END-INTERNAL

Into:

This is
public
more public code

core.todo_replace

Replace Google style TODOs. For example TODO(username, othername).

todoReplace core.todo_replace(tags=['TODO', 'NOTE'], mapping={}, mode='MAP_OR_IGNORE', paths=glob(["**"]), default=None)

Parameters:

Parameter Description
tags sequence of string

Prefix tag to look for

mapping dict

Mapping of users/strings

mode string

Mode for the replace:

  • 'MAP_OR_FAIL': Try to use the mapping and if not found fail.
  • 'MAP_OR_IGNORE': Try to use the mapping but ignore if no mapping found.
  • 'MAP_OR_DEFAULT': Try to use the mapping and use the default if not found.
  • 'SCRUB_NAMES': Scrub all names from TODOs. Transforms 'TODO(foo)' to 'TODO'
  • 'USE_DEFAULT': Replace any TODO(foo, bar) with TODO(default_string)

paths glob

A glob expression relative to the workdir representing the files to apply the transformation. For example, glob(["**.java"]), matches all java files recursively. Defaults to match all the files recursively.

default string

Default value if mapping not found. Only valid for 'MAP_OR_DEFAULT' or 'USE_DEFAULT' modes

Examples:

Simple update:

Replace TODOs and NOTES for users in the mapping:

core.todo_replace(
  mapping = {
    'test1' : 'external1',
    'test2' : 'external2'
  }
)

Would replace texts like TODO(test1) or NOTE(test1, test2) with TODO(external1) or NOTE(external1, external2)

Scrubbing:

Remove text from inside TODOs

core.todo_replace(
  mode = 'SCRUB_NAMES'
)

Would replace texts like TODO(test1): foo or NOTE(test1, test2):foo with TODO:foo and NOTE:foo

core.verify_match

Verifies that a RegEx matches (or not matches) the specified files. Does not transform anything, but will stop the workflow if it fails.

verifyMatch core.verify_match(regex, paths=glob(["**"]), verify_no_match=False)

Parameters:

Parameter Description
regex string

The regex pattern to verify. To satisfy the validation, there has to be atleast one (or no matches if verify_no_match) match in each of the files included in paths. The re2j pattern will be applied in multiline mode, i.e. '^' refers to the beginning of a file and '$' to its end. Copybara uses re2 syntax.

paths glob

A glob expression relative to the workdir representing the files to apply the transformation. For example, glob(["**.java"]), matches all java files recursively. Defaults to match all the files recursively.

verify_no_match boolean

If true, the transformation will verify that the RegEx does not match.

core.transform

Groups some transformations in a transformation that can contain a particular, manually-specified, reversal, where the forward version and reversed version of the transform are represented as lists of transforms. The is useful if a transformation does not automatically reverse, or if the automatic reversal does not work for some reason.
If reversal is not provided, the transform will try to compute the reverse of the transformations list.

transformation core.transform(transformations, reversal=The reverse of 'transformations', ignore_noop=False)

Parameters:

Parameter Description
transformations sequence of transformation

The list of transformations to run as a result of running this transformation.

reversal sequence of transformation

The list of transformations to run as a result of running this transformation in reverse.

ignore_noop boolean

In case a noop error happens in the group of transformations (Both forward and reverse), it will be ignored and rest of the transformations in the group will not be executed. In general this is a bad idea and prevents Copybara for detecting important transformation errors.

core.dynamic_transform

Create a dynamic Skylark transformation. This should only be used by libraries developers

transformation core.dynamic_transform(impl, params={})

Parameters:

Parameter Description
impl baseFunction

The Skylark function to call

params dict

The parameters to the function. Will be available under ctx.params

Example:

Create a dynamic transformation with parameter:

If you want to create a library that uses dynamic transformations, you probably want to make them customizable. In order to do that, in your library.bara.sky, you need to hide the dynamic transformation (prefix with '_' and instead expose a function that creates the dynamic transformation with the param:

def _test_impl(ctx):
  ctx.set_message(ctx.message + ctx.params['name'] + str(ctx.params['number']) + '\n')

def test(name, number = 2):
  return core.dynamic_transform(impl = _test_impl,
                           params = { 'name': name, 'number': number})

  

After defining this function, you can use test('example', 42) as a transformation in core.workflow.

folder

Module for dealing with local filesytem folders

folder.destination

A folder destination is a destination that puts the output in a folder. It can be used both for testing or real production migrations.Given that folder destination does not support a lot of the features of real VCS, there are some limitations on how to use it:

  • It requires passing a ref as an argument, as there is no way of calculating previous migrated changes. Alternatively, --last-rev can be used, which could migrate N changes.
  • Most likely, the workflow should use 'SQUASH' mode, as history is not supported.
  • If 'ITERATIVE' mode is used, a new temp directory will be created for each change migrated.

destination folder.destination()

Command line flags:

Name Type Description
--folder-dir string Local directory to write the output of the migration to. If the directory exists, all files will be deleted. By default Copybara will generate a temporary directory, so you shouldn't need this.

folder.origin

A folder origin is a origin that uses a folder as input

folderOrigin folder.origin(materialize_outside_symlinks=False)

Parameters:

Parameter Description
materialize_outside_symlinks boolean

By default folder.origin will refuse any symlink in the migration folder that is an absolute symlink or that refers to a file outside of the folder. If this flag is set, it will materialize those symlinks as regular files in the checkout directory.

Command line flags:

Name Type Description
--folder-origin-author string Author of the change being migrated from folder.origin()
--folder-origin-message string Message of the change being migrated from folder.origin()

git

Set of functions to define Git origins and destinations.

Command line flags:

Name Type Description
--git-credential-helper-store-file string Credentials store file to be used. See https://git-scm.com/docs/git-credential-store
--nogit-credential-helper-store boolean Disable using credentials store. See https://git-scm.com/docs/git-credential-store

git.origin

Defines a standard Git origin. For Git specific origins use: github_origin or gerrit_origin.

All the origins in this module accept several string formats as reference (When copybara is called in the form of copybara config workflow reference):

  • Branch name: For example master
  • An arbitrary reference: refs/changes/20/50820/1
  • A SHA-1: Note that it has to be reachable from the default refspec
  • A Git repository URL and reference: http://github.com/foo master
  • A GitHub pull request URL: https://github.com/some_project/pull/1784

So for example, Copybara can be invoked for a git.origin in the CLI as:
copybara copy.bara.sky my_workflow https://github.com/some_project/pull/1784
This will use the pull request as the origin URL and reference.

gitOrigin git.origin(url, ref=None, submodules='NO', include_branch_commit_logs=False, first_parent=True)

Parameters:

Parameter Description
url string

Indicates the URL of the git repository

ref string

Represents the default reference that will be used for reading the revision from the git repository. For example: 'master'

submodules string

Download submodules. Valid values: NO, YES, RECURSIVE.

include_branch_commit_logs boolean

Whether to include raw logs of branch commits in the migrated change message.WARNING: This field is deprecated in favor of 'first_parent' one. This setting only affects merge commits.

first_parent boolean

If true, it only uses the first parent when looking for changes. Note that when disabled in ITERATIVE mode, it will try to do a migration for each change of the merged branch.

git.integrate

Integrate changes from a url present in the migrated change label.

git_integrate git.integrate(label="COPYBARA_INTEGRATE_REVIEW", strategy="FAKE_MERGE_AND_INCLUDE_FILES", ignore_errors=True)

Parameters:

Parameter Description
label string

The migration label that will contain the url to the change to integrate.

strategy string

How to integrate the change:

  • 'FAKE_MERGE': Add the url revision/reference as parent of the migration change but ignore all the files from the url. The commit message will be a standard merge one but will include the corresponding RevId label
  • 'FAKE_MERGE_AND_INCLUDE_FILES': Same as 'FAKE_MERGE' but any change to files that doesn't match destination_files will be included as part of the merge commit. So it will be a semi fake merge: Fake for destination_files but merge for non destination files.
  • 'INCLUDE_FILES': Same as 'FAKE_MERGE_AND_INCLUDE_FILES' but it it doesn't create a merge but only include changes not matching destination_files

ignore_errors boolean

If we should ignore integrate errors and continue the migration without the integrate

Example:

Integrate changes from a review url:

Assuming we have a git.destination defined like this:

git.destination(
        url = "https://example.com/some_git_repo",
        integrates = [git.integrate()],

)

It will look for COPYBARA_INTEGRATE_REVIEW label during the worklow migration. If the label is found, it will fetch the git url and add that change as an additional parent to the migration commit (merge). It will fake-merge any change from the url that matches destination_files but it will include changes not matching it.

git.mirror

Mirror git references between repositories

git.mirror(name, origin, destination, refspecs=['refs/heads/*'], prune=False)

Parameters:

Parameter Description
name string

Migration name

origin string

Indicates the URL of the origin git repository

destination string

Indicates the URL of the destination git repository

refspecs sequence of string

Represents a list of git refspecs to mirror between origin and destination.For example 'refs/heads/:refs/remotes/origin/' will mirror any referenceinside refs/heads to refs/remotes/origin.

prune boolean

Remove remote refs that don't have a origin counterpart

Command line flags:

Name Type Description
--git-mirror-force boolean Force push even if it is not fast-forward

git.gerrit_origin

Defines a Git origin for Gerrit reviews.

Implicit labels that can be used/exposed:

  • GERRIT_CHANGE_NUMBER: The change number for the Gerrit review.
  • GERRIT_CHANGE_ID: The change id for the Gerrit review.
  • GERRIT_CHANGE_DESCRIPTION: The description of the Gerrit review.
  • COPYBARA_INTEGRATE_REVIEW: A label that when exposed, can be used to integrate automatically in the reverse workflow.

gitOrigin git.gerrit_origin(url, ref=None, submodules='NO', first_parent=True)

Parameters:

Parameter Description
url string

Indicates the URL of the git repository

ref string

DEPRECATED. Use git.origin for submitted branches.

submodules string

Download submodules. Valid values: NO, YES, RECURSIVE.

first_parent boolean

If true, it only uses the first parent when looking for changes. Note that when disabled in ITERATIVE mode, it will try to do a migration for each change of the merged branch.

git.github_origin

Defines a Git origin for a Github repository. This origin should be used for public branches. Use github_pr_origin for importing Pull Requests.

gitOrigin git.github_origin(url, ref=None, submodules='NO', first_parent=True)

Parameters:

Parameter Description
url string

Indicates the URL of the git repository

ref string

Represents the default reference that will be used for reading the revision from the git repository. For example: 'master'

submodules string

Download submodules. Valid values: NO, YES, RECURSIVE.

first_parent boolean

If true, it only uses the first parent when looking for changes. Note that when disabled in ITERATIVE mode, it will try to do a migration for each change of the merged branch.

git.github_pr_origin

Defines a Git origin for Github pull requests.

Implicit labels that can be used/exposed:

  • GITHUB_PR_NUMBER: The pull request number if the reference passed was in the form of https://github.com/project/pull/123, refs/pull/123/head or refs/pull/123/master.
  • COPYBARA_INTEGRATE_REVIEW: A label that when exposed, can be used to integrate automatically in the reverse workflow.
  • GITHUB_BASE_BRANCH: The base branch name used for the Pull Request.
  • GITHUB_BASE_BRANCH_SHA1: The base branch SHA-1 used as baseline.
  • GITHUB_PR_TITLE: Title of the Pull Request.
  • GITHUB_PR_BODY: Body of the Pull Request.

githubPROrigin git.github_pr_origin(url, use_merge=False, required_labels=[], retryable_labels=[], submodules='NO', baseline_from_branch=False, first_parent=True, state='OPEN')

Parameters:

Parameter Description
url string

Indicates the URL of the GitHub repository

use_merge boolean

If the content for refs/pull//merge should be used instead of the PR head. The GitOrigin-RevId still will be the one from refs/pull//head revision.

required_labels sequence of string

Required labels to import the PR. All the labels need to be present in order to migrate the Pull Request.

retryable_labels sequence of string

Required labels to import the PR that should be retried. This parameter must be a subset of required_labels.

submodules string

Download submodules. Valid values: NO, YES, RECURSIVE.

baseline_from_branch boolean

WARNING: Use this field only for github -> git CHANGE_REQUEST workflows.
When the field is set to true for CHANGE_REQUEST workflows it will find the baseline comparing the Pull Request with the base branch instead of looking for the *-RevId label in the commit message.

first_parent boolean

If true, it only uses the first parent when looking for changes. Note that when disabled in ITERATIVE mode, it will try to do a migration for each change of the merged branch.

state string

Only migrate Pull Request with that state. Possible values: 'OPEN', 'CLOSED' or 'ALL'. Default 'OPEN'

Command line flags:

Name Type Description
--github-required-label string> Required labels in the Pull Request to be imported by github_pr_origin
--github-retryable-label string> Required labels in the Pull Request that should be retryed to be imported by github_pr_origin
--github-skip-required-labels boolean Skip checking labels for importing Pull Requests. Note that this is dangerous as it might import an unsafe PR.

git.destination

Creates a commit in a git repository using the transformed worktree.

Given that Copybara doesn't ask for user/password in the console when doing the push to remote repos, you have to use ssh protocol, have the credentials cached or use a credential manager.

gitDestination git.destination(url, push=master, fetch=push reference, skip_push=False, integrates=[])

Parameters:

Parameter Description
url string

Indicates the URL to push to as well as the URL from which to get the parent commit

push string

Reference to use for pushing the change, for example 'master'

fetch string

Indicates the ref from which to get the parent commit

skip_push boolean

If set, copybara will not actually push the result to the destination. This is meant for testing workflows and dry runs.

integrates sequence of git_integrate

(NOT IMPLEMENTED) Integrate changes from a url present in the migrated change label.

Command line flags:

Name Type Description
--git-committer-name string If set, overrides the committer name for the generated commits in git destination.
--git-committer-email string If set, overrides the committer e-mail for the generated commits in git destination.
--git-destination-url string If set, overrides the git destination URL.
--git-destination-fetch string If set, overrides the git destination fetch reference.
--git-destination-push string If set, overrides the git destination push reference.
--git-destination-path string If set, the tool will use this directory for the local repository. Note that if the directory exists it needs to be a git repository. Copybara will revert any staged/unstaged changes.
--git-destination-skip-push boolean If set, the tool will not push to the remote destination
--git-destination-last-rev-first-parent boolean Use git --first-parent flag when looking for last-rev in previous commits
--git-destination-non-fast-forward boolean Allow non-fast-forward pushes to the destination. We only allow this when used with different push != fetch references.
--git-destination-ignore-integration-errors boolean If an integration error occurs, ignore it and continue without the integrate
--nogit-destination-rebase boolean Don't rebase the change automatically for workflows CHANGE_REQUEST mode

git.github_pr_destination

Creates changes in a new pull request in the destination.

githubPrDestination git.github_pr_destination(url, destination_ref=master, skip_push=False, title=None, body=None)

Parameters:

Parameter Description
url string

Url of the GitHub project. For example "https://github.com/google/copybara'"

destination_ref string

Destination reference for the change. By default 'master'

skip_push boolean

If set, copybara will not actually push the result to the destination. This is meant for testing workflows and dry runs.

title string

When creating a pull request, use this title. By default it uses the change first line.

body string

When creating a pull request, use this body. By default it uses the change summary.

Command line flags:

Name Type Description
--git-committer-name string If set, overrides the committer name for the generated commits in git destination.
--git-committer-email string If set, overrides the committer e-mail for the generated commits in git destination.
--git-destination-url string If set, overrides the git destination URL.
--git-destination-fetch string If set, overrides the git destination fetch reference.
--git-destination-push string If set, overrides the git destination push reference.
--git-destination-path string If set, the tool will use this directory for the local repository. Note that if the directory exists it needs to be a git repository. Copybara will revert any staged/unstaged changes.
--git-destination-skip-push boolean If set, the tool will not push to the remote destination
--git-destination-last-rev-first-parent boolean Use git --first-parent flag when looking for last-rev in previous commits
--git-destination-non-fast-forward boolean Allow non-fast-forward pushes to the destination. We only allow this when used with different push != fetch references.
--git-destination-ignore-integration-errors boolean If an integration error occurs, ignore it and continue without the integrate
--nogit-destination-rebase boolean Don't rebase the change automatically for workflows CHANGE_REQUEST mode
--github-destination-pr-branch string If set, uses this branch for creating the pull request instead of using a generated one
--github-destination-pr-create boolean If the pull request should be created

git.gerrit_destination

Creates a change in Gerrit using the transformed worktree. If this is used in iterative mode, then each commit pushed in a single Copybara invocation will have the correct commit parent. The reviews generated can then be easily done in the correct order without rebasing.

gerritDestination git.gerrit_destination(url, fetch, push_to_refs_for='', change_id_policy='FAIL_IF_PRESENT')

Parameters:

Parameter Description
url string

Indicates the URL to push to as well as the URL from which to get the parent commit

fetch string

Indicates the ref from which to get the parent commit

push_to_refs_for string

Review branch to push the change to, for example setting this to 'feature_x' causes the destination to push to 'refs/for/feature_x'. It defaults to 'fetch' value.

change_id_policy string

What to do in the presence or absent of Change-Id in message:

  • 'REQUIRE': Require that the change_id is present in the message as a valid label
  • 'FAIL_IF_PRESENT': Fail if found in message
  • 'REUSE': Reuse if present. Otherwise generate a new one
  • 'REPLACE': Replace with a new one if found

Command line flags:

Name Type Description
--git-committer-name string If set, overrides the committer name for the generated commits in git destination.
--git-committer-email string If set, overrides the committer e-mail for the generated commits in git destination.
--git-destination-url string If set, overrides the git destination URL.
--git-destination-fetch string If set, overrides the git destination fetch reference.
--git-destination-push string If set, overrides the git destination push reference.
--git-destination-path string If set, the tool will use this directory for the local repository. Note that if the directory exists it needs to be a git repository. Copybara will revert any staged/unstaged changes.
--git-destination-skip-push boolean If set, the tool will not push to the remote destination
--git-destination-last-rev-first-parent boolean Use git --first-parent flag when looking for last-rev in previous commits
--git-destination-non-fast-forward boolean Allow non-fast-forward pushes to the destination. We only allow this when used with different push != fetch references.
--git-destination-ignore-integration-errors boolean If an integration error occurs, ignore it and continue without the integrate
--nogit-destination-rebase boolean Don't rebase the change automatically for workflows CHANGE_REQUEST mode

github_api_status_obj

Information about a commit status as defined in https://developer.github.com/v3/repos/statuses. This is a subset of the available fields in GitHub

github_endpoint_obj

GitHub specific class used in feedback mechanism and migration event hooks to access GitHub

patch

Module for applying patches.

patch.apply

A transformation that applies the given patch files. If a path does not exist in a patch, it will be ignored.

patchTransformation patch.apply(patches=[], excluded_patch_paths=[], series=None)

Parameters:

Parameter Description
patches sequence of string

The list of patchfiles to apply, relative to the current config file.The files will be applied relative to the checkout dir and the leading pathcomponent will be stripped (-p1).

excluded_patch_paths sequence of string

The list of paths to exclude from each of the patches. Each of the paths will be excluded from all the patches. Note that these are not workdir paths, but paths relative to the patch itself.

series string

The config file that contains a list of patches to apply. The series file contains names of the patch files one per line. The names of the patch files are relative to the series config file. The files will be applied relative to the checkout dir and the leading path component will be stripped (-p1).