Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

legacy: merge remaining lando-api commits (bug 1944562) #226

Closed
wants to merge 65 commits into from

Conversation

zzzeid
Copy link
Collaborator

@zzzeid zzzeid commented Feb 19, 2025

  • merge remaining lando-api commits (legacy)
  • delete all files that were delete by us/them
  • checkout all files that were modified by both to api-merge-without-fixes

cgsheeh and others added 30 commits December 14, 2023 10:35
…it` diff (Bug 1709608) (#364)

Extend `get_uplift_conduit_state` to also query `differential.querydiffs`
and return all known diffs associated with the selected uplift
revisions. When creating the uplift revisions, add a function to
parse through the diffs and return the latest diff that does not
have a `creationMethod` of `commit`. Then find the associated diff
response from `differential.diff.search` and pass both along to the
uplift revision creation function. When parsing binary from the
`getrawdiff` API response, re-use the metadata from the `querydiffs`
response so the pre-uploaded binary file is re-used in the uplift revision.
Update `ruff` and `black` to the latest versions, and apply
some fixes that `ruff` finds to the codebase.
While debugging bug 1871425 we orignally had a suspicion that
the issue was resulting from Lando erroneously applying later
patches in the stack when it should only have applied the
requested patches. I noticed we don't have a test for this
case, so I added one myself, and it passed as expected. We
might as well keep this simple test around just to provide
extra test coverage of this use case.
…es (Bug 1843041) (#369)

Currently a Lando based uplift request populates the required fields
in Phabricator by pulling down the raw diff from `differential.getrawdiff`,
passing the diff through `rs-parsepatch` and populating a dict in
the format Phabricator expects. This has worked decently well to date,
however this method doesn't include the full file context, leading
to diff display issues on Phabricator.

The `differential.querydiffs` endpoint returns data about a diff in
the expected Phabricator format. We have already started using this
data to populate the binary files metadata for uplift requests.
Taking a closer look at the response we can see that the full file
context is included in this format, and the diff itself is the
exact same diff we expect to be uploaded as the uplift request on
Phabricator.

Remove the code which parses the `differential.getrawdiff` response
into the expected Phabricator format and instead simply re-use the
data from the `differential.querydiffs` response for the uplift
revision. This will provide a more accurate view of the diff that
will land on the uplift train. The content of the file itself may
be different than what exists on the target train, however this
problem already exists in the existing Lando uplift implementation.
Add the `--verbose` flag to the `mach lint` call so more information
is displayed in log output when linters fail.
…ug 1856590) (#373)

Some patches sent with Git will include whitespace at the end of the
patch, meaning the Git version info is the last 3 lines instead of
2. Add a function which scans the lines in reverse and looks for
the first line beginning with `--` instead of using a naive slice
that removes the last two lines of output.

Update the Git parsing test to include extra lines of whitespace
to account for this change in the main tests for `GitPatchParser`.
Also add a simple test for the new parsing function.
…(#375)

Also rename the `r` variable to `reviewers` while we are here.
…451) (#376)

In bug 1864463 we added a snippet so pushlog operational failures in
hgmo resulted in a retry instead of a permanent failure of the landing
job. The snippet we check for was the very generic `abort: push failed
on remote`, which is actually an error message seen for many other
failures, including problems with the patch that will always cause
the push to be rejected.

Remove the generic snippet and replace with a more specific error about
pushlog operational failures. Also add a snippet about closed connections.
`transplants.py` has a function for comparing the equality of two
confirmation tokens that is a simple wrapper around `==` to allow
for easy replication of unequal tokens in test conditions. However,
the pytext fixture which is hooked into this shim is unused. Remove
the shim and replace the function with a simple `==` comparison.
…25) (#381)

Due to the WPT linter breaking in a way that is difficult to debug,
we have had autoformatting disabled in Lando for a short while. There
are still some blockers with bug 1807712 that mean we can't directly
move to `mach format`. In the meantime, to re-enable autoformatting,
add an explicit allow list for formatters which will run in Lando
and update the call to `mach lint` to use them.
…landing (Bug 1880861) (#382)

Add a blocking check that enforces revisions with the `needs-data-classification`
tag cannot land. The check is similar to the testing policy project tag checks
except it simply looks for the `needs-data-classification` tag and
blocks landing if it is present on the revision. The tag requires the
PHID of the project tag so it is generated with a factory function in a
similar manner to the release manager approval blocking check.
In the list comprehension where we set `job.revision_order` in
`create_landing_job_with_no_linked_revisions`, we iterate over
`revisions` and set the iteration variable as `r`, but the
value that ends up in the list uses the leftover `revision`
variable from the loop iteration above. Rename `r` to `revision`
so the correct variable is used.
Add the existing `large-repo` Phabricator repo to Lando so patches can
be landed there for testing.
…472) (#283)

This commit implements the Treestatus API as a feature of Lando.
The Treestatus OpenAPI spec is added to Lando's API spec and most
endpoints are re-implemented here. The following Treestatus endpoints
are removed in the Lando implementation:

- `/v0/*` endpoints.
- `DELETE /trees2`

The following features are also removed in the Lando implementation:

- StatusPage updating.
- Pulse message publishing.

Treestatus' database, caching and logging is changed to integrate with
Lando's Flask platform.

A new `group` argument is added to the Auth0 decorators to specify the
user must be a member of the specified groups to be permitted to
access certain API endpoints, and two new `TREESTATUS_ADMIN` and
`TREESTATUS_USER` group name constants are added. A `user_id()`
function is added to `A0User` to retrieve the current user ID string
for usage in the `who` field of status updates.

Integration tests for the endpoints are added to show typical usage
patterns and ensure the endpoints return output in a format consistent
with the existing Treestatus API.

A few database adjustments are made in this migration. First, the `tree`
field of `Tree` is now a unique index instead of a primary key, as Lando
models all have an `id` field as the primary key. This also involves
updating some `.get()` calls in SQLALchemy to instead by `filter_by`.
Second, the custom `UTCDatetime` object used by the existing Treestatus
implementation is replaced with a normal `DateTime` SQLAlchemy object.

Since Treestatus is now a part of Lando after this PR, the existing
Treestatus client is removed, and replaced with an `is_open` function
that directly queries the DB to retrieve the status of the tree. The
TreestatusDouble test mocker is removed and replaced with a simple
fixture to create a tree and set it to a specific status in the
tests. The `TreestatusSubsystem` is also removed as it is no longer
necessary.

After the above changes, many more code cleanups and refactorings
were completed to make the code easier to reason about and more
maintainable.
This commit begins the process of unifying and refactoring the
landing checks system, which blocks stacks and revisions from
landing and issues warnings about various issues in the landing
that are not blockers. The state of this revision gives us a
base to work on for extending the lints system to add Mercurial
commit hooks, clean up some unnecessary data fields, and improve
the user experience around presentation of landing lints.

The existing checks are split up into three categories. First are
stack-level blocker checks, which inspect the entire state of the
stack and block landings entirely. The second are revision-level
blocker checks which inspect the state of individual revisions and
block landing of each revision and it's successor nodes. Finally
are the revision-level warnings, which do not block landings but
will present information about sub-optimal landing conditions that
must be acknowledged before landing.

Each piece of state used by individual checks is added to a new
TransplantAssessmentState class. An instance of this class is
passed to each lint to keep the API consistent and discoverable.
Some fields on this class have been moved directly from existing
checks, and future refactors should reduce the number of fields on
this class.

The transplant assessment API and the calculate_landable_subgraphs
API are combined into a single API as part of this commit. Since
the previous setup had two different use cases, some state is not
required to run landing checks on non-transplant assessments. All
pieces of state required to check for a valid landing request are
encapsulated in LandingAssessmentState, which is an optional field
on TransplantAssessmentState. Checks which should only run on
landing assessments can look for the existance of a LandingAssessmentState
on the TransplantAssessmentState.

The calculate_landable_subgraphs API and the custom graph walking
code is replaced by the new APIs. When building the TransplantAssessmentState
we create a deep copy of the RevisionStack object and name it the
landing_stack, where blocked revisions are removed from this stack.
Then all landable paths in the landing_stack from the root revisions
to the leaf revisions are considered to be landable subgraphs. A new
leaf_revisions method is added which returns all nodes that have
no successors in the graph. Each node in the original stack object has a
blocked attribute which is initialized to an empty list. When a
blocking check fails on a revision the reason is appended to the
list in the full revision stack.
…e_state` (Bug 1895933) (#400)

Make `landing_state` a separate argument in `create_state`, so callers
who want to create a custom landing state must do so with
`create_landing_state` directly instead of indirectly via
`create_state`.
This commit adds a blocking check to reject the introduction of
symlinks into the repo. It is a port of the `prevent_symlinks.py`
Mercurial server-side hook. First we re-introduce the `rs-parsepatch`
library as a dependency. We add a `get_parsed_diffs` function to
query Conduit's `differential.getrawdiff` API endpoint to retrieve
the raw diff content for each Phabricator diff, pass the contents
through `rs_parsepatch.get_diffs` and add the response to the
`StackAssessmentState` object. This will allow us to share the
parsed diff contents between this check and future diff-inspecting
checks that will be introduced shortly. We then add a revision-wise
check which inspects the `new` file modes for each file in the diff and
asserts the mode does not match the known symlink decimal notation.
This command was a one-off to import data into Lando from
RelEng Treestatus. It should be removed to avoid being run
again.
…1896696) (#406)

Our current snippet parsing for `hg push` output looks for `is CLOSED`
to determine if a tree is closed. When the Treestatus API returns an
error, it will instead print that is it "treating as if CLOSED". Since
we don't check for that specific string we can sometimes treat closures
as permanent failures. Add a snippet to account for this tree-closed
state.
…) (#405)

There is a test repo `uplift-target` used in some tests that is
added to the `local-dev` repo config. In reality this repo should
be present in the mocked out testing-only `test` repo config. Move
this test repo to the correct location and update tests with fixtures
to make the repo available.
cgsheeh and others added 22 commits June 20, 2024 12:29
…3462) (#426)

Add a check to the `DiffAssessor` API that prevents introduction of
the `.gitmodules` file, which tracks Git submodules in the repository.
Due to a regression in newer Python build tools, our
currently installed version of `celery` fails to install
in the Docker image build as a result of improperly formatted
version specifiers. Install the latest version of `celery`
to resolve the issue, which also requires an update to
the `kombu` dependency. Updating `kombu` then causes a
regression in `redis`, so update `redis` to the latest
`3.x` version.

In newer versions of `celery` the pytest fixtures are disabled
by default, so add a `pytest_plugins` variable to the top-level
`conftest.py` to enable it.

Celery commands now fail when `sys.argv[0]` is passed as part
of `argv`, so remove them instead. The worker also requires the
`worker` command as the first argument, so update the call to
add it. We also add the broker URL as the `results_backend` in
the worker config.
Since `hg id <repo_url>` hits the network to connect to hgmo,
it can fail due to internal server errors and other connectivity
issues. In this case, we want the landing job to be retried in
the same fashion as when failing to pull/push. Map the exception
raised by `run_hg` to an `HgException` so the failure will be
considered a retry in the landing worker.
…t is found (Bug 1921023) (#433)

Previously we had just removed the last two lines of output,
however if the patch has extra newlines this could fail and
cause the patch to be deformed. Use the helper function we
added to avoid issues with parsing the commit message.
… 1893976) (#434)

When parsing the Git patch sent through `git format-patch`, the
email parser returns escaped unicode. Explicitly set the charset
of the email to `utf-8` so the unescaped unicode string is returned.

Add a test which uses a real-life example of this issue, which fails
without specifying the charset but passes afterwards.
…(Bug 1915671) (#432)

Adds an API for push-wide checks and implements the bug references check
using it. The API defines a `PushCheck` interface which initializes the
check with a repo, defines a `next_diff()` function which accepts each diff 
parsed as a `PatchHelper`, and a `result()` function which returns the error
message for the check, or `None` if the check passed.

The `BugReferencesCheck` is the first check to be implemented using this
new API. The check is only relevant on `try` repositories. Each diff is
passed via `next_diff`, and if `SKIP_BMO_CHECK` is found in the commit
message of any diff, a flag is set which indicates the check will be
skipped, otherwise the bug numbers found in the commit message are
stored. On `result()` the `skip_check` flag will cause an unconditional
pass, otherwise BMO is pinged to check the bug numbers in the commit
message are all referencing public bugs.

Since the `author` field was always assumed to be an email address, it
was renamed to the more accurate `email` name.
…91) (#435)

`eslint` can make more than just formatting fixes, so we should disable
it.
…20) (#437)

`headers` includes headers passed to the callers, which we add to the
`common_headers` dict to include things like the UA string and
API key. When we create the actual request, we are instead passing
the `headers` object and ignoring our common headers. Pass
`common_headers` to `requests.request` instead.
… 1926431) (#439)

Change `parse_bugs` to use `BUG_CONSERVATIVE_RE` to avoid incorrect parsing
of bug numbers during the `BugReferencesCheck`. Other code paths use the
`parse_bugs` function as well, but their tests pass after this change, so
we fully remove `BUG_RE` since it is no longer used.

Also remove the accidental double-definition of `BUG_RE` and
`BUG_CONSERVATIVE_RE` from `commit_message.py`.
…927013) (#440)

The `PhabricatorDouble` mock API calls take the `limit` parameter,
but do not actually limit the number of entries in the response data.
Add a check if `limit` is set and slice the `items `list at that index.
… (#438)

Add a warning that checks when a revision has diffs from multiple
authors associated with it. First we add support to specify a diffs
author in the test suite. Then we update `request_extended_revision_data`
to pull diffs based on the `revisionPHID` instead of the diff `id`,
which returns all diffs for a given revision. We update
`gather_involved_phids` to take the list of diffs for a revision and
gather all diff authors as involved in the revision. We then add
the warning which gathers all the PHIDs that have authored diffs
on a revision, and emit a warning when there are multiple authors,
including the Phabricator username of each author.
- ci-admin was moved into ci-configuration.
- ci-configuration was moved to Github.
- taskgraph was moved to Github.
- fluent-migration was moved to Github.
- build-tools is deprecated.
…15) (#441)

`parse_git_author_information` is using `email.utils.parseaddr` to
parse the authorship information from Git and Mercurial patches.
In Mercurial, the standard Git format is not required, so users can
use values like `ffxbld` as their author header. When the parser
encounters fields such as this, the improperly formatted name
is returned as the email.

Update `parse_git_author_information` to return the raw header
as the username when the email is parsed as empty.
…CollectionCheck` (Bug 1932507) (#442)

This commit refactors the `DiffAssessor` API to more closely
match the `PatchCollectionCheck` API. A `PatchCheck` abstract
class is introduced which has a similar interface as the
`PatchCollectionCheck`, where each check can store state
as each part of the patch is iterated over using `next_diff`,
and `result()` is called to determine the final result of
the check. Each check in the `DiffAssessor` class is moved
to a class which subclasses `PatchCheck`.

To make the patch checks more closely resemble the patch
collection checks, each check to be run is passed as an
argument to `run_patch_collection_checks`, which allows
us to run the relevant checks based on the code path
or known relevant repository instead of unconditionally
running all checks. As a result, all logic around inspecting
the repo and determining if a check is relevant is removed
from the checks, simplifying the APIs.
… (#443)

Currently when an uplift without a bug number hits the
Bugzilla update code path, an exception is thrown which
is turned into a failure email indicating Bugzilla could
not be updated. This commit changes the behaviour to
avoid trying to update Bugzilla if no bug numbers are
detected.
…rs` calls (Bug 1936373) (#446)

Catch any exceptions thrown while setting reviewers from `mots.yaml`, so
they do not take down the landing worker and cause lag in the queue.
…936373) (#447)

I used `logging.info` instead of `logger.info` in my haste, so this
logging message is not being sent to the proper location.
@zzzeid zzzeid marked this pull request as ready for review February 19, 2025 20:22
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be reverted to its state in main or that in #227.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be reverted to main or #227.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this also needs to be reverted to its state in main or #227.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I follow the "revert to main or 227" comments. Are you saying these files need to be reverted to main so #227 is the full diff of main->api-merge?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think basically these files were outdated in your merge PR, and then the fixup PR applied those changes (probably in another merge). I think once the PRs are cleaned up these will disappear. The issue here is that the merge PR (#225) is outdated.

Comment on lines -1340 to -1343
@pytest.mark.django_db(transaction=True)
def test_transplant_on_linked_legacy_repo(
app,
proxy_client,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was accidentally deleted, it seems, and then re-added in #227.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was deleted because it didn't exist in the repo when I merged the LandoAPI code in. It was "re-added" in #227 because that commit includes another merge of the code that was landed after I created this merge commit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok the merge PR should be updated then so this PR (or #224) can be squashed and merged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few changes here that seem to rely on the treestatus changes. I assume these changes are OK as-is, but flagging this as it was part of the conflicting files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about your PRs seems wrong. This file was git mv-ed to src/lando/treestatus/ in my version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file did not have any conflicts in the merge PR, hence why it should be left as-is and moved in the fixup PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is to keep this PR easy to review, i.e., only conflicting files are to be reviewed)

@cgsheeh
Copy link
Member

cgsheeh commented Feb 24, 2025

Closing in favor of #224

@cgsheeh cgsheeh closed this Feb 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants