Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sdkv3): start migration to sdkv3 #6706

Open
wants to merge 264 commits into
base: master
Choose a base branch
from
Open

feat(sdkv3): start migration to sdkv3 #6706

wants to merge 264 commits into from

Conversation

Hweinstock
Copy link
Contributor

@Hweinstock Hweinstock commented Mar 3, 2025

Problem

The feature/sdkv3 branch has a lot of progress on the sdkv3 migration and we want to merge these changes in to avoid further divergence.

Included Changes

Testing

  • Tried some features on the vsix, and verified behavior is normal.
  • Tested some edge cases with EC2 Remote Connect as well.

Do not squash merge


  • Treat all work as PUBLIC. Private feature/x branches will not be squash-merged at release time.
  • Your code changes must meet the guidelines in CONTRIBUTING.md.
  • License: I confirm that my contribution is made under the terms of the Apache 2.0 license.

aws-toolkit-automation and others added 24 commits February 19, 2025 22:09
## Problem
The AWS SDK now supports built-in pagination:
[https://aws.amazon.com/blogs/developer/pagination-using-async-iterators-in-modular-aws-sdk-for-javascript/
](https://aws.amazon.com/blogs/developer/pagination-using-async-iterators-in-modular-aws-sdk-for-javascript/)
Currently, have our own logic build on top of calls to do this logic for
us. However, the SDK paginators are likely more reliable, and avoids
reimplementing logic. They are also part of the SDK making them easier
to integrate with (specifically for types).

Doing this early in the migration process has the potential to reduce a
lot of unnecessary code.

## Solution
- Migrate existing use cases of sdkv3 to use these paginators.
- Create general support for them in the Wrapper class.

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
…er. (#6664)

## Problem
Inspecting the logs, and http requests themselves, it appears that the
sdkv3 clients are somehow adding a `connection: close` header to the
http requests. This is not done by the SDK (according to SDK team), and
inspecting the headers throughout the SDK packages at the recommended
points suggests this is being overwritten somewhere else.

Creating a new vscode extension with this code snippet also fails, but a
new project without vscode succeeds. This suggests its VS Code.

I believe this is caused by
microsoft/vscode#173861, which was resolved,
but is now reopened. The comments and techdebt tests suggested this is
no longer an issue, but it now is again due to fix being reverted.

## Solution
- Remove outdated notes/test that suggest this is fixed in newer
versions of VS Code.
- add e2e test verifying expected `keepAlive` behavior. 
- Add custom "build" step middleware that adds the keepAlive to the HTTP
request, unless explicitly told not to.
- Improve some typing in the middleware by making use of type
assertions.

### Notes
Related work done to completely overwrite the proxyAgent in vscode to
fix this done here:
aws/aws-toolkit-vscode-staging#1214. However,
this code is outdated and doesn't work anymore (its also unused). There
is a functioning implementation of this logic at
https://github.com/sourcegraph/cody/blob/62d73f78c432036d1f99bc9631ed534cc2ed846b/vscode/src/net/net.patch.ts.

However, adding the `keepAlive` header manually appears to be working
based on the e2e test. My understanding is that if the client was
closing the connection in between each request, it must send a FIN
message to the server to let it know according to TCP. The server would
then close its side of the connection as well, but we know from the e2e
test this isn't happening since it maintains the same connection across
two requests. Therefore, the client must also be keeping its connection
open, making use of persistent connections on both sides.

### Experimental Results
Redoing the experiment of timing repeated requests when caching,
not-caching, and caching with the change in this PR (connection
persistence):
<img width="485" alt="image"
src="https://github.com/user-attachments/assets/786130ea-6e6c-4029-8e1d-df5cc89edb66"
/>
Note: experiment separate into 3 trails of 10 requests to limit the
effect of throttling.

Therefore, this change is saving almost 40 ms per request!

## Future Work
Delete the dead code at
aws/aws-toolkit-vscode-staging#1214.

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
## Problem
The toolkit recreates each SDK client before each request is made. As an
example see:
https://github.com/aws/aws-toolkit-vscode/blob/b4d7417ece1eebc56a5284a9cdd643e4b2308bf4/packages/core/src/shared/clients/ec2Client.ts#L31-L52

This is done as an unnecessary guard against the client's internal state
interfering with higher level business logic. Creating clients for each
request is not cheap, and there is likely a better pattern to be had
here.

See internal docs expanding more on this issue. 

## Solution
- Create a map that caches the clients.
- Key the function arguments such that it detects config changes. 
- Cache clients with new method `getAwsService` that wraps
`createAwsService`.
- Allow consumers to pass a `ignoreCache` flag to force creation of a
new client.
- On deactivation, call `destroy` on all the clients.

### Testing Strategy
- Create clients with methods that make it easy to determine equality.  

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
## Problem
EC2 currently uses the old sdkv2, we can start the migration to v3. 

## Solution
- migrate existing use cases and provide small clean ups along the way. 
- Re-work pagination to use `AsyncCollection` abstraction. 

## Testing
- Manually tested Ec2Connect, Explorer (polling functionality), Command
Palette, and some edge cases including instances missing a name,
duplicate names, and windows instance (for remote terminal).

## Future Work
- Allow quickpicks to support `AsyncIterator` of individual items, to
avoid loading entire collection. Alternatively, add a `batchIterator` to
`AsyncCollection` that returns an `AsyncIterator` with batches of size
`k`.
- built-in explorer support for pagination. 
- automated E2E testing for EC2?

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
## Problem
Follow up to discussion:
#6672 (comment)

Currently the ec2 module does make paginated requests to the SDK
endpoint. However, in order to show these results into the UI, we
resolve them in entirety.
For the quick picks this is done here:
[https://github.com/aws/aws-toolkit-vscode/blob/4e568970e8d8646cf8e590d39d8891d68cebde84/packages/core/src/awsService/ec2/prompter.ts#L65
](https://github.com/aws/aws-toolkit-vscode/blob/4e568970e8d8646cf8e590d39d8891d68cebde84/packages/core/src/awsService/ec2/prompter.ts#L65)
For the explorer this is done here:
[https://github.com/aws/aws-toolkit-vscode/blob/4e568970e8d8646cf8e590d39d8891d68cebde84/packages/core/src/awsService/ec2/explorer/ec2ParentNode.ts#L52
](https://github.com/aws/aws-toolkit-vscode/blob/4e568970e8d8646cf8e590d39d8891d68cebde84/packages/core/src/awsService/ec2/explorer/ec2ParentNode.ts#L52)

## Solution
- Refactor the EC2 client to process the instances within the pages they
come in (i.e. not flattening them). This allows passing an
`AsyncCollection` of pages to the quickPick.
- Avoid flattening results in `makePaginatedRequest` by default.  
- Refactor client builder to be non-async, this causes some trickle-down
refactoring, but allows `makePaginatedRequest` to not return a promise.
- Simplify instance processing to make use of type predicates. 
- Remove mocks from `Ec2Prompter` tests, and refactor to test pagination
behavior.

## Alternative Solution 
- Repack the instances into artificial pages (i.e. a batched iterator).
This is a little hacky since it involves us unpacking and repacking the
instances. However, it does involve significantly less code changes.

## Verification
In addition to updating the tests, we can add a logging statement when
the quickPick loads an item from the `AsyncIterable`. This screenshot
shows that loading two items was successful and triggered the
`AsyncIterable` twice, as expected.
<img width="1045" alt="image"
src="https://github.com/user-attachments/assets/8303f909-17e9-4fc3-a9fa-b0ec311eb1bc"
/>


## Future Work
- Extend this work to the explorer. There is already some general work
done in
https://github.com/aws/aws-toolkit-vscode/blob/df810aa4264ba14dac488697d539011a732c7eff/packages/core/src/awsexplorer/childNodeLoader.ts,
but it doesn't support working with `AsyncCollection`, only making the
requests directly. We could do something hacky, like wrap the
`AsyncCollection` in a fake request method, or try to refactor the
utility itself.
---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
## Problem
From #6664, we have
persistent connections.
 
However, each SDK client creates its own http handler. If we have N
distinct service clients we maintain, then we could have up to N http
handlers, with N distinct connection pools. This does not affect
persistent connections since each service maintains its own endpoints,
however, there is a small overhead in initiating each connection pool.
Additionally, there is no guarantee for consistent behavior across
handlers with regards to configuration options (Ex. requestTimeout).

## Solution
- inject the same HTTP handler into each SDK client, unless explicitly
given a different one.
- use fetch-http-handler on web and `node-http-handler` on node. 
- We don't want to use `fetch-http-handler` in node because it still has
experimental support and is not recommended.
[docs](https://github.com/aws/aws-sdk-js-v3/blob/main/supplemental-docs/CLIENTS.md#request-handler-requesthandler)
and
[comment](aws/aws-sdk-js-v3#4619 (comment))
from SDK team. When trying to, there were issues with persistent
connections.
- install `http2` to resolve web deps issue. This is part of nodes
standard library, but needed as explicit dependency for web.

### Trying `fetch-http-handler` in node. 
- confirmed `fetch-http-handler` with `keep-alive: true` option is
sending `keepAlive` headers, but closing the connection after doing so
in node, both in VSCode environment, and outside of it in a pure node
environment. This implies it is not related to
microsoft/vscode#173861.

## Verification 
The request times seemed unaffected by this change, but there was a
noticeable impact on sdk client initialization speed. The results below
are from creating 1000 SSM clients with and without the same HTTP
Handler.
<img width="304" alt="image"
src="https://github.com/user-attachments/assets/9b28af43-795c-4dcb-9bb1-752c118a3247"
/>

Because we usually cache the SDK clients under each service, the
important statistic is that this speeds up 0.131 ms per SDK client
creation. If we always use the cache and only create a client once per
service, then this also suggests a 0.131 ms per service speedup. We
interact with at least 20 services, and 16 in the explorer alone, so
this could result in 2-2.5 ms improvement in initialization time for all
these SDK clients depending on how they are created.

Could be interesting to revisit after the migration to see if this
reduces start-up time.

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
## Problem
Iam uses v2.

## Solution
- migrate it. 
- rename from `DefaultIamClient` -> `IamClient`. 
- add stronger type assertions on request to avoid weak type assertions
later on. (replace `as` with proper type checks when possible)
- Continue pattern of wrapping sdk types in "safe" types to avoid
assertions.

---

- Treat all work as PUBLIC. Private `feature/x` branches will not be
squash-merged at release time.
- Your code changes must meet the guidelines in
[CONTRIBUTING.md](https://github.com/aws/aws-toolkit-vscode/blob/master/CONTRIBUTING.md#guidelines).
- License: I confirm that my contribution is made under the terms of the
Apache 2.0 license.
Copy link

github-actions bot commented Mar 3, 2025

  • This pull request implements a feat or fix, so it must include a changelog entry (unless the fix is for an unreleased feature). Review the changelog guidelines.
    • Note: beta or "experiment" features that have active users should announce fixes in the changelog.
    • If this is not a feature or fix, use an appropriate type from the title guidelines. For example, telemetry-only changes should use the telemetry type.

@Hweinstock
Copy link
Contributor Author

/runIntegrationTests

@Hweinstock Hweinstock marked this pull request as ready for review March 3, 2025 20:56
@Hweinstock Hweinstock requested a review from a team as a code owner March 3, 2025 20:56
@Hweinstock Hweinstock changed the title feat(sdkv3): start migration to sdkv3 (WIP) feat(sdkv3): start migration to sdkv3 Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants