Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting IdToken causes TooManyRequestsException. #13973

Open
3 tasks done
al-mcnichol opened this issue Oct 29, 2024 · 10 comments
Open
3 tasks done

Requesting IdToken causes TooManyRequestsException. #13973

al-mcnichol opened this issue Oct 29, 2024 · 10 comments
Assignees
Labels
Auth Related to Auth components/category question General question

Comments

@al-mcnichol
Copy link

al-mcnichol commented Oct 29, 2024

Before opening, please confirm:

JavaScript Framework

Web Components

Amplify APIs

Authentication

Amplify Version

Older than v5

Amplify Categories

auth

Backend

None

Environment information

# Put output below this line
System:
    OS: Linux 5.15 Ubuntu 22.04.2 LTS 22.04.2 LTS (Jammy Jellyfish)
    CPU: (16) x64 11th Gen Intel(R) Core(TM) i9-11950H @ 2.60GHz
    Memory: 11.75 GB / 15.23 GB
    Container: Yes
    Shell: 5.1.16 - /bin/bash
  Binaries:
    Node: 22.8.0 - ~/.nvm/versions/node/v22.8.0/bin/node
    npm: 10.8.2 - ~/.nvm/versions/node/v22.8.0/bin/npm
  npmPackages:
    @aws-amplify/auth: ^4.6.17 => 4.6.17
    @aws-amplify/core: ^4.7.15 => 4.7.15
    @optimizely/optimizely-sdk: ^5.3.4 => 5.3.4
    @stencil/core: ^4.22.0 => 4.22.0
    @stencil/core/cli:  4.22.0
    @stencil/core/compiler:  4.22.0
    @stencil/core/dev-server:  4.22.0
    @stencil/core/dev-server/client:  4.22.0
    @stencil/core/internal:  4.22.0
    @stencil/core/internal/app-data:  4.22.0
    @stencil/core/internal/client:  4.22.0
    @stencil/core/internal/hydrate:  4.22.0
    @stencil/core/internal/testing:  4.22.0
    @stencil/core/mock-doc:  4.22.0
    @stencil/core/screenshot:  4.22.0
    @stencil/core/sys/node:  4.22.0
    @stencil/core/testing:  4.22.0
    @stencil/sass: ^3.0.12 => 3.0.12
    @stencil/store: ^2.0.16 => 2.0.16
    @types/jest: ^29.5.13 => 29.5.13
    @types/node: ^22.7.4 => 22.7.4
    autoprefixer: ^10.4.20 => 10.4.20
    http-proxy: ^1.18.1 => 1.18.1
    http-server: ^14.1.1 => 14.1.1
    husky: ^9.1.6 => 9.1.6
    jest: ^29.7.0 => 29.7.0
    jest-cli: ^29.7.0 => 29.7.0
    jwt-decode: ^4.0.0 => 4.0.0
    mitt: ^3.0.0 => 3.0.1
    npm-check: ^6.0.1 => 6.0.1
    puppeteer: ^23.5.0 => 23.5.0
    rollup-plugin-node-polyfills: ^0.2.1 => 0.2.1
    stencil-tailwind-plugin: ^1.7.0 => 1.8.0
    tailwindcss: ^3.4.13 => 3.4.13
    typescript: ^5.6.2 => 5.6.2
    uuid: ^10.0.0 => 10.0.0 (3.4.0, 9.0.1)
  npmGlobalPackages:
    @aws-amplify/cli: 12.12.6
    corepack: 0.29.3
    npm: 10.8.2

Describe the bug

We upgraded v4 to v6 and within an hour started getting reports of users not being able to login. It turns out we had an overwhelming number of requests, surpassing our rate limit (120 request/sec), to Cognito, which in turn blocked the requests causing TooManyRequestsException.

We rolled back to v4 and through testing recognized the issue was there all along and somehow was exacerbated by the upgrade?

We can reproduce the issue in v4. It occurs when the refresh token expires and the library continually attempts to refresh the IdToken token from the server and we are therefore hitting the Cognito server hundreds of times, which causes us to be hit the rate limit.

For more context on reproducing the error – we login to the App and used AWS CLI (aws cognito-idp revoke-token) to revoke the refresh token. After some time any idToken request caused a network request to Cognito. The interesting part is that we’re listening for tokenRefresh_failure event from the HUB and calling Auth.signOut() , which triggers other code to remove any related cookies but the network requests continues.

NOTE: The repeated requests for the idToken is because it's being saved to a cookie to support an old app that authenticates on the server and not through client technologies. The app is using our new Authentication web component, which is client-side.

image

Expected behavior

Once the user is logged out due to token refresh expiration or refresh error any request to refresh to ID or Access token should not cause a network request to Cognito

Reproduction steps

  1. Login
  2. Copy refresh token from local storage (this is needed for AWS CLI command)
  3. Run AWS CLI commands to revoke the refresh token.
  4. Clear the network tab and filter on "cognito"
  5. Wait approximately xxx minutes to allow the current token to expire.
  6. Repeated calls to refresh the ID/Access tokens causes a network request

Code Snippet

import { Amplify, Hub } from '@aws-amplify/core';
import { Auth as AmplifyAuth, CognitoHostedUIIdentityProvider } from '@aws-amplify/auth';
import { Build } from '@stencil/core';
import config from './config';
import {
  FederatedAuthProviders,
  removeLocalUser,
  setCurrentAuthProvider,
  usernameToFederatedProvider,
  setAuthConumserIDCookies,
} from './core';

import { getConsumerId } from './index';
import { getFederatedAuthBroadcastChannel } from '../utils/federatedAuthBroadcastChannel';
import { emit } from '../utils/events';

import {
  loadFraudCollectorScript,
} from '../utils/fraudCollector';
import newrelicUtil from '../utils/newrelicUtil';
import BrowserStorage from '../utils/BrowserStorage';
import { noop } from '../utils/noop';
import { analytics } from '../utils/analytics';

loadFraudCollectorScript().catch(noop);

/**
 * If the opener has been disconnected from the new tab, the user will see a blank page if they remain
 * on /account-api/federated-auth-page. This will navigate them back to the same URL as the opener.
 */
const callingPageURLStorage = new BrowserStorage('auth-calling-page-url', { persist: true });
const avoidBlankPage = (federatedProvider?: FederatedAuthProviders) => {
  const currentURL = new URL(window.location.href);
  const nextURL = callingPageURLStorage.getItem() as string;
  if (!globalThis.opener && nextURL && currentURL.pathname === '/account-api/federated-auth-page') {
    callingPageURLStorage.removeItem();
    newrelicUtil.FEDERATED_AUTH(`Forced navigation to avoid blank page - ${federatedProvider}`);
    globalThis.location.href = nextURL;
  }
};

// WARNING: Amplify actually passes null instead of undefined for the payload.data in some cases, so we can't
// destructure payload.data in the function signature.
Hub.listen('auth', async ({ payload: { event, data } }) => {
  try {
    const federatedAuthBroadcastChannel = await getFederatedAuthBroadcastChannel();
    if (event === 'tokenRefresh_failure') {
      AmplifyAuth.signOut();
      newrelicUtil.AMPLIFY_AUTH_HANDLER('tokenRefresh_failure');
    } else if (['oAuthSignOut', 'signOut'].includes(event)) {
      analytics.SIGN_OUT_SUCCESS();
      removeLocalUser();
    } else if (event === 'autoSignIn') {
      await setCurrentAuthProvider('cognito');
      globalThis.dispatchEvent(new Event('auth.signin'));
      emit('auth.signin');
    } else if (['customOAuthState', 'signIn'].includes(event)) {
      // Process fraud collector data once sign-in event occurs
      const federatedProvider: FederatedAuthProviders = usernameToFederatedProvider(data?.username);
      const consumerId = await getConsumerId();
      setAuthConumserIDCookies(consumerId);
      if (federatedProvider) {
        const { waitForIdTokenWithConsumerId } = await import('./waitForIdTokenWithConsumerId');
        await waitForIdTokenWithConsumerId();
        federatedAuthBroadcastChannel.postMessage({ signIn: true, provider: federatedProvider });
        avoidBlankPage(federatedProvider);
      }
    } else if (['signIn_failure'].includes(event)) {
      if (data?.message) {
        if (data.message.startsWith('PreSignUp+failed+with+error+An+account+already+exists+with+a+different+spelling')) {
          federatedAuthBroadcastChannel.postMessage({ signIn: false, err: 'sameBaseEmail' });
          avoidBlankPage();
          return;
        }
        if (/^PreSignUp.fail/.test(data.message)) {
          federatedAuthBroadcastChannel.postMessage({ signIn: false, err: 'federatedEmailNotProvided' });
          avoidBlankPage();
          return;
        }
        if (data.message.startsWith('User+is+not+enabled')) {
          federatedAuthBroadcastChannel.postMessage({ signIn: false, err: 'accountDisabled' });
          avoidBlankPage();
          return;
        }
      }
      federatedAuthBroadcastChannel.postMessage({ signIn: false, err: 'genericErrorTryAgain' });
      avoidBlankPage();
    }
  } catch (err) {
    if (globalThis.location.pathname === '/account-api/federated-auth-page') {
      newrelicUtil.FEDERATED_AUTH(err);
    } else {
      newrelicUtil.AMPLIFY_AUTH_HANDLER(err);
    }
  }
});

Amplify.configure(config);

export const useCustomAuthFlow = () => {
  Amplify.configure({ authenticationFlowType: 'CUSTOM_AUTH' });
};
export const useUserSRPAuthFlow = () => {
  Amplify.configure({ authenticationFlowType: 'USER_SRP_AUTH' });
};

// Reexporting Auth this way instead of `export {Auth}` to prevent IDEs suggesting that importing Auth from this module
// is unnecessary.
export const Auth = AmplifyAuth;

// underlying apps are calling this utility functions.
export const getCognitoIdToken = async (): Promise<string> =>
  (await Auth.currentSession()).getIdToken().getJwtToken();
  
  
  

Log output

// Put your logs below this line


aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

No response

@github-actions github-actions bot added pending-triage Issue is pending triage pending-maintainer-response Issue is pending a response from the Amplify team. labels Oct 29, 2024
@cwomack cwomack self-assigned this Oct 30, 2024
@cwomack cwomack added Auth Related to Auth components/category question General question and removed pending-triage Issue is pending triage labels Oct 30, 2024
@cwomack
Copy link
Member

cwomack commented Oct 30, 2024

Hello, @al-mcnichol 👋. If you've upgraded from v4 to v6, then there was likely a massive amount of changes that would be needed to get the API's functioning properly and avoid errors on the user side. Can you help me understand what documentation was followed, if you went literally from v4 straight to v6, etc.?

While I see some issues with the v4/v5 API's in the code provided, I'm wondering if a sample repo would be more helpful here to ensure that we're updated and migrating the code properly across multiple files.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending a response from the Amplify team. label Oct 30, 2024
@cwomack cwomack added the pending-community-response Issue is pending a response from the author or community. label Oct 30, 2024
@al-mcnichol
Copy link
Author

@cwomack Thanks for responding. While we plan to do the upgrade, it was postponed due to the massive outage and business priorities. We're looking to try again at a later time. I'd like to focus on fixing the current version and not sure what else we can do to rectify it. You mind share the issues you see in the code and are they related to cause?

We followed the migration document and to your point, there were quite a bit of changes. However, the upgrade was deployed to our lower environment and went through rigorous testing, both regression and manual. Of course we didn't catch the issue until we encounter traffic in PROD

Also, I'm noticing the installed package version, namely @aws-amplify/auth: ^4.6.17 and `@aws-amplify/core: ^4.7.15', but seeing Amplify v5 being used under the hood. Is that expected?

@github-actions github-actions bot added pending-maintainer-response Issue is pending a response from the Amplify team. and removed pending-community-response Issue is pending a response from the author or community. labels Oct 30, 2024
@cwomack
Copy link
Member

cwomack commented Nov 1, 2024

@al-mcnichol, sorry to hear that you're experiencing these issues. We definitely recommend upgrading to v6 whenever you can, since that's the latest major version where we'll be providing bug fixes, feature improvements, etc.

To help with that, the migration docs you linked would be the best place to start for getting from v5 (upgrading from v4 to v5 should be pretty quick/easy) to v6. Are you able to share your code within the v6, lower environment to see if we can identify any areas for improvement? And can you also resubmit the image or details of your network logs when this happens? We can't see the image in the initial description.

For the final comment about v5 API's being used under the hood, that shouldn't be the case if you're only seeing v4.X in your package.json for dependencies. Can you make sure you've cleared your cache as well as do the following:

  • Delete you package-lock.json file
  • Delete your node_modules folder
  • Reinstall your dependencies with npm i

The above steps should be done any time you upgrade to a new major version as well. Thanks!

@cwomack cwomack added pending-community-response Issue is pending a response from the author or community. and removed pending-maintainer-response Issue is pending a response from the Amplify team. labels Nov 1, 2024
@al-mcnichol
Copy link
Author

al-mcnichol commented Nov 2, 2024

@cwomack Are you saying this particular issue is fixed in v6 and recommending we upgrade to address it? If so, I'm not 100% comfortable, given our experience, as mentioned in the original description. As you said, it could be something we're uniquely doing wrong and I'd like to focus first on understanding the root cause and address it in our current version.

Regarding the package-lock, this is our only time upgrading since v4. However, we discovered underlying applications that uses our web component are also loading Amplify, where we see occurrence of v5, even though their package version is v4 (same as ours). We're working to have Amplify installs removed on Apps that uses our web component.

What is the best/easiest way to expose our code for review? Our internal gitHub is not accessible externally and wondering if there are particular snippets you'd like to review, if that make sense.

Here is a zoomed version of the image I posted earlier. I didn't capture an image of the "TooManyRequestException", I was in panic mode when PROD was blowing up :)

image

@github-actions github-actions bot added pending-maintainer-response Issue is pending a response from the Amplify team. and removed pending-community-response Issue is pending a response from the author or community. labels Nov 2, 2024
@cwomack
Copy link
Member

cwomack commented Nov 5, 2024

@al-mcnichol, thanks for the additional context. Can you provide us some clarity on how your application gets used when you say, "underlying applications that uses our web component are also loading Amplify"? Is it embedded in other applications then and you're running multiple instances of Amplify?

Also, it sounds like you aren't on v6 then any longer and reverted back. But if you could share the code for v4/v5/v6 in a minimal sample repo (if not only share the frontend code that is causing issues), that might be needed.

Just to be clear, we'd recommend anyone using Amplify to upgrade to v6 if possible... but anyone on v4 especially to upgrade to v5 at minimum to take advantages of the bug fixes, features, etc. Going from v4 to v5 should be an easy upgrade since the API's, imports, etc. won't change like they do in v6.

@cwomack cwomack added pending-community-response Issue is pending a response from the author or community. and removed pending-maintainer-response Issue is pending a response from the Amplify team. labels Nov 5, 2024
@al-mcnichol
Copy link
Author

al-mcnichol commented Nov 11, 2024

@cwomack Thanks again for staying with me on this. Yes, we built our own custom UI Web component. The component is agnostic and used by several applications. We learned that one or more applications has as an instance of Amplify v4 installed. We figured it was the root cause when we upgraded the component to Amplify v6 but been able to verify/replicate the mentioned issue on v4.

Yes, our goal is to upgrade to v6 but kind of stuck without knowing the root cause.

I created a new repo and added you as a collaborator and you'll have access to the code once you accept the invite. Also, see branch "AUTHWEB-amplify-upgrade", which has V6 upgraded code

@github-actions github-actions bot added pending-maintainer-response Issue is pending a response from the Amplify team. and removed pending-community-response Issue is pending a response from the author or community. labels Nov 11, 2024
@cwomack
Copy link
Member

cwomack commented Nov 12, 2024

@al-mcnichol, can you clarify what the need or intent is for the loop here? I'm wondering if you're hitting the quota limit for Cognito for GetID as seen here.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending a response from the Amplify team. label Nov 12, 2024
@cwomack cwomack added the pending-community-response Issue is pending a response from the author or community. label Nov 12, 2024
@al-mcnichol
Copy link
Author

al-mcnichol commented Nov 14, 2024

@cwomack Great call out! We added a custom:user_id property to Cognito, which is how we identify a user once logged-in. However, for Federated sign-in/up the token does not consistently include the property and therefore the need to forceRefresh.

Also, the link you posted to the code is on v6, which is not yet in production. see the difference but think the end result is the same based on the reason stated above.

v4-code
v6-code

The quotas we are hitting are Rate of UserAuthentication requests and Rate of UserRead requests, which I'm not sure is tied to that.

@github-actions github-actions bot added pending-maintainer-response Issue is pending a response from the Amplify team. and removed pending-community-response Issue is pending a response from the author or community. labels Nov 14, 2024
@cwomack
Copy link
Member

cwomack commented Nov 15, 2024

@al-mcnichol, if you're still experiencing these quot limit errors then it's likely tied to how the loop is being used along with the fetchAuthSession() API in the v6 version of the code (as well as where you're calling Auth.currentUserPoolUser() in the v4 app).

While forcing the refresh of tokens is OK in many situations, you don't want to be recursively calling those API's to the point where the rate limits with Cognito are hit. Can you help me understand why there's a need to do the loop like this? Can the custom:user_id property that you added in Cognito be obtained with a single call? And can you potentially check to see if this attribute is present when you make any call to Cognito, and if not, attempt the call again when it's not present?

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending a response from the Amplify team. label Nov 15, 2024
@cwomack
Copy link
Member

cwomack commented Nov 15, 2024

Just to be clear, if you're looking to keep this logic then you can also reach out to Cognito to request a rate limit increase for the quotas that you're exceeding as an alternate way to work around this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auth Related to Auth components/category question General question
Projects
None yet
Development

No branches or pull requests

2 participants