Skip to content
This repository has been archived by the owner on Apr 11, 2024. It is now read-only.

Adds check for Google Crawler attempting to crawl Auth Routes #824

Merged
merged 15 commits into from
Apr 25, 2023
Merged

Conversation

abharvey
Copy link
Contributor

@abharvey abharvey commented Apr 14, 2023

WHY are these changes introduced?

Fixes Related to #686

Users were seeing logs with multiple auth requests coming in failing on missing cookies. We determined that in Chrome google's crawler was sending additional requests to the auth and auth/callback endpoints for our test app with the user-agent "GoogleOther" which would not have cookies set nor shop and other parameters available.

WHAT is this pull request doing?

Adds a check against the google crawler user agent and for missing shop params, then throws an error (based on needed discussion)

Type of change

  • Patch: Bug (non-breaking change which fixes an issue)
  • Minor: New feature (non-breaking change which adds functionality)
  • Major: Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

  • I have used yarn changeset to create a draft changelog entry (do NOT update the CHANGELOG.md file manually)
  • I have added/updated tests for this change
  • I have documented new APIs/updated the documentation for modified APIs (for public APIs)

@abharvey abharvey requested a review from a team as a code owner April 14, 2023 15:58
Copy link
Contributor

@cquemin cquemin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the same code copy/pasted twice in the same file make me think we should create a method for it.
It would also facilitate the implementation of the final solution (throwing vs 418) since it would be implemented only once.

With that in mind, I would argue that throwing should be used for something unexpected that we can't recover from. Here it feels to me that since we can respond with 418 (? I am a teapot ???) and we know it will happen I would vote for the response code.
However did you try to see what is the behaviour from the bot? Does it "understand" 418 ?

@abharvey
Copy link
Contributor Author

Seeing the same code copy/pasted twice in the same file make me think we should create a method for it. It would also facilitate the implementation of the final solution (throwing vs 418) since it would be implemented only once.

With that in mind, I would argue that throwing should be used for something unexpected that we can't recover from. Here it feels to me that since we can respond with 418 (? I am a teapot ???) and we know it will happen I would vote for the response code. However did you try to see what is the behaviour from the bot? Does it "understand" 418 ?

I personally love the respond with the 418 most. And I agree with avoid a throw, I normally subscribe to the "never write exceptions" camp. Ironically it was a Goggle that had convinced me.

I think whether or not the bot understands the response isn't of importance to us. The main reason for the throw was to eject from the function and stop the auth process. Ending the request with a response I think sounds better based on your comment as well as the chat with Kevin.

Regarding the duplication, I'm on the fence. Trying to be more comfortable with duplication with AHA

@abharvey
Copy link
Contributor Author

@cquemin I changed my mind and extracted it. I'll have to add the response to it but now we can compare

@cquemin
Copy link
Contributor

cquemin commented Apr 17, 2023

@cquemin I changed my mind and extracted it. I'll have to add the response to it but now we can compare

I think it looks a bit better now :D

lib/auth/oauth/oauth.ts Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
lib/auth/oauth/oauth.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@teddyhwang teddyhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 418, we should return a 410 status code so Google stops coming back to these endpoints: https://www.searchenginejournal.com/google-404-status/254429/

lib/auth/oauth/__tests__/oauth.test.ts Show resolved Hide resolved
lib/auth/oauth/oauth.ts Outdated Show resolved Hide resolved
@abharvey abharvey requested a review from teddyhwang April 25, 2023 18:27
Comment on lines +140 to +145
if (isbot(request.headers['User-Agent'])) {
logForBot({request, log, func: 'callback'});
throw new ShopifyErrors.BotActivityDetected(
'Invalid OAuth callback initiated by bot',
);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@teddyhwang New callback bot handling.
Your last comment was on an out dated diff. Lots of back and forth on this with @mkevinosullivan and @paulomarg to decide the best course of action.
This is where we've landed.

Next step is to update the express app middleware to handle this error.

@abharvey abharvey changed the title [Bug]**[WIP = RFC]** Adds check for Google Crawler attempting to crawl Auth Routes [Bug] Adds check for Google Crawler attempting to crawl Auth Routes Apr 25, 2023
@abharvey abharvey changed the title [Bug] Adds check for Google Crawler attempting to crawl Auth Routes Adds check for Google Crawler attempting to crawl Auth Routes Apr 25, 2023
@abharvey abharvey merged commit 447af77 into main Apr 25, 2023
@abharvey abharvey deleted the abh/oauth branch April 25, 2023 19:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants