-
Notifications
You must be signed in to change notification settings - Fork 458
Remove ssr.paths
configuration and replace with ssr.excludePathPatterns
which excludes specific paths from SSR
#4227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks @FrancescoMolinaro. I tested this on DSpace 7.6.4-SNAPSHOT (patch applies cleanly) and saw that the 403, 404, and 500 error pages were returning the correct codes. I also tested a few other paths for existing and non-existing items (for example) to make sure the codes didn't change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrancescoMolinaro : The code looks good overall now, but my tests are not all successful. Here's what I'm seeing:
- Running this PR in production mode (
npm run build:prod; npm run serve:ssr
) - Test http://localhost:4000/500 -> Returns 500 error code (SUCCESS)
- Test http://localhost:4000/403 -> Returns 403 error code (SUCCESS)
- Test http://localhost:4000/404 -> Returns 404 error code (SUCCESS)
- Test http://localhost:4000/invalid-url -> Shows the
/404
page, but returns 200 OK (FAILURE)
So, this works for the pages that it's checking for. But, it doesn't seem to work in scenarios where a different path results in a 404 error. It also therefore might not work if a different path resulted in a 500 or 403 error.
Overall, this PR provides better behavior than the current main
. But, it doesn't seem to fully fix the problem of returning a proper error code when an error occurs from the backend. Instead, it just makes sure that if you access these specific error pages directly in your browser that you'll get a proper error code.
Hi @tdonohue , thanks for the feedback! I agree that there are some scenario where this solution might not work. I don't think there is an easy solution to this, what comes to mind to make a bit more robust the handling of the requests' status while keeping some pages out of SSR, could be importing the APP_ROUTES in the server.ts and checking the starting of the request's path so we can set a 404 in case none of the app's paths matches the starting of the request's one. This would still leave a lot of options uncovered, for instance home-invalid, would still result as a known route with a 200 code. |
Assigning @artlowel as a secondary reviewer since he offered to help take a look at this today. |
@tdonohue, @artlowel, the same situation (using a 200 error code for something that is not explicitly - eg. /404, /500 - a redirect/error) can be seen here: #4042. |
@AndreaBarbasso and @FrancescoMolinaro : If we can find a way to solve this by changing our settings to exclude specific paths from SSR, instead of including only specific paths in SSR, then I'd be OK with that approach. That does seem like it may provide better behavior for unexpected errors / redirects. If we want to go that route, I'd recommend renaming the existing
The challenge may be that we may need to also support regex values if we also want to exclude browse pages for communities & collections which appear under Whatever solution is created, we'll have to make sure to also backport it to 7.6.x and 8.x. |
Hi @tdonohue , thanks for the feedback! I agree that a blacklist approach would make the solution more stable for error codes and redirects, so I have refactored the implementation following your suggestion. I will look forward to your feedback and thanks again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrancescoMolinaro : I haven't had a chance to test this yet, but I had some feedback on how we could clean up the new code. It may make since to just have one excludePaths
setting, if you agree. See more inline below.
Hi @tdonohue , thanks for the feedback and sorry for the poor naming choice, I appreciated a lot the article you shared. I have refactored the solution so to have a single parameter based on regexes as you proposed, I agree that this give a better flexibility and power in handling the various paths compositiions. I will look forward to your next feedback and thanks again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrancescoMolinaro : Thanks for the updates. Overall, this now looks good to me. I only have minor suggestions inline below.
I'd also recommend we update the Title & Description of this PR, as it's changed completely from its original purpose. If possible, we may also want to remove the earlier commits in this PR (by rebasing the PR & squashing commits). But, if you aren't able to easily do that, I can always merge squash this (reducing the entire PR to a single commit).
(UPDATE: I've changed the PR title. But, I haven't yet updated the description)
Overall, though, I'm +1. I also tested this today and verified that this approach works. I can verify that the excluded paths are not triggering SSR. I can also verify that the error pages & invalid URLs are returning correct error codes again.
ssr.paths
configuration and replace with ssr.excludePathRegexes
which excludes specific paths from SSR
4af4306
to
c399976
Compare
c399976
to
b1c5460
Compare
Hi @tdonohue , thanks again for the feedback. I did implement the last changes you requested and I have updated the description and squashed all the commits so that we now have a single one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrancescoMolinaro : Apologies for not getting back to this PR for a while. I re-tested it today, and it's mostly looking good & working well. However, I've run in a few more small bugs described inline below. Basically, we need to make a small update to the "/statistics" regex, and I'm also having issues with overriding the default settings in my config.prod.yml
.
Beyond that, I'm still basically a +1. I just want to ensure we get these final issues fixed before merging.
@FrancescoMolinaro
|
Hi @tdonohue , I checked and actually YAML doesn't support RegEx so they are treated as strings. |
ssr.paths
configuration and replace with ssr.excludePathRegexes
which excludes specific paths from SSRssr.paths
configuration and replace with ssr.excludePathPatterns
which excludes specific paths from SSR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FrancescoMolinaro : Thanks for the updates. The basics appear to work, but I'm still not able to properly override settings using my config.*.yml
.
I'm testing in Production Mode (npm run build:prod; npm run serve:ssr
), and I'm turning off Javascript in my browser and attempting to access various paths which should be excluded.
Here's what I'm trying:
Add the following YAML settings (this is just a very basic example of overriding the defaults and only using one pattern)
ssr:
excludePathPatterns:
- /^\/community-list$/
Now, attempt to access http://localhost:4000/community-list and it wrongly is served via SSR.
However, if I comment out those YAML settings and use the defaults, then http://localhost:4000/community-list is correctly blocked from SSR.
So, it appears overrides in YAML are not working properly still. The default settings do work, but I cannot figure out how to override defaults.
I've also noticed a few of the new patterns appear to be incorrect. I believe these all need to be corrected
# There's no "/admin" path, but lots of paths like "/admin/[subpath]"
/^\/admin\//
# This needs to match both "/processes" and "/processes/[id]"
/^\/processes\/?/
# There's no "/notifications" path, but a few paths like "/notifications/[subpath]"
/^\/notifications\//
# I gave the wrong pattern for this one, as we also need to match "/statistics"
/^\/statistics\/?/
# For completion, we may also want to add this as other admin tools use "/access-control/[subpath]".
/^\/access-control\//
(Please double check my regex patterns are correct. I think these are correct, but I also might have made a mistake. I wasn't able to easily test these because the YAML overrides are not working.)
Finally, one note inline below.
Some more details about #4227 (comment): The example config.yml contains this:
I think this pattern doesn't work, which means we would incorrectly attempt SSR for this path. We're in a yml file, and yml does not do anything special for regexp syntax: the entire string is treated as an unquoted yml string and written to config.json as:
The string is then passed to isExcludedFromSsr:
This calls the RegExp constructor, effectively, as:
This results in the RegExp object printed as:
This is an impossible RegExp which does not match any string, so we'll incorrectly attempt to use SSR for this page. Unfortunately, the fix isn't quite trivial: JavaScript does not provide a good way of parsing a RegExp specified as "/.../i". Traditionally, one might use
This seems simple enough, and it would make things work for regular expressions with or without slashes. |
@philipprumpf : Thanks for digging in here & figuring out why this PR is no longer working. I see similar behavior. @FrancescoMolinaro : Could you update this PR to no longer use the |
Hi @tdonohue, @philipprumpf , thanks for the feedback and sorry for the delay, I wasn't working the past few days. I have refactored the solution following the suggestions of @philipprumpf to prevent the slashes from the strings in the YAML and adapted a bit the logic so that we can pass the flag as an additional parameter, therefore adding a bit more flexibility in building the RegExp. After running some tests it seem working to me, I tried accessing every path that should be covered but it would be great if you could double check them. Regarding the issue with YAML override I believe it was connected with the fact that the patterns would have a different value when passed from the YAML, I tried the patterns you suggested and they seem correct and working when applied from the YAML. I will look forward to your next feedback and thanks again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks @FrancescoMolinaro ! This looks good to me now. I've retested today, and all the prior issues are fixed. The error codes are working properly now for the /404
, /403
and /500
error pages (and any invalid URLs). I've also verified that overriding the default settings in your config.*.yml
now works properly.
Backport failed for Please cherry-pick the changes locally and resolve any conflicts. git fetch origin dspace-7_x
git worktree add -d .worktree/backport-4227-to-dspace-7_x origin/dspace-7_x
cd .worktree/backport-4227-to-dspace-7_x
git switch --create backport-4227-to-dspace-7_x
git cherry-pick -x b1c5460bbba131d4f430f7d415000bc9a9809546 4c9638150ab994834ec6e732dc0045fad5166127 c442d355057adc6b6bcf27dde9138019c9e15d55 |
Successfully created backport PR for |
@FrancescoMolinaro : Could you manually backport this to |
References
Fixes #4132
Fixes #4042
Description
Adapted filtering of SSR pages blocking the rendering on specific pages instead of allowing it only on some pages.
Instructions for Reviewers
Navigating on pages 500, 404 or 403, or any invalid URL I should see the correct error status in the request.
In general the response status should be always correct for pages rendered in SSR.
List of changes in this PR:
Removed
ssr.paths
configuration and replaced withssr.excludePathPatterns
.Add patterns to exclude community and collection browse, global browse, global search, community list, and statistics and various administrative tools.
For example:
Checklist
This checklist provides a reminder of what we are going to look for when reviewing your PR. You do not need to complete this checklist prior creating your PR (draft PRs are always welcome).
However, reviewers may request that you complete any actions in this list if you have not done so. If you are unsure about an item in the checklist, don't hesitate to ask. We're here to help!
main
branch of code (unless it is a backport or is fixing an issue specific to an older branch).npm run lint
npm run check-circ-deps
)package.json
), I've made sure their licenses align with the DSpace BSD License based on the Licensing of Contributions documentation.