Skip to content

Error pages return HTTP 200 instead of correct status codes (403, 404, 500) #4132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kerojohan opened this issue Apr 7, 2025 · 8 comments · Fixed by #4227
Closed

Error pages return HTTP 200 instead of correct status codes (403, 404, 500) #4132

kerojohan opened this issue Apr 7, 2025 · 8 comments · Fixed by #4227
Assignees
Labels
affects: main Issue impacts "main" (latest release). affects: 7.x Issue impacts 7.x releases affects: 8.x Issue impacts 8.x releases bug component: SEO Search Engine Optimization high priority testathon Reported by a tester during Community Testathon
Milestone

Comments

@kerojohan
Copy link

Describe the bug

In DSpace, pages that should return proper HTTP error codes (e.g., 500, 404, or 403) are instead returning a 200 OK status. This misbehavior affects search engine indexing and SEO, as search engines may incorrectly interpret error pages as valid content.

I’ve tested this issue on the DSpace demo site using Firefox and Chrome, and the following URLs all return a 200 OK status instead of the appropriate error code:

This issue has been observed in both DSpace 7.x and DSpace 8.x.

To Reproduce

Steps to reproduce the behavior:

  1. Open a browser (Chrome, Firefox)
  2. Navigate to one of the following URLs:
  3. Inspect the HTTP response headers (e.g., using browser developer tools or curl -I)
  4. Notice that the status code is 200 OK, even though the page clearly shows an error message

Expected behavior

The HTTP response should return the correct error status code:

  • 404 Not Found for missing pages
  • 403 Forbidden for unauthorized access
  • 500 Internal Server Error for internal failures

This ensures correct behavior for clients and search engines, and improves SEO by preventing incorrect indexing of error pages.

Related work

N/A – please let me know if there's an existing issue or PR related to this.

@kerojohan kerojohan added bug needs triage New issue needs triage and/or scheduling labels Apr 7, 2025
@github-project-automation github-project-automation bot moved this to 🆕 Triage in DSpace Backlog Apr 7, 2025
@alanorth
Copy link
Contributor

alanorth commented Apr 7, 2025

Hi @kerojohan. HTTP 404 errors for missing identifiers should be working correctly as of #2816. I don't know about the other errors, and I'm not sure if navigating directly to the error page is a good test?

@kerojohan
Copy link
Author

Hi @alanorth,

Looking at the issue you referenced, it might actually be related to the same problem I'm experiencing. Direct access to those pages, or any non-existent path, should return a 404,403 or 500 status code.

I was testing with this URL: https://demo.dspace.org/test — since the page doesn't exist, I expected to receive a 404, but instead I got a 200 status.

Best regards,
Joan

@alanorth
Copy link
Contributor

alanorth commented Apr 7, 2025

I was testing with this URL: demo.dspace.org/test — since the page doesn't exist, I expected to receive a 404, but instead I got a 200 status.

Yes, this may be. #2816 seems to only have fixed 404 responses for missing DSpace objects.

@tdonohue tdonohue added help wanted Needs a volunteer to claim to move forward component: SEO Search Engine Optimization high priority and removed needs triage New issue needs triage and/or scheduling labels Apr 7, 2025
@tdonohue tdonohue removed this from DSpace Backlog Apr 7, 2025
@tdonohue tdonohue added the testathon Reported by a tester during Community Testathon label Apr 7, 2025
@jlipka
Copy link

jlipka commented Apr 7, 2025

(I've had a look at this, but I can't find a solution yet.)

#2816 might also be related to the whole status code complex, but in this case I think this change might be more relevant: #3682

As the server-side rendered pages are restricted to a few pages/paths, the "ServerResponseService" can't set the response status code on the server, which seems to be necessary to get a real 404 status code back to the client!

If you call any route that isn't registered, like "/this-route-does-not-exist", it'll never be rendered on the server side (check out the "ssr.paths" config to see which paths are rendered on the server side).

Just for testing purposes, I reverted the "server.ts" to this line:
if (environment.ssr.enabled) {
then, unknown routes are getting a 404 status code from the backend again.

@tdonohue
Copy link
Member

tdonohue commented Apr 7, 2025

If this is a side effect of #3682, then it might be (temporarily) worked around by disabling that feature. You can disable that feature by using this configuration (in your config.*.yml for the User Interface).

ssr:
  paths: [ '/' ]

That setting says to use SSR for all paths.

Alternatively, it could be that we need to add all the 403/404/500 error pages into the default setting like this:

ssr:
  paths: [ '/home', '/items/', '/entities/', '/collections/', '/communities/', '/bitstream/', '/bitstreams/', '/handle/', '/reload/', '/403', '/404', '/500' ]

If someone has a chance to try these settings, please report back to let us know if they have any impact on the behavior.

@tdonohue
Copy link
Member

tdonohue commented Apr 7, 2025

I've been able to verify this bug is caused by #3682.

If you disable the default settings in #3682, then the bug no longer exists.

Use this configuration as a workaround to solve this bug: This configuration disables the SSR improvements added by #3682

ssr:
  paths: [ '/' ]

Listing the individual error pages for 403, 404, and 500 will fix the problem for those specific pages. But, a 200 OK is still returned for an invalid URL like http://localhost:4000/invalid-url, even though the 404 page will be shown.

It appears the issue is that the page must go through SSR for the error code to be properly returned.

@jesielviana
Copy link
Contributor

I think that’s exactly right, @tdonohue

From what I’ve observed, in the current setup, only the routes defined in ssr.paths are processed by the server during SSR. All other routes — including error pages like 404 or 500 — are rendered on the client side. This means it's not possible to set a specific HTTP status code for them.

The Angular Client-Side router operates entirely in the user's browser. It intercepts URL changes and renders different components of your Angular application without triggering a full page reload.

Since the browser doesn’t allow JavaScript to change the HTTP status code after the page has already loaded, these error pages end up returning a 200 OK, even though they display the correct error page.

@tdonohue
Copy link
Member

Fixed by #4227.

If anyone is encountering this error locally, you have two possible fixes/workarounds:

  1. You can enable SSR (server side rendering) on all paths by using this configuration in your config.prod.yml (on frontend)
    ssr:
      paths: [ '/' ]
    
  2. Or, alternatively, you can merge in the changes of Remove ssr.paths configuration and replace with ssr.excludePathPatterns which excludes specific paths from SSR #4227 (or upgrade to 7.6.4, 8.2 or 9.0 once they are released). This PR removes the paths configuration in favor of an excludePathPatterns configuration which only excludes specific paths from SSR like this:
    ssr:
      excludePathPatterns:
        - pattern: "^/communities/[a-f0-9-]{36}/browse(/.*)?$",
          flag: "i"
        - pattern: "^/collections/[a-f0-9-]{36}/browse(/.*)?$"
          flag: "i"
        - pattern: "^/browse/"
        - pattern: "^/search$"
        - pattern: "^/community-list$"
        - pattern: "^/admin/"
        - pattern: "^/processes/?"
        - pattern: "^/notifications/"
        - pattern: "^/statistics/?"
        - pattern: "^/access-control/"
        - pattern: "^/health$"
    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects: main Issue impacts "main" (latest release). affects: 7.x Issue impacts 7.x releases affects: 8.x Issue impacts 8.x releases bug component: SEO Search Engine Optimization high priority testathon Reported by a tester during Community Testathon
Projects
6 participants