Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installer Download / URL checker problems / Strange server behavior with partial requests #165

Open
drojf opened this issue Sep 19, 2021 · 4 comments

Comments

@drojf
Copy link
Collaborator

drojf commented Sep 19, 2021

When investigating why the URL checker didn't work, @TellowKrinkle discovered that the server won't always send the correct response if you only request part of a file.

It looks like every second response will give a different response, alternating between HTTP 200 (the whole file), and HTTP 206 (partial response):

C:\Users\drojf>curl --head "https://07th-mod.com/rikachama/graphics/Watanagashi-Graphics.7z" -H "Range: bytes=0-1023"
HTTP/1.1 200 OK
Date: Sun, 19 Sep 2021 01:41:37 GMT
Content-Type: application/x-7z-compressed
Content-Length: 1006748838
Connection: keep-alive
last-modified: Mon, 12 Oct 2020 23:09:17 GMT
etag: "5f84e21d-3c01c4a6"
Cache-Control: max-age=3600
CF-Cache-Status: MISS
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=mzUHRUmntFmMAdOY%2FT9om2cIKrUpjUPKU%2FNdqCofBmZ0ex1wMttuDkfcONIyBCdTUQJ%2FoipAQlY1aaNiDPwqxjJSl7%2BP1213rdqqMwO5PtPJD%2B1KnQeSkVecs2jLbeY%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Server: cloudflare
CF-RAY: 690f1cd88c5c16c5-SYD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400


C:\Users\drojf>curl --head "https://07th-mod.com/rikachama/graphics/Watanagashi-Graphics.7z" -H "Range: bytes=0-1023"
HTTP/1.1 206 Partial Content
Date: Sun, 19 Sep 2021 01:41:39 GMT
Content-Type: application/x-7z-compressed
Content-Length: 1024
Connection: keep-alive
last-modified: Mon, 12 Oct 2020 23:09:17 GMT
etag: "5f84e21d-3c01c4a6"
content-range: bytes 0-1023/1006748838
Cache-Control: max-age=3600
CF-Cache-Status: MISS
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=PCimvFfIu0rza0%2FAeyxog9j3bGk2q6r7Tlm7u44XnZcBlKmcL6HOS6lwCstnij1E7isipUtlHQYiQi5rSmoph%2BoTVm7UCu5Bo280PMFUWzlVZr120rQEsCnu1lpm0x8%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Server: cloudflare
CF-RAY: 690f1ce28a1a16b9-SYD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400

I tested doing range requests on another website (downloading the Ubuntu ISO), and that website doesn't have this problem - it always returns 206.

Now this may seem like it would break the installer's "resume downloads" capability, but aria2c always first does a normal request, followed by a range request. When you do the requests this way, the server will always give a correct range response on the second request.

09/19 13:16:25 [INFO] CUID#7 - Requesting:
GET /rikachama/graphics/Watanagashi-Graphics.7z HTTP/1.1
User-Agent: aria2/1.35.0
Accept: */*,application/metalink4+xml,application/metalink+xml
Host: 07th-mod.com
Want-Digest: SHA-512;q=1, SHA-256;q=1, SHA;q=0.1


[#9392da 0B/0B CN:1 DL:0B]
09/19 13:16:26 [INFO] CUID#7 - Response received:
HTTP/1.1 200 OK
Date: Sun, 19 Sep 2021 03:16:24 GMT
Content-Type: application/x-7z-compressed
Content-Length: 1006748838
Connection: keep-alive
last-modified: Mon, 12 Oct 2020 23:09:17 GMT
etag: "5f84e21d-3c01c4a6"
Cache-Control: max-age=3600
CF-Cache-Status: MISS
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=e13MFXOE%2FDL0pPprAmHZ%2BvK2cexqyTqMxP5j7X29KHXDlP6CnCWCXGLM25HYUQ57tGbLnhhyYQ64F0pq6%2B0ScohBZJLoufugicKtDdmvSM%2BTw1yOGf6vUXiqQsWzYcc%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Server: cloudflare
CF-RAY: 690fa7ad9e1b6a1a-SYD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400

<omitted some log text>

09/19 13:16:26 [INFO] CUID#8 - Requesting:
GET /rikachama/graphics/Watanagashi-Graphics.7z HTTP/1.1
User-Agent: aria2/1.35.0
Accept: */*
Host: 07th-mod.com
Range: bytes=241631232-1006748837
Want-Digest: SHA-512;q=1, SHA-256;q=1, SHA;q=0.1


[#9392da 230MiB/0.9GiB(24%) CN:1 DL:0B]
09/19 13:16:27 [INFO] CUID#8 - Response received:
HTTP/1.1 206 Partial Content
Date: Sun, 19 Sep 2021 03:16:25 GMT
Content-Type: application/x-7z-compressed
Content-Length: 765117606
Connection: keep-alive
last-modified: Mon, 12 Oct 2020 23:09:17 GMT
etag: "5f84e21d-3c01c4a6"
content-range: bytes 241631232-1006748837/1006748838
Cache-Control: max-age=3600
CF-Cache-Status: MISS
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=pse79IiOe%2BsMrwDhjQICTw4v0c8bmSBkby2xudgF7bC9AgpUaT7orsMmTi8n6%2BxfVuuvQ0N0O%2F%2BavyfBTCvIIVDhhHjs8lKr6dtPFHy17iw3fAendzRlQsBs5auBnNQ%3D"}],"group":"cf-nel","max_age":604800}
NEL: {"success_fraction":0,"report_to":"cf-nel","max_age":604800}
Server: cloudflare
CF-RAY: 690fa7b64fa516e1-SYD
alt-svc: h3=":443"; ma=86400, h3-29=":443"; ma=86400, h3-28=":443"; ma=86400, h3-27=":443"; ma=86400

This means that our installer's resume functionality is unaffected by this behavior. Therefore, I don't think we need to investigate/fix this issue.

Tellowkrinkle has put a workaround for the URL checker (along with quite a few other changes) on the fix_validator_timeout branch, which should be ready to merge already. I'll close this issue once that is merged.

@drojf
Copy link
Collaborator Author

drojf commented Sep 19, 2021

@TellowKrinkle I see you're still making changes - you can merge the fix_validator_timeout branch whenever you're ready. You're not blocking anything in the installer so you can take your time.

I won't merge it myself in case you want to make more changes.

@TellowKrinkle
Copy link
Member

It's good to be merged, though it seems to have 10 other commits that aren't on master so I'm not sure what it was supposed to be merged onto

@drojf
Copy link
Collaborator Author

drojf commented Sep 19, 2021

ah sorry that's my fault, I thought I already pushed those to master, but I didn't...

in that case, let me handle merging it

@drojf
Copy link
Collaborator Author

drojf commented Oct 17, 2021

I just had a report where the installer would fail with an "Invalid range header" error:

Screen_Shot_2021-10-17_at_4 13 25_PM cropped

I have actually seen this issue before once or twice. I'm pretty sure it's the same issue that we were seeing with the URL checker where the server wouldn't always return the correct range.

I thought this issue wouldn't affect aria2c because aria2c always first requests the whole file size information, then do a second request for just part of the file. This usually ensures the second request is the correct range.

But I'm not sure what happens if the connection count reaches 0 and then it requests a new block - it might do a request for a partial range immediately which the server would then give back incorrectly.

We currently do 8 connections when we download using aria2c.

This only seems to happen with the really large Umineko file (and possibly only when using metalinks).

If the user retries the install once or twice it seems to fix the problem, however.

Edit: Some more info on this issue. They had the issue again. It appears that each time it happens, the installer restarts the download, which starts again OK, however after 3 restarts on a single file the installer will give up. Also notable was that the users' download speed was very slow - .53MBPS (on the logs max 1MiB).

Possibly just changing the number of attempts on metafiles to 10 on would fix this issue, since it's only metafiles which have this problem, and metafiles have checksum checks, so we probably wouldn't be accidentally downloading the same file 10 times in the worst case.

Also changing the connection count to 1 might also fix the issue, as having multiple connections to the server might be causing the invalid range header issue.

@drojf drojf reopened this Oct 17, 2021
@drojf drojf changed the title URL checker problems / Strange server behavior with partial requests Installer Download / URL checker problems / Strange server behavior with partial requests Oct 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants