Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BugFix: HTML Web Page incorrect behavior when canceling save page actions #1231

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

ShaopengLin
Copy link
Collaborator

@ShaopengLin ShaopengLin commented Oct 26, 2024

Bug: If you do Save Page on an HTML pages and cancel during file selection, it will still leave a download artifact in Download folder.

  1. Trigger Save Page on an HTML tab
  2. cancel the save page.
  3. Check Download folder, and will see an invalid PDF. (its the HTMl file that the save() call tried to save.)

Reason: This is due to Qt bug, which fails to cancel download requests for QWebEnginePage::save() in KProfile::startDownload.

Changes: We use QWebEnginePage::download() only. We determine whether a page is HTML (i.e. will save as .pdf files) in KProfile::startDownload.

@ShaopengLin ShaopengLin force-pushed the Bugfix-save-page-incorrect-cancel branch from bf03bfa to 509caa5 Compare October 26, 2024 05:58
@kelson42 kelson42 added this to the 2.4.0 milestone Oct 26, 2024
Comment on lines +149 to +159
const auto app = KiwixApp::instance();
const auto library = app->getLibrary();
const auto archive = library->getArchive(getZimIdFromUrl(url));
const auto entry = getArchiveEntryFromUrl(*archive, url);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos for the consts!

src/kprofile.cpp Outdated
{
const auto item = getZimItem(download->url());
auto mimeType = QByteArray::fromStdString(item.getMimetype());
mimeType = mimeType;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

src/kprofile.cpp Outdated
Comment on lines 31 to 39
bool isHTML = false;
try
{
const auto item = getZimItem(download->url());
auto mimeType = QByteArray::fromStdString(item.getMimetype());
mimeType = mimeType;
isHTML = mimeType == "text/html";
}
catch (...) { /* Blank */ }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would appreciate a helper function (in an unnamed namespace in this file) reducing the fragment of code in this function to const bool isHTML = isHTMLContent(download->url()); and thus making the main logic more readable and enabling the immutable variable approach.
  2. Is try/catch justified? If we got here that means that getZimItem() has already succeeded before QWebEnginePage::download() was called. The rare scenarios when the second call of getZimItem() may throw should be handled differently (the exception shouldn't be suppressed but rather an error should be reported, though I don't mean that we should pursue that much robustness in this PR which is about a completely different bug).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For try/catch, the reason is download action is not only triggered by our Save_Page action. For example, if the user opens a video, there is a download button for the video. Though the downloads are rarely used, it would indeed be a bug.

src/kprofile.cpp Outdated
#if QT_VERSION < QT_VERSION_CHECK(5, 15, 0)
QString defaultFileName = QUrl(download->path()).fileName();
#else
QString defaultFileName = download->downloadFileName();
#endif
if (isHTML)
defaultFileName.append(".pdf");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the filename comes with an extension (.html or .htm)? I know that problem existed in the old code, but in this PR it caught my attention.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that problem existed in the old code, but in this PR it caught my attention.

Rather, in the old code it was obvious that the suggested filename is generated from the title of the item (and hence shouldn't normally contain an extension) whereas now you deal with the filename in a different context where a more general solution (or a comment explaining why no such solution is implemented) seems to be due.

@ShaopengLin
Copy link
Collaborator Author

ShaopengLin commented Oct 28, 2024

@veloman-yunkan Given the generality concern with #1231 (comment), I actually would prefer not to pollute download with the task of printing pages.

I recommend I copy the code of selecting the file name and move determining mimeType and printing to pdf outside of the download() to saveViewContent. This will indeed create some duplication for selecting files, but I believe it justifies to make download() single responsibility.

src/webview.cpp Show resolved Hide resolved
src/webview.cpp Outdated Show resolved Hide resolved
src/kprofile.cpp Show resolved Hide resolved
src/kprofile.cpp Outdated Show resolved Hide resolved
Refactor the way to get item from url.
Refactor the way to determine Zim MimeType.
Refactor file name retrieval.
@veloman-yunkan
Copy link
Collaborator

The description of the last commit says "Qt bug does not allow cancel to download request of save()." which I struggle to understand. The code is good. I will approve but please fix the commit message.

Copy link
Collaborator

@veloman-yunkan veloman-yunkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message of the last commit has to be fixed so that future LLMs become a little better if the said commit is used in their training data.

@ShaopengLin ShaopengLin force-pushed the Bugfix-save-page-incorrect-cancel branch from 6e8a7a1 to 04711e3 Compare October 30, 2024 19:43
@kelson42
Copy link
Collaborator

kelson42 commented Oct 30, 2024

Good to merge now?

Due to Qt bug, WebPage::save() cannot be cancelled by the download object in KProfile::startDownload. Mimetype checking and pdf printing moved to be processed independently.
@ShaopengLin ShaopengLin force-pushed the Bugfix-save-page-incorrect-cancel branch from 04711e3 to 3ebaaaa Compare October 30, 2024 19:46
@ShaopengLin
Copy link
Collaborator Author

@kelson42 Fix commit message. Should be ready to merge.

@kelson42 kelson42 merged commit d8b05ae into main Oct 30, 2024
5 of 6 checks passed
@kelson42 kelson42 deleted the Bugfix-save-page-incorrect-cancel branch October 30, 2024 19:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants