Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix image loading issues by switching from MHTML to HTML format #71 #87

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Sebb955
Copy link

@Sebb955 Sebb955 commented Jan 21, 2025

Images were not displayed correctly when importing in MHTML. Sometimes, only the first image was displayed correctly, while the others remained grayed out.

To solve this problem, we switched to importing in HTML, which allows for better loading and display of images. This method ensures that all images are loaded and displayed correctly, improving the user experience.

Fixes #71

Images hidden behind buttons were not displayed correctly when importing in MHTML. Sometimes, only the first image was displayed correctly, while the others remained white or were broken.

To solve this problem, we switched to importing in HTML, which allows for better loading and display of images. This method ensures that all images are loaded and displayed correctly, improving the user experience.
@shihabmridha shihabmridha added the bug Something isn't working label Jan 22, 2025
Copy link
Owner

@shihabmridha shihabmridha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not tested the PR yet. Added a couple of comments. My main question is, do we really need the two packages (node-fetch and @types/node-fetch) you've added? Because you are calling fetch inside evaluate function's callback. Everything inside evaluate gets executed in the browser where we have access to fetch.

package.json Outdated
@@ -4,7 +4,9 @@
"description": "Download course from educative.io",
"main": "src/app.ts",
"dependencies": {
"puppeteer": "^22.6.3"
"@types/node-fetch": "^2.6.12",
Copy link
Owner

@shihabmridha shihabmridha Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@types/node-fetch should be in devDependencies. I know it's a CLI tool and it does not really matter but, lets follow the standard.

package.json Outdated
@@ -4,7 +4,9 @@
"description": "Download course from educative.io",
"main": "src/app.ts",
"dependencies": {
"puppeteer": "^22.6.3"
"@types/node-fetch": "^2.6.12",
"node-fetch": "^3.3.2",
Copy link
Owner

@shihabmridha shihabmridha Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need node-fetch? We are using Bun, and it has fetch API implemented to it.

_Note: NodeJS now has fetch API built into it.

src/browser.ts Outdated
if (srcset) {
const urls = srcset.split(',').map(url => url.trim().split(' ')[0]);
// Processes an array of URLs, fetching and converting those that start with '/api/' to data URLs.
const newSrcset = await Promise.all(urls.map(async url => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Promise.allSettled instead of Promise.all.

@Sebb955
Copy link
Author

Sebb955 commented Jan 23, 2025

Thank you for pointing this out! You are absolutely right that the evaluate function runs in the browser context, where the native fetch API is already available. I have removed node-fetch and its type definitions (@types/node-fetch) from the dependencies, as they are unnecessary in this case.

Here are the changes I made:

  • Removed node-fetch and @types/node-fetch from the package.json.
  • Verified that the fetch API works as expected in the browser context without any additional dependencies.

Copy link
Owner

@shihabmridha shihabmridha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple of comments that I missed before.

});

// Wait a bit more for the new image URLs to load
await new Promise(resolve => setTimeout(resolve, 2000));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the waitFor function that we have in the utils file.

for (const img of Array.from(images)) {
const src = img.getAttribute('src');
if (src?.startsWith('/api/')) {
const response = await fetch('https://www.educative.io' + src);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the configuration.ts file, we build urls in there. Can we introduce a new function over there and use the function to build the URL instead of hard-coding the root URL here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

images hidden behind the buttons aren't showing properly.
2 participants