Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression: Huffman Coding #1513

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

aladin002dz
Copy link
Contributor

Open in Gitpod know more

Describe your change:

  • Add an algorithm?
  • Fix a bug or typo in an existing algorithm?
  • Documentation change?

Checklist:

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized.
  • I know that pull requests will not be merged if they fail the automated tests.
  • This PR only changes one algorithm file. To ease review, please open separate PRs for separate algorithms.
  • All new JavaScript files are placed inside an existing directory.
  • All filenames should use the UpperCamelCase (PascalCase) style. There should be no spaces in filenames.
    Example:UserProfile.js is allowed but userprofile.js,Userprofile.js,user-Profile.js,userProfile.js are not
  • All new algorithms have a URL in their comments that points to Wikipedia or another similar explanation.
  • If this pull request resolves one or more open issues then the commit message contains Fixes: #{$ISSUE_NO}.

Copy link
Collaborator

@appgurueu appgurueu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other main point is string handling, which pretty consistently uses repeated concatenation here (e.g. for building prefixes (that one is mostly a non-concern though, since even with slow string concat, it'll be linearithmic time), for encoding / decoding). At least the encoding could be rewritten to use an array without making it any less readable.

Arguably all of this is fine in practice, as JS engines optimize string concatenation. I think it needs some comments alluding to this, though (I'm thinking something along the lines of "the string concatenation in a loop here is fine despite strings being immutable, as modern JS engines optimize it to be amortized constant time by delaying it (using ropes) and similar techniques").

I'm sorry about the unrelated changes due to formatting, that's due to a mistake on our part (see #1515).

Keep up the good work!

)

while (nodes.length > 1) {
nodes.sort((a, b) => a.freq - b.freq)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you sort in every iteration?

Copy link
Contributor Author

@aladin002dz aladin002dz Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because when pushing a new node, the order may change.

Copy link
Collaborator

@appgurueu appgurueu Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more appropriate to use a max heap here (our existing implementations should work, you just need to import and use them), given that you always extract the most frequent ones and only push new ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would add a lot more complexity, wouldn't it?

Copy link
Collaborator

@appgurueu appgurueu Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the API of our heap should be pretty straightforward to use. A heap is pretty much the data structure for this use case; it would significantly help the time complexity. If you want me to, I can make the necessary changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to implement it myself, then I'll need your valuable feedback and guidance.

Compression/Huffman.js Outdated Show resolved Hide resolved
Compression/Huffman.js Outdated Show resolved Hide resolved
Copy link
Collaborator

@appgurueu appgurueu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have you duplicated the code? I think there should only be a heap-based implementation; it is strictly better (more efficient) than the array-based one and of similar code complexity.

@aladin002dz
Copy link
Contributor Author

aladin002dz commented Oct 29, 2023

Why have you duplicated the code? I think there should only be a heap-based implementation; it is strictly better (more efficient) than the array-based one and of similar code complexity.
In progress to remove duplicates.

@appgurueu appgurueu added the hacktoberfest-accepted Accepted to be counted towards Hacktoberfest label Oct 29, 2023
@appgurueu
Copy link
Collaborator

Adding the hacktoberfest-accepted label since we're close to the end of October, and this PR is pretty close to being accepted and merged later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest-accepted Accepted to be counted towards Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants