Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize unnecessary require operation in write_ascii_slow #835

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pyb1993
Copy link

@pyb1993 pyb1993 commented Jun 14, 2021

you can see the discussion here
This PR is to reduce the require operation when serialize huge ascii string
You can use

mvn -f benchmarks/pom.xml compile exec:java -Dexec.args="-f 4 -wi 5 -i 3 -t 2 -w 2s -r 2 HugeStringBenchmark.writeAsciiHuge"
to test the performance (with and without this optimization), I have show a short result in the discussion

@theigl
Copy link
Collaborator

theigl commented Jun 22, 2021

@pyb1993: I have not forgotten about this PR. I'm currently on vacation and will take a look when I'm back in 2 weeks.

@theigl
Copy link
Collaborator

theigl commented Jul 5, 2021

@pyb1993: I just looked into your PR and can reproduce the performance gains. However, I'm not sure about applying your changes, since they only affect a rare corner case. As you wrote on the mailing list:

But If there is such a case(and I do a benchmark for this rare case), the performance will have huge improvement in writeAscii.

  1. the length of string is larger than the initial output buffer size. (which will cause the unecessary require operation(allocate and copy))
  2. the output is not reused, new output each time (I think the best practice is reused the output if possible)

If performance is important, the output should be pooled and re-used. Otherwise you will see a lot of allocation and a lot of GC pressure. Your change is simple enough, but it still makes the code slightly harder to understand and I'm not sure if any real user would benefit from the it.

@pyb1993
Copy link
Author

pyb1993 commented Jul 6, 2021 via email

@pyb1993
Copy link
Author

pyb1993 commented Jul 11, 2021

@pyb1993: I just looked into your PR and can reproduce the performance gains. However, I'm not sure about applying your changes, since they only affect a rare corner case. As you wrote on the mailing list:

But If there is such a case(and I do a benchmark for this rare case), the performance will have huge improvement in writeAscii.

  1. the length of string is larger than the initial output buffer size. (which will cause the unecessary require operation(allocate and copy))
  2. the output is not reused, new output each time (I think the best practice is reused the output if possible)

If performance is important, the output should be pooled and re-used. Otherwise you will see a lot of allocation and a lot of GC pressure. Your change is simple enough, but it still makes the code slightly harder to understand and I'm not sure if any real user would benefit from the it.

Hi, how do you think about his PR? as what I talked above, I think there are a lot similar situation that
capacity >= count << k in writeLongs, writeInts, and writeDoubles

@theigl
Copy link
Collaborator

theigl commented Jul 14, 2021

@pyb1993: I'm still +0 on this. It can improve things for edge cases, but I'm not sure the change is worth it. I'll try to get some more feedback on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

2 participants