Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#132 allow users to optionally provide an output buffer when calling transcode #136

Merged
merged 26 commits into from
Apr 9, 2023
Merged

Conversation

baumgold
Copy link
Contributor

An alternative to #134 to solve #132 without code-duplication.

@Moelf
Copy link

Moelf commented Mar 15, 2023

I like this better

@svilupp
Copy link

svilupp commented Mar 15, 2023

Yeah, this is great. I was solving for minimum changes to the original code path, but this is much better/cleaner!

I’ll close my PR

@baumgold
Copy link
Contributor Author

Is anyone with write-access available to review this PR so we can get it approved/merged/released? Thanks!

@baumgold
Copy link
Contributor Author

@mkitti / @quinnj - are either of you able to review this? Thanks.

Copy link
Member

@mkitti mkitti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for all the new signatures. See if you can consolidate the docstring into one.

src/transcode.jl Show resolved Hide resolved
src/transcode.jl Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
@quinnj quinnj requested a review from Drvi March 23, 2023 18:19
@quinnj
Copy link
Member

quinnj commented Mar 23, 2023

I'm going to request a review from @Drvi as well; will try to review myself a little later today

@baumgold
Copy link
Contributor Author

@mkitti - Thanks for your review. I believe I addressed all of your suggestions. Could you please review again? Thanks.

@baumgold baumgold requested review from mkitti and removed request for Drvi March 23, 2023 20:00
src/transcode.jl Outdated Show resolved Hide resolved
Co-authored-by: Mark Kittisopikul <[email protected]>
@baumgold baumgold requested a review from mkitti March 25, 2023 10:53
Copy link
Member

@mkitti mkitti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Let's give the others a chance to review. Ping me in a week.

@baumgold
Copy link
Contributor Author

baumgold commented Apr 3, 2023

@quinnj / @Drvi / @mkitti - any new thoughts/comments? If not, shall we merge? Thanks.

Copy link
Member

@mkitti mkitti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thought is that documenting the argument as ByteData might not be very useful. Rather we should just spell it out as Union{Vector{UInt8},Base.CodeUnits{UInt8}}.

src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
baumgold and others added 5 commits April 3, 2023 09:42
Co-authored-by: Mark Kittisopikul <[email protected]>
Co-authored-by: Mark Kittisopikul <[email protected]>
Co-authored-by: Mark Kittisopikul <[email protected]>
@mkitti
Copy link
Member

mkitti commented Apr 3, 2023

Looks good to me. Will merge soon unless someone else speaks up.

@JoaoAparicio
Copy link
Contributor

JoaoAparicio commented Apr 3, 2023

May I make a comment...

I note that as it is, there's no method with signature

transcode(::Type{TranscodingStreams.Codec}, ::Vector{UInt8}, ::Vector{UInt8})

There's

transcode(::Type{TranscodingStreams.Codec}, ::Vector{UInt8})

and there's

transcode(::Codec, ::Vector{UInt8}, ::Vector{UInt8})

As you can see the number of possibilities for manually defined methods grows combinatorially.

I would suggest that we fully resolve first the first argument, then the second etc. Then the combinations grow linearly. In this way we need only 3 manually defined methods to cover all combinations (right now there are 4 and not all combinations are covered)

src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
src/transcode.jl Outdated Show resolved Hide resolved
@baumgold baumgold requested a review from mkitti April 8, 2023 02:02
src/buffer.jl Outdated Show resolved Hide resolved
@mkitti
Copy link
Member

mkitti commented Apr 8, 2023

I'm feeling pretty good about where we are on this. I'm aiming to merge within the next 24 hours.

@mkitti mkitti self-requested a review April 8, 2023 02:24
@mkitti mkitti self-assigned this Apr 8, 2023
@mkitti
Copy link
Member

mkitti commented Apr 9, 2023

I see no further comments or objections. Merging.

Please submit another pull request or issue to iterate further.

I will arrange for a release within the next five days. I'm currently targeting Tuesday.

I believe this to be a non-breaking change, but I will review further later.

@mkitti mkitti merged commit 3ece750 into JuliaIO:master Apr 9, 2023
@JoaoAparicio
Copy link
Contributor

@mkitti this is where this will be used: apache/arrow-julia#422

@mkitti
Copy link
Member

mkitti commented Apr 11, 2023

Released as 0.9.12

@quinnj
Copy link
Member

quinnj commented Apr 11, 2023

Sorry for not being able to help review earlier. I think this is awesome. One note I want to follow up on: it seems that currently "bad things" happen when you try to pass in the same buffer as the input and output buffers to transcode. Can we add a validation check and throw an error in that case? (I just tried it on a 20mb stream and it blew up to 17.5gb before erroring!)

Or even better: can that be made to work? i.e. I have a use case in CloudStore.jl where people want to provide pre-allocated buffers in which to download a compressed file and then be able to decompress the data "in place". So they'd provide a big enough buffer for the final decompressed data, download the compressed data into the first part of the buffer, then do a decompress in place; is that possible?

@baumgold
Copy link
Contributor Author

Can we add a validation check and throw an error in that case? (I just tried it on a 20mb stream and it blew up to 17.5gb before erroring!)

#140

Or even better: can that be made to work? i.e. I have a use case in CloudStore.jl where people want to provide pre-allocated buffers in which to download a compressed file and then be able to decompress the data "in place". So they'd provide a big enough buffer for the final decompressed data, download the compressed data into the first part of the buffer, then do a decompress in place; is that possible?

I don't know if this is possible. Is there data to show that this is a performance bottleneck? Can you just allocate a separate buffer?

bauglir added a commit to bauglir/TranscodingStreams.jl that referenced this pull request Apr 14, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. JuliaIO#136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes JuliaIO#139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
bauglir added a commit to bauglir/TranscodingStreams.jl that referenced this pull request Apr 15, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. JuliaIO#136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes JuliaIO#139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
mkitti pushed a commit that referenced this pull request Apr 20, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. #136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes #139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants