Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IO-427] Add TrailerInputStream #497

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ax-lothas
Copy link

@ax-lothas ax-lothas commented Oct 12, 2023

Add a new TrailerInputStream which holds back the last x bytes of the underlying input stream while reading it.

This is useful, e.g., when a payload is followed by a fixed-length checksum.

https://issues.apache.org/jira/browse/IO-427

Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some note. I'll comment in the ticket on a separate issue.

* @param trailerLength the length of the trailer which is hold back (must be >= 0).
* @throws IOException initializing the trailer buffer failed.
*/
public TrailerInputStream(final InputStream source, final int trailerLength)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the Builder pattern to avoid constructor creep. For example org.apache.commons.io.input.BOMInputStream.Builder.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the default value of trailerLength? 0 is rather useless, so both arguments have to be provided anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer constructors, especially when as here there's exactly one constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There does not need to be a default if none makes sense, an exception can be thrown at build time or in the private ctor for nonsensical values.

If experience has taught us anything here, it's that there will be constructor creep in the future, so please use a builder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO builders are vastly overused. They're not native to Java's design, and constructors are more natural. There's a place for builders, but they shouldn't be the default.

* @param trailerLength the length of the trailer which is hold back (must be >= 0).
* @throws IOException initializing the trailer buffer failed.
*/
public TrailerInputStream(final InputStream source, final int trailerLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I prefer constructors, especially when as here there's exactly one constructor.


@Override
public int available() throws IOException {
return this.source.available();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this subtract the bytes in the trailer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trailer is filled in the constructor. So every single byte read after the constructor finished causes exactly one byte to return to the caller of the read methods. Think about it as shift register.

Copy link
Contributor

@sebbASF sebbASF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update code with design info

Copy link
Contributor

@sebbASF sebbASF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reword comment please

@garydgregory
Copy link
Member

If this class is indeed generally useful, can it be reused from our Tailer classes?

Copy link
Contributor

@sebbASF sebbASF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See review

*/
public TrailerInputStream(final InputStream source, final int trailerLength)
throws IOException {
if (trailerLength < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the point of allowing a zero-length trailer

Also, it probably make sense to have an upper limit.

Copy link
Contributor

@elharo elharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the trailer before the stream is fully read should probably throw an illegal state exception.

*
* <p>
* "Normal" read calls read the underlying stream except the last few bytes (the trailer). The
* trailer is updated with each read call. The trailer can be gotten by one of the copyTrailer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTrailer

* </p>
*
* <p>
* It is safe to fetch the trailer at any time but the trailer will change with each read call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems wonky. I'd almost prefer iot to throw IllegalStateException unitl it has the real trailer


private final InputStream source;
/**
* Invariant: After every method call which exited without exception, the trailer has to be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has to be --> is

* </p>
*
* @param source underlying stream from which is read.
* @param trailerLength the length of the trailer which is hold back (must be &gt;= 0).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

held back

* @param trailerLength the length of the trailer which is hold back (must be &gt;= 0).
* @throws IOException initializing the trailer buffer failed.
*/
public TrailerInputStream(final InputStream source, final int trailerLength)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO builders are vastly overused. They're not native to Java's design, and constructors are more natural. There's a place for builders, but they shouldn't be the default.

return this.trailer.length;
}

public byte[] copyTrailer() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyTrailer --> getTrailer

@sebbASF
Copy link
Contributor

sebbASF commented Oct 23, 2023

Whilst the class seems to work well, I'm not convinced that there is a sufficient need for it.

Every extra class adds to the maintenance load, so there needs to be a demonstrated to make it worth the extra effort.

@garydgregory
Copy link
Member

Whilst the class seems to work well, I'm not convinced that there is a sufficient need for it.

Every extra class adds to the maintenance load, so there needs to be a demonstrated to make it worth the extra effort.

Right, I agree, and as I mentioned earlier: Could this be reused as is within IO itself (Tailer class) and from Tika which provides a similar class?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants