Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve] Binary transmission statistics #6946

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from
Open

Conversation

gitjxm
Copy link

@gitjxm gitjxm commented Jun 5, 2024

Count

counts the number of files (Excluding folders)

Bytes

counts the size of files (Excluding folders)

Example Results


       Job Statistic Information

Start Time : 2024-06-05 15:22:21
End Time : 2024-06-05 15:22:24
Total Time(s) : 2
Total Read Count : 4
Total Read Bytes : 3153492
Total Write Count : 4
Total Write Bytes : 3153492
Total Failed Count : 0


@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 5, 2024

Please open CI on your fork repository.
image

@gitjxm
Copy link
Author

gitjxm commented Jun 5, 2024

Do I need to resubmit?

@Hisoka-X
Copy link
Member

Hisoka-X commented Jun 5, 2024

It's fine at now. But I think you should add some test case.
ignore it.

@Hisoka-X Hisoka-X added the First-time contributor First-time contributor label Jun 5, 2024
Copy link
Member

@Hisoka-X Hisoka-X left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we can introduce new metrics in file connector. Like readFileCount, readFileBytes. Not use metrics of zeta engine. You can refer

Comment on lines +18 to +41
package org.apache.seatunnel.api.table.type;

public class BinaryRowType implements SeaTunnelDataType<BinaryRow> {

public static final BinaryRowType INSTANCE = new BinaryRowType();

@Override
public Class<BinaryRow> getTypeClass() {
return BinaryRow.class;
}

@Override
public SqlType getSqlType() {
return SqlType.BINARY;
}

@Override
public boolean equals(Object obj) {
if (obj == this) {
return true;
}
return obj instanceof PrimitiveByteArrayType;
}
}
Copy link
Member

@Hisoka-X Hisoka-X Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we add a new type, we have to be careful because it affects how all Sink/Transform should handle it. But the only purpose of this PR to add this new type is just to make the metrics more accurate, I think we need to discuss it in depth. cc @hailin0 @EricJoy2048

@gitjxm
Copy link
Author

gitjxm commented Jun 6, 2024

I suggest we can introduce new metrics in file connector. Like readFileCount, readFileBytes. Not use metrics of zeta engine. You can refer

I think use metrics of zeta engine is feasible,Just adding an extra meaning

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants