Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-44491: [C++] StatusConstant- cheaply copied const Status #44493

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bkietz
Copy link
Member

@bkietz bkietz commented Oct 21, 2024

Rationale for this change

It'd be convenient to construct placeholder error Status cheaply.

What changes are included in this PR?

A constant error Status can now be constructed wrapped in a function like

Status UninitializedResult() {
  static StatusConstant uninitialized_result{StatusCode::UnknownError,
                                             "Uninitialized Result<T>"};
  return uninitialized_result;
}

Copying the constant status is relatively cheap (no heap interaction), so it's suitable for use as a placeholder error status.

Added bool Status::State::is_constant which causes copies to be shallow and skips destruction. Also Status::state_ is a const pointer now; this helps to ensure that it is obvious when we mutate state_ (as in AddContextLine).

Are these changes tested?

Minimal unit test added. The main consideration is probably benchmarks to make sure hot paths don't get much slower.

Are there any user-facing changes?

This API is not currently public; no user-facing changes.

Copy link

⚠️ GitHub issue #44491 has been automatically assigned in GitHub to PR creator.

cpp/src/arrow/status.h Outdated Show resolved Hide resolved
cpp/src/arrow/status.h Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting committer review Awaiting committer review labels Oct 21, 2024
@bkietz bkietz marked this pull request as ready for review October 25, 2024 15:12
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 25, 2024
@bkietz bkietz changed the title GH-44491: [C++] Static Status draft GH-44491: [C++] StatusConstant- cheaply copied const Status Oct 25, 2024
ARROW_CHECK_NE(code, StatusCode::OK) << "Cannot construct ok status constant";
}

operator Status() const { // NOLINT(runtime/explicit)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could alternatively make StatusConstant a subclass of Status

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Oct 25, 2024
@bkietz
Copy link
Member Author

bkietz commented Oct 25, 2024

@pitrou @felipecrv which benchmark would be convincing as a demonstration this doesn't slow us down? I don't think we have an explicit status or result benchmark

For now (since it seems to me to have the deepest Status bubbling stack),

@ursabot please benchmark command=cpp-micro --suite-filter=grouper

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 25, 2024
@pitrou
Copy link
Member

pitrou commented Oct 25, 2024

I don't think we have an explicit status or result benchmark

We do, but for some reason I had stuffed it in type_benchmark.cc

BENCHMARK(ErrorSchemeNoError);
BENCHMARK(ErrorSchemeBool);
BENCHMARK(ErrorSchemeStatus);
BENCHMARK(ErrorSchemeResult);
BENCHMARK(ErrorSchemeException);
BENCHMARK(ErrorSchemeNoErrorNoInline);
BENCHMARK(ErrorSchemeBoolNoInline);
BENCHMARK(ErrorSchemeStatusNoInline);
BENCHMARK(ErrorSchemeResultNoInline);
BENCHMARK(ErrorSchemeExceptionNoInline);

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 25, 2024
@bkietz
Copy link
Member Author

bkietz commented Oct 25, 2024

@ursabot please benchmark command=cpp-micro --suite-filter=type

@ursabot
Copy link

ursabot commented Oct 25, 2024

Benchmark runs are scheduled for commit f7d3fff. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

Copy link

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit f7d3fff.

There were 4 benchmark results indicating a performance regression:

The full Conbench report has more details.

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 30, 2024
@felipecrv
Copy link
Contributor

@pitrou @felipecrv which benchmark would be convincing as a demonstration this doesn't slow us down? I don't think we have an explicit status or result benchmark

Can you check the binary size impact with bloaty?

The performance impact is very tiny but everywhere a Status gets destroyed (all over the place). I think checking binary size is a good proxy to keeping the shape of the code the same.

cpp/src/arrow/status.cc Outdated Show resolved Hide resolved
cpp/src/arrow/status_internal.h Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Nov 4, 2024
StatusConstant(StatusCode code, std::string msg,
std::shared_ptr<StatusDetail> detail = nullptr)
: state_{code, std::move(msg), std::move(detail), /*is_constant=*/true} {
ARROW_CHECK_NE(code, StatusCode::OK)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that ARROW_CHECK_NE is always executed even in non-debug builds, do you want that here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably unnecessary for constants; I was copying from Status::Status

cpp/src/arrow/status.h Outdated Show resolved Hide resolved
// We can't add context lines to a StatusConstant's state, so copy it now
state_ = new State{code(), message(), detail()};
}
const_cast<State*>(state_)->msg += ss.str();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did we make the state const if we need the const_cast here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to satisfy myself that this line was the only place we ever mutate state_. I can remove the const now that I'm sure, but... since this line is the only place we ever mutate state_, I thought it'd be better to leave it so that in the future if we add another mutation we won't forget to CopyOnWrite any constants

@bkietz
Copy link
Member Author

bkietz commented Nov 4, 2024

Can you check the binary size impact with bloaty?

$ $ bloaty build/relwithdebinfo/libarrow.so -- main-build/relwithdebinfo/libarrow.so
    FILE SIZE        VM SIZE
 --------------  --------------
  +0.1% +14.2Ki  [ = ]       0    .debug_str
   +14% +1.17Ki  [ = ]       0    [Unmapped]
  +0.0% +1.14Ki  [ = ]       0    .strtab
  +0.0%    +250  +0.0%    +250    .dynstr
  [ = ]       0  +0.0%     +64    .bss
 -100.0%     +50 -100.0%     +30    [7 Others]
  -0.1%     -40  -0.1%     -40    .got.plt
  -0.0%     -72  -0.0%     -72    .dynsym
  -0.1%     -80  -0.1%     -80    .plt
  -0.1%    -120  -0.1%    -120    .rela.plt
  -0.0%    -288  [ = ]       0    .symtab
  -0.1%    -533  [ = ]       0    .debug_abbrev
  -1.2% -3.74Ki  -1.2% -3.74Ki    .gcc_except_table
  -0.4% -10.00Ki  [ = ]       0    .debug_str_offsets
  -1.3% -13.4Ki  -1.3% -13.4Ki    .eh_frame

  # This seems the most relevant data point; no significant change
  -0.6% -48.0Ki  -0.6% -48.0Ki    .text

  -2.2% -86.1Ki  [ = ]       0    .debug_addr
  -1.7%  -146Ki  [ = ]       0    .debug_line
  -3.4%  -164Ki  [ = ]       0    .debug_rnglists
  -2.1%  -347Ki  [ = ]       0    .debug_loclists
  -1.3%  -539Ki  [ = ]       0    .debug_info
  -1.2% -1.31Mi  -0.4% -65.1Ki    TOTAL```

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on the principle, see comments

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Nov 4, 2024
@bkietz
Copy link
Member Author

bkietz commented Nov 4, 2024

@ursabot please benchmark command=cpp-micro --suite-filter=type --benchmark-filter=Error

@ursabot
Copy link

ursabot commented Nov 4, 2024

Benchmark runs are scheduled for commit 86741d6. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

Copy link

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit 86741d6.

There were 2 benchmark results indicating a performance regression:

The full Conbench report has more details.

@bkietz
Copy link
Member Author

bkietz commented Nov 4, 2024

Those regressions have got to be flukes; they don't even touch Status

@pitrou
Copy link
Member

pitrou commented Nov 4, 2024

Yes, the code changes probably trigger random variation in code placement in these benchmarks, that could explain the false regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants