Skip to content

BufWriter remove Option #59759

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

czipperz
Copy link
Contributor

@czipperz czipperz commented Apr 6, 2019

Closes #36319

See #36336 for previous discussion

@rust-highfive
Copy link
Contributor

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 6, 2019
@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

@frewsxcv

@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:0abad691:start=1554583822586491811,finish=1554583901692540650,duration=79106048839
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
Setting environment variables from .travis.yml
---
[00:04:12]    Compiling panic_unwind v0.0.0 (/checkout/src/libpanic_unwind)
[00:04:13] error[E0433]: failed to resolve: use of undeclared type or module `std`
[00:04:13]    --> src/libstd/io/buffered.rs:592:21
[00:04:13]     |
[00:04:13] 592 |                 use std::mem::{replace, uninitialized, forget};
[00:04:13] 
[00:04:14] error[E0425]: cannot find function `replace` in this scope
[00:04:14]    --> src/libstd/io/buffered.rs:593:29
[00:04:14]     |
[00:04:14]     |
[00:04:14] 593 |                 let inner = replace(&mut self.inner, unsafe { uninitialized() });
[00:04:14] help: possible candidates are found in other modules, you can import them into scope
[00:04:14]     |
[00:04:14] 3   | use core::mem::replace;
[00:04:14]     |
---
[00:04:14] 
[00:04:14] error[E0425]: cannot find function `uninitialized` in this scope
[00:04:14]    --> src/libstd/io/buffered.rs:593:63
[00:04:14]     |
[00:04:14] 593 |                 let inner = replace(&mut self.inner, unsafe { uninitialized() });
[00:04:14] help: a tuple struct with a similar name exists
[00:04:14]     |
[00:04:14]     |
[00:04:14] 593 |                 let inner = replace(&mut self.inner, unsafe { Initializer() });
[00:04:14] help: possible candidates are found in other modules, you can import them into scope
[00:04:14]     |
[00:04:14] 3   | use core::mem::uninitialized;
[00:04:14]     |
[00:04:14]     |
[00:04:14] 3   | use core::mem::uninitialized;
[00:04:14]     |
[00:04:14] 
[00:04:14] error[E0425]: cannot find function `replace` in this scope
[00:04:14]    --> src/libstd/io/buffered.rs:594:17
[00:04:14]     |
[00:04:14] 594 |                 replace(&mut self.buf, unsafe { uninitialized() });
[00:04:14] help: possible candidates are found in other modules, you can import them into scope
[00:04:14]     |
[00:04:14] 3   | use core::mem::replace;
[00:04:14]     |
---
[00:04:14] 
[00:04:14] error[E0425]: cannot find function `uninitialized` in this scope
[00:04:14]    --> src/libstd/io/buffered.rs:594:49
[00:04:14]     |
[00:04:14] 594 |                 replace(&mut self.buf, unsafe { uninitialized() });
[00:04:14] help: a tuple struct with a similar name exists
[00:04:14]     |
[00:04:14]     |
[00:04:14] 594 |                 replace(&mut self.buf, unsafe { Initializer() });
[00:04:14] help: possible candidates are found in other modules, you can import them into scope
[00:04:14]     |
[00:04:14] 3   | use core::mem::uninitialized;
[00:04:14]     |
---
[00:04:14]     |
[00:04:14] 3   | use core::mem::forget;
[00:04:14]     |
[00:04:14] 
[00:04:14] error: unused imports: `forget`, `replace`, `uninitialized`
[00:04:14]     |
[00:04:14]     |
[00:04:14] 592 |                 use std::mem::{replace, uninitialized, forget};
[00:04:14]     |
[00:04:14]     = note: `-D unused-imports` implied by `-D warnings`
[00:04:14] 
[00:04:15] error[E0308]: mismatched types
[00:04:15] error[E0308]: mismatched types
[00:04:15]    --> src/libstd/io/buffered.rs:529:35
[00:04:15]     |
[00:04:15] 529 |     pub fn get_ref(&self) -> &W { self.inner }
[00:04:15]     |                                   |
[00:04:15]     |                                   |
[00:04:15]     |                                   expected &W, found type parameter
[00:04:15]     |                                   help: consider borrowing here: `&self.inner`
[00:04:15]     = note: expected type `&W`
[00:04:15]                found type `W`
[00:04:15] 
[00:04:15] error[E0308]: mismatched types
[00:04:15] error[E0308]: mismatched types
[00:04:15]    --> src/libstd/io/buffered.rs:547:43
[00:04:15]     |
[00:04:15] 547 |     pub fn get_mut(&mut self) -> &mut W { self.inner }
[00:04:15]     |                                           |
[00:04:15]     |                                           |
[00:04:15]     |                                           expected &mut W, found type parameter
[00:04:15]     |                                           help: consider mutably borrowing here: `&mut self.inner`
[00:04:15]     = note: expected type `&mut W`
[00:04:15]                found type `W`
[00:04:15] 
[00:04:16] error: aborting due to 9 previous errors
---
travis_time:end:27f6e2cc:start=1554584169550828654,finish=1554584169556561933,duration=5733279
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:215a192c
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:04ec41fa
travis_time:start:04ec41fa
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:034a844

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

Current bench results:

With option

|   Trial | construct | large write | small | multiple small |
|---------+-----------+-------------+-------+----------------|
|       1 |        81 |          66 |    83 |            134 |
|       2 |        81 |          67 |    83 |            134 |
|       3 |        82 |          67 |    82 |            134 |
|       4 |        82 |          68 |    82 |            134 |
|       5 |        82 |          68 |    86 |            135 |
| Average |      81.6 |        67.2 |  83.2 |          134.2 |

Without option

replace / uninitialized

|   Trial | construct | large write | small | multiple small |
|---------+-----------+-------------+-------+----------------|
|       1 |        80 |          65 |    80 |            131 |
|       2 |        79 |          65 |    79 |            132 |
|       3 |        80 |          66 |    80 |            133 |
|       4 |        80 |          66 |    80 |            133 |
|       5 |        81 |          65 |    81 |            134 |
| Average |      80.0 |        65.6 |  80.0 |          132.6 |

read / drop_in_place

|   Trial | construct | large write | small | multiple small |
|---------+-----------+-------------+-------+----------------|
|       1 |        80 |          66 |    81 |            134 |
|       2 |        81 |          66 |    80 |            134 |
|       3 |        82 |          67 |    80 |            132 |
|       4 |        80 |          67 |    85 |            134 |
|       5 |        80 |          66 |    81 |            132 |
| Average |      80.6 |        66.4 |  81.4 |          132.2 |

@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:1425fc38:start=1554584592395938786,finish=1554584672686721936,duration=80290783150
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
Setting environment variables from .travis.yml
---
[00:03:54]    Compiling panic_unwind v0.0.0 (/checkout/src/libpanic_unwind)
[00:03:57] error[E0308]: mismatched types
[00:03:57]    --> src/libstd/io/buffered.rs:529:35
[00:03:57]     |
[00:03:57] 529 |     pub fn get_ref(&self) -> &W { self.inner }
[00:03:57]     |                                   |
[00:03:57]     |                                   |
[00:03:57]     |                                   expected &W, found type parameter
[00:03:57]     |                                   help: consider borrowing here: `&self.inner`
[00:03:57]     = note: expected type `&W`
[00:03:57]                found type `W`
[00:03:57] 
[00:03:57] error[E0308]: mismatched types
[00:03:57] error[E0308]: mismatched types
[00:03:57]    --> src/libstd/io/buffered.rs:547:43
[00:03:57]     |
[00:03:57] 547 |     pub fn get_mut(&mut self) -> &mut W { self.inner }
[00:03:57]     |                                           |
[00:03:57]     |                                           |
[00:03:57]     |                                           expected &mut W, found type parameter
[00:03:57]     |                                           help: consider mutably borrowing here: `&mut self.inner`
[00:03:57]     = note: expected type `&mut W`
[00:03:57]                found type `W`
[00:03:57] 
[00:03:58] error: aborting due to 2 previous errors
---
travis_time:end:03b7b6c4:start=1554584920996306550,finish=1554584921001820522,duration=5513972
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:27a0a502
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:1cd10ffc
travis_time:start:1cd10ffc
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:12f70974

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

use crate::mem::forget;
use crate::ptr::{drop_in_place, read};
let inner = unsafe { read(&mut self.inner) };
unsafe { drop_in_place(&mut self.buf) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that if you have a global memory allocator that can panic in dealloc, a panic here would turn into a double-free of inner and self.inner. The GlobalAlloc trait is currently documented as:

  • It's undefined behavior if global allocators unwind. This restriction may be lifted in the future, but currently a panic from any of these functions may lead to memory unsafety.

so this seems fine for now but needs at least a comment why the code is safe despite appearances.

Ok(()) => {
use crate::mem::forget;
use crate::ptr::{drop_in_place, read};
let inner = unsafe { read(&mut self.inner) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach of ~manually running the drop glue for all of the other parts of the struct seems a bit dangerous and error prone. Using ManuallyDrop::take seems like it'd be cleaner: https://doc.rust-lang.org/std/mem/struct.ManuallyDrop.html?search=#method.take

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you suggesting applying ManuallyDrop::take in this context?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you would need the struct to contain:

    inner: ManuallyDrop<W>,
    buf: ManuallyDrop<Vec<u8>>,

where you call ManuallyDrop::drop(&mut self.buf); ManuallyDrop::drop(&mut self.inner) inside Drop and ManuallyDrop::drop(&mut self.buf); ManuallyDrop::take(&mut self.inner) inside into_inner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that implementation at all. The entire point of this change is to isolate this edge case into one function rather than spreading it throughout the entire struct.

Copy link
Contributor Author

@czipperz czipperz Apr 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps he meant putting self into a ManuallyDrop to prevent these double free issues? That seems reasonable:

use crate::mem::ManuallyDrop;
use crate::ptr::{drop_in_place, read};
let mut s = ManuallyDrop::new(self);
let inner = unsafe { read(&mut s.inner) };
unsafe { drop_in_place(&mut s.buf) };
Ok(inner)

@dtolnay
Copy link
Member

dtolnay commented Apr 6, 2019

I don't think that the 1-2% difference in microbenchmark (when measuring with io::sink, not even doing real I/O) is worth the added complexity of this change. What do you think @czipperz? Were you counting on a greater improvement?

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

I think micro optimizations such as this are generally good so long as they don't break correctness.

This is a complex problem and must have a complex solution. The existing solution is to special case into_inner by use of Option to mark whether it has been called or not. This change simply shifts that complexity into the into_inner function rather than the rest of the code base. I don't think the new implementation of into_inner is very complex, either. Do you disagree?

@sfackler
Copy link
Member

sfackler commented Apr 6, 2019

I think micro optimizations such as this are generally good so long as they don't break correctness.

New micro-optimizations impose an increased maintenance burden, correctness risk, and even soundness risk when they contain unsafe code. Because of that, I prefer to limit them to solve problems that actually exist. Has anyone actually traced a concrete performance issue to the use of Option in BufReader?

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

I used the word generally because I agree, there definitely are optimizations that abuse unsafe code and add complexity. However, in this specific case, I don't believe this is the case.

Suppose that the current implementation didn't have the into_inner method. It can still have the get_ref and get_mut methods to access the underlying writer, but didn't specifically implement this functionality. Would the rest of the code be as complex? Would we be using an Option<Write> instead of straight Write? If we were at this state and want to add this method, I don't think it would be incredibly contentious to add a bit of unsafe code to handle this edge case. I think it makes much more sense than redoing the entire BufWriter to ease this one concern.

If I am reading through BufWriter, it is not intuitive why we are using an optional inner writer. It isn't documented anywhere and isn't clear from the code. It seems like every method just unwraps it. I would like to see the complexity of the option eliminated from other functions. I'm going to submit another PR that uses self.get_mut to remove the redundant unwraps.

@sfackler
Copy link
Member

sfackler commented Apr 6, 2019

I don't think you've answered my question.

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

Has anyone actually traced a concrete performance issue to the use of Option in BufReader?

I highly doubt it. And I agree, making a change to unsafe code to purely increase performance is bad. Especially when there is such a small gain. However, I also think the added complexity of smearing the Option around the rest of the structure is bad. It's really a question of which is worse.

@sfackler
Copy link
Member

sfackler commented Apr 6, 2019

The Option layer here was added over 5 years ago. It's not "added complexity", it's the status quo!

There are reasons to add unsafe code to the standard library. A reader of the source code being confused for a couple minutes over why there's an Option<W> rather than a W is not one of them.

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

Just to be clear, I'm coming to rust from c/c++ where manual memory management like this is normalized. So I'm not very scared of it personally.

@czipperz
Copy link
Contributor Author

czipperz commented Apr 6, 2019

The Option layer here was added over 5 years ago. It's not "added complexity", it's the status quo!

Irregardless of whether it is common or not, I would still argue it is more complex than needs be. I also think that the panicked flag adds complexity. This is a complex problem and must have a complex solution.

@sfackler
Copy link
Member

sfackler commented Apr 6, 2019

Just to be clear, I'm coming to rust from c/c++ where manual memory management like this is normalized. So I'm not very scared of it personally.

Rust exists because the past 30 years have demonstrated that human beings are not capable of writing large, complex codebases in C and C++ that manage memory correctly.

@czipperz
Copy link
Contributor Author

czipperz commented Apr 7, 2019

I agree. Its one of the reasons I moved to rust. Thanks for the feedback @sfackler .
Are there other places around Rust where we use this sort of pattern?

@czipperz czipperz closed this Apr 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants