-
Notifications
You must be signed in to change notification settings - Fork 419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Interleave to prevent eager MoveNext calls #696
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private extension method is useless here
Remove useless extension method signature on a private method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TestInterleaveDoNoCallMoveNextEagerly
doesn't prove what it intends to. Both your sample sequences are one element long and you just throw for the retrieval of the element in the second sequence. That doesn't test for MoveNext
being called eagerly. That would have looked as follows:
[Test]
public void TestInterleaveDoNoCallMoveNextEagerly()
{
void Code()
{
var sequenceA = Enumerable.Range(1, 1);
var sequenceB = MoreEnumerable.From(() => 2, () => throw new TestException());
sequenceA.Interleave(sequenceB).Take(1).Consume();
}
Assert.DoesNotThrow(Code);
}
And it succeeds with the current implementation (without the changes in this PR).
I fixed your test in 61e4921, used the origin
|
Exactly, the code asks for one element and this element came from the first sequence: MoreLINQ/MoreLinq.Test/InterleaveTest.cs Line 49 in c9aa710
It throw for the retrieval of the first element in the second sequence because |
Right, I've reverted your originally proposed test and fix with 8a972a9 and will review. |
MoreLinq/Interleave.cs
Outdated
try | ||
{ | ||
// First pass. create enumerators. | ||
foreach (var sequence in sequences) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have Acquire
for this so I would take advantage of it here rather than roll out a duplicate implementation that needs separate testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIU, Acquire
do not return a IDisposable
so Interleave
keep the responsibility of disposing the enumerators when the result sequence enumerator is disposed.
Achieve this with a using
for each sequence is not feasible since the number of using
is dynamic.
The only way I know to achieve this in a yield
context is to use some try {} finally {}
blocks. Because the finally
block is called when the enumerator is disposed.
We indeed can put this behavior in a specialized IDisposable
class. When disposed, this object dispose a bunch of elements.
DisposableGroup<T>
(declared in SortedMerge and only used here) looks to achieve this, but DisposableGroup<T>
internally use a list where I use a LinkedList
.
(Rem: DisposableGroup<T>
constructor doesn't dispose the already acquired object in case of error in the input sequence, like Acquire
do. But anyway, the only usage of DisposableGroup<T>
use Acquire
).
I have chosen to use a LinkedList
since Remove
operation is O(1) there.
If we use a list or an array, where the the Remove
operation is O(N), enumerating N sequences of 1 element will cost O(N²).
I know that it is an uncommon use case (it may be less uncommon with #697), anyway this implementation manage it.
Rem: Because of DisposableGroup<T>
implementation SortedMerge
is O(N²) in the case described above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIU,
Acquire
do not return aIDisposable
It would return an array of enumerators that are disposable. Question is whether there is something to be gained out of not calling GetEnumerator()
on other sequences until you get to them, but that will only optimize the case of someone eventually taking fewer elements than sequences!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acquire
will not remove the necessity of a try {} finally {}
(with disposal of the enumerators in the finally block), and then is just an alias of ToArray()
.
- An array is not optimal here.
- If
Acquire
is used and if an error occur while acquiring (the only purpose ofAcquire
is to manage this case), elements will be disposed twice, which is useless.
So Acquire
here is not optimal and useless.
A ToDisposableLinkedList()
method that have the functionality of Acquire
(on error, dispose already acquired elements) and provide a DisposableLinkedList
that have the functionality of DisposableGroup<T>
(being disposable and dispose its content when disposed) is the way to go for Interleave
, SortedMerge
and Transpose
.
It will be used like that:
using var enumerators = source.Select(e => e.GetEnumerator()).ToDisposableLinkedList();
The using
here allow to remove the try {} finally {}
from here:
MoreLINQ/MoreLinq/Transpose.cs
Lines 92 to 96 in 9fc8486
finally | |
{ | |
foreach (var e in enumerators) | |
e?.Dispose(); | |
} |
and here:
MoreLINQ/MoreLinq/Interleave.cs
Lines 141 to 146 in 9fc8486
finally | |
{ | |
Debug.Assert(iteratorList != null || iterators != null); | |
foreach (var iter in (iteratorList ?? (IList<IEnumerator<T>>) iterators)) | |
iter.Dispose(); | |
} |
and improve the DisposableGroup/Acquire
combo from here:
MoreLINQ/MoreLinq/SortedMerge.cs
Line 101 in 9fc8486
using var disposables = new DisposableGroup<TSource>(sequences.Select(e => e.GetEnumerator()).Acquire()); |
I opened #724 to refactor Interleave
, SortedMerge
and Transpose
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acquire
will not remove the necessity of atry {} finally {}
(with disposal of the enumerators in the finally block), and then is just an alias ofToArray()
.
Acquire
itself does not need to be inside a try-finally block and so it's not the same as ToArray()
. It doesn't mean it prevents you from having one. It is, for example, perfectly safe to return the array from Acquire
as the return value of a function:
static Stream[] OpenFiles(params string[] paths) =>
paths.Select(p => File.OpenRead(p)).Acquire();
Of course, the disposal becomes the responsibility of the caller and will need a try-catch. It does prevent that.
- If
Acquire
is used and if an error occur while acquiring (the only purpose ofAcquire
is to manage this case), elements will be disposed twice, which is useless.
It is useless but it's not a problem since it is a requirement for IDisposable.Dispose
to be idempotent (per docs):
If an object's Dispose method is called more than once, the object must ignore all calls after the first one.
I have chosen to use a
LinkedList
sinceRemove
operation is O(1) there.
That's a fine choice but I am not debating the most optimal solution first. My point with Acquire
was that it's carefully designed and tested and so might be best to keep using it instead of putting everything into question. This PR was about fixing one problem but by trying to be cleverly optimal at the same time, it may introduce others bugs (and I believe it does)! This means it takes longer to review as the scope of change expands. Instead, I find that it helps to make it work, make it right and then make it fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@atifaziz with the implementation using Acquire
in this PR (currently at a7f2c6f), calls to GetEnumerator
are made eagerly and the code below throw:
var sequenceA = Enumerable.Range(1, 1);
var sequenceB = new BreakingSequence<int>();
sequenceA.Interleave(sequenceB).Take(1).Consume();
I don't know it it's acceptable
It doesn't throw in #726.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much better. Thanks!
PS I've been busy on another project and needed to put this aside for a bit to have focus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome and thank you!
This PR addresses #694