Skip to content

Array.Parallel - search functions (tryFindIndex,tryFind,tryPick) #14827

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 14, 2023

Conversation

T-Gro
Copy link
Member

@T-Gro T-Gro commented Feb 28, 2023

This is second part of the RFC to expand the Array.Parallel module, based on approved suggestion

This PR adds single-item search functions tryFindIndex,tryFind,tryPick, with the fulfilled property of returning first element (by overall index, not just time-wise) found.
This is accomplished using the Parallel.For function, and it's ability to terminate early (ParallelLoopState.Break()) and retrieve the first index where this occurred via LowestBreakIteration.
Calling .Break() (as opposed to immediate .Stop()) ensures that all indexes lower than the terminating one are still processed, but higher ones will be skipped if not started already.

This was done to ensure the behavior is consistent with the same functions in Array (non-parallel) module.

@vzarytovskii
Copy link
Member

Wondering if the APIs should be marked as Experimental, in case if we'll want to change implementation details later.

@T-Gro
Copy link
Member Author

T-Gro commented Mar 1, 2023

As of now, the usage of [<Experimental>] within Fsharp.Core seems to be related only to language features (e.g. reverse index methods bound to this feature), I do not see a precedence for using it on plain API additions.

That being sad, I have no problem of putting it in.

@vzarytovskii
Copy link
Member

As of now, the usage of [<Experimental>] within Fsharp.Core seems to be related only to language features (e.g. reverse index methods bound to this feature), I do not see a precedence for using it on plain API additions.

That being sad, I have no problem of putting it in.

I think we may do it for more complex functions? Like sorting, so we can easily change implementation details of it? Wdyt?

@T-Gro
Copy link
Member Author

T-Gro commented Mar 2, 2023

I would for sure use it if the public API itself is potentially subject to change (not the case here, yet). That was actually my current view of it - `[] for the existence and shape of the API itself.

For implementation details (e.g. how much threads it uses, in what scenarios it can be harmful compared to non-parallel code) - maybe this attribute is not the best fit, but I guess it's a good indication of something new and to raise awareness.

Last but not least, it's good to have it if we want to have some space of removing it altogether for whatever reason.
(e.g. it will not get used and only increase code size)

=> I am now more inclined of putting it in place, especially for the last reason.

@T-Gro T-Gro marked this pull request as ready for review March 9, 2023 14:00
@T-Gro T-Gro requested a review from a team as a code owner March 9, 2023 14:00
@T-Gro T-Gro enabled auto-merge (squash) March 9, 2023 14:01
@T-Gro T-Gro merged commit b87ca33 into dotnet:main Mar 14, 2023
@T-Gro T-Gro deleted the 14217-arrayparallel-searchfunctions branch March 14, 2023 16:33
vzarytovskii pushed a commit that referenced this pull request Mar 14, 2023
* Cache parsing results in incremental build (#14852)

* First take on the F# telemetry (#14889)

* Array.Parallel - search functions (tryFindIndex,tryFind,tryPick) (#14827)

* Searching functions for Array.Parallel added

* with [<Experimental("Experimental library feature, requires '--langversion:preview'")>] annotation to give us space to change/remove this API in the future if needed

* Add `GraphNode.FromResult` (#14894)

* Add GraphNode.FromResult

* fantomas

* Fix missing reference (#14892)

* Fix missing reference

* undo whitespace change

---------

Co-authored-by: Jakub Majocha <[email protected]>
Co-authored-by: Petr <[email protected]>
Co-authored-by: Tomas Grosup <[email protected]>
Co-authored-by: Petr Pokorny <[email protected]>
Co-authored-by: Kevin Ransom (msft) <[email protected]>
Comment on lines +1937 to +1954
[<CompiledName("TryFindIndex")>]
let tryFindIndex predicate (array: _[]) =
checkNonNull "array" array

let pResult =
Parallel.For(
0,
array.Length,
(fun i pState ->
if predicate array[i] then
pState.Break())
)

pResult.LowestBreakIteration |> Option.ofNullable |> Option.map int

[<CompiledName("TryFind")>]
let tryFind predicate (array: _[]) =
array |> tryFindIndex predicate |> Option.map (fun i -> array[i])
Copy link
Contributor

@brianrourkeboll brianrourkeboll Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it would matter most of the time (especially since this is Array.Parallel and not regular Array, and allocations are probably gonna happen anyway), but given that this is standard library code that probably won't change very often, it might not hurt to avoid the extra allocations here123:

Suggested change
[<CompiledName("TryFindIndex")>]
let tryFindIndex predicate (array: _[]) =
checkNonNull "array" array
let pResult =
Parallel.For(
0,
array.Length,
(fun i pState ->
if predicate array[i] then
pState.Break())
)
pResult.LowestBreakIteration |> Option.ofNullable |> Option.map int
[<CompiledName("TryFind")>]
let tryFind predicate (array: _[]) =
array |> tryFindIndex predicate |> Option.map (fun i -> array[i])
let inline tryFindIndexAux predicate (array: _ array) =
checkNonNull (nameof array) array
let pResult =
Parallel.For(
0,
array.Length,
(fun i pState ->
if predicate array[i] then
pState.Break())
)
pResult.LowestBreakIteration
[<CompiledName("TryFindIndex")>]
let tryFindIndex predicate (array: _ array) =
let i = tryFindIndexAux predicate array
if i.HasValue then Some (int (i.GetValueOrDefault()))
else None
[<CompiledName("TryFind")>]
let tryFind predicate (array: _ array) =
let i = tryFindIndexAux predicate array
if i.HasValue then Some array[int (i.GetValueOrDefault()]
else None

Footnotes

  1. 2 for tryFindIndex: one Some from Option.ofNullable and another from Option.map.

  2. 3 for tryFind: one Some from Option.ofNullable, another from the first Option.map, and a third from the outer Option.map.

  3. Or maybe I've just read too many Stephen Toub PRs 🙃.


[<CompiledName("TryPick")>]
let tryPick chooser (array: _[]) =
checkNonNull "array" array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someday, we should add a nullArg (nameof arg) codefix :)

Suggested change
checkNonNull "array" array
checkNonNull (nameof array) array

kant2002 pushed a commit to kant2002/fsharp that referenced this pull request Apr 1, 2023
…net#14827)

* Searching functions for Array.Parallel added

* with [<Experimental("Experimental library feature, requires '--langversion:preview'")>] annotation to give us space to change/remove this API in the future if needed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants