`Enumerable` method to find the first truthy block result and return that result #14879

jgaskins · 2024-08-07T18:35:01Z

Feature Request

Is your feature request related to a problem? Please describe clearly and concisely what is it.

I have a frequent need for something that is similar to Enumerable#find, but rather than returning the element for which the block is truthy, it returns that truthy block result itself.

This comes up when I have an array of objects that may or may not have a property set. The most recent need for this came up when implementing tool use for LLM APIs. The API response has an array of choices, each with a message, and those messages may or may not have a tool_calls array.

record Response, choices : Array(Choice)
record Choice, message : Message
record Message, tool_calls : Array(ToolCall)?
record ToolCall

response = Response.new([Choice.new(Message.new([ToolCall.new]))])

Describe the feature you would like, optionally illustrated by examples, and how it will solve the above problem.

What I've been using in my code to make this more expressive is a monkeypatch on Enumerable called find_first:

if tool_calls = response.choices.find_first &.message.tool_calls
  # find the tools for these calls, execute them, and send the results back to the API
end

Describe considered alternative solutions, and the reasons why you have not proposed them as a solution here.

Currently, as far as I can tell, at least, the only ways to do this involve either executing that same criteria again:

if choice_with_tool_call = response.choices.find &.message.tool_calls
  # the method chain for choice.message.tool_calls has to be repeated
  tool_calls = choice_with_tool_call
    .message
    .tool_calls
end

if tool_calls
  # find the tools for these calls, execute them, and send the results back to the API
end

Or this format with the block expanded

# or this
response.choices.each do |choice|
  if tool_calls = choice.message.tool_calls
    # find the tools for these calls, execute them, and send the results back to the API
    break
  end
end

These are both a lot more verbose, especially when there are multiple layers of these arrays. Having a method that returns the truthy value lets us use the shorthand block syntax and have a single expression that does all of these things smoothly.

Does it break backward compatibility, if yes then what's the migration path?

Nope. Purely additive.

The text was updated successfully, but these errors were encountered:

jgaskins · 2024-08-07T18:36:36Z

To be clear, I'm not attached to the find_first name. It's the name I've been using only because I haven't come up with anything better.

Blacksmoke16 · 2024-08-07T18:52:05Z

response = Response.new([Choice.new(Message.new(nil)), Choice.new(Message.new([ToolCall.new, ToolCall.new]))])

if tool_calls = response.choices.each { |c| tc = c.message.tool_calls; break tc if tc }
  pp tool_calls # => [ToolCall(), ToolCall()]
end

This is best I could come up with without introducing something new. As reference only.

yxhuvud · 2024-08-07T18:54:19Z

It is still slightly clunky, but at least it doesn't involve break:

[1, 2, 3].each.map { |z| evaluate(z) }.find &.itself

If it is fine to find the first not-nil-value, then
[1, 2, 3].each.compact_map { |z| evaluate(z) }.first
is even better

HertzDevil · 2024-08-07T19:53:52Z

Ruby doesn't have it, but in Elixir it is called Enum.find_value/3

straight-shoota · 2024-08-13T06:39:13Z

Sounds good to me. I'd prefer the name find_value over find_first.

jgaskins · 2024-08-13T19:18:40Z

In this comment, @Sija brought up an interesting point I hadn't considered. I normally use this for nil checks (and even then only use shorthand blocks), but are there any use cases for false triggering an early return?

Part of me wants to push back against it because I don't know of any other Enumerable methods where that is the case, but also I don't have any stake at all in how false is treated, so I thought I'd bring it here for discussion.

Sija · 2024-08-13T19:23:16Z

Elixir's find_value skips over both falsy values: nil and false, just FTR.

yxhuvud · 2024-08-14T06:06:46Z

I would argue this method should have the same semantics as find wrt nil vs falsey.

oprypin · 2024-11-02T14:25:50Z

I came here to share this observation, but @yxhuvud already did:

Yes, this is the same as:

[1, 2, 3].each.map { |x| evaluate(x) }.find &.itself

~~But the issue is that map materializes the whole array! If it didn't, then we could push back against adding this method.~~

And it is made even more clunky by the fact that .each. is necessary there. If you forget to add it then map would materialize the whole array.

oprypin · 2024-11-02T14:29:12Z

Also fun fact:
These methods map and to_a are exact aliases of each other

crystal/src/enumerable.cr

Line 2007 in 4aac6f2

def to_a(& : T -> U) : Array(U) forall U
crystal/src/enumerable.cr

Line 1016 in 4aac6f2

def map(& : T -> U) : Array(U) forall U

(which makes me wish that map was lazy even more)

jgaskins · 2024-11-02T18:04:21Z

Not only is the method chain clunkier, but since it also involves a heap allocation (Iterators have state), it's also slower:

With tool calls
                find_value   2.09M (477.64ns) (± 1.45%)    0.0B/op        fastest
each.compact_map(&).first? 134.17k (  7.45µs) (± 0.85%)  31.2kB/op  15.60× slower
each.map(&).find(&.itself) 128.57k (  7.78µs) (± 1.45%)  31.2kB/op  16.28× slower

Without tool calls
                find_value   1.57M (638.94ns) (± 5.47%)    0.0B/op        fastest
each.compact_map(&).first? 132.08k (  7.57µs) (± 1.62%)  31.2kB/op  11.85× slower
each.map(&).find(&.itself) 111.00k (  9.01µs) (± 0.92%)  31.2kB/op  14.10× slower

Benchmark code

require "benchmark"

response_with_tool_calls = Response.new([Choice.new(Message.new([ToolCall.new]))])
response_without_tool_calls = Response.new([Choice.new(Message.new(nil))])

# Need to run the block multiple times because each iteration for find_value
# takes less time than the monotonic clock can measure
iterations = 1_000
[response_with_tool_calls, response_without_tool_calls].each do |response|
  tool_calls = response.choices.find_value &.message.tool_calls
  puts "#{tool_calls ? "With" : "Without"} tool calls"
  Benchmark.ips do |x|
    x.report "find_value" { iterations.times { tool_calls = response.choices.find_value &.message.tool_calls } }
    x.report "each.compact_map(&).first?" { iterations.times { tool_calls = response.choices.each.compact_map(&.message.tool_calls).first? } }
    x.report "each.map(&).find(&.itself)" { iterations.times { tool_calls = response.choices.each.map(&.message.tool_calls).find &.itself } }
  end
  puts

  # Side effect so LLVM won't optimize out the benchmark
  pp tool_calls unless tool_calls
end

record Response, choices : Array(Choice)
record Choice, message : Message
record Message, tool_calls : Array(ToolCall)?
record ToolCall

module Enumerable(T)
  # Same implementation as https://github.com/crystal-lang/crystal/pull/14893
  def find_value(if_none = nil, & : T ->)
    each do |i|
      if result = yield i
        return result
      end
    end
    if_none
  end
end

jgaskins added the kind:feature label Aug 7, 2024

Blacksmoke16 added the topic:stdlib:collection label Aug 7, 2024

jgaskins mentioned this issue Aug 13, 2024

Add Enumerable#find_value #14893

Merged

straight-shoota closed this as completed in #14893 Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Enumerable` method to find the first truthy block result and return that result #14879

`Enumerable` method to find the first truthy block result and return that result #14879

jgaskins commented Aug 7, 2024

jgaskins commented Aug 7, 2024

Blacksmoke16 commented Aug 7, 2024

yxhuvud commented Aug 7, 2024 •

edited

Loading

HertzDevil commented Aug 7, 2024

straight-shoota commented Aug 13, 2024 •

edited

Loading

jgaskins commented Aug 13, 2024

Sija commented Aug 13, 2024

yxhuvud commented Aug 14, 2024

oprypin commented Nov 2, 2024 •

edited

Loading

oprypin commented Nov 2, 2024

jgaskins commented Nov 2, 2024

Enumerable method to find the first truthy block result and return that result #14879

Enumerable method to find the first truthy block result and return that result #14879

Comments

jgaskins commented Aug 7, 2024

Feature Request

Is your feature request related to a problem? Please describe clearly and concisely what is it.

Describe the feature you would like, optionally illustrated by examples, and how it will solve the above problem.

Describe considered alternative solutions, and the reasons why you have not proposed them as a solution here.

Does it break backward compatibility, if yes then what's the migration path?

jgaskins commented Aug 7, 2024

Blacksmoke16 commented Aug 7, 2024

yxhuvud commented Aug 7, 2024 • edited Loading

HertzDevil commented Aug 7, 2024

straight-shoota commented Aug 13, 2024 • edited Loading

jgaskins commented Aug 13, 2024

Sija commented Aug 13, 2024

yxhuvud commented Aug 14, 2024

oprypin commented Nov 2, 2024 • edited Loading

oprypin commented Nov 2, 2024

jgaskins commented Nov 2, 2024

`Enumerable` method to find the first truthy block result and return that result #14879

`Enumerable` method to find the first truthy block result and return that result #14879

yxhuvud commented Aug 7, 2024 •

edited

Loading

straight-shoota commented Aug 13, 2024 •

edited

Loading

oprypin commented Nov 2, 2024 •

edited

Loading