Skip to content

improve construction of <:Tuple types from iterators #53265

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nsajko
Copy link
Contributor

@nsajko nsajko commented Feb 9, 2024

Make tuple construction from overlong iterators throw instead of truncating the input iterator.

Trying to construct a tuple from an infinite iterator now throws an ArgumentError instead of succeeding or throwing a MethodError.

Fixes #40495

Fixes #52657

Make tuple construction from known constant-length iterators inferrable and prevent allocation.

Fixes #52993

Compute more accurate tuple element types in many cases using typeintersect, with a fallback to the old behavior. This prevents some cases of convert throwing due to ambiguity.

Instead of just calling typeintersect on the tuple types, do typeintersect again once fieldtype computes the element type.

Fixes #53181

Separate the tuple construction from tuple element conversion, allowing dropping the tuple_type_tail dependency.

Fixes #53182

@nsajko nsajko added types and dispatch Types, subtyping and method dispatch bugfix This change fixes an existing bug collections Data structures holding multiple items, e.g. sets performance Must go faster labels Feb 9, 2024
@nsajko nsajko force-pushed the tuple_from_iterator branch 4 times, most recently from aaf9ebd to d4127a7 Compare February 9, 2024 20:40
Make tuple construction from overlong iterators throw instead of
truncating the input iterator.

Trying to construct a tuple from an infinite iterator now throws an
`ArgumentError` instead of succeeding or throwing a `MethodError`.

Fixes JuliaLang#40495

Fixes JuliaLang#52657

Make tuple construction from known constant-length iterators inferrable
and prevent allocation.

Fixes JuliaLang#52993

Compute more accurate tuple element types in many cases using
`typeintersect`, with a fallback to the old behavior. This prevents
some cases of `convert` throwing due to ambiguity.

Instead of just calling `typeintersect` on the tuple types, do
`typeintersect` again once `fieldtype` computes the element type.

Fixes JuliaLang#53181

Separate the tuple construction from tuple element conversion, allowing
dropping the `tuple_type_tail` dependency.

Fixes JuliaLang#53182
@nsajko nsajko force-pushed the tuple_from_iterator branch from d4127a7 to 039fc1b Compare February 9, 2024 21:12
@@ -196,6 +197,7 @@ end

# core operations & types
include("promotion.jl")
include("tuple_from_iterator.jl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really need to exist this early? The code would be a lot cleaner if it could be moved to after int (so you can use +)


function _totuple_err(@nospecialize T)
@noinline
throw(ArgumentError("too few or too many elements for tuple type $T"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this use a LazyString?

end

function iterator_to_ntuple(::Type{T}, iter) where {len,T<:NTuple{len,Any}}
if len < 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100 seems pretty high for this. Does the recursive version really infer that well?

Comment on lines +65 to +83
AB = typeintersect(A, B)
BA = typeintersect(B, A)
AC = typeintersect(A, C)
CA = typeintersect(C, A)
BC = typeintersect(B, C)
CB = typeintersect(C, B)

AB_C = typeintersect(AB, C)
C_AB = typeintersect(C, AB)
BA_C = typeintersect(BA, C)
C_BA = typeintersect(C, BA)
AC_B = typeintersect(AC, B)
B_AC = typeintersect(B, AC)
CA_B = typeintersect(CA, B)
B_CA = typeintersect(B, CA)
BC_A = typeintersect(BC, A)
A_BC = typeintersect(A, BC)
CB_A = typeintersect(CB, A)
A_CB = typeintersect(A, CB)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you simplify this to something like

type_intersect_exact(C, type_intersect_exact(A,B))
type_intersect_exact(B, type_intersect_exact(A,C))
type_intersect_exact(A, type_intersect_exact(B,C))

A_CB = typeintersect(A, CB)

(
A, B, C,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure A, B, C do not need to be considered because if one of them was valid, it would be successfully found as the intersection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're saying that typeintersect(A, B) can be relied on to return the "smaller" type of the two when they're ordered? Maybe you're right, but that's not documented, and I didn't read the C implementation, so I don't know if we can rely on that. If we can rely on that behavior of typeintersect, perhaps we should document it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're saying that typeintersect(A, B) can be relied on to return the "smaller" type of the two when they're ordered?

yes. If it didn't, it would be doing a really bad job of returning an intersection.


(
A, B, C,
AB, BA, AC, CA, BC, CB,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

When `OK <: ok`, `R` is the exact type intersection. When `ok <: Union{}`, the
exact type intersection wasn't found and `R` has no significance.
"""
struct Result{R, ok<:OK} end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this is implementing an Option type seems pretty sketchy to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is it sketchy? It just seems like an elegant encoding to me?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to let the caller know if the exact intersection couldn't be found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the idea is to enable the implementation of get_result.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's relatively sensible, the weird parts are that

  1. the OK value is a type? Why not a bool?
  2. Implimenting a custom option type specifically for this functionality seems somewhat weird. Why is an option type inherently embedded in the implementation of a typeintersect wrapper?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may be elegant, but something like that should be a separate PR, and not live under TypeIntersectExact namespace.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about this again, OK isn't necessary at all, and thus Result is also unnecessary: I think it would be better to just return a Type when the search was successful and nothing when the search failed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adienes what do you mean by "not live under TypeIntersectExact namespace"? Do you object to the name, or to the addition of a namespace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a suggestion for a better name?

is_exact = (T <: A) && (T <: B) && (T <: C)
is_exact && (return result(T)::Result)
end
failure()::Result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change. We cannot merge the type_intersect_exact.jl file into Base in any form, as it promises something that it cannot provide

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not at all sure what you mean? It's just an internal helper function, given that it's not a change it can't be a breaking change??

Do you just don't like the name? The documentation is clear about the search for the exact intersection not always succeeding, but even so it's safe given that the result says whether the intersection was found or not.

Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just at a high level review comment, it sounds like this is trying to change too many things at once, several of which are likely breaking changes that therefore must be handled carefully and likely will need to be in separate PRs

nsajko added a commit to nsajko/julia that referenced this pull request Feb 11, 2024
fingolfin pushed a commit that referenced this pull request Feb 11, 2024
@nsajko nsajko marked this pull request as draft February 11, 2024 19:02
@nsajko
Copy link
Contributor Author

nsajko commented Feb 11, 2024

I realized that, while this is an improvement for constant-length iterators, it's a (runtime) performance regression for non-constant-length iterators when the tuple type is known-length, so I need to rethink this a bit, apart from the pointed out issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug collections Data structures holding multiple items, e.g. sets performance Must go faster types and dispatch Types, subtyping and method dispatch
Projects
None yet
4 participants