-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: add a set
data type
#264
Comments
The standard recommendation thus far has been to use a dict with None as the values. I've used that in practice and while it's not quite as nice, it's perfectly workable. |
@dieortin @ndmitchell It's already half-done in Go: google/starlark-go#492 |
@suganoi that's nice! I have also just seen #20, but it doesn't seem to include the existence of IMO the |
For the record, the example above can be written: names = ["Thomas", "David", "Michelle", "David"] # contains duplicate names
unique_names = {n: None for n in names}.keys() In many cases, people want sets just to remove duplicates; this workaround may be sufficient for that use-case. In Bazel BUILD files, there's a tradition of using lists in most places. For example, As mentioned previously, Starlark-Go has a implementation for sets (https://github.com/google/starlark-go/blob/master/doc/spec.md#set - https://github.com/google/starlark-go/blob/90ade8b19d09805d1b91a9687198869add6dfaa1/starlark/library.go#L990). It can be used using a flag. |
I'm not opposed to a set data type, although if we do add it, I think we ought to also support the literal |
This is not as easy decision: people constantly make mistakes by writing I suggest to have separate discussions/decisions for |
Given the discussions ... is there any status on the set data type feature? |
@tomaspipes - given the positive reaction from the community here and elsewhere, I want to take a few days this summer to add a |
Didn't have time to look at this in summer - but I finally wrote an experimental implementation for Java / Bazel: https://bazel.googlesource.com/bazel/+/0651a84f3b1a8d8ad06eb30d96ed298e14dde74f In light of this, let's hammer out a language spec. Current feature matrix:
For the spec, I propose the following:
|
Modeled on the Python 3 set type and the existing implementations in Go, Rust, and the proposed one for Java, with the following differences: * set literals and set comprehensions *not* supported (unlike python3) * `copy()` method *not* supported (unlike python3) because we do not have it on lists or dictionaries. * comparison operators *not* supported (unlike starlark-go and python3) * `update()` method supported (unlike starlark-go) * `isdisjoint()`,`intersection_update()`, `difference_update()`, `symmetric_difference_update()` method supported (unlike starlark-go and starlark-rust) * multiple-argument form of `union()`, `intersection()`, `difference()` and corresponding _update methods supported (unlike starlark-go and starlark-rust). Fixes bazelbuild#264
Proposed spec (with multiple-argument form of |
Experimental feature guarded by --experimental_enable_starlark_set. This is the Bazel implementation of bazelbuild/starlark#264 Replicates the Python 3 set API and (in almost all respects) the starlark-go implementation, with the notable exception of explicitly not supporting (partial) ordering of sets. Note that there are no set-valued attributes (nor plans to add any), and set-valued select() expressions are not supported. RELNOTES: Add a set data type to Starlark, guarded by the --experimental_enable_starlark_set flag. PiperOrigin-RevId: 695886977 Change-Id: Id1e178bd3dd354619f188c4375d8a1256bd55f75
A
set
data structure, similar to the one in Python, would be a useful addition. For example, it is sometimes required to ensure no duplicate values exist in alist
, and aset
is the most ergonomic solution.Other common use cases are:
Thoughts? Should I make an attempt at writing a proposal?
Use case example
As an example, I have run into this need while adding automatic dependency exploration in my fork of kklochkov/rules_qt, a ruleset for building Qt applications with Bazel. When generating a rule instantiation, the
deps
attribute cannot contain duplicate values. Because my feature analyzes Qt shared libraries in the system, and they can be duplicated with different names (for example,libQt6Quick.so
andlibQt6Quick.so.6.4.3
) This introduces the need to enforce each dependency is only added once. Of course, there are ways to work around this problem, but it would be easily solved with aset
.The text was updated successfully, but these errors were encountered: