Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fud2 Tracker #1878

Open
3 of 22 tasks
sampsyo opened this issue Jan 28, 2024 · 12 comments
Open
3 of 22 tasks

fud2 Tracker #1878

sampsyo opened this issue Jan 28, 2024 · 12 comments
Labels
C: fud2 experimental driver Type: Tracker Track various tasks

Comments

@sampsyo
Copy link
Contributor

sampsyo commented Jan 28, 2024

Here is a tracking issue to catalog all the stuff to do to fud2, after the initial import in #1877. As I turn these bullet points into issues, I'll edit this issue to link to them.

Near-term infrastructure completion:

  • Tests! Maybe unit tests, or maybe snapshot tests of the emitted Ninja stuff. (Actual functionality tests of real builds will happen elsewhere, i.e., when our ecosystem-wide tests eventually use fud2 as a fud alternative.)
  • Document all the currently supported ops.
  • Document the design (and the scope/goals of both fake and fud2), and how to add functionality to fud2. [fud2] Write some design docs #1943
  • Add a CLI option to list all the states & ops. [fud2] Add list subcommand #1937
  • Better error messages w/r/t configuration problems.

Missing functionality from OG fud (an incomplete list):

  • XRT: Make trace mode (VCD emission) work.
  • XRT: Support real-FPGA execution mode.
  • Systolic array generator.
  • Vivado HLS.
  • Vivado report parsing?

Big features to explore:

  • Extensibility. Perhaps with a scripting language like Starlark, as outlined elsewhere.
  • Ops with multiple inputs. The point is that the --set sim.data=foo.json route, where the data input is a "second-class citizen" w/r/t the actual code for simulation, is a hack. We should instead treat foo.json as a proper input, just like the Verilog program, that goes through its own op-based transformation, discovered with a BFS traversal of the op graph. This would let you, for example, do fud2 foo.json -o foo.hex or whatever to run the data conversion alone, and it would allow other formats other than the "blessed" JSON format to work transparently. [fud2] Hypergraph #1958
  • Ops with multiple outputs. This is relevant mostly to the simulation stages… currently, we have separate icarus and icarus-trace ops because one produces a JSON data file and the other produces a VCD file. Maybe we can make this one op? Then you wouldn't have to do --through icarus-trace --to vcd, which seems redundant; it would suffice to just do --through icarus --to vcd to specify which output you want. [fud2] Hypergraph #1958
  • Remote execution. We had this in OG fud for the Xilinx ops, but I think this should be a generic thing that any op can use.
  • Priority. This is an OG fud feature we don't have. See Using fud priority causes warning #1873, and also [Fud2] Cider is the default for --to dat computations #2208 for a specific current problem with prioritization.
  • "Check mode." OG fud has fud check that can automatically diagnose lots of installation and versioning problems.
  • "Install mode." Like check mode but even more helpful. See Proposal: fud install sampsyo/fake#2 and Self-contained fud2 package #1899.
  • Rethink how "support resource" files work. Lots of ops need auxiliary files, and we have to deliver those somehow. Currently, those go in a directory alongside the actual code for fud2. This is inelegant because it means fud2 is not a self-contained executable, and fud2 isn't usable until you set a config option to point to that directory. Maybe we should include the files in the binary itself. See Self-contained fud2 package #1899. fud2: Embed resources in executable, in release mode #1910
  • (Optionally) embed Turtle instead of relying on Ninja?
  • Several ops currently depend on chunks of Python functionality from OG fud. We should figure out what to do about this in general… maybe just document it? Or create a separate Python package you have to install alongside our Rust business? Or just replace those bits of functionality altogether, which sidesteps the headache.
  • An extensible command line, perhaps? It would be nice for ops to be able to provide custom CLI options to parse, which would make things somewhat shorter than -s key=value. We'd need to collect examples of how this might be useful.
  • Internal ops: fud2 Tracker #1878 (comment)
@sampsyo sampsyo added Type: Tracker Track various tasks C: fud2 experimental driver labels Jan 28, 2024
@rachitnigam
Copy link
Contributor

Thanks for getting started on this @sampsyo! What do you think about the following milestone: getting all of the tests in the monorepo working with fud2? We can maintain a separate runt.toml file and slowly etch away at all the failures/missing features. Once we have that done, we can consider recommending fud2 as the default build tool and see what other things keel over?

@sampsyo
Copy link
Contributor Author

sampsyo commented Jan 29, 2024

Yes! That seems like the milestone to shoot for. I think we are a few steps away from that being feasible (mostly the documentation/QoL stuff in the first stanza of checkboxes above, and of course the extra ops in the second stanza), but that is the right medium-term goal to shoot for.

@rachitnigam
Copy link
Contributor

Ops with multiple inputs

I came across when playing with fud2 as well; there is some feeling about treating inputs more declaratively in the same way we think about files. The idea that some secondary inputs need to be transformed themselves is already true for things like the symbolic evaluator where the input spec might need to compiled through its own fud flow.

@rachitnigam
Copy link
Contributor

Can we add #1899 to the tracker @sampsyo?

@sampsyo
Copy link
Contributor Author

sampsyo commented Feb 10, 2024

Yes, having "secondary inputs" be treated as first-class citizens would be really really satisfying…

@rachitnigam
Copy link
Contributor

Proposal: fud2 package to create zip/tarballs containing all of the source files needed to execute a particular flow. This could be useful for remote workflows and particular powerful for benchmarking infrastructures that want to generate one ninja file to generate results for a bunch of different input programs.

@rachitnigam
Copy link
Contributor

Several ops currently depend on chunks of Python functionality from OG fud. We should figure out what to do about this in general… maybe just document it? Or create a separate Python package you have to install alongside our Rust business? Or just replace those bits of functionality altogether, which sidesteps the headache.

I think this should be promoted into a general discussion about a new tools/ folder and we should formalize the process of building new tools for Calyx (languages supported, build flows, especially for python).

@rachitnigam
Copy link
Contributor

One problem I've been running into when using fud2 is if the Calyx compiler itself change, the .fud2 is not invalidated and fud2 will skip the compilation step. We should add a dependency on the various tools being run to make sure that when the tools themselves change, the build is re-executed from the right step!

Woo! New set of challenges from incremental execution but it'll be worth it once we have it figured out.

@sampsyo
Copy link
Contributor Author

sampsyo commented Feb 27, 2024

Ah yes, exciting!! Makes sense to force (e.g.) Calyx compilation rules to have an implicit dependency on the Calyx compiler binary.

This is mostly a note to myself, but making that work would probably benefit from some sort of abstraction inside the fud-core emitter to make it easy to include that dependency every time… Ninja does not let you attach dependencies to rules (only build statements), so there is a need to repeat this dependency every place the rule is used. That is very understandable for Ninja, but it means we will want a way for fud2 code to do the right thing by default instead of needing to remember the implicit dependency every time. (This already comes up in #1910, where we need similar implicit dependencies on resource files.)

@sampsyo
Copy link
Contributor Author

sampsyo commented May 9, 2024

Adding an idea to the tracker above that might make sense in the medium term: internal ops.

Currently, fud2 only orchestrates external commands. Everything it does must be implemented in a separate binary that can be invoked on the command line. This is great for modularity/testability/clarity, but it comes with obvious downsides: the overhead of serializing everything to files on disk, and the complexity of managing external dependencies. That's why clang, for example, is both a compiler driver and a toolbox of actual toolchain elements that it wraps up into one big burrito.

It would be cool if we could optionally include functionality inside the fud2 binary itself for certain, specific cases. For example, a version of fud2 could link in the entire Calyx compiler as a library. Then we wouldn't need to find the Calyx compiler executable; fud2 could instead emit Ninja "callbacks" to itself to do the compilation. As in, maybe fud2 exposes a command like this:

fud2 internal-op calyx -- foo.futil -o foo.sv

…and emits Ninja commands to call it this way instead of an external Calyx binary. This would just make the dependency management easier, at the expense of a gigantic monolithic fud2 binary.

Beyond this, fud2 could try to detect when there is a "chain" of internal ops in the plan. Then it could call itself to do the entire chain in a single fud2 internal-op invocation. This could then potentially help with serialization overhead.

Finally, in the limit, if the entire plan is a chain of internal ops, then we don't need to involve Ninja at all. Basically, the idea would be that fud2 could follow a "progressive enhancement" strategy where external commands always exist as a fallback, but we can use internal ops opportunistically, solely as an optimization with identical semantics.

@rachitnigam
Copy link
Contributor

This is a cool idea!! A potential challenge is that this make fud2 a full blown build tool (like fud). For example, the approach of "fusing" build steps implies that ninja does not get to see (and cache) the results of build artifacts. This also complicates the debugging story; there are now two possible flows for every build: one through fud2 and one through ninja.

An alternative approach to the serialization problem is coming up with binary formats that can be mmapd directly by the various tools. For example, we don't really have a binary format for Calyx IR which causes expensive serialization and deserialization.

@sampsyo
Copy link
Contributor Author

sampsyo commented May 13, 2024

Ah, yes, thank you for articulating the downside. It would cost a lot of complexity! And, as you say, create two implementations to test for every possible plan. I guess one thing it wouldn't incur, however, is process management: everything would be in-process (that would be the point) so no std::process::Command necessary.

And yeah, you're right that there might be better alternatives to explore if serialization becomes a bottleneck… I guess the real lesson here is that we should wait for an actual bottleneck to appear before picking the solution. 🤪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: fud2 experimental driver Type: Tracker Track various tasks
Projects
None yet
Development

No branches or pull requests

2 participants