Skip to content
This repository has been archived by the owner on Mar 29, 2024. It is now read-only.

Consider redesigning how Module (hierarchies) work #23

Open
yupferris opened this issue Jan 12, 2021 · 10 comments
Open

Consider redesigning how Module (hierarchies) work #23

yupferris opened this issue Jan 12, 2021 · 10 comments

Comments

@yupferris
Copy link
Owner

yupferris commented Jan 12, 2021

Sketch/braindump:

Currently, a module hierarchy in kaze might look like this (taken/adapted from cases in sim tests):

fn test<'a>(c: &'a Context<'a>) -> &Module<'a> {
    test_inner(c);

    let m = c.module("Test");
    let i1 = m.instance("inner1", "TestInner");
    i1.drive_input("i1", m.input("i1", 32));
    i1.drive_input("i2", m.input("i2", 32));
    let i2 = m.instance("inner2", "TestInner");
    i2.drive_input("i1", m.input("i3", 32));
    i2.drive_input("i2", m.input("i4", 32));
    let i3 = m.instance("inner3", "TestInner");
    i3.drive_input("i1", i1.output("o"));
    i3.drive_input("i2", i2.output("o"));
    m.output("o", i3.output("o"));

    m
}

fn test_inner<'a>(c: &'a Context<'a>) -> &Module<'a> {
    let m = c.module("TestInner");

    let i1 = m.input("i1", 32);
    let i2 = m.input("i2", 32);
    m.output("o", i1 & i2);

    m
}

This fn creates a simple inner module ("TestInner") instantiated 3 times inside the top-level module ("Test").

While this is easy enough to understand, the code itself has a few problems:

  • It's quite verbose, especially horizontally. Conceptually, we're not doing something very complicated, yet it still takes quite a few characters to express this. Not only does that make it annoying to type/put together, it also makes the code quite dense and a bit hard to understand what's going on as things get more complicated.
  • Input/output names are specified as strings in several different places. While kaze validates module hierarchies internally before generating code (which ensures that string names are checked/validated at some point and keeps things safe), it would be nice for as many of these errors as possible to be caught at rust compile time and displayed in the editor (eg. referring to a port that doesn't exist).
  • It can be difficult to get an overview of the top-level ports of a module. This is especially important when browsing unfamiliar RTL code (which includes code you wrote but haven't looked at in a while!)

Additionally, there are some more general loose ends with the current pattern for creating/instantiating modules:

  • It's kindof nice to constrain module generation to a fn, but what does that really get us?
  • Why do we return the Module at the end? This makes it easy to generate code for the top-level module, but a more common case actually is just to call a module generation function, ignore the return value, and then instantiate the module after (by string name, not the reference we just got back!), as is done for TestInner above. Feels a bit janky to say the least.
  • Currently, verilog gen requires the user to explicitly generate code for each Module that's been instantiated in their design, but other than providing a way to iterate over all Modules in a Context (which is a hack) there's no useful way to query which modules are instantiated (so we don't necessarily know which modules are/aren't important and the only conservative thing is to always generate all modules).
  • Specialization for modules with different parameters has to be done by hand. This leads to many Modules with hand-generated mangled names. If we're already mangling names, why not try to make this automatic somehow and save some typing/decision making?

Generally, module generation/instantiation feels a bit half-baked, and I'd really like to improve things. I feel it's quite important to have a good pattern for this - we want to encourage the use of several decoupled modules in designs, and if it's too hairy to do this meaningfully in a lot of cases, then users will tend to create larger modules with much more logic, which are harder to build and maintain (I find myself doing this more often than I would like).

One thought I've had after looking a bit at nMigen is for a Module to not represent a verilog module 1:1, but instead have it represent a specific instance (which is currently represented by Instance in kaze). Then the convention might be to wrap each Module (instance!) in a wrapper struct that exposes the inputs/outputs as Signals (or relevant wrappers) as fields. This way, adding a new module into the Context and instantiating it (which are very commonly done one right after another) are merged into one action. For codegen, we can make verilog and rust sim gen more symmetric by always specifying a top-level module and having kaze always output the whole hierarchy (with mangled names for inner modules in the verilog case now, and still flattening the graph for rust sim). This then greatly simplifies codegen for the verilog case, since we only write code to output top-level modules.

Mockup:

struct Test<'a> {
    pub i1: &'a Input<'a>,
    pub i2: &'a Input<'a>,
    pub i3: &'a Input<'a>,
    pub i4: &'a Input<'a>,

    pub o: &'a Output<'a>,
}

impl<'a> Test<'a> {
    // TODO: Optionally make this a submodule
    // TODO: How do we get the inner module again in order to generate code?
    pub fn new<S: Into<String>>(c: &'a Context<'a>, instance_name: S) -> Test<'a> {
        let m = c.module(instance_name, "Test");

        let i1 = m.input("i1", 32);
        let i2 = m.input("i2", 32);
        let i3 = m.input("i3", 32);
        let i4 = m.input("i4", 32);

        let inner1 = TestInner::new(c, "inner1");
        inner1.i1.drive(i1);
        inner1.i2.drive(i2);
        let inner2 = TestInner::new(c, "inner2");
        inner2.i1.drive(i3);
        inner2.i2.drive(i4);
        let inner3 = TestInner::new(c, "inner3");
        inner3.i1.drive(inner1.o);
        inner3.i2.drive(inner2.o);
        let o = m.output("o", inner3.o);

        Test {
            i1,
            i2,
            i3,
            i4,

            o,
        }
    }
}

struct TestInner<'a> {
    pub i1: &'a Input<'a>,
    pub i2: &'a Input<'a>,
    pub o: &'a Output<'a>,
}

impl<'a> TestInner<'a> {
    // TODO: Optionally make this a submodule
    // TODO: How do we get the inner module again in order to generate code?
    pub fn new<S: Into<String>>(c: &'a Context<'a>, instance_name: S) -> TestInner<'a> {
        let m = c.module(instance_name, "TestInner");

        let i1 = m.input("i1", 32);
        let i2 = m.input("i2", 32);
        let o = m.output("o", i1 & i2);

        TestInner {
            i1,
            i2,
            o,
        }
    }
}

Overall, I think this will make modules a bit more verbose (mostly vertically due to struct wrapping etc), but I think the verbosity adds clarity, and the code that actually constructs Modules by combining Signals should get a bit lighter. Further, I think it's very useful to have instance signals exposed on a struct, as this could be extended by having other patterns - for example, we can group some signals with "bus port" structs, and add convenience functions to bind compatible bus ports to one another. Abstractions like this would be made ad-hoc and should reduce a lot of boilerplate in larger designs, especially where certain buses are used a lot (eg. xenowing). These abstractions can also potentially provide better semantic errors (eg. non-matching bus widths) before actually connecting signals (and deferring errors to lower-level kaze signal errors).

A small caveat: kaze currently treats all cases where Signals belonging to different Modules cannot be combined, but if inputs/outputs were to represent inputs/outputs on the submodule instance, we need to make an exception for them. It probably also makes sense to continue to disallow inputs/outputs for any Module that isn't a submodule of the current Module.

It might also be possible/desirable to hide some of the boilerplate with proc macros, but I'm always hesitant about magic syntax like this. If this is a common pattern that works well it might be worth it tho.

Whatever pattern(s) we end up with, even if we support them with proc macros for syntactic convenience, I think it's crucial that they be easy to understand/expressible in plain code. Further, a user can choose to forego this pattern entirely!

Some unresolved issues include:

  • How do we get the inner module from each wrapper struct in order to generate code for it? In nMigen, a new module is a new class that derives from a common base (Elaboratable I believe it's called), so the new module is the same object we'd generate code for, rather than a wrapper. Do we want to use traits/inheritance of some kind to mimic this? If we eventually move to an API where we separate unvalidated/validated graphs and we produce a validated graph by consuming an unvalidated graph (and transitively, all of the references to nodes in the graph, which includes modules!), how do we get ahold of the module(s) again in order to specify the top-level one(s) for codegen? Do we need to do that, or does it makes sense to "generate all top-level modules in this validated graph" always?
  • How do we get modules to either represent top-level modules or submodules in a nice way?
  • How do we specify instance names in a nice way? Do we always need instance names then (eg for top level modules, where this doesn't actually make sense in most cases)?
  • How can we use Inputs transparently as Signals and as sinks for Signals depending on context? Does it make sense to do Into<&'a Signal<'a>> everywhere instead of &'a Signal<'a> directly? Can this extend to Register as well so we don't have to use Register::value (which I tend to forget a lot at least)?
@yupferris
Copy link
Owner Author

I just realized, considering moving signal bit widths into the type system (#13), it fits really nicely with the proposed Input and Output types, because their widths would be explicitly encoded in their parameterization at their declarations, which makes the above struct definitions read more like port declarations in Verilog, which I find to be a good pattern and something that's missing often from kaze. Furthermore, we can actually do that with today's rust, as long as we check that the Signal used to construct an Input/Output's width matches the constant parameter in the constructor. This would be an excellent first step towards #13 if the above strategy ends up working.

@jdonszelmann
Copy link
Contributor

@yupferris I think I can get something like this to work:
image
It's very similar to what you say in your last comment I think

@jdonszelmann
Copy link
Contributor

The generic S could also just be a non-generic struct (like Signal) but the current signal would not work because of all of the lifetime arguments. Module definitions would become overly verbose. In fact, with this shorthand S I kind of like it

@jdonszelmann
Copy link
Contributor

jdonszelmann commented Aug 28, 2021

The only thing I'm not sure about is the fact that previously you could name inputs and outputs. You can't really name type arguments. If we want to keep naming, the only solution I can think of would be adding a method to Module which returns the names of all signals (which I hate). Macros could help with that but still. For example, a macro like this:

impl<S: Signal> Module<S> for And {
    inputs!("i1", "i2");
}

which would expand to something like:

type Inputs = (S, S);
fn inputs(&self) -> {("i1", "i2")}

but I don't like that much

The naming of the module itself is less of a problem, as we can get the name of the struct at runtime (https://doc.rust-lang.org/std/any/fn.type_name.html)

@yupferris
Copy link
Owner Author

yupferris commented Aug 30, 2021

Thanks for the feedback. I have to be honest, though - I have no idea what you're talking about/suggesting, or how to interpret your feedback!

It may not have been written clearly above in my original wall of text, especially with a bunch of unresolved issues outlined (this was all written before I actually started on the implementation), so I can understand that it may have been easy to gravitate towards that, and for that I apologize. I'll attempt to clarify a bit:

The TL;DR is that the current approach and syntax for declaring and instantiating new Modules doesn't make much sense to me anymore after having used the language for a bit, for reasons outlined above.

What I'm specifically after is:

  • What do you think about the syntax I suggest above for Module declarations, assuming the role of a user using the language? Is it too verbose? Could it be improved?
  • Are there alternative ideas that I should consider?
  • Do you have any examples of user code that would make or break these strategies, or motivate alternatives?

While I understand that these questions are inherently tied to what's actually implementable within Rust, I would appreciate it if the feedback were more general, perhaps even assuming an ideal implementation that couldn't necessarily exist.

I already have implementations underway on this branch where I'm working through potential issues and changing things around where it makes sense. I'm not interested and explicitly do not want help/collaboration for the concrete details; it's something I've been working on for some time and need to continue doing on my own.

@jdonszelmann
Copy link
Contributor

jdonszelmann commented Aug 30, 2021

I see. I haven't been too clear either. My comments specifically answer point 2: alternatives. I think it might be possible to make modules be structs which implements the a "Module" trait. The reasoning behind that is that when modules become structs, instances of modules become instances of struct structs in rust. In my example of "and", I have an And struct which implements a trait, in which it defines inputs, outputs and a function mapping inputs to outputs. But as you say, you might be less interested in that (which i get).

However, about your first and last point: I think the new way is still quite verbose and not that much better than the old way. What might help it is if names are automatically assigned by default (though names can still be optionally given). When you need to refer to a specific input. In many cases the names add a lot of verbosity while you can just work with references to that input. An example:

image

does inner.o really need a name? we refer to it as inner.o, never by it's name. Without names your code would look like this:

struct Test<'a> {
    pub i1: &'a Input<'a>,
    pub i2: &'a Input<'a>,
    pub i3: &'a Input<'a>,
    pub i4: &'a Input<'a>,

    pub o: &'a Output<'a>,
}

impl<'a> Test<'a> {
    // TODO: Optionally make this a submodule
    // TODO: How do we get the inner module again in order to generate code?
    pub fn new(c: &'a Context<'a>) -> Test<'a> {
        let m = c.module();

        let i1 = m.input(32);
        let i2 = m.input(32);
        let i3 = m.input(32);
        let i4 = m.input(32);

        let inner1 = TestInner::new(c);
        inner1.i1.drive(i1);
        inner1.i2.drive(i2);
        let inner2 = TestInner::new(c);
        inner2.i1.drive(i3);
        inner2.i2.drive(i4);
        let inner3 = TestInner::new(c);
        inner3.i1.drive(inner1.o);
        inner3.i2.drive(inner2.o);
        let o = m.output("o", inner3.o);

        Test {
            i1,
            i2,
            i3,
            i4,

            o,
        }
    }
}

struct TestInner<'a> {
    pub i1: &'a Input<'a>,
    pub i2: &'a Input<'a>,
    pub o: &'a Output<'a>,
}

impl<'a> TestInner<'a> {
    // TODO: Optionally make this a submodule
    // TODO: How do we get the inner module again in order to generate code?
    pub fn new(c: &'a Context<'a>) -> TestInner<'a> {
        let m = c.module();

        let i1 = m.input(32);
        let i2 = m.input(32);
        let o = m.output(i1 & i2);

        TestInner {
            i1,
            i2,
            o,
        }
    }
}

Optional names could be added with a sort of builder pattern like this:

let i1 = m.input(32).with_name("i1")

Generating names could be done using a scheme (for example, call inputs "i1", "i2", "i3", etc.).

I do not entirely get your TODO about "making this a submodule". If it means you want to make a distinction between modules and submodules I must say I don't think that's a good idea. I do think it makes sense to generate code for all top level modules as you call it. In fact, I think you could just generate code for every module that is added to the context. Or is that something you don't want?

About macros: I too am quite hesitant about them. I especially don't like that getting error reporting right in them is so damn hard and if you don't have nice error reporting in them it's a pain to work with them as you also don't get autocomplete with them in most/all editors. However, I think normal macros can already help a lot, and it's much easier to work with those.

But more generally, I do think working with structs like this is a good way to go forward. It's definitely much better than what's there right now.

@yupferris
Copy link
Owner Author

Apologies for not writing back about this until now, but I've struggled to be able to make time to internalize what you've written and respond properly while prioritizing IRL things and the projects I'd like to use kaze for 🙂 .

At this point, I'd like to finalize this work as it is today. Though it's not 1:1 with what we've discussed a bit above, and design-wise I wouldn't consider it complete, I believe it's a step in the right direction, and I need to move on to be able to properly work on other things on top of it (including your other pull requests).

Anyways, to address some specific points and clarify some things:

I think it might be possible to make modules be structs which implements the a "Module" trait. The reasoning behind that is that when modules become structs, instances of modules become instances of struct structs in rust.

The first thing I gather from this (which you also mention a bit later) is that representing instances of modules as instance of Rust structs is intuitive and something we agree would be good to work towards.

As for doing this via implementing a Module trait, which perhaps requires that an implementation implements an elaborate (or similarly-named) function, I think that's a good idea. This is also a common pattern in other meta-HDLs (both Chisel and nMigen do it this way I believe). However, one thing that's not clear to me is how to declare ports in a nice way.

I would really like inputs/outputs to be expressed as fields on a given module struct. This means we don't have to look them up by string names when referring to them, which, in addition to requiring fewer characters, opens up for the possibility for additional type safety. For example, with separate Input and Output types, trying to connect incompatible ports will result in a compile-time error that is typically highlighted in existing Rust editing environments. I'm generally of the opinion that everything that can be brought to an error at Rust compile-time is beneficial, as when working with kaze so far, even a small amount tooling support (especially that which comes "for free") is extremely helpful.

Additionally (and this paragraph also answers your "does an output really need a name?" question), a very large amount of time developing a kaze module is spent looking at trace output, and having proper names to refer to all ports (and registers, and potentially other signals as well, though those should probably be opt-in with a .trace("name") fn or similar on Signals, as many Signal instances are short-lived temporaries as part of a larger expression) is an absolute necessity. Additionally, for me as a developer of kaze itself, I spend a lot of time looking at the generated verilog and Rust simulator code, and I need the names there to be proper ones as well. Auto-generated names would be a nightmare to work with (especially as they would change when new ports are added), and opting-in for each port would be even more verbose than what I have now and what I'm suggesting above.

I do not entirely get your TODO about "making this a submodule".

Right, what I mean by this is there needs to be a way for a module instance to be declared as an instance inside of another module that kaze has knowledge of. This is mostly for codegen, but it's also for error checking. If kaze Signals are being connected, they must only be connected to other Signals in the same context (for example, a module output must only be connected to signals in the scope of the parent module), so kaze must know about the module hierarchy.

The way I've handled this currently is that I've introduced a ModuleParent trait, which both Context and Module implement. Module constructors take an impl ModuleParent argument that they use to construct the corresponding kaze Module. I don't think this is too bad.

There is still a distinction between top-level and other modules though, but this doesn't exist in the module code itself; it only exists at codegen time. kaze needs to know which modules to generate. For verilog, kaze has previously gotten away with just generating all modules, and since kaze instances mapped 1:1 with verilog instances, the verilog compiler would later be able to put the pieces back together. For sim, this is different, as the module hierarchy is flattened (to simplify how the simulator works), so a user must specify which modules to actually generate simulators for, and all of the logic in child modules is included in that simulator. With the suggestion above, the verilog output will work in a similar manner, which is more symmetric/easier to use (and doesn't require some hacks that I previously added to be able to iterate over all of the modules in a context in order to actually generate everything for the verilog case).

Anyways, in light of all of that, one thing is for sure:

However, about your first and last point: I think the new way is still quite verbose and not that much better than the old way.

I definitely agree with the bit about verbosity; it's really unfortunate/annoying. Common things like making registers is not terrible (it's better than the equivalent verilog in most cases), but creating a module (or even just a port) is too much hassle.

I do think it's important to note that with the new way, it's easier to build ad-hoc abstractions. For example, one can now collect a bunch of Inputs and Outputs into another struct to define a "meta-port" as a collection of ports, complete with a connect function to connect it with a compatible port of "opposite polarity". With the currently-released kaze versions, this requires storing a bunch of string names in addition to each module instance (in the case where such a "meta-port" contains ports going in both directions), in addition to a lot more typing!

The more I think about it, the more I think I might want a module (proc) macro to declare a module struct that desugars to something similar to what I've implemented that's similar to the above. This could handle repeating port names and some plumbing details gracefully, while users who are allergic to macros still don't have to use it if they don't want to (in fact, they can opt out of the 1:1 Module/struct correspondence entirely, probably). However, it rules out module structs that contain additional (non-port) fields, but arguably it's better that the module struct is only concerned with the module itself, and (meta)data could live on a wrapper object that contains a field for the module as well as the additional fields, and the macro approach doesn't hinder this one bit.

Some other loose ends:

The naming of the module itself is less of a problem, as we can get the name of the struct at runtime (https://doc.rust-lang.org/std/any/fn.type_name.html)

This I was unaware of, thanks for bringing it to my attention! I think I'll definitely want to use this.

Another thing that bugs me when working with ad-hoc abstractions, and in particular the "meta-port" example above: It's actually annoying that this only works for inputs/outputs, because it forces these abstractions to only work at module boundaries, and sometimes it might be nice to use them entirely within a module as well. This requires something like verilog's wire construct to work properly, as we need some kind of "late binding" like we have with Inputs now, where the Input is first instantiated undriven, and is driven later. This can be emulated today by making a Wire module (with parameterized bit width), which is something I've actually done in a few places to get around the fact that a Mem port requires its input signals to be driven at construction time.

Anyways, I'm going to work on getting what I have on master and cleaned up a bit, and perhaps even released (kaze is still pre-1.0.0 so I'm not totally averse to making some breaking changes along the way; for a "langauge" like this that's standard affair). I'll leave this issue open for now, but I think I want to split it into more focused/specific issues at some point (hopefully soon).

@yupferris
Copy link
Owner Author

Also, just to have an example built with what I currently have, here's an LFSR:

use kaze::*;

pub struct Lfsr<'a> {
    pub m: &'a Module<'a>,
    pub shift_enable: &'a Input<'a>,
    pub value: &'a Output<'a>,
}

impl<'a> Lfsr<'a> {
    pub fn new(instance_name: impl Into<String>, p: &'a impl ModuleParent<'a>) -> Lfsr<'a> {
        let m = p.module(instance_name, "Lfsr");

        let shift_enable = m.input("shift_enable", 1);

        let state = m.reg("state", 16);
        state.default_value(0xace1u32);
        let value = m.output("value", state.bits(7, 0));

        state.drive_next(if_(shift_enable, {
            let feedback_bit = state.bit(0) ^ state.bit(2) ^ state.bit(3) ^ state.bit(5);
            feedback_bit.concat(state.bits(15, 1))
        }).else_({
            state
        }));

        Lfsr {
            m,
            shift_enable,
            value,
        }
    }
}

@Hazematman
Copy link

Has the ModuleParent trait been properly implemented into kaze? I tried to copy the code here into a basic example https://github.com/xenowing/xenowing/blob/master/rtl/src/fifo.rs#L18 and I get a not found in this scope error.

I have to admit, I'm not the familiar with Rust, so I may just be doing something incorrectly.

@yupferris
Copy link
Owner Author

yupferris commented Feb 20, 2023

It's implemented on master, which can be used from cargo but not in a proper release, as I still haven't gone back and fixed tests, added docs, etc. Not sure when I'll ever get around to this beyond "eventually". That said, I don't have any immediate plans to roll back these changes or anything like that, and in general the xenowing code should be considered the closest thing to "best practices" as kaze has.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants